date:20120620

Re: [PATCH 3/3] Handle const_vector in mulv4si3 for pre-sse4.1.

2012-06-20 Thread Uros Bizjak

On Mon, Jun 18, 2012 at 10:06 PM, Richard Henderson r...@redhat.com wrote:

 Please note that you will probably hit PR33329, this is the reason
 that we expand multiplications after reload. Please see [1] for
 further explanation. There is gcc.target/i386/pr33329.c test to cover
 this issue, but it is not effective anymore since the simplification
 happens at tree level.

 [1] http://gcc.gnu.org/ml/gcc-patches/2007-09/msg00668.html


 Well, even with the test case changed s/*2/*12345/ so that the
 test case continues to use a multiply instead of devolving to
 a shift, does not fail.

 There have been a lot of changes since 2007; I might hope that
 the underlying bug has been fixed.

Should we also change mulVI1_AVX23 and mulVI8_AVX23 from
pre-reload splitter to an expander in the same way?

Uros.

Re: [patch] Deal with #ident without

2012-06-20 Thread Steven Bosscher

On Wed, Jun 20, 2012 at 2:21 AM, Hans-Peter Nilsson h...@bitrange.com wrote:
 On Tue, 19 Jun 2012, Steven Bosscher wrote:
 I've now committed this, see r188791.

 Breaking cris-elf.  Just try rebuilding cc1:
 ./gcc/gcc/../libdecnumber/dpd -I../libdecnumber    \
                /tmp/hpautotest-gcc1/gcc/gcc/config/cris/cris.c -o cris.o
 /tmp/hpautotest-gcc1/gcc/gcc/config/cris/cris.c: In function 
 'cris_asm_output_ident':
 /tmp/hpautotest-gcc1/gcc/gcc/config/cris/cris.c:2480: error: 'cgraph_state' 
 undeclared (first use in this function)
 /tmp/hpautotest-gcc1/gcc/gcc/config/cris/cris.c:2480: error: (Each undeclared 
 identifier is reported only once
 /tmp/hpautotest-gcc1/gcc/gcc/config/cris/cris.c:2480: error: for each 
 function it appears in.)
 /tmp/hpautotest-gcc1/gcc/gcc/config/cris/cris.c:2480: error: 
 'CGRAPH_STATE_PARSING' undeclared (first use in this funct\
 ion)
 /tmp/hpautotest-gcc1/gcc/gcc/config/cris/cris.c:2478: warning: unused 
 variable 'buf'
 /tmp/hpautotest-gcc1/gcc/gcc/config/cris/cris.c:2477: warning: unused 
 variable 'size'
 /tmp/hpautotest-gcc1/gcc/gcc/config/cris/cris.c:2476: warning: unused 
 variable 'section_asm_op'
 /tmp/hpautotest-gcc1/gcc/gcc/config/cris/cris.c: In function 
 'cris_option_override':
 /tmp/hpautotest-gcc1/gcc/gcc/config/cris/cris.c:2538: error: 
 'flag_no_gcc_ident' undeclared (first use in this function\
 )
 make[2]: *** [cris.o] Error 1

Grr. A merge f*ck-up. This was in my testing tree on the compile farm
but not in the patch I committed:

Index: config/cris/cris.c
===
--- config/cris/cris.c  (revision 188808)
+++ config/cris/cris.c  (working copy)
@@ -47,6 +47,7 @@ along with GCC; see the file COPYING3.
 #include optabs.h
 #include df.h
 #include opts.h
+#include cgraph.h

 /* Usable when we have an amount to add or subtract, and want the
optimal size of the insn.  */
@@ -2533,10 +2534,6 @@ cris_asm_output_case_end (FILE *stream,
 static void
 cris_option_override (void)
 {
-  /* We don't want an .ident for gcc.
- It isn't really clear anymore why not.  */
-  flag_no_gcc_ident = true;
-
   if (cris_max_stackframe_str)
 {
   cris_max_stackframe = atoi (cris_max_stackframe_str);

I'm building a cross to cris-elf now, to be sure, and I'll commit this
ASAP after that.

Sorry for this...

Ciao!
Steven

Re: [patch committed testsuite] Tweak gcc.dg/stack-usage-1.c on SH

2012-06-20 Thread Eric Botcazou

 I've applied the attached patch which is a tiny SH specific
 change of gcc.dg/stack-usage-1.c test.  Tested on sh-linux
 and i686-pc-linux-gnu.

This is wrong, please remove the dg-options line and do like the other targets.

-- 
Eric Botcazou

RFA: PATCH to Makefile.def/tpl to add libgomp to make check-c++

2012-06-20 Thread Jason Merrill

The recent regression in libgomp leads me to want to add libgomp tests 
to the check-c++ target.  OK for trunk?
commit 3eaa6c5b268115cbf4ab762b5d7b50022389ef25
Author: Jason Merrill ja...@redhat.com
Date:   Tue Jun 19 18:16:34 2012 -0700

	* Makefile.tpl (check-target-libgomp-c++): New.
	* Makefile.def (c++): Add it.
	* Makefile.in: Regenerate.

diff --git a/Makefile.def b/Makefile.def
index 1449a50..2a0b8fa 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -518,7 +518,8 @@ dependencies = { module=configure-target-libgfortran; on=all-target-libquadmath;
 languages = { language=c;	gcc-check-target=check-gcc; };
 languages = { language=c++;	gcc-check-target=check-c++;
 lib-check-target=check-target-libstdc++-v3;
-lib-check-target=check-target-libmudflap-c++; };
+lib-check-target=check-target-libmudflap-c++;
+lib-check-target=check-target-libgomp-c++; };
 languages = { language=fortran;	gcc-check-target=check-fortran;
 lib-check-target=check-target-libquadmath;
 lib-check-target=check-target-libgfortran; };
diff --git a/Makefile.in b/Makefile.in
index def860e..9cf3543 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -41116,6 +41116,13 @@ check-target-libmudflap-c++:
 
 @endif target-libmudflap
 
+@if target-libgomp
+.PHONY: check-target-libgomp-c++
+check-target-libgomp-c++:
+	$(MAKE) RUNTESTFLAGS=$(RUNTESTFLAGS) c++.exp check-target-libgomp
+
+@endif target-libgomp
+
 # --
 # GCC module
 # --
@@ -41150,7 +41157,7 @@ check-gcc-c++:
 	s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
 	$(HOST_EXPORTS) \
 	(cd gcc  $(MAKE) $(GCC_FLAGS_TO_PASS) check-c++);
-check-c++: check-gcc-c++ check-target-libstdc++-v3 check-target-libmudflap-c++
+check-c++: check-gcc-c++ check-target-libstdc++-v3 check-target-libmudflap-c++ check-target-libgomp-c++
 
 .PHONY: check-gcc-fortran check-fortran
 check-gcc-fortran:
diff --git a/Makefile.tpl b/Makefile.tpl
index 371c3b6..f06a7ce 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -1415,6 +1415,13 @@ check-target-libmudflap-c++:
 
 @endif target-libmudflap
 
+@if target-libgomp
+.PHONY: check-target-libgomp-c++
+check-target-libgomp-c++:
+	$(MAKE) RUNTESTFLAGS=$(RUNTESTFLAGS) c++.exp check-target-libgomp
+
+@endif target-libgomp
+
 # --
 # GCC module
 # --

Re: [patch committed testsuite] Tweak gcc.dg/stack-usage-1.c on SH

2012-06-20 Thread Kaz Kojima

Eric Botcazou ebotca...@adacore.com wrote:
 This is wrong, please remove the dg-options line and do like the other 
 targets.

I'll revert that line and use my patch in the trail #11 of
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53621.

Regards,
kaz

Re: [PATCH] C++11, grammar fix for late-specified return types and virt-specifiers

2012-06-20 Thread Jason Merrill


Applied, thanks.  Note that your dg-error regexp doesn't make much sense:

// { dg-error expected type-specifier before 'final'||expected 
';'||declaration doesn't declare anything }


Regular expression or uses a single |, so this ends up being a long 
way of writing


// { dg-error  }

I adjusted the dg-error lines to check only for the expected 
type-specifier error, and used dg-prune-output to discard the extra errors.


Jason

Re: C++ PATCH for c++/53484 (wrong auto in template)

2012-06-20 Thread Jason Merrill


On 06/15/2012 02:59 PM, Dominique Dhumieres wrote:

Back when we added C++11 auto deduction, I thought we could shortcut the
normal deduction in some templates, when the type is adequately
describable (thus the late, unlamented function describable_type).  Over
time various problems with this have arisen, of which this is the most
recent; as a result, I'm giving up the attempt as a bad idea and just
deferring auto deduction if the initializer is type-dependent.
...


This has caused

http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00085.html


Did you mean to put a bugzilla link here?

Jason

Re: [patch committed testsuite] Tweak gcc.dg/stack-usage-1.c on SH

2012-06-20 Thread Kaz Kojima

 This is wrong, please remove the dg-options line and do like the other 
 targets.
 
 I'll revert that line and use my patch in the trail #11 of
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53621.

I've applied the patch below.  I'll backport it release branches.

Regards,
kaz
--
2012-06-19  Kaz Kojima  kkoj...@gcc.gnu.org

* gcc.dg/stack-usage-1.c: Remove dg-options line for sh targets
and add __sh__ case.

--- ORIG/trunk/gcc/testsuite/gcc.dg/stack-usage-1.c 2012-06-20 
10:01:51.0 +0900
+++ trunk/gcc/testsuite/gcc.dg/stack-usage-1.c  2012-06-20 16:28:31.0 
+0900
@@ -1,6 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options -fstack-usage } */
-/* { dg-options -fstack-usage -fomit-frame-pointer { target { sh*-*-* } } } 
*/
 
 /* This is aimed at testing basic support for -fstack-usage in the back-ends.
See the SPARC back-end for example (grep flag_stack_usage_info in sparc.c).
@@ -61,6 +60,8 @@
 #  define SIZE (256 - __EPIPHANY_STACK_OFFSET__)
 #elif defined (__RL78__)
 #  define SIZE 254
+#elif defined (__sh__)
+#  define SIZE 252
 #else
 #  define SIZE 256
 #endif

Re: [patch committed testsuite] Tweak gcc.dg/stack-usage-1.c on SH

2012-06-20 Thread Eric Botcazou

 I've applied the patch below.  I'll backport it release branches.

Thanks!

-- 
Eric Botcazou

Re: [PATCH] C++11, grammar fix for late-specified return types and virt-specifiers

2012-06-20 Thread Ville Voutilainen

On 20 June 2012 10:35, Jason Merrill ja...@redhat.com wrote:
 Applied, thanks.  Note that your dg-error regexp doesn't make much sense:


 // { dg-error expected type-specifier before 'final'||expected
 ';'||declaration doesn't declare anything }

 Regular expression or uses a single |, so this ends up being a long way of
 writing

 // { dg-error  }

Funny. The testcasewriting gcc wiki page at
http://gcc.gnu.org/wiki/TestCaseWriting
suggests a double pipe. Quoth the Raven:

Should a line produce two errors, the regular expression should
include an || (ie. a regular expression OR) between the possible
message fragments.

If a single pipe is indeed to be used, perhaps we want to correct that
piece of documentation, lest
fools follow its advice. :)

Re: [PING ARM Patches] PR53447: optimizations of 64bit ALU operation with constant

2012-06-20 Thread Carrot Wei

Hi Michael

It seems the wiki page describes 64bit operations on NEON only. My
patches improves 64bit operations on core registers only. I touched
the neon patterns simply because those DI mode operations are enabled
separately according to the TARGET_NEON value, so in the neon patterns
I duplicated the alternatives in normal cases.

thanks
Carrot

On Wed, Jun 20, 2012 at 9:58 AM, Michael Hope michael.h...@linaro.org wrote:
 On 18 June 2012 22:17, Carrot Wei car...@google.com wrote:
 Hi

 Could ARM maintainers review following patches?

 http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00497.html
 64bit add/sub constants.

 http://gcc.gnu.org/ml/gcc-patches/2012-05/msg01834.html
 64bit and with constants.

 http://gcc.gnu.org/ml/gcc-patches/2012-05/msg01974.html
 64bit xor with constants.

 http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00287.html
 64bit ior with constants.

 Hi Carrot.  Out of interest, how do these interact with the 64 bit in
 NEON patches that Andrew has been doing?  They seem to touch many of
 the same patterns and I'm concerned that they'd cause GCC to prefer
 core registers instead of NEON, especially as the constant values you
 can use in a vmov are limited.

 There's a (in progress) summary of the current state for the standard
 C operators here:
  https://wiki.linaro.org/MichaelHope/Sandbox/64BitOperations

 -- Michael

Re: [PATCH] Fix PR53708

2012-06-20 Thread Richard Guenther

On Tue, 19 Jun 2012, Iain Sandoe wrote:

 
 On 19 Jun 2012, at 22:41, Mike Stump wrote:
 
  On Jun 19, 2012, at 12:22 PM, Iain Sandoe i...@codesourcery.com wrote:
  On 19 Jun 2012, at 13:53, Dominique Dhumieres wrote:
  
  On Tue, 19 Jun 2012, Richard Guenther wrote:
  
  Richard Guenther rguent...@suse.de writes:
  We are too eager to bump alignment of some decls when vectorizing.
  The fix is to not bump alignment of decls the user explicitely
  aligned or that are used in an unknown way.
  
  I thought attribute((__aligned__)) only set a minimum alignment for
  variables?  Most usees I've seen have been trying to get better
  performance from higher alignment, so it might not go down well if the
  attribute stopped the vectoriser from increasing the alignment still
  further.
  
  That's what the documentation says indeed.  I'm not sure which part of
  the patch fixes the ObjC failures where the alignment is part of the ABI
  (and I suppose ObjC then mis-uses the aligned attribute?).
  
  A quick test shows that 
  
  if (DECL_PRESERVE_P (decl))
  
  alone is enough to fix the objc failures, while they are still there if 
  one uses only
  
  if (DECL_USER_ALIGN (decl))
  
  That makes sense, I had a quick look at the ObjC code, and it appears that 
  the explicit ALIGNs were never committed to trunk.
  
  Thus, the question becomes; what should ObjC (or any other) FE do to 
  ensure that specific ABI (upper) alignment constraints are met?
  
  Hum, upper is easy...  I thought the issue was that extra alignment would 
  kill it?  I know that extra alignment does kill some of the objc metadata.
 
 clearly, ambiguous phrasing on my part. 
 I mean when we want to say no more than this much.

I think the only way would be to lay out things inside a structure.
Otherwise if extra alignment can break things cannot re-ordering of
symbols break, too?  Or can you elaborate on how extra alignment
breaks stuff here?

Thanks,
Richard.

Re: C++ PATCH for c++/53484 (wrong auto in template)

2012-06-20 Thread Dominique Dhumieres

 Did you mean to put a bugzilla link here?

Yes;-( http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53565 
copy and paste from the wrong window).

Dominique

Re: [PATCH] Fix PR tree-optimization/53636 (SLP generates invalid misaligned access)

2012-06-20 Thread Richard Guenther

On Tue, Jun 19, 2012 at 11:36 PM, Mikael Pettersson mi...@it.uu.se wrote:
 Richard Guenther writes:
   On Fri, Jun 15, 2012 at 5:00 PM, Ulrich Weigand uweig...@de.ibm.com 
 wrote:
    Richard Guenther wrote:
    On Fri, Jun 15, 2012 at 3:13 PM, Ulrich Weigand uweig...@de.ibm.com 
 wrote:
     However, there is a second case where we need to check every pass: if
     we're not actually vectorizing any loop, but are performing 
 basic-block
     SLP.  In this case, it would appear that we need the same check as
     described in the comment above, i.e. to verify that the stride is a
     multiple of the vector size.
    
     The patch below adds this check, and this indeed fixes the invalid 
 access
     I was seeing in the test case (in the final assembler, we now get a
     vld1.16 instead of vldr).
    
     Tested on arm-linux-gnueabi with no regressions.
    
     OK for mainline?
   
    Ok.
   
    Thanks for the quick review; I've checked this in to mainline now.
   
    I just noticed that the test case also crashes on 4.7, but not on 4.6.
   
    Would a backport to 4.7 also be OK, once testing passes?
  
   Yes.  Please leave it on mainline a few days to catch fallout from
   autotesters.

 This patch caused

 FAIL: gcc.dg/vect/bb-slp-16.c scan-tree-dump-times slp basic block 
 vectorized using SLP 1

 on sparc64-linux.  Comparing the pre and post patch dumps for that file shows

  22: vect_compute_data_ref_alignment:
  22: misalign = 4 bytes of ref MEM[(unsigned int *)pout_90 + 28B]
  22: vect_compute_data_ref_alignment:
 -22: force alignment of arr[i_87]
 -22: misalign = 0 bytes of ref arr[i_87]
 +22: SLP: step doesn't divide the vector-size.
 +22: Unknown alignment for access: arr

 (lots of stuff that's simply gone)

 -22: BASIC BLOCK VECTORIZED
 -
 -22: basic block vectorized using SLP
 +22: not vectorized: unsupported unaligned store.arr[i_87]
 +22: not vectorized: unsupported alignment in basic block.

In this testcase the alignment of arr[i] should be irrelevant - it is
not part of
the stmts that are going to be vectorized.  But of course this may be
simply an odering issue in how we analyze data-references / statements
in basic-block vectorization (thus we possibly did not yet declare the
arr[i] = i statement as not taking part in the vectorization).

The line

 -22: force alignment of arr[i_87]

is odd, too - as said we do not need to touch arr when vectorizing the
basic-block.

Ulrich, can you look into this or do you want me to take a look here?

Mikael - please open a bugreport for this.

Thanks,
Richard.

 /Mikael

Re: [Patch] Adjustments for Windows x64 SEH

2012-06-20 Thread Tristan Gingold


On Jun 19, 2012, at 6:47 PM, Richard Henderson wrote:

 On 2012-06-18 05:22, Tristan Gingold wrote:
 +  /* Win64 SEH, very large frames need a frame-pointer as maximum stack
 + allocation is 4GB (add a safety guard for saved registers).  */
 +  if (TARGET_64BIT_MS_ABI  get_frame_size () + 4096  SEH_MAX_FRAME_SIZE)
 +return true;
 
 Elsewhere you say this is an upper bound for stack use by the prologue.
 It's clearly a wild guess.  The maximum stack use is 10*sse + 8*int 
 registers saved, which is a lot less than 4096.
 
 That said, I'm ok with *using* 4096 so long that the comment clearly
 states that it's a large over-estimate.  I do suggest, however, folding
 this into the SEH_MAX_FRAME_SIZE value, and expanding on the comment
 there.  I see no practical difference between 0x8000 and 0x7fffe000
 being the limit.

Here is the new comment.  I have reduced the estimation to 256.

/* According to Windows x64 software convention, the maximum stack allocatable
   in the prologue is 4G - 8 bytes.  Furthermore, there is a limited set of
   instructions allowed to adjust the stack pointer in the epilog, forcing the
   use of frame pointer for frames larger than 2 GB.  This theorical limit
   is reduced by 256, an over-estimated upper bound for the stack use by the
   prologue.
   We define only one threshold for both the prolog and the epilog.  When the
   frame size is larger than this threshold, we allocate the are to save SSE
   regs, then save them, and then allocate the remaining.  There is no SEH
   unwind info for this later allocation.  */
#define SEH_MAX_FRAME_SIZE ((2U  30) - 256)


 
 +/* Output assembly code to get the establisher frame (Windows x64 only).
 +   This corresponds to what will be computed by Windows from Frame Register
 +   and Frame Register Offset fields of the UNWIND_INFO structure.  Since
 +   these values are computed very late (by ix86_expand_prologue), we cannot
 +   express this using only RTL.  */
 +
 +const char *
 +ix86_output_establisher_frame (rtx target)
 +{
 +  if (!frame_pointer_needed)
 +{
 +  /* Note that we have advertized an lea operation.  */
 +  output_asm_insn (lea{q}\t{0(%%rsp), %0|%0, 0[rsp]}, target);
 +}
 +  else
 +{
 +  rtx xops[3];
 +  struct ix86_frame frame;
 +
 +  /* Recompute the frame layout here.  */
 +  ix86_compute_frame_layout (frame);
 +
 +  /* Closely follow how the frame pointer is set in
 + ix86_expand_prologue.  */
 +  xops[0] = target;
 +  xops[1] = hard_frame_pointer_rtx;
 +  if (frame.hard_frame_pointer_offset == frame.reg_save_offset)
 +xops[2] = GEN_INT (0);
 +  else
 +xops[2] = GEN_INT (-(frame.stack_pointer_offset
 + - frame.hard_frame_pointer_offset));
 +  output_asm_insn (lea{q}\t{%a2(%1), %0|%0, %a2[%1]}, xops);
 
 This is what register elimination is for; the value substitution happens
 during reload.
 
 Now, one *could* add a new pseudo-hard-register for this (we support as
 many register eliminations as needed), but before we do that we need to
 decide if we can adjust the soft frame pointer to be the value required.
 If so, you can then rely on the existing __builtin_frame_address.  Which
 is a very attractive sounding solution.  I'm 99% moving the sfp will work.

Thank you for this idea.  I am trying to implement it.

Tristan.

[patch][RFC] Move the C front end to gcc/c/

2012-06-20 Thread Steven Bosscher

Hello,

Attached is a concept patch to move the C front end to its own
sub-directory of the main gcc directory. Things like updates of
sourcebuild.texi are not yet included.

I'm posting this as an RFC: Does this look like the right approach?
Have I overlooked other things than just documentation updates? I hope
this would not cause too much trouble for branches like the
cxx-conversion branch?

Bootstrapped on x86_64-unknown-linux-gnu with c,objc,c++,obj-c++ enabled, FWIW.

Thanks,

Ciao!
Steven


move_C_fe.diff
Description: Binary data

Re: RFA: PATCH to Makefile.def/tpl to add libgomp to make check-c++

2012-06-20 Thread Richard Guenther

On Wed, Jun 20, 2012 at 9:26 AM, Jason Merrill ja...@redhat.com wrote:
 The recent regression in libgomp leads me to want to add libgomp tests to
 the check-c++ target.  OK for trunk?

Ok.  (what about libitm?)

Thanks,
Richard.

[PATCH] Adjust call stmt cost for tailcalls

2012-06-20 Thread Richard Guenther


Tailcalls have no argument setup cost and no return value cost.
This patch adjusts estminate_num_insns to reflect that.

Honza, does this look correct?

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Thanks,
Richard.

2012-06-20  Richard Guenther  rguent...@suse.de

* tree-inline.c (estimate_num_insns): Estimate call cost for
tailcalls properly.

Index: gcc/tree-inline.c
===
--- gcc/tree-inline.c   (revision 188817)
+++ gcc/tree-inline.c   (working copy)
@@ -3611,12 +3611,15 @@ estimate_num_insns (gimple stmt, eni_wei
  }
 
cost = node ? weights-call_cost : weights-indirect_call_cost;
-   if (gimple_call_lhs (stmt))
- cost += estimate_move_cost (TREE_TYPE (gimple_call_lhs (stmt)));
-   for (i = 0; i  gimple_call_num_args (stmt); i++)
+   if (!gimple_call_tail_p (stmt))
  {
-   tree arg = gimple_call_arg (stmt, i);
-   cost += estimate_move_cost (TREE_TYPE (arg));
+   if (gimple_call_lhs (stmt))
+ cost += estimate_move_cost (TREE_TYPE (gimple_call_lhs (stmt)));
+   for (i = 0; i  gimple_call_num_args (stmt); i++)
+ {
+   tree arg = gimple_call_arg (stmt, i);
+   cost += estimate_move_cost (TREE_TYPE (arg));
+ }
  }
break;
   }

Re: [PATCH] Fix PR53708

2012-06-20 Thread Richard Guenther

On Tue, 19 Jun 2012, Dominique Dhumieres wrote:

 On Tue, 19 Jun 2012, Richard Guenther wrote:
  
   Richard Guenther rguent...@suse.de writes:
We are too eager to bump alignment of some decls when vectorizing.
The fix is to not bump alignment of decls the user explicitely
aligned or that are used in an unknown way.
   
   I thought attribute((__aligned__)) only set a minimum alignment for
   variables?  Most usees I've seen have been trying to get better
   performance from higher alignment, so it might not go down well if the
   attribute stopped the vectoriser from increasing the alignment still
   further.
  
  That's what the documentation says indeed.  I'm not sure which part of
  the patch fixes the ObjC failures where the alignment is part of the ABI
  (and I suppose ObjC then mis-uses the aligned attribute?).
 
 A quick test shows that 
 
 if (DECL_PRESERVE_P (decl))
 
 alone is enough to fix the objc failures, while they are still there if 
 one uses only
 
 if (DECL_USER_ALIGN (decl))

Thus, the following.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2012-06-20  Richard Guenther  rguent...@suse.de

* tree-vect-data-refs.c (vect_can_force_dr_alignment_p):
Allow adjusting alignment of user-aligned decls again.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 188817)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -4731,10 +4720,9 @@ vect_can_force_dr_alignment_p (const_tre
   if (TREE_ASM_WRITTEN (decl))
 return false;
 
-  /* Do not override explicit alignment set by the user or the alignment
- as specified by the ABI when the used attribute is set.  */
-  if (DECL_USER_ALIGN (decl)
-  || DECL_PRESERVE_P (decl))
+  /* Do not override the alignment as specified by the ABI when the used
+ attribute is set.  */
+  if (DECL_PRESERVE_P (decl))
 return false;
 
   if (TREE_STATIC (decl))

[patch][m32c] Remove unnecessary includes from m32c-pragma.c

2012-06-20 Thread Steven Bosscher

Hello,

m32c-pragma.c doesn't need the includes that the patch below removes.

Tested with a cross from powerpc64-unknown-linux-gnu to m32c-elf.
Will commit as obvious unless someone objects.

Ciao!
Steven

* config/m32c/m32c-pragma.c: Remove unnecessary includes.

Index: config/m32c/m32c-pragma.c
===
--- config/m32c/m32c-pragma.c   (revision 188820)
+++ config/m32c/m32c-pragma.c   (working copy)
@@ -27,13 +27,7 @@
 #include c-family/c-common.h
 #include diagnostic-core.h
 #include cpplib.h
-#include hard-reg-set.h
-#include output.h
 #include m32c-protos.h
-#include function.h
-#define MAX_RECOG_OPERANDS 10
-#include reload.h
-#include target.h

 /* Implements the GCC memregs pragma.  This pragma takes only an
integer, and is semantically identical to the -memregs= command

Re: [Patch] PR 51938: extend ifcombine

2012-06-20 Thread Richard Guenther

On Sun, Jun 10, 2012 at 4:16 PM, Marc Glisse marc.gli...@inria.fr wrote:
 Hello,

 currently, tree-ssa-ifcombine handles pairs of imbricated ifs that share
 the same then branch, or the same else branch. There is no particular reason
 why it couldn't also handle the case where the then branch of one is the
 else branch of the other, which is what I do here.

 Any comments?

The general idea looks good, but I think the patch is too invasive.  As far
as I can see the only callers with a non-zero 'inv' argument come from
ifcombine_ifnotorif and ifcombine_ifnotandif (and both with inv == 2).
I would rather see a more localized patch that makes use of
invert_tree_comparison to perform the inversion on the call arguments
of maybe_fold_and/or_comparisons.

Is there any reason that would not work?

At least

+  if (inv  1)
+lcompcode2 = COMPCODE_TRUE - lcompcode2;

looks as if it were not semantically correct - you cannot simply invert
floating-point comparisons (see the restrictions invert_tree_comparison
has).

Thanks,
Richard.


 2012-06-10  Marc Glisse  marc.gli...@inria.fr

 gcc/
        PR tree-optimization/51938
        * fold-const.c (combine_comparisons): Extra argument. Handle inverted
        conditions.
        (fold_truth_andor_1): Update call to combine_comparisons.
        * gimple-fold.c (swap12): New function.
        (and_comparisons_1): Extra argument. Handle inverted conditions.
        (and_var_with_comparison_1): Update call to and_comparisons_1.
        (maybe_fold_and_comparisons): Extra argument. Update call to
        and_comparisons_1.
        (or_comparisons_1): Extra argument. Handle inverted conditions.
        (or_var_with_comparison_1): Update call to or_comparisons_1.
        (maybe_fold_or_comparisons): Extra argument. Update call to
        or_comparisons_1.
        * tree-ssa-ifcombine.c (ifcombine_ifnotandif): New function.
        (ifcombine_ifnotorif): New function.
        (tree_ssa_ifcombine_bb): Call them.
        (ifcombine_iforif): Update call to maybe_fold_or_comparisons.
        (ifcombine_ifandif): Update call to maybe_fold_and_comparisons.
        * tree-ssa-reassoc.c (eliminate_redundant_comparison): Update calls
 to
        maybe_fold_or_comparisons and maybe_fold_and_comparisons.
        * tree-if-conv.c (fold_or_predicates): Update call to
        maybe_fold_or_comparisons.
        * gimple.h (maybe_fold_and_comparisons): Match gimple-fold.c
 prototype.
        (maybe_fold_or_comparisons): Likewise.
        * tree.h (combine_comparisons): Match fold-const.c prototype.

 gcc/testsuite/
        PR tree-optimization/51938
        * gcc.dg/tree-ssa/ssa-ifcombine-8.c: New testcase.
        * gcc.dg/tree-ssa/ssa-ifcombine-9.c: New testcase.



 --
 Marc Glisse

Re: [patch][ARM] Do not include output.h in arm-c.c

2012-06-20 Thread Richard Earnshaw

On 19/06/12 23:44, Steven Bosscher wrote:
 Hello,
 
 Only a few front-end files to go that need output.h, and some of them
 are in the c_target_objs: arm, mep, m32c, and rl78.
 
 This patch tackles the ARM case.  arm-c.c needs output.h because
 EMIT_EABI_ATTRIBUTE wants to print to asm_out_file. Solved by
 replacing EMIT_EABI_ATTRIBUTE with a function
 arm.c:arm_emit_eabi_attribute.
 
 Tested by building a cross-compiler from powerpc64-unknown-linux-gnu X
 arm-eabi, and comparing assembly on a set of files.
 OK for trunk?
 
 Ciao!
 Steven=
 
 
 arm_C_no_output_h.diff
 
 

OK.

R.

[patch] Implement -fcallgraph-info option

2012-06-20 Thread Eric Botcazou

Hi,

this is a repost of
  http://gcc.gnu.org/ml/gcc-patches/2010-10/msg02468.html
earlier in the development cycle, so with hopefully more time for discussion.

The command line option -fcallgraph-info is added and makes the compiler 
generate another output file (xxx.ci) for each compilation unit, which is a 
valid VCG file (you can launch your favorite VCG viewer on it unmodified) and 
contains the final callgraph of the unit.  final is a bit of a misnomer 
as this is actually the callgraph at RTL expansion time, but since most 
high-level optimizations are done at the Tree level and RTL doesn't usually 
fiddle with calls, it's final in almost all cases.  Moreover, the nodes can 
be decorated with additional info: -fcallgraph-info=su adds stack usage info 
and -fcallgraph-info=da dynamic allocation info.

This is useful for embedded applications with stringent requirements in terms 
of memory usage for example.  You can provide the .ci files (or an aggregated 
version) for pre-compiled libraries.

This is again strictly orthogonal to code and debug info generation.  There 
are a few non-obvious changes to libfuncs.h, builtins.c, expr.c and optabs.c 
to deal with quirks of the RTL expander, but this mostly removes dead code.

This version takes into account Joseph's comments for the first submission.
As for Richard's comments, the implementation is low-level by design because we 
want to be able to trust it and the IPA callgraph isn't suitable for this, as 
the RTL expander can introduce function calls that need to be accounted for.

Tested on x86_64-suse-linux.


2012-06-20  Eric Botcazou  ebotca...@adacore.com

Callgraph info support
* common.opt (-fcallgraph-info[=]): New option.
* doc/invoke.texi (Debugging options): Document it.
* opts.c (common_handle_option): Handle it.
* flag-types.h (enum callgraph_info_type): New type.
* builtins.c (set_builtin_user_assembler_name): Do not initialize
memcpy_libfunc and memset_libfunc.
* calls.c (expand_call): If -fcallgraph-info, record the call.
(emit_library_call_value_1): Likewise.
* cgraph.h (struct cgraph_final_info): New structure.
(struct cgraph_dynamic_alloc): Likewise.
(cgraph_final_edge): Likewise.
(cgraph_node): Add 'final' field.
(dump_cgraph_final_vcg): Declare.
(cgraph_final_record_call): Likewise.
(cgraph_final_record_dynamic_alloc): Likewise.
(cgraph_final_info): Likewise.
* cgraph.c: Include expr.h and output.h.
(cgraph_create_empty_node): Initialize 'final' field.
(final_create_edge): New static function.
(cgraph_final_record_call): New global function.
(cgraph_final_record_dynamic_alloc): Likewise.
(cgraph_final_info): Likewise.
(dump_cgraph_final_indirect_call_node_vcg): New static function.
(dump_cgraph_final_edge_vcg): Likewise.
(dump_cgraph_final_node_vcg): Likewise.
(external_node_needed_p): Likewise.
(dump_cgraph_final_vcg): New global function.
* expr.c (emit_block_move_via_libcall): Set input_location on the call.
(set_storage_via_libcall): Likewise.
(block_move_fn): Make global.
Do not include gt-expr.h.
* expr.h (block_move_fn): Declare.
* gimplify.c (gimplify_decl_expr): Record dynamically-allocated object
by calling cgraph_final_record_dynamic_alloc if -fcallgraph-info=da.
* libfuncs.h (enum libfunc_index): Delete LTI_memcpy and LTI_memset.
(memcpy_libfunc): Delete.
(memset_libfunc): Likewise.
* optabs.c (init_one_libfunc): Do not zap the SYMBOL_REF_DECL.
(init_optabs): Do not initialize memcpy_libfunc and memset_libfunc.
* print-tree.c (print_decl_identifier): New function.
* output.h (enum stack_usage_kind_type): New type.
(stack_usage_qual): Declare.
* toplev.c (callgraph_info_file): New global variable.
(stack_usage_qual): Likewise.
(output_stack_usage): If -fcallgraph-info=su, set stack_usage_kind
and stack_usage of associated callgraph node.  If -fstack-usage, use
print_decl_identifier for pretty-printing.
(lang_dependent_init): Open file if -fcallgraph-info.
(finalize): If callgraph_info_file is not null, invoke dump_cgraph_vcg
and close file.
* tree.h (print_decl_identifier): Declare.
(PRINT_DECL_ORIGIN, PRINT_DECL_NAME, PRINT_DECL_UNIQUE_NAME): New.
* Makefile.in (expr.o): Remove gt-expr.h.
(cgraph.o): Add $(EXPR_H) and output.h.
* config/picochip/picochip.c: Adjust comment.


-- 
Eric Botcazou
Index: doc/invoke.texi
===
--- doc/invoke.texi	(revision 188651)
+++ doc/invoke.texi	(working copy)
@@ -328,7 +328,7 @@ Objective-C and Objective-C++ Dialects}.
 -feliminate-unused-debug-symbols -femit-class-debug-always @gol

Re: [PR49888, VTA] don't keep VALUEs bound to modified MEMs

2012-06-20 Thread Jakub Jelinek

On Wed, Jun 20, 2012 at 12:52:12AM -0300, Alexandre Oliva wrote:
 On Jun 16, 2012, H.J. Lu hjl.to...@gmail.com wrote:
 from  Alexandre Oliva  aol...@redhat.com
 
   PR debug/53671
   PR debug/49888
   * alias.c (memrefs_conflict_p): Improve handling of AND for
   alignment.
 from  Alexandre Oliva  aol...@redhat.com
 
   PR debug/53671
   PR debug/49888
   * var-tracking.c (vt_initialize): Record initial offset between
   arg pointer and stack pointer.
 
 from  Alexandre Oliva  aol...@redhat.com
 
   PR debug/53671
   PR debug/49888
   * var-tracking.c (vt_get_canonicalize_base): New.
   (vt_canonicalize_addr, vt_stack_offset_p): New.
   (vt_canon_true_dep): New.
   (drop_overlapping_mem_locs): Use vt_canon_true_dep.
   (clobber_overlaping_mems): Use vt_canonicalize_addr.

 from  Alexandre Oliva  aol...@redhat.com
 
   PR debug/53671
   PR debug/49888
   * var-tracking.c (vt_init_cfa_base): Drop redundant recording of
   CFA base.

Ok, thanks.

Jakub

Re: [PR debug/53682] avoid crash in cselib promote_debug_loc

2012-06-20 Thread Jakub Jelinek

On Wed, Jun 20, 2012 at 12:39:29AM -0300, Alexandre Oliva wrote:
 When promote_debug_loc was first introduced, it would never be called
 with a NULL loc list.  However, because of the strategy of temporarily
 resetting loc lists before recursion introduced a few months ago in
 alias.c, the earlier assumption no longer holds.
 
 This patch adusts promote_debug_loc to deal with this case.

The thing I'm worried about is what will happen with -g0 in that case.
If the loc list is temporarily reset, it will be restored again,
won't that mean that for -g0 we'll then have a loc that is in the
corresponding -g compilation referenced by a DEBUG_INSNs only (and thus
non-promoted)?

 for  gcc/ChangeLog
 from  Alexandre Oliva  aol...@redhat.com
 
   PR debug/53682
   * cselib.c (promote_debug_loc): Don't crash on NULL argument.
 
 Index: gcc/cselib.c
 ===
 --- gcc/cselib.c.orig 2012-06-17 22:52:27.740087279 -0300
 +++ gcc/cselib.c  2012-06-18 08:55:32.948832112 -0300
 @@ -322,7 +322,7 @@ new_elt_loc_list (cselib_val *val, rtx l
  static inline void
  promote_debug_loc (struct elt_loc_list *l)
  {
 -  if (l-setting_insn  DEBUG_INSN_P (l-setting_insn)
 +  if (l  l-setting_insn  DEBUG_INSN_P (l-setting_insn)
 (!cselib_current_insn || !DEBUG_INSN_P (cselib_current_insn)))
  {
n_debug_values--;

Jakub

Re: [patch] Implement -fcallgraph-info option

2012-06-20 Thread Steven Bosscher

On Wed, Jun 20, 2012 at 12:40 PM, Steven Bosscher stevenb@gmail.com wrote:
 On Wed, Jun 20, 2012 at 12:30 PM, Eric Botcazou ebotca...@adacore.com wrote:
        * cgraph.c: Include expr.h and output.h.

 What for?

Never mind, I see why you need this.

[patch] Poison removed target macros ASM_OUTPUT_IDENT and IDENT_ASM_OP

2012-06-20 Thread Steven Bosscher

Hello,

These macros are now gone and should be poisoned. I'll commit this
later today unless someone objects.

Ciao!
Steven

  * system.h: Poison ASM_OUTPUT_IDENT and IDENT_ASM_OP.

Index: system.h
===
--- system.h(revision 188825)
+++ system.h(working copy)
@@ -815,7 +815,7 @@
LABEL_ALIGN_AFTER_BARRIER_MAX_SKIP JUMP_ALIGN_MAX_SKIP  \
CAN_DEBUG_WITHOUT_FP UNLIKELY_EXECUTED_TEXT_SECTION_NAME\
HOT_TEXT_SECTION_NAME LEGITIMATE_CONSTANT_P ALWAYS_STRIP_DOTDOT \
-   OUTPUT_ADDR_CONST_EXTRA SMALL_REGISTER_CLASSES
+   OUTPUT_ADDR_CONST_EXTRA SMALL_REGISTER_CLASSES ASM_OUTPUT_IDENT

 /* Target macros only used for code built for the target, that have
moved to libgcc-tm.h or have never been present elsewhere.  */
@@ -887,7 +887,8 @@
SETJMP_VIA_SAVE_AREA FORBIDDEN_INC_DEC_CLASSES \
PREFERRED_OUTPUT_RELOAD_CLASS SYSTEM_INCLUDE_DIR   \
STANDARD_INCLUDE_DIR STANDARD_INCLUDE_COMPONENT\
-   LINK_ELIMINATE_DUPLICATE_LDIRECTORIES MIPS_DEBUGGING_INFO
+   LINK_ELIMINATE_DUPLICATE_LDIRECTORIES MIPS_DEBUGGING_INFO  \
+   IDENT_ASM_OP

 /* Hooks that are no longer used.  */
  #pragma GCC poison LANG_HOOKS_FUNCTION_MARK LANG_HOOKS_FUNCTION_FREE  \

[PATCH, i386]: Some more int iterator macroizations

2012-06-20 Thread Uros Bizjak

Hello!

2012-06-20  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.md (rounding_insn): New int attribute.
(rounding_insnxf2): Macroize insn from
{floor,ceil,btrunc}xf2 using FRNDINT_ROUNDING int iterator.
(lrounding_insnxfmode2): Rename from lroundingxfmode2.

2012-06-20  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.md (IEEE_MAXMIN): New int iterator.
(ieee_maxmin): New int attribute.
(*ieee_sieee_maxminmode3): Macroize insn from
*ieee_s{max,min}mode3 using IEEE_MAXMIN mode iterator.

Bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32},
committed to mainline SVN.

Uros.
Index: i386.md
===
--- i386.md (revision 188808)
+++ i386.md (working copy)
@@ -15108,6 +15108,14 @@
[UNSPEC_FIST_FLOOR
 UNSPEC_FIST_CEIL])
 
+;; Base name for define_insn
+(define_int_attr rounding_insn
+   [(UNSPEC_FRNDINT_FLOOR floor)
+(UNSPEC_FRNDINT_CEIL ceil)
+(UNSPEC_FRNDINT_TRUNC btrunc)
+(UNSPEC_FIST_FLOOR floor)
+(UNSPEC_FIST_CEIL ceil)])
+
 (define_int_attr rounding
[(UNSPEC_FRNDINT_FLOOR floor)
 (UNSPEC_FRNDINT_CEIL ceil)
@@ -15161,17 +15169,14 @@
(set_attr i387_cw rounding)
(set_attr mode XF)])
 
-(define_expand floorxf2
-  [(use (match_operand:XF 0 register_operand))
-   (use (match_operand:XF 1 register_operand))]
+(define_expand rounding_insnxf2
+  [(parallel [(set (match_operand:XF 0 register_operand)
+  (unspec:XF [(match_operand:XF 1 register_operand)]
+ FRNDINT_ROUNDING))
+ (clobber (reg:CC FLAGS_REG))])]
   TARGET_USE_FANCY_MATH_387
-flag_unsafe_math_optimizations
-{
-  if (optimize_insn_for_size_p ())
-FAIL;
-  emit_insn (gen_frndintxf2_floor (operands[0], operands[1]));
-  DONE;
-})
+flag_unsafe_math_optimizations
+!optimize_insn_for_size_p ())
 
 (define_expand floormode2
   [(use (match_operand:MODEF 0 register_operand))
@@ -15213,18 +15218,6 @@
   DONE;
 })
 
-(define_expand ceilxf2
-  [(use (match_operand:XF 0 register_operand))
-   (use (match_operand:XF 1 register_operand))]
-  TARGET_USE_FANCY_MATH_387
-flag_unsafe_math_optimizations
-{
-  if (optimize_insn_for_size_p ())
-FAIL;
-  emit_insn (gen_frndintxf2_ceil (operands[0], operands[1]));
-  DONE;
-})
-
 (define_expand ceilmode2
   [(use (match_operand:MODEF 0 register_operand))
(use (match_operand:MODEF 1 register_operand))]
@@ -15265,18 +15258,6 @@
   DONE;
 })
 
-(define_expand btruncxf2
-  [(use (match_operand:XF 0 register_operand))
-   (use (match_operand:XF 1 register_operand))]
-  TARGET_USE_FANCY_MATH_387
-flag_unsafe_math_optimizations
-{
-  if (optimize_insn_for_size_p ())
-FAIL;
-  emit_insn (gen_frndintxf2_trunc (operands[0], operands[1]));
-  DONE;
-})
-
 (define_expand btruncmode2
   [(use (match_operand:MODEF 0 register_operand))
(use (match_operand:MODEF 1 register_operand))]
@@ -15357,14 +15338,12 @@
(set_attr mode XF)])
 
 (define_expand nearbyintxf2
-  [(use (match_operand:XF 0 register_operand))
-   (use (match_operand:XF 1 register_operand))]
+  [(parallel [(set (match_operand:XF 0 register_operand)
+  (unspec:XF [(match_operand:XF 1 register_operand)]
+ UNSPEC_FRNDINT_MASK_PM))
+ (clobber (reg:CC FLAGS_REG))])]
   TARGET_USE_FANCY_MATH_387
-flag_unsafe_math_optimizations
-{
-  emit_insn (gen_frndintxf2_mask_pm (operands[0], operands[1]));
-  DONE;
-})
+flag_unsafe_math_optimizations)
 
 (define_expand nearbyintmode2
   [(use (match_operand:MODEF 0 register_operand))
@@ -15531,7 +15510,7 @@
  (use (match_dup 2))
  (use (match_dup 3))])])
 
-(define_expand lroundingxfmode2
+(define_expand lrounding_insnxfmode2
   [(parallel [(set (match_operand:SWI248x 0 nonimmediate_operand)
   (unspec:SWI248x [(match_operand:XF 1 register_operand)]
   FIST_ROUNDING))
@@ -16616,31 +16595,24 @@
 ;; Their operands are not commutative, and thus they may be used in the
 ;; presence of -0.0 and NaN.
 
-(define_insn *ieee_sminmode3
-  [(set (match_operand:MODEF 0 register_operand =x,x)
-   (unspec:MODEF
- [(match_operand:MODEF 1 register_operand 0,x)
-  (match_operand:MODEF 2 nonimmediate_operand xm,xm)]
-UNSPEC_IEEE_MIN))]
-  SSE_FLOAT_MODE_P (MODEmode)  TARGET_SSE_MATH
-  @
-   minssemodesuffix\t{%2, %0|%0, %2}
-   vminssemodesuffix\t{%2, %1, %0|%0, %1, %2}
-  [(set_attr isa noavx,avx)
-   (set_attr prefix orig,vex)
-   (set_attr type sseadd)
-   (set_attr mode MODE)])
+(define_int_iterator IEEE_MAXMIN
+   [UNSPEC_IEEE_MAX
+UNSPEC_IEEE_MIN])
 
-(define_insn *ieee_smaxmode3
+(define_int_attr ieee_maxmin
+   [(UNSPEC_IEEE_MAX max)
+(UNSPEC_IEEE_MIN min)])
+
+(define_insn *ieee_sieee_maxminmode3
   [(set (match_operand:MODEF 0 register_operand

Re: [Patch ping] Strength reduction

2012-06-20 Thread Richard Guenther

On Thu, Jun 14, 2012 at 3:21 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Pro forma ping. :)

;)

I notice (with all of these functions)

+unsigned
+negate_cost (enum machine_mode mode, bool speed)
+{
+  static unsigned costs[NUM_MACHINE_MODES];
+  rtx seq;
+  unsigned cost;
+
+  if (costs[mode])
+return costs[mode];
+
+  start_sequence ();
+  force_operand (gen_rtx_fmt_e (NEG, mode,
+   gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1)),
+NULL_RTX);
+  seq = get_insns ();
+  end_sequence ();
+
+  cost = seq_cost (seq, speed);
+  if (!cost)
+cost = 1;

that the cost[] array is independent on the speed argument.  Thus whatever
comes first determines the cost.  Odd, and probably not good.  A fix
would be appreciated (even for the current code ...) - simply make the
array costs[NUM_MACHINE_MODES][2].

As for the renaming - can you name the functions consistently?  Thus
the above would be negate_reg_cost?  And maybe rename the other
FIXME function, too?

Index: gcc/tree-ssa-strength-reduction.c
===
--- gcc/tree-ssa-strength-reduction.c   (revision 0)
+++ gcc/tree-ssa-strength-reduction.c   (revision 0)
@@ -0,0 +1,1611 @@
+/* Straight-line strength reduction.
+   Copyright (C) 2012  Free Software Foundation, Inc.

I know we have these 'tree-ssa-' names, but really this is gimple-ssa now ;)
So, please name it gimple-ssa-strength-reduction.c.

+  /* Access to the statement for subsequent modification.  Cached to
+ save compile time.  */
+  gimple_stmt_iterator cand_gsi;

this is a iterator for cand_stmt?  Then caching it is no longer necessary
as the iterator is the stmt itself after recent infrastructure changes.

+/* Hash table embodying a mapping from statements to candidates.  */
+static htab_t stmt_cand_map;
...
+static hashval_t
+stmt_cand_hash (const void *p)
+{
+  return htab_hash_pointer (((const_slsr_cand_t) p)-cand_stmt);
+}

use a pointer-map instead.

+/* Callback to produce a hash value for a candidate chain header.  */
+
+static hashval_t
+base_cand_hash (const void *p)
+{
+  tree ssa_name = ((const_cand_chain_t) p)-base_name;
+
+  if (TREE_CODE (ssa_name) != SSA_NAME)
+return (hashval_t) 0;
+
+  return (hashval_t) SSA_NAME_VERSION (ssa_name);
+}

does it ever happen that ssa_name is not an SSA_NAME?  I'm not sure
the memory savings over simply using a fixed-size (num_ssa_names)
array indexed by SSA_NAME_VERSION pointing to the chain is worth
using a hashtable for this?

+  node = (cand_chain_t) pool_alloc (chain_pool);
+  node-base_name = c-base_name;

If you never free pool entries it's more efficient to use an obstack.
alloc-pool
only pays off if you get freed item re-use.

+  switch (gimple_assign_rhs_code (gs))
+{
+case MULT_EXPR:
+  rhs2 = gimple_assign_rhs2 (gs);
+
+  if (TREE_CODE (rhs2) == INTEGER_CST)
+   return multiply_by_cost (TREE_INT_CST_LOW (rhs2), lhs_mode, speed);
+
+  if (TREE_CODE (rhs1) == INTEGER_CST)
+   return multiply_by_cost (TREE_INT_CST_LOW (rhs1), lhs_mode, speed);

In theory all commutative statements should have constant operands only
at rhs2 ...

Also you do not verify that the constant fits in a host-wide-int - but maybe
you do not care?  Thus, I'd do

   if (host_integerp (rhs2, 0))
 return multiply_by_cost (TREE_INT_CST_LOW (rhs2), lhs_mode, speed);

or make multiply_by[_const?]_cost take a double-int instead.  Likewise below
for add.

+case MODIFY_EXPR:
+  /* Be suspicious of assigning costs to copies that may well go away.  */
+  return 0;

MODIFY_EXPR is never a gimple_assign_rhs_code.  Simple copies have
a code of SSA_NAME for example.  But as you assert if you get to an
unhandled code I wonder why you needed the above ...

+static slsr_cand_t
+base_cand_from_table (tree base_in)
+{
+  slsr_cand mapping_key;
+
+  gimple def = SSA_NAME_DEF_STMT (base_in);
+  if (!def)
+return (slsr_cand_t) NULL;
+
+  mapping_key.cand_stmt = def;
+  return (slsr_cand_t) htab_find (stmt_cand_map, mapping_key);

isn't that reachable via the base-name - chain mapping for base_in?

+  if (TREE_CODE (rhs2) == SSA_NAME  operand_equal_p (rhs1, rhs2, 0))
+return;

SSA_NAMEs can be compared by pointer equality, thus the above is
equivalent to

  if (TREE_CODE (rhs2) == SSA_NAME  rhs1 == rhs2)

or even just

  if (rhs1 == rhs2)

applies elsewhere as well.

+/* Return TRUE iff PRODUCT is an integral multiple of FACTOR, and return
+   the multiple in *MULTIPLE.  Otherwise return FALSE and leave *MULTIPLE
+   unchanged.  */
+/* ??? - Should this be moved to double-int.c?  */

I think so.

+static bool
+double_int_multiple_of (double_int product, double_int factor,
+   bool unsigned_p, double_int *multiple)
+{
+  double_int remainder;
+  double_int quotient = double_int_divmod (product, factor, unsigned_p,
+  TRUNC_DIV_EXPR, remainder);
+  if

Re: [patch] Implement -fcallgraph-info option

2012-06-20 Thread Richard Guenther

On Wed, Jun 20, 2012 at 12:30 PM, Eric Botcazou ebotca...@adacore.com wrote:
 Hi,

 this is a repost of
  http://gcc.gnu.org/ml/gcc-patches/2010-10/msg02468.html
 earlier in the development cycle, so with hopefully more time for discussion.

 The command line option -fcallgraph-info is added and makes the compiler
 generate another output file (xxx.ci) for each compilation unit, which is a
 valid VCG file (you can launch your favorite VCG viewer on it unmodified) and
 contains the final callgraph of the unit.  final is a bit of a misnomer
 as this is actually the callgraph at RTL expansion time, but since most
 high-level optimizations are done at the Tree level and RTL doesn't usually
 fiddle with calls, it's final in almost all cases.  Moreover, the nodes can
 be decorated with additional info: -fcallgraph-info=su adds stack usage info
 and -fcallgraph-info=da dynamic allocation info.

 This is useful for embedded applications with stringent requirements in terms
 of memory usage for example.  You can provide the .ci files (or an aggregated
 version) for pre-compiled libraries.

 This is again strictly orthogonal to code and debug info generation.  There
 are a few non-obvious changes to libfuncs.h, builtins.c, expr.c and optabs.c
 to deal with quirks of the RTL expander, but this mostly removes dead code.

 This version takes into account Joseph's comments for the first submission.
 As for Richard's comments, the implementation is low-level by design because 
 we
 want to be able to trust it and the IPA callgraph isn't suitable for this, as
 the RTL expander can introduce function calls that need to be accounted for.

Hmm.

I wonder why we cannot do the following (some of this might be already
the case):

 1) preserve cgraph nodes of functions we expanded
 2) at expand time, remove call edges from the cgraph node of the currently
 expanding function (they are not kept up-to-date anyway)
 3) add cgraph edges to the regular cgraph when expand_call expands a call
 (what about indirect calls?  you seem to ignore those ...)
 4) for dynamic allocs simply record an edge to a callgraph node for alloca

Thus, go away with the notion of a final cgraph.  And make it possible to
specify the dump should be emitted at other useful points of the compilation
(LTO WPA phase comes to my mind, similar the callgraph as constructed
by the frontend).

Thus, make the whole thing a little less special case and more useful
in general?  (yeah, I still detest VCG and prefer DOT ... maybe we can
think of a simple abstraction layer that would allow switching the output
format ...)

Thanks,
Richard.

 Tested on x86_64-suse-linux.


 2012-06-20  Eric Botcazou  ebotca...@adacore.com

        Callgraph info support
        * common.opt (-fcallgraph-info[=]): New option.
        * doc/invoke.texi (Debugging options): Document it.
        * opts.c (common_handle_option): Handle it.
        * flag-types.h (enum callgraph_info_type): New type.
        * builtins.c (set_builtin_user_assembler_name): Do not initialize
        memcpy_libfunc and memset_libfunc.
        * calls.c (expand_call): If -fcallgraph-info, record the call.
        (emit_library_call_value_1): Likewise.
        * cgraph.h (struct cgraph_final_info): New structure.
        (struct cgraph_dynamic_alloc): Likewise.
        (cgraph_final_edge): Likewise.
        (cgraph_node): Add 'final' field.
        (dump_cgraph_final_vcg): Declare.
        (cgraph_final_record_call): Likewise.
        (cgraph_final_record_dynamic_alloc): Likewise.
        (cgraph_final_info): Likewise.
        * cgraph.c: Include expr.h and output.h.
        (cgraph_create_empty_node): Initialize 'final' field.
        (final_create_edge): New static function.
        (cgraph_final_record_call): New global function.
        (cgraph_final_record_dynamic_alloc): Likewise.
        (cgraph_final_info): Likewise.
        (dump_cgraph_final_indirect_call_node_vcg): New static function.
        (dump_cgraph_final_edge_vcg): Likewise.
        (dump_cgraph_final_node_vcg): Likewise.
        (external_node_needed_p): Likewise.
        (dump_cgraph_final_vcg): New global function.
        * expr.c (emit_block_move_via_libcall): Set input_location on the call.
        (set_storage_via_libcall): Likewise.
        (block_move_fn): Make global.
        Do not include gt-expr.h.
        * expr.h (block_move_fn): Declare.
        * gimplify.c (gimplify_decl_expr): Record dynamically-allocated object
        by calling cgraph_final_record_dynamic_alloc if -fcallgraph-info=da.
        * libfuncs.h (enum libfunc_index): Delete LTI_memcpy and LTI_memset.
        (memcpy_libfunc): Delete.
        (memset_libfunc): Likewise.
        * optabs.c (init_one_libfunc): Do not zap the SYMBOL_REF_DECL.
        (init_optabs): Do not initialize memcpy_libfunc and memset_libfunc.
        * print-tree.c (print_decl_identifier): New function.
        * output.h (enum stack_usage_kind_type): New type.
        (stack_usage_qual):

Re: [Patch ARM] PR51980 / PR49081 Improve Neon permute intrinsics.

2012-06-20 Thread Julian Brown

On Wed, 20 Jun 2012 11:56:39 +0100
Ramana Radhakrishnan ramana.radhakrish...@linaro.org wrote:

 Hi,
 
 This patch helps use the __builtin_shuffle intrinsics to implement the
 Neon permute intrinsics following on from Julian's and my patch last
 week. It needed support for __builtin_shuffle in the C++ frontend
 which is now in and has been for the past few days , so I'm a little
 happier with this going in now.The changes to Julian's  patch are to
 drop the mask generation and now this directly generates the vector
 constants instead.

A small stylistic point I noticed: in,

   let rec print_lines = function
 [] - ()
-  | [line] - Format.printf %s line
-  | line::lines - Format.printf %s@, line; print_lines lines in
+  | [line] - if line   then Format.printf %s line else ()
+  | line::lines - (if line   then Format.printf %s@, line);
   print_lines lines in
   print_lines body; close_braceblock ffmt;
   end_function ffmt

You can use constant strings in pattern matches, so this can be just:

  let rec print_lines = function
[] | ::_ - ()
  | [line] - Format.printf...
  | line::lines - Format.printf...

You didn't need the brackets () around the if, btw. It's semantically
quite like C: only a single statement after the then is conditional.
If you want multiple statements conditionalised, the idiomatic
way to do it is use begin...end (equivalent to { } in C) after the then
keyword.

HTH,

Julian

Re: [PATCH] Fix PR tree-optimization/53636 (SLP generates invalid misaligned access)

2012-06-20 Thread Mikael Pettersson

Richard Guenther writes:
  On Tue, Jun 19, 2012 at 11:36 PM, Mikael Pettersson mi...@it.uu.se wrote:
   Richard Guenther writes:
     On Fri, Jun 15, 2012 at 5:00 PM, Ulrich Weigand uweig...@de.ibm.com 
   wrote:
      Richard Guenther wrote:
      On Fri, Jun 15, 2012 at 3:13 PM, Ulrich Weigand 
   uweig...@de.ibm.com wrote:
       However, there is a second case where we need to check every pass: 
   if
       we're not actually vectorizing any loop, but are performing 
   basic-block
       SLP.  In this case, it would appear that we need the same check as
       described in the comment above, i.e. to verify that the stride is a
       multiple of the vector size.
      
       The patch below adds this check, and this indeed fixes the invalid 
   access
       I was seeing in the test case (in the final assembler, we now get a
       vld1.16 instead of vldr).
      
       Tested on arm-linux-gnueabi with no regressions.
      
       OK for mainline?
     
      Ok.
     
      Thanks for the quick review; I've checked this in to mainline now.
     
      I just noticed that the test case also crashes on 4.7, but not on 4.6.
     
      Would a backport to 4.7 also be OK, once testing passes?
    
     Yes.  Please leave it on mainline a few days to catch fallout from
     autotesters.
  
   This patch caused
  
   FAIL: gcc.dg/vect/bb-slp-16.c scan-tree-dump-times slp basic block 
   vectorized using SLP 1
  
   on sparc64-linux.  Comparing the pre and post patch dumps for that file 
   shows
  
    22: vect_compute_data_ref_alignment:
    22: misalign = 4 bytes of ref MEM[(unsigned int *)pout_90 + 28B]
    22: vect_compute_data_ref_alignment:
   -22: force alignment of arr[i_87]
   -22: misalign = 0 bytes of ref arr[i_87]
   +22: SLP: step doesn't divide the vector-size.
   +22: Unknown alignment for access: arr
  
   (lots of stuff that's simply gone)
  
   -22: BASIC BLOCK VECTORIZED
   -
   -22: basic block vectorized using SLP
   +22: not vectorized: unsupported unaligned store.arr[i_87]
   +22: not vectorized: unsupported alignment in basic block.
  
  In this testcase the alignment of arr[i] should be irrelevant - it is
  not part of
  the stmts that are going to be vectorized.  But of course this may be
  simply an odering issue in how we analyze data-references / statements
  in basic-block vectorization (thus we possibly did not yet declare the
  arr[i] = i statement as not taking part in the vectorization).
  
  The line
  
   -22: force alignment of arr[i_87]
  
  is odd, too - as said we do not need to touch arr when vectorizing the
  basic-block.
  
  Ulrich, can you look into this or do you want me to take a look here?
  
  Mikael - please open a bugreport for this.

I opened PR53729 for this, with an update saying that powerpc64-linux
also has this regression.

/Mikael

[PATCH] Fix PR30318 - handle more cases of + in VRP

2012-06-20 Thread Richard Guenther


This concludes the VRP and anti-ranges series for now (well, it was
the motivation for this patch which was pending for quite some time).
This re-implements PLUS_EXPR support on integer ranges to cover
all cases, even those that generate an anti-range as result.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2012-06-20  Richard Guenther  rguent...@suse.de

PR tree-optimization/30318
* tree-vrp.c (range_int_cst_p): Do not reject overflowed
constants here.
(range_int_cst_singleton_p): But explicitely here.
(zero_nonzero_bits_from_vr): And here.
(extract_range_from_binary_expr_1): Re-implement PLUS_EXPR
to cover all cases we can perform arbitrary precision
arithmetic with double-ints.
(intersect_ranges): Handle adjacent anti-ranges.

* gcc.dg/tree-ssa/vrp69.c: New testcase.

Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c.orig 2012-06-19 16:58:53.0 +0200
--- gcc/tree-vrp.c  2012-06-19 17:16:31.569517561 +0200
*** range_int_cst_p (value_range_t *vr)
*** 844,852 
  {
return (vr-type == VR_RANGE
   TREE_CODE (vr-max) == INTEGER_CST
!  TREE_CODE (vr-min) == INTEGER_CST
!  !TREE_OVERFLOW (vr-max)
!  !TREE_OVERFLOW (vr-min));
  }
  
  /* Return true if VR is a INTEGER_CST singleton.  */
--- 844,850 
  {
return (vr-type == VR_RANGE
   TREE_CODE (vr-max) == INTEGER_CST
!  TREE_CODE (vr-min) == INTEGER_CST);
  }
  
  /* Return true if VR is a INTEGER_CST singleton.  */
*** static inline bool
*** 855,860 
--- 853,860 
  range_int_cst_singleton_p (value_range_t *vr)
  {
return (range_int_cst_p (vr)
+  !TREE_OVERFLOW (vr-min)
+  !TREE_OVERFLOW (vr-max)
   tree_int_cst_equal (vr-min, vr-max));
  }
  
*** zero_nonzero_bits_from_vr (value_range_t
*** 1970,1976 
  {
*may_be_nonzero = double_int_minus_one;
*must_be_nonzero = double_int_zero;
!   if (!range_int_cst_p (vr))
  return false;
  
if (range_int_cst_singleton_p (vr))
--- 1970,1978 
  {
*may_be_nonzero = double_int_minus_one;
*must_be_nonzero = double_int_zero;
!   if (!range_int_cst_p (vr)
!   || TREE_OVERFLOW (vr-min)
!   || TREE_OVERFLOW (vr-max))
  return false;
  
if (range_int_cst_singleton_p (vr))
*** extract_range_from_binary_expr_1 (value_
*** 2376,2414 
   range and see what we end up with.  */
if (code == PLUS_EXPR)
  {
!   /* If we have a PLUS_EXPR with two VR_ANTI_RANGEs, drop to
!VR_VARYING.  It would take more effort to compute a precise
!range for such a case.  For example, if we have op0 == 1 and
!op1 == -1 with their ranges both being ~[0,0], we would have
!op0 + op1 == 0, so we cannot claim that the sum is in ~[0,0].
!Note that we are guaranteed to have vr0.type == vr1.type at
!this point.  */
!   if (vr0.type == VR_ANTI_RANGE)
{
  set_value_range_to_varying (vr);
  return;
}
- 
-   /* For operations that make the resulting range directly
-proportional to the original ranges, apply the operation to
-the same end of each range.  */
-   min = vrp_int_const_binop (code, vr0.min, vr1.min);
-   max = vrp_int_const_binop (code, vr0.max, vr1.max);
- 
-   /* If both additions overflowed the range kind is still correct.
-This happens regularly with subtracting something in unsigned
-arithmetic.
-  ???  See PR30318 for all the cases we do not handle.  */
-   if ((TREE_OVERFLOW (min)  !is_overflow_infinity (min))
-  (TREE_OVERFLOW (max)  !is_overflow_infinity (max)))
-   {
- min = build_int_cst_wide (TREE_TYPE (min),
-   TREE_INT_CST_LOW (min),
-   TREE_INT_CST_HIGH (min));
- max = build_int_cst_wide (TREE_TYPE (max),
-   TREE_INT_CST_LOW (max),
-   TREE_INT_CST_HIGH (max));
-   }
  }
else if (code == MIN_EXPR
   || code == MAX_EXPR)
--- 2378,2538 
   range and see what we end up with.  */
if (code == PLUS_EXPR)
  {
!   /* If we have a PLUS_EXPR with two VR_RANGE integer constant
!  ranges compute the precise range for such case if possible.  */
!   if (range_int_cst_p (vr0)
!  range_int_cst_p (vr1)
! /* We attempt to do infinite precision signed integer arithmetic,
!thus we need two more bits than the possibly unsigned inputs.  */
!  TYPE_PRECISION (expr_type)  HOST_BITS_PER_DOUBLE_INT - 1)
!   {
! double_int min0 = tree_to_double_int (vr0.min);
! double_int max0 = tree_to_double_int (vr0.max);
! double_int min1 = tree_to_double_int (vr1.min);
!

Re: [PATCH] Fix PR tree-optimization/53636 (SLP generates invalid misaligned access)

2012-06-20 Thread Ulrich Weigand

Richard Guenther wrote:

 In this testcase the alignment of arr[i] should be irrelevant - it is
 not part of the stmts that are going to be vectorized.

Agreed.

 But of course this may be
 simply an odering issue in how we analyze data-references / statements
 in basic-block vectorization (thus we possibly did not yet declare the
 arr[i] = i statement as not taking part in the vectorization).
 
 The line
 
  -22: force alignment of arr[i_87]
 
 is odd, too - as said we do not need to touch arr when vectorizing the
 basic-block.
 
 Ulrich, can you look into this or do you want me to take a look here?

I'll have a look.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com

Re: [patch][RFC] Move the C front end to gcc/c/

2012-06-20 Thread Diego Novillo

On Wed, Jun 20, 2012 at 4:44 AM, Steven Bosscher stevenb@gmail.com wrote:

 I'm posting this as an RFC: Does this look like the right approach?
 Have I overlooked other things than just documentation updates? I hope
 this would not cause too much trouble for branches like the
 cxx-conversion branch?

It should be fine.  SVN knows about moves and, as it happens, I don't
think we have changes in any of the c-* files.

Thanks for doing this.


Diego.

[patch testsuite]: Fix two testcases for x86_64--mingw target

2012-06-20 Thread Kai Tietz

Hi,

ChangeLog

2012-06-20  Kai Tietz

* gcc.target/i386/pr23943.c (size_t): Use compatible type-definition
for LLP64 targets.
* gcc.target/i386/pr38988.c: Likewise.

Regression-tested for x86_64-w64-mingw32, and x86_64-unknown-linux-gnu.

Ok for apply?

Regards,
Kai


Index: gcc/testsuite/gcc.target/i386/pr23943.c
===
--- gcc/testsuite/gcc.target/i386/pr23943.c (revision 188753)
+++ gcc/testsuite/gcc.target/i386/pr23943.c (working copy)
@@ -4,7 +4,7 @@
 /* { dg-require-effective-target fpic } */
 /* { dg-options -O2 -fPIC } */

-typedef long unsigned int size_t;
+__extension__ typedef __SIZE_TYPE__ size_t;

 extern size_t strlen (__const char *__s)
 __attribute__ ((__nothrow__)) __attribute__ ((__pure__))
__attribute__ ((__nonnull__ (1)));
Index: gcc/testsuite/gcc.target/i386/pr38988.c
===
--- gcc/testsuite/gcc.target/i386/pr38988.c (revision 188753)
+++ gcc/testsuite/gcc.target/i386/pr38988.c (working copy)
@@ -3,7 +3,7 @@
 /* { dg-require-effective-target fpic } */
 /* { dg-options -O2 -fpic -mcmodel=large } */

-typedef long unsigned int size_t;
+__extension__ typedef __SIZE_TYPE__ size_t;
 typedef void (*func_ptr) (void);

 static func_ptr __DTOR_LIST__[1] = { (func_ptr) (-1) };

Re: [Patch] Don't test for pr53425 on mingw

2012-06-20 Thread NightStrike

On Mon, Jun 18, 2012 at 10:09 AM, Kai Tietz ktiet...@googlemail.com wrote:
 2012/6/18 JonY jo...@users.sourceforge.net:
 Hi,

 I am told that this ABI test does not apply to mingw targets. OK to apply?

 Hi JonY,

 The test doesn't apply to x64 windows targets, as for it sse is part of its 
 ABI.
 As test already checks for !ia32, we could simply check for
 x86_64/i?86-*-mingw* targets instead.  We don't need to check for
 ilp32 here again.

The test needs to be skipped if the target is:
x86_64-*-mingw*
i*86-*-mingw* with -m64 multilib option

and it needs to run if the target is:
i*86-*-mingw*
x86_64-*-mingw* with -m32 multilib option

Does anyone know how to make that happen?

Re: [Patch] Don't test for pr53425 on mingw

2012-06-20 Thread Kai Tietz

As both tests are checking already for !ia32, there is no additiona
check beside the targets necessary.

Cheers,
Kai

[Patch ARM] Improve vdup_n intrinsics.

2012-06-20 Thread Ramana Radhakrishnan

Hi ,

This improves the vdup_n intrinsics where one tries to form constant
vectors. This uses targetm.fold_builtin to fold these vector
initializations to actual vector constants. The vdup_n cases are fine
with both endian-ness as the vector constant is just duplicated. In
addition I've made the *neon_vmov patterns take a const_zero vector to
allow the compiler to generate vmov.i32 reg, #0 for vdup_n_f32
(0.0f); type operations. It has the nice side effect that zero
initalization of FP vectors for Neon doesn't need a load from the
literal pool. I will point out that the vcreate and a lot of the other
intrinsics could be improved in a similar vein (caveat big-endian) .
This helps in a number of cases where we were initially generating a
mov of a constant into an integer register and then dupping it over
and indeed helps the tree optimizers recognize the value for the
constant vector that it is.

This also needed some work with making a testcase for vabd more robust
which just showed that the folding works !

In the process I've also cleaned up a few prototypes which was obvious.

Tested cross on arm-linux-gnueabi with no regressions.

Ok (to commit as 2 separate patches one for the prototype cleanup and
the other for the vdup case ) ?

regards,
Ramana



2012-06-20  Ramana Radhakrishnan  ramana.radhakrish...@linaro.org

* config/arm/arm.c (arm_vector_alignment_reachable): Fix declaration.
(arm_builtin_support_vector_misalignment): Likewise.
(arm_preferred_rename_class): Likewise.
(arm_vectorize_vec_perm_const_ok): Likewise.
(arm_fold_builtin): New.
(TARGET_FOLD_BUILTIN): New.
* config/arm/neon.md (*neon_movmode:VDX, VQX): Add Dz alternative.

testsuite/
* gcc.target/arm/neon-combine-sub-abs-into-abd.c: Make test
more robust.


vmovzero.patch
Description: Binary data

Re: [Target maintainers]: Please update libjava/sysdep/*/locks.h with new atomic builtins

2012-06-20 Thread David Edelsohn

Alan and I both re-implemented the locks and settled on the following
patch.  This uses the __atomic intrinsics, not the __sync instrinsics,
to avoid generating expensive instructions for a memory model that is
stricter than necessary.

If these intrinsics correctly represent the semantics of the libjava
barriers, it probably can be used as a generic implementation for
targets that support the __atomic intrinsics.

- David

2012-06-20  David Edelsohn  dje@gmail.com
Alan Modra  amo...@gmail.com

* sysdep/powerpc/locks.h (compare_and_swap): Use GCC atomic
intrinsics.
(release_set): Same.
(compare_and_swap_release): Same.
(read_barrier): Same.
(write_barrier): Same.
Index: locks.h
===
--- locks.h (revision 188778)
+++ locks.h (working copy)
@@ -11,87 +11,63 @@
 #ifndef __SYSDEP_LOCKS_H__
 #define __SYSDEP_LOCKS_H__
 
-#ifdef __LP64__
-#define _LARX ldarx 
-#define _STCX stdcx. 
-#else
-#define _LARX lwarx 
-#ifdef __PPC405__
-#define _STCX sync; stwcx. 
-#else
-#define _STCX stwcx. 
-#endif
-#endif
-
 typedef size_t obj_addr_t; /* Integer type big enough for object   */
/* address. */
 
+// Atomically replace *addr by new_val if it was initially equal to old.
+// Return true if the comparison succeeded.
+// Assumed to have acquire semantics, i.e. later memory operations
+// cannot execute before the compare_and_swap finishes.
+
 inline static bool
-compare_and_swap (volatile obj_addr_t *addr, obj_addr_t old,
+compare_and_swap (volatile obj_addr_t *addr,
+ obj_addr_t old,
  obj_addr_t new_val) 
 {
-  obj_addr_t ret;
-
-  __asm__ __volatile__ (
- _LARX %0,0,%1 \n
-xor. %0,%3,%0\n
-bne $+12\n
- _STCX %2,0,%1\n
-bne- $-16\n
-   : =r (ret)
-   : r (addr), r (new_val), r (old)
-   : cr0, memory);
-
-  /* This version of __compare_and_swap is to be used when acquiring
- a lock, so we don't need to worry about whether other memory
- operations have completed, but we do need to be sure that any loads
- after this point really occur after we have acquired the lock.  */
-  __asm__ __volatile__ (isync : : : memory);
-  return ret == 0;
+  return __atomic_compare_exchange_n (addr, old, new_val, 0,
+ __ATOMIC_ACQUIRE, __ATOMIC_RELAXED);
 }
 
+
+// Set *addr to new_val with release semantics, i.e. making sure
+// that prior loads and stores complete before this
+// assignment.
+
 inline static void
 release_set (volatile obj_addr_t *addr, obj_addr_t new_val)
 {
-  __asm__ __volatile__ (sync : : : memory);
-  *addr = new_val;
+  __atomic_store_n(addr, val, __ATOMIC_RELEASE);
 }
 
+
+// Compare_and_swap with release semantics instead of acquire semantics.
+
 inline static bool
 compare_and_swap_release (volatile obj_addr_t *addr, obj_addr_t old,
  obj_addr_t new_val)
 {
-  obj_addr_t ret;
-
-  __asm__ __volatile__ (sync : : : memory);
-
-  __asm__ __volatile__ (
- _LARX %0,0,%1 \n
-xor. %0,%3,%0\n
-bne $+12\n
- _STCX %2,0,%1\n
-bne- $-16\n
-   : =r (ret)
-   : r (addr), r (new_val), r (old)
-   : cr0, memory);
-
-  return ret == 0;
+  return __atomic_compare_exchange_n (addr, old, new_val, 0,
+ __ATOMIC_RELEASE, __ATOMIC_RELAXED);
 }
 
+
 // Ensure that subsequent instructions do not execute on stale
 // data that was loaded from memory before the barrier.
+
 inline static void
 read_barrier ()
 {
-  __asm__ __volatile__ (isync : : : memory);
+  __atomic_thread_fence (__ATOMIC_ACQUIRE);
 }
 
+
 // Ensure that prior stores to memory are completed with respect to other
 // processors.
+
 inline static void
 write_barrier ()
 {
-  __asm__ __volatile__ (sync : : : memory);
+  __atomic_thread_fence (__ATOMIC_RELEASE);
 }
 
 #endif

Re: [Target maintainers]: Please update libjava/sysdep/*/locks.h with new atomic builtins

2012-06-20 Thread Alan Modra

On Wed, Jun 20, 2012 at 09:10:44AM -0400, David Edelsohn wrote:
  inline static void
  release_set (volatile obj_addr_t *addr, obj_addr_t new_val)
  {
 -  __asm__ __volatile__ (sync : : : memory);
 -  *addr = new_val;
 +  __atomic_store_n(addr, val, __ATOMIC_RELEASE);

A typo seems to have crept in here.  s/val/new_val/

-- 
Alan Modra
Australia Development Lab, IBM

Re: [PATCH] ARM/NEON: vld1q_dup_s64 builtin

2012-06-20 Thread Christophe Lyon


On 06.06.2012 11:00, Ramana Radhakrishnan wrote:
Ok with those changes. Ramana . 


Hi Ramana,

How about this version?

Christophe.

commit f57ce4b63ca1c30ee88e8c1a431d6e90ffbecb82
Author: Christophe Lyon christophe.l...@st.com
Date:   Wed Jun 20 15:30:50 2012 +0200

2012-06-20  Christophe Lyon  christophe.l...@st.com

* gcc/config/arm/neon.md (UNSPEC_VLD1_DUP): Remove.
(neon_vld1_dup): Restrict to VQ operands.
(neon_vld1_dupv2di): New, fixes vld1q_dup_s64.
* gcc/testsuite/gcc.target/arm/neon-vld1_dupQ.c: New test.

diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 4568dea..b3b925c 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -45,7 +45,6 @@
   UNSPEC_VHADD
   UNSPEC_VHSUB
   UNSPEC_VLD1
-  UNSPEC_VLD1_DUP
   UNSPEC_VLD1_LANE
   UNSPEC_VLD2
   UNSPEC_VLD2_DUP
@@ -4381,8 +4380,7 @@
 
 (define_insn neon_vld1_dupmode
   [(set (match_operand:VDX 0 s_register_operand =w)
-(unspec:VDX [(match_operand:V_elem 1 neon_struct_operand Um)]
-UNSPEC_VLD1_DUP))]
+(vec_duplicate:VDX (match_operand:V_elem 1 neon_struct_operand 
Um)))]
   TARGET_NEON
 {
   if (GET_MODE_NUNITS (MODEmode)  1)
@@ -4397,20 +4395,30 @@
 )
 
 (define_insn neon_vld1_dupmode
-  [(set (match_operand:VQX 0 s_register_operand =w)
-(unspec:VQX [(match_operand:V_elem 1 neon_struct_operand Um)]
-UNSPEC_VLD1_DUP))]
+  [(set (match_operand:VQ 0 s_register_operand =w)
+(vec_duplicate:VQ (match_operand:V_elem 1 neon_struct_operand 
Um)))]
   TARGET_NEON
 {
-  if (GET_MODE_NUNITS (MODEmode)  2)
-return vld1.V_sz_elem\t{%e0[], %f0[]}, %A1;
-  else
-return vld1.V_sz_elem\t%h0, %A1;
+  return vld1.V_sz_elem\t{%e0[], %f0[]}, %A1;
 }
-  [(set (attr neon_type)
-  (if_then_else (gt (const_string V_mode_nunits) (const_string 1))
-(const_string neon_vld2_2_regs_vld1_vld2_all_lanes)
-(const_string neon_vld1_1_2_regs)))]
+  [(set_attr neon_type neon_vld2_2_regs_vld1_vld2_all_lanes)]
+)
+
+(define_insn_and_split neon_vld1_dupv2di
+   [(set (match_operand:V2DI 0 s_register_operand =w)
+(vec_duplicate:V2DI (match_operand:DI 1 neon_struct_operand Um)))]
+   TARGET_NEON
+   #
+reload_completed
+   [(const_int 0)]
+   {
+rtx tmprtx = gen_lowpart (DImode, operands[0]);
+emit_insn (gen_neon_vld1_dupdi (tmprtx, operands[1]));
+emit_move_insn (gen_highpart (DImode, operands[0]), tmprtx );
+DONE;
+}
+  [(set_attr length 8)
+   (set_attr neon_type neon_vld2_2_regs_vld1_vld2_all_lanes)]
 )
 
 (define_expand vec_store_lanesmodemode
diff --git a/gcc/testsuite/gcc.target/arm/neon-vld1_dupQ.c 
b/gcc/testsuite/gcc.target/arm/neon-vld1_dupQ.c
new file mode 100644
index 000..b5793bf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/neon-vld1_dupQ.c
@@ -0,0 +1,24 @@
+/* Test the `vld1q_s64' ARM Neon intrinsic.  */
+
+/* { dg-do run } */
+/* { dg-require-effective-target arm_neon_hw } */
+/* { dg-options -O0 } */
+/* { dg-add-options arm_neon } */
+
+#include arm_neon.h
+#include stdlib.h
+
+int main (void)
+{
+  int64x1_t input[2] = {(int64x1_t)0x0123456776543210LL,
+   (int64x1_t)0x89abcdeffedcba90LL};
+  int64x1_t output[2] = {0, 0};
+  int64x2_t var = vld1q_dup_s64(input);
+
+  vst1q_s64(output, var);
+  if (output[0] != (int64x1_t)0x0123456776543210LL)
+abort();
+  if (output[1] != (int64x1_t)0x0123456776543210LL)
+abort();
+  return 0;
+}

Re: [PATCH] Fix PR53708

2012-06-20 Thread Iain Sandoe

Hi,

On 20 Jun 2012, at 09:23, Richard Guenther wrote:

 On Tue, 19 Jun 2012, Iain Sandoe wrote:
 
 
 On 19 Jun 2012, at 22:41, Mike Stump wrote:
 
 On Jun 19, 2012, at 12:22 PM, Iain Sandoe i...@codesourcery.com wrote:
 On 19 Jun 2012, at 13:53, Dominique Dhumieres wrote:
 
 On Tue, 19 Jun 2012, Richard Guenther wrote:
 
 Richard Guenther rguent...@suse.de writes:
 We are too eager to bump alignment of some decls when vectorizing.
 The fix is to not bump alignment of decls the user explicitely
 aligned or that are used in an unknown way.
 
 I thought attribute((__aligned__)) only set a minimum alignment for
 variables?  Most usees I've seen have been trying to get better
 performance from higher alignment, so it might not go down well if the
 attribute stopped the vectoriser from increasing the alignment still
 further.
 
 That's what the documentation says indeed.  I'm not sure which part of
 the patch fixes the ObjC failures where the alignment is part of the ABI
 (and I suppose ObjC then mis-uses the aligned attribute?).
 
 A quick test shows that 
 
 if (DECL_PRESERVE_P (decl))
 
 alone is enough to fix the objc failures, while they are still there if 
 one uses only
 
 if (DECL_USER_ALIGN (decl))
 
 That makes sense, I had a quick look at the ObjC code, and it appears that 
 the explicit ALIGNs were never committed to trunk.
 
 Thus, the question becomes; what should ObjC (or any other) FE do to 
 ensure that specific ABI (upper) alignment constraints are met?
 
 Hum, upper is easy...  I thought the issue was that extra alignment would 
 kill it?  I know that extra alignment does kill some of the objc metadata.
 
 clearly, ambiguous phrasing on my part. 
 I mean when we want to say no more than this much.
 
 I think the only way would be to lay out things inside a structure.


 Otherwise if extra alignment can break things cannot re-ordering of
 symbols break, too?  Or can you elaborate on how extra alignment
 breaks stuff here?

The NeXT runtime meta-data are anonymous and stored in named sections;
The names of the sections are known to the runtime, which looks up the data 
directly.

In the case that's failing we have class references (which are, effectively, 
pointers to strings).
The (m32, v0 or v1) runtime knows that the number of names is the section size 
/ sizeof(long).

Bumping up the alignment of these items makes it look like there are more name 
pointers present.

For later versions of (Darwin) ld, this is caught by the linker; for earlier 
versions the exe will fail.

The order of the names or other items is not significant (or has already been 
handled by the runtime).

[It might be possible rejig the class ref. list as an array, or a structure 
containing only longs, but I'll need to look at that later].

However, it seems reasonable that ABIs could require both upper and lower 
limits on alignment;
are we saying that the only way to handle the 'upper' is by declaring things 
'packed' and putting them into a crafted structure?

thanks
Iain

Re: [Target maintainers]: Please update libjava/sysdep/*/locks.h with new atomic builtins

2012-06-20 Thread David Edelsohn

On Wed, Jun 20, 2012 at 9:35 AM, Alan Modra amo...@gmail.com wrote:
 On Wed, Jun 20, 2012 at 09:10:44AM -0400, David Edelsohn wrote:
  inline static void
  release_set (volatile obj_addr_t *addr, obj_addr_t new_val)
  {
 -  __asm__ __volatile__ (sync : : : memory);
 -  *addr = new_val;
 +  __atomic_store_n(addr, val, __ATOMIC_RELEASE);

 A typo seems to have crept in here.  s/val/new_val/

Fixed.

Thanks, David

Re: [cxx-conversion] Remove option to build without a C++ compiler (issue6296093)

2012-06-20 Thread Ian Lance Taylor

dnovi...@google.com (Diego Novillo) writes:

 Ian, could you please take a look to double check I have not missed
 anything?  There was more code dealing with it than I was expecting.

It all looks plausible to me.

Ian

Re: [PATCH] Fix PR53708

2012-06-20 Thread Richard Guenther

On Wed, 20 Jun 2012, Richard Guenther wrote:

 On Wed, 20 Jun 2012, Iain Sandoe wrote:
 
  Hi,
  
  On 20 Jun 2012, at 09:23, Richard Guenther wrote:
  
   On Tue, 19 Jun 2012, Iain Sandoe wrote:
   
   
   On 19 Jun 2012, at 22:41, Mike Stump wrote:
   
   On Jun 19, 2012, at 12:22 PM, Iain Sandoe i...@codesourcery.com wrote:
   On 19 Jun 2012, at 13:53, Dominique Dhumieres wrote:
   
   On Tue, 19 Jun 2012, Richard Guenther wrote:
   
   Richard Guenther rguent...@suse.de writes:
   We are too eager to bump alignment of some decls when vectorizing.
   The fix is to not bump alignment of decls the user explicitely
   aligned or that are used in an unknown way.
   
   I thought attribute((__aligned__)) only set a minimum alignment for
   variables?  Most usees I've seen have been trying to get better
   performance from higher alignment, so it might not go down well if 
   the
   attribute stopped the vectoriser from increasing the alignment still
   further.
   
   That's what the documentation says indeed.  I'm not sure which part 
   of
   the patch fixes the ObjC failures where the alignment is part of the 
   ABI
   (and I suppose ObjC then mis-uses the aligned attribute?).
   
   A quick test shows that 
   
   if (DECL_PRESERVE_P (decl))
   
   alone is enough to fix the objc failures, while they are still there 
   if 
   one uses only
   
   if (DECL_USER_ALIGN (decl))
   
   That makes sense, I had a quick look at the ObjC code, and it appears 
   that the explicit ALIGNs were never committed to trunk.
   
   Thus, the question becomes; what should ObjC (or any other) FE do to 
   ensure that specific ABI (upper) alignment constraints are met?
   
   Hum, upper is easy...  I thought the issue was that extra alignment 
   would kill it?  I know that extra alignment does kill some of the objc 
   metadata.
   
   clearly, ambiguous phrasing on my part. 
   I mean when we want to say no more than this much.
   
   I think the only way would be to lay out things inside a structure.
  
  
   Otherwise if extra alignment can break things cannot re-ordering of
   symbols break, too?  Or can you elaborate on how extra alignment
   breaks stuff here?
  
  The NeXT runtime meta-data are anonymous and stored in named sections;
  The names of the sections are known to the runtime, which looks up the data 
  directly.
  
  In the case that's failing we have class references (which are, 
  effectively, pointers to strings).
  The (m32, v0 or v1) runtime knows that the number of names is the section 
  size / sizeof(long).
  
  Bumping up the alignment of these items makes it look like there are more 
  name pointers present.
  
  For later versions of (Darwin) ld, this is caught by the linker; for 
  earlier versions the exe will fail.
  
  The order of the names or other items is not significant (or has already 
  been handled by the runtime).
  
  [It might be possible rejig the class ref. list as an array, or a structure 
  containing only longs, but I'll need to look at that later].
  
  However, it seems reasonable that ABIs could require both upper and lower 
  limits on alignment;
  are we saying that the only way to handle the 'upper' is by declaring 
  things 'packed' and putting them into a crafted structure?
 
 Yes, I think so.  It would also be reasonable to have
 __attribute__((aligned(8),packed)) specify that's a hard alignment
 requirement, not a lower bound.  Not sure if that works to that effect
 though.  At least
 
 int c __attribute__((aligned(8),packed));
 
 tells you that the packed attribute is ignored.
 
 If you pack things inside an array the array itself might still get
 larger alignment (though of course not its elements).  So if you
 rely on section concatenation not producing gaps even the
 packed structure may not be a good solution - it's start address
 can still get a bigger alignment.
 
 So I think there does not exist a way to tell GCC that the start
 address of an object ought not to be aligned in another way
 than the ABI specifies (though the ABIs I know only specify
 minimum alignments, not maximum ones ...).

From reading your description of the issue again I think that
an array of names is what you want.

Richard.

Re: [cxx-conversion] Remove option to build without a C++ compiler (issue6296093)

2012-06-20 Thread Steven Bosscher

On Wed, Jun 20, 2012 at 1:08 AM, Diego Novillo dnovi...@google.com wrote:
 diff --git a/configure.ac b/configure.ac
 index 071b5e2..2a2a0c6 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -1667,7 +1653,7 @@ ACX_ELF_TARGET_IFELSE([# ELF platforms build the 
 lto-plugin always.
  ])


 -# By default, C is the only stage 1 language.
 +# By default, C and C++ are the only stage 1 languages.
  stage1_languages=,c,

So shouldn't you add c++ here?


  # Target libraries that we bootstrap.
 @@ -1705,15 +1691,14 @@ if test -d ${srcdir}/gcc; then
       ;;
   esac

 -  # If bootstrapping, then using --enable-build-with-cxx or
 -  # --enable-build-poststage1-with-cxx requires enabling C++.
 -  case 
 ,$enable_languages,:,$ENABLE_BUILD_WITH_CXX,$ENABLE_BUILD_POSTSTAGE1_WITH_CXX,:$enable_bootstrap
  in
 -    *,c++,*:*:*) ;;
 -    *:*,yes,*:yes)
 +  # If bootstrapping, C++ must be enabled.

Hmn, perhaps I misunderstand, but shouldn't C++ also be enabled if not
bootstrapping?


 +  case ,$enable_languages,:$enable_bootstrap in
 +    *,c++,*:*) ;;
 +    *:yes)
       if test -f ${srcdir}/gcc/cp/config-lang.in; then
         enable_languages=${enable_languages},c++
       else
 -        AC_MSG_ERROR([bootstrapping with --enable-build-with-cxx or 
 --enable-build-poststage1-with-cxx requires c++ sources])
 +        AC_MSG_ERROR([bootstrapping requires c++ sources])
       fi
       ;;
   esac
 @@ -1808,10 +1793,7 @@ if test -d ${srcdir}/gcc; then
         fi

        if test $language = c++; then
 -         if test $ENABLE_BUILD_WITH_CXX = yes \
 -            || test $ENABLE_BUILD_POSTSTAGE1_WITH_CXX = yes; then
 -           boot_language=yes
 -         fi
 +         boot_language=yes
        fi

This shouldn't be necessary if you add c++ to stage1_languages

         case ,${enable_languages}, in
 @@ -3198,26 +3180,6 @@ case $build in
     esac ;;
  esac


You can also remove the lang_requires_boot_languages machinery again.
It is only used by Go to enable c++ for bootstrapping the Go front
end, but with c++ enabled by default, there is no need for this hack
for Go anymore.

Ciao!
Steven

Re: [cxx-conversion] Remove option to build without a C++ compiler (issue6296093)

2012-06-20 Thread Richard Guenther

On Wed, Jun 20, 2012 at 4:10 PM, Steven Bosscher stevenb@gmail.com wrote:
 On Wed, Jun 20, 2012 at 1:08 AM, Diego Novillo dnovi...@google.com wrote:
 diff --git a/configure.ac b/configure.ac
 index 071b5e2..2a2a0c6 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -1667,7 +1653,7 @@ ACX_ELF_TARGET_IFELSE([# ELF platforms build the 
 lto-plugin always.
  ])


 -# By default, C is the only stage 1 language.
 +# By default, C and C++ are the only stage 1 languages.
  stage1_languages=,c,

 So shouldn't you add c++ here?

If you are not bootstrapping you only need frontends to build target
libraries - unless that includes a C++ library by default no, you only
need a C++ host compiler then.

I think stage1_languages should be empty and Makefile.def should
be properly set-up to add frontends required for required target libraries.

Richard.


  # Target libraries that we bootstrap.
 @@ -1705,15 +1691,14 @@ if test -d ${srcdir}/gcc; then
       ;;
   esac

 -  # If bootstrapping, then using --enable-build-with-cxx or
 -  # --enable-build-poststage1-with-cxx requires enabling C++.
 -  case 
 ,$enable_languages,:,$ENABLE_BUILD_WITH_CXX,$ENABLE_BUILD_POSTSTAGE1_WITH_CXX,:$enable_bootstrap
  in
 -    *,c++,*:*:*) ;;
 -    *:*,yes,*:yes)
 +  # If bootstrapping, C++ must be enabled.

 Hmn, perhaps I misunderstand, but shouldn't C++ also be enabled if not
 bootstrapping?


 +  case ,$enable_languages,:$enable_bootstrap in
 +    *,c++,*:*) ;;
 +    *:yes)
       if test -f ${srcdir}/gcc/cp/config-lang.in; then
         enable_languages=${enable_languages},c++
       else
 -        AC_MSG_ERROR([bootstrapping with --enable-build-with-cxx or 
 --enable-build-poststage1-with-cxx requires c++ sources])
 +        AC_MSG_ERROR([bootstrapping requires c++ sources])
       fi
       ;;
   esac
 @@ -1808,10 +1793,7 @@ if test -d ${srcdir}/gcc; then
         fi

        if test $language = c++; then
 -         if test $ENABLE_BUILD_WITH_CXX = yes \
 -            || test $ENABLE_BUILD_POSTSTAGE1_WITH_CXX = yes; then
 -           boot_language=yes
 -         fi
 +         boot_language=yes
        fi

 This shouldn't be necessary if you add c++ to stage1_languages

         case ,${enable_languages}, in
 @@ -3198,26 +3180,6 @@ case $build in
     esac ;;
  esac


 You can also remove the lang_requires_boot_languages machinery again.
 It is only used by Go to enable c++ for bootstrapping the Go front
 end, but with c++ enabled by default, there is no need for this hack
 for Go anymore.

 Ciao!
 Steven

Re: [arm] Remove obsolete FPA support (1/n): obsolete target removal

2012-06-20 Thread Sebastian Huber


On 06/13/2012 02:51 PM, Richard Earnshaw wrote:

This patch is the first of a series to remove support for the now
obsolete FPA and Maverick co-processors.

This patch removes those targets and configuration options that were
marked as deprecated in GCC-4.7 and removes the config fragments that
depended on them.

* config.gcc (unsupported): Move obsoleted FPA-based configurations
here from ...
(obsolete): ... here.

[...]

I am not sure, but I think the libgcc/config.host needs some cleanup too?

--
Sebastian Huber, embedded brains GmbH

Address : Obere Lagerstr. 30, D-82178 Puchheim, Germany
Phone   : +49 89 18 90 80 79-6
Fax : +49 89 18 90 80 79-9
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

Re: RFA: Fix PR53688

2012-06-20 Thread Michael Matz

Hi,

On Tue, 19 Jun 2012, Richard Guenther wrote:

 The MEM_REF is acceptable to the tree oracle and it can extract 
 points-to information from it.
 
 Thus for simplicity unconditionally building the above is the best.

But it doesn't work, as refs_may_alias_p_1 only accepts certain operands 
in MEM_REFs.  So, I opted to check the operand for is_gimple_mem_ref_addr 
after it's built, and if not acceptable at least build a mem-ref for the 
base object, if possible.  In order not to loose info we had before the 
patch I had to improve get_base_address a little to not give up on 
MEM_REFs like MEM[p.c].

Regstrapped on x86_64-linux, no regressions.  Okay for trunk?


Ciao,
Michael.
PR middle-end/53688
* gimple.c (get_base_address): Strip components also from inner
arguments to MEM_REFs.
* builtins.c (get_memory_rtx): Always build an all-aliasing MEM_REF
with correct size.

testsuite/
* gcc.c-torture/execute/pr53688.c: New test.
Index: gimple.c
===
--- gimple.c(revision 188772)
+++ gimple.c(working copy)
@@ -2911,7 +2911,11 @@ get_base_address (tree t)
   if ((TREE_CODE (t) == MEM_REF
|| TREE_CODE (t) == TARGET_MEM_REF)
TREE_CODE (TREE_OPERAND (t, 0)) == ADDR_EXPR)
-t = TREE_OPERAND (TREE_OPERAND (t, 0), 0);
+{
+  t = TREE_OPERAND (TREE_OPERAND (t, 0), 0);
+  while (handled_component_p (t))
+   t = TREE_OPERAND (t, 0);
+}
 
   if (TREE_CODE (t) == SSA_NAME
   || DECL_P (t)
Index: builtins.c
===
--- builtins.c  (revision 188772)
+++ builtins.c  (working copy)
@@ -1252,7 +1252,6 @@ get_memory_rtx (tree exp, tree len)
 {
   tree orig_exp = exp;
   rtx addr, mem;
-  HOST_WIDE_INT off;
 
   /* When EXP is not resolved SAVE_EXPR, MEM_ATTRS can be still derived
  from its expression, for expr-a.b only variable.a.b is recorded.  */
@@ -1269,114 +1268,30 @@ get_memory_rtx (tree exp, tree len)
  POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (exp, 0
 exp = TREE_OPERAND (exp, 0);
 
-  off = 0;
-  if (TREE_CODE (exp) == POINTER_PLUS_EXPR
-   TREE_CODE (TREE_OPERAND (exp, 0)) == ADDR_EXPR
-   host_integerp (TREE_OPERAND (exp, 1), 0)
-   (off = tree_low_cst (TREE_OPERAND (exp, 1), 0))  0)
-exp = TREE_OPERAND (TREE_OPERAND (exp, 0), 0);
-  else if (TREE_CODE (exp) == ADDR_EXPR)
-exp = TREE_OPERAND (exp, 0);
-  else if (POINTER_TYPE_P (TREE_TYPE (exp)))
-exp = build1 (INDIRECT_REF, TREE_TYPE (TREE_TYPE (exp)), exp);
-  else
-exp = NULL;
-
-  /* Honor attributes derived from exp, except for the alias set
- (as builtin stringops may alias with anything) and the size
- (as stringops may access multiple array elements).  */
-  if (exp)
+  /* Build a MEM_REF representing the whole accessed area as a byte blob,
+ (as builtin stringops may alias with anything).  */
+  exp = fold_build2 (MEM_REF,
+build_array_type (char_type_node,
+  build_range_type (sizetype,
+size_one_node, len)),
+exp, build_int_cst (ptr_type_node, 0));
+
+  /* If the MEM_REF has no acceptable address, try to get the base object,
+ and build an all-aliasing unknown-sized access to that one.  */
+  if (!is_gimple_mem_ref_addr (TREE_OPERAND (exp, 0))
+   (exp = get_base_address (exp)))
 {
-  set_mem_attributes (mem, exp, 0);
-
-  if (off)
-   mem = adjust_automodify_address_nv (mem, BLKmode, NULL, off);
-
-  /* Allow the string and memory builtins to overflow from one
-field into another, see http://gcc.gnu.org/PR23561.
-Thus avoid COMPONENT_REFs in MEM_EXPR unless we know the whole
-memory accessed by the string or memory builtin will fit
-within the field.  */
-  if (MEM_EXPR (mem)  TREE_CODE (MEM_EXPR (mem)) == COMPONENT_REF)
-   {
- tree mem_expr = MEM_EXPR (mem);
- HOST_WIDE_INT offset = -1, length = -1;
- tree inner = exp;
-
- while (TREE_CODE (inner) == ARRAY_REF
-|| CONVERT_EXPR_P (inner)
-|| TREE_CODE (inner) == VIEW_CONVERT_EXPR
-|| TREE_CODE (inner) == SAVE_EXPR)
-   inner = TREE_OPERAND (inner, 0);
-
- gcc_assert (TREE_CODE (inner) == COMPONENT_REF);
-
- if (MEM_OFFSET_KNOWN_P (mem))
-   offset = MEM_OFFSET (mem);
-
- if (offset = 0  len  host_integerp (len, 0))
-   length = tree_low_cst (len, 0);
-
- while (TREE_CODE (inner) == COMPONENT_REF)
-   {
- tree field = TREE_OPERAND (inner, 1);
- gcc_assert (TREE_CODE (mem_expr) == COMPONENT_REF);
- gcc_assert (field == TREE_OPERAND (mem_expr, 1));
-
- /* Bitfields are generally not byte-addressable.  */
- gcc_assert (!DECL_BIT_FIELD

Re: RFA: Fix PR53688

2012-06-20 Thread Richard Guenther

On Wed, Jun 20, 2012 at 4:57 PM, Michael Matz m...@suse.de wrote:
 Hi,

 On Tue, 19 Jun 2012, Richard Guenther wrote:

 The MEM_REF is acceptable to the tree oracle and it can extract
 points-to information from it.

 Thus for simplicity unconditionally building the above is the best.

 But it doesn't work, as refs_may_alias_p_1 only accepts certain operands
 in MEM_REFs.  So, I opted to check the operand for is_gimple_mem_ref_addr
 after it's built, and if not acceptable at least build a mem-ref for the
 base object, if possible.  In order not to loose info we had before the
 patch I had to improve get_base_address a little to not give up on
 MEM_REFs like MEM[p.c].

 Regstrapped on x86_64-linux, no regressions.  Okay for trunk?


Hrm ...

 Ciao,
 Michael.
        PR middle-end/53688
        * gimple.c (get_base_address): Strip components also from inner
        arguments to MEM_REFs.
        * builtins.c (get_memory_rtx): Always build an all-aliasing MEM_REF
        with correct size.

 testsuite/
        * gcc.c-torture/execute/pr53688.c: New test.
 Index: gimple.c
 ===
 --- gimple.c    (revision 188772)
 +++ gimple.c    (working copy)
 @@ -2911,7 +2911,11 @@ get_base_address (tree t)
   if ((TREE_CODE (t) == MEM_REF
        || TREE_CODE (t) == TARGET_MEM_REF)
        TREE_CODE (TREE_OPERAND (t, 0)) == ADDR_EXPR)
 -    t = TREE_OPERAND (TREE_OPERAND (t, 0), 0);
 +    {
 +      t = TREE_OPERAND (TREE_OPERAND (t, 0), 0);
 +      while (handled_component_p (t))
 +       t = TREE_OPERAND (t, 0);
 +    }

   if (TREE_CODE (t) == SSA_NAME
       || DECL_P (t)
 Index: builtins.c
 ===
 --- builtins.c  (revision 188772)
 +++ builtins.c  (working copy)
 @@ -1252,7 +1252,6 @@ get_memory_rtx (tree exp, tree len)
  {
   tree orig_exp = exp;
   rtx addr, mem;
 -  HOST_WIDE_INT off;

   /* When EXP is not resolved SAVE_EXPR, MEM_ATTRS can be still derived
      from its expression, for expr-a.b only variable.a.b is recorded.  */
 @@ -1269,114 +1268,30 @@ get_memory_rtx (tree exp, tree len)
          POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (exp, 0
     exp = TREE_OPERAND (exp, 0);

 -  off = 0;
 -  if (TREE_CODE (exp) == POINTER_PLUS_EXPR
 -       TREE_CODE (TREE_OPERAND (exp, 0)) == ADDR_EXPR
 -       host_integerp (TREE_OPERAND (exp, 1), 0)
 -       (off = tree_low_cst (TREE_OPERAND (exp, 1), 0))  0)
 -    exp = TREE_OPERAND (TREE_OPERAND (exp, 0), 0);
 -  else if (TREE_CODE (exp) == ADDR_EXPR)
 -    exp = TREE_OPERAND (exp, 0);
 -  else if (POINTER_TYPE_P (TREE_TYPE (exp)))
 -    exp = build1 (INDIRECT_REF, TREE_TYPE (TREE_TYPE (exp)), exp);
 -  else
 -    exp = NULL;
 -
 -  /* Honor attributes derived from exp, except for the alias set
 -     (as builtin stringops may alias with anything) and the size
 -     (as stringops may access multiple array elements).  */
 -  if (exp)
 +  /* Build a MEM_REF representing the whole accessed area as a byte blob,
 +     (as builtin stringops may alias with anything).  */
 +  exp = fold_build2 (MEM_REF,
 +                    build_array_type (char_type_node,
 +                                      build_range_type (sizetype,
 +                                                        size_one_node, len)),
 +                    exp, build_int_cst (ptr_type_node, 0));
 +
 +  /* If the MEM_REF has no acceptable address, try to get the base object,
 +     and build an all-aliasing unknown-sized access to that one.  */
 +  if (!is_gimple_mem_ref_addr (TREE_OPERAND (exp, 0))
 +       (exp = get_base_address (exp)))

The get_base_address massaging should be not necessary if you'd
use the original exp here, not the built MEM_REF.

Otherwise looks ok.

Thanks,
Richard.

     {
 -      set_mem_attributes (mem, exp, 0);
 -
 -      if (off)
 -       mem = adjust_automodify_address_nv (mem, BLKmode, NULL, off);
 -
 -      /* Allow the string and memory builtins to overflow from one
 -        field into another, see http://gcc.gnu.org/PR23561.
 -        Thus avoid COMPONENT_REFs in MEM_EXPR unless we know the whole
 -        memory accessed by the string or memory builtin will fit
 -        within the field.  */
 -      if (MEM_EXPR (mem)  TREE_CODE (MEM_EXPR (mem)) == COMPONENT_REF)
 -       {
 -         tree mem_expr = MEM_EXPR (mem);
 -         HOST_WIDE_INT offset = -1, length = -1;
 -         tree inner = exp;
 -
 -         while (TREE_CODE (inner) == ARRAY_REF
 -                || CONVERT_EXPR_P (inner)
 -                || TREE_CODE (inner) == VIEW_CONVERT_EXPR
 -                || TREE_CODE (inner) == SAVE_EXPR)
 -           inner = TREE_OPERAND (inner, 0);
 -
 -         gcc_assert (TREE_CODE (inner) == COMPONENT_REF);
 -
 -         if (MEM_OFFSET_KNOWN_P (mem))
 -           offset = MEM_OFFSET (mem);
 -
 -         if (offset = 0  len  host_integerp (len, 0))
 -           length = tree_low_cst (len, 0);
 -
 -         while (TREE_CODE (inner) ==

Re: [cxx-conversion] Remove option to build without a C++ compiler (issue6296093)

2012-06-20 Thread Diego Novillo


On 12-06-20 10:10 , Steven Bosscher wrote:


-# By default, C is the only stage 1 language.
+# By default, C and C++ are the only stage 1 languages.
  stage1_languages=,c,


So shouldn't you add c++ here?


That was a bad change on the comment.  We only need C for stage1. 
Thanks for spotting it.



-  # If bootstrapping, then using --enable-build-with-cxx or
-  # --enable-build-poststage1-with-cxx requires enabling C++.
-  case 
,$enable_languages,:,$ENABLE_BUILD_WITH_CXX,$ENABLE_BUILD_POSTSTAGE1_WITH_CXX,:$enable_bootstrap
 in
-*,c++,*:*:*) ;;
-*:*,yes,*:yes)
+  # If bootstrapping, C++ must be enabled.


Hmn, perhaps I misunderstand, but shouldn't C++ also be enabled if not
bootstrapping?


It's only needed if we are building C++ code.  Everything else uses the 
host compiler.




You can also remove the lang_requires_boot_languages machinery again.
It is only used by Go to enable c++ for bootstrapping the Go front
end, but with c++ enabled by default, there is no need for this hack
for Go anymore.


Good point.  I'll send a separate patch for that.


Diego.

Re: RFA: Fix PR53688

2012-06-20 Thread Michael Matz

Hi,

On Wed, 20 Jun 2012, Richard Guenther wrote:

  +  exp = fold_build2 (MEM_REF,
  +                    build_array_type (char_type_node,
  +                                      build_range_type (sizetype,
  +                                                        size_one_node, 
  len)),
  +                    exp, build_int_cst (ptr_type_node, 0));
  +
  +  /* If the MEM_REF has no acceptable address, try to get the base object,
  +     and build an all-aliasing unknown-sized access to that one.  */
  +  if (!is_gimple_mem_ref_addr (TREE_OPERAND (exp, 0))
  +       (exp = get_base_address (exp)))
 
 The get_base_address massaging should be not necessary if you'd
 use the original exp here, not the built MEM_REF.

Hmm?  The original expression is an address, I have to build a MEM_REF out 
of that, and the is_gimple_mem_ref_addr() just checked that that very 
address (after going through fold) is not acceptable as MEM_REF operand.  
So how could I avoid the massaging of the address to make it an acceptable 
operand?


Ciao,
Michael.

Re: RFA: Fix PR53688

2012-06-20 Thread Richard Guenther

On Wed, Jun 20, 2012 at 5:09 PM, Michael Matz m...@suse.de wrote:
 Hi,

 On Wed, 20 Jun 2012, Richard Guenther wrote:

  +  exp = fold_build2 (MEM_REF,
  +                    build_array_type (char_type_node,
  +                                      build_range_type (sizetype,
  +                                                        size_one_node, 
  len)),
  +                    exp, build_int_cst (ptr_type_node, 0));
  +
  +  /* If the MEM_REF has no acceptable address, try to get the base object,
  +     and build an all-aliasing unknown-sized access to that one.  */
  +  if (!is_gimple_mem_ref_addr (TREE_OPERAND (exp, 0))
  +       (exp = get_base_address (exp)))

 The get_base_address massaging should be not necessary if you'd
 use the original exp here, not the built MEM_REF.

 Hmm?  The original expression is an address, I have to build a MEM_REF out
 of that, and the is_gimple_mem_ref_addr() just checked that that very
 address (after going through fold) is not acceptable as MEM_REF operand.
 So how could I avoid the massaging of the address to make it an acceptable
 operand?

Not change get_base_addres and use

  if (!is_gimple_mem_ref_addr (TREE_OPERAND (exp, 0))
  (exp = get_base_address (TREE_OPERAND (orig_exp, 0

Richard.

Re: [Patch ping] Strength reduction

2012-06-20 Thread William J. Schmidt

On Wed, 2012-06-20 at 13:11 +0200, Richard Guenther wrote:
 On Thu, Jun 14, 2012 at 3:21 PM, William J. Schmidt
 wschm...@linux.vnet.ibm.com wrote:
  Pro forma ping. :)
 
 ;)
 
 I notice (with all of these functions)
 
 +unsigned
 +negate_cost (enum machine_mode mode, bool speed)
 +{
 +  static unsigned costs[NUM_MACHINE_MODES];
 +  rtx seq;
 +  unsigned cost;
 +
 +  if (costs[mode])
 +return costs[mode];
 +
 +  start_sequence ();
 +  force_operand (gen_rtx_fmt_e (NEG, mode,
 + gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1)),
 +  NULL_RTX);
 +  seq = get_insns ();
 +  end_sequence ();
 +
 +  cost = seq_cost (seq, speed);
 +  if (!cost)
 +cost = 1;
 
 that the cost[] array is independent on the speed argument.  Thus whatever
 comes first determines the cost.  Odd, and probably not good.  A fix
 would be appreciated (even for the current code ...) - simply make the
 array costs[NUM_MACHINE_MODES][2].
 
 As for the renaming - can you name the functions consistently?  Thus
 the above would be negate_reg_cost?  And maybe rename the other
 FIXME function, too?

I agree with all this.  I'll prepare all the cost model changes as a
separate preliminaries patch.

 
 Index: gcc/tree-ssa-strength-reduction.c
 ===
 --- gcc/tree-ssa-strength-reduction.c (revision 0)
 +++ gcc/tree-ssa-strength-reduction.c (revision 0)
 @@ -0,0 +1,1611 @@
 +/* Straight-line strength reduction.
 +   Copyright (C) 2012  Free Software Foundation, Inc.
 
 I know we have these 'tree-ssa-' names, but really this is gimple-ssa now ;)
 So, please name it gimple-ssa-strength-reduction.c.

Will do.  Vive la revolution? ;)

 
 +  /* Access to the statement for subsequent modification.  Cached to
 + save compile time.  */
 +  gimple_stmt_iterator cand_gsi;
 
 this is a iterator for cand_stmt?  Then caching it is no longer necessary
 as the iterator is the stmt itself after recent infrastructure changes.

Oh yeah, I remember seeing that go by.  Nice.  Will change.

 
 +/* Hash table embodying a mapping from statements to candidates.  */
 +static htab_t stmt_cand_map;
 ...
 +static hashval_t
 +stmt_cand_hash (const void *p)
 +{
 +  return htab_hash_pointer (((const_slsr_cand_t) p)-cand_stmt);
 +}
 
 use a pointer-map instead.
 
 +/* Callback to produce a hash value for a candidate chain header.  */
 +
 +static hashval_t
 +base_cand_hash (const void *p)
 +{
 +  tree ssa_name = ((const_cand_chain_t) p)-base_name;
 +
 +  if (TREE_CODE (ssa_name) != SSA_NAME)
 +return (hashval_t) 0;
 +
 +  return (hashval_t) SSA_NAME_VERSION (ssa_name);
 +}
 
 does it ever happen that ssa_name is not an SSA_NAME?  

Not in this patch, but when I introduce CAND_REF in a later patch it
could happen since the base field of a CAND_REF is a MEM_REF.  It's a
safety valve in case of misuse.  I'll think about this some more.

 I'm not sure
 the memory savings over simply using a fixed-size (num_ssa_names)
 array indexed by SSA_NAME_VERSION pointing to the chain is worth
 using a hashtable for this?

That's reasonable.  I'll do that.

 
 +  node = (cand_chain_t) pool_alloc (chain_pool);
 +  node-base_name = c-base_name;
 
 If you never free pool entries it's more efficient to use an obstack.
 alloc-pool
 only pays off if you get freed item re-use.

OK.  I'll change both cand_pool and chain_pool to obstacks.

 
 +  switch (gimple_assign_rhs_code (gs))
 +{
 +case MULT_EXPR:
 +  rhs2 = gimple_assign_rhs2 (gs);
 +
 +  if (TREE_CODE (rhs2) == INTEGER_CST)
 + return multiply_by_cost (TREE_INT_CST_LOW (rhs2), lhs_mode, speed);
 +
 +  if (TREE_CODE (rhs1) == INTEGER_CST)
 + return multiply_by_cost (TREE_INT_CST_LOW (rhs1), lhs_mode, speed);
 
 In theory all commutative statements should have constant operands only
 at rhs2 ...

I'm glad I'm not the only one who thought that was the theory. ;)  I
wasn't sure, and I've seen violations of this come up in practice.
Should I assert when that happens instead, and track down the offending
optimizations?

 
 Also you do not verify that the constant fits in a host-wide-int - but maybe
 you do not care?  Thus, I'd do
 
if (host_integerp (rhs2, 0))
  return multiply_by_cost (TREE_INT_CST_LOW (rhs2), lhs_mode, speed);
 
 or make multiply_by[_const?]_cost take a double-int instead.  Likewise below
 for add.

Ok.  Name change looks good also, I'll include that in the cost model
changes.

 
 +case MODIFY_EXPR:
 +  /* Be suspicious of assigning costs to copies that may well go away.  
 */
 +  return 0;
 
 MODIFY_EXPR is never a gimple_assign_rhs_code.  Simple copies have
 a code of SSA_NAME for example.  But as you assert if you get to an
 unhandled code I wonder why you needed the above ...

I'll remove this, and document that we are deliberately not touching
copies (which was my original intent).

 
 +static slsr_cand_t
 +base_cand_from_table (tree base_in)
 +{
 +  slsr_cand mapping_key;

Re: [PATCH] C++11, grammar fix for late-specified return types and virt-specifiers

2012-06-20 Thread Jason Merrill


On 06/20/2012 12:57 AM, Ville Voutilainen wrote:

If a single pipe is indeed to be used, perhaps we want to correct that
piece of documentation, lest fools follow its advice. :)


Done, thanks.

Jason

Re: [PATCH] ARM: exclude fixed_regs for stack-alignment save/restore

2012-06-20 Thread Roland McGrath

On Mon, Jun 18, 2012 at 9:34 AM, Roland McGrath mcgra...@google.com wrote:
 OK then.  If you like the original patch, would you like to commit it for me?

ping?

Re: [arm] Remove obsolete FPA support (1/n): obsolete target removal

2012-06-20 Thread Richard Earnshaw

On 20/06/12 15:41, Sebastian Huber wrote:
 On 06/13/2012 02:51 PM, Richard Earnshaw wrote:
 This patch is the first of a series to remove support for the now
 obsolete FPA and Maverick co-processors.

 This patch removes those targets and configuration options that were
 marked as deprecated in GCC-4.7 and removes the config fragments that
 depended on them.

  * config.gcc (unsupported): Move obsoleted FPA-based configurations
  here from ...
  (obsolete): ... here.
 [...]
 
 I am not sure, but I think the libgcc/config.host needs some cleanup too?
 

Undoubtedly.  But I'm not finished yet...

R.

[PATCH, i386]: Macroize remaining rounding expanders

2012-06-20 Thread Uros Bizjak

Hello!

2012-06-20  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.md (rounding_insnmode2): Macroize expander
from {floor,ceil,btrunc}mode2 using FIST_ROUNDING int iterator.
(lrounding_insnMODEF:modeSWI48:mode2): Macroize expander
from l{floor,ceil}MODEF:modeSWI48:mode2 using FIST_ROUNDING
int iterator.

Bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32},
committed to mainline SVN.

Uros.
Index: i386.md
===
--- i386.md (revision 188837)
+++ i386.md (working copy)
@@ -15178,9 +15178,11 @@
 flag_unsafe_math_optimizations
 !optimize_insn_for_size_p ())
 
-(define_expand floormode2
-  [(use (match_operand:MODEF 0 register_operand))
-   (use (match_operand:MODEF 1 register_operand))]
+(define_expand rounding_insnmode2
+  [(parallel [(set (match_operand:MODEF 0 register_operand)
+  (unspec:MODEF [(match_operand:MODEF 1 register_operand)]
+FRNDINT_ROUNDING))
+ (clobber (reg:CC FLAGS_REG))])]
   (TARGET_USE_FANCY_MATH_387
  (!(SSE_FLOAT_MODE_P (MODEmode)  TARGET_SSE_MATH)
|| TARGET_MIX_SSE_I387)
@@ -15193,53 +15195,31 @@
 {
   if (TARGET_ROUND)
emit_insn (gen_sse4_1_roundmode2
-  (operands[0], operands[1], GEN_INT (ROUND_FLOOR)));
+  (operands[0], operands[1], GEN_INT (ROUND_ROUNDING)));
   else if (optimize_insn_for_size_p ())
-FAIL;
-  else if (TARGET_64BIT || (MODEmode != DFmode))
-   ix86_expand_floorceil (operands[0], operands[1], true);
-  else
-   ix86_expand_floorceildf_32 (operands[0], operands[1], true);
-}
-  else
-{
-  rtx op0, op1;
-
-  if (optimize_insn_for_size_p ())
FAIL;
-
-  op0 = gen_reg_rtx (XFmode);
-  op1 = gen_reg_rtx (XFmode);
-  emit_insn (gen_extendmodexf2 (op1, operands[1]));
-  emit_insn (gen_frndintxf2_floor (op0, op1));
-
-  emit_insn (gen_truncxfmode2_i387_noop (operands[0], op0));
-}
-  DONE;
-})
-
-(define_expand ceilmode2
-  [(use (match_operand:MODEF 0 register_operand))
-   (use (match_operand:MODEF 1 register_operand))]
-  (TARGET_USE_FANCY_MATH_387
- (!(SSE_FLOAT_MODE_P (MODEmode)  TARGET_SSE_MATH)
-   || TARGET_MIX_SSE_I387)
- flag_unsafe_math_optimizations)
-   || (SSE_FLOAT_MODE_P (MODEmode)  TARGET_SSE_MATH
-!flag_trapping_math)
-{
-  if (SSE_FLOAT_MODE_P (MODEmode)  TARGET_SSE_MATH
-   !flag_trapping_math)
-{
-  if (TARGET_ROUND)
-   emit_insn (gen_sse4_1_roundmode2
-  (operands[0], operands[1], GEN_INT (ROUND_CEIL)));
-  else if (optimize_insn_for_size_p ())
-   FAIL;
   else if (TARGET_64BIT || (MODEmode != DFmode))
-   ix86_expand_floorceil (operands[0], operands[1], false);
+   {
+ if (ROUND_ROUNDING == ROUND_FLOOR)
+   ix86_expand_floorceil (operands[0], operands[1], true);
+ else if (ROUND_ROUNDING == ROUND_CEIL)
+   ix86_expand_floorceil (operands[0], operands[1], false);
+ else if (ROUND_ROUNDING == ROUND_TRUNC)
+   ix86_expand_trunc (operands[0], operands[1]);
+ else
+   gcc_unreachable ();
+   }
   else
-   ix86_expand_floorceildf_32 (operands[0], operands[1], false);
+   {
+ if (ROUND_ROUNDING == ROUND_FLOOR)
+   ix86_expand_floorceildf_32 (operands[0], operands[1], true);
+ else if (ROUND_ROUNDING == ROUND_CEIL)
+   ix86_expand_floorceildf_32 (operands[0], operands[1], false);
+ else if (ROUND_ROUNDING == ROUND_TRUNC)
+   ix86_expand_truncdf_32 (operands[0], operands[1]);
+ else
+   gcc_unreachable ();
+   }
 }
   else
 {
@@ -15251,53 +15231,13 @@
   op0 = gen_reg_rtx (XFmode);
   op1 = gen_reg_rtx (XFmode);
   emit_insn (gen_extendmodexf2 (op1, operands[1]));
-  emit_insn (gen_frndintxf2_ceil (op0, op1));
+  emit_insn (gen_frndintxf2_rounding (op0, op1));
 
   emit_insn (gen_truncxfmode2_i387_noop (operands[0], op0));
 }
   DONE;
 })
 
-(define_expand btruncmode2
-  [(use (match_operand:MODEF 0 register_operand))
-   (use (match_operand:MODEF 1 register_operand))]
-  (TARGET_USE_FANCY_MATH_387
- (!(SSE_FLOAT_MODE_P (MODEmode)  TARGET_SSE_MATH)
-   || TARGET_MIX_SSE_I387)
- flag_unsafe_math_optimizations)
-   || (SSE_FLOAT_MODE_P (MODEmode)  TARGET_SSE_MATH
-!flag_trapping_math)
-{
-  if (SSE_FLOAT_MODE_P (MODEmode)  TARGET_SSE_MATH
-   !flag_trapping_math)
-{
-  if (TARGET_ROUND)
-   emit_insn (gen_sse4_1_roundmode2
-  (operands[0], operands[1], GEN_INT (ROUND_TRUNC)));
-  else if (optimize_insn_for_size_p ())
-   FAIL;
-  else if (TARGET_64BIT || (MODEmode != DFmode))
-   ix86_expand_trunc (operands[0], operands[1]);
-  else
-   ix86_expand_truncdf_32 (operands[0], operands[1]);
-}
-  else
-{
-

Re: [PATCH] add DECL_SOURCE_COLUMN to tree.h (trivial)

2012-06-20 Thread Diego Novillo


On 12-06-20 13:43 , Rüdiger Sonderfeld wrote:


The patch is extremely trivial and probably doesn't need copyright assignment.
However I have signed copyright assignment for Emacs and maybe that will work
too (not sure if this has to be signed for every project).


It does, unfortunately.



gcc/ChangeLog

2012-06-20  Rüdiger Sonderfeld  ruedi...@c-plusplus.de

* tree.h (DECL_SOURCE_COLUMN): New accessor


OK.  I suppose you do not have write to the repo, so I will commit it 
for you.



Diego.

Re: [PATCH] add DECL_SOURCE_COLUMN to tree.h (trivial)

2012-06-20 Thread Diego Novillo


On 12-06-20 13:50 , Diego Novillo wrote:


OK.  I suppose you do not have write to the repo, so I will commit it
for you.


Committed r188841.


Diego.

Re: [Patch ping] Strength reduction

2012-06-20 Thread Richard Henderson

On 06/20/2012 04:11 AM, Richard Guenther wrote:
 I notice (with all of these functions)
 
 +unsigned
 +negate_cost (enum machine_mode mode, bool speed)
 +{
 +  static unsigned costs[NUM_MACHINE_MODES];
 +  rtx seq;
 +  unsigned cost;
 +
 +  if (costs[mode])
 +return costs[mode];
 +
 +  start_sequence ();
 +  force_operand (gen_rtx_fmt_e (NEG, mode,
 + gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1)),
 +  NULL_RTX);

I don't suppose there's any way to share data with what init_expmed computes?

Not, strictly speaking, the cleanest thing to include expmed.h here, but surely
a tad better than re-computing identical data (and without the clever rtl
garbage avoidance tricks).


r~

Re: [Patch] PR 51938: extend ifcombine

2012-06-20 Thread Marc Glisse


On Wed, 20 Jun 2012, Richard Guenther wrote:


On Sun, Jun 10, 2012 at 4:16 PM, Marc Glisse marc.gli...@inria.fr wrote:

Hello,

currently, tree-ssa-ifcombine handles pairs of imbricated ifs that share
the same then branch, or the same else branch. There is no particular reason
why it couldn't also handle the case where the then branch of one is the
else branch of the other, which is what I do here.

Any comments?


The general idea looks good, but I think the patch is too invasive.  As far
as I can see the only callers with a non-zero 'inv' argument come from
ifcombine_ifnotorif and ifcombine_ifnotandif (and both with inv == 2).
I would rather see a more localized patch that makes use of
invert_tree_comparison to perform the inversion on the call arguments
of maybe_fold_and/or_comparisons.

Is there any reason that would not work?


invert_tree_comparison is useless for floating point (the case I am most 
interested in) unless we specify -fno-trapping-math (writing this patch 
taught me to add this flag to my default flags, but I can't expect 
everyone to do the same). An issue is that gcc mixes the behaviors of qnan 
and snan (it is not really an issue, it just means that !(comparison) 
can't be represented as comparison2).



At least

+  if (inv  1)
+lcompcode2 = COMPCODE_TRUE - lcompcode2;

looks as if it were not semantically correct - you cannot simply invert
floating-point comparisons (see the restrictions invert_tree_comparison
has).


I don't remember all details, but I specifically thought of that, and the 
trapping behavior is handled a few lines below.


--
Marc Glisse

[PATCH, i386]: Macroize with int iterators remaining insn patterns

2012-06-20 Thread Uros Bizjak

Hello!

2012-06-20  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.md (SINCOS): New int iterator.
(sincos): New int attribute.
(*sincosxf2_i387): Macroize insn from *{sin,cos}xf2_i387 using
SINCOS int iterator.
(*sincos_extendmodexf2_i387): Macroize insn from
*{sin,cos}_extendmodexf2_i387 using SINCOS int iterator.

2012-06-20  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.md (RDFSGSBASE): New int iterator.
(WRFSGSBASE): Ditto.
(fsgs): New int attribute.
(rdfsgsbasemode): Macroize insn from rdfsgsbasemode using
RDFSGSBASE int iterator.
(wrfsgsbasemode): Macroize insn from wrfsgsbasemode using
WRFSGSBASE int iterator.

Bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32},
committed to mainline SVN.

Uros.
Index: i386.md
===
--- i386.md (revision 188840)
+++ i386.md (working copy)
@@ -13863,47 +13863,34 @@
   DONE;
 })
 
-(define_insn *sinxf2_i387
-  [(set (match_operand:XF 0 register_operand =f)
-   (unspec:XF [(match_operand:XF 1 register_operand 0)] UNSPEC_SIN))]
-  TARGET_USE_FANCY_MATH_387
-flag_unsafe_math_optimizations
-  fsin
-  [(set_attr type fpspc)
-   (set_attr mode XF)])
+(define_int_iterator SINCOS
+   [UNSPEC_SIN
+UNSPEC_COS])
 
-(define_insn *sin_extendmodexf2_i387
-  [(set (match_operand:XF 0 register_operand =f)
-   (unspec:XF [(float_extend:XF
- (match_operand:MODEF 1 register_operand 0))]
-  UNSPEC_SIN))]
-  TARGET_USE_FANCY_MATH_387
-(!(SSE_FLOAT_MODE_P (MODEmode)  TARGET_SSE_MATH)
-   || TARGET_MIX_SSE_I387)
-flag_unsafe_math_optimizations
-  fsin
-  [(set_attr type fpspc)
-   (set_attr mode XF)])
+(define_int_attr sincos
+   [(UNSPEC_SIN sin)
+(UNSPEC_COS cos)])
 
-(define_insn *cosxf2_i387
+(define_insn *sincosxf2_i387
   [(set (match_operand:XF 0 register_operand =f)
-   (unspec:XF [(match_operand:XF 1 register_operand 0)] UNSPEC_COS))]
+   (unspec:XF [(match_operand:XF 1 register_operand 0)]
+  SINCOS))]
   TARGET_USE_FANCY_MATH_387
 flag_unsafe_math_optimizations
-  fcos
+  fsincos
   [(set_attr type fpspc)
(set_attr mode XF)])
 
-(define_insn *cos_extendmodexf2_i387
+(define_insn *sincos_extendmodexf2_i387
   [(set (match_operand:XF 0 register_operand =f)
(unspec:XF [(float_extend:XF
  (match_operand:MODEF 1 register_operand 0))]
-  UNSPEC_COS))]
+  SINCOS))]
   TARGET_USE_FANCY_MATH_387
 (!(SSE_FLOAT_MODE_P (MODEmode)  TARGET_SSE_MATH)
|| TARGET_MIX_SSE_I387)
 flag_unsafe_math_optimizations
-  fcos
+  fsincos
   [(set_attr type fpspc)
(set_attr mode XF)])
 
@@ -18087,38 +18074,36 @@
(set (attr length)
 (symbol_ref ix86_attr_length_address_default (insn) + 9))])
 
-(define_insn rdfsbasemode
-  [(set (match_operand:SWI48 0 register_operand =r)
-   (unspec_volatile:SWI48 [(const_int 0)] UNSPECV_RDFSBASE))]
-  TARGET_64BIT  TARGET_FSGSBASE
-  rdfsbase %0
-  [(set_attr type other)
-   (set_attr prefix_extra 2)])
+(define_int_iterator RDFSGSBASE
+   [UNSPECV_RDFSBASE
+UNSPECV_RDGSBASE])
 
-(define_insn rdgsbasemode
+(define_int_iterator WRFSGSBASE
+   [UNSPECV_WRFSBASE
+UNSPECV_WRGSBASE])
+
+(define_int_attr fsgs
+   [(UNSPECV_RDFSBASE fs)
+(UNSPECV_RDGSBASE gs)
+(UNSPECV_WRFSBASE fs)
+(UNSPECV_WRGSBASE gs)])
+
+(define_insn rdfsgsbasemode
   [(set (match_operand:SWI48 0 register_operand =r)
-   (unspec_volatile:SWI48 [(const_int 0)] UNSPECV_RDGSBASE))]
+   (unspec_volatile:SWI48 [(const_int 0)] RDFSGSBASE))]
   TARGET_64BIT  TARGET_FSGSBASE
-  rdgsbase %0
+  rdfsgsbase\t%0
   [(set_attr type other)
(set_attr prefix_extra 2)])
 
-(define_insn wrfsbasemode
+(define_insn wrfsgsbasemode
   [(unspec_volatile [(match_operand:SWI48 0 register_operand r)]
-   UNSPECV_WRFSBASE)]
+   WRFSGSBASE)]
   TARGET_64BIT  TARGET_FSGSBASE
-  wrfsbase %0
+  wrfsgsbase\t%0
   [(set_attr type other)
(set_attr prefix_extra 2)])
 
-(define_insn wrgsbasemode
-  [(unspec_volatile [(match_operand:SWI48 0 register_operand r)]
-   UNSPECV_WRGSBASE)]
-  TARGET_64BIT  TARGET_FSGSBASE
-  wrgsbase %0
-  [(set_attr type other)
-   (set_attr prefix_extra 2)])
-
 (define_insn rdrandmode_1
   [(set (match_operand:SWI248 0 register_operand =r)
(unspec_volatile:SWI248 [(const_int 0)] UNSPECV_RDRAND))

[PATCH] PR c/53702: Fix -Wunused-local-typedefs with nested functions

2012-06-20 Thread Meador Inge

Hi,

A few weeks ago I submitted a fix for a garbage collection issue I ran
into involving -Wunused-local-typedefs [1].  The analysis for that patch
still stands, but unfortunately the patch is wrong.

The problem is that the allocation reuse can't be removed otherwise the
information about local typedefs for a parent function is lost after
a nested function is parsed.  I obviously missed that distinction the
first time.

This patch restores the previous behavior and just clears the 'x_cur_stmt_list'
field to avoid the GC issue.  The patch was tested by building mips-linux-gnu
(to verify that the GC crash that I originally encountered is still fixed) and
by bootstrapping and running the full test suite for i686-pc-linux-gnu.

OK?

P.S.  If it is OK, then can someone commit for me (I don't have write access)?

[1] http://gcc.gnu.org/ml/gcc-patches/2012-05/msg01936.html

gcc/
2012-06-20  Meador Inge  mead...@codesourcery.com

PR c/53702
* c-decl.c (c_push_function_context): Restore the behavior to reuse
the language function allocated for -Wunused-local-typedefs.
(c_pop_function_context): If necessary, clear the language function
created in c_push_function_context.  Always clear out the
x_cur_stmt_list field of the restored language function.

gcc/testsuite/
2012-06-20  Meador Inge  mead...@codesourcery.com

PR c/53702
* gcc.dg/Wunused-local-typedefs.c: New testcase.


Index: gcc/testsuite/gcc.dg/Wunused-local-typedefs.c
===
--- gcc/testsuite/gcc.dg/Wunused-local-typedefs.c   (revision 0)
+++ gcc/testsuite/gcc.dg/Wunused-local-typedefs.c   (revision 0)
@@ -0,0 +1,36 @@
+/*  Origin PR c/53702
+{ dg-options -Wunused-local-typedefs }
+{ dg-do compile }
+*/
+
+/* Only test nested functions for C.  More tests that work for C and C++
+   can be found in c-c++-common.
+*/
+
+void
+test0 ()
+{
+  typedef int foo; /* { dg-warning locally defined but not used } */
+  void f ()
+  {
+  }
+}
+
+void
+test1 ()
+{
+  void f ()
+  {
+typedef int foo; /* { dg-warning locally defined but not used } */
+  }
+}
+
+
+void
+test2 ()
+{
+  void f ()
+  {
+  }
+  typedef int foo; /* { dg-warning locally defined but not used } */
+}
Index: gcc/c-decl.c
===
--- gcc/c-decl.c(revision 188841)
+++ gcc/c-decl.c(working copy)
@@ -8579,9 +8579,11 @@ check_for_loop_decls (location_t loc, bo
 void
 c_push_function_context (void)
 {
-  struct language_function *p;
-  p = ggc_alloc_language_function ();
-  cfun-language = p;
+  struct language_function *p = cfun-language;
+  /* cfun-language might have been already allocated by the use of
+ -Wunused-local-typedefs.  In that case, just re-use it.  */
+  if (p == NULL)
+cfun-language = p = ggc_alloc_cleared_language_function ();
 
   p-base.x_stmt_tree = c_stmt_tree;
   c_stmt_tree.x_cur_stmt_list
@@ -8607,7 +8609,12 @@ c_pop_function_context (void)
 
   pop_function_context ();
   p = cfun-language;
-  cfun-language = NULL;
+
+  /* When -Wunused-local-typedefs is in effect, cfun-languages is
+ used to store data throughout the life time of the current cfun,
+ So don't deallocate it.  */
+  if (!warn_unused_local_typedefs)
+cfun-language = NULL;
 
   if (DECL_STRUCT_FUNCTION (current_function_decl) == 0
DECL_SAVED_TREE (current_function_decl) == NULL_TREE)
@@ -8620,6 +8627,7 @@ c_pop_function_context (void)
 }
 
   c_stmt_tree = p-base.x_stmt_tree;
+  p-base.x_stmt_tree.x_cur_stmt_list = NULL;
   c_break_label = p-x_break_label;
   c_cont_label = p-x_cont_label;
   c_switch_stack = p-x_switch_stack;

Re: [Patch ping] Strength reduction

2012-06-20 Thread William J. Schmidt

On Wed, 2012-06-20 at 11:52 -0700, Richard Henderson wrote:
 On 06/20/2012 04:11 AM, Richard Guenther wrote:
  I notice (with all of these functions)
  
  +unsigned
  +negate_cost (enum machine_mode mode, bool speed)
  +{
  +  static unsigned costs[NUM_MACHINE_MODES];
  +  rtx seq;
  +  unsigned cost;
  +
  +  if (costs[mode])
  +return costs[mode];
  +
  +  start_sequence ();
  +  force_operand (gen_rtx_fmt_e (NEG, mode,
  +   gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1)),
  +NULL_RTX);
 
 I don't suppose there's any way to share data with what init_expmed computes?
 
 Not, strictly speaking, the cleanest thing to include expmed.h here, but 
 surely
 a tad better than re-computing identical data (and without the clever rtl
 garbage avoidance tricks).

Interesting.  I was building on what ivopts already has; not sure of the
history there.  It looks like there is some overlap in function, but
expmed doesn't have everything ivopts uses today (particularly the hash
table of costs for multiplies by various constants).  The stuff I need
for type promotion/demotion is also not present (which I'm computing on
demand for whatever mode pairs are encountered).  Not sure how great it
would be to precompute that for all pairs, and obviously precomputing
costs of multiplying by all constants isn't going to work.  So if the
two functionalities were to be combined, it would seem to require some
modification to how expmed works.

Thanks,
Bill
 
 
 r~

[Patch, mips] Fix warning when using --with-synci

2012-06-20 Thread Steve Ellcey

This patch addresses the problem of building GCC for mips with the
'--with-synci' configure option.  If you do that and then compile a
program with GCC and specify an architecture that does not support synci
(such as the -mips32 option), GCC will issue a warning that synci
is not supported.  This results in many problems including an inability
to build a multilib version of GCC that includes -mips32.  This
patch changes the gcc driver to pass --msynci-if-supported to cc1
instead of -msynci so that cc1 will only turn on synci if it is
supported on the architecture being compiled for and will leave it off
(and not generate a warning) if it is not supported on that architecture.
If the user specifically uses -msynci, they still get the warning.

Tested on mips-linux-gnu, OK for checkin?

Steve Ellcey
sell...@mips.com


2012-06-20  Steve Ellcey  sell...@mips.com

* config.gcc: Set with_synci to synci-if-supported instead if synci.
* config/mips/mips.c (mips_option_override): Check
TARGET_SYNCI_IF_SUPPORTED and update target_flags.
* config/mips/mips.opt (msynci-if-supported): New.


diff --git a/gcc/config.gcc b/gcc/config.gcc
index f2b0936..58ee3e9 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3281,7 +3281,7 @@ case ${target} in
 
case ${with_synci} in
yes)
-   with_synci=synci
+   with_synci=synci-if-supported
;;
 | no)
# No is the default.
diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 5bcb7a8..f17d39b 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -16172,6 +16172,9 @@ mips_option_override (void)
   : !TARGET_BRANCHLIKELY))
 sorry (%qs requires branch-likely instructions, -mfix-r1);
 
+  if (TARGET_SYNCI_IF_SUPPORTED  !TARGET_SYNCI  ISA_HAS_SYNCI)
+target_flags |= MASK_SYNCI;
+
   if (TARGET_SYNCI  !ISA_HAS_SYNCI)
 {
   warning (0, the %qs architecture does not support the synci 
diff --git a/gcc/config/mips/mips.opt b/gcc/config/mips/mips.opt
index e3294a7..1dbce65 100644
--- a/gcc/config/mips/mips.opt
+++ b/gcc/config/mips/mips.opt
@@ -338,6 +338,9 @@ msynci
 Target Report Mask(SYNCI)
 Use synci instruction to invalidate i-cache
 
+msynci-if-supported
+Target Mask(SYNCI_IF_SUPPORTED) RejectNegative Undocumented
+
 mtune=
 Target RejectNegative Joined Var(mips_tune_option) ToLower 
Enum(mips_arch_opt_value)
 -mtune=PROCESSOR   Optimize the output for PROCESSOR

[Patch, fortran] PR 39654 FTELL intrinsic

2012-06-20 Thread Janne Blomqvist

Hi,

the attached patch makes the FTELL intrinsic function work on offsets
larger than 2 GB on 32-bit systems that support large files. As this
is an ABI change the old library function is left untouched, to be
removed when/if the library ABI is bumped.

Regtested on x86_64-unknown-linux-gnu, Ok for trunk?

frontend ChangeLog:

2012-06-21  Janne Blomqvist  j...@gcc.gnu.org

PR fortran/39654
* iresolve.c (gfc_resolve_ftell): Fix result kind and use new
library function.


library ChangeLog:

2012-06-21  Janne Blomqvist  j...@gcc.gnu.org

PR fortran/39654
* io/intrinsics.c (ftell2): New function.
* gfortran.map (_gfortran_ftell2): Export function.


-- 
Janne Blomqvist


ftell.diff
Description: Binary data

Re: Updated to respond to various email comments from Jason, Diego and Cary (issue6197069)

2012-06-20 Thread Cary Coutant

 +         /* If we're putting types in their own .debug_types sections,
 +            the .debug_pubtypes table will still point to the compile
 +            unit (not the type unit), so we want to use the offset of
 +            the skeleton DIE (if there is one).  */
 +         if (pub-die-comdat_type_p  names == pubtype_table)

 +           {
 +             comdat_type_node_ref type_node =
 pub-die-die_id.die_type_node;
 +
 +             if (type_node != NULL  type_node-skeleton_die != NULL)
 +               die_offset = type_node-skeleton_die-die_offset;
 +           }


 I think we had agreed that if there is no skeleton, we should use an offset
 of 0.

You're right, I forgot to handle that case. How's this look?

             if (type_node != NULL)
               die_offset = (type_node-skeleton_die != NULL
 ? type_node-skeleton_die-die_offset
 : 0);

Is that OK if it passes regression tests?

-cary

New option to turn off stack reuse for temporaries

2012-06-20 Thread Xinliang David Li

One of the most common runtime errors we have seen in gcc-4_7 is
caused by dangling references to temporaries whole life time have
ended

e.g,

 const A a = foo();

or
foo (A());// where temp's address is saved and used after foo.

Of course this is user error according to the standard, triaging of
bugs like this is pretty time consuming to triage. This patch tries to
introduce an option to disable stack reuse for temporaries, which can
be used to debugging purpose.

Is this good for trunk?

thanks,

David

2012-06-20  Xinliang David Li  davi...@google.com

* common.opt: -ftemp-reuse-stack option.
* gimplify.c (gimplify_target_expr): Check new flag.



Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 188362)
+++ doc/invoke.texi (working copy)
@@ -1003,6 +1003,7 @@ See S/390 and zSeries Options.
 -fstack-limit-register=@var{reg}  -fstack-limit-symbol=@var{sym} @gol
 -fno-stack-limit -fsplit-stack @gol
 -fleading-underscore  -ftls-model=@var{model} @gol
+-ftemp-stack-reuse @gol
 -ftrapv  -fwrapv  -fbounds-check @gol
 -fvisibility -fstrict-volatile-bitfields}
 @end table
@@ -19500,6 +19501,10 @@ indices used to access arrays are within
 currently only supported by the Java and Fortran front ends, where
 this option defaults to true and false respectively.

+@item -ftemp-stack-reuse
+@opindex ftemp_stack_reuse
+This option enables stack space reuse for temporaries. The default is on.
+
 @item -ftrapv
 @opindex ftrapv
 This option generates traps for signed overflow on addition, subtraction,
Index: gimplify.c
===
--- gimplify.c  (revision 188362)
+++ gimplify.c  (working copy)
@@ -5487,7 +5487,8 @@ gimplify_target_expr (tree *expr_p, gimp
   /* Add a clobber for the temporary going out of scope, like
 gimplify_bind_expr.  */
   if (gimplify_ctxp-in_cleanup_point_expr
-  needs_to_live_in_memory (temp))
+  needs_to_live_in_memory (temp)
+  flag_temp_stack_reuse)
{
  tree clobber = build_constructor (TREE_TYPE (temp), NULL);
  TREE_THIS_VOLATILE (clobber) = true;
Index: common.opt
===
--- common.opt  (revision 188362)
+++ common.opt  (working copy)
@@ -1322,6 +1322,10 @@ fif-conversion2
 Common Report Var(flag_if_conversion2) Optimization
 Perform conversion of conditional jumps to conditional execution

+ftemp-stack-reuse
+Common Report Var(flag_temp_stack_reuse) Init(1)
+Enable stack reuse for compiler generated temps
+
 ftree-loop-if-convert
 Common Report Var(flag_tree_loop_if_convert) Init(-1) Optimization
 Convert conditional jumps in innermost loops to branchless equivalents

[wwwdocs] Fix type in gcc-4.7/changes.html

2012-06-20 Thread Gerald Pfeifer

Applied.

Gerald

Index: gcc-4.7/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v
retrieving revision 1.119
diff -u -3 -p -r1.119 changes.html
--- gcc-4.7/changes.html14 Jun 2012 17:57:18 -  1.119
+++ gcc-4.7/changes.html20 Jun 2012 23:32:58 -
@@ -834,7 +834,7 @@ void set_portb (uint8_t value)
functions when the user switches the target machine using the
code#pragma GCC target/code or
code__attribute__ ((__target__ (emtarget/em)))/code
-   code sequences.  In additon, the target macros are updated.
+   code sequences.  In addition, the target macros are updated.
However, due to the way the code-save-temps/code switch is
implemented, you won't see the effect of these additional macros
being defined in preprocessor output./li

Re: [PR debug/53682] avoid crash in cselib promote_debug_loc

2012-06-20 Thread Alexandre Oliva

On Jun 20, 2012, Jakub Jelinek ja...@redhat.com wrote:

 On Wed, Jun 20, 2012 at 12:39:29AM -0300, Alexandre Oliva wrote:
 When promote_debug_loc was first introduced, it would never be called
 with a NULL loc list.  However, because of the strategy of temporarily
 resetting loc lists before recursion introduced a few months ago in
 alias.c, the earlier assumption no longer holds.

 This patch adusts promote_debug_loc to deal with this case.

 The thing I'm worried about is what will happen with -g0 in that case.
 If the loc list is temporarily reset, it will be restored again,
 won't that mean that for -g0 we'll then have a loc that is in the
 corresponding -g compilation referenced by a DEBUG_INSNs only (and thus
 non-promoted)?

I don't see how.  If we get to a NULL loc list, it means it's not the
first time we visit that node (it was visited upstack), so if it needed
promotion, it would have already been promoted then.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer

Re: [Patch ARM/ configury] Add fall-back check for gnu_unique_object

2012-06-20 Thread Ramana Radhakrishnan

On 10 April 2012 10:11, Ramana Radhakrishnan
ramana.radhakrish...@linaro.org wrote:
 The patch with correct configure output is ok.

 Thanks - this is what I committed.

 Is this something that can be considered for backporting to release
 branches ? This patch technically doesn't fix a regression but brings
 in line behaviour of the normal bootstrap for %gnu_unique_object ?

Ping - Is this ok to backport to the 4.7 and 4.6 branches ?

Ramana

Re: Updated to respond to various email comments from Jason, Diego and Cary (issue6197069)

2012-06-20 Thread Jason Merrill


OK.

Jason

Re: [wwwdocs] Make codingconventions.html pass W3 validator.

2012-06-20 Thread Gerald Pfeifer

Lawrence,

you ask a number of awfully good questions. :-)

First of all you made me realize that we were missing a
cross-link from http://gcc.gnu.org/projects/web.html to 
http://gcc.gnu.org/contribute.html#webchanges which the
first patch included below does now.

On Tue, 5 Jun 2012, Lawrence Crowl wrote:
 Where do these prepended pages come from?  How do I test the page
 as it will appear?  

This is covered in http://gcc.gnu.org/contribute.html#webchanges .

 I guess maybe I'm asking for the makefile that produces what one
 would see.  I want to validate that.

This is now documented via the second patch below.

 BTW, part of the problem is that the pages are complete enough as
 they are to be considered complete.  I.e. they are not obviously
 fragments.  Would it be better to make them clearly fragments?

The idea was for them to be basic HTML, so that people can view
them in their browsers and use some clever editors without problems.
So far this has generally worked well.  Is there something we can
tweak to make it better for you?

 Doesn't the prepending prevent incremental migration to new
 standards?

This is true, though we can mitigate this by adding separate tags or 
annotations to either old or new pages during such a transition.

The last time I did such a transition, it was not a big issue, though,
and I expect web standards to be more incremental and usually quite
compatible.  But, you are right.

 Since you ran into this, I would like to document this better.
 Would http://gcc.gnu.org/projects/web.html be a good place,
 or do you have a different suggestion?
 My entry point was http://gcc.gnu.org/cvs.html, so at a minimum it
 need to be cross linked with http://gcc.gnu.org/projects/web.html.

Done via the patch below, which also shortens the cvs.html page to
make it easier to consume (and move/integrate somewhere else later
on).


Anything else I can answer / document, let me know!

Gerald
 

Index: projects/web.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/web.html,v
retrieving revision 1.11
diff -u -3 -p -r1.11 web.html
--- projects/web.html   30 Mar 2008 18:59:30 -  1.11
+++ projects/web.html   20 Jun 2012 23:50:38 -
@@ -8,6 +8,9 @@
 
 h1GCC: Web Pages/h1
 
+pa href=../contribute.html#webchangesContributing changes to
+our web pages/a is simple./p
+
 pOur web pages are managed via CVS and can be accessed using the
 directions for a href=../cvs.htmlour CVS setup/a./p

Index: projects/web.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/web.html,v
retrieving revision 1.12
diff -u -3 -p -r1.12 web.html
--- projects/web.html   20 Jun 2012 23:55:35 -  1.12
+++ projects/web.html   21 Jun 2012 00:02:02 -
@@ -14,6 +14,13 @@ our web pages/a is simple./p
 pOur web pages are managed via CVS and can be accessed using the
 directions for a href=../cvs.htmlour CVS setup/a./p
 
+pAs changes are checked in, the respective pages are preprocessed
+via the script codewwwdocs/bin/preprocess/code which in turn
+uses a tool called MetaHTML.  Among others, this preprocessing
+adds CSS style sheets, XML and HTML headers, and our standard
+footer.  The MetaHTML style sheet is in
+codewwwdocs/htdocs/style.mhtml/code./p
+
 h2TODO/h2
 
 pAny help concerning open issues is highly welcome, as are

Index: cvs.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/cvs.html,v
retrieving revision 1.220
diff -u -3 -p -r1.220 cvs.html
--- cvs.html3 Apr 2011 13:00:43 -   1.220
+++ cvs.html21 Jun 2012 00:25:42 -
@@ -12,7 +12,8 @@
 pOur web pages and related scripts are available via our CVS
 repository.  You can also a
 href=http://gcc.gnu.org/cgi-bin/cvsweb.cgi/wwwdocs/;browse them
-online/a./p
+online/a or view a href=projects/web.htmldetails on the
+setup/a./p
 
 h2Using the CVS repository/h2
 
@@ -28,8 +29,6 @@ and SSH installed, you can check out the
 pFor anonymous access, use
 code-d :pserver:c...@gcc.gnu.org:/cvs/gcc/code instead./p
 
-pPatches should be marked with the tag [wwwdocs] in the subject line./p
-
 
 hr /
 h2a name=checkinChecking in a change/a/h2
@@ -37,25 +36,21 @@ and SSH installed, you can check out the
 pWhen you check in changes to our web pages, they will
 automatically be checked out into the web server's data area./p
 
-pThe following is meant to provide a very quick overview of how
+pThe following is a very quick overview of how
 to check in a change.  We recommend you list files explicitly
 to avoid accidental checkins and prefer that each checkin be of a
 complete, single logical change./p
 
 ol
 liSync your sources with the master repository via codecvs
-update/code before attempting a checkin; this will save you a little
-time if someone else has modified that file since the last time you
-synced your sources.  It will also identify any files in your local

Re: New option to turn off stack reuse for temporaries

2012-06-20 Thread Jason Merrill

The documentation needs to explain more what the option controls, and 
why you might want it on or off.  Other than that it looks fine.


Jason

[gimplefe] creating individual gimple_assign statements

2012-06-20 Thread Sandeep Soni

Hi,

This patch creates basic gimple_assign statements. It is a little raw
not considering all types of gimple_assign statements for which I have
already started working.

Here is the Changelog.

2012-06-09   Sandeep Soni   soni.sande...@gmail.com

* parser.c (gimple_symtab_get): New.
  (gimple_symtab_get_token): New.
  (gp_parse_expect_lhs): Returns tree node.
  (gp_parse_expect_rhs_op): Returns the op as tree node.
  (gp_parse_assign_stmt) : Builds gimple_assign statement.

Index: gcc/gimple/parser.c
===
--- gcc/gimple/parser.c (revision 188546)
+++ gcc/gimple/parser.c (working copy)
@@ -105,6 +105,7 @@
 gimple_symtab_eq_hash, NULL);
 }

+
 /* Registers DECL with the gimple symbol table as having identifier ID.  */

 static void
@@ -123,6 +124,41 @@
 *slot = new_entry;
 }

+
+/* Gets the tree node for the corresponding identifier ID  */
+
+static tree
+gimple_symtab_get (tree id)
+{
+  struct gimple_symtab_entry_def temp;
+  gimple_symtab_entry_t entry;
+  void **slot;
+
+  gimple_symtab_maybe_init_hash_table();
+  temp.id = id;
+  slot = htab_find_slot (gimple_symtab, temp, NO_INSERT);
+  if (slot)
+{
+  entry = (gimple_symtab_entry_t) *slot;
+  return entry-decl;
+}
+  else
+return NULL;
+}
+
+
+/* Gets the tree node of token TOKEN from the global gimple symbol table.  */
+
+static tree
+gimple_symtab_get_token (const gimple_token *token)
+{
+  const char *name = gl_token_as_text(token);
+  tree id = get_identifier(name);
+  tree decl = gimple_symtab_get (id);
+  return decl;
+}
+
+
 /* Return the string representation of token TOKEN.  */

 static const char *
@@ -360,10 +396,11 @@
 /* Helper for gp_parse_assign_stmt. The token read from reader PARSER should
be the lhs of the tuple.  */

-static void
+static tree
 gp_parse_expect_lhs (gimple_parser *parser)
 {
   const gimple_token *next_token;
+  tree lhs;

   /* Just before the name of the identifier we might get the symbol
  of dereference too. If we do get it then consume that token, else
@@ -372,18 +409,22 @@
   if (next_token-type == CPP_MULT)
 next_token = gl_consume_token (parser-lexer);

-  gl_consume_expected_token (parser-lexer, CPP_NAME);
+  next_token = gl_consume_token (parser-lexer);
+  lhs = gimple_symtab_get_token (next_token);
   gl_consume_expected_token (parser-lexer, CPP_COMMA);
+  return lhs;
+
 }


 /* Helper for gp_parse_assign_stmt. The token read from reader PARSER should
be the first operand in rhs of the tuple.  */

-static void
+static tree
 gp_parse_expect_rhs_op (gimple_parser *parser)
 {
   const gimple_token *next_token;
+  tree rhs = NULL_TREE;

   next_token = gl_peek_token (parser-lexer);

@@ -402,11 +443,13 @@
 case CPP_NUMBER:
 case CPP_STRING:
   next_token = gl_consume_token (parser-lexer);
+  rhs = gimple_symtab_get_token (next_token);
   break;

 default:
   break;
 }
+
 }


@@ -420,9 +463,10 @@
   gimple_token *optoken;
   enum tree_code opcode;
   enum gimple_rhs_class rhs_class;
+  tree op1 = NULL_TREE, op2 = NULL_TREE, op3 = NULL_TREE;

   opcode = gp_parse_expect_subcode (parser, optoken);
-  gp_parse_expect_lhs (parser);
+  tree lhs = gp_parse_expect_lhs (parser);

   rhs_class = get_gimple_rhs_class (opcode);
   switch (rhs_class)
@@ -436,16 +480,16 @@
   case GIMPLE_UNARY_RHS:
   case GIMPLE_BINARY_RHS:
   case GIMPLE_TERNARY_RHS:
-   gp_parse_expect_rhs_op (parser);
+   op1 = gp_parse_expect_rhs_op (parser);
if (rhs_class == GIMPLE_BINARY_RHS || rhs_class == GIMPLE_TERNARY_RHS)
  {
gl_consume_expected_token (parser-lexer, CPP_COMMA);
-   gp_parse_expect_rhs_op (parser);
+   op2 = gp_parse_expect_rhs_op (parser);
  }
if (rhs_class == GIMPLE_TERNARY_RHS)
  {
gl_consume_expected_token (parser-lexer, CPP_COMMA);
-   gp_parse_expect_rhs_op (parser);
+   op3 = gp_parse_expect_rhs_op (parser);
  }
break;

@@ -454,6 +498,9 @@
 }

   gl_consume_expected_token (parser-lexer, CPP_GREATER);
+
+  gimple stmt = gimple_build_assign_with_ops (code, lhs, op1, op2, op3);
+  gcc_assert(verify_gimple_stmt(stmt));
 }

 /* Helper for gp_parse_cond_stmt. The token read from reader PARSER should





-- 
Cheers
Sandy

Re: New option to turn off stack reuse for temporaries

2012-06-20 Thread Xinliang David Li

I modified the documentation and it now looks like this:

@item -ftemp-stack-reuse
@opindex ftemp_stack_reuse
This option enables stack space reuse for temporaries. The default is on.
The lifetime of a compiler generated temporary is well defined by the C++
standard. When a lifetime of a temporary ends, and if the temporary lives
in memory, an optimizing compiler has the freedom to reuse its stack
space with other temporaries or scoped local variables whose live range
does not overlap with it. However some of the legacy code relies on
the behavior of older compilers in which temporaries' stack space is
not reused, the aggressive stack reuse can lead to runtime errors. This
option is used to control the temporary stack reuse optimization.

Does it look ok?

thanks,

David

On Wed, Jun 20, 2012 at 5:29 PM, Jason Merrill ja...@redhat.com wrote:
 The documentation needs to explain more what the option controls, and why
 you might want it on or off.  Other than that it looks fine.

 Jason

Re: RFA: PATCH to Makefile.def/tpl to add libgomp to make check-c++

2012-06-20 Thread Mike Stump

On Jun 20, 2012, at 12:26 AM, Jason Merrill ja...@redhat.com wrote:
 The recent regression in libgomp leads me to want to add libgomp tests to the 
 check-c++ target.

I'm fine with the idea...

79 matches

Mail list logo