date:20120904

RE: [Patch ARM] Update the test case to differ movs and lsrs for ARM mode and non-ARM mode

2012-09-04 Thread Terry Guo

 -Original Message-
 From: Richard Earnshaw
 Sent: Wednesday, August 22, 2012 10:00 PM
 To: Terry Guo
 Cc: gcc-patches@gcc.gnu.org
 Subject: Re: [Patch ARM] Update the test case to differ movs and lsrs
 for ARM mode and non-ARM mode

 On 22/08/12 12:16, Terry Guo wrote:

  Due to the impact of ARM UAL, the Thumb1 and Thumb2 mode use LSRS
  instruction while the ARM mode uses MOVS instruction. So the
  following case
  is updated accordingly. Is it OK to trunk?

  BR,
  Terry

  2012-08-21  Terry Guo  terry@arm.com

  * gcc.target/arm/combine-movs.c: Check movs for ARM mode
  and lsrs for other mode.

  This can't be right.  Thumb1 doesn't use unified syntax.

  R.

  oops. You are right. Sorry for making such obvious mistake.
  Here is patch updated to distinguish ARM and Thumb2.
  Tested for Thumb1, Thumb2 and ARM modes. No regression.

  Is it OK?

  BR,
  Terry

  2012-08-21  Terry Guo  terry@arm.com

  * gcc.target/arm/combine-movs.c: Check movs for ARM mode
  and lsrs for Thumb2 mode.

 OK.

 R.

Hi Richard,

Is it ok to apply this fix to gcc 4.7 branch?

BR,
Terry

[Ping]RE: [Patch, test] Enable to prune warnings for tests defined in one exp file

2012-09-04 Thread Terry Guo

Hi Mike,

Is it ok to document this feature in README.gcc? Is it ok to back port this
feature to 4.7 branch? Thanks.

BR,
Terry

 -Original Message-
 From: Terry Guo [mailto:terry@arm.com]
 Sent: Thursday, August 30, 2012 10:45 AM
 To: 'Mike Stump'
 Cc: gcc-patches@gcc.gnu.org; Richard Guenther
 Subject: RE: [Patch, test] Enable to prune warnings for tests defined
 in one exp file
 
  -Original Message-
  From: Mike Stump [mailto:mikest...@comcast.net]
  Sent: Tuesday, August 28, 2012 1:21 AM
  To: Terry Guo
  Cc: gcc-patches@gcc.gnu.org; Richard Guenther
  Subject: Re: [Patch, test] Enable to prune warnings for tests defined
  in one exp file
 
  On Aug 27, 2012, at 1:14 AM, Terry Guo wrote:
   This patch intends to provide a chance to prune common warning
  messages for
   tests defined in an exp file.
 
   Is it OK to trunk?
 
  Ok.
 
  If you can find where to document this...  :-)  That'd be nice.
 
 
 I checked the texi files in gcc/doc folder, but can't find a suitable
 place. So I resort to README.gcc in gcc/testsuite which is claimed to
 list notes for those writing testcases and those writing expect scripts.
 Following is the patch. Is it OK?
 
 BR,
 Terry
 
 2012-08-30  Terry Guo  terry@arm.com
 
 * README.gcc: Document new variable dg_runtest_extra_prunes.
 
 Index: gcc/testsuite/README.gcc
 ===
 --- gcc/testsuite/README.gcc  (revision 190795)
 +++ gcc/testsuite/README.gcc  (working copy)
 @@ -79,6 +79,11 @@
 
  If a test does not fit into the torture framework, use the dg
 framework.
 
 +If some tests in an exp file need to skip same warning messages, just
 define
 +variable dg_runtest_extra_prunes in this exp file and let it contain
 this warning
 +message pattern.  This can avoid duplicating dg-prune in these cases.
 +Always remember to clear this variable when leave this exp file.
 +
 
 
  Copyright (C) 1997, 1998, 2004 Free Software Foundation, Inc.

Re: [Patch,avr] PR54461: Better AVR-Libc integration

2012-09-04 Thread Denis Chertykov

2012/9/4 Joerg Wunsch j...@uriah.heep.sax.de:
 As Georg-Johann Lay wrote:

 What do you propose?

 I'm fine with that option, and think it's a good idea.


I have not objections against the patch.

Denis

Re: [Patch,avr] Fix PR54220

2012-09-04 Thread Denis Chertykov

2012/9/3 Georg-Johann Lay a...@gjlay.de:
 This implements TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS as
 obvious fix for PR54220.

 Ok to install?

 Johann


 PR target/54220
 * config/avr/avr.c (TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS): New
 define to...
 (avr_allocate_stack_slots_for_args): ...this new static function.

Approved.

Denis.

Re: [Patch,avr] PR54461: Better AVR-Libc integration

2012-09-04 Thread Gabriel Dos Reis

On Mon, Sep 3, 2012 at 4:23 PM, Georg-Johann Lay a...@gjlay.de wrote:
 Gabriel Dos Reis schrieb:

 Georg-Johann Lay wrote:

 Gabriel Dos Reis schrieb:

 Georg-Johann Lay wrote:

 AVR-Libc comes with hand-optimized float support functions written
 in assembler.  These functions use the same naming conventions like
 libgcc.  There are situations where this name clashed lead to
 performance
 regression because the functions from libgcc are linked.  One example
 are the new fixed-point support that convert fixed-point to/from float
 and reference float/int conversion functions from within libgcc.

 The float implementation in libm.a have been discussed several times
 with the only result that it is very unlikely that the code will
 ever be integrated into libgcc because the original authors are no
 more around.  And is is much less work to add a new configure switch
 than to port and integrate the code, given there were no license
 issues.
 One point against such an extension was that such change to the
 compiler
 establishes a dependency between the compiler and AVR-Libc, but this
 decision has been made long ago by accepting code that actually should
 had been added to libgcc -- but was not for whatever reason.

 This patch removes that performance regressions by removing the
 doubly implemented functions from libgcc by means of a new configure
 option --with-avrlibc.


 as I stated yesterday, I do not understand why there needs to be yet
 another
 configure option. The NATURAL libc for ARV targets is ARV-libc.  We
 should not need a switch for that.


 There is also newlib that is used with avr-gcc.  I know this because
 some bugs are only triggered for newlib.  If there are users that
 report bugs if avr-gcc is build for newlib, I'd guess these users are
 actually interested in using newlib.


 I did not say there was no other libc library.  I said
 that the *natural* libc appears to be AVR-libc.

 We don't configure GCC/g++ saying --with-libstdc++.


 That's a different story because these libraries support in-tree
 build just like newlib does.  This is not true for AVR-Libc which
 does not support in-tree builds.

 I agree that AVR-Libc is the most common libc implementation
 used with avr-gcc and is has many advantages over other libc
 implementation (except that it does not support in-tree builds).

I think the in-tree builds thing is a red herring.


 However, a --with-avrlibc is not needed to *get* the support
 from AVR-Libc, it's just used to fix some problems that arise
 in certain use cases.

so, let's make it the default -- see below.


 Besides that, the proposed arrangement does not affect the
 configuration if the switch is *not* specified, thus the patch
 is appropriate to be backported.

 My intention is to backport it to 4.7 as indicated by the milestone,
 but if the change was unconditionally I don't think the change is
 appropriate for a backport.


It is perfectly reasonable and OK to to make the backport more
guarded (e.g. by the configure option) than on mainline.

 And after all it's just a *configure* option that some distribution
 maintainers can set if they want to.

yes, but it is still one more configure option.

  The tool chain user is not
 bothered at all by the new option and won't even notice it.
 From the user perspective it's just as if some optimizations
 had been added to the tool chain.

 What do you propose?

 Use the setting per default and support a --with-avrlibc=no if
 the user want full libgcc support and nothing removed from it?

Yes. Let's make the sane behaviour the default.

-- Gaby

[SH] Define NO_IMPLICIT_EXTERN_C for newlib targets

2012-09-04 Thread Christian Bruel

newlib uses extern C wrappers in its headers, so GCC can be told it is
C++ compatible.

this patch fixes :

FAIL: g++.dg/lookup/builtin5.C -std=c++11  scan-assembler _ZSt5atanhd t

Tested om the 4.7 and 4.8 branches, OK for both ?

nb: newlib can be added to the list of runtimes that need it (see
http://gcc.gnu.org/ml/gcc-patches/2012-06/msg01164.html), in case this
macro is removed in the future.

Thanks

Christian



2012-09-04  Christian Bruel  christian.br...@st.com

	* config/sh/newlib.h (NO_IMPLICIT_EXTERN_C): Define.

Index: config/sh/newlib.h
===
--- config/sh/newlib.h	(revision 190714)
+++ config/sh/newlib.h	(working copy)
@@ -23,3 +23,7 @@

 #undef LIB_SPEC
 #define LIB_SPEC -lc -lgloss
+
+#undef  NO_IMPLICIT_EXTERN_C
+#define NO_IMPLICIT_EXTERN_C 1
+

Re: [Patch,avr] PR54461: Better AVR-Libc integration

2012-09-04 Thread Georg-Johann Lay


Gabriel Dos Reis schrieb:

Georg-Johann Lay wrote:

Gabriel Dos Reis schrieb:

Georg-Johann Lay wrote:

Gabriel Dos Reis schrieb:

Georg-Johann Lay wrote:

AVR-Libc comes with hand-optimized float support functions
written in assembler.  These functions use the same naming
conventions like libgcc.  There are situations where this
name clashed lead to performance regression because the
functions from libgcc are linked.  One example are the new
fixed-point support that convert fixed-point to/from float 
and reference float/int conversion functions from within

libgcc.

The float implementation in libm.a have been discussed
several times with the only result that it is very unlikely
that the code will ever be integrated into libgcc because
the original authors are no more around.  And is is much
less work to add a new configure switch than to port and
integrate the code, given there were no license issues. One
point against such an extension was that such change to the
compiler establishes a dependency between the compiler and
AVR-Libc, but this decision has been made long ago by
accepting code that actually should had been added to
libgcc -- but was not for whatever reason.

This patch removes that performance regressions by removing
the doubly implemented functions from libgcc by means of a
new configure option --with-avrlibc.


as I stated yesterday, I do not understand why there needs to
be yet another configure option. The NATURAL libc for ARV
targets is ARV-libc.  We should not need a switch for that.


There is also newlib that is used with avr-gcc.  I know this
because some bugs are only triggered for newlib.  If there are
users that report bugs if avr-gcc is build for newlib, I'd
guess these users are actually interested in using newlib.


I did not say there was no other libc library.  I said that the
*natural* libc appears to be AVR-libc.

We don't configure GCC/g++ saying --with-libstdc++.


That's a different story because these libraries support in-tree 
build just like newlib does.  This is not true for AVR-Libc which 
does not support in-tree builds.


I agree that AVR-Libc is the most common libc implementation used
with avr-gcc and is has many advantages over other libc 
implementation (except that it does not support in-tree builds).


I think the in-tree builds thing is a red herring.


I don't think so.  If there was an in-tree build gcc could detect
itself whether or not AVR-Libc is present or not.  With the
current setup the user has to specify that -- in whatever
direction: that libc is there or that libc is not there depending
on whatever is default.


However, a --with-avrlibc is not needed to *get* the support from
AVR-Libc, it's just used to fix some problems that arise in certain
use cases.


so, let's make it the default -- see below.

Besides that, the proposed arrangement does not affect the 
configuration if the switch is *not* specified, thus the patch is

appropriate to be backported.

My intention is to backport it to 4.7 as indicated by the
milestone, but if the change was unconditionally I don't think the
change is appropriate for a backport.


It is perfectly reasonable and OK to to make the backport more 
guarded (e.g. by the configure option) than on mainline.



And after all it's just a *configure* option that some distribution
 maintainers can set if they want to.


yes, but it is still one more configure option.


hmm.  The configure machinery was not changed, it automatically sets
with_foo if --with-foo is specified.  It's just about who is to
be blamed if he does not read the release notes ;-)

Whatever, I think we two are stuck now and enough arguments passed
back and forth.  Let the port maintainers decide.

And Jörg, would you check the excludes list in t-avrlibc?

Johann


The tool chain user is not bothered at all by the new option and
won't even notice it. From the user perspective it's just as if
some optimizations had been added to the tool chain.

What do you propose?

Use the setting per default and support a --with-avrlibc=no if the
user want full libgcc support and nothing removed from it?


Yes. Let's make the sane behaviour the default.

-- Gaby

Re: [Patch,avr] PR54461: Better AVR-Libc integration

2012-09-04 Thread Gabriel Dos Reis

On Tue, Sep 4, 2012 at 1:55 AM, Georg-Johann Lay a...@gjlay.de wrote:
 Gabriel Dos Reis schrieb:

 Georg-Johann Lay wrote:

 Gabriel Dos Reis schrieb:

 Georg-Johann Lay wrote:

 Gabriel Dos Reis schrieb:

 Georg-Johann Lay wrote:

 AVR-Libc comes with hand-optimized float support functions
 written in assembler.  These functions use the same naming
 conventions like libgcc.  There are situations where this
 name clashed lead to performance regression because the
 functions from libgcc are linked.  One example are the new
 fixed-point support that convert fixed-point to/from float and
 reference float/int conversion functions from within
 libgcc.

 The float implementation in libm.a have been discussed
 several times with the only result that it is very unlikely
 that the code will ever be integrated into libgcc because
 the original authors are no more around.  And is is much
 less work to add a new configure switch than to port and
 integrate the code, given there were no license issues. One
 point against such an extension was that such change to the
 compiler establishes a dependency between the compiler and
 AVR-Libc, but this decision has been made long ago by
 accepting code that actually should had been added to
 libgcc -- but was not for whatever reason.

 This patch removes that performance regressions by removing
 the doubly implemented functions from libgcc by means of a
 new configure option --with-avrlibc.


 as I stated yesterday, I do not understand why there needs to
 be yet another configure option. The NATURAL libc for ARV
 targets is ARV-libc.  We should not need a switch for that.


 There is also newlib that is used with avr-gcc.  I know this
 because some bugs are only triggered for newlib.  If there are
 users that report bugs if avr-gcc is build for newlib, I'd
 guess these users are actually interested in using newlib.


 I did not say there was no other libc library.  I said that the
 *natural* libc appears to be AVR-libc.

 We don't configure GCC/g++ saying --with-libstdc++.


 That's a different story because these libraries support in-tree build
 just like newlib does.  This is not true for AVR-Libc which does not support
 in-tree builds.

 I agree that AVR-Libc is the most common libc implementation used
 with avr-gcc and is has many advantages over other libc implementation
 (except that it does not support in-tree builds).


 I think the in-tree builds thing is a red herring.


 I don't think so.  If there was an in-tree build gcc could detect
 itself whether or not AVR-Libc is present or not.  With the
 current setup the user has to specify that -- in whatever
 direction: that libc is there or that libc is not there depending
 on whatever is default.

obviously that situation isn't ideal, and we shouldn't build patches
that are as if it it should be perpetuated.

[...]

 yes, but it is still one more configure option.


 hmm.  The configure machinery was not changed,

It is one more configure option for user to specify, no
matter how the internal configury is implemented.

-- Gaby

[PATCH] Fix PR 54362 (COND_EXPR not understood by ITM)

2012-09-04 Thread Andrew Pinski

Hi,
  The problem here is that trans-mem.c does not take into account that
COND_EXPR can happen for pointers.  This patch modifies
thread_private_new_memory to handle COND_EXPR as it can handle PHI
nodes.  The testcase is a modified version of memopt-12.c but with a
loop which both LIM and if-convert can change the conditional to a
COND_EXPR.

I found this problem when I was producing a pass which does a full
if-convert before expanding (well changing the last phi-opt pass) and
it produces COND_EXPRs and memopt-12.c started to fail.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Thanks,
Andrew Pinski

ChangeLog:
* trans-mem.c (thread_private_new_memory): Handle COND_EXPR also.

testsuite/ChangeLog:
* gcc.dg/tm/memopt-16.c: New testcase.
Index: testsuite/gcc.dg/tm/memopt-16.c
===
--- testsuite/gcc.dg/tm/memopt-16.c (revision 0)
+++ testsuite/gcc.dg/tm/memopt-16.c (revision 0)
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-options -fgnu-tm -O3 -fdump-tree-tmmark } */
+/* Like memopt-12.c but the phi is inside a look which causes
+   it to be converted into a COND_EXPR.  */
+
+extern int test(void) __attribute__((transaction_safe));
+extern void *malloc (__SIZE_TYPE__) __attribute__((malloc,transaction_safe));
+
+struct large { int foo[500]; };
+
+int f(int j)
+{
+  int *p1, *p2, *p3;
+
+  p1 = malloc (sizeof (*p1)*5000);
+  __transaction_atomic {
+_Bool t;
+int i = 1;
+*p1 = 0;
+
+p2 = malloc (sizeof (*p2)*6000);
+*p2 = 1;
+t = test();
+
+for (i = 0;i  j;i++)
+{
+
+/* p3 = PHI (p1, p2) */
+if (t)
+  p3 = p1;
+else
+  p3 = p2;
+
+/* Since both p1 and p2 are thread-private, we can inherit the
+   logging already done.  No ITM_W* instrumentation necessary.  */
+*p3 = 555;
+}
+  }
+  return p3[something()];
+}
+
+/* { dg-final { scan-tree-dump-times ITM_WU 0 tmmark } } */
+/* { dg-final { cleanup-tree-dump tmmark } } */
Index: trans-mem.c
===
--- trans-mem.c (revision 190908)
+++ trans-mem.c (working copy)
@@ -1379,6 +1379,19 @@ thread_private_new_memory (basic_block e
  /* x = (cast*) foo == foo */
  else if (code == VIEW_CONVERT_EXPR || code == NOP_EXPR)
x = gimple_assign_rhs1 (stmt);
+ /* x = c ? op1 : op2 ==  op1 or op2 just like a PHI */
+ else if (code == COND_EXPR)
+   {
+ tree op1 = gimple_assign_rhs2 (stmt);
+ tree op2 = gimple_assign_rhs3 (stmt);
+ enum thread_memory_type mem;
+ retval = thread_private_new_memory (entry_block, op1);
+ if (retval == mem_non_local)
+   goto new_memory_ret;
+ mem = thread_private_new_memory (entry_block, op2);
+ retval = MIN (retval, mem);
+ goto new_memory_ret;
+   }
  else
{
  retval = mem_non_local;

Re: [middle-end] Add machine_mode to address_cost target hook

2012-09-04 Thread Oleg Endo

On Mon, 2012-09-03 at 01:58 +0200, Oleg Endo wrote:
 OKOK -- I'll do it :)
 (within the next couple of days)
 

And so I did.  Attached is an updated patch that adds the address space
parameter to the address_cost function.  I hope that this change does
not reset the ACKs so far:

[x] target-independent bits
[ ] alpha [ ] arm   [ ] avr [ ] bfin
[ ] cr16  [ ] cris  [ ] epiphany[ ] i386
[ ] ia64  [ ] iq2000[ ] lm32[ ] m32c
[ ] m32r  [ ] mcore [ ] mep [x] microblaze
[x] mips  [ ] mmix  [x] mn10300 [ ] pa
[ ] rs6000[ ] rx[ ] s390[ ] score
[x] sh[ ] sparc [ ] spu [ ] stormy16
[ ] v850  [ ] vax   [ ] xtensa

Tested with 'make all-gcc' on SH xgcc and i386 native build.
No functional changes, except on MIPS, as requested by Richard
Sandiford.

Cheers,
Oleg

ChangeLog:

* hooks.c (hook_int_rtx_mode_as_bool_0): New function.
* hooks.h (hook_int_rtx_mode_as_bool_0): Declare it.
* output.h (default_address_cost): Add machine_mode 
and address space arguments.
* target.def (address_cost): Likewise.
* rtlanal.c (address_cost): Pass mode and address space to
target hook.
(default_address_cost): Add unnamed machine_mode and address 
space arguments.
* doc/tm.texi: Regenerate.
* config/alpha/alpha.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
* config/arm/arm.c (arm_address_cost): Add machine_mode 
and address space arguments.
* config/avr/avr.c (avr_address_cost): Likewise.
* config/bfin/bfin.c (bfin_address_cost): Likewise.
* config/cr16/cr16.c (cr16_address_cost): Likewise.
* config/cris/cris.c (cris_address_cost): Likewise.
* config/epiphany/epiphany.c (epiphany_address_cost): Likewise.
* config/i386/i386.c (ix86_address_cost): Likewise.
* config/ia64/ia64.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
* config/iq2000/iq2000.c (iq2000_address_cost): Add 
machine_mode and address space arguments.  Pass them on in
recursive invocation.
* config/lm32/lm32.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
* config/m32c/m32c.c (m32c_address_cost): Add machine_mode 
and address space arguments.
* config/m32r/m32r.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
* config/mcore/mcore.c (TARGET_ADDRESS_COST): Likewise.
* config/mep/mep.c (mep_address_cost): Add machine_mode 
and address space arguments.
* config/microblaze/microblaze.c (microblaze_address_cost): 
Likewise.
* config/mips/mips.c (mips_address_cost): Likewise.
* config/mmix/mmix.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
* config/mn10300/mn10300.c (mn10300_address_cost): Add
machine_mode and address space arguments.  Use GET_MODE (x) and 
ADDR_SPACE_GENERIC in recursive invocation.
* config/pa/pa.c (hppa_address_cost): Add machine_mode and 
address space arguments.
* config/rs6000/rs6000.c (rs6000_debug_address_cost): Likewise.
(TARGET_ADDRESS_COST): Use hook_int_rtx_mode_as_bool_0 instead 
of hook_int_rtx_bool_0.
* config/rx/rx.c (rx_address_cost): Add machine_mode and 
address space arguments.
* config/s390/s390.c (s390_address_cost): Likewise.
* config/score/score-protos.h (score_address_cost): Likewise.
* config/score/score.c (score_address_cost): Likewise.
* config/sh/sh.c (sh_address_cost): Likewise.
* config/sparc/sparc.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
* config/spu/spu.c (TARGET_ADDRESS_COST): Likewise.
* config/stormy16/stormy16.c (xstormy16_address_cost): Add 
machine_mode and address space arguments.
* config/v850/v850.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
* config/vax/vax.c (vax_address_cost): Add machine_mode 
and address space arguments.
* config/xtensa/xtensa (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.

Index: gcc/rtlanal.c
===
--- gcc/rtlanal.c	(revision 190865)
+++ gcc/rtlanal.c	(working copy)
@@ -3820,13 +3820,13 @@
   if (!memory_address_addr_space_p (mode, x, as))
 return 1000;
 
-  return targetm.address_cost (x, speed);
+  return targetm.address_cost (x, mode, as, speed);
 }
 
 /* If the target doesn't override, compute the cost as with arithmetic.  */
 
 int
-default_address_cost (rtx

Re: [PATCH, M68K] Fix ICE from scheduler improvement

2012-09-04 Thread Andreas Schwab

Maxim Kuvyrkov maxim_kuvyr...@mentor.com writes:

 2012-09-03  Maxim Kuvyrkov  ma...@codesourcery.com

   * config/m68k/m68k.c (m68k_sched_dfa_post_advance_cycle): Fix ICE
   caused by save scheduler state patch.

The change log entry should describe what was changed.  Save scheduler
state doesn't say anything to me.

 + {
 +   /* The instruction buffer appears to be more filled than we
 +  anticipated.  We should have inheritted the state from

s/inheritted/inherited/

 +  the previous basic block.  Adjust buffer counter.  */
 +   ++sched_ib.filled;
 + }

The comment appears to suggest that this is rather a workaround for a
deficiency elsewhere.  Is that true?

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.

Re: [middle-end] Add machine_mode to address_cost target hook

2012-09-04 Thread nick clifton


Hi Oleg,


And so I did.  Attached is an updated patch that adds the address space
parameter to the address_cost function.  I hope that this change does
not reset the ACKs so far:

[x] target-independent bits
[ ] alpha [ ] arm   [ ] avr [ ] bfin
[ ] cr16  [ ] cris  [ ] epiphany[ ] i386
[ ] ia64  [ ] iq2000[ ] lm32[ ] m32c
[ ] m32r  [ ] mcore [ ] mep [x] microblaze
[x] mips  [ ] mmix  [x] mn10300 [ ] pa
[ ] rs6000[ ] rx[ ] s390[ ] score
[x] sh[ ] sparc [ ] spu [ ] stormy16
[ ] v850  [ ] vax   [ ] xtensa


Please add ACKs for: iq2000, m32r, mcore, rx, stormy16 and v850.

Cheers
  Nick

Re: [PATCH] PR45070: Fix wrong epilogue code for cortex-m0/Os

2012-09-04 Thread Ramana Radhakrishnan



I ran regression test with/without Os for cortex-m0 and everything is ok.
Ok for trunk and 4.7/4.6 release branches?


OK for trunk.

Ok to backport if no release manager objects in 24 hours and if it tests 
without regressions there.



Thanks,
Ramana

Re: [SH] PR 51244 - Add CANONICALIZE_COMPARISON macro

2012-09-04 Thread Oleg Endo

On Mon, 2012-09-03 at 19:37 +0900, Kaz Kojima wrote:
 Oleg Endo oleg.e...@t-online.de wrote:
  In any case, I have no problem with changing the multi line comments  
  to /* ... */.  Just let me know.
 
 Other than that, the patch is OK.

I've committed the attached version of the patch as rev 190909.

Cheers,
Oleg
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 190865)
+++ gcc/config/sh/sh.md	(working copy)
@@ -881,10 +881,9 @@
   if (TARGET_SHMEDIA)
 emit_jump_insn (gen_cbranchint4_media (operands[0], operands[1],
 	   operands[2], operands[3]));
-  else if (TARGET_CBRANCHDI4)
-expand_cbranchsi4 (operands, LAST_AND_UNUSED_RTX_CODE, -1);
   else
-sh_emit_compare_and_branch (operands, SImode);
+expand_cbranchsi4 (operands, LAST_AND_UNUSED_RTX_CODE, -1);
+
   DONE;
 })
 
Index: gcc/config/sh/sh-protos.h
===
--- gcc/config/sh/sh-protos.h	(revision 190865)
+++ gcc/config/sh/sh-protos.h	(working copy)
@@ -106,6 +106,9 @@
 extern rtx sh_gen_truncate (enum machine_mode, rtx, int);
 extern bool sh_vector_mode_supported_p (enum machine_mode);
 extern bool sh_cfun_trap_exit_p (void);
+extern void sh_canonicalize_comparison (enum rtx_code, rtx, rtx,
+	enum machine_mode mode = VOIDmode);
+
 #endif /* RTX_CODE */
 
 extern const char *output_jump_label_table (void);
Index: gcc/config/sh/sh.c
===
--- gcc/config/sh/sh.c	(revision 190865)
+++ gcc/config/sh/sh.c	(working copy)
@@ -21,6 +21,12 @@
 along with GCC; see the file COPYING3.  If not see
 http://www.gnu.org/licenses/.  */
 
+/* FIXME: This is a temporary hack, so that we can include algorithm
+   below.  algorithm will try to include cstdlib which will reference
+   malloc  co, which are poisoned by system.h.  The proper solution is
+   to include cstdlib in system.h instead of stdlib.h.  */
+#include cstdlib
+
 #include config.h
 #include system.h
 #include coretypes.h
@@ -56,6 +62,7 @@
 #include tm-constrs.h
 #include opts.h
 
+#include algorithm
 
 int code_for_indirect_jump_scratch = CODE_FOR_indirect_jump_scratch;
 
@@ -1791,65 +1798,124 @@
 }
 }
 
-enum rtx_code
-prepare_cbranch_operands (rtx *operands, enum machine_mode mode,
-			  enum rtx_code comparison)
+/* Implement the CANONICALIZE_COMPARISON macro for the combine pass.
+   This function is also re-used to canonicalize comparisons in cbranch
+   pattern expanders.  */
+void
+sh_canonicalize_comparison (enum rtx_code cmp, rtx op0, rtx op1,
+			enum machine_mode mode)
 {
-  rtx op1;
-  rtx scratch = NULL_RTX;
+  /* When invoked from within the combine pass the mode is not specified,
+ so try to get it from one of the operands.  */
+  if (mode == VOIDmode)
+mode = GET_MODE (op0);
+  if (mode == VOIDmode)
+mode = GET_MODE (op1);
 
-  if (comparison == LAST_AND_UNUSED_RTX_CODE)
-comparison = GET_CODE (operands[0]);
-  else
-scratch = operands[4];
-  if (CONST_INT_P (operands[1])
-   !CONST_INT_P (operands[2]))
+  // We need to have a mode to do something useful here.
+  if (mode == VOIDmode)
+return;
+
+  // Currently, we don't deal with floats here.
+  if (GET_MODE_CLASS (mode) == MODE_FLOAT)
+return;
+
+  // Make sure that the constant operand is the second operand.
+  if (CONST_INT_P (op0)  !CONST_INT_P (op1))
 {
-  rtx tmp = operands[1];
+  std::swap (op0, op1);
+  cmp = swap_condition (cmp);
+}
 
-  operands[1] = operands[2];
-  operands[2] = tmp;
-  comparison = swap_condition (comparison);
-}
-  if (CONST_INT_P (operands[2]))
+  if (CONST_INT_P (op1))
 {
-  HOST_WIDE_INT val = INTVAL (operands[2]);
-  if ((val == -1 || val == -0x81)
-	   (comparison == GT || comparison == LE))
+  /* Try to adjust the constant operand in such a way that available
+ comparison insns can be utilized better and the constant can be
+ loaded with a 'mov #imm,Rm' insn.  This avoids a load from the
+ constant pool.  */
+  const HOST_WIDE_INT val = INTVAL (op1);
+
+  /* x  -1		  -- x = 0
+	 x  0xFF7F	  -- x = 0xFF80
+	 x = -1	  -- x  0
+	 x = 0xFF7F  -- x  0xFF80  */
+  if ((val == -1 || val == -0x81)  (cmp == GT || cmp == LE))
 	{
-	  comparison = (comparison == GT) ? GE : LT;
-	  operands[2] = gen_int_mode (val + 1, mode);
+	  cmp = cmp == GT ? GE : LT;
+	  op1 = gen_int_mode (val + 1, mode);
+}
+
+  /* x = 1 -- x  0
+	 x = 0x80  -- x  0x7F
+	 x  1  -- x = 0
+	 x  0x80   -- x = 0x7F  */
+  else if ((val == 1 || val == 0x80)  (cmp == GE || cmp == LT))
+	{
+	  cmp = cmp == GE ? GT : LE;
+	  op1 = gen_int_mode (val - 1, mode);
 	}
-  else if ((val == 1 || val == 0x80)
-	(comparison == GE || comparison == LT))
+
+  /* unsigned x = 1  -- x != 0
+	 unsigned x  1   -- x == 0  */
+  else if (val == 1  (cmp == GEU ||

[Patch, Fortran, committed] PR 54435 54443

2012-09-04 Thread Janus Weil

Hi all,

I have just committed to trunk a trivial fix for two recent OOP
regressions (affecting 4.7 and trunk), both of which originate from
the same problem:

http://gcc.gnu.org/viewcvs?view=revisionrevision=190910

I will commit this fix also to the 4.7 branch in a few days.

Cheers,
Janus

[PATCH] Fix PR54458

2012-09-04 Thread Richard Guenther


This fixes PR54458 where DOM jump threading turns a loop into
one with multiple latches but does not mark it so.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2012-09-04  Richard Guenther  rguent...@suse.de

PR tree-optimization/54458
* tree-ssa-threadupdate.c (thread_through_loop_header): If we
turn the loop into one with multiple latches mark it so.

* gcc.dg/torture/pr54458.c: New testcase.

Index: gcc/tree-ssa-threadupdate.c
===
--- gcc/tree-ssa-threadupdate.c (revision 190889)
+++ gcc/tree-ssa-threadupdate.c (working copy)
@@ -1037,11 +1037,21 @@ thread_through_loop_header (struct loop
}
   free (bblocks);
 
+  /* If the new header has multiple latches mark it so.  */
+  FOR_EACH_EDGE (e, ei, loop-header-preds)
+   if (e-src-loop_father == loop
+e-src != loop-latch)
+ {
+   loop-latch = NULL;
+   loops_state_set (LOOPS_MAY_HAVE_MULTIPLE_LATCHES);
+ }
+
   /* Cancel remaining threading requests that would make the
 loop a multiple entry loop.  */
   FOR_EACH_EDGE (e, ei, header-preds)
{
  edge e2;
+
  if (e-aux == NULL)
continue;
 
Index: gcc/testsuite/gcc.dg/torture/pr54458.c
===
--- gcc/testsuite/gcc.dg/torture/pr54458.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr54458.c  (working copy)
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+
+unsigned int a, b, c;
+
+void
+foo (unsigned int x)
+{
+  do
+{
+  if (a == 0 ? 1 : 1 % a)
+   for (; b; b--)
+ lab:;
+  else
+   while (x)
+ ;
+  if (c)
+   goto lab;
+}
+  while (1);
+}

Re: [PATCH 3/3] Compute predicates for phi node results in ipa-inline-analysis.c

2012-09-04 Thread Richard Guenther

On Mon, Sep 3, 2012 at 5:52 PM, Jan Hubicka hubi...@ucw.cz wrote:
 On Fri, Aug 31, 2012 at 7:24 PM, Martin Jambor mjam...@suse.cz wrote:
  Hi,
 
  On Thu, Aug 30, 2012 at 05:11:35PM +0200, Martin Jambor wrote:
  this is a new version of the patch which makes ipa analysis produce
  predicates for PHI node results, at least at the bottom of the
  simplest diamond and semi-diamond CFG subgraphs.  This time I also
  analyze the conditions again rather than extracting information from
  CFG edges, which means I can reason about substantially more PHI
  nodes.
 
  This patch makes us produce loop bounds hint for the pr48636.f90
  testcase.
 
  Bootstrapped and tested on x86_64-linux.  OK for trunk?
 
  Thanks,
 
  Martin
 
 
  2012-08-29  Martin Jambor  mjam...@suse.cz
 
* ipa-inline-analysis.c (phi_result_unknown_predicate): New 
  function.
(predicate_for_phi_result): Likewise.
(estimate_function_body_sizes): Use the above two functions.
 
 
  This patch, on top of the one doing loop calculations almost always,
  introduces a number of testsuite failures which somehow I had not
  caught during my testing.  The problem is that either
  calculate_dominance_info or loop_optimizer_init introduce new SSA
  names for which there is no index in nonconstant_names which is
  allocated before the dominance and loop computations.  I'm currently
  bootstrapping and testing the following fix which simply allocates the
  vector after doing the two computations.  If it passes I will commit
  it straight away so that the regression is fixed before I leave for
  the weekend, I hope it's obvious enough for that.
 
  On the other hand, it would really be better if we did not change
  function bodies during IPA summary generation phase...

 Um ... we shouldn't do this.  Can you track down where it happens?  I
 suppose it might come from CFG manipulations loop_optimizer_init
 performs when not passing AVOID_CFG_MODIFICATIONS.

 I bet it come from loop noromalization :) (i.e. loop closed form
 or preheader construction both needs new SSA names.)
 I think it would be best to make pass manager to handle this and make
 loop normalization to happen once before all SSA IPA analysis

And compute loops as well.

Richard.

 Honza

Re: [PATCH, C] Mixed pointer types in call to streamer_tree_cache_lookup() in gcc/lto-streamer-out.c

2012-09-04 Thread Richard Guenther

On Mon, Sep 3, 2012 at 6:10 PM, Andris Pavenis andris.pave...@iki.fi wrote:
 On 09/03/2012 03:27 PM, Richard Guenther wrote:

 On Sat, Sep 1, 2012 at 2:21 PM, Andris Pavenis andris.pave...@iki.fi
 wrote:

 uint32_t * is used as a 3rd parameter in call to
 streamer_tree_cache_lookup()
 in 2 places in gcc/lto-streamer-out.c when the procedure prototype have
 unsigned *. They are not guaranteed to be the same for all targets
 (I got error when building for DJGPP)


 Ok.


 I do not have SVN write access, so I cannot commit myself

Hmm, OTOH your patch looks wrong as

@@ -1131,7 +1131,7 @@ lto_output_decl_state_refs (struct output_block *ob,
struct lto_out_decl_state *state)
 {
   unsigned i;
-  uint32_t ref;
+  unsigned ref;
   tree decl;

   /* Write reference to FUNCTION_DECL.  If there is not function,

conflicts with

  streamer_tree_cache_lookup (ob-writer_cache, decl, ref);
  gcc_assert (ref != (unsigned)-1);
  lto_output_data_stream (out_stream, ref, sizeof (uint32_t));

where the on-disk format expects uint32_t layout.  Thus I think
streamer_tree_cache_lookup should instead use uint32_t consistently.

Richard.

 Andris



 Thanks,
 Richard.

 Andris

 ChangeLog entry

 2012-09-01  Andris Pavenis andris.pave...@iki.fi

  * lto-streamer-out.c (write_global_references,
 lto_output_decl_state_refs):
  Fix parameter type in call to streamer_tree_cache_lookup

Re: combine BIT_FIELD_REF and VEC_PERM_EXPR

2012-09-04 Thread Richard Guenther

On Mon, Sep 3, 2012 at 6:12 PM, Marc Glisse marc.gli...@inria.fr wrote:
 On Mon, 3 Sep 2012, Richard Guenther wrote:

 +  if (code == VEC_PERM_EXPR)
 +{
 +  tree p, m, index, tem;
 +  unsigned nelts;
 +  m = gimple_assign_rhs3 (def_stmt);
 +  if (TREE_CODE (m) != VECTOR_CST)
 +   return false;
 +  nelts = VECTOR_CST_NELTS (m);
 +  idx = TREE_INT_CST_LOW (VECTOR_CST_ELT (m, idx));
 +  idx %= 2 * nelts;
 +  if (idx  nelts)
 +   {
 + p = gimple_assign_rhs1 (def_stmt);
 +   }
 +  else
 +   {
 + p = gimple_assign_rhs2 (def_stmt);
 + idx -= nelts;
 +   }
 +  index = build_int_cst (TREE_TYPE (TREE_TYPE (m)), idx * size);
 +  tem = fold_build3 (BIT_FIELD_REF, TREE_TYPE (op), p, op1,
 index);



 This shouldn't simplify, so you can use build3 instead.



 I think that it is possible for p to be a VECTOR_CST, if the shuffle
 involves one constant and one non-constant vectors, no?


 Well, constant propagation should have handled it ...


 When it sees __builtin_shuffle(cst1,var,cst2)[cst3], CCP should basically do
 the same thing I am doing here, in the hope that the element will be part of
 cst1 instead of var?

Yes, if CCP sees

 vec_1 = VEC_PERM cst1, var, cst2;
scalar_2 = BIT_FIELD_REF vec_1, cst3;

then if vec_1 ends up being constant it should figure out that vec_1 is constant
and that scalar_2 is constant.  Of course if we have

 vec_1 = VEC_PERM cst1, var1, var2;

and var1/var2 are CONSTRUCTORS with some elements constants then
it won't be able to do that and forwprop should do it.  So I suppose handling
constants in forwprop is fine (but it would be nice to double-check if in
the first example CCP figures out that vec_1 and scalar_2 are constant).

 What if builtin_shuffle takes 2 constructors, one of
 which contains at least one constant? It looks easier to handle it here and
 let the next run of CCP notice the simplified expression. Or do you mean I
 should add the new function to CCP (or even fold) instead of forwprop?
 (wouldn't be the first time CCP does more than constant propagation)

CCP should only do lattice-based constant propagation, other cases need
to be handled in forwprop.

 If you use fold_build3 you need to check that the result is in expected
 form
 (a is_gimple_invariant or an SSA_NAME).

 Now that I look at this line, I wonder if I am missing some unshare_expr
 for
 p and/or op1.


 If either is a CONSTRUCTOR and its def stmt is not removed and it survives
 into tem then yes ...


 But the integer_cst doesn't need it. Ok, thanks.

 --
 Marc Glisse

Re: [PATCH][RFC] Add -Og

2012-09-04 Thread Richard Guenther

On Mon, 3 Sep 2012, H.J. Lu wrote:

 On Mon, Sep 3, 2012 at 11:50 AM,  rguent...@suse.de wrote:
  H.J. Lu hjl.to...@gmail.com wrote:
 
 On Mon, Sep 3, 2012 at 6:28 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
  On Fri, Aug 10, 2012 at 1:30 PM, Richard Guenther rguent...@suse.de
 wrote:
 
  This adds a new optimization level, -Og, as previously discussed.
  It aims at providing fast compilation, a superior debugging
  experience and reasonable runtime performance.  Instead of making
  -O1 this optimization level this adds a new -Og.
 
  It's a first cut, highlighting that our fixed pass pipeline and
  simply enabling/disabling individual passes (but not pass copies
  for example) doesn't scale to properly differentiate between
  -Og and -O[23].  -O1 should get similar treatment, eventually
  just building on -Og but not focusing on debugging experience.
  That is, I expect that in the end we will at least have two post-IPA
  optimization pipelines.  It also means that you cannot enable
  PRE or VRP with -Og at the moment because these passes are not
  anywhere scheduled (similar to the situation with -O0).
 
  It has some funny effect on dump-file naming of the pass copies
  though, which hints at that the current setup is too static.
  For that reason the new queue comes after the old, to not confuse
  too many testcases.
 
  It also does not yet disable any of the early optimizations that
  make debugging harder (SRA comes to my mind here, as does
  switch-conversion and partial inlining).
 
  The question arises if we want to support in any reasonable
  way using profile-feedback or LTO for -O[01g], thus if we
  rather want to delay some of the early opts to after IPA
  optimizations.
 
  Not bootstrapped or fully tested, but it works for the compile
  torture.
 
  Comments welcome,
 
  No comments?  Then I'll drop this idea for 4.8.
 
 
 When I debug binutils, I have to use -O0 -g to get precise
 line and variable info.  Also glibc has to be compiled with
 -O, which makes debug a challenge.  Will -Og help bintils
 and glibc debug?
 
  I suppose so, but it is hard to tell without knowing more about the issues.
 
 
 The main issues are
 
 1. I need to know precise values for all local variables at all times.

That would certainly be a good design goal for -Og (but surely the
first cut at it won't do it).

 2. Compiler shouldn't inline a function or move lines around.

Let's split that.

2. Compiler shouldn't inline a function

well - we need to inline always_inline functions.  And I am positively
sure people want trivial C++ abstraction penalty to be removed even with
-Og, thus getter/setter methods inlined.  Let's say the compiler should
not inline a function not declared inline and the compiler should not
inline a function if that would increase code size even if it is declared
inline?

3. Compiler shouldn't move lines around.

A good goal as well, probably RTL pieces are least ready for this.

4. Generated code should be small and fast, compile-time and memory
usage should be low.  Unless either of it defeats 1. to 3.

The patch only provides a starting point and from the GIMPLE side
should be reasonably close to the goals above.

Richard.

[Patch,avr,committed] Fix PR54476

2012-09-04 Thread Georg-Johann Lay

http://gcc.gnu.org/viewcvs?root=gccview=revrev=190920
http://gcc.gnu.org/viewcvs?root=gccview=revrev=190921

Applied these obvious fix for PR54476.

Johann

--


Index: config/avr/avr.c
===
--- config/avr/avr.c(revision 190914)
+++ config/avr/avr.c(working copy)
@@ -10449,7 +10449,7 @@ avr_mem_clobber (void)
 static void
 avr_expand_delay_cycles (rtx operands0)
 {
-  unsigned HOST_WIDE_INT cycles = UINTVAL (operands0);
+  unsigned HOST_WIDE_INT cycles = UINTVAL (operands0)  GET_MODE_MASK (SImode);
   unsigned HOST_WIDE_INT cycles_used;
   unsigned HOST_WIDE_INT loop_count;

Re: [middle-end] Add machine_mode to address_cost target hook

2012-09-04 Thread Richard Earnshaw

On 04/09/12 08:52, Oleg Endo wrote:
 On Mon, 2012-09-03 at 01:58 +0200, Oleg Endo wrote:
 OKOK -- I'll do it :)
 (within the next couple of days)

 
 And so I did.  Attached is an updated patch that adds the address space
 parameter to the address_cost function.  I hope that this change does
 not reset the ACKs so far:
 
 [x] target-independent bits
 [ ] alpha [ ] arm   [ ] avr [ ] bfin
 [ ] cr16  [ ] cris  [ ] epiphany[ ] i386
 [ ] ia64  [ ] iq2000[ ] lm32[ ] m32c
 [ ] m32r  [ ] mcore [ ] mep [x] microblaze
 [x] mips  [ ] mmix  [x] mn10300 [ ] pa
 [ ] rs6000[ ] rx[ ] s390[ ] score
 [x] sh[ ] sparc [ ] spu [ ] stormy16
 [ ] v850  [ ] vax   [ ] xtensa
 
 Tested with 'make all-gcc' on SH xgcc and i386 native build.
 No functional changes, except on MIPS, as requested by Richard
 Sandiford.
 
 Cheers,
 Oleg
 
 ChangeLog:
 
   * hooks.c (hook_int_rtx_mode_as_bool_0): New function.
   * hooks.h (hook_int_rtx_mode_as_bool_0): Declare it.
   * output.h (default_address_cost): Add machine_mode 
   and address space arguments.
   * target.def (address_cost): Likewise.
   * rtlanal.c (address_cost): Pass mode and address space to
   target hook.
   (default_address_cost): Add unnamed machine_mode and address 
   space arguments.
   * doc/tm.texi: Regenerate.
   * config/alpha/alpha.c (TARGET_ADDRESS_COST): Use 
   hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
   * config/arm/arm.c (arm_address_cost): Add machine_mode 
   and address space arguments.
   * config/avr/avr.c (avr_address_cost): Likewise.
   * config/bfin/bfin.c (bfin_address_cost): Likewise.
   * config/cr16/cr16.c (cr16_address_cost): Likewise.
   * config/cris/cris.c (cris_address_cost): Likewise.
   * config/epiphany/epiphany.c (epiphany_address_cost): Likewise.
   * config/i386/i386.c (ix86_address_cost): Likewise.
   * config/ia64/ia64.c (TARGET_ADDRESS_COST): Use 
   hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
   * config/iq2000/iq2000.c (iq2000_address_cost): Add 
 machine_mode and address space arguments.  Pass them on in
   recursive invocation.
   * config/lm32/lm32.c (TARGET_ADDRESS_COST): Use 
   hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
   * config/m32c/m32c.c (m32c_address_cost): Add machine_mode 
   and address space arguments.
   * config/m32r/m32r.c (TARGET_ADDRESS_COST): Use 
   hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
   * config/mcore/mcore.c (TARGET_ADDRESS_COST): Likewise.
   * config/mep/mep.c (mep_address_cost): Add machine_mode 
   and address space arguments.
   * config/microblaze/microblaze.c (microblaze_address_cost): 
   Likewise.
   * config/mips/mips.c (mips_address_cost): Likewise.
   * config/mmix/mmix.c (TARGET_ADDRESS_COST): Use 
   hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
   * config/mn10300/mn10300.c (mn10300_address_cost): Add
 machine_mode and address space arguments.  Use GET_MODE (x) and 
   ADDR_SPACE_GENERIC in recursive invocation.
   * config/pa/pa.c (hppa_address_cost): Add machine_mode and 
   address space arguments.
   * config/rs6000/rs6000.c (rs6000_debug_address_cost): Likewise.
   (TARGET_ADDRESS_COST): Use hook_int_rtx_mode_as_bool_0 instead 
   of hook_int_rtx_bool_0.
 * config/rx/rx.c (rx_address_cost): Add machine_mode and 
   address space arguments.
   * config/s390/s390.c (s390_address_cost): Likewise.
   * config/score/score-protos.h (score_address_cost): Likewise.
   * config/score/score.c (score_address_cost): Likewise.
   * config/sh/sh.c (sh_address_cost): Likewise.
   * config/sparc/sparc.c (TARGET_ADDRESS_COST): Use 
   hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
   * config/spu/spu.c (TARGET_ADDRESS_COST): Likewise.
   * config/stormy16/stormy16.c (xstormy16_address_cost): Add 
   machine_mode and address space arguments.
   * config/v850/v850.c (TARGET_ADDRESS_COST): Use 
   hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
   * config/vax/vax.c (vax_address_cost): Add machine_mode 
   and address space arguments.
   * config/xtensa/xtensa (TARGET_ADDRESS_COST): Use 
   hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
 
 
 

The arm bits are OK.

R.

Ping^2: [PATCH]Remove duplicate check on BRANCH_COST in fold-const.c

2012-09-04 Thread Bin Cheng

  -Original Message-
  From: Richard Earnshaw
  Sent: Thursday, July 26, 2012 9:19 PM
  To: Andrew Pinski
  Cc: Bin Cheng; gcc-patches@gcc.gnu.org
  Subject: Re: [PATCH]Remove duplicate check on BRANCH_COST in
  fold-const.c

  On 26/07/12 11:27, Andrew Pinski wrote:
   On Thu, Jul 26, 2012 at 3:20 AM, Bin Cheng bin.ch...@arm.com wrote:
   Hi,
   This patch removes the duplicate check on BRANCH_COST in
 fold_truth_andor.
   The BRANCH_COST condition removed is a duplicate of the default
   definition of LOGICAL_OP_NON_SHORT_CIRCUIT.
   All current targets (mips and rs6000) that provide non-default
   definitions of LOGICAL_OP_SHORT_CIRCUIT set it to 0, so this patch
   is therefore just a code cleanup and does not change behaviour in
   the
 compiler.

   I built mipsel-elf cross compiler and compared newlib/libstdc++
   compiled by the patched/original compilers.

   Is it OK?

   Just some history here on this.  The BRANCH COST check was there
   before LOGICAL_OP_NON_SHORT_CIRCUIT was added.  I will be submitting
   a patch which changes the MIPS definition soon but it will not be
   based on the branch cost but rather than another option.  So in the
   end it might not be redundant as it is currently.

   Thanks,
   Andrew

  You can always factor BRANCH_COST into LOGICAL_OP_NON_SHORT_CIRCUIT
  (as
 the
  default currently does), so there's no loss of functionality from
  removing this currently redundant check.  However, the current
  definition is broken
 in
  that it makes it impossible to force the compiler to use this
  optimization when the branch cost is low.

Hi, is this change ok? Or we need more discussion on it?

Thanks very much.

Re: [Patch,avr] PR54461: Better AVR-Libc integration

2012-09-04 Thread Richard Guenther

On Tue, Sep 4, 2012 at 8:55 AM, Georg-Johann Lay a...@gjlay.de wrote:
 Gabriel Dos Reis schrieb:

 Georg-Johann Lay wrote:

 Gabriel Dos Reis schrieb:

 Georg-Johann Lay wrote:

 Gabriel Dos Reis schrieb:

 Georg-Johann Lay wrote:

 AVR-Libc comes with hand-optimized float support functions
 written in assembler.  These functions use the same naming
 conventions like libgcc.  There are situations where this
 name clashed lead to performance regression because the
 functions from libgcc are linked.  One example are the new
 fixed-point support that convert fixed-point to/from float and
 reference float/int conversion functions from within
 libgcc.

 The float implementation in libm.a have been discussed
 several times with the only result that it is very unlikely
 that the code will ever be integrated into libgcc because
 the original authors are no more around.  And is is much
 less work to add a new configure switch than to port and
 integrate the code, given there were no license issues. One
 point against such an extension was that such change to the
 compiler establishes a dependency between the compiler and
 AVR-Libc, but this decision has been made long ago by
 accepting code that actually should had been added to
 libgcc -- but was not for whatever reason.

 This patch removes that performance regressions by removing
 the doubly implemented functions from libgcc by means of a
 new configure option --with-avrlibc.


 as I stated yesterday, I do not understand why there needs to
 be yet another configure option. The NATURAL libc for ARV
 targets is ARV-libc.  We should not need a switch for that.


 There is also newlib that is used with avr-gcc.  I know this
 because some bugs are only triggered for newlib.  If there are
 users that report bugs if avr-gcc is build for newlib, I'd
 guess these users are actually interested in using newlib.


 I did not say there was no other libc library.  I said that the
 *natural* libc appears to be AVR-libc.

 We don't configure GCC/g++ saying --with-libstdc++.


 That's a different story because these libraries support in-tree build
 just like newlib does.  This is not true for AVR-Libc which does not support
 in-tree builds.

 I agree that AVR-Libc is the most common libc implementation used
 with avr-gcc and is has many advantages over other libc implementation
 (except that it does not support in-tree builds).


 I think the in-tree builds thing is a red herring.


 I don't think so.  If there was an in-tree build gcc could detect
 itself whether or not AVR-Libc is present or not.  With the
 current setup the user has to specify that -- in whatever
 direction: that libc is there or that libc is not there depending
 on whatever is default.

You can do a link check on whether -lc provides those functions
and skip those that overlap with libgcc.

Richard.


 However, a --with-avrlibc is not needed to *get* the support from
 AVR-Libc, it's just used to fix some problems that arise in certain
 use cases.


 so, let's make it the default -- see below.

 Besides that, the proposed arrangement does not affect the configuration
 if the switch is *not* specified, thus the patch is
 appropriate to be backported.

 My intention is to backport it to 4.7 as indicated by the
 milestone, but if the change was unconditionally I don't think the
 change is appropriate for a backport.


 It is perfectly reasonable and OK to to make the backport more guarded
 (e.g. by the configure option) than on mainline.

 And after all it's just a *configure* option that some distribution
  maintainers can set if they want to.


 yes, but it is still one more configure option.


 hmm.  The configure machinery was not changed, it automatically sets
 with_foo if --with-foo is specified.  It's just about who is to
 be blamed if he does not read the release notes ;-)

 Whatever, I think we two are stuck now and enough arguments passed
 back and forth.  Let the port maintainers decide.

 And Jörg, would you check the excludes list in t-avrlibc?

 Johann


 The tool chain user is not bothered at all by the new option and
 won't even notice it. From the user perspective it's just as if
 some optimizations had been added to the tool chain.

 What do you propose?

 Use the setting per default and support a --with-avrlibc=no if the
 user want full libgcc support and nothing removed from it?


 Yes. Let's make the sane behaviour the default.

 -- Gaby

Ping: [PATCH GCC/ARM] Fix problem that hardreg_cprop opportunities are missed on thumb1

2012-09-04 Thread Bin Cheng

 Hi,
 For thumb1, arm-gcc rewrites move insn into subtract of ZERO in peephole2
pass
 intentionally, then executes
 pass_if_after_reload/pass_regrename/pass_cprop_hardreg sequentially.
 
 In this scenario, copy propagation opportunities are missed because:
   1. the move insns are re-written.
   2. pass_cprop_hardreg currently don't notice the subtract of ZERO.
 
 This patch fixes the problem and the logic is:
   1. notice the plus/subtract of ZERO in pass_cprop_hardreg.
   2. if the last insn providing information about conditional codes is in
the
 form of dest_reg = src_reg - 0, record the src_reg in newly added field
 thumb1_cc_op0_src of structure machine_function.
   3. in pattern cbranchsi4_insn, check thumb1_cc_op0_src along with
 thumb1_cc_op0 to save one comparison insn.
 
 I measured the patch on CSiBE, about 600 bytes are saved for both O2 and
Os on
 cortex-m0 without any regression.
 
 I also tested the patch on
 arm-none-eabi+cortex-m0/arm-none-eabi+cortex-m3/i686-pc-linux and no
 regressions introduced.
 
 So is it OK?
 
 Thanks
 
 2012-08-13  Bin Cheng  bin.ch...@arm.com
 
   * regcprop.c (copyprop_hardreg_forward_1) Notice copies in the form
of
   subtract of ZERO.
   * config/arm/arm.h (thumb1_cc_op0_src) New field.
   * config/arm/arm.c (thumb1_final_prescan_insn) Record
thumb1_cc_op0_src.
   * config/arm/arm.md (cbranchsi4_insn) Check thumb1_cc_op0_src along
 with
   thumb1_cc_op0.

Ping?

Hi Ramana, could you help me review this patch?
Hi Eric, Richard, could you help me review the change in regcprop.c?

Thanks very much

Re: Ping^2: [PATCH]Remove duplicate check on BRANCH_COST in fold-const.c

2012-09-04 Thread Richard Guenther

On Tue, Sep 4, 2012 at 11:56 AM, Bin Cheng bin.ch...@arm.com wrote:
  -Original Message-
  From: Richard Earnshaw
  Sent: Thursday, July 26, 2012 9:19 PM
  To: Andrew Pinski
  Cc: Bin Cheng; gcc-patches@gcc.gnu.org
  Subject: Re: [PATCH]Remove duplicate check on BRANCH_COST in
  fold-const.c

  On 26/07/12 11:27, Andrew Pinski wrote:
   On Thu, Jul 26, 2012 at 3:20 AM, Bin Cheng bin.ch...@arm.com wrote:
   Hi,
   This patch removes the duplicate check on BRANCH_COST in
 fold_truth_andor.
   The BRANCH_COST condition removed is a duplicate of the default
   definition of LOGICAL_OP_NON_SHORT_CIRCUIT.
   All current targets (mips and rs6000) that provide non-default
   definitions of LOGICAL_OP_SHORT_CIRCUIT set it to 0, so this patch
   is therefore just a code cleanup and does not change behaviour in
   the
 compiler.

   I built mipsel-elf cross compiler and compared newlib/libstdc++
   compiled by the patched/original compilers.

   Is it OK?

   Just some history here on this.  The BRANCH COST check was there
   before LOGICAL_OP_NON_SHORT_CIRCUIT was added.  I will be submitting
   a patch which changes the MIPS definition soon but it will not be
   based on the branch cost but rather than another option.  So in the
   end it might not be redundant as it is currently.

   Thanks,
   Andrew

  You can always factor BRANCH_COST into LOGICAL_OP_NON_SHORT_CIRCUIT
  (as
 the
  default currently does), so there's no loss of functionality from
  removing this currently redundant check.  However, the current
  definition is broken
 in
  that it makes it impossible to force the compiler to use this
  optimization when the branch cost is low.

 Hi, is this change ok? Or we need more discussion on it?

It's not ok (I btw fail to see the patch in this thread).  The current
way LOGICAL_OP_NON_SHORT_CIRCUIT is implemented/used should instead
be changed to always match the pattern

  LOGICAL_OP_NON_SHORT_CIRCUIT
   (BRANCH_COST (optimize_function_for_speed_p (cfun),
false) = 2)

and the default value of LOGICAL_OP_NON_SHORT_CIRCUIT should be 1,
defined in defaults.h (and the docs updated).

Richard.

 Thanks very much.

Re: combine vec_perm_expr with constructor

2012-09-04 Thread Richard Guenther

On Mon, Sep 3, 2012 at 5:50 PM, Marc Glisse marc.gli...@inria.fr wrote:
 On Mon, 3 Sep 2012, Richard Guenther wrote:

 On Mon, Sep 3, 2012 at 4:00 PM, Marc Glisse marc.gli...@inria.fr wrote:

 On Mon, 3 Sep 2012, Richard Guenther wrote:

 You shouldn't need the VECTOR_CST handling - constant propagation should
 already ensure properly simplified code here (and is the more canonical
 place
 to handle this).



 IIRC, I added VECTOR_CST because of mixed constructor/vector_cst shuffles
 (and because it wasn't too hard). If I remove it (I can), I guess some of
 the testcases won't work anymore.


 I see.  If you still have a testcase can you look if CCP does not do
 something it should?


 I think CCP is working fine, the fold_ternary patch you approved today tests
 some of that (without that patch, sometimes ccp1 does half the work and fre1
 finishes it, and since forwprop1 is before fre1, I hit that case there). Is
 there a particular scenario you have in mind that might not be handled?

No.  In theory it should be handled just fine via gimple_fold_stmt_to_constant_1
dispatching to fold_ternary.

Richard.

 Here I was concerned with:
 x={a,b}; // constructor
 y={18,42}; // vector_cst
 m={0,3};
 __builtin_shuffle(x,y,m) // should be {a,42}

 --
 Marc Glisse

Re: Ping: [PATCH] Enable bbro for -Os

2012-09-04 Thread Richard Guenther

On Wed, Aug 29, 2012 at 10:42 AM, Zhenqiang Chen zhenqiang.c...@arm.com wrote:
 -Original Message-
 From: Steven Bosscher [mailto:stevenb@gmail.com]
 Sent: Friday, August 24, 2012 8:17 PM
 To: Zhenqiang Chen
 Cc: gcc-patches@gcc.gnu.org
 Subject: Re: Ping: [PATCH] Enable bbro for -Os

 On Wed, Aug 22, 2012 at 8:49 AM, Zhenqiang Chen
 zhenqiang.c...@arm.com wrote:
  The patch is to enable bbro for -Os. When optimizing for size, it
  * avoid duplicating block.
  * keep its original order if there is no chance to fall through.
  * ignore edge frequency and probability.
  * handle predecessor first if its index is smaller to break long trace.

 You do this by inserting the index as a key. I don't fully understand this
 change. You're assuming that a block with a lower index has a lower pre-
 order number in the CFG's DFS spanning tree, IIUC (i.e. the blocks are
 numbered sequentially)? I'm not sure that's always true. I think you
 should
 add an explanation for this heuristic.

 Thank you for the comments.

 cleanup_cfg is called at the end cfg_layout_initialize before
 reorder_basic_blocks. cleanup_cfg does lots of optimization on cfg and
 renumber the basic blocks. After cleanup_cfg, the blocks are roughly
 numbered sequentially.

Well, sequentially in their current order which is not in any way
flow-controlled.

 The heuristic bases on the result of cleanup_cfg. It just wants to keep the
 order of cleanup_cfg since logs show we will have code size improvement (by
 cleanup_cfg) even if we do not call reorder_basic_blocks. index as a key
 is a simple way keep the original order.

That's true.

 Comments are added in the updated patch.

  * only connect Trace n with Trace n + 1 to reduce long jump.
 ...
* bb-reorder.c (connect_better_edge_p): New added.
(find_traces_1_round): When optimizing for size, ignore edge
  frequency
and probability, and handle all in one round.
(bb_to_key): Use bb-index as key for size.
(better_edge_p): The smaller bb index is better for size.
(connect_traces): Connect block n with block n + 1;
connect trace m with trace m + 1 if falling through.
(copy_bb_p): Avoid duplicating blocks.
(gate_handle_reorder_blocks): Enable bbro when optimizing for
 -Os.

 This probably fixes PR54364.

 Try the case in PR54364, the patch does reduce several jmp.

  @@ -1169,6 +1272,10 @@ copy_bb_p (const_basic_block bb, int
 code_may_grow)
 int max_size = uncond_jump_length;
 rtx insn;

  +  /* Avoid duplicating blocks for size.  */  if
  + (optimize_function_for_size_p (cfun))
  +return false;
  +
 if (!bb-frequency)
   return false;

 This shouldn't be necessary, due to the CODE_MAY_GROW argument, and
 this change should result in a code size increase because jumps to
 conditional
 jumps aren't removed anymore. What did you make this change for, do you
 have a test case where code size increases if you allow copy_bb_p to
 return
 true?

 Thanks. It is not necessary.

 Here is the updated ChangeLog. The updated patch is attached.

 ChangeLog
 2012-08-29  Zhenqiang Chen zhenqiang.c...@arm.com

 PR middle-end/54364
 * bb-reorder.c (connect_better_edge_p): New added.
 (find_traces_1_round): When optimizing for size, ignore edge
 frequency
 and probability, and handle all in one round.
 (bb_to_key): Use bb-index as key for size.
 (better_edge_p): The smaller bb index is better for size.
 (connect_traces): Connect block n with block n + 1;
 connect trace m with trace m + 1 if falling through.
 (gate_handle_reorder_blocks): Enable bbro when optimizing for -Os.

@@ -530,10 +544,11 @@ find_traces_1_round (int branch_th, int exec_th,
gcov_type count_th,
}

  /* Edge that cannot be fallthru or improbable or infrequent
-successor (i.e. it is unsuitable successor).  */
+successor (i.e. it is unsuitable successor).
+For size, ignore the frequency and probability.  */
  if (!(e-flags  EDGE_CAN_FALLTHRU) || (e-flags  EDGE_COMPLEX)
- || prob  branch_th || EDGE_FREQUENCY (e)  exec_th
- || e-count  count_th)
+ || (prob  branch_th || EDGE_FREQUENCY (e)  exec_th
+ || e-count  count_th)  !for_size)
continue;

why that change?  It seems you do re-orderings that would not be done with -Os
even though your goal was to preserve the original ordering.

+ /* Wait for the predecessors.  */
+ if ((e == best_edge)  for_size
+  (EDGE_COUNT (best_edge-dest-succs)  1
+ || EDGE_COUNT (best_edge-dest-preds)  1))
+   {
+ best_edge = NULL;
+   }

I don't understand this (well, I'm not very familiar with bb-reorder),
doesn't that
mean you rather want to push this block to the next round?

Overall I

Re: [middle-end] Add machine_mode to address_cost target hook

2012-09-04 Thread Paolo Bonzini

Il 04/09/2012 09:52, Oleg Endo ha scritto:
 [x] target-independent bits
 [ ] alpha [ ] arm   [ ] avr [ ] bfin
 [ ] cr16  [ ] cris  [ ] epiphany[ ] i386
 [ ] ia64  [ ] iq2000[ ] lm32[ ] m32c
 [ ] m32r  [ ] mcore [ ] mep [x] microblaze
 [x] mips  [ ] mmix  [x] mn10300 [ ] pa
 [ ] rs6000[ ] rx[ ] s390[ ] score
 [x] sh[ ] sparc [ ] spu [ ] stormy16
 [ ] v850  [ ] vax   [ ] xtensa
 
 Tested with 'make all-gcc' on SH xgcc and i386 native build.
 No functional changes, except on MIPS, as requested by Richard
 Sandiford.

I think you only need explicit approval for mn10300.  All other changes
are trivial.

 +hook_int_rtx_mode_as_bool_0 (rtx, enum machine_mode, addr_space_t, bool)

So we're using C++ already?  Or do we want ATTRIBUTE_UNUSED here?

Paolo

Re: [PATCH][RFC] Add -Og

2012-09-04 Thread Richard Earnshaw

On 04/09/12 10:45, Richard Guenther wrote:
 On Mon, 3 Sep 2012, H.J. Lu wrote:
 
 On Mon, Sep 3, 2012 at 11:50 AM,  rguent...@suse.de wrote:
 H.J. Lu hjl.to...@gmail.com wrote:

 On Mon, Sep 3, 2012 at 6:28 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Fri, Aug 10, 2012 at 1:30 PM, Richard Guenther rguent...@suse.de
 wrote:

 This adds a new optimization level, -Og, as previously discussed.
 It aims at providing fast compilation, a superior debugging
 experience and reasonable runtime performance.  Instead of making
 -O1 this optimization level this adds a new -Og.

 It's a first cut, highlighting that our fixed pass pipeline and
 simply enabling/disabling individual passes (but not pass copies
 for example) doesn't scale to properly differentiate between
 -Og and -O[23].  -O1 should get similar treatment, eventually
 just building on -Og but not focusing on debugging experience.
 That is, I expect that in the end we will at least have two post-IPA
 optimization pipelines.  It also means that you cannot enable
 PRE or VRP with -Og at the moment because these passes are not
 anywhere scheduled (similar to the situation with -O0).

 It has some funny effect on dump-file naming of the pass copies
 though, which hints at that the current setup is too static.
 For that reason the new queue comes after the old, to not confuse
 too many testcases.

 It also does not yet disable any of the early optimizations that
 make debugging harder (SRA comes to my mind here, as does
 switch-conversion and partial inlining).

 The question arises if we want to support in any reasonable
 way using profile-feedback or LTO for -O[01g], thus if we
 rather want to delay some of the early opts to after IPA
 optimizations.

 Not bootstrapped or fully tested, but it works for the compile
 torture.

 Comments welcome,

 No comments?  Then I'll drop this idea for 4.8.


 When I debug binutils, I have to use -O0 -g to get precise
 line and variable info.  Also glibc has to be compiled with
 -O, which makes debug a challenge.  Will -Og help bintils
 and glibc debug?

 I suppose so, but it is hard to tell without knowing more about the issues.


 The main issues are

 1. I need to know precise values for all local variables at all times.
 
 That would certainly be a good design goal for -Og (but surely the
 first cut at it won't do it).
 
 2. Compiler shouldn't inline a function or move lines around.
 
 Let's split that.
 
 2. Compiler shouldn't inline a function
 
 well - we need to inline always_inline functions.  And I am positively
 sure people want trivial C++ abstraction penalty to be removed even with
 -Og, thus getter/setter methods inlined.  Let's say the compiler should
 not inline a function not declared inline and the compiler should not
 inline a function if that would increase code size even if it is declared
 inline?
 
 3. Compiler shouldn't move lines around.
 
 A good goal as well, probably RTL pieces are least ready for this.
 
 4. Generated code should be small and fast, compile-time and memory
 usage should be low.  Unless either of it defeats 1. to 3.
 
 The patch only provides a starting point and from the GIMPLE side
 should be reasonably close to the goals above.
 
 Richard.
 

I'd add

5. User variables don't have to live in memory (or in any single place),
but there should only be one 'live' location at any one time.  Changing
the value of a variable at a sequence point/statement/line boundary
(pick a definition and document it) should affect all subsequent uses of
that variable.  Values assigned to variables remain available until the
declaration goes out of scope.

R.

Re: Ping^2: [PATCH]Remove duplicate check on BRANCH_COST in fold-const.c

2012-09-04 Thread Richard Earnshaw

On 04/09/12 11:11, Richard Guenther wrote:
 On Tue, Sep 4, 2012 at 11:56 AM, Bin Cheng bin.ch...@arm.com wrote:
 -Original Message-
 From: Richard Earnshaw
 Sent: Thursday, July 26, 2012 9:19 PM
 To: Andrew Pinski
 Cc: Bin Cheng; gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH]Remove duplicate check on BRANCH_COST in
 fold-const.c

 On 26/07/12 11:27, Andrew Pinski wrote:
 On Thu, Jul 26, 2012 at 3:20 AM, Bin Cheng bin.ch...@arm.com wrote:
 Hi,
 This patch removes the duplicate check on BRANCH_COST in
 fold_truth_andor.
 The BRANCH_COST condition removed is a duplicate of the default
 definition of LOGICAL_OP_NON_SHORT_CIRCUIT.
 All current targets (mips and rs6000) that provide non-default
 definitions of LOGICAL_OP_SHORT_CIRCUIT set it to 0, so this patch
 is therefore just a code cleanup and does not change behaviour in
 the
 compiler.

 I built mipsel-elf cross compiler and compared newlib/libstdc++
 compiled by the patched/original compilers.

 Is it OK?

 Just some history here on this.  The BRANCH COST check was there
 before LOGICAL_OP_NON_SHORT_CIRCUIT was added.  I will be submitting
 a patch which changes the MIPS definition soon but it will not be
 based on the branch cost but rather than another option.  So in the
 end it might not be redundant as it is currently.

 Thanks,
 Andrew

 You can always factor BRANCH_COST into LOGICAL_OP_NON_SHORT_CIRCUIT
 (as
 the
 default currently does), so there's no loss of functionality from
 removing this currently redundant check.  However, the current
 definition is broken
 in
 that it makes it impossible to force the compiler to use this
 optimization when the branch cost is low.

 Hi, is this change ok? Or we need more discussion on it?

 It's not ok (I btw fail to see the patch in this thread).  The current
 way LOGICAL_OP_NON_SHORT_CIRCUIT is implemented/used should instead
 be changed to always match the pattern

   LOGICAL_OP_NON_SHORT_CIRCUIT
(BRANCH_COST (optimize_function_for_speed_p (cfun),
 false) = 2)

 and the default value of LOGICAL_OP_NON_SHORT_CIRCUIT should be 1,
 defined in defaults.h (and the docs updated).

That's not going to work for modern ARM cores.  We want to set
BRANCH_COST to 1 but still have it generate the non-short-circuit code
(because conditional compares are really cheap.

R.

 Richard.

 Thanks very much.

Re: [wwwdocs] PATCH for Re: [PATCH] Remove matrix-reorg

2012-09-04 Thread Gerald Pfeifer

On Mon, 3 Sep 2012, Richard Guenther wrote:
 I'd not mention the command-line flags.

I was thinking to point out what to not use any longer, in case.
Doesn't that make sense for the release notes?

 They were not working correctly and they did not work with LTO
 which made them useless apart from for single-TU programs.

How about the following?

Gerald

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v
retrieving revision 1.26
diff -u -3 -p -r1.26 changes.html
--- changes.html2 Sep 2012 15:56:24 -   1.26
+++ changes.html4 Sep 2012 10:59:32 -
@@ -38,7 +38,7 @@ explicit use of vector types may be inco
 built with older versions of GCC.  Auto-vectorized code is not affected
 by this change./p
 
-h2General Optimizer Improvements/h2
+h2General Optimizer Improvements (and Changes)/h2
 
   ul
 liA new option code-ftree-partial-pre/code was added to control
@@ -46,6 +46,11 @@ by this change./p
   This option is enabled by default at the code-O3/code optimization
   level, and it makes PRE more aggressive.
 /li
+liThe struct reorg and matrix reorg optimizations (command-line
+options code-fipa-struct-reorg/code and 
+code-fipa-matrix-reorg/code) have been removed.  They did not
+work correctly nor with link-time optimization (LTO), hence were only
+applicable to programs consisting of a single translation unit./li
   /ul

Re: [Patch,avr] PR54461: Better AVR-Libc integration

2012-09-04 Thread Georg-Johann Lay

Richard Guenther wrote:
 Georg-Johann Lay wrote:
 Gabriel Dos Reis schrieb:
 Georg-Johann Lay wrote:
 Gabriel Dos Reis schrieb:
 Georg-Johann Lay wrote:
 Gabriel Dos Reis schrieb:
 Georg-Johann Lay wrote:
 AVR-Libc comes with hand-optimized float support functions
 written in assembler.  These functions use the same naming
 conventions like libgcc.  There are situations where this
 name clashed lead to performance regression because the
 functions from libgcc are linked.  One example are the new
 fixed-point support that convert fixed-point to/from float and
 reference float/int conversion functions from within
 libgcc.

 The float implementation in libm.a have been discussed
 several times with the only result that it is very unlikely
 that the code will ever be integrated into libgcc because
 the original authors are no more around.  And is is much
 less work to add a new configure switch than to port and
 integrate the code, given there were no license issues. One
 point against such an extension was that such change to the
 compiler establishes a dependency between the compiler and
 AVR-Libc, but this decision has been made long ago by
 accepting code that actually should had been added to
 libgcc -- but was not for whatever reason.

 This patch removes that performance regressions by removing
 the doubly implemented functions from libgcc by means of a
 new configure option --with-avrlibc.

 as I stated yesterday, I do not understand why there needs to
 be yet another configure option. The NATURAL libc for ARV
 targets is ARV-libc.  We should not need a switch for that.

 There is also newlib that is used with avr-gcc.  I know this
 because some bugs are only triggered for newlib.  If there are
 users that report bugs if avr-gcc is build for newlib, I'd
 guess these users are actually interested in using newlib.

 I did not say there was no other libc library.  I said that the
 *natural* libc appears to be AVR-libc.

 We don't configure GCC/g++ saying --with-libstdc++.

 That's a different story because these libraries support in-tree build
 just like newlib does.  This is not true for AVR-Libc which does not 
 support
 in-tree builds.

 I agree that AVR-Libc is the most common libc implementation used
 with avr-gcc and is has many advantages over other libc implementation
 (except that it does not support in-tree builds).

 I think the in-tree builds thing is a red herring.

 I don't think so.  If there was an in-tree build gcc could detect
 itself whether or not AVR-Libc is present or not.  With the
 current setup the user has to specify that -- in whatever
 direction: that libc is there or that libc is not there depending
 on whatever is default.
 
 You can do a link check on whether -lc provides those functions
 and skip those that overlap with libgcc.

Can you explain this?  A typical build of avr tools goes like

1) configure, build and install binutils
2) configure, build and install the compiler
3) configure, build and install AVR-Libc

so that in step 2 no checking is possible because there is no -lc yet.
Or do you mean a check at run time (of the compiler)?


Johann

Re: [PATCH][RFC] Add -Og

2012-09-04 Thread Matthew Gretton-Dann

On 4 September 2012 10:45, Richard Guenther rguent...@suse.de wrote:
 On Mon, 3 Sep 2012, H.J. Lu wrote:

 On Mon, Sep 3, 2012 at 11:50 AM,  rguent...@suse.de wrote:
  H.J. Lu hjl.to...@gmail.com wrote:
 
 On Mon, Sep 3, 2012 at 6:28 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
  On Fri, Aug 10, 2012 at 1:30 PM, Richard Guenther rguent...@suse.de
 wrote:
 
  This adds a new optimization level, -Og, as previously discussed.
  It aims at providing fast compilation, a superior debugging
  experience and reasonable runtime performance.  Instead of making
  -O1 this optimization level this adds a new -Og.
 
  It's a first cut, highlighting that our fixed pass pipeline and
  simply enabling/disabling individual passes (but not pass copies
  for example) doesn't scale to properly differentiate between
  -Og and -O[23].  -O1 should get similar treatment, eventually
  just building on -Og but not focusing on debugging experience.
  That is, I expect that in the end we will at least have two post-IPA
  optimization pipelines.  It also means that you cannot enable
  PRE or VRP with -Og at the moment because these passes are not
  anywhere scheduled (similar to the situation with -O0).
 
  It has some funny effect on dump-file naming of the pass copies
  though, which hints at that the current setup is too static.
  For that reason the new queue comes after the old, to not confuse
  too many testcases.
 
  It also does not yet disable any of the early optimizations that
  make debugging harder (SRA comes to my mind here, as does
  switch-conversion and partial inlining).
 
  The question arises if we want to support in any reasonable
  way using profile-feedback or LTO for -O[01g], thus if we
  rather want to delay some of the early opts to after IPA
  optimizations.
 
  Not bootstrapped or fully tested, but it works for the compile
  torture.
 
  Comments welcome,
 
  No comments?  Then I'll drop this idea for 4.8.
 
 
 When I debug binutils, I have to use -O0 -g to get precise
 line and variable info.  Also glibc has to be compiled with
 -O, which makes debug a challenge.  Will -Og help bintils
 and glibc debug?
 
  I suppose so, but it is hard to tell without knowing more about the issues.
 

 The main issues are

 1. I need to know precise values for all local variables at all times.

 That would certainly be a good design goal for -Og (but surely the
 first cut at it won't do it).

 2. Compiler shouldn't inline a function or move lines around.

 Let's split that.

 2. Compiler shouldn't inline a function

 well - we need to inline always_inline functions.  And I am positively
 sure people want trivial C++ abstraction penalty to be removed even with
 -Og, thus getter/setter methods inlined.  Let's say the compiler should
 not inline a function not declared inline and the compiler should not
 inline a function if that would increase code size even if it is declared
 inline?

I don't see a problem with inlining functions under -Og - under a
couple of assumptions:
 * The debug table format can correctly mark inlined functions (DWARF
can - I don't know about other formats).
 * The compiler is executing sequence points in order - and so the
function being inlined doesn't 'spill out' into the function it is
inlined into.  See below for further comments.

This should provide enough information to the debugger to allow it to
maintain the illusion that an inlined function is a separate function,
and enable a user to set breakpoints on all calls to the function.

 3. Compiler shouldn't move lines around.

 A good goal as well, probably RTL pieces are least ready for this.

I would change this to say something like (using C language terms):
The compiler should provide enough information to allow breakpoints to
be set at each sequence point, and that the state of the machine is
such that everything before that sequence point will have been
completed and that nothing after that sequence point will have been
started.

It is probably also possible to argue that there is a case for having
points between sequence points where we say the code would be in a
good state (lets call them observation points).  So for instance we
might want to say that in:

 int x, a, b, c;
 ...
 x = a + b * c;

If we just say we only promise a known state at sequence points then
the compiler is free to use some form of multiply-accumulate
instruction here.  But a user may want to see the multiply followed by
addition split out.  So we could define the observation points to be
on the *, +, and =.

 4. Generated code should be small and fast, compile-time and memory
 usage should be low.  Unless either of it defeats 1. to 3.

 The patch only provides a starting point and from the GIMPLE side
 should be reasonably close to the goals above.

 Richard.

Thanks,

Matt

-- 
Matthew Gretton-Dann
Linaro Toolchain Working Group
matthew.gretton-d...@linaro.org

Re: [PATCH 3/3] Compute predicates for phi node results in ipa-inline-analysis.c

2012-09-04 Thread Martin Jambor

On Tue, Sep 04, 2012 at 11:27:47AM +0200, Richard Guenther wrote:
 On Mon, Sep 3, 2012 at 5:52 PM, Jan Hubicka hubi...@ucw.cz wrote:
  On Fri, Aug 31, 2012 at 7:24 PM, Martin Jambor mjam...@suse.cz wrote:
   Hi,
  
   On Thu, Aug 30, 2012 at 05:11:35PM +0200, Martin Jambor wrote:
   this is a new version of the patch which makes ipa analysis produce
   predicates for PHI node results, at least at the bottom of the
   simplest diamond and semi-diamond CFG subgraphs.  This time I also
   analyze the conditions again rather than extracting information from
   CFG edges, which means I can reason about substantially more PHI
   nodes.
  
   This patch makes us produce loop bounds hint for the pr48636.f90
   testcase.
  
   Bootstrapped and tested on x86_64-linux.  OK for trunk?
  
   Thanks,
  
   Martin
  
  
   2012-08-29  Martin Jambor  mjam...@suse.cz
  
 * ipa-inline-analysis.c (phi_result_unknown_predicate): New 
   function.
 (predicate_for_phi_result): Likewise.
 (estimate_function_body_sizes): Use the above two functions.
  
  
   This patch, on top of the one doing loop calculations almost always,
   introduces a number of testsuite failures which somehow I had not
   caught during my testing.  The problem is that either
   calculate_dominance_info or loop_optimizer_init introduce new SSA
   names for which there is no index in nonconstant_names which is
   allocated before the dominance and loop computations.  I'm currently
   bootstrapping and testing the following fix which simply allocates the
   vector after doing the two computations.  If it passes I will commit
   it straight away so that the regression is fixed before I leave for
   the weekend, I hope it's obvious enough for that.
  
   On the other hand, it would really be better if we did not change
   function bodies during IPA summary generation phase...
 
  Um ... we shouldn't do this.  Can you track down where it happens?  I
  suppose it might come from CFG manipulations loop_optimizer_init
  performs when not passing AVOID_CFG_MODIFICATIONS.
 
  I bet it come from loop noromalization :) (i.e. loop closed form
  or preheader construction both needs new SSA names.)
  I think it would be best to make pass manager to handle this and make
  loop normalization to happen once before all SSA IPA analysis
 
 And compute loops as well.

OK, this is now PR 54477 so that we don't forget.

Thanks,

Martin

Re: [wwwdocs] PATCH for Re: [PATCH] Remove matrix-reorg

2012-09-04 Thread Richard Guenther

On Tue, 4 Sep 2012, Gerald Pfeifer wrote:

 On Mon, 3 Sep 2012, Richard Guenther wrote:
  I'd not mention the command-line flags.
 
 I was thinking to point out what to not use any longer, in case.
 Doesn't that make sense for the release notes?
 
  They were not working correctly and they did not work with LTO
  which made them useless apart from for single-TU programs.
 
 How about the following?

Looks good to me.

Thanks,
Richard.

 Gerald
 
 Index: changes.html
 ===
 RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v
 retrieving revision 1.26
 diff -u -3 -p -r1.26 changes.html
 --- changes.html  2 Sep 2012 15:56:24 -   1.26
 +++ changes.html  4 Sep 2012 10:59:32 -
 @@ -38,7 +38,7 @@ explicit use of vector types may be inco
  built with older versions of GCC.  Auto-vectorized code is not affected
  by this change./p
  
 -h2General Optimizer Improvements/h2
 +h2General Optimizer Improvements (and Changes)/h2
  
ul
  liA new option code-ftree-partial-pre/code was added to control
 @@ -46,6 +46,11 @@ by this change./p
This option is enabled by default at the code-O3/code optimization
level, and it makes PRE more aggressive.
  /li
 +liThe struct reorg and matrix reorg optimizations (command-line
 +options code-fipa-struct-reorg/code and 
 +code-fipa-matrix-reorg/code) have been removed.  They did not
 +work correctly nor with link-time optimization (LTO), hence were only
 +applicable to programs consisting of a single translation unit./li
/ul
  
  
 
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

Re: [PATCH][RFC] Add -Og

2012-09-04 Thread H.J. Lu

On Tue, Sep 4, 2012 at 4:16 AM, Matthew Gretton-Dann
matthew.gretton-d...@linaro.org wrote:
 On 4 September 2012 10:45, Richard Guenther rguent...@suse.de wrote:
 On Mon, 3 Sep 2012, H.J. Lu wrote:

 On Mon, Sep 3, 2012 at 11:50 AM,  rguent...@suse.de wrote:
  H.J. Lu hjl.to...@gmail.com wrote:
 
 On Mon, Sep 3, 2012 at 6:28 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
  On Fri, Aug 10, 2012 at 1:30 PM, Richard Guenther rguent...@suse.de
 wrote:
 
  This adds a new optimization level, -Og, as previously discussed.
  It aims at providing fast compilation, a superior debugging
  experience and reasonable runtime performance.  Instead of making
  -O1 this optimization level this adds a new -Og.
 
  It's a first cut, highlighting that our fixed pass pipeline and
  simply enabling/disabling individual passes (but not pass copies
  for example) doesn't scale to properly differentiate between
  -Og and -O[23].  -O1 should get similar treatment, eventually
  just building on -Og but not focusing on debugging experience.
  That is, I expect that in the end we will at least have two post-IPA
  optimization pipelines.  It also means that you cannot enable
  PRE or VRP with -Og at the moment because these passes are not
  anywhere scheduled (similar to the situation with -O0).
 
  It has some funny effect on dump-file naming of the pass copies
  though, which hints at that the current setup is too static.
  For that reason the new queue comes after the old, to not confuse
  too many testcases.
 
  It also does not yet disable any of the early optimizations that
  make debugging harder (SRA comes to my mind here, as does
  switch-conversion and partial inlining).
 
  The question arises if we want to support in any reasonable
  way using profile-feedback or LTO for -O[01g], thus if we
  rather want to delay some of the early opts to after IPA
  optimizations.
 
  Not bootstrapped or fully tested, but it works for the compile
  torture.
 
  Comments welcome,
 
  No comments?  Then I'll drop this idea for 4.8.
 
 
 When I debug binutils, I have to use -O0 -g to get precise
 line and variable info.  Also glibc has to be compiled with
 -O, which makes debug a challenge.  Will -Og help bintils
 and glibc debug?
 
  I suppose so, but it is hard to tell without knowing more about the 
  issues.
 

 The main issues are

 1. I need to know precise values for all local variables at all times.

 That would certainly be a good design goal for -Og (but surely the
 first cut at it won't do it).

It will be harder to use it to debug binutils.

 2. Compiler shouldn't inline a function or move lines around.

 Let's split that.

 2. Compiler shouldn't inline a function

 well - we need to inline always_inline functions.  And I am positively
 sure people want trivial C++ abstraction penalty to be removed even with
 -Og, thus getter/setter methods inlined.  Let's say the compiler should
 not inline a function not declared inline and the compiler should not
 inline a function if that would increase code size even if it is declared
 inline?

 I don't see a problem with inlining functions under -Og - under a
 couple of assumptions:
  * The debug table format can correctly mark inlined functions (DWARF
 can - I don't know about other formats).
  * The compiler is executing sequence points in order - and so the
 function being inlined doesn't 'spill out' into the function it is
 inlined into.  See below for further comments.

 This should provide enough information to the debugger to allow it to
 maintain the illusion that an inlined function is a separate function,
 and enable a user to set breakpoints on all calls to the function.

It works for me.

 3. Compiler shouldn't move lines around.

 A good goal as well, probably RTL pieces are least ready for this.

 I would change this to say something like (using C language terms):
 The compiler should provide enough information to allow breakpoints to
 be set at each sequence point, and that the state of the machine is
 such that everything before that sequence point will have been
 completed and that nothing after that sequence point will have been
 started.

 It is probably also possible to argue that there is a case for having
 points between sequence points where we say the code would be in a
 good state (lets call them observation points).  So for instance we
 might want to say that in:

  int x, a, b, c;
  ...
  x = a + b * c;

 If we just say we only promise a known state at sequence points then
 the compiler is free to use some form of multiply-accumulate
 instruction here.  But a user may want to see the multiply followed by
 addition split out.  So we could define the observation points to be
 on the *, +, and =.

The problem I run into is next in gdb can go backward within the
same function when compiled with optimization.  It makes harder
for me to use breakpoints to track where/when the problem happens.

-- 
H.J.

[PATCH] Fix bogus use of cfun in gen_subprogram_die and premark_used_types

2012-09-04 Thread Martin Jambor

Hi,

while looking into how to remove push/pop_cfun from dwarf2out.c, I
have noticed the following wrong use of cfun in premark_used_types,
which is the first thing called by gen_subprogram_die.

What happens is that:

1. early inliner calls dwarf2out_abstract_function, cfun corresponds
   to the function being inlined to, argument decl is the function
   being inlined.

2. dwarf2out_abstract_function calls gen_type_die_for_member to
   generate an in-class declaration DIE.  It does this before
   changing cfun.

3. gen_type_die_for_member calls gen_type_die_for_member because
   member is a function decl.

4. gen_subprogram_die calls premark_used_types to mark DIEs of all
   types in cfun-used_types_hash as perennial.  But cfun does not
   correspond to the decl it is supposed to be emitting a DIE for,
   instead, used_types of the function decl is being inlined to are
   being marked as perennial.

Similarly, when dealing with nested functions, gen_subprogram_die can
call itself, just with a different decl parameter but unchanged cfun
through decls_for_scope.

I was not able to produce a failing testcase similar to
gcc.dg/20060410.c, mainly because dwarf2out_abstract_function then
changes cfun and indirectly invokes gen_subprogram_die again but still
I believe the intention was to use DECL_STRUCT_FUNCTION(decl) rather
than cfun in premark_used_types and everywhere in gen_subprogram_die.
The patch below does exactly that and as far as my experiments go,
seems to work.

This patch also removes push/pop cfun from dwarf2out_abstract_function
and only leaves the change of current_function_decl.  Richi suggested
that we push NULL cfun at this point but my goal is to enforce that
cfun and current_function_decl match at each push_cfun and since
dwarf2out_abstract_function can call itself, that is not the case.
Nevertheless, I also bootstrapped, tested and compiled Firefox with a
version in which I do push_cfun(NULL) when cfun is not already NULL
and there were no problems.

Bootstrapped and tested on x86_64-linux.  OK for trunk?

Thanks,

Martin


2012-08-30  Martin Jambor  mjam...@suse.cz

* dwarf2out.c (dwarf2out_abstract_function): Do not change cfun.
(premark_used_types): New parameter fun, use it instead of cfun.
(gen_subprogram_die): Use DECL_STRUCT_FUNCTION (decl) instead of cfun,
also pass it to premark_used_types.


*** /tmp/3GCMxa_dwarf2out.c Tue Sep  4 15:10:23 2012
--- gcc/dwarf2out.c Mon Sep  3 14:48:02 2012
*** dwarf2out_abstract_function (tree decl)
*** 16765,16771 
/* Pretend we've just finished compiling this function.  */
save_fn = current_function_decl;
current_function_decl = decl;
-   push_cfun (DECL_STRUCT_FUNCTION (decl));
  
was_abstract = DECL_ABSTRACT (decl);
set_decl_abstract_flags (decl, 1);
--- 16765,16770 
*** dwarf2out_abstract_function (tree decl)
*** 16779,16785 
call_arg_locations = old_call_arg_locations;
call_site_count = old_call_site_count;
tail_call_site_count = old_tail_call_site_count;
-   pop_cfun ();
  }
  
  /* Helper function of premark_used_types() which gets called through
--- 16778,16783 
*** premark_types_used_by_global_vars_helper
*** 16838,16847 
  /* Mark all members of used_types_hash as perennial.  */
  
  static void
! premark_used_types (void)
  {
!   if (cfun  cfun-used_types_hash)
! htab_traverse (cfun-used_types_hash, premark_used_types_helper, NULL);
  }
  
  /* Mark all members of types_used_by_vars_entry as perennial.  */
--- 16836,16845 
  /* Mark all members of used_types_hash as perennial.  */
  
  static void
! premark_used_types (struct function *fun)
  {
!   if (fun  fun-used_types_hash)
! htab_traverse (fun-used_types_hash, premark_used_types_helper, NULL);
  }
  
  /* Mark all members of types_used_by_vars_entry as perennial.  */
*** gen_subprogram_die (tree decl, dw_die_re
*** 16904,16910 
int declaration = (current_function_decl != decl
 || class_or_namespace_scope_p (context_die));
  
!   premark_used_types ();
  
/* It is possible to have both DECL_ABSTRACT and DECLARATION be true if we
   started to generate the abstract instance of an inline, decided to output
--- 16902,16908 
int declaration = (current_function_decl != decl
 || class_or_namespace_scope_p (context_die));
  
!   premark_used_types (DECL_STRUCT_FUNCTION (decl));
  
/* It is possible to have both DECL_ABSTRACT and DECLARATION be true if we
   started to generate the abstract instance of an inline, decided to output
*** gen_subprogram_die (tree decl, dw_die_re
*** 17067,17079 
else if (!DECL_EXTERNAL (decl))
  {
HOST_WIDE_INT cfa_fb_offset;
  
if (!old_die || !get_AT (old_die, DW_AT_inline))
equate_decl_number_to_die (decl, subr_die);
  
if (!flag_reorder_blocks_and_partition)
{
!

[PATCH] Simplify FRE parts of PRE, try to save some memory

2012-09-04 Thread Richard Guenther


Currently computa_avail consumes an unreasonable amount of memory
in the FRE case for PR46590.  The following patch makes some
obvious adjustments but does not cure the underlying issue.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2012-09-04  Richard Guenther  rguent...@suse.de

* tree-ssa-pre.c (add_to_exp_gen): Adjust.
(make_values_for_phi): Do not add to PHI_GEN for FRE.
(compute_avail): Stop processing after adding all defs to
AVAIL_OUT for FRE.
(init_pre): Do not allocate not needed bitmap sets for FRE.

Index: gcc/tree-ssa-pre.c
===
*** gcc/tree-ssa-pre.c  (revision 190918)
--- gcc/tree-ssa-pre.c  (working copy)
*** insert (void)
*** 3786,3799 
  static void
  add_to_exp_gen (basic_block block, tree op)
  {
!   if (!in_fre)
! {
!   pre_expr result;
!   if (TREE_CODE (op) == SSA_NAME  ssa_undefined_value_p (op))
!   return;
!   result = get_or_alloc_expr_for_name (op);
!   bitmap_value_insert_into_set (EXP_GEN (block), result);
! }
  }
  
  /* Create value ids for PHI in BLOCK.  */
--- 3786,3800 
  static void
  add_to_exp_gen (basic_block block, tree op)
  {
!   pre_expr result;
! 
!   gcc_checking_assert (!in_fre);
! 
!   if (TREE_CODE (op) == SSA_NAME  ssa_undefined_value_p (op))
! return;
! 
!   result = get_or_alloc_expr_for_name (op);
!   bitmap_value_insert_into_set (EXP_GEN (block), result);
  }
  
  /* Create value ids for PHI in BLOCK.  */
*** make_values_for_phi (gimple phi, basic_b
*** 3805,3827 
  
/* We have no need for virtual phis, as they don't represent
   actual computations.  */
!   if (!virtual_operand_p (result))
  {
!   pre_expr e = get_or_alloc_expr_for_name (result);
!   add_to_value (get_expr_value_id (e), e);
bitmap_insert_into_set (PHI_GEN (block), e);
!   bitmap_value_insert_into_set (AVAIL_OUT (block), e);
!   if (!in_fre)
{
! unsigned i;
! for (i = 0; i  gimple_phi_num_args (phi); ++i)
{
! tree arg = gimple_phi_arg_def (phi, i);
! if (TREE_CODE (arg) == SSA_NAME)
!   {
! e = get_or_alloc_expr_for_name (arg);
! add_to_value (get_expr_value_id (e), e);
!   }
}
}
  }
--- 3806,3828 
  
/* We have no need for virtual phis, as they don't represent
   actual computations.  */
!   if (virtual_operand_p (result))
! return;
! 
!   pre_expr e = get_or_alloc_expr_for_name (result);
!   add_to_value (get_expr_value_id (e), e);
!   bitmap_value_insert_into_set (AVAIL_OUT (block), e);
!   if (!in_fre)
  {
!   unsigned i;
bitmap_insert_into_set (PHI_GEN (block), e);
!   for (i = 0; i  gimple_phi_num_args (phi); ++i)
{
! tree arg = gimple_phi_arg_def (phi, i);
! if (TREE_CODE (arg) == SSA_NAME)
{
! e = get_or_alloc_expr_for_name (arg);
! add_to_value (get_expr_value_id (e), e);
}
}
  }
*** compute_avail (void)
*** 3934,3939 
--- 3935,3944 
  bitmap_value_insert_into_set (AVAIL_OUT (block), e);
}
  
+ /* That's all we need to do when doing FRE.  */
+ if (in_fre)
+   continue;
+ 
  if (gimple_has_side_effects (stmt) || stmt_could_throw_p (stmt))
continue;
  
*** compute_avail (void)
*** 3992,3999 
  
get_or_alloc_expression_id (result);
add_to_value (get_expr_value_id (result), result);
!   if (!in_fre)
! bitmap_value_insert_into_set (EXP_GEN (block), result);
  }
continue;
  }
--- 3997,4003 
  
get_or_alloc_expression_id (result);
add_to_value (get_expr_value_id (result), result);
!   bitmap_value_insert_into_set (EXP_GEN (block), result);
  }
continue;
  }
*** compute_avail (void)
*** 4105,4112 
  
get_or_alloc_expression_id (result);
add_to_value (get_expr_value_id (result), result);
!   if (!in_fre)
! bitmap_value_insert_into_set (EXP_GEN (block), result);
  
continue;
  }
--- 4109,4115 
  
get_or_alloc_expression_id (result);
add_to_value (get_expr_value_id (result), result);
!   bitmap_value_insert_into_set (EXP_GEN (block), result);
  
continue;
  }
*** my_rev_post_order_compute (int *post_ord
*** 4733,4747 
src = ei_edge (ei)-src;
dest = ei_edge (ei)-dest;
  
!   /* Check if the edge destination has been visited yet.  */
if (src !=

MAINTAINERS (Write After Approval): Add myself.

2012-09-04 Thread Christophe Lyon


Hi all,

I have added my name in the Write After Approval section, with the attached 
patch.

Christophe.

Index: ChangeLog
===
--- ChangeLog   (revision 190926)
+++ ChangeLog   (working copy)
@@ -1,3 +1,7 @@
+2012-09-04  Christophe Lyon  christophe.l...@st.com
+
+   * MAINTAINERS (Write After Approval): Add myself.
+
 2012-09-03  Richard Guenther  rguent...@suse.de
 
PR bootstrap/54138
Index: MAINTAINERS
===
--- MAINTAINERS (revision 190926)
+++ MAINTAINERS (working copy)
@@ -439,6 +439,7 @@ Manuel López-Ibáñez m...@gcc.gnu.org
 Martin v. Löwis
loe...@informatik.hu-berlin.de
 H.J. Luhjl.to...@gmail.com
 Xinliang David Li  davi...@google.com
+Christophe Lyonchristophe.l...@st.com
 Luis Machado   luis...@br.ibm.com
 Ziga Mahkovec  ziga.mahko...@klika.si
 Simon Martin   simar...@users.sourceforge.net

Re: Re-implement VEC_* to be member functions of vec_tT

2012-09-04 Thread Diego Novillo

On Fri, Aug 24, 2012 at 2:32 PM, Diego Novillo dnovi...@google.com wrote:
 On 2012-08-23 23:08 , Diego Novillo wrote:

 I've tested this patch on x86_64 and ppc64 with all languages plus
 ada, go and obj-c++.  I am going to be offline for several days
 starting on Saturday, so I will not commit it until I return.


 I've also done memory and time comparisons to make sure I didn't change
 behaviour.  No differences.

I've now committed the patch.


Diego.

Fix bootstrap failure with Sun linker

2012-09-04 Thread Eric Botcazou

The generated libstdc++-symbols.ver-sun cannot be parsed by the linker anymore.

Bootstrapped on SPARC64/Solaris 9  SPARC/Solaris 10, applied on the mainline.


2012-09-04  Eric Botcazou  ebotca...@adacore.com

* make_sunver.pl: Add missing newline at the end of extern C++ block.


-- 
Eric Botcazou
Index: make_sunver.pl
===
--- make_sunver.pl	(revision 190863)
+++ make_sunver.pl	(working copy)
@@ -185,7 +185,7 @@ while (F) {
 	$glob = 'glob';
 	if ($in_extern) {
 	$in_extern--;
-	print $1##$2;
+	print $1##$2\n;
 	} else {
 	print;
 	}

Re: [Patch,avr] PR54461: Better AVR-Libc integration

2012-09-04 Thread Richard Guenther

On Tue, Sep 4, 2012 at 1:01 PM, Georg-Johann Lay a...@gjlay.de wrote:
 Richard Guenther wrote:
 Georg-Johann Lay wrote:
 Gabriel Dos Reis schrieb:
 Georg-Johann Lay wrote:
 Gabriel Dos Reis schrieb:
 Georg-Johann Lay wrote:
 Gabriel Dos Reis schrieb:
 Georg-Johann Lay wrote:
 AVR-Libc comes with hand-optimized float support functions
 written in assembler.  These functions use the same naming
 conventions like libgcc.  There are situations where this
 name clashed lead to performance regression because the
 functions from libgcc are linked.  One example are the new
 fixed-point support that convert fixed-point to/from float and
 reference float/int conversion functions from within
 libgcc.

 The float implementation in libm.a have been discussed
 several times with the only result that it is very unlikely
 that the code will ever be integrated into libgcc because
 the original authors are no more around.  And is is much
 less work to add a new configure switch than to port and
 integrate the code, given there were no license issues. One
 point against such an extension was that such change to the
 compiler establishes a dependency between the compiler and
 AVR-Libc, but this decision has been made long ago by
 accepting code that actually should had been added to
 libgcc -- but was not for whatever reason.

 This patch removes that performance regressions by removing
 the doubly implemented functions from libgcc by means of a
 new configure option --with-avrlibc.

 as I stated yesterday, I do not understand why there needs to
 be yet another configure option. The NATURAL libc for ARV
 targets is ARV-libc.  We should not need a switch for that.

 There is also newlib that is used with avr-gcc.  I know this
 because some bugs are only triggered for newlib.  If there are
 users that report bugs if avr-gcc is build for newlib, I'd
 guess these users are actually interested in using newlib.

 I did not say there was no other libc library.  I said that the
 *natural* libc appears to be AVR-libc.

 We don't configure GCC/g++ saying --with-libstdc++.

 That's a different story because these libraries support in-tree build
 just like newlib does.  This is not true for AVR-Libc which does not 
 support
 in-tree builds.

 I agree that AVR-Libc is the most common libc implementation used
 with avr-gcc and is has many advantages over other libc implementation
 (except that it does not support in-tree builds).

 I think the in-tree builds thing is a red herring.

 I don't think so.  If there was an in-tree build gcc could detect
 itself whether or not AVR-Libc is present or not.  With the
 current setup the user has to specify that -- in whatever
 direction: that libc is there or that libc is not there depending
 on whatever is default.

 You can do a link check on whether -lc provides those functions
 and skip those that overlap with libgcc.

 Can you explain this?  A typical build of avr tools goes like

 1) configure, build and install binutils
 2) configure, build and install the compiler
 3) configure, build and install AVR-Libc

 so that in step 2 no checking is possible because there is no -lc yet.
 Or do you mean a check at run time (of the compiler)?

4) build and install the real compiler

at which time you have AVR-libc available.  AT least that's how you
bootstrap a glibc cross.

Richard.


 Johann

Re: [PATCH] Simplify FRE parts of PRE, try to save some memory

2012-09-04 Thread Steven Bosscher

On Tue, Sep 4, 2012 at 3:19 PM, Richard Guenther rguent...@suse.de wrote:

 Currently computa_avail consumes an unreasonable amount of memory
 in the FRE case for PR46590.  The following patch makes some
 obvious adjustments but does not cure the underlying issue.

I don't think there's any way to cure the underlying issue, it's just
the result of having SSA form that so many values are available. You
can improve the representation of the sets (e.g. something similar to
the views of the tree-ssa-live machinery) but that's it.

Ciao!
Steven

Re: [PATCH] Simplify FRE parts of PRE, try to save some memory

2012-09-04 Thread Richard Guenther

On Tue, 4 Sep 2012, Steven Bosscher wrote:

 On Tue, Sep 4, 2012 at 3:19 PM, Richard Guenther rguent...@suse.de wrote:
 
  Currently computa_avail consumes an unreasonable amount of memory
  in the FRE case for PR46590.  The following patch makes some
  obvious adjustments but does not cure the underlying issue.
 
 I don't think there's any way to cure the underlying issue, it's just
 the result of having SSA form that so many values are available. You
 can improve the representation of the sets (e.g. something similar to
 the views of the tree-ssa-live machinery) but that's it.

One idea is that we only need AVAIL for the block we are currently
doing elimination on (well, for FRE - PRE is another story).  And
we need AVAIL only for values we are going to replace.  The first
can be exploited by a domwalk unifying elimination and AVAIL
computation for FRE, the 2nd one is harder ;)  At least sounds like
a reason to finally split FRE from PRE ...

Richard.

Re: Ping^2: [PATCH]Remove duplicate check on BRANCH_COST in fold-const.c

2012-09-04 Thread Richard Guenther

On Tue, Sep 4, 2012 at 12:53 PM, Richard Earnshaw rearn...@arm.com wrote:
 On 04/09/12 11:11, Richard Guenther wrote:
 On Tue, Sep 4, 2012 at 11:56 AM, Bin Cheng bin.ch...@arm.com wrote:
 -Original Message-
 From: Richard Earnshaw
 Sent: Thursday, July 26, 2012 9:19 PM
 To: Andrew Pinski
 Cc: Bin Cheng; gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH]Remove duplicate check on BRANCH_COST in
 fold-const.c

 On 26/07/12 11:27, Andrew Pinski wrote:
 On Thu, Jul 26, 2012 at 3:20 AM, Bin Cheng bin.ch...@arm.com wrote:
 Hi,
 This patch removes the duplicate check on BRANCH_COST in
 fold_truth_andor.
 The BRANCH_COST condition removed is a duplicate of the default
 definition of LOGICAL_OP_NON_SHORT_CIRCUIT.
 All current targets (mips and rs6000) that provide non-default
 definitions of LOGICAL_OP_SHORT_CIRCUIT set it to 0, so this patch
 is therefore just a code cleanup and does not change behaviour in
 the
 compiler.

 I built mipsel-elf cross compiler and compared newlib/libstdc++
 compiled by the patched/original compilers.

 Is it OK?

 Just some history here on this.  The BRANCH COST check was there
 before LOGICAL_OP_NON_SHORT_CIRCUIT was added.  I will be submitting
 a patch which changes the MIPS definition soon but it will not be
 based on the branch cost but rather than another option.  So in the
 end it might not be redundant as it is currently.

 Thanks,
 Andrew

 You can always factor BRANCH_COST into LOGICAL_OP_NON_SHORT_CIRCUIT
 (as
 the
 default currently does), so there's no loss of functionality from
 removing this currently redundant check.  However, the current
 definition is broken
 in
 that it makes it impossible to force the compiler to use this
 optimization when the branch cost is low.

 Hi, is this change ok? Or we need more discussion on it?

 It's not ok (I btw fail to see the patch in this thread).  The current
 way LOGICAL_OP_NON_SHORT_CIRCUIT is implemented/used should instead
 be changed to always match the pattern

   LOGICAL_OP_NON_SHORT_CIRCUIT
(BRANCH_COST (optimize_function_for_speed_p (cfun),
 false) = 2)

 and the default value of LOGICAL_OP_NON_SHORT_CIRCUIT should be 1,
 defined in defaults.h (and the docs updated).

 That's not going to work for modern ARM cores.  We want to set
 BRANCH_COST to 1 but still have it generate the non-short-circuit code
 (because conditional compares are really cheap.

Then you define LOGICAL_OP_NON_SHORT_CIRCUIT to zero.  The above
would be an identity transform for all targets currently, so it is not working
for modern ARM cores anyway.

Richard.

 R.

 Richard.

 Thanks very much.

Re: [middle-end] Add machine_mode to address_cost target hook

2012-09-04 Thread Richard Guenther

On Tue, Sep 4, 2012 at 12:38 PM, Paolo Bonzini bonz...@gnu.org wrote:
 Il 04/09/2012 09:52, Oleg Endo ha scritto:
 [x] target-independent bits
 [ ] alpha [ ] arm   [ ] avr [ ] bfin
 [ ] cr16  [ ] cris  [ ] epiphany[ ] i386
 [ ] ia64  [ ] iq2000[ ] lm32[ ] m32c
 [ ] m32r  [ ] mcore [ ] mep [x] microblaze
 [x] mips  [ ] mmix  [x] mn10300 [ ] pa
 [ ] rs6000[ ] rx[ ] s390[ ] score
 [x] sh[ ] sparc [ ] spu [ ] stormy16
 [ ] v850  [ ] vax   [ ] xtensa

 Tested with 'make all-gcc' on SH xgcc and i386 native build.
 No functional changes, except on MIPS, as requested by Richard
 Sandiford.

 I think you only need explicit approval for mn10300.  All other changes
 are trivial.

 +hook_int_rtx_mode_as_bool_0 (rtx, enum machine_mode, addr_space_t, bool)

 So we're using C++ already?  Or do we want ATTRIBUTE_UNUSED here?

Use C++ where it is so nicely obvious an improvement ;)

Richard.

 Paolo

Re: Ping^2: [PATCH]Remove duplicate check on BRANCH_COST in fold-const.c

2012-09-04 Thread Richard Earnshaw

On 04/09/12 15:31, Richard Guenther wrote:
 On Tue, Sep 4, 2012 at 12:53 PM, Richard Earnshaw rearn...@arm.com wrote:
 On 04/09/12 11:11, Richard Guenther wrote:
 On Tue, Sep 4, 2012 at 11:56 AM, Bin Cheng bin.ch...@arm.com wrote:
 -Original Message-
 From: Richard Earnshaw
 Sent: Thursday, July 26, 2012 9:19 PM
 To: Andrew Pinski
 Cc: Bin Cheng; gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH]Remove duplicate check on BRANCH_COST in
 fold-const.c

 On 26/07/12 11:27, Andrew Pinski wrote:
 On Thu, Jul 26, 2012 at 3:20 AM, Bin Cheng bin.ch...@arm.com wrote:
 Hi,
 This patch removes the duplicate check on BRANCH_COST in
 fold_truth_andor.
 The BRANCH_COST condition removed is a duplicate of the default
 definition of LOGICAL_OP_NON_SHORT_CIRCUIT.
 All current targets (mips and rs6000) that provide non-default
 definitions of LOGICAL_OP_SHORT_CIRCUIT set it to 0, so this patch
 is therefore just a code cleanup and does not change behaviour in
 the
 compiler.

 I built mipsel-elf cross compiler and compared newlib/libstdc++
 compiled by the patched/original compilers.

 Is it OK?

 Just some history here on this.  The BRANCH COST check was there
 before LOGICAL_OP_NON_SHORT_CIRCUIT was added.  I will be submitting
 a patch which changes the MIPS definition soon but it will not be
 based on the branch cost but rather than another option.  So in the
 end it might not be redundant as it is currently.

 Thanks,
 Andrew

 You can always factor BRANCH_COST into LOGICAL_OP_NON_SHORT_CIRCUIT
 (as
 the
 default currently does), so there's no loss of functionality from
 removing this currently redundant check.  However, the current
 definition is broken
 in
 that it makes it impossible to force the compiler to use this
 optimization when the branch cost is low.

 Hi, is this change ok? Or we need more discussion on it?

 It's not ok (I btw fail to see the patch in this thread).  The current
 way LOGICAL_OP_NON_SHORT_CIRCUIT is implemented/used should instead
 be changed to always match the pattern

   LOGICAL_OP_NON_SHORT_CIRCUIT
(BRANCH_COST (optimize_function_for_speed_p (cfun),
 false) = 2)

 and the default value of LOGICAL_OP_NON_SHORT_CIRCUIT should be 1,
 defined in defaults.h (and the docs updated).

 That's not going to work for modern ARM cores.  We want to set
 BRANCH_COST to 1 but still have it generate the non-short-circuit code
 (because conditional compares are really cheap.

 Then you define LOGICAL_OP_NON_SHORT_CIRCUIT to zero.  The above
 would be an identity transform for all targets currently, so it is not 
 working
 for modern ARM cores anyway.

No, that's backwards.  That gives us branches around compares, not
formation of or'ed cflag values that we can then transform into
conditional compares.

R.

 Richard.

 R.

 Richard.

 Thanks very much.

Re: Ping^2: [PATCH]Remove duplicate check on BRANCH_COST in fold-const.c

2012-09-04 Thread Richard Guenther

On Tue, Sep 4, 2012 at 4:33 PM, Richard Earnshaw rearn...@arm.com wrote:
 On 04/09/12 15:31, Richard Guenther wrote:
 On Tue, Sep 4, 2012 at 12:53 PM, Richard Earnshaw rearn...@arm.com wrote:
 On 04/09/12 11:11, Richard Guenther wrote:
 On Tue, Sep 4, 2012 at 11:56 AM, Bin Cheng bin.ch...@arm.com wrote:
 -Original Message-
 From: Richard Earnshaw
 Sent: Thursday, July 26, 2012 9:19 PM
 To: Andrew Pinski
 Cc: Bin Cheng; gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH]Remove duplicate check on BRANCH_COST in
 fold-const.c

 On 26/07/12 11:27, Andrew Pinski wrote:
 On Thu, Jul 26, 2012 at 3:20 AM, Bin Cheng bin.ch...@arm.com wrote:
 Hi,
 This patch removes the duplicate check on BRANCH_COST in
 fold_truth_andor.
 The BRANCH_COST condition removed is a duplicate of the default
 definition of LOGICAL_OP_NON_SHORT_CIRCUIT.
 All current targets (mips and rs6000) that provide non-default
 definitions of LOGICAL_OP_SHORT_CIRCUIT set it to 0, so this patch
 is therefore just a code cleanup and does not change behaviour in
 the
 compiler.

 I built mipsel-elf cross compiler and compared newlib/libstdc++
 compiled by the patched/original compilers.

 Is it OK?

 Just some history here on this.  The BRANCH COST check was there
 before LOGICAL_OP_NON_SHORT_CIRCUIT was added.  I will be submitting
 a patch which changes the MIPS definition soon but it will not be
 based on the branch cost but rather than another option.  So in the
 end it might not be redundant as it is currently.

 Thanks,
 Andrew

 You can always factor BRANCH_COST into LOGICAL_OP_NON_SHORT_CIRCUIT
 (as
 the
 default currently does), so there's no loss of functionality from
 removing this currently redundant check.  However, the current
 definition is broken
 in
 that it makes it impossible to force the compiler to use this
 optimization when the branch cost is low.

 Hi, is this change ok? Or we need more discussion on it?

 It's not ok (I btw fail to see the patch in this thread).  The current
 way LOGICAL_OP_NON_SHORT_CIRCUIT is implemented/used should instead
 be changed to always match the pattern

   LOGICAL_OP_NON_SHORT_CIRCUIT
(BRANCH_COST (optimize_function_for_speed_p (cfun),
 false) = 2)

 and the default value of LOGICAL_OP_NON_SHORT_CIRCUIT should be 1,
 defined in defaults.h (and the docs updated).

 That's not going to work for modern ARM cores.  We want to set
 BRANCH_COST to 1 but still have it generate the non-short-circuit code
 (because conditional compares are really cheap.

 Then you define LOGICAL_OP_NON_SHORT_CIRCUIT to zero.  The above
 would be an identity transform for all targets currently, so it is not 
 working
 for modern ARM cores anyway.

 No, that's backwards.  That gives us branches around compares, not
 formation of or'ed cflag values that we can then transform into
 conditional compares.

I see.  So I suppose for that reason the original patch is ok.

Thanks,
Richard.

 R.

 Richard.

 R.

 Richard.

 Thanks very much.

Re: [middle-end] Add machine_mode to address_cost target hook

2012-09-04 Thread Alexandre Oliva

On Sep  4, 2012, Oleg Endo oleg.e...@t-online.de wrote:

   * config/mn10300/mn10300.c (mn10300_address_cost): Add
 machine_mode and address space arguments.  Use GET_MODE (x) and 
   ADDR_SPACE_GENERIC in recursive invocation.

Ok with a change, see below.

   * config/sh/sh.c (sh_address_cost): Likewise.

Ok, thanks.

 Index: gcc/config/mn10300/mn10300.c

 -  total = mn10300_address_cost (XEXP (x, 0), speed);
 +  total = mn10300_address_cost (XEXP (x, 0), GET_MODE (x),
 + ADDR_SPACE_GENERIC, speed);

Instead of ADDR_SPACE_GENERIC, this should be MEM_ADDR_SPACE (x), no?

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer

Re: [Patch,avr] PR54461: Better AVR-Libc integration

2012-09-04 Thread Gabriel Dos Reis

On Tue, Sep 4, 2012 at 9:17 AM, Richard Guenther
richard.guent...@gmail.com wrote:

 Can you explain this?  A typical build of avr tools goes like

 1) configure, build and install binutils
 2) configure, build and install the compiler
 3) configure, build and install AVR-Libc

 so that in step 2 no checking is possible because there is no -lc yet.
 Or do you mean a check at run time (of the compiler)?

 4) build and install the real compiler

 at which time you have AVR-libc available.  AT least that's how you
 bootstrap a glibc cross.

avr-gcc has had a simplified build process for a while, as it almost never
needed to have a avr-gcc hosted on an avr platform.  It is usually
built as a cross-compiler that always run on the build platform.

What I was suggesting earlier is that we shouldn't continue patching
the AVR target as if the current state is almost ideal.  Pick a libc -- avr-libc
appears to be the natural implementation -- and make it the default as
opposed to adding nobs.

-- Gaby

RE: [Patch,avr] PR54461: Better AVR-Libc integration

2012-09-04 Thread Weddington, Eric

 -Original Message-
 From: dosr...@gmail.com [] On Behalf Of Gabriel Dos
 Reis
 Sent: Tuesday, September 04, 2012 9:08 AM
 To: Richard Guenther
 Cc: Georg-Johann Lay; gcc-patches@gcc.gnu.org; Denis Chertykov; Weddington,
 Eric; Joerg Wunsch
 Subject: Re: [Patch,avr] PR54461: Better AVR-Libc integration

 On Tue, Sep 4, 2012 at 9:17 AM, Richard Guenther
 richard.guent...@gmail.com wrote:

  Can you explain this?  A typical build of avr tools goes like

  1) configure, build and install binutils
  2) configure, build and install the compiler
  3) configure, build and install AVR-Libc

  so that in step 2 no checking is possible because there is no -lc yet.
  Or do you mean a check at run time (of the compiler)?

  4) build and install the real compiler

  at which time you have AVR-libc available.  AT least that's how you
  bootstrap a glibc cross.

 avr-gcc has had a simplified build process for a while, as it almost never
 needed to have a avr-gcc hosted on an avr platform.  It is usually
 built as a cross-compiler that always run on the build platform.

 What I was suggesting earlier is that we shouldn't continue patching
 the AVR target as if the current state is almost ideal.  Pick a libc -- avr-
 libc
 appears to be the natural implementation -- and make it the default as
 opposed to adding nobs.

I also strongly agree with this.

AFAIK, the only project that uses newlib as the C library for the AVR target is 
RTEMS, because, AIUI, they need to have the POSIX interface. The vast majority 
of AVR users have a toolchain that uses avr-libc.

Eric Weddington

Fix bootstrap with release checking

2012-09-04 Thread Diego Novillo


PR bootstrap/54479
* vec.h (vec_t::copy): Add cast in call to reserve_exact.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3d20ebd..c605432 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2012-09-04   Diego Novillo  dnovi...@google.com
+
+   PR bootstrap/54479
+   * vec.h (vec_t::copy): Add cast in call to reserve_exact.
+
 2012-09-04  Richard Guenther  rguent...@suse.de
 
* tree-ssa-pre.c (add_to_exp_gen): Adjust.
diff --git a/gcc/vec.h b/gcc/vec.h
index 74a85c7..ac426e9 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -699,7 +699,8 @@ vec_tT::copy (ALONE_MEM_STAT_DECL)
 
   if (len)
 {
-  new_vec = vec_tT::reserve_exactA (NULL, len PASS_MEM_STAT);
+  new_vec = vec_tT::reserve_exactA (static_castvec_tT * (NULL),
+   len PASS_MEM_STAT);
   new_vec-embedded_init (len, len);
   memcpy (new_vec-address (), vec_, sizeof (T) * len);
 }

Re: Ping^2: [PATCH]Remove duplicate check on BRANCH_COST in fold-const.c

2012-09-04 Thread Bin.Cheng

Sorry, I mis-sent this offline.

On Tue, Sep 4, 2012 at 11:00 PM, Bin.Cheng amker.ch...@gmail.com wrote:

 It's not ok (I btw fail to see the patch in this thread).  The current
 way LOGICAL_OP_NON_SHORT_CIRCUIT is implemented/used should instead
 be changed to always match the pattern

   LOGICAL_OP_NON_SHORT_CIRCUIT
(BRANCH_COST (optimize_function_for_speed_p (cfun),
 false) = 2)

 and the default value of LOGICAL_OP_NON_SHORT_CIRCUIT should be 1,
 defined in defaults.h (and the docs updated).


 That's not going to work for modern ARM cores.  We want to set
 BRANCH_COST to 1 but still have it generate the non-short-circuit code
 (because conditional compares are really cheap.


 Hi Richard,
 For now, LOGICAL_OP_NON_SHORT_CIRCUIT macro is defined as below, which
 is duplicate of the BRANCH_COST condition.

 #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT
 #define LOGICAL_OP_NON_SHORT_CIRCUIT \
   (BRANCH_COST (optimize_function_for_speed_p (cfun), \
 false) = 2)
 #endif

 Recently we measured performance on some ARM processors and found it
 would be better to have non-short-circuit optimization while setting
 BRANCH_COST to 1, which is impossible with present codes. So here
 comes this patch as below:

 Index: gcc/fold-const.c
 ===
 --- gcc/fold-const.c(revision 189835)
 +++ gcc/fold-const.c(working copy)
 @@ -8443,9 +8443,7 @@
if ((tem = fold_truth_andor_1 (loc, code, type, arg0, arg1)) != 0)
  return tem;

 -  if ((BRANCH_COST (optimize_function_for_speed_p (cfun),
 -   false) = 2)
 -   LOGICAL_OP_NON_SHORT_CIRCUIT
 +  if (LOGICAL_OP_NON_SHORT_CIRCUIT
 (code == TRUTH_AND_EXPR
|| code == TRUTH_ANDIF_EXPR
|| code == TRUTH_OR_EXPR

 The purpose is to remove the duplicate check on BRANCH_COST.

 As Andrew pointed out that the patch may change behavior if some
 back-ends define the macro independent of BRANCH_COST. After looking
 into the code, there are two uses of the macro in fold-const.c, each
 controls one kind code transformation. The first use is:

   else if (LOGICAL_OP_NON_SHORT_CIRCUIT
 lhs != 0  rhs != 0
 (code == TRUTH_ANDIF_EXPR
|| code == TRUTH_ORIF_EXPR)
 operand_equal_p (lhs, rhs, 0))

 The second one is:

   if ((BRANCH_COST (optimize_function_for_speed_p (cfun),
 false) = 2)
LOGICAL_OP_NON_SHORT_CIRCUIT
(code == TRUTH_AND_EXPR
   || code == TRUTH_ANDIF_EXPR
   || code == TRUTH_OR_EXPR
   || code == TRUTH_ORIF_EXPR))

 I am not sure why the 2nd condition is designed in current way and
 haven't found any useful changelog on it.
 But considering back end can factor BRANCH_COST in
 LOGICAL_OP_NON_SHORT_CIRCUIT or not, we can conclude that the behavior
 will only be changed if some back-end want to control the two
 transformations differently. So the problem becomes whether the 2nd
 condition should be changed. Either way there is scenario cannot be
 covered.


And for now,
FTR, only two targets redefine L_O_N_S_C: mips and rs6000.  Both set it
to zero so won't be affected by this change.


-- 
Best Regards.

Re: [PATCH] Set correct source location for deallocator calls

2012-09-04 Thread Dehao Chen

On Thu, Aug 30, 2012 at 9:33 AM, Richard Henderson r...@redhat.com wrote:
 On 08/30/2012 08:20 AM, Andrew Haley wrote:
 Is the problem simply that the logic to
 scan the assembly code isn't present in the libgcj testsuite?

 Yes, exactly.

For this case, I don't think that we want a testcase to rely on
addr2line in the system. So how about that that we add a test when
assembly scan is available in libgcj testsuit?

Thanks,
Dehao



 r~

Re: [PATCH] Set correct source location for deallocator calls

2012-09-04 Thread Andrew Haley

On 09/04/2012 05:07 PM, Dehao Chen wrote:
 On Thu, Aug 30, 2012 at 9:33 AM, Richard Henderson r...@redhat.com wrote:
 On 08/30/2012 08:20 AM, Andrew Haley wrote:
 Is the problem simply that the logic to
 scan the assembly code isn't present in the libgcj testsuite?

 Yes, exactly.
 
 For this case, I don't think that we want a testcase to rely on
 addr2line in the system. So how about that that we add a test when
 assembly scan is available in libgcj testsuit?

Fine by me.  I guess you can just copy the scanning code from the gcc
testsuite.

Andrew.

Re: [PATCH] Set correct source location for deallocator calls

2012-09-04 Thread Bryce McKinlay

On Tue, Sep 4, 2012 at 5:07 PM, Dehao Chen de...@google.com wrote:
 On Thu, Aug 30, 2012 at 9:33 AM, Richard Henderson r...@redhat.com wrote:
 On 08/30/2012 08:20 AM, Andrew Haley wrote:
 Is the problem simply that the logic to
 scan the assembly code isn't present in the libgcj testsuite?

 Yes, exactly.

 For this case, I don't think that we want a testcase to rely on
 addr2line in the system. So how about that that we add a test when
 assembly scan is available in libgcj testsuit?

Once Ian Lance Taylor's libbacktrace patch is integrated (see:
http://gcc.gnu.org/ml/gcc/2012-08/msg00317.html), we'll be able to get
rid of the code that calls addr2line from libgcj.

So, I think it would be fine to write a Java test case using
Throwable.getStackTrace(). Whichever approach is easiest for you is
fine.

Bryce

Re: [PATCH] Set correct source location for deallocator calls

2012-09-04 Thread Andrew Haley

On 09/04/2012 05:32 PM, Bryce McKinlay wrote:
 On Tue, Sep 4, 2012 at 5:07 PM, Dehao Chen de...@google.com wrote:
 On Thu, Aug 30, 2012 at 9:33 AM, Richard Henderson r...@redhat.com wrote:
 On 08/30/2012 08:20 AM, Andrew Haley wrote:
 Is the problem simply that the logic to
 scan the assembly code isn't present in the libgcj testsuite?

 Yes, exactly.

 For this case, I don't think that we want a testcase to rely on
 addr2line in the system. So how about that that we add a test when
 assembly scan is available in libgcj testsuit?
 
 Once Ian Lance Taylor's libbacktrace patch is integrated (see:
 http://gcc.gnu.org/ml/gcc/2012-08/msg00317.html), we'll be able to get
 rid of the code that calls addr2line from libgcj.

As I understand it, Ian Taylor's backtrace patch is intended for use in
gcc development, and as he puts it Since its use in GCC would
be purely for GCC developers, it's not essential that it be fully
portable.  Not for gcj runtime.

Andrew.

[PATCH] Clarify gcc-{ar,nm,ranlib} usage in the documentation

2012-09-04 Thread Andi Kleen

From: Andi Kleen a...@linux.intel.com

Make it clear in the documentation that with -fno-fat-lto-objects the
gcc-* wrappers should be used to pass the linker plugin.

gcc/:

2012-09-04  Andi Kleen a...@linux.intel.com

* doc/invoke.texi (-ffat-lto-objects): Clarify that gcc-ar
et.al. should be used.
---
 gcc/doc/invoke.texi |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6cf7cec..197803d 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -8138,7 +8138,9 @@ requires the complete toolchain to be aware of LTO. It 
requires a linker with
 linker plugin support for basic functionality.  Additionally,
 @command{nm}, @command{ar} and @command{ranlib}
 need to support linker plugins to allow a full-featured build environment
-(capable of building static libraries etc).
+(capable of building static libraries etc). gcc provides the @command{gcc-ar},
+@command{gcc-nm}, @command{gcc-ranlib} wrappers to pass the right options
+to these tools. With non fat LTO makefiles need to be modified to use them.
 
 The default is @option{-ffat-lto-objects} but this default is intended to
 change in future releases when linker plugin enabled environments become more
-- 
1.7.7

[PATCH] Reduce memory usage for storing LTO decl resolutions

2012-09-04 Thread Andi Kleen

From: Andi Kleen a...@linux.intel.com

With a LTO build of a large project (11k subfiles incrementially linked)
storing the LTO resolutions took over 0.5GB memory:

lto/lto.c:1087 (lto_resolution_read)  0: 0.0%  540398500
   15903: 0.0%

The reason is that the declaration indexes are quite sparse, but every subfile
got a full continuous vector for them. Since there are so many of them the
many vectors add up.

This patch instead stores the resolutions initially in a compact (index, 
resolution)
format. This is only expanded into a sparse vector for fast lookup when
the subfile is actually read, but then immediately freed. This means only one
vector is allocated at a time.

This brings the overhead for this down to less than 3MB for the test case:

lto/lto.c:1087 (lto_resolution_read)  0: 0.0%2821456
   42186: 0.0%

Passed bootstrap and test suite on x86_64-linux.

Ok for 4.8 and possibly for 4.7?

-Andi

2012-09-04  Andi Kleen  a...@linux.intel.com

* gcc/lto-streamer.h (res_pair): Add.
(lto_file_decl_data): Replace resolutions with respairs.
Add max_index.
* gcc/lto/lto.c (lto_resolution_read): Remove max_index.  Add rp.
Initialize respairs.
(lto_file_finalize): Set up resolutions vector lazily from respairs.
---
 gcc/lto-streamer.h |   15 ++-
 gcc/lto/lto.c  |   31 ++-
 2 files changed, 36 insertions(+), 10 deletions(-)

diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index bed408a..19a35cb 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -513,6 +513,18 @@ typedef struct lto_out_decl_state *lto_out_decl_state_ptr;
 DEF_VEC_P(lto_out_decl_state_ptr);
 DEF_VEC_ALLOC_P(lto_out_decl_state_ptr, heap);
 
+/* Compact representation of a index - resolution pair. Unpacked to an 
+   vector later. */
+struct res_pair 
+{
+  ld_plugin_symbol_resolution_t res;
+  unsigned index;
+};
+typedef struct res_pair res_pair;
+
+DEF_VEC_P(res_pair);
+DEF_VEC_ALLOC_P(res_pair, heap);
+
 /* One of these is allocated for each object file that being compiled
by lto.  This structure contains the tables that are needed by the
serialized functions and ipa passes to connect themselves to the
@@ -548,7 +560,8 @@ struct GTY(()) lto_file_decl_data
   unsigned HOST_WIDE_INT id;
 
   /* Symbol resolutions for this file */
-  VEC(ld_plugin_symbol_resolution_t,heap) * GTY((skip)) resolutions;
+  VEC(res_pair, heap) * GTY((skip)) respairs;
+  unsigned max_index;
 
   struct gcov_ctr_summary GTY((skip)) profile_info;
 };
diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index bd91c39..5da5412 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -1012,7 +1012,6 @@ lto_resolution_read (splay_tree file_ids, FILE 
*resolution, lto_file *file)
   unsigned int num_symbols;
   unsigned int i;
   struct lto_file_decl_data *file_data;
-  unsigned max_index = 0;
   splay_tree_node nd = NULL; 
 
   if (!resolution)
@@ -1054,13 +1053,12 @@ lto_resolution_read (splay_tree file_ids, FILE 
*resolution, lto_file *file)
   unsigned int j;
   unsigned int lto_resolution_str_len =
sizeof (lto_resolution_str) / sizeof (char *);
+  res_pair rp;
 
   t = fscanf (resolution, %u  HOST_WIDE_INT_PRINT_HEX_PURE  %26s 
%*[^\n]\n, 
  index, id, r_str);
   if (t != 3)
 internal_error (invalid line in the resolution file);
-  if (index  max_index)
-   max_index = index;
 
   for (j = 0; j  lto_resolution_str_len; j++)
{
@@ -1082,11 +1080,13 @@ lto_resolution_read (splay_tree file_ids, FILE 
*resolution, lto_file *file)
}
 
   file_data = (struct lto_file_decl_data *)nd-value;
-  VEC_safe_grow_cleared (ld_plugin_symbol_resolution_t, heap, 
-file_data-resolutions,
-max_index + 1);
-  VEC_replace (ld_plugin_symbol_resolution_t, 
-  file_data-resolutions, index, r);
+  /* The indexes are very sparse. To save memory save them in a compact
+ format that is only unpacked later when the subfile is processed. */
+  rp.res = r;
+  rp.index = index;
+  VEC_safe_push (res_pair, heap, file_data-respairs, rp);
+  if (file_data-max_index  index)
+file_data-max_index = index;
 }
 }
 
@@ -1166,6 +1166,18 @@ lto_file_finalize (struct lto_file_decl_data *file_data, 
lto_file *file)
 {
   const char *data;
   size_t len;
+  VEC(ld_plugin_symbol_resolution_t,heap) *resolutions = NULL;
+  int i;
+  res_pair *rp;
+
+  /* Create vector for fast access of resolution. We do this lazily
+ to save memory. */ 
+  VEC_safe_grow_cleared (ld_plugin_symbol_resolution_t, heap, 
+resolutions,
+file_data-max_index + 1);
+  for (i = 0; VEC_iterate (res_pair, file_data-respairs, i, rp); i++)
+VEC_replace (ld_plugin_symbol_resolution_t, resolutions, rp-index, 
rp-res);
+  VEC_free

Re: [Ping]RE: [Patch, test] Enable to prune warnings for tests defined in one exp file

2012-09-04 Thread Mike Stump

On Sep 3, 2012, at 11:05 PM, Terry Guo terry@arm.com wrote:
 Is it ok to document this feature in README.gcc?

Sure.  I was almost hoping someone had a pointer to a wiki page that had new 
bits...

 Is it ok to back port this feature to 4.7 branch?

Ok.

Re: [PATCH] Set correct source location for deallocator calls

2012-09-04 Thread Bryce McKinlay

On Tue, Sep 4, 2012 at 5:39 PM, Andrew Haley a...@redhat.com wrote:
 On 09/04/2012 05:32 PM, Bryce McKinlay wrote:
 On Tue, Sep 4, 2012 at 5:07 PM, Dehao Chen de...@google.com wrote:
 On Thu, Aug 30, 2012 at 9:33 AM, Richard Henderson r...@redhat.com wrote:
 On 08/30/2012 08:20 AM, Andrew Haley wrote:
 Is the problem simply that the logic to
 scan the assembly code isn't present in the libgcj testsuite?

 Yes, exactly.

 For this case, I don't think that we want a testcase to rely on
 addr2line in the system. So how about that that we add a test when
 assembly scan is available in libgcj testsuit?

 Once Ian Lance Taylor's libbacktrace patch is integrated (see:
 http://gcc.gnu.org/ml/gcc/2012-08/msg00317.html), we'll be able to get
 rid of the code that calls addr2line from libgcj.

 As I understand it, Ian Taylor's backtrace patch is intended for use in
 gcc development, and as he puts it Since its use in GCC would
 be purely for GCC developers, it's not essential that it be fully
 portable.  Not for gcj runtime.

He's also planning to use it for libgo, and other gcc runtime libs
have indicated interest. It doesn't have to work on all platforms, and
I can't see how it would be any less portable than addr2line!

Bryce

Re: [PATCH] Set correct source location for deallocator calls

2012-09-04 Thread Andrew Haley

On 09/04/2012 06:08 PM, Bryce McKinlay wrote:
 On Tue, Sep 4, 2012 at 5:39 PM, Andrew Haley a...@redhat.com wrote:
 On 09/04/2012 05:32 PM, Bryce McKinlay wrote:
 On Tue, Sep 4, 2012 at 5:07 PM, Dehao Chen de...@google.com wrote:
 On Thu, Aug 30, 2012 at 9:33 AM, Richard Henderson r...@redhat.com wrote:
 On 08/30/2012 08:20 AM, Andrew Haley wrote:
 Is the problem simply that the logic to
 scan the assembly code isn't present in the libgcj testsuite?

 Yes, exactly.

 For this case, I don't think that we want a testcase to rely on
 addr2line in the system. So how about that that we add a test when
 assembly scan is available in libgcj testsuit?

 Once Ian Lance Taylor's libbacktrace patch is integrated (see:
 http://gcc.gnu.org/ml/gcc/2012-08/msg00317.html), we'll be able to get
 rid of the code that calls addr2line from libgcj.

 As I understand it, Ian Taylor's backtrace patch is intended for use in
 gcc development, and as he puts it Since its use in GCC would
 be purely for GCC developers, it's not essential that it be fully
 portable.  Not for gcj runtime.
 
 He's also planning to use it for libgo, and other gcc runtime libs
 have indicated interest. It doesn't have to work on all platforms, and
 I can't see how it would be any less portable than addr2line!

I certainly can.  Maybe once it's shaken-down so it's at least as
robust as what we have now it'll be OK.  I suspect it hasn't had much
testing with, for example, unwinding through signal handlers.

Andrew.

Re: [PATCH] Set correct source location for deallocator calls

2012-09-04 Thread Bryce McKinlay

On Tue, Sep 4, 2012 at 6:12 PM, Andrew Haley a...@redhat.com wrote:

 He's also planning to use it for libgo, and other gcc runtime libs
 have indicated interest. It doesn't have to work on all platforms, and
 I can't see how it would be any less portable than addr2line!

 I certainly can.  Maybe once it's shaken-down so it's at least as
 robust as what we have now it'll be OK.  I suspect it hasn't had much
 testing with, for example, unwinding through signal handlers.

libgcj wouldn't actually use it for unwinding, we already have all
that. We'd just use it to read DWARF debug info and give us the source
code line numbers.

re: [google/gcc-4_7] Fix GDB test suite regression with -fdebug-types-section patch

2012-09-04 Thread Sterling Augustine

On Sat, Sep 1, 2012 at 1:44 PM,  gcc-patches-digest-h...@gcc.gnu.org wrote:
 From: ccout...@google.com (Cary Coutant)
 To: d...@google.com, gcc-patches@gcc.gnu.org
 Cc:
 Date: Fri, 31 Aug 2012 16:55:40 -0700 (PDT)
 Subject: [google/gcc-4_7] Fix GDB test suite regression with 
 -fdebug-types-section patch
 This patch is for the google/gcc-4_7 branch.

 This patch fixes a problem caused by the previous patch that removed
 the code to copy children of a DIE referenced by a type unit.

 I don't believe that it's necessary to copy the children of the class
 declaration at all, and this patch simply removes the code that copies
 those children. If there's a reference in the type unit to one of the
 children of that class, that one child will get copied in as needed.

 The problem was that it IS necessary to copy the children of a
 non-declaration -- such as a DW_TAG_array_type.  I've restored the
 loop that calls clone_tree_partial, but placed it within a test
 for is_declaration_die.

 Bootstraps and passes regression tests. Also tested with parts of the
 GDB testsuite, and is still able to build a large internal test case
 that previously resulted in out-of-memory during compilation.

 Google ref b/7041390.

 2012-08-31   Cary Coutant  ccout...@google.com

 * gcc/dwarf2out.c (clone_tree_partial): Restore.
 (copy_decls_walk): Call clone_tree_partial to copy children
 of non-declaration DIEs.

This is OK for google branches.

Re: [PATCH] Set correct source location for deallocator calls

2012-09-04 Thread Andrew Haley

On 09/04/2012 06:17 PM, Bryce McKinlay wrote:
 On Tue, Sep 4, 2012 at 6:12 PM, Andrew Haley a...@redhat.com wrote:
 
 He's also planning to use it for libgo, and other gcc runtime libs
 have indicated interest. It doesn't have to work on all platforms, and
 I can't see how it would be any less portable than addr2line!

 I certainly can.  Maybe once it's shaken-down so it's at least as
 robust as what we have now it'll be OK.  I suspect it hasn't had much
 testing with, for example, unwinding through signal handlers.
 
 libgcj wouldn't actually use it for unwinding, we already have all
 that. We'd just use it to read DWARF debug info and give us the source
 code line numbers.

OK, as long as that's all it does.  I think I was perhaps a bit
misled by its description of a stack backtrace library.  It
certainly looks like a nicer approach than addr2line, but is
going to be much less well-ported.  I guess we'll see how it goes.

Andrew.

Re: [PATCH][RFC] Fixing instability of -fschedule-insns for x86

2012-09-04 Thread Uros Bizjak

On Mon, Aug 13, 2012 at 9:39 PM, Igor Zamyatin izamya...@gmail.com wrote:

 Main idea of this activity is mostly to provide user a possibility to
 safely turn on first scheduler for his codes. In some cases this could
 positively affect performance, especially for in-order Atom.

 It would be great to hear some feedback from the community about the change.

I don't think it is necessary to set dependence for CALL_INSN
arguments. It seems to me, that it is enough to set scheduling
priority of moves to hard registers to zero, to schedule them as late
as possible, presumably just before call insn.

The attached patch builds on your idea of setting priorities of moves
from hard registers to pseudos to maximum (these are moves from
function arguments, they should be scheduled as soon as possible to
free hard registers). Please note that it is enough to handle only
likely spilled hard regs (for moves from and to registers), since
these regs are causing all the troubles.

The patch assumes that likely spilled hard regs didn't propagate to
other instructions, and that other hard registers didn't propagate to
operands with wrong constraints (recent x86 improvement).

Unfortunately, the patch doesn't fix PR 54472 (the spill failure with
selective scheduler). No matter what TARGET_SCHED_ADJUST_PRIORITY
returns, the offending move to ax register always get scheduled before
problematic string instruction. The patch however builds on promise
from the documentation that:

 -- Target Hook: int TARGET_SCHED_ADJUST_PRIORITY (rtx INSN, int PRIORITY)

 This hook adjusts the integer scheduling priority PRIORITY of
 INSN.  It should return the new priority.  Increase the priority to
 execute INSN earlier, reduce the priority to execute INSN later.
 Do not define this hook if you do not need to adjust the
 scheduling priorities of insns.

The patch is in RFC state, but survives quite some -fschedule-insns
testing on current mainline, with and without added -fsched-pressure.

Uros.
Index: i386.c
===
--- i386.c  (revision 190932)
+++ i386.c  (working copy)
@@ -24314,6 +24314,49 @@ ix86_sched_reorder(FILE *dump, int sched_verbose,
   return issue_rate;
 }
 
+/* Before reload, adjust priority of moves to/from likely spilled
+   hard registers.  This reduces hard register life times and consequently
+   the chance of spill failures for enclosed instructions.  */
+static int
+ix86_adjust_priority (rtx insn, int priority)
+{
+  rtx set;
+
+  if (reload_completed)
+return priority;
+
+  if (!NONJUMP_INSN_P (insn))
+return priority;
+
+  set = single_set (insn);
+
+  if (set)
+{
+  rtx tmp;
+
+  /* Set priority of moves from likely spilled hard registers to maximum,
+to schedule them as soon as possible.  These are moves from
+function argument registers at the top of the function entry.  */
+  tmp = SET_SRC (set);
+  if (REG_P (tmp)
+  HARD_REGISTER_P (tmp)
+  ! TEST_HARD_REG_BIT (fixed_reg_set, REGNO (tmp))
+  targetm.class_likely_spilled_p (REGNO_REG_CLASS (REGNO (tmp
+   return current_sched_info-sched_max_insns_priority;
+
+  /* Set priority of moves to likely spilled hard registers to minimum,
+to schedule them as late as possible.  These are moves to
+function argument registers before function call.  */
+  tmp = SET_DEST (set);
+  if (REG_P (tmp)
+  HARD_REGISTER_P (tmp)
+  ! TEST_HARD_REG_BIT (fixed_reg_set, REGNO (tmp))
+  targetm.class_likely_spilled_p (REGNO_REG_CLASS (REGNO (tmp
+   return 0;
+}
+
+  return priority;
+}
 
 
 /* Model decoder of Core 2/i7.
@@ -39608,6 +39651,8 @@ ix86_enum_va_list (int idx, const char **pname, tr
 #define TARGET_SCHED_REASSOCIATION_WIDTH ix86_reassociation_width
 #undef TARGET_SCHED_REORDER
 #define TARGET_SCHED_REORDER ix86_sched_reorder
+#undef TARGET_SCHED_ADJUST_PRIORITY
+#define TARGET_SCHED_ADJUST_PRIORITY ix86_adjust_priority
 
 /* The size of the dispatch window is the total number of bytes of
object code allowed in a window.  */

[RFC] PowerPC / rs6000 call glue removal

2012-09-04 Thread David Edelsohn

Segher and I are planning to remove the machinery supporting
RS6000_CALL_GLUE.  In the AIX ABI, used by AIX, PowerOpen, PPC64 Linux
and mcall-aixdesc, direct calls to named functions that may be
external are followed by a special no-op instruction that the linker
can replace with an instruction to restore the TOC addressibility
register.  This no-op instruction changed over time prior to the
PowerPC architecture: initially cror 15,15,15, then cror 31,31,31
and finally settling on the PowerPC nop instruction.

PPC64 Linux only has ever used the PowerPC nop instruction.  All
versions of AIX targeting PowerPC use the PowerPC nop instruction.
All current configurations of GCC targeting PPC64 Linux and AIX only
generate the nop instruction.  The machinery is not used for the
PPC32 SVR4 ABI and eABI.

With the recent removal of support for the original POWER instruction
set from GCC, we propose to remove the machinery to control the no-op
instruction emitted by GCC. The linkers will continue to handle the
older no-op instructions in any old object files.

However, the default value of RS6000_CALL_GLUE, which almost all
targets override, is cror 31,31,31.  The only place this potentially
still could be used is in the old mcall-aixdesc embedded mode
originally designed by Cygnus for a customer who desired an embedded
mode with AIX compatibility.  We do not believe there are any
remaining users of this option, but we want to distribute this
announcement widely to ensure that anyone affected has an opportunity
to comment.

http://gcc.gnu.org/ml/gcc-patches/2012-08/msg01849.html

As the saying goes: Speak now, or forever hold your peace.

Thanks, David

Re: [Patch,avr] PR54461: Better AVR-Libc integration

2012-09-04 Thread Georg-Johann Lay

Weddington, Eric wrote:
 
 Can you explain this?  A typical build of avr tools goes like

 1) configure, build and install binutils
 2) configure, build and install the compiler
 3) configure, build and install AVR-Libc

 so that in step 2 no checking is possible because there is no -lc yet.
 Or do you mean a check at run time (of the compiler)?
 4) build and install the real compiler

 at which time you have AVR-libc available.  AT least that's how you
 bootstrap a glibc cross.
 avr-gcc has had a simplified build process for a while, as it almost never
 needed to have a avr-gcc hosted on an avr platform.  It is usually
 built as a cross-compiler that always run on the build platform.

 What I was suggesting earlier is that we shouldn't continue patching
 the AVR target as if the current state is almost ideal.  Pick a libc -- avr-
 libc
 appears to be the natural implementation -- and make it the default as
 opposed to adding nobs.
 
 I also strongly agree with this.
 
 AFAIK, the only project that uses newlib as the C library for the AVR target
 is RTEMS, because, AIUI, they need to have the POSIX interface. The vast
 majority of AVR users have a toolchain that uses avr-libc.

So here is an updated version of the patch.
Instead of with_avrlibc = yes it does with_avrlibc != no.

Just like the first version, --with-avrlibc[=*] is only recognized
if avr-gcc is not configured for RTEMS, i.e. RTEMS users don't need
to set --with-avrlibc=no in order to get a complete libgcc.


Johann

--

PR target/54461
* configure.ac (noconfigdirs,target=avr-*-*): Add target-newlib,
target-libgloss if not configured --with-avrlibc=no.
* configure: Regenerate.

libgcc/
PR target/54461
* config.host (tmake_file,host=avr-*-*): Add avr/t-avrlibc if
not configured --with-avrlibc=no.
* config/avr/t-avrlibc: New file.
* Makefile.in (FPBIT_FUNCS): filter-out LIB2FUNCS_EXCLUDE.
(DPBIT_FUNCS): Ditto.
(TPBIT_FUNCS): Ditto.

gcc/
PR target/54461
* config.gcc (tm_file,target=avr-*-*): Add avr/avrlibc.h if
not configured --with-avrlibc=no.
(tm_defines,target=avr-*-*): Add WITH_AVRLIBC if not configured
--with-avrlibc=no.
* config/avr/avrlibc.h: New file.
* config/avr/avr-c.c: Build-in define __WITH_AVRLIBC__ if
not configured --with-avrlibc=no.

Index: configure
===
--- configure	(revision 190922)
+++ configure	(working copy)
@@ -3499,6 +3499,13 @@ case ${target} in
   arm-*-riscix*)
 noconfigdirs=$noconfigdirs ld target-libgloss
 ;;
+  avr-*-rtems*)
+;;
+  avr-*-*)
+if test x${with_avrlibc} != xno; then
+  noconfigdirs=$noconfigdirs target-newlib target-libgloss
+fi
+;;
   c4x-*-* | tic4x-*-*)
 noconfigdirs=$noconfigdirs target-libgloss
 ;;
Index: configure.ac
===
--- configure.ac	(revision 190922)
+++ configure.ac	(working copy)
@@ -891,6 +891,13 @@ case ${target} in
   arm-*-riscix*)
 noconfigdirs=$noconfigdirs ld target-libgloss
 ;;
+  avr-*-rtems*)
+;;
+  avr-*-*)
+if test x${with_avrlibc} != xno; then
+  noconfigdirs=$noconfigdirs target-newlib target-libgloss
+fi
+;;
   c4x-*-* | tic4x-*-*)
 noconfigdirs=$noconfigdirs target-libgloss
 ;;
Index: libgcc/config/avr/t-avrlibc
===
--- libgcc/config/avr/t-avrlibc	(revision 0)
+++ libgcc/config/avr/t-avrlibc	(revision 0)
@@ -0,0 +1,66 @@
+# This file is used if not configured --with-avrlibc=no
+#
+# AVR-Libc comes with hand-optimized float routines.
+# For historical reasons, these routines live in AVR-Libc
+# and not in libgcc and use the same function names like libgcc.
+# To get the best support, i.e. always use the routines from
+# AVR-Libc, we remove these routines from libgcc.
+#
+# See also PR54461.
+#
+#
+# Arithmetic:
+# __addsf3 __subsf3 __divsf3 __mulsf3 __negsf2
+#
+# Comparison:
+# __cmpsf2 __unordsf2
+# __eqsf2 __lesf2 __ltsf2 __nesf2 __gesf2 __gtsf2
+#
+# Conversion:
+# __fixsfdi __fixunssfdi __floatdisf __floatundisf
+# __fixsfsi __fixunssfsi __floatsisf __floatunsisf
+#
+#
+# These functions are contained in modules:
+#
+# _addsub_sf.o:   __addsf3  __subsf3
+# _mul_sf.o:  __mulsf3
+# _div_sf.o:  __divsf3
+# _negate_sf.o:   __negsf2
+#
+# _compare_sf.o:  __cmpsf2
+# _unord_sf.o:__unordsf2
+# _eq_sf.o:   __eqsf2
+# _ne_sf.o:   __nesf2
+# _ge_sf.o:   __gesf2
+# _gt_sf.o:   __gtsf2
+# _le_sf.o:   __lesf2
+# _lt_sf.o:   __ltsf2
+#
+# _fixsfdi.o: __fixsfdi
+# _fixunssfdi.o:  __fixunssfdi
+# _fixunssfsi.o:  __fixunssfsi
+# _floatdisf.o:   __floatdisf
+# _floatundisf.o: __floatundisf
+# _sf_to_si.o:__fixsfsi
+# _si_to_sf.o:__floatsisf
+# _usi_to_sf.o:   __floatunsisf
+
+
+# SFmode
+LIB2FUNCS_EXCLUDE += \

RE: [Patch,avr] PR54461: Better AVR-Libc integration

2012-09-04 Thread Weddington, Eric

 -Original Message-
 From: Georg-Johann Lay []
 Sent: Tuesday, September 04, 2012 12:00 PM
 To: Weddington, Eric
 Cc: Gabriel Dos Reis; Richard Guenther; gcc-patches@gcc.gnu.org; Denis
 Chertykov; Joerg Wunsch
 Subject: Re: [Patch,avr] PR54461: Better AVR-Libc integration

 So here is an updated version of the patch.
 Instead of with_avrlibc = yes it does with_avrlibc != no.

 Just like the first version, --with-avrlibc[=*] is only recognized
 if avr-gcc is not configured for RTEMS, i.e. RTEMS users don't need
 to set --with-avrlibc=no in order to get a complete libgcc.

Sorry, I'm a bit confused. With your new patch...

- If I build GCC, for the avr target (plain), without specifying the 
--with-avr-libc= switch, does it default to yes?

- If I build GCC, for the avr-rtems target, without specifying the 
--with-avr-libc= switch, does it default to no?

Because the above is what I would expect the default behavior to be. Doing that 
would certainly help with backwards compatibility for those building toolchain 
distributions.

I would think that the user has to specify the --with-avr-libc= flag to 
explicitly deviate from common usage and practice.

Eric Weddington

Re: [Patch,avr] PR54461: Better AVR-Libc integration

2012-09-04 Thread Gabriel Dos Reis

On Tue, Sep 4, 2012 at 1:00 PM, Georg-Johann Lay a...@gjlay.de wrote:
 Weddington, Eric wrote:

 Can you explain this?  A typical build of avr tools goes like

 1) configure, build and install binutils
 2) configure, build and install the compiler
 3) configure, build and install AVR-Libc

 so that in step 2 no checking is possible because there is no -lc yet.
 Or do you mean a check at run time (of the compiler)?
 4) build and install the real compiler

 at which time you have AVR-libc available.  AT least that's how you
 bootstrap a glibc cross.
 avr-gcc has had a simplified build process for a while, as it almost never
 needed to have a avr-gcc hosted on an avr platform.  It is usually
 built as a cross-compiler that always run on the build platform.

 What I was suggesting earlier is that we shouldn't continue patching
 the AVR target as if the current state is almost ideal.  Pick a libc -- avr-
 libc
 appears to be the natural implementation -- and make it the default as
 opposed to adding nobs.

 I also strongly agree with this.

 AFAIK, the only project that uses newlib as the C library for the AVR target
 is RTEMS, because, AIUI, they need to have the POSIX interface. The vast
 majority of AVR users have a toolchain that uses avr-libc.

 So here is an updated version of the patch.
 Instead of with_avrlibc = yes it does with_avrlibc != no.

 Just like the first version, --with-avrlibc[=*] is only recognized
 if avr-gcc is not configured for RTEMS, i.e. RTEMS users don't need
 to set --with-avrlibc=no in order to get a complete libgcc.

Thanks!  I am satisfied with this.

-- Gaby

Fix PR 54478 - Work around g++ 4.3 parsing bug

2012-09-04 Thread Diego Novillo

This patch works around a parsing problem with g++ 4.3.  The parser is
failing to lookup calls to the template function reserve when called
from other member functions:

vec_tT::reserveA (...)

The parser thinks that the '' in reserveA is a less-than operation.
This problem does not happen after 4.3.

This code is going to change significantly, so this won't be needed
soon.

Tested on x86_64 with g++ 4.3 and g++ 4.6.


Diego.


PR bootstrap/54478
* vec.h (vec_t::alloc): Remove explicit type specification
in call to reserve.
(vec_t::copy): Likewise.
(vec_t::reserve): Likewise.
(vec_t::reserve_exact): Likewise.
(vec_t::safe_splice): Likewise.
(vec_t::safe_push): Likewise.
(vec_t::safe_grow): Likewise.
(vec_t::safe_grow_cleared): Likewise.
(vec_t::safe_insert): Likewise.

diff --git a/gcc/vec.h b/gcc/vec.h
index ac426e9..c0f1bb2 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -655,7 +655,7 @@ templateenum vec_allocation_t A
 vec_tT *
 vec_tT::alloc (int nelems MEM_STAT_DECL)
 {
-  return vec_tT::reserve_exactA ((vec_tT *) NULL, nelems PASS_MEM_STAT);
+  return reserve_exactA ((vec_tT *) NULL, nelems PASS_MEM_STAT);
 }
 
 templatetypename T
@@ -699,8 +699,8 @@ vec_tT::copy (ALONE_MEM_STAT_DECL)
 
   if (len)
 {
-  new_vec = vec_tT::reserve_exactA (static_castvec_tT * (NULL),
-   len PASS_MEM_STAT);
+  new_vec = reserve_exactA (static_castvec_tT * (NULL),
+ len PASS_MEM_STAT);
   new_vec-embedded_init (len, len);
   memcpy (new_vec-address (), vec_, sizeof (T) * len);
 }
@@ -736,7 +736,7 @@ vec_tT::reserve (vec_tT **vec, int nelems 
VEC_CHECK_DECL MEM_STAT_DECL)
   bool extend = (*vec) ? !(*vec)-space (nelems VEC_CHECK_PASS) : nelems != 0;
 
   if (extend)
-*vec = vec_tT::reserveA (*vec, nelems PASS_MEM_STAT);
+*vec = reserveA (*vec, nelems PASS_MEM_STAT);
 
   return extend;
 }
@@ -755,7 +755,7 @@ vec_tT::reserve_exact (vec_tT **vec, int nelems 
VEC_CHECK_DECL
   bool extend = (*vec) ? !(*vec)-space (nelems VEC_CHECK_PASS) : nelems != 0;
 
   if (extend)
-*vec = vec_tT::reserve_exactA (*vec, nelems PASS_MEM_STAT);
+*vec = reserve_exactA (*vec, nelems PASS_MEM_STAT);
 
   return extend;
 }
@@ -796,8 +796,7 @@ vec_tT::safe_splice (vec_tT **dst, vec_tT *src 
VEC_CHECK_DECL
 {
   if (src)
 {
-  vec_tT::reserve_exactA (dst, VEC_length (T, src) VEC_CHECK_PASS
- MEM_STAT_INFO);
+  reserve_exactA (dst, VEC_length (T, src) VEC_CHECK_PASS MEM_STAT_INFO);
   (*dst)-splice (src VEC_CHECK_PASS);
 }
 }
@@ -843,7 +842,7 @@ templateenum vec_allocation_t A
 T 
 vec_tT::safe_push (vec_tT **vec, T obj VEC_CHECK_DECL MEM_STAT_DECL)
 {
-  vec_tT::reserveA (vec, 1 VEC_CHECK_PASS PASS_MEM_STAT);
+  reserveA (vec, 1 VEC_CHECK_PASS PASS_MEM_STAT);
   return (*vec)-quick_push (obj VEC_CHECK_PASS);
 }
 
@@ -858,7 +857,7 @@ templateenum vec_allocation_t A
 T *
 vec_tT::safe_push (vec_tT **vec, const T *ptr VEC_CHECK_DECL MEM_STAT_DECL)
 {
-  vec_tT::reserveA (vec, 1 VEC_CHECK_PASS PASS_MEM_STAT);
+  reserveA (vec, 1 VEC_CHECK_PASS PASS_MEM_STAT);
   return (*vec)-quick_push (ptr VEC_CHECK_PASS);
 }
 
@@ -898,8 +897,8 @@ vec_tT::safe_grow (vec_tT **vec, int size 
VEC_CHECK_DECL MEM_STAT_DECL)
 {
   VEC_ASSERT (size = 0  VEC_length (T, *vec) = (unsigned)size,
  grow, T, A);
-  vec_tT::reserve_exactA (vec, size - (int)VEC_length (T, *vec)
- VEC_CHECK_PASS PASS_MEM_STAT);
+  reserve_exactA (vec, size - (int)VEC_length (T, *vec)
+   VEC_CHECK_PASS PASS_MEM_STAT);
   (*vec)-prefix_.num_ = size;
 }
 
@@ -915,7 +914,7 @@ vec_tT::safe_grow_cleared (vec_tT **vec, int size 
VEC_CHECK_DECL
 MEM_STAT_DECL)
 {
   int oldsize = VEC_length (T, *vec);
-  vec_tT::safe_growA (vec, size VEC_CHECK_PASS PASS_MEM_STAT);
+  safe_growA (vec, size VEC_CHECK_PASS PASS_MEM_STAT);
   memset (((*vec)-address ()[oldsize]), 0, sizeof (T) * (size - oldsize));
 }
 
@@ -972,7 +971,7 @@ void
 vec_tT::safe_insert (vec_tT **vec, unsigned ix, T obj VEC_CHECK_DECL
   MEM_STAT_DECL)
 {
-  vec_tT::reserveA (vec, 1 VEC_CHECK_PASS PASS_MEM_STAT);
+  reserveA (vec, 1 VEC_CHECK_PASS PASS_MEM_STAT);
   (*vec)-quick_insert (ix, obj VEC_CHECK_PASS);
 }
 
@@ -988,7 +987,7 @@ void
 vec_tT::safe_insert (vec_tT **vec, unsigned ix, T *ptr VEC_CHECK_DECL
   MEM_STAT_DECL)
 {
-  vec_tT::reserveA (vec, 1 VEC_CHECK_PASS PASS_MEM_STAT);
+  reserveA (vec, 1 VEC_CHECK_PASS PASS_MEM_STAT);
   (*vec)-quick_insert (ix, ptr VEC_CHECK_PASS);
 }

PATCH to configure.ac to fix --enable-languages=all

2012-09-04 Thread Jason Merrill

I configure GCC with --enable-languages=all,obj-c++ for testing, and 
this started breaking recently, because I ended up with 'c' twice in the 
list of languages, so reconfiguring breaks.  This seems to be because 
r189080 didn't adjust the all case along with the language name case.


Tested x86_64-pc-linux-gnu, applying to trunk as obvious.
commit 1e901c1bd7814cd5e3b6800fe035255ab6c3976f
Author: Jason Merrill ja...@redhat.com
Date:   Tue Sep 4 13:25:21 2012 -0400

	* configure.ac: Fix --enable-languages=all.

diff --git a/configure b/configure
index 0f655b8..cd06e4e 100755
--- a/configure
+++ b/configure
@@ -6112,6 +6112,7 @@ if test -d ${srcdir}/gcc; then
 	  boot_language=yes
 	fi
 
+add_this_lang=no
 case ,${enable_languages}, in
   *,${language},*)
 # Language was explicitly selected; include it
@@ -6122,10 +6123,9 @@ if test -d ${srcdir}/gcc; then
 ;;
   *,all,*)
 # 'all' was selected, select it if it is a default language
-add_this_lang=${build_by_default}
-;;
-  *)
-add_this_lang=no
+	if test $language != c; then
+	  add_this_lang=${build_by_default}
+	fi
 ;;
 esac
 
diff --git a/configure.ac b/configure.ac
index 02174b3..9bee624 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1758,6 +1758,7 @@ if test -d ${srcdir}/gcc; then
 	  boot_language=yes
 	fi
 
+add_this_lang=no
 case ,${enable_languages}, in
   *,${language},*)
 # Language was explicitly selected; include it
@@ -1768,10 +1769,9 @@ if test -d ${srcdir}/gcc; then
 ;;
   *,all,*)
 # 'all' was selected, select it if it is a default language
-add_this_lang=${build_by_default}
-;;
-  *)
-add_this_lang=no
+	if test $language != c; then
+	  add_this_lang=${build_by_default}
+	fi
 ;;
 esac

Re: [middle-end] Add machine_mode to address_cost target hook

2012-09-04 Thread Richard Sandiford

Oleg Endo oleg.e...@t-online.de writes:
 On Mon, 2012-09-03 at 01:58 +0200, Oleg Endo wrote:
 OKOK -- I'll do it :)
 (within the next couple of days)
 

 And so I did.  Attached is an updated patch that adds the address space
 parameter to the address_cost function.  I hope that this change does
 not reset the ACKs so far:

 [x] target-independent bits
 [ ] alpha [ ] arm   [ ] avr [ ] bfin
 [ ] cr16  [ ] cris  [ ] epiphany[ ] i386
 [ ] ia64  [ ] iq2000[ ] lm32[ ] m32c
 [ ] m32r  [ ] mcore [ ] mep [x] microblaze
 [x] mips  [ ] mmix  [x] mn10300 [ ] pa
 [ ] rs6000[ ] rx[ ] s390[ ] score
 [x] sh[ ] sparc [ ] spu [ ] stormy16
 [ ] v850  [ ] vax   [ ] xtensa

 Tested with 'make all-gcc' on SH xgcc and i386 native build.
 No functional changes, except on MIPS, as requested by Richard
 Sandiford.

Thanks, looks good to me.

Hopefully a friendly global maintainer will approve the whole thing in
one go (modulo Alex's comment) so that you don't need to get individual
approvals for all targets.

Richard

[PATCH] Further OpenBSD/amd64 and OpenBSD/i386 improvements

2012-09-04 Thread Mark Kettenis

Here are some additional fixes for OpenBSD that fix a fair number of
failing testcases.  I can split this up in smaller patches if that's
preferred.

I believe I submitted the openbsd-stdint.h bit before.  We consistenly
use long long types for the *max_t types, on both 32-bit and 64-bit
platforms wheras GCC defaults to using long on 32-bit platforms and
long long on 64-bit platforms.  Hence the need for overrides.


libgcc/:

2012-09-02  Mark Kettenis  kette...@gnu.org

* config.host (*-*-openbsd*): Add t-eh-dw2-dip to tmake_file.
(i[34567]86-*-openbsd* and x86_64-*-openbsd*): Add to list of
i[34567]86-*-* and x86_64-*-* soft-fp targets.
* unwind-dw2-fde-dip.c: Don't include elf.h on OpenBSD.
(USE_PT_GNU_EH_FRAME): Define for OpenBSD.
(ElfW): Likewise.

gcc:/

2012-09-02  Mark Kettenis  kette...@gnu.org

* config.gcc (*-*-openbsd4.[3-9]|*-*-openbsd[5-9]*): Set
default_use_cxa_atexit to yes.
* config/openbsd-stdint.h (INTMAX_TYPE, UINTMAX_TYPE): Define.
* config/i386/openbsdelf.h (LIBGCC2_HAS_TF_MODE, LIBGCC2_TF_CEXT) 
(TF_SIZE): Define.

Index: libgcc/unwind-dw2-fde-dip.c
===
--- libgcc/unwind-dw2-fde-dip.c (revision 190863)
+++ libgcc/unwind-dw2-fde-dip.c (working copy)
@@ -33,7 +33,7 @@
 
 #include tconfig.h
 #include tsystem.h
-#ifndef inhibit_libc
+#if !defined(inhibit_libc)  !defined(__OpenBSD__)
 #include elf.h   /* Get DT_CONFIG.  */
 #endif
 #include coretypes.h
@@ -65,6 +65,12 @@
 #endif
 
 #if !defined(inhibit_libc)  defined(HAVE_LD_EH_FRAME_HDR) \
+ defined(__OpenBSD__)
+# define ElfW(type) Elf_##type
+# define USE_PT_GNU_EH_FRAME
+#endif
+
+#if !defined(inhibit_libc)  defined(HAVE_LD_EH_FRAME_HDR) \
  defined(TARGET_DL_ITERATE_PHDR) \
  defined(__sun__)  defined(__svr4__)
 # define USE_PT_GNU_EH_FRAME
Index: libgcc/config.host
===
--- libgcc/config.host  (revision 190863)
+++ libgcc/config.host  (working copy)
@@ -213,7 +213,7 @@
   esac
   ;;
 *-*-openbsd*)
-  tmake_file=$tmake_file t-crtstuff-pic t-libgcc-pic
+  tmake_file=$tmake_file t-crtstuff-pic t-libgcc-pic t-eh-dw2-dip
   case ${target_thread_file} in
 posix)
   tmake_file=$tmake_file t-openbsd-thread
@@ -1150,7 +1150,8 @@
   i[34567]86-*-gnu* | \
   i[34567]86-*-solaris2* | x86_64-*-solaris2.1[0-9]* | \
   i[34567]86-*-cygwin* | i[34567]86-*-mingw* | x86_64-*-mingw* | \
-  i[34567]86-*-freebsd* | x86_64-*-freebsd*)
+  i[34567]86-*-freebsd* | x86_64-*-freebsd* | \
+  i[34567]86-*-openbsd* | x86_64-*-openbsd*)
tmake_file=${tmake_file} t-softfp-tf
if test ${host_address} = 32; then
tmake_file=${tmake_file} i386/${host_address}/t-softfp
Index: gcc/config.gcc
===
--- gcc/config.gcc  (revision 190863)
+++ gcc/config.gcc  (working copy)
@@ -708,6 +708,11 @@
 *-*-openbsd2.*|*-*-openbsd3.[012])
   tm_defines=${tm_defines} HAS_LIBC_R=1 ;;
   esac
+  case ${target} in
+*-*-openbsd4.[3-9]|*-*-openbsd[5-9]*)
+  default_use_cxa_atexit=yes
+  ;;
+  esac
   ;;
 *-*-rtems*)
   case ${enable_threads} in
Index: gcc/config/i386/openbsdelf.h
===
--- gcc/config/i386/openbsdelf.h(revision 190863)
+++ gcc/config/i386/openbsdelf.h(working copy)
@@ -111,3 +111,9 @@
 #define OBSD_HAS_CORRECT_SPECS
 
 #define HAVE_ENABLE_EXECUTE_STACK
+
+/* Put all *tf routines in libgcc.  */
+#undef LIBGCC2_HAS_TF_MODE
+#define LIBGCC2_HAS_TF_MODE 1
+#define LIBGCC2_TF_CEXT q
+#define TF_SIZE 113
Index: gcc/config/openbsd-stdint.h
===
--- gcc/config/openbsd-stdint.h (revision 190863)
+++ gcc/config/openbsd-stdint.h (working copy)
@@ -26,6 +26,9 @@
 #define UINT_FAST16_TYPE   unsigned int
 #define UINT_FAST32_TYPE   unsigned int
 #define UINT_FAST64_TYPE   long long unsigned int
+
+#define INTMAX_TYPElong long int
+#define UINTMAX_TYPE   long long unsigned int
  
 #define INTPTR_TYPElong int
 #define UINTPTR_TYPE   long unsigned int

C++ PATCH for c++/54437 (firefox build failure)

2012-09-04 Thread Jason Merrill

Here, the problem was that we were resolving the address of an 
overloaded function in the context of the template being called (which 
doesn't have access to the function) rather than the caller (which 
does).  We need to massage explicit template arguments before we enter 
the callee's context.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit c17767b10d05c0ea47107a3b7f067da76cc5ad8d
Author: Jason Merrill ja...@redhat.com
Date:   Tue Sep 4 11:27:03 2012 -0400

	PR c++/54437
	PR c++/51213
	* pt.c (fn_type_unification): Call coerce_template_parms before
	entering substitution context.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 4a39427..6f6235c 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -14591,11 +14591,22 @@ fn_type_unification (tree fn,
   static int deduction_depth;
   struct pending_template *old_last_pend = last_pending_template;
   struct tinst_level *old_error_tinst = last_error_tinst_level;
+  tree tparms = DECL_INNERMOST_TEMPLATE_PARMS (fn);
   tree tinst;
   tree r = error_mark_node;
 
-  if (excessive_deduction_depth)
-return error_mark_node;
+  /* Adjust any explicit template arguments before entering the
+ substitution context.  */
+  if (explicit_targs)
+{
+  explicit_targs
+	= (coerce_template_parms (tparms, explicit_targs, NULL_TREE,
+  complain,
+  /*require_all_args=*/false,
+  /*use_default_args=*/false));
+  if (explicit_targs == error_mark_node)
+	return error_mark_node;
+}
 
   /* In C++0x, it's possible to have a function template whose type depends
  on itself recursively.  This is most obvious with decltype, but can also
@@ -14608,6 +14619,8 @@ fn_type_unification (tree fn,
  substitutions back up to the initial one.
 
  This is, of course, not reentrant.  */
+  if (excessive_deduction_depth)
+return error_mark_node;
   tinst = build_tree_list (fn, targs);
   if (!push_tinst_level (tinst))
 {
@@ -14640,23 +14653,10 @@ fn_type_unification (tree fn,
 	 specified template argument values.  If a substitution in a
 	 template parameter or in the function type of the function
 	 template results in an invalid type, type deduction fails.  */
-  tree tparms = DECL_INNERMOST_TEMPLATE_PARMS (fn);
   int i, len = TREE_VEC_LENGTH (tparms);
   location_t loc = input_location;
-  tree converted_args;
   bool incomplete = false;
 
-  if (explicit_targs == error_mark_node)
-	goto fail;
-
-  converted_args
-	= (coerce_template_parms (tparms, explicit_targs, NULL_TREE,
-  complain,
-   /*require_all_args=*/false,
-   /*use_default_args=*/false));
-  if (converted_args == error_mark_node)
-	goto fail;
-
   /* Substitute the explicit args into the function type.  This is
 	 necessary so that, for instance, explicitly declared function
 	 arguments can match null pointed constants.  If we were given
@@ -14667,7 +14667,7 @@ fn_type_unification (tree fn,
 {
   tree parm = TREE_VALUE (TREE_VEC_ELT (tparms, i));
   bool parameter_pack = false;
-	  tree targ = TREE_VEC_ELT (converted_args, i);
+	  tree targ = TREE_VEC_ELT (explicit_targs, i);
 
   /* Dig out the actual parm.  */
   if (TREE_CODE (parm) == TYPE_DECL
@@ -14705,7 +14705,7 @@ fn_type_unification (tree fn,
 
   processing_template_decl += incomplete;
   input_location = DECL_SOURCE_LOCATION (fn);
-  fntype = tsubst (TREE_TYPE (fn), converted_args,
+  fntype = tsubst (TREE_TYPE (fn), explicit_targs,
 		   complain | tf_partial, NULL_TREE);
   input_location = loc;
   processing_template_decl -= incomplete;
@@ -14714,8 +14714,8 @@ fn_type_unification (tree fn,
 	goto fail;
 
   /* Place the explicitly specified arguments in TARGS.  */
-  for (i = NUM_TMPL_ARGS (converted_args); i--;)
-	TREE_VEC_ELT (targs, i) = TREE_VEC_ELT (converted_args, i);
+  for (i = NUM_TMPL_ARGS (explicit_targs); i--;)
+	TREE_VEC_ELT (targs, i) = TREE_VEC_ELT (explicit_targs, i);
 }
 
   /* Never do unification on the 'this' parameter.  */
diff --git a/gcc/testsuite/g++.dg/template/access24.C b/gcc/testsuite/g++.dg/template/access24.C
new file mode 100644
index 000..9f19226
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/access24.C
@@ -0,0 +1,8 @@
+// PR c++/54437
+
+template void (*P)() void f();
+class A {
+  template class T static void g();
+  template class T static void h () { fgT (); }
+  static void i() { hint(); }
+};

Re: [middle-end] Add machine_mode to address_cost target hook

2012-09-04 Thread Oleg Endo

On Tue, 2012-09-04 at 12:02 -0300, Alexandre Oliva wrote:

  Index: gcc/config/mn10300/mn10300.c
 
  -  total = mn10300_address_cost (XEXP (x, 0), speed);
  +  total = mn10300_address_cost (XEXP (x, 0), GET_MODE (x),
  +   ADDR_SPACE_GENERIC, speed);
 
 Instead of ADDR_SPACE_GENERIC, this should be MEM_ADDR_SPACE (x), no?

Effectively, it actually doesn't matter, since the address space is not
used in the cost function.  But yeah, true, fixed thusly.  The change
log entry for this was also wrong.  Fixed that, too.  Thanks.  Updated
patch and change log below.

On Tue, 2012-09-04 at 12:38 +0200, Paolo Bonzini wrote:

  I think you only need explicit approval for mn10300.  All other
 changes are trivial.
 

On Tue, 2012-09-04 at 19:43 +0100, Richard Sandiford wrote:

 Thanks, looks good to me.
 
 Hopefully a friendly global maintainer will approve the whole thing in
 one go (modulo Alex's comment) so that you don't need to get
 individual
 approvals for all targets.

Hmm .. the ACK status so far is:

[x] target-independent bits
[ ] alpha [x] arm   [ ] avr [ ] bfin
[ ] cr16  [ ] cris  [ ] epiphany[ ] i386
[ ] ia64  [x] iq2000[ ] lm32[ ] m32c
[x] m32r  [x] mcore [ ] mep [x] microblaze
[x] mips  [ ] mmix  [x] mn10300 [ ] pa
[ ] rs6000[x] rx[ ] s390[ ] score
[x] sh[ ] sparc [ ] spu [x] stormy16
[x] v850  [ ] vax   [ ] xtensa

I think I'll wait until Friday.  If there are no further objections
until then,  I'd like to and install the patch even if some of the boxes
should still remain unchecked.  Would that be OK?


On Tue, 2012-09-04 at 16:32 +0200, Richard Guenther wrote:
 
  +hook_int_rtx_mode_as_bool_0 (rtx, enum machine_mode, addr_space_t,
 bool)
 
  So we're using C++ already?  Or do we want ATTRIBUTE_UNUSED here?
 
 Use C++ where it is so nicely obvious an improvement ;)

ProblemFactoryManagerListenerSingleton? ;)


Cheers,
Oleg

ChangeLog:

* hooks.c (hook_int_rtx_mode_as_bool_0): New function.
* hooks.h (hook_int_rtx_mode_as_bool_0): Declare it.
* output.h (default_address_cost): Add machine_mode 
and address space arguments.
* target.def (address_cost): Likewise.
* rtlanal.c (address_cost): Pass mode and address space to
target hook.
(default_address_cost): Add unnamed machine_mode and address 
space arguments.
* doc/tm.texi: Regenerate.
* config/alpha/alpha.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
* config/arm/arm.c (arm_address_cost): Add machine_mode 
and address space arguments.
* config/avr/avr.c (avr_address_cost): Likewise.
* config/bfin/bfin.c (bfin_address_cost): Likewise.
* config/cr16/cr16.c (cr16_address_cost): Likewise.
* config/cris/cris.c (cris_address_cost): Likewise.
* config/epiphany/epiphany.c (epiphany_address_cost): Likewise.
* config/i386/i386.c (ix86_address_cost): Likewise.
* config/ia64/ia64.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
* config/iq2000/iq2000.c (iq2000_address_cost): Add 
machine_mode and address space arguments.  Pass them on in
recursive invocation.
* config/lm32/lm32.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
* config/m32c/m32c.c (m32c_address_cost): Add machine_mode 
and address space arguments.
* config/m32r/m32r.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
* config/mcore/mcore.c (TARGET_ADDRESS_COST): Likewise.
* config/mep/mep.c (mep_address_cost): Add machine_mode 
and address space arguments.
* config/microblaze/microblaze.c (microblaze_address_cost): 
Likewise.
* config/mips/mips.c (mips_address_cost): Likewise.
* config/mmix/mmix.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_as_bool_0 instead of hook_int_rtx_bool_0.
* config/mn10300/mn10300.c (mn10300_address_cost): Add
machine_mode and address space arguments.
(mn10300_rtx_costs):  Pass GET_MODE (x) and MEM_ADDR_SPACE (x)
to mn10300_address_cost.
* config/pa/pa.c (hppa_address_cost): Add machine_mode and 
address space arguments.
* config/rs6000/rs6000.c (rs6000_debug_address_cost): Likewise.
(TARGET_ADDRESS_COST): Use hook_int_rtx_mode_as_bool_0 instead 
of hook_int_rtx_bool_0.
* config/rx/rx.c (rx_address_cost): Add machine_mode and 
address space arguments.
* config/s390/s390.c (s390_address_cost): Likewise.
* config/score/score-protos.h (score_address_cost): Likewise.
* config/score/score.c (score_address_cost): Likewise.
*

Re: [Patch,avr] PR54461: Better AVR-Libc integration

2012-09-04 Thread Georg-Johann Lay

Weddington, Eric wrote:
 
 From: Georg-Johann Lay

 So here is an updated version of the patch.
 Instead of with_avrlibc = yes it does with_avrlibc != no.

 Just like the first version, --with-avrlibc[=*] is only recognized
 if avr-gcc is not configured for RTEMS, i.e. RTEMS users don't need
 to set --with-avrlibc=no in order to get a complete libgcc.
 
 Sorry, I'm a bit confused. With your new patch...
 
 - If I build GCC, for the avr target (plain), without specifying
 the --with-avr-libc= switch, does it default to yes?

Yes.  Anything except an explicit no is treated like yes.

 - If I build GCC, for the avr-rtems target, without specifying
 the --with-avr-libc= switch, does it default to no?

Notice the switch is called --with-avrlibc.  The option is ignored for
avr-*-rtems*, thus is similar to no, thus yes for the question.

 Because the above is what I would expect the default behavior to be.
 Doing that would certainly help with backwards compatibility for those
 building toolchain distributions.
 
 I would think that the user has to specify the --with-avr-libc= flag
 to explicitly deviate from common usage and practice.

Yes, that's the case.  Except for users that want avr-*-* without
AVR-Libc and with newlib or some other libc flavor.


Johann

Fix PR rtl-optimization/54456

2012-09-04 Thread Eric Botcazou

This patch fixes PR rtl-optimization/54456 by running the first scheduling 
pass only when optimizing, as is already done for the second scheduling pass.

Tested on x86_64-suse-linux, applied on the mainline.


2012-09-04  Eric Botcazou  ebotca...@adacore.com

PR rtl-optimization/54456
* sched-rgn.c (gate_handle_sched): Return 1 only if optimize  0.


-- 
Eric Botcazou
Index: sched-rgn.c
===
--- sched-rgn.c	(revision 190863)
+++ sched-rgn.c	(working copy)
@@ -3473,7 +3473,7 @@ static bool
 gate_handle_sched (void)
 {
 #ifdef INSN_SCHEDULING
-  return flag_schedule_insns  dbg_cnt (sched_func);
+  return optimize  0  flag_schedule_insns  dbg_cnt (sched_func);
 #else
   return 0;
 #endif

RE: [Patch,avr] PR54461: Better AVR-Libc integration

2012-09-04 Thread Weddington, Eric

 -Original Message-
 From: Georg-Johann Lay []
 Sent: Tuesday, September 04, 2012 1:03 PM
 To: Weddington, Eric
 Cc: Gabriel Dos Reis; Richard Guenther; gcc-patches@gcc.gnu.org; Denis
 Chertykov; Joerg Wunsch
 Subject: Re: [Patch,avr] PR54461: Better AVR-Libc integration

  I would think that the user has to specify the --with-avr-libc= flag
  to explicitly deviate from common usage and practice.

 Yes, that's the case.  Except for users that want avr-*-* without
 AVR-Libc and with newlib or some other libc flavor.

Excellent! Thanks for the detailed explanation, and sorry for my confusion.

I'm good with the patch, then.

Eric

Minor reorganization in bb-reorder.c

2012-09-04 Thread Eric Botcazou

The file contains 3 RTL optimization passes, the gate and worker functions of 
which are strangely intertwined.

Fixed thusly, tested on x86_64-suse-linux, applied on the mainline.


2012-09-04  Eric Botcazou  ebotca...@adacore.com

* bb-reorder.c (gate_handle_reorder_blocks): Move around.
(rest_of_handle_reorder_blocks): Likewise.
(pass_reorder_blocks): Likewise.
(gate_handle_partition_blocks): Likewise.


-- 
Eric Botcazou
Index: bb-reorder.c
===
--- bb-reorder.c	(revision 190863)
+++ bb-reorder.c	(working copy)
@@ -2037,6 +2037,65 @@ insert_section_boundary_note (void)
 }
 }
 
+static bool
+gate_handle_reorder_blocks (void)
+{
+  if (targetm.cannot_modify_jumps_p ())
+return false;
+  /* Don't reorder blocks when optimizing for size because extra jump insns may
+ be created; also barrier may create extra padding.
+
+ More correctly we should have a block reordering mode that tried to
+ minimize the combined size of all the jumps.  This would more or less
+ automatically remove extra jumps, but would also try to use more short
+ jumps instead of long jumps.  */
+  if (!optimize_function_for_speed_p (cfun))
+return false;
+  return (optimize  0
+	   (flag_reorder_blocks || flag_reorder_blocks_and_partition));
+}
+
+static unsigned int
+rest_of_handle_reorder_blocks (void)
+{
+  basic_block bb;
+
+  /* Last attempt to optimize CFG, as scheduling, peepholing and insn
+ splitting possibly introduced more crossjumping opportunities.  */
+  cfg_layout_initialize (CLEANUP_EXPENSIVE);
+
+  reorder_basic_blocks ();
+  cleanup_cfg (CLEANUP_EXPENSIVE);
+
+  FOR_EACH_BB (bb)
+if (bb-next_bb != EXIT_BLOCK_PTR)
+  bb-aux = bb-next_bb;
+  cfg_layout_finalize ();
+
+  /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes.  */
+  insert_section_boundary_note ();
+  return 0;
+}
+
+struct rtl_opt_pass pass_reorder_blocks =
+{
+ {
+  RTL_PASS,
+  bbro,   /* name */
+  gate_handle_reorder_blocks,   /* gate */
+  rest_of_handle_reorder_blocks,/* execute */
+  NULL, /* sub */
+  NULL, /* next */
+  0,/* static_pass_number */
+  TV_REORDER_BLOCKS,/* tv_id */
+  0,/* properties_required */
+  0,/* properties_provided */
+  0,/* properties_destroyed */
+  0,/* todo_flags_start */
+  TODO_verify_rtl_sharing,  /* todo_flags_finish */
+ }
+};
+
 /* Duplicate the blocks containing computed gotos.  This basically unfactors
computed gotos that were factored early on in the compilation process to
speed up edge based data flow.  We used to not unfactoring them again,
@@ -2178,6 +2237,21 @@ struct rtl_opt_pass pass_duplicate_compu
  }
 };
 
+static bool
+gate_handle_partition_blocks (void)
+{
+  /* The optimization to partition hot/cold basic blocks into separate
+ sections of the .o file does not work well with linkonce or with
+ user defined section attributes.  Don't call it if either case
+ arises.  */
+  return (flag_reorder_blocks_and_partition
+   optimize
+	  /* See gate_handle_reorder_blocks.  We should not partition if
+	 we are going to omit the reordering.  */
+	   optimize_function_for_speed_p (cfun)
+	   !DECL_ONE_ONLY (current_function_decl)
+	   !user_defined_section_attribute);
+}
 
 /* This function is the main 'entrance' for the optimization that
partitions hot and cold basic blocks into separate sections of the
@@ -2346,83 +2420,6 @@ partition_hot_cold_basic_blocks (void)
 
   return TODO_verify_flow | TODO_verify_rtl_sharing;
 }
-
-static bool
-gate_handle_reorder_blocks (void)
-{
-  if (targetm.cannot_modify_jumps_p ())
-return false;
-  /* Don't reorder blocks when optimizing for size because extra jump insns may
- be created; also barrier may create extra padding.
-
- More correctly we should have a block reordering mode that tried to
- minimize the combined size of all the jumps.  This would more or less
- automatically remove extra jumps, but would also try to use more short
- jumps instead of long jumps.  */
-  if (!optimize_function_for_speed_p (cfun))
-return false;
-  return (optimize  0
-	   (flag_reorder_blocks || flag_reorder_blocks_and_partition));
-}
-
-
-/* Reorder basic blocks.  */
-static unsigned int
-rest_of_handle_reorder_blocks (void)
-{
-  basic_block bb;
-
-  /* Last attempt to optimize CFG, as scheduling, peepholing and insn
- splitting possibly introduced more crossjumping opportunities.  */
-  cfg_layout_initialize (CLEANUP_EXPENSIVE);
-
-  reorder_basic_blocks ();
-  cleanup_cfg (CLEANUP_EXPENSIVE);
-
-  FOR_EACH_BB (bb)
-if (bb-next_bb != EXIT_BLOCK_PTR)
-  bb-aux =

[PATCH, libstdc++] Add proper OpenBSD support

2012-09-04 Thread Mark Kettenis

Fixes a few testcases.  Mostly based on the existing
NetBSD/FreeBSD/Darwin code.

2012-09-04  Mark Kettenis  kette...@openbsd.org

* configure.host (*-*-openbsd*) Set cpu_include_dir.
* config/os/bsd/openbsd/ctype_base.h: New file.
* config/os/bsd/openbsd/ctype_configure_char.cc: New file.
* config/os/bsd/openbsd/ctype_inline.h: New file.
* config/os/bsd/openbsd/os_defines.h: New file.

Index: configure.host
===
--- configure.host  (revision 190863)
+++ configure.host  (working copy)
@@ -270,6 +270,9 @@
   netbsd*)
 os_include_dir=os/bsd/netbsd
 ;;
+  openbsd*)
+os_include_dir=os/bsd/openbsd
+;;
   qnx6.[12]*)
 os_include_dir=os/qnx/qnx6.1
 c_model=c
Index: config/os/bsd/openbsd/ctype_base.h
===
--- config/os/bsd/openbsd/ctype_base.h  (revision 0)
+++ config/os/bsd/openbsd/ctype_base.h  (working copy)
@@ -0,0 +1,59 @@
+// Locale support -*- C++ -*-
+
+// Copyright (C) 2000, 2009, 2012 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// http://www.gnu.org/licenses/.
+
+//
+// ISO C++ 14882: 22.1  Locales
+//
+  
+// Information as gleaned from /usr/include/ctype.h on OpenBSD.
+  
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
+  /// @brief  Base class for ctype.
+  struct ctype_base
+  {
+// Non-standard typedefs.
+typedef const short*   __to_type;
+
+// NB: Offsets into ctypechar::_M_table force a particular size
+// on the mask type. Because of this, we don't use an enum.
+typedef char   mask;
+
+static const mask upper= _U;
+static const mask lower= _L;
+static const mask alpha= _U | _L;
+static const mask digit= _N;
+static const mask xdigit   = _N | _X;
+static const mask space= _S;
+static const mask print= _P | _U | _L | _N | _B;
+static const mask graph= _P | _U | _L | _N;
+static const mask cntrl= _C;
+static const mask punct= _P;
+static const mask alnum= _U | _L | _N;
+  };
+
+_GLIBCXX_END_NAMESPACE_VERSION
+} // namespace
Index: config/os/bsd/openbsd/os_defines.h
===
--- config/os/bsd/openbsd/os_defines.h  (revision 0)
+++ config/os/bsd/openbsd/os_defines.h  (working copy)
@@ -0,0 +1,41 @@
+// Specific definitions for OpenBSD  -*- C++ -*-
+
+// Copyright (C) 2000, 2002, 2009, 2012 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// http://www.gnu.org/licenses/.
+
+/** @file bits/os_defines.h
+ *  This is an internal header file, included by other library headers.
+ *  Do not attempt to use it directly. @headername{iosfwd}
+ */
+
+#ifndef _GLIBCXX_OS_DEFINES
+#define _GLIBCXX_OS_DEFINES 1
+
+// System-specific #define, typedefs, corrections, etc, go here.  This
+// file will come before all others.
+
+#define _GLIBCXX_USE_C99_DYNAMIC (!(__ISO_C_VISIBLE = 1999))
+#define _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC

C++ PATCH for c++/54198

2012-09-04 Thread Jason Merrill

My patch to change check_default_argument to call 
perform_implicit_conversion_flags in order to get the diagnostics we 
want there had the undesired side-effect of causing the instantiation of 
templates that would be used by that conversion, even though the 
conversion isn't really used.  So this patch avoids that by setting 
cp_unevaluated_context.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit ce91c2a524880f727a114cc40e0ad94ac6755631
Author: Jason Merrill ja...@redhat.com
Date:   Tue Sep 4 15:20:32 2012 -0400

	PR c++/54198
	* decl.c (check_default_argument): Set cp_unevaluated_operand
	around call to perform_implicit_conversion_flags.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 8b94e26..8024373 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -10575,8 +10575,10 @@ check_default_argument (tree decl, tree arg)
 
  A default argument expression is implicitly converted to the
  parameter type.  */
+  ++cp_unevaluated_operand;
   perform_implicit_conversion_flags (decl_type, arg, tf_warning_or_error,
  LOOKUP_NORMAL);
+  --cp_unevaluated_operand;
 
   if (warn_zero_as_null_pointer_constant
c_inhibit_evaluation_warnings == 0
diff --git a/gcc/testsuite/g++.dg/template/defarg15.C b/gcc/testsuite/g++.dg/template/defarg15.C
new file mode 100644
index 000..fea3dee
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/defarg15.C
@@ -0,0 +1,19 @@
+// PR c++/54198
+
+template typename T void
+refIfNotNull (T* p1)
+{
+p1-ref;
+}
+template typename T struct A
+{
+A (T* p1)
+{
+refIfNotNull (p1);
+}
+};
+class B;
+class C
+{
+void getParent (A B = 0);
+};

Re: [middle-end] Add machine_mode to address_cost target hook

2012-09-04 Thread David Edelsohn

On Tue, Sep 4, 2012 at 2:57 PM, Oleg Endo oleg.e...@t-online.de wrote:

 Hmm .. the ACK status so far is:

 [x] target-independent bits
 [ ] alpha [x] arm   [ ] avr [ ] bfin
 [ ] cr16  [ ] cris  [ ] epiphany[ ] i386
 [ ] ia64  [x] iq2000[ ] lm32[ ] m32c
 [x] m32r  [x] mcore [ ] mep [x] microblaze
 [x] mips  [ ] mmix  [x] mn10300 [ ] pa
 [x] rs6000[x] rx[ ] s390[ ] score
 [x] sh[ ] sparc [x] spu [x] stormy16
 [x] v850  [ ] vax   [ ] xtensa


 * config/rs6000/rs6000.c (rs6000_debug_address_cost): Likewise.
 (TARGET_ADDRESS_COST): Use hook_int_rtx_mode_as_bool_0 instead
 of hook_int_rtx_bool_0.
* config/spu/spu.c (TARGET_ADDRESS_COST): Likewise.

The rs6000 and spu bits are okay.

Thanks, David

Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC

2012-09-04 Thread David Edelsohn

On Wed, Aug 29, 2012 at 3:09 PM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 For things that do mftb with high frequency, maybe you should also add a
 builtin that does just an mftb, i.e. returns a 32-bit result on 32-bit
 implementations.

 Are you thinking in a function that returns only the TBL?

 On 32-bit, just TBL; on 64-bit, the whole TB (there is no machine
 instruction to read just TBL on 64-bit, so it doesn't make much
 sense to have it return a 32-bit number).

It sounds like you are asking for an additional interface for
high-frequency events that only reads one register on both PPC32 and
PPC64.  I do not believe that interface currently exists for PPC in
GLibc and that seems out of the scope of this patch.  It could be a
nice feature, but it's a new feature request that is not necessary for
this round of patches.

Thanks, David

Re: [middle-end] Add machine_mode to address_cost target hook

2012-09-04 Thread Joern Rennecke


Quoting David Edelsohn dje@gmail.com:


On Tue, Sep 4, 2012 at 2:57 PM, Oleg Endo oleg.e...@t-online.de wrote:


Hmm .. the ACK status so far is:


Not sure if we are supposed to acknowledge all the straigtforward argument
additions... at any rate, the epiphany hunk is OK.

I think I'll make use of the new functionality eventually, but prefer
to be able to test such a functional change separately, so I'm fine
with the approach to just introduce the infrastructure first.

Re: [PATCH] Set correct source location for deallocator calls

2012-09-04 Thread Dehao Chen

Looks like even with addr2line properly installed, the gcj generated
code cannot get the correct source file/lineno. Do I need to pass in
anything to gcj to let it know where addr2line is?

Thanks,
Dehao

#javac stacktrace.java
#java stacktrace
stacktrace.e(stacktrace.java:42)
stacktrace.d(stacktrace.java:38)
stacktrace.c(stacktrace.java:31)
stacktrace.b(stacktrace.java:26)
stacktrace.a(stacktrace.java:19)
stacktrace.main(stacktrace.java:12)
#gcj *.class -o stacktrace.exe
#./stacktrace.exe
stacktrace.e(stacktrace.exe:-1)
stacktrace.d(stacktrace.exe:-1)
stacktrace.c(stacktrace.exe:-1)
stacktrace.b(stacktrace.exe:-1)
stacktrace.a(stacktrace.exe:-1)
stacktrace.main(stacktrace.exe:-1)

The java code is shown below:
stacktrace.java
/* This test should test the stacktrace functionality.
   We only print ClassName and MethName since the other information
   like FileName and LineNumber are not consistent while building
   native or interpreted and we want to test the output inside the dejagnu
   test environment.
   Also, we have to make the methods public since they might be optimized away
   with inline's and then the -O3/-O2 execution might fail.
*/
public class stacktrace {
  public static void main(String args[]) {
try {
  new stacktrace().a();
} catch (TopException e) {
}
  }

  public void a() throws TopException {
try {
  b();
} catch (MiddleException e) {
  throw new TopException(e);
}
  }

  public void b() throws MiddleException {
c();
  }

  public void c() throws MiddleException {
try {
  d();
} catch (BottomException e) {
  throw new MiddleException(e);
}
  }

  public void d() throws BottomException {
e();
  }

  public void e() throws BottomException {
throw new BottomException();
  }
}

class TopException extends Exception {
  TopException(Throwable cause) {
super(cause);
  }
}

class MiddleException extends Exception {
  MiddleException(Throwable cause) {
super(cause);
  }
}

class BottomException extends Exception {
  BottomException() {
StackTraceElement stack[] = this.getStackTrace();
for (int i = 0; i  stack.length; i++) {
  String className = stack[i].getClassName();
  String methodName = stack[i].getMethodName();
  System.out.println(className + . + methodName + ( +
 stack[i].getFileName() + : +
 stack[i].getLineNumber() +  ));
}
  }
}

Re: [PATCH] Set correct source location for deallocator calls

2012-09-04 Thread Dehao Chen

On Tue, Sep 4, 2012 at 9:22 AM, Andrew Haley a...@redhat.com wrote:
 On 09/04/2012 05:07 PM, Dehao Chen wrote:
 On Thu, Aug 30, 2012 at 9:33 AM, Richard Henderson r...@redhat.com wrote:
 On 08/30/2012 08:20 AM, Andrew Haley wrote:
 Is the problem simply that the logic to
 scan the assembly code isn't present in the libgcj testsuite?

 Yes, exactly.

 For this case, I don't think that we want a testcase to rely on
 addr2line in the system. So how about that that we add a test when
 assembly scan is available in libgcj testsuit?

 Fine by me.  I guess you can just copy the scanning code from the gcc
 testsuite.

I tried that, but it is not trivial, and simply copying proc
scan-assembler to libjava seems ugly. Do libjava people really think
it's worth to add scan-assembler and other premitives in gcc testsuite
into libjava testsuite? If yes, I'll leave it to the TODO list.

Thanks,
Dehao

 Andrew.

Re: [PATCH] Combine location with block using block_locations

2012-09-04 Thread Dehao Chen

ping...

Thanks,
Dehao

On Tue, Aug 21, 2012 at 4:54 PM, Dehao Chen de...@google.com wrote:
 On Tue, Aug 21, 2012 at 6:25 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Mon, Aug 20, 2012 at 3:18 AM, Dehao Chen de...@google.com wrote:
 ping

 Conceptually I like the change.  Can a libcpp maintainer please have a 2nd
 look?

 Dehao, did you do any compile-time and memory-usage benchmarks?

 I don't have a memory benchmarks at hand. But I've tested it through
 some huge apps, each of which takes more than 1 hour to build on a
 modern machine. None of them had observed noticeable memory footprint
 and compile time increase.

 Thanks,
 Dehao


 Thanks,
 Richard.

 Thanks,
 Dehao

 On Tue, Aug 14, 2012 at 10:13 AM, Dehao Chen de...@google.com wrote:
 Hi, Dodji,

 Thanks for the review. I've fixed all the addressed issues. I'm
 attaching the related changes:

 Thanks,
 Dehao

 libcpp/ChangeLog:
 2012-08-01  Dehao Chen  de...@google.com

 * include/line-map.h (MAX_SOURCE_LOCATION): New value.
 (location_adhoc_data_init): New.
 (location_adhoc_data_fini): New.
 (get_combined_adhoc_loc): New.
 (get_data_from_adhoc_loc): New.
 (get_location_from_adhoc_loc): New.
 (COMBINE_LOCATION_DATA): New.
 (IS_ADHOC_LOC): New.
 (expanded_location): New field.
 * line-map.c (location_adhoc_data): New.
 (location_adhoc_data_htab): New.
 (curr_adhoc_loc): New.
 (location_adhoc_data): New.
 (allocated_location_adhoc_data): New.
 (location_adhoc_data_hash): New.
 (location_adhoc_data_eq): New.
 (location_adhoc_data_update): New.
 (get_combined_adhoc_loc): New.
 (get_data_from_adhoc_loc): New.
 (get_location_from_adhoc_loc): New.
 (location_adhoc_data_init): New.
 (location_adhoc_data_fini): New.
 (linemap_lookup): Change to use new location.
 (linemap_ordinary_map_lookup): Likewise.
 (linemap_macro_map_lookup): Likewise.
 (linemap_macro_map_loc_to_def_point): Likewise.
 (linemap_macro_map_loc_unwind_toward_spel): Likewise.
 (linemap_get_expansion_line): Likewise.
 (linemap_get_expansion_filename): Likewise.
 (linemap_location_in_system_header_p): Likewise.
 (linemap_location_from_macro_expansion_p): Likewise.
 (linemap_macro_loc_to_spelling_point): Likewise.
 (linemap_macro_loc_to_def_point): Likewise.
 (linemap_macro_loc_to_exp_point): Likewise.
 (linemap_resolve_location): Likewise.
 (linemap_unwind_toward_expansion): Likewise.
 (linemap_unwind_to_first_non_reserved_loc): Likewise.
 (linemap_expand_location): Likewise.
 (linemap_dump_location): Likewise.

 Index: libcpp/line-map.c
 ===
 --- libcpp/line-map.c   (revision 190209)
 +++ libcpp/line-map.c   (working copy)
 @@ -25,6 +25,7 @@
  #include line-map.h
  #include cpplib.h
  #include internal.h
 +#include hashtab.h

  static void trace_include (const struct line_maps *, const struct 
 line_map *);
  static const struct line_map * linemap_ordinary_map_lookup (struct 
 line_maps *,
 @@ -50,6 +51,135 @@
  extern unsigned num_expanded_macros_counter;
  extern unsigned num_macro_tokens_counter;

 +/* Data structure to associate an arbitrary data to a source location.  */
 +struct location_adhoc_data {
 +  source_location locus;
 +  void *data;
 +};
 +
 +/* The following data structure encodes a location with some adhoc data
 +   and maps it to a new unsigned integer (called an adhoc location)
 +   that replaces the original location to represent the mapping.
 +
 +   The new adhoc_loc uses the highest bit as the enabling bit, i.e. if the
 +   highest bit is 1, then the number is adhoc_loc. Otherwise, it serves as
 +   the original location. Once identified as the adhoc_loc, the lower 31
 +   bits of the integer is used to index the location_adhoc_data array,
 +   in which the locus and associated data is stored.  */
 +
 +static htab_t location_adhoc_data_htab;
 +static source_location curr_adhoc_loc;
 +static struct location_adhoc_data *location_adhoc_data;
 +static unsigned int allocated_location_adhoc_data;
 +
 +/* Hash function for location_adhoc_data hashtable.  */
 +
 +static hashval_t
 +location_adhoc_data_hash (const void *l)
 +{
 +  const struct location_adhoc_data *lb =
 +  (const struct location_adhoc_data *) l;
 +  return (hashval_t) lb-locus + (size_t) lb-data;
 +}
 +
 +/* Compare function for location_adhoc_data hashtable.  */
 +
 +static int
 +location_adhoc_data_eq (const void *l1, const void *l2)
 +{
 +  const struct location_adhoc_data *lb1 =
 +  (const struct location_adhoc_data *) l1;
 +  const struct location_adhoc_data *lb2 =
 +  (const struct location_adhoc_data *) l2;
 +  return lb1-locus == lb2-locus  lb1-data == lb2-data;
 +}
 +
 +/* Update the hashtable when

Re: [google/integration] Add a configure option to disable system header canonicalizations (issue6489063)

2012-09-04 Thread Ollie Wild

On Fri, Aug 31, 2012 at 10:30 AM, Simon Baldwin sim...@google.com wrote:

 Yes.  I meant --disable-canonical-prefixes.  That is a gcc configure
 flag that we use to control the default setting for
 -[no-]canonical-prefixes where neither flag is supplied on the gcc
 command line.  --disable/enable-canonical-prefixes is only in google
 branches.

I did a little archaeology.  AFAICT, there was no specific objection
to pushing --disable-canonical-prefixes into upstream trunk.  The
feedback I see to your initial post was send us a trunk-based patch
and here are some minor nits to cleanup.  It basically sounds like
upstream was neutral to the patch and would probably accept it if we
actually sent something for review.

I still think this is something that is both reasonable and feasible
to push upstream.  We should at least try to get some feedback first.
While there aren't a lot of people using symlink farms, I'd be
surprised if we were the only ones.

Ollie

Fix bootstrap failure with clang++ (PR 54484)

2012-09-04 Thread Diego Novillo

Fix bootstrap failure with clang++.

This patch fixes a bootstrap failure when using clang as the host
compiler.  Default arguments for class template member functions
should be added in the declaration, not the definition.

From Jason:

 8.3.6 says Default arguments for a member function of a class template shall
 be specified on the initial declaration of the member function within the
 class template.

2012-09-04  Diego Novillo  dnovi...@google.com

PR bootstrap/54484
* vec.h (vec_t::embedded_init): Move default argument value
to function declaration.

diff --git a/gcc/vec.h b/gcc/vec.h
index c0f1bb2..441c9b5 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -171,7 +171,7 @@ struct GTY(()) vec_t
   T last (ALONE_VEC_CHECK_DECL);
   const T operator[] (unsigned) const;
   T operator[] (unsigned);
-  void embedded_init (int, int);
+  void embedded_init (int, int = 0);
 
   templateenum vec_allocation_t A
   vec_tT *copy (ALONE_MEM_STAT_DECL);
@@ -599,7 +599,7 @@ vec_tT::iterate (const vec_tT *vec, unsigned ix, T 
**ptr)
final member):
 
size_t vec_tT::embedded_sizeT (int reserve);
-   void v-embedded_init(int reserve, int active = 0);
+   void v-embedded_init(int reserve, int active);
 
These allow the caller to perform the memory allocation.  */
 
@@ -616,7 +616,7 @@ vec_tT::embedded_size (int nelems)
 
 templatetypename T
 void
-vec_tT::embedded_init (int nelems, int active = 0)
+vec_tT::embedded_init (int nelems, int active)
 {
   prefix_.num_ = active;
   prefix_.alloc_ = nelems;

Re: Fix bootstrap failure with clang++ (PR 54484)

2012-09-04 Thread Steven Bosscher

On Tue, Sep 4, 2012 at 11:07 PM, Diego Novillo dnovi...@google.com wrote:
 Fix bootstrap failure with clang++.

 This patch fixes a bootstrap failure when using clang as the host
 compiler.  Default arguments for class template member functions
 should be added in the declaration, not the definition.

 From Jason:

 8.3.6 says Default arguments for a member function of a class template shall
 be specified on the initial declaration of the member function within the
 class template.

If GCC doesn't diagnose this, what is there to avoid this problem in the future?

Ciao!
Steven

Re: Fix bootstrap failure with clang++ (PR 54484)

2012-09-04 Thread Diego Novillo


On 2012-09-04 17:10 , Steven Bosscher wrote:

On Tue, Sep 4, 2012 at 11:07 PM, Diego Novillo dnovi...@google.com wrote:

Fix bootstrap failure with clang++.

This patch fixes a bootstrap failure when using clang as the host
compiler.  Default arguments for class template member functions
should be added in the declaration, not the definition.

 From Jason:


8.3.6 says Default arguments for a member function of a class template shall
be specified on the initial declaration of the member function within the
class template.


If GCC doesn't diagnose this, what is there to avoid this problem in the future?


I'm filing a separate PR for this.


Diego

[patch, mips] New mips triplet for multilib linux builds

2012-09-04 Thread Steve Ellcey

I would like to create a new mips target triplet (mips-mti-linux-gnu).
This target would be multilib by default and would have --enable-synci
on by default.  It would mainly be used for building mips cross compilers
with glibc.  I hope to extend this target to support the n32 and 64 bit
ABIs in the future and add a corresponding mips-mti-elf triplet that would
be like mips-sde-elf but have fewer/different multilib versions.

Other then adding the new target the only changes are to the --enable-synci
default setting (enabled for mips-mti-linux-gnu, still disabled for other
targets) and in mips.h to use a new macro SYNCI_SPEC so that I don't have
to copy all of OPTION_DEFAULT_SPECS into mti-linux.h just to change the
-msynci handling.

I tested the changes by building and running the testsuite with the qemu
simulator.  No glibc or binutils changes were needed for this.

OK to checkin?

Steve Ellcey
sell...@mips.com


2012-09-04  Steve Ellcey  sell...@mips.com

* config.gcc: Add mips*-mti-linux* target and make with_synci
true by default for that target.
* config/mips/mips.h (SYNCI_SPEC): New.
(OPTION_DEFAULT_SPECS): Use new SYNCI_SPEC.
* mti-linux.h: New file.
* t-mti-linux: New file.


diff --git a/gcc/config.gcc b/gcc/config.gcc
index 9ec8a41..6923211 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1685,6 +1685,13 @@ mips*-*-netbsd*) # NetBSD/mips, either 
endian.
tm_file=elfos.h ${tm_file} mips/elf.h netbsd.h netbsd-elf.h 
mips/netbsd.h
extra_options=${extra_options} netbsd.opt netbsd-elf.opt
;;
+mips*-mti-linux*)
+   tm_file=dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h 
mips/mti-linux.h ${tm_file} mips/gnu-user.h mips/gnu-user64.h mips/linux64.h 
mips/linux-common.h
+   tmake_file=${tmake_file} mips/t-mti-linux
+   gnu_ld=yes
+   gas=yes
+   test x$with_llsc != x || with_llsc=yes
+   ;;
 mips64*-*-linux* | mipsisa64*-*-linux*)
tm_file=dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h ${tm_file} 
mips/gnu-user.h mips/gnu-user64.h mips/linux64.h mips/linux-common.h
tmake_file=${tmake_file} mips/t-linux64
@@ -3262,10 +3269,19 @@ case ${target} in
yes)
with_synci=synci
;;
-| no)
-   # No is the default.
+   no)
with_synci=no-synci
;;
+   )
+   case ${target} in
+   mips*-mti-*)
+   with_synci=synci
+   ;;
+   *)
+   with_synci=no-synci
+   ;;
+   esac
+   ;;
*)
echo Unknown synci type used in --with-synci 12
exit 1
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 9ce466d..b98b434 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -748,6 +748,9 @@ struct mips_cpu_info {
  specified.
--with-divide is ignored if -mdivide-traps or -mdivide-breaks are
  specified. */
+#ifndef SYNCI_SPEC
+#define SYNCI_SPEC -m%(VALUE)
+#endif
 #define OPTION_DEFAULT_SPECS \
   {arch, %{ MIPS_ARCH_OPTION_SPEC :;: -march=%(VALUE)} }, \
   {arch_32, %{ OPT_ARCH32 :%{ MIPS_ARCH_OPTION_SPEC :;: 
-march=%(VALUE)}} }, \
@@ -760,7 +763,7 @@ struct mips_cpu_info {
   {divide, %{!mdivide-traps:%{!mdivide-breaks:-mdivide-%(VALUE)}} }, \
   {llsc, %{!mllsc:%{!mno-llsc:-m%(VALUE)}} }, \
   {mips-plt, %{!mplt:%{!mno-plt:-m%(VALUE)}} }, \
-  {synci, %{!msynci:%{!mno-synci:-m%(VALUE)}} }
+  {synci, %{!msynci:%{!mno-synci: SYNCI_SPEC }} }
 
 
 /* A spec that infers the -mdsp setting from an -march argument.  */
diff --git a/gcc/config/mips/mti-linux.h b/gcc/config/mips/mti-linux.h
new file mode 100644
index 000..af3d71f
--- /dev/null
+++ b/gcc/config/mips/mti-linux.h
@@ -0,0 +1,35 @@
+/* Target macros for mips*-mti-linux* targets.
+   Copyright (C) 2012
+   Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+http://www.gnu.org/licenses/.  */
+
+/* Use the (o)32 ABI and the mips32r2 architecture by default.  */
+#undef MIPS_ABI_DEFAULT
+#define MIPS_ABI_DEFAULT ABI_32
+#undef MIPS_ISA_DEFAULT
+#define

Re: [PATCH] Add counter histogram to fdo summary (issue6465057)

2012-09-04 Thread Teresa Johnson

I just committed the patch (included below). I implemented the
occupancy bit vector approach for recording non-zero histogram
entries, and a few issues uncovered with the merging in a profiled
bootstrap.

Passes both bootstrap and profiledbootstrap builds and regression tests.

Thanks,
Teresa

Enhances the gcov program summary by adding a histogram of arc counter
entries. This is used to compute working set information in the compiler
for use by optimizations that need information on hot vs cold counter
values or the rough working set size in terms of the number of counters.
Each working set data point is the minimum counter value and number of
counters required to reach a given percentage of the cumulative counter
sum across the profiled execution (sum_all in the program summary).

2012-09-04  Teresa Johnson  tejohn...@google.com

* libgcc/libgcov.c (struct gcov_summary_buffer): New structure.
(gcov_histogram_insert): New function.
(gcov_compute_histogram): Ditto.
(gcov_exit): Invoke gcov_compute_histogram, and perform merging of
histograms during summary merging.
* gcc/gcov-io.c (gcov_write_summary): Write out non-zero histogram
entries to function summary along with an occupancy bit vector.
(gcov_read_summary): Read in the histogram entries.
(gcov_histo_index): New function.
(void gcov_histogram_merge): Ditto.
* gcc/gcov-io.h (gcov_type_unsigned): New type.
(struct gcov_bucket_type): Ditto.
(struct gcov_ctr_summary): Include histogram.
(GCOV_TAG_SUMMARY_LENGTH): Update to include histogram entries.
(GCOV_HISTOGRAM_SIZE): New macro.
(GCOV_HISTOGRAM_BITVECTOR_SIZE): Ditto.
* gcc/profile.c (NUM_GCOV_WORKING_SETS): Ditto.
(gcov_working_sets): New global variable.
(compute_working_sets): New function.
(find_working_set): Ditto.
(get_exec_counts): Invoke compute_working_sets.
* gcc/coverage.c (read_counts_file): Merge histograms, and
fix bug with accessing summary info for non-summable counters.
* gcc/basic-block.h (gcov_type_unsigned): New type.
(struct gcov_working_set_info): Ditto.
(find_working_set): Declare.
* gcc/gcov-dump.c (tag_summary): Dump out histogram.

Index: libgcc/libgcov.c
===
--- libgcc/libgcov.c(revision 190950)
+++ libgcc/libgcov.c(working copy)
@@ -97,6 +97,12 @@ struct gcov_fn_buffer
   /* note gcov_fn_info ends in a trailing array.  */
 };

+struct gcov_summary_buffer
+{
+  struct gcov_summary_buffer *next;
+  struct gcov_summary summary;
+};
+
 /* Chain of per-object gcov structures.  */
 static struct gcov_info *gcov_list;

@@ -276,6 +282,76 @@ gcov_version (struct gcov_info *ptr, gcov_unsigned
   return 1;
 }

+/* Insert counter VALUE into HISTOGRAM.  */
+
+static void
+gcov_histogram_insert(gcov_bucket_type *histogram, gcov_type value)
+{
+  unsigned i;
+
+  i = gcov_histo_index(value);
+  histogram[i].num_counters++;
+  histogram[i].cum_value += value;
+  if (value  histogram[i].min_value)
+histogram[i].min_value = value;
+}
+
+/* Computes a histogram of the arc counters to place in the summary SUM.  */
+
+static void
+gcov_compute_histogram (struct gcov_summary *sum)
+{
+  struct gcov_info *gi_ptr;
+  const struct gcov_fn_info *gfi_ptr;
+  const struct gcov_ctr_info *ci_ptr;
+  struct gcov_ctr_summary *cs_ptr;
+  unsigned t_ix, f_ix, ctr_info_ix, ix;
+  int h_ix;
+
+  /* This currently only applies to arc counters.  */
+  t_ix = GCOV_COUNTER_ARCS;
+
+  /* First check if there are any counts recorded for this counter.  */
+  cs_ptr = (sum-ctrs[t_ix]);
+  if (!cs_ptr-num)
+return;
+
+  for (h_ix = 0; h_ix  GCOV_HISTOGRAM_SIZE; h_ix++)
+{
+  cs_ptr-histogram[h_ix].num_counters = 0;
+  cs_ptr-histogram[h_ix].min_value = cs_ptr-run_max;
+  cs_ptr-histogram[h_ix].cum_value = 0;
+}
+
+  /* Walk through all the per-object structures and record each of
+ the count values in histogram.  */
+  for (gi_ptr = gcov_list; gi_ptr; gi_ptr = gi_ptr-next)
+{
+  if (!gi_ptr-merge[t_ix])
+continue;
+
+  /* Find the appropriate index into the gcov_ctr_info array
+ for the counter we are currently working on based on the
+ existence of the merge function pointer for this object.  */
+  for (ix = 0, ctr_info_ix = 0; ix  t_ix; ix++)
+{
+  if (gi_ptr-merge[ix])
+ctr_info_ix++;
+}
+  for (f_ix = 0; f_ix != gi_ptr-n_functions; f_ix++)
+{
+  gfi_ptr = gi_ptr-functions[f_ix];
+
+  if (!gfi_ptr || gfi_ptr-key != gi_ptr)
+continue;
+
+  ci_ptr = gfi_ptr-ctrs[ctr_info_ix];
+  for (ix = 0; ix  ci_ptr-num; ix++)
+gcov_histogram_insert (cs_ptr-histogram, ci_ptr-values[ix]);
+}
+}
+}
+
 /* Dump the coverage counts. We merge with existing counts when

Re: [patch,libgcc] fp-bit.c: filter-out LIB2FUNCS_EXCLUDE

2012-09-04 Thread Ian Lance Taylor

On Mon, Sep 3, 2012 at 8:30 AM, Georg-Johann Lay a...@gjlay.de wrote:

 * Makefile.in (FPBIT_FUNCS): filter-out LIB2FUNCS_EXCLUDE.
 (DPBIT_FUNCS): Ditto.
 (TPBIT_FUNCS): Ditto.

This is OK.

Thanks.

Ian

Re: Fix bootstrap failure with clang++ (PR 54484)

2012-09-04 Thread Diego Novillo


On 2012-09-04 17:10 , Steven Bosscher wrote:

On Tue, Sep 4, 2012 at 11:07 PM, Diego Novillo dnovi...@google.com wrote:

Fix bootstrap failure with clang++.

This patch fixes a bootstrap failure when using clang as the host
compiler.  Default arguments for class template member functions
should be added in the declaration, not the definition.

 From Jason:


8.3.6 says Default arguments for a member function of a class template shall
be specified on the initial declaration of the member function within the
class template.


If GCC doesn't diagnose this, what is there to avoid this problem in the future?


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54485


Diego.

Re: Ping: [PATCH GCC/ARM] Fix problem that hardreg_cprop opportunities are missed on thumb1

2012-09-04 Thread Richard Sandiford

Bin Cheng bin.ch...@arm.com writes:
 Hi,
 For thumb1, arm-gcc rewrites move insn into subtract of ZERO in peephole2
 pass
 intentionally, then executes
 pass_if_after_reload/pass_regrename/pass_cprop_hardreg sequentially.
 
 In this scenario, copy propagation opportunities are missed because:
   1. the move insns are re-written.
   2. pass_cprop_hardreg currently don't notice the subtract of ZERO.
 
 This patch fixes the problem and the logic is:
   1. notice the plus/subtract of ZERO in pass_cprop_hardreg.
   2. if the last insn providing information about conditional codes is in
 the
 form of dest_reg = src_reg - 0, record the src_reg in newly added field
 thumb1_cc_op0_src of structure machine_function.
   3. in pattern cbranchsi4_insn, check thumb1_cc_op0_src along with
 thumb1_cc_op0 to save one comparison insn.
 
 I measured the patch on CSiBE, about 600 bytes are saved for both O2 and
 Os on
 cortex-m0 without any regression.
 
 I also tested the patch on
 arm-none-eabi+cortex-m0/arm-none-eabi+cortex-m3/i686-pc-linux and no
 regressions introduced.
 
 So is it OK?
 
 Thanks
 
 2012-08-13  Bin Cheng  bin.ch...@arm.com
 
  * regcprop.c (copyprop_hardreg_forward_1) Notice copies in the form
 of
  subtract of ZERO.
  * config/arm/arm.h (thumb1_cc_op0_src) New field.
  * config/arm/arm.c (thumb1_final_prescan_insn) Record
 thumb1_cc_op0_src.
  * config/arm/arm.md (cbranchsi4_insn) Check thumb1_cc_op0_src along
 with
  thumb1_cc_op0.

 Ping?

 Hi Ramana, could you help me review this patch?
 Hi Eric, Richard, could you help me review the change in regcprop.c?

Subtraction of zero isn't canonical rtl though.  Passes after peephole2
would be well within their rights to simplify the expression back to a move.
From that point of view, making the passes recognise (plus X 0) and
(minus X 0) as special cases would be inconsistent.

Rather than make the Thumb 1 CC usage implicit in the rtl stream,
and carry the current state around in cfun-machine, it seems like it
would be better to get md_reorg to rewrite the instructions into a form
that makes the use of condition codes explicit.

md_reorg also sounds like a better place in the pipeline than peephole2
to be doing this kind of transformation, although I admit I have zero
evidence to back that up...

Richard

Re: [PATCH] Reduce memory usage for storing LTO decl resolutions

2012-09-04 Thread Steven Bosscher

On Tue, Sep 4, 2012 at 6:43 PM, Andi Kleen a...@firstfloor.org wrote:
 +/* Compact representation of a index - resolution pair. Unpacked to an
 +   vector later. */
 +struct res_pair
 +{
 +  ld_plugin_symbol_resolution_t res;
 +  unsigned index;
 +};
 +typedef struct res_pair res_pair;
 +
 +DEF_VEC_P(res_pair);
 +DEF_VEC_ALLOC_P(res_pair, heap);

Did you mean to use DEF_VEC_O here?
(Not sure it matters after the vec rewrite for c++)

Ciao!
Steven

Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC

2012-09-04 Thread Segher Boessenkool

For things that do mftb with high frequency, maybe you should  
also add a
builtin that does just an mftb, i.e. returns a 32-bit result on  
32-bit

implementations.


Are you thinking in a function that returns only the TBL?


On 32-bit, just TBL; on 64-bit, the whole TB (there is no machine
instruction to read just TBL on 64-bit, so it doesn't make much
sense to have it return a 32-bit number).


It sounds like you are asking for an additional interface for
high-frequency events that only reads one register on both PPC32 and
PPC64.


Yes.  A builtin only makes sense for measuring very short intervals;
the builtin is quite a hassle (the timebase is not part of the UISA,
and as we see it actually differs a lot between implementations), and
there is no advantage over having it in some library if you're
measuring big intervals.


  I do not believe that interface currently exists for PPC in
GLibc


Does glibc implement the timebase thing at all?  I lost track of
those patches.


and that seems out of the scope of this patch.  It could be a
nice feature, but it's a new feature request that is not necessary for
this round of patches.


Sure; on the other hand, it seems simple enough to implement.  It
was just a request.


Segher

1 2 >

1 - 100 of 105 matches

Mail list logo