Re: [tsan] Instrument atomics

2012-11-26 Thread Dmitry Vyukov
On Sat, Nov 24, 2012 at 12:58 PM, Dmitry Vyukov dvyu...@google.com wrote:
 Ok.  A slight problem then is that where the tsan pass sits right now, there
 is no easy way to find out if the builtin call will be expanded inline or
 not, so (similar for asan), if we instrument them in the pass, it might be
 instrumented twice at runtime if the builtin is expanded as a library call
 (once the added instrumentation for the builtin, once in the intercepted
 library call).  That isn't wrong, just might need slightly more resources
 than if we ensured we only instrument the builtin if it isn't expanded
 inline.


 Should inlining of those functions be disabled as if -fno-builtins is 
 specified?

 Yes, it sounds reasonable. Performance characteristics under tsan
 differ significantly, so most likely we don't care.


Do we still need range access functions then?


Re: [PATCH] Section anchors and thread-local storage

2012-11-26 Thread Richard Sandiford
David Edelsohn dje@gmail.com writes:
 I have been working to enable native thread-local storage on AIX. One
 problem I encountered is the AIX assembler has difficulty with the
 anchor symbol for TLS CSECTs.  While the section anchors machinery
 uses a separate pool for TLS entries, should section anchor blocks be
 used for TLS symbols at all?

 powerpc-linux uses GOT annotations directly and never places TLS
 symbols in the TOC at all. Section anchors seem to be avoided by TLS
 code already.  The appended patch rejects TLS symbols for object
 blocks in general. I could add a target hook, but I wanted to propose
 this policy in general before pursing a target-specific hook.

Yeah, TLS anchors work on mips64-linux-gnu, although admittedly
section anchors aren't yet enabled by default there, so perhaps not
many people have tried it.

A use_blocks_for_decl_p hook sounds good, and fits nicely with
targetm.use_blocks_for_constant_p.

Thanks,
Richard


Re: [tsan] Instrument atomics

2012-11-26 Thread Jakub Jelinek
On Mon, Nov 26, 2012 at 12:17:44PM +0400, Dmitry Vyukov wrote:
 On Sat, Nov 24, 2012 at 12:58 PM, Dmitry Vyukov dvyu...@google.com wrote:
  Ok.  A slight problem then is that where the tsan pass sits right now, 
  there
  is no easy way to find out if the builtin call will be expanded inline or
  not, so (similar for asan), if we instrument them in the pass, it might be
  instrumented twice at runtime if the builtin is expanded as a library call
  (once the added instrumentation for the builtin, once in the intercepted
  library call).  That isn't wrong, just might need slightly more resources
  than if we ensured we only instrument the builtin if it isn't expanded
  inline.
 
 
  Should inlining of those functions be disabled as if -fno-builtins is 
  specified?
 
  Yes, it sounds reasonable. Performance characteristics under tsan
  differ significantly, so most likely we don't care.
 
 
 Do we still need range access functions then?

Yes.  I think whether to instrument builtins inline or not should be a user
decision, -fsanitize-thread and either nothinng or -fno-builtin.  Implying
-fno-builtin might penalize some code unnecessarily (remember, builtin
handling is not only about expanding some builtins inline, but also about
folding them if they are used with constant arguments, performing all sorts
of optimizations etc., all of which are ok, just we don't have a knob to
disable inline expansion of some builtins (which, some are certainly ok,
e.g. those that don't have a library counterpart).
Also note that many programs use __builtin_* calls directly and then you
can't disable recognizing them as builtins.

Jakub


[patch] Re: MULTIARCH_DIRNAME in gcc/config/rs6000/t-linux doesn't work

2012-11-26 Thread Matthias Klose
Am 26.11.2012 09:50, schrieb Jan-Benedict Glaw:
 The ifeq in this new makefile fragment doesn't work due to unbalanced
 brackets.

Fixed, and committing as obvious. Sorry, apparently I missed testing the
non-multilib'd configuration.

  Matthias

2012-11-26  Matthias Klose  d...@ubuntu.com

	* config/rs6000/t-linux (MULTIARCH_DIRNAME): Fix unbalanced
	parentheses.

Index: config/rs6000/t-linux
===
--- config/rs6000/t-linux	(Revision 193804)
+++ config/rs6000/t-linux	(Arbeitskopie)
@@ -1,5 +1,5 @@
 # do not define the multiarch name if configured for a soft-float cpu
 # or soft-float.
-ifeq (,$(filter $(with_cpu),$(SOFT_FLOAT_CPUS))$(findstring soft,$(with_float))
+ifeq (,$(filter $(with_cpu),$(SOFT_FLOAT_CPUS))$(findstring soft,$(with_float)))
 MULTIARCH_DIRNAME = powerpc-linux-gnu
 endif


Re: [Patch,testsuite] ad PR52641: More fixes for not-so-common targets

2012-11-26 Thread Georg-Johann Lay
Hans-Peter Nilsson wrote:
 On Fri, 23 Nov 2012, Georg-Johann Lay wrote:
 Here are some more fixes for 16-bit int and similar.
 
  * gcc.c-torture/execute/20120919-1.x: New file (int32plus).
 
 No, you should be able to use dg-directives in the main file these days.
 (The .x files are obsolete since a few years, IIRC.)

Would you be more specific?

AFAIK this works only with dg.

After removing the x file and adding

/* { dg-do run } */
/* { dg-require-effective-target int32plus } */

to the test it is still executed.

Thanks,

Johann




Re: [PATCH][AARCH64] Fix the name mangling of va_list

2012-11-26 Thread Marcus Shawcroft

On 21/11/12 14:31, Yufeng Zhang wrote:

Hi,

This patch updates the AArch64 port to mangle __va_list as it is in
namespace std in C++.  This is specified in the AArch64 AAPCS64 ABI doc.

OK for the trunk?

Thanks,
Yufeng

gcc/ChangeLog

2012-11-21  Yufeng Zhang  yufeng.zh...@arm.com

  * config/aarch64/aarch64.c (aarch64_mangle_type): New function.
  (TARGET_MANGLE_TYPE): Define.



The change to the AArch64 port itself is OK.


gcc/testsuite/ChangeLog

2012-11-21  Yufeng Zhang  yufeng.zh...@arm.com

  * g++.dg/abi/arm_va_list.C: Also test on aarch64*-*-*.



 // AAPCS \S 7.1.4 requires that va_list be a typedef for struct
 // __va_list.  The mangling is as if it were std::__va_list.
+// So is required for the AArch64 target.


The functional change in this test makes sense however the comment 
change is slightly confusing.  The original comment refers to the 
procedure call standard for AArch32: IHI0042D_aapcs.pdf.


The procedure call standard for AArch64 is defined in 
IHI0055A_aapcs.pdf.  This document also discusses va_list in chapter 
7.1.4.  Perhaps a different form of words distinguishes between the two 
different PCS documents would be better? Perhaps something along these 
lines:


// AAPCS \S 7.1.4 requires that va_list be a typedef for struct
// __va_list.  The mangling is as if it were std::__va_list.
// AArch64 PCS IHI0055A_aapcs64.pdf \S 7.1.4 requires that va_list
// be a typedef for struct __va_list.  The mangling is as if it
// were std::__va_list.

In any case I don;t believe I can OK this change to the generic part of 
the test suite. Suggest you CC Mike Stump or Janis Johnson.



Cheers

/Marcus




Re: [Patch, Fortran] PR 54997: -Wunused-function gives false warnings for procedures passed as actual argument

2012-11-26 Thread Janus Weil
2012/11/25 Mikael Morin mikael.mo...@sfr.fr:
 Le 22/10/2012 16:49, Janus Weil a écrit :

 Minor update to the patch: It now also sets TREE_USED for entry
 masters in order to avoid bogus warnings for procedures with ENTRY
 (cf. comment 6 in the PR, which like comment 0 is a 4.8 regression).

 Still regtests cleanly. Ok?


 OK with an extra test for comment 6.

Thanks a bunch. Committed as r193811. (A test for comment 6 was
already included in the updated test case I sent together with the
updated patch.)


 Thanks. And sorry for the delay.

No reason to apologize! Better late than never ;)

Cheers,
Janus





 2012/10/21 Janus Weilja...@gcc.gnu.org:

 Hi all,

 here is another patch to silence some more of the bogus warnings about
 unused functions that gfortran is currently throwing (cf. also the
 previous patch for PR 54224).

 It fixes the usage of the 'referenced' attribute, which should only be
 given to procedures which are actually 'used' (called/referenced).
 Then TREE_USED is set according to this attribute, which in turn
 silences the warning in the middle-end.

 The patch was regtested on x86_64-unknown-linux-gnu. Ok for trunk?

 Cheers,
 Janus


 2012-10-21  Janus Weilja...@gcc.gnu.org

  PR fortran/54997
  * decl.c (match_procedure_decl): Don't set 'referenced'
 attribute
  for PROCEDURE declarations.
  * parse.c (gfc_fixup_sibling_symbols,parse_contained): Don't set
  'referenced' attribute for all contained procedures.
  * trans-decl.c (gfc_get_symbol_decl): Allow for unreferenced
 procedures.
  (build_function_decl): Set TREE_USED for referenced procedures.

 2012-10-21  Janus Weilja...@gcc.gnu.org

  PR fortran/54997
  * gfortran.dg/warn_unused_function_2.f90: New.


Re: [Patch,testsuite] ad PR52641: More fixes for not-so-common targets

2012-11-26 Thread Hans-Peter Nilsson
On Mon, 26 Nov 2012, Georg-Johann Lay wrote:
 Hans-Peter Nilsson wrote:
  On Fri, 23 Nov 2012, Georg-Johann Lay wrote:
  Here are some more fixes for 16-bit int and similar.
 
 * gcc.c-torture/execute/20120919-1.x: New file (int32plus).
 
  No, you should be able to use dg-directives in the main file these days.
  (The .x files are obsolete since a few years, IIRC.)

 Would you be more specific?

You seem to have understood what I meant, so I guess not needed. :)

 AFAIK this works only with dg.

I had another look.  I thought Mark Mitchells overhaul in
2003 adjusted *all* gcc/testsuite subdirectories; I grepped
gcc.c-torture/execute/* and gcc.c-torture/compile/* for dg-,
lots of apparently-working examples.

But that's wrong; observe the differences between
gcc.c-torture/execute/execute.exp and
gcc.c-torture/compile/compile.exp; they should differ only in
the dg-do-what-default, but the whole dg- thing is missing in
gcc.c-torture/execute/execute.exp.  I haven't dug into the ml
archives for the reason, maybe there were problems and not
enough time to solve them.  At least according to svn it doesn't
seem to be a reverted change.

 After removing the x file and adding

 /* { dg-do run } */
 /* { dg-require-effective-target int32plus } */

 to the test it is still executed.

Yeah...  If it had been a new test, I would have suggested to
put it in gcc.dg/torture instead, the new preferred home for
executed torture-tests.  So unless you feel like completing
Mark's dg- makeover for gcc.c-torture/execute you'll have to go
with your .x file.  Sorry for putting you on the wrong track,
but thanks for getting light on the issue.

Mike, other testsuite maintainers; heads-up: there are several
dg- directives in files in gcc.c-torture/execute.  That's just
misleading, since they have no effect.  (I just verified that,
testing a few examples.)

brgds, H-P


Re: [Patch, Fortran] PR 54997: -Wunused-function gives false warnings for procedures passed as actual argument

2012-11-26 Thread Mikael Morin

Le 26/11/2012 12:21, Janus Weil a écrit :

2012/11/25 Mikael Morinmikael.mo...@sfr.fr:

Le 22/10/2012 16:49, Janus Weil a écrit :


Minor update to the patch: It now also sets TREE_USED for entry
masters in order to avoid bogus warnings for procedures with ENTRY
(cf. comment 6 in the PR, which like comment 0 is a 4.8 regression).

Still regtests cleanly. Ok?



OK with an extra test for comment 6.


Thanks a bunch. Committed as r193811. (A test for comment 6 was
already included in the updated test case I sent together with the
updated patch.)


Ah, yes, indeed. Didn't see it.


[PATCH] Remove redundant variable in hash_set

2012-11-26 Thread Marek Polacek
I don't see why we need the `hash' variable, when we can use the
`regno' variable directly.  Regtested/bootstrapped on x86_64-linux.
Ok for trunk?

2012-11-26  Marek Polacek  pola...@redhat.com

* cprop.c (hash_set): Remove hash variable.  Use regno
variable directly.

--- gcc/cprop.c.mp  2012-11-26 12:13:08.631193625 +0100
+++ gcc/cprop.c 2012-11-26 12:13:25.836238566 +0100
@@ -170,10 +170,7 @@ reg_available_p (const_rtx x, const_rtx
 static unsigned int
 hash_set (int regno, int hash_table_size)
 {
-  unsigned int hash;
-
-  hash = regno;
-  return hash % hash_table_size;
+  return (unsigned) regno % hash_table_size;
 }
 
 /* Insert assignment DEST:=SET from INSN in the hash table.

Marek


Re: [RFA/4.7/ARM] Backport arm-*-linux-gnueabihf triplet support to 4.7

2012-11-26 Thread Richard Biener
On Tue, 20 Nov 2012, Matthew Gretton-Dann wrote:

 All,
 
 This patch backports Matthais Klose's arm*-*-linux-gnueabihf triplet
 support patch of 2012-10-15 to 4.7.
 
 The backport was not clean as 4.8 has obsoleted various arm*-*-*
 triplets which are valid in 4.7.
 
 I have tested this cross with arm-none-linux-gnueabihf and
 arm-none-linux-gnueabi.
 
 One question I do have having done this work - is there a canonical way to 
 test for the arm*-*-linux-gnueabi triplet (or variants)?  Various configure 
 and testsuite files test for this, but there doesn't seem to be a consistent 
 method.
 
 OK for 4.7?

I'll defer to the arm maintainers for this.

Richard.

 Thanks,
 
 Matt
 
 2012-11-08  Matthew Gretton-Dann  matthew.gretton-d...@linaro.org
 
   Backport from mainline
   2012-10-15  Matthias Klose  d...@ubuntu.com
 
   * config.gcc: Match arm*-*-linux-* for ARM Linux/GNU.
   * doc/install.texi: Use arm-*-*linux-* instead of arm-*-*linux-gnueabi.
 
 gcc/ada/ChangeLog:
 2012-11-08  Matthew Gretton-Dann  matthew.gretton-d...@linaro.org
 
   Backport from mainline.
   2012-10-15  Matthias Klose  d...@ubuntu.com
 
   * gcc-interface/Makefile.in: Match arm*-*-linux-*eabi* for
   ARM Linux/GNU.
 
 gcc/testsuite/ChangeLog:
 2012-11-08  Matthew Gretton-Dann  matthew.gretton-d...@linaro.org
 
   Backport from mainline
   2012-10-15  Matthias Klose  d...@ubuntu.com
 
   * lib/target-supports.exp (check_profiling_available): Match
   arm*-*-linux-* for ARM Linux/GNU.
   * gfortran.dg/enum_10.f90: Likewise.
   * gfortran.dg/enum_9.f90: Likewise.
   * gcc.target/arm/synchronize.c: Likewise.
   * g++.old-deja/g++.jason/enum6.C: Likewise.
   * g++.old-deja/g++.law/enum9.C: Likewise.
   * g++.old-deja/g++.other/enum4.C: Likewise.
 
 libgcc/ChangeLog:
 2012-11-08  Matthew Gretton-Dann  matthew.gretton-d...@linaro.org
 
   Backport from mainline.
   2012-10-15  Matthias Klose  d...@ubuntu.com
 
   * config.host: Match arm*-*-linux-* for ARM Linux/GNU.
 
 libjava/ChangeLog:
 2012-11-08  Matthew Gretton-Dann  matthew.gretton-d...@linaro.org
 
   Backport from mainline.
   2012-10-15  Matthias Klose  d...@ubuntu.com
 
   * configure.ac: Match arm*-*-linux-* for ARM Linux/GNU.
   * configure: Regenerate.
 
 libstdc++-v3/ChangeLog:
 2012-11-08  Matthew Gretton-Dann  matthew.gretton-d...@linaro.org
 
   Backport from mainline
   2012-10-15  Matthias Klose  d...@ubuntu.com
 
   * configure.host: Match arm*-*-linux-* for ARM Linux/GNU.
   * testsuite/20_util/make_signed/requirements/typedefs-2.cc: Likewise.
   * testsuite/20_util/make_unsigned/requirements/typedefs-2.cc: Likewise.


Patch ping

2012-11-26 Thread Jakub Jelinek
- PR55137 fold reassociation fix
 
  http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00636.html   
 
  (alternatively also retry with unsigned atype if the reassociation
 
  would fail otherwise) 
 

- PR43631 fix up var-tracking notes in between bbs
  http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01201.html

- PR55331 substitute_and_fold fix for memmove folding to nothing
  http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01302.html

- convert asan to use builtins instead of building fndecls on the fly
  http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01875.html

- PR55380 tiny libcpp performance fix + hack for asan/valgrind
  http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01973.html

Jakub


Re: [Patch AArch64] Add support for TARGET_BUILTIN_DECL

2012-11-26 Thread Marcus Shawcroft

On 21/11/12 15:05, James Greenhalgh wrote:


Hi,

This patch wires up support for TARGET_BUILTIN_DECL in the AArch64
backend.

Is this OK to commit?

Thanks,
James Greenhalgh

---
gcc/

2012-11-21  James Greenhalgh  james.greenha...@arm.com

* config/aarch64/aarch64-builtins.c (aarch64_builtin_decls): New.
(aarch64_init_simd_builtins): Store declaration after builtin
initialisation.
(aarch64_builtin_decl): New.
* config/aarch64/aarch64-protos.h (aarch64_builtin_decl): New.
* config/aarch64/aarch64.c (TARGET_BUILTIN_DECL): Define.



OK and please back port to ARM/aarch64-4.7-branch

Thanks
/Marcus



Re: [PATCH, generic] Fix for define_subst

2012-11-26 Thread Kirill Yukhin
Guys, this is a ping.

Thanks, K

On Thu, Nov 22, 2012 at 9:36 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote:
 Hello,
 Here is copy-and-paste from issue raised by Ian (in the bottom).

 Fix is attached.
 ChangeLog entry:
 2012-11-22  Michael Zolotukhin  michael.v.zolotuk...@intel.com

 * gensupport.c (add_c_test): Check if expr isn't NULL.

 Is it ok for trunk?

 === CUT HERE ===
 It looks like the recent VEC rewrite might have interacted badly with
 this patch.

 I am getting a segfault in genconditions.c when add_c_test is called
 for desc.

 I think it is because the condition is not within a vector.


 Example of desc for working pattern:

 (define_insn (*zero_extendqihi2_aarch64)
  [
 (set (match_operand:HI 0 (register_operand) (=r,r))
 (zero_extend:HI (match_operand:QI 1 (nonimmediate_operand)
 (r,m
 ] () (@
uxtb\t%w0, %w1
ldrb\t%w0, %1)
  [
 (set_attr (v8type) (extend,load1))
 (set_attr (mode) (HI))
 ])


 Example of desc for broken one that uses define_subst:

 (define_insn (*addsi3_aarch64_noextend)
  [
 (set (match_operand:SI 0 (register_operand) (=rk,rk,rk))
 (plus:SI (match_operand:SI 1 (register_operand) (%rk,rk,rk))
 (match_operand:SI 2 (aarch64_plus_operand) (I,r,J
 ]  (@
   add\t%w0, %w1, %2
   add\t%w0, %w1, %w2
   sub\t%w0, %w1, #%n2)
  [
 (set_attr (v8type) (alu))
 (set_attr (mode) (SI))
 ])


 Note that there are no brackets around the  condition.  That's the issue,
 I
 think.

 === CUT HERE ===

 Thanks, K


Re: [patch] reorg.c janitor patch 3: remove rare_destination()

2012-11-26 Thread Jeff Law

On 11/25/2012 11:30 AM, Richard Sandiford wrote:

FWIW, the vast majority of condjumps have a REG_BR_PROB note so the
part of mostly_true_jump that handles the jumps without a REG_BR_PROB
note is itself a rare_destination :-) GCC is very careful about
preserving the notes. Perhaps all that code should just be removed.


Yeah, just returning 0 if there's no note sounds good.  Preapproved if
noone objects in 24 hours.
Fine with me -- all the branch prediction code in reorg.c pre-dates the 
branch prediction notes and information we carry around these days.


I wouldn't lose any sleep if all the reorg.c bits were converted to use 
the existing notes and the now redundant bits removed.


jeff



Re: [PATCH] Remove unused DELAY_SLOTS_FOR_EPILOGUE target macro

2012-11-26 Thread Jeff Law

On 11/24/2012 03:08 PM, Steven Bosscher wrote:

Hello,

The DELAY_SLOTS_FOR_EPILOGUE target macro, and the related
ELIGIBLE_FOR_EPILOGUE_DELAY macro, are unused. The attached patch
removes them.

OK for trunk?
OK, though this arguably shouldn't be going in during stage3. But given 
those are totally unused, OK.


jeff


Re: [PATCH] Remove redundant variable in hash_set

2012-11-26 Thread Richard Biener
On Mon, Nov 26, 2012 at 1:09 PM, Marek Polacek pola...@redhat.com wrote:
 I don't see why we need the `hash' variable, when we can use the
 `regno' variable directly.  Regtested/bootstrapped on x86_64-linux.
 Ok for trunk?

Ok.

Thanks,
Richard.

 2012-11-26  Marek Polacek  pola...@redhat.com

 * cprop.c (hash_set): Remove hash variable.  Use regno
 variable directly.

 --- gcc/cprop.c.mp  2012-11-26 12:13:08.631193625 +0100
 +++ gcc/cprop.c 2012-11-26 12:13:25.836238566 +0100
 @@ -170,10 +170,7 @@ reg_available_p (const_rtx x, const_rtx
  static unsigned int
  hash_set (int regno, int hash_table_size)
  {
 -  unsigned int hash;
 -
 -  hash = regno;
 -  return hash % hash_table_size;
 +  return (unsigned) regno % hash_table_size;
  }

  /* Insert assignment DEST:=SET from INSN in the hash table.

 Marek


[PATCH] Don't bypass blocks with multiple latch edges (PR middle-end/54838)

2012-11-26 Thread Marek Polacek
In this PR, for the C testcase, in .cse1 we have:

ENTRY
 |   
 |   
 2   
 |   
 |   
   + 4 --+
   |/ \  |   
   |   /   \ |   
   |  6 5|   
   | /\ |\   |   
   |/  \| \  |   
   |   73   |  8 |   
   |   ||   |  /\/   
   +---||   | /  \  /
   |  --109/  
   |-/  
  EXIT-/

(3-4 and 9-4 are back edges).  Now, in bypass_block, when we're
trying to bypass BB 4, we iterate over BB 4's incoming edges,
then we're iterating over reg_use_table (registers used in
insn).  Here we call
set = find_bypass_set (regno, e-src-index);
but since it returns non-NULL value, redirect_edge_and_branch_force 
later redirects edges and BB 4 is gone.
We then end up with
Redirecting fallthru edge 3-4 to 6
JUMP-BYPASS: Proved reg 59 in jump_insn 15 equals constant (const_int 1 [0x1])
Bypass edge from 3-4 to 6
Redirecting fallthru edge 9-4 to 5
JUMP-BYPASS: Proved reg 59 in jump_insn 15 equals constant (const_int 3 [0x3])
Bypass edge from 9-4 to 5
i.e., it is assumed that in one reg there are two constants, that can't
be right, right?!  I've tried to fix this by not redirecting edges (thus
bypassing block) if BB has more
incoming latch edges and the hash table has more different SRC rtxs with
the same hash value.
FWIW, the table looks like the below:
SET hash table (11 buckets, 3 entries)
Index 0 (hash value 4)
  (reg:SI 59 [ D.1735 ]) := (const_int 1 [0x1])
Index 1 (hash value 5)
  (reg/v/f:DI 60 [ b ]) := (const_int 0 [0])
Index 2 (hash value 4)
  (reg:SI 59 [ D.1735 ]) := (const_int 3 [0x3])

(Only not bypassing BB when it has more latch edges would work too,
but I'm afraid it could pessimize stuff somewhere else.)
I've also changed some zeros into NULLs for pointers.
Also, as folowup, we could get rid of may_be_loop_header variable 
completely, at one place where it is used we can use just 'n_latches  0'.
And add the C++ TC as well.
Regtested/bootstrapped on x86_64-linux, ok for trunk?

2012-11-26  Marek Polacek  pola...@redhat.com

PR middle-end/54838
* cprop.c (only_one_const_p): New function.
(find_bypass_set): New parameter.  Conditionally call
only_one_const_p.  Use NULL instead of 0.
(bypass_block): Adjust caller.  Determine number of latch edges.
(n_latches): New variable.

* gcc.dg/pr54838.c: New test.

--- gcc/cprop.c.mp  2012-11-26 12:13:08.631193625 +0100
+++ gcc/cprop.c 2012-11-26 14:41:32.864034283 +0100
@@ -1429,14 +1429,26 @@ find_implicit_sets (void)
 
 static int bypass_last_basic_block;
 
+/* Determine whether there is only one constant SRC rtx in the hash table
+   for REGNO.  Here we assume that for the same hash value there won't be
+   more entries with the same SRC rtx.  Return true if there is only one
+   constant, false otherwise.  */
+
+static bool
+only_one_const_p (int regno)
+{
+  struct expr *set = lookup_set (regno, set_hash_table);
+  return next_set (regno, set) == NULL;
+}
+
 /* Find a set of REGNO to a constant that is available at the end of basic
block BB.  Return NULL if no such set is found.  Based heavily upon
find_avail_set.  */
 
 static struct expr *
-find_bypass_set (int regno, int bb)
+find_bypass_set (int regno, int bb, bool multiple_latches)
 {
-  struct expr *result = 0;
+  struct expr *result = NULL;
 
   for (;;)
 {
@@ -1450,9 +1462,15 @@ find_bypass_set (int regno, int bb)
  set = next_set (regno, set);
}
 
-  if (set == 0)
+  if (set == NULL)
break;
 
+  /* If there are more latch edges coming to the BB, we need to
+make sure that for the same hash value there aren't
+more constants.  */
+  if (multiple_latches  !only_one_const_p (regno))
+   return NULL;
+
   src = set-src;
   if (cprop_constant_p (src))
result = set;
@@ -1502,6 +1520,7 @@ bypass_block (basic_block bb, rtx setcc,
   int may_be_loop_header;
   unsigned removed_p;
   unsigned i;
+  unsigned n_latches;
   edge_iterator ei;
 
   insn = (setcc != NULL) ? setcc : jump;
@@ -1514,12 +1533,12 @@ bypass_block (basic_block bb, rtx setcc,
 find_used_regs (XEXP (note, 0), NULL);
 
   may_be_loop_header = false;
+  n_latches = 0;
   FOR_EACH_EDGE (e, ei, bb-preds)
 if (e-flags  EDGE_DFS_BACK)
-  {
-   may_be_loop_header = true;
-   break;
-  }
+  n_latches++;
+
+  may_be_loop_header = n_latches  0;
 
   change = 0;
   for (ei = ei_start (bb-preds); (e = ei_safe_edge (ei)); )
@@ -1557,7 +1576,7 @@ bypass_block (basic_block bb, rtx setcc,
  struct expr *set;
  rtx src, new_rtx;
 
- set = find_bypass_set (regno, e-src-index);
+ set = find_bypass_set (regno, e-src-index, n_latches  1);
 
  if (! set)
continue;
--- gcc/testsuite/gcc.dg/pr54838.c.mp   2012-11-26 14:48:43.783980854 +0100
+++ 

Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)

2012-11-26 Thread Dominique Dhumieres
 Or rather this one. Same hammer, different color. It turns out that
 the rtlanal.c change caused problems, so I've got to use a home-brewn
 equivalent of remove_reg_equal_equiv_notes_for_regno...

This does not fix pr55006 on x86_64-apple-darwin10;-(while the patch in
http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01607.html
does).

 Test case is unchanged so I'm omitting it here.

This test passes on x86_64-apple-darwin10 for a clean trunk.

Thanks for looking at the problem.

Dominique


[PATCH][ARM] AArch32 vmaxnm, vminnm support

2012-11-26 Thread Kyrylo Tkachov
Hi all,
This patch adds support for the AArch32 vmaxnm and vminnm VFP instructions
in that can be used to implement the smax[sf,df]3 and smin[sf,df]3 RTL
patterns.
The patterns are only used by gcc when unsafe math optimisations are turned
on.
Two new values for the type attribute are introduced: f_minmaxs and
f_minmaxd.

New compilation tests are added. They pass and no regressions on
arm-none-eabi.

Ok for trunk?

Thanks,
Kyrill

gcc/ChangeLog

2012-11-26  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/arm/arm.md (f_minmaxs, f_minmaxd): New types.
* config/arm/vfp.md (smaxmode3): New pattern.
 (sminmode3): Likewise.


gcc/testsuite/ChangeLog

2012-11-26  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* gcc.target/arm/vmaxnmdf.c: New test.
* gcc.target/arm/vmaxnmsf.c: Likewise.
* gcc.target/arm/vminnmsf.c: Likewise.
* gcc.target/arm/vminnmdf.c: Likewise.--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -274,6 +274,8 @@
   fmacd,\
   f_rints,\
   f_rintd,\
+  f_minmaxs,\
+  f_minmaxd,\
   f_flag,\
   f_loads,\
   f_loadd,\
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index 480a88d..82b277a 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -1263,6 +1263,31 @@
(set_attr type f_rintvfp_type)]
 )
 
+;; MIN_EXPR and MAX_EXPR eventually map to 'smin' and 'smax' in RTL.
+;; The 'smax' and 'smin' RTL standard pattern names do not specify which
+;; operand will be returned when both operands are zero (i.e. they may not
+;; honour signed zeroes), or when either operand is NaN.  Therefore GCC
+;; only introduces MIN_EXPR/MAX_EXPR in fast math mode or when not honouring
+;; NaNs.
+
+(define_insn smaxmode3
+  [(set (match_operand:SDF 0 register_operand =F_constraint)
+(smax:SDF (match_operand:SDF 1 register_operand F_constraint)
+ (match_operand:SDF 2 register_operand F_constraint)))]
+  TARGET_HARD_FLOAT  TARGET_FPU_ARMV8 vfp_double_cond
+  vmaxnm.V_if_elem\\t%V_reg0, %V_reg1, %V_reg2
+  [(set_attr type f_minmaxvfp_type)]
+)
+
+(define_insn sminmode3
+  [(set (match_operand:SDF 0 register_operand =F_constraint)
+(smin:SDF (match_operand:SDF 1 register_operand F_constraint)
+ (match_operand:SDF 2 register_operand F_constraint)))]
+  TARGET_HARD_FLOAT  TARGET_FPU_ARMV8 vfp_double_cond
+  vminnm.V_if_elem\\t%V_reg0, %V_reg1, %V_reg2
+  [(set_attr type f_minmaxvfp_type)]
+)
+
 ;; Unimplemented insns:
 ;; fldm*
 ;; fstm*
diff --git a/gcc/testsuite/gcc.target/arm/vmaxnmdf.c 
b/gcc/testsuite/gcc.target/arm/vmaxnmdf.c
new file mode 100644
index 000..1a172b8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/vmaxnmdf.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_vfp_ok } */
+/* { dg-options -ffast-math } */
+/* { dg-add-options arm_v8_vfp } */
+
+double
+foo (double x, double y)
+{
+  return __builtin_fmax (x, y);
+}
+
+/* { dg-final { scan-assembler-times vmaxnm.f64\td\[0-9\]+ 1 } } */
diff --git a/gcc/testsuite/gcc.target/arm/vmaxnmsf.c 
b/gcc/testsuite/gcc.target/arm/vmaxnmsf.c
new file mode 100644
index 000..bc23261
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/vmaxnmsf.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_vfp_ok } */
+/* { dg-options -ffast-math } */
+/* { dg-add-options arm_v8_vfp } */
+
+float
+foo (float x, float y)
+{
+  return __builtin_fmaxf (x, y);
+}
+
+/* { dg-final { scan-assembler-times vmaxnm.f32\ts\[0-9\]+ 1 } } */
diff --git a/gcc/testsuite/gcc.target/arm/vminnmdf.c 
b/gcc/testsuite/gcc.target/arm/vminnmdf.c
new file mode 100644
index 000..c2a6915
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/vminnmdf.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_vfp_ok } */
+/* { dg-options -ffast-math } */
+/* { dg-add-options arm_v8_vfp } */
+
+double
+foo (double x, double y)
+{
+  return __builtin_fmin (x, y);
+}
+
+/* { dg-final { scan-assembler-times vminnm.f64\td\[0-9\]+ 1 } } */
diff --git a/gcc/testsuite/gcc.target/arm/vminnmsf.c 
b/gcc/testsuite/gcc.target/arm/vminnmsf.c
new file mode 100644
index 000..eee43bc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/vminnmsf.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_vfp_ok } */
+/* { dg-options -ffast-math } */
+/* { dg-add-options arm_v8_vfp } */
+
+float
+foo (float x, float y)
+{
+  return __builtin_fminf (x, y);
+}
+
+/* { dg-final { scan-assembler-times vminnm.f32\ts\[0-9\]+ 1 } } */


Re: [RFA 6/8] validate_failures.py: remove pass/fail from GetBuildData

2012-11-26 Thread Diego Novillo
On Sat, Nov 24, 2012 at 5:54 PM, Doug Evans d...@google.com wrote:

 2012-11-24  Doug Evans  d...@google.com

 * testsuite-management/validate_failures.py: Remove pass/fail
 indicator from
 result of GetBuildData.

OK.


Diego.


Re: [PATCH] Fix make_range_step with -fwrapv (PR tree-optimization/55236)

2012-11-26 Thread Richard Biener
On Thu, Nov 8, 2012 at 9:04 PM, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 With -fwrapv the range test optimization miscompiles the following testcase
 (both inter-bb version in 4.8+ in first function and the pure fold-const.c
 one since 3.4).  The problem is that with -fwrapv and signed type
 -x in [-,b] doesn't imply x in [-b,-] and -x in [a,-] doesn't imply
 x in [-,-a], as -INT_MIN is INT_MIN and thus if INT_MIN was in the range,
 after negation it needs to be in the range too (and similarly if it wasn't,
 it can't be there).  Turns out that if we make sure both low and high
 are non-NULL (i.e. replace - with INT_MIN resp. INT_MAX), normalize:
 will do the right thing that it does also for unsigned operation
 - if it detects the high bound is lower than low bound, it adjusts the
 range by inverting it.

 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk (what
 about 4.7 where bar is miscompiled too)?

 Perhaps for PLUS_EXPR/MINUS_EXPR and signed -fwrapv type we could do the
 same instead of return NULL_TREE; that we do right now (added by Kazu
 for PR23518).

Ok.

Thanks,
Richard.

 2012-11-08  Jakub Jelinek  ja...@redhat.com

 PR tree-optimization/55236
 * fold-const.c (make_range_step) case NEGATE_EXPR: For -fwrapv
 and signed ARG0_TYPE, force low and high to be non-NULL.

 * gcc.dg/pr55236.c: New test.

 --- gcc/fold-const.c.jj 2012-11-07 11:24:34.0 +0100
 +++ gcc/fold-const.c2012-11-08 16:54:38.978339040 +0100
 @@ -3880,6 +3880,17 @@ make_range_step (location_t loc, enum tr
return arg0;

  case NEGATE_EXPR:
 +  /* If flag_wrapv and ARG0_TYPE is signed, make sure
 +low and high are non-NULL, then normalize will DTRT.  */
 +  if (!TYPE_UNSIGNED (arg0_type)
 +  !TYPE_OVERFLOW_UNDEFINED (arg0_type))
 +   {
 + if (low == NULL_TREE)
 +   low = TYPE_MIN_VALUE (arg0_type);
 + if (high == NULL_TREE)
 +   high = TYPE_MAX_VALUE (arg0_type);
 +   }
 +
/* (-x) IN [a,b] - x in [-b, -a]  */
n_low = range_binop (MINUS_EXPR, exp_type,
build_int_cst (exp_type, 0),
 --- gcc/testsuite/gcc.dg/pr55236.c.jj   2012-11-08 16:40:49.171781137 +0100
 +++ gcc/testsuite/gcc.dg/pr55236.c  2012-11-08 16:41:20.044578919 +0100
 @@ -0,0 +1,31 @@
 +/* PR tree-optimization/55236 */
 +/* { dg-do run } */
 +/* { dg-options -O2 -fwrapv } */
 +
 +extern void abort ();
 +
 +__attribute__((noinline, noclone)) void
 +foo (int i)
 +{
 +  if (i  0)
 +abort ();
 +  i = -i;
 +  if (i  0)
 +return;
 +  abort ();
 +}
 +
 +__attribute__((noinline, noclone)) void
 +bar (int i)
 +{
 +  if (i  0 || (-i) = 0)
 +abort ();
 +}
 +
 +int
 +main ()
 +{
 +  foo (-__INT_MAX__ - 1);
 +  bar (-__INT_MAX__ - 1);
 +  return 0;
 +}

 Jakub


Re: [PATCH] Reset source location for instructions moved out of its original residing basic block

2012-11-26 Thread Richard Biener
On Sun, Nov 4, 2012 at 11:26 PM, Alexandre Oliva aol...@redhat.com wrote:
 On Nov  1, 2012, Dehao Chen de...@google.com wrote:

 I see your point. How about we guard these changes with a flag, say
 -gless-jumpy

 The right (DWARF-expected) approach to avoid this sortof jumpiness is to
 teach GCC to emit .locs with is_stmt=1, along the lines (and
 limitations) described in my GCC Summit paper on stmt frontier notes,
 that proposes further extensions to deal with the foreseen limitations.

Right.  gdb could also instead of show a single line (and that changing
all the time) show all currently active lines (thus scan the current
function for lines that appear and compute their liveness).  A step
would then jump to the next stmt that changes line liveness.

Richard.

 http://people.redhat.com/~aoliva/papers/sfn/gcc2010.pdf

 --
 Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
 You must be the change you wish to see in the world. -- Gandhi
 Be Free! -- http://FSFLA.org/   FSF Latin America board member
 Free Software Evangelist  Red Hat Brazil Compiler Engineer


Re: [PATCH] Allocate extra 16 bytes for -fsanitize=address

2012-11-26 Thread H.J. Lu
On Fri, Nov 23, 2012 at 11:50 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Fri, Nov 23, 2012 at 11:33:37AM -0800, H.J. Lu wrote:
 2012-11-21  H.J. Lu  hongjiu...@intel.com

   PR bootstrap/55380
   * charset.c (_cpp_convert_input): Clear CPP_PAD_BUFFER_SIZE
   bytes if CLEAR_CPP_PAD_BUFFER isn't 0.  Allocate extra
   CPP_PAD_BUFFER_SIZE bytes and clear it if CLEAR_CPP_PAD_BUFFER
   isn't 0.

   * files.c (read_file_guts): Allocate extra CPP_PAD_BUFFER_SIZE
   bytes.

   * internal.h (CPP_PAD_BUFFER_SIZE): New.  Defined to 16 for
   -fsanitize=address if __SANITIZE_ADDRESS__ is defined.
   (CLEAR_CPP_PAD_BUFFER): New.

 I'd say you are making this way too much complicated.
 The following patch just changes those + 1 to + 16, adds memset of the 16
 bytes unconditionally, but as it also fixes a thinko which IMHO affects
 the most common cases (regular file, with no conversion needed), I'd say
 it ought to be faster than the old version (the old version would allocate
 st.st_size + 1 bytes long buffer, read st.st_size bytes into it,
 call _cpp_convert_input with len st.st_size and size st.st_size (the latter
 should have been st.st_size + 1, i.e. the allocated size) and thus
 to.len = to.asize was true and we called realloc with the same length
 as malloc has been called originally.

 2012-11-23  Jakub Jelinek  ja...@redhat.com

 PR bootstrap/55380
 * files.c (read_file_guts): Allocate extra 16 bytes instead of
 1 byte at the end of buf.  Pass size + 16 instead of size
 to _cpp_convert_input.
 * charset.c (_cpp_convert_input): Reallocate if there aren't
 at least 16 bytes beyond to.len in the buffer.  Clear 16 bytes
 at to.text + to.len.

 --- libcpp/files.c.jj   2012-11-22 11:04:24.0 +0100
 +++ libcpp/files.c  2012-11-23 20:37:03.146379853 +0100
 @@ -671,7 +671,7 @@ read_file_guts (cpp_reader *pfile, _cpp_
 the majority of C source files.  */
  size = 8 * 1024;

 -  buf = XNEWVEC (uchar, size + 1);
 +  buf = XNEWVEC (uchar, size + 16);
total = 0;
while ((count = read (file-fd, buf + total, size - total))  0)
  {
 @@ -682,7 +682,7 @@ read_file_guts (cpp_reader *pfile, _cpp_
   if (regular)
 break;
   size *= 2;
 - buf = XRESIZEVEC (uchar, buf, size + 1);
 + buf = XRESIZEVEC (uchar, buf, size + 16);
 }
  }

 @@ -699,7 +699,7 @@ read_file_guts (cpp_reader *pfile, _cpp_

file-buffer = _cpp_convert_input (pfile,
  CPP_OPTION (pfile, input_charset),
 -buf, size, total,
 +buf, size + 16, total,
  file-buffer_start,
  file-st.st_size);
file-buffer_valid = true;
 --- libcpp/charset.c.jj 2011-01-06 10:22:00.0 +0100
 +++ libcpp/charset.c2012-11-23 20:40:39.736116642 +0100
 @@ -1,6 +1,6 @@
  /* CPP Library - charsets
 Copyright (C) 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2006, 2008, 2009,
 -   2010 Free Software Foundation, Inc.
 +   2010, 2012 Free Software Foundation, Inc.

 Broken out of c-lex.c Apr 2003, adding valid C99 UCN ranges.

 @@ -1729,9 +1729,12 @@ _cpp_convert_input (cpp_reader *pfile, c
  iconv_close (input_cset.cd);

/* Resize buffer if we allocated substantially too much, or if we
 - haven't enough space for the \n-terminator.  */
 -  if (to.len + 4096  to.asize || to.len = to.asize)
 -to.text = XRESIZEVEC (uchar, to.text, to.len + 1);
 + haven't enough space for the \n-terminator or following
 + 15 bytes of padding.  */
 +  if (to.len + 4096  to.asize || to.len + 16  to.asize)
 +to.text = XRESIZEVEC (uchar, to.text, to.len + 16);
 +
 +  memset (to.text + to.len, '\0', 16);

/* If the file is using old-school Mac line endings (\r only),
   terminate with another \r, not an \n, so that we do not mistake


 Jakub

You should also mention:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54691

in ChangeLog entry.

Thanks.


-- 
H.J.


Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

2012-11-26 Thread Richard Biener
On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:

 On 11/04/2012 11:54 AM, Richard Biener wrote:

 On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford
 rdsandif...@googlemail.com wrote:

 Kenneth Zadeck zad...@naturalbridge.com writes:

 I would like you to respond to at least point 1 of this email.   In it
 there is code from the rtl level that was written twice, once for the
 case when the size of the mode is less than the size of a HWI and once
 for the case where the size of the mode is less that 2 HWIs.

 my patch changes this to one instance of the code that works no matter
 how large the data passed to it is.

 you have made a specific requirement for wide int to be a template that
 can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.   I
 would like to know how this particular fragment is to be rewritten in
 this model?   It seems that I would have to retain the structure where
 there is one version of the code for each size that the template is
 instantiated.

 I think richi's argument was that wide_int should be split into two.
 There should be a bare-metal class that just has a length and HWIs,
 and the main wide_int class should be an extension on top of that
 that does things to a bit precision instead.  Presumably with some
 template magic so that the length (number of HWIs) is a constant for:

typedef foo2 double_int;

 and a variable for wide_int (because in wide_int the length would be
 the number of significant HWIs rather than the size of the underlying
 array).  wide_int would also record the precision and apply it after
 the full HWI operation.

 So the wide_int class would still provide as wide as we need
 arithmetic,
 as in your rtl patch.  I don't think he was objecting to that.

 That summarizes one part of my complaints / suggestions correctly.  In
 other
 mails I suggested to not make it a template but a constant over object
 lifetime
 'bitsize' (or maxlen) field.  Both suggestions likely require more thought
 than
 I put into them.  The main reason is that with C++ you can abstract from
 where
 wide-int information pieces are stored and thus use the arithmetic /
 operation
 workers without copying the (source) wide-int objects.  Thus you should
 be able to write adaptors for double-int storage, tree or RTX storage.

 We had considered something along these lines and rejected it.   I am not
 really opposed to doing something like this, but it is not an obvious
 winning idea and is likely not to be a good idea.   Here was our thought
 process:

 if you abstract away the storage inside a wide int, then you should be able
 to copy a pointer to the block of data from either the rtl level integer
 constant or the tree level one into the wide int.   It is certainly true
 that making a wide_int from one of these is an extremely common operation
 and doing this would avoid those copies.

 However, this causes two problems:
 1)  Mike's first cut at the CONST_WIDE_INT did two ggc allocations to make
 the object.   it created the base object and then it allocated the array.
 Richard S noticed that we could just allocate one CONST_WIDE_INT that had
 the array in it.   Doing it this way saves one ggc allocation and one
 indirection when accessing the data within the CONST_WIDE_INT.   Our plan is
 to use the same trick at the tree level.   So to avoid the copying, you seem
 to have to have a more expensive rep for CONST_WIDE_INT and INT_CST.

I did not propose having a pointer to the data in the RTX or tree int.  Just
the short-lived wide-ints (which are on the stack) would have a pointer to
the data - which can then obviously point into the RTX and tree data.

 2) You are now stuck either ggcing the storage inside a wide_int when they
 are created as part of an expression or you have to play some game to
 represent the two different storage plans inside of wide_int.

Hm?  wide-ints are short-lived and thus never live across a garbage collection
point.  We create non-GCed objects pointing to GCed objects all the time
and everywhere this way.

   Clearly this
 is where you think that we should be going by suggesting that we abstract
 away the internal storage.   However, this comes at a price:   what is
 currently an array access in my patches would (i believe) become a function
 call.

No, the workers (that perform the array accesses) will simply get
a pointer to the first data element.  Then whether it's embedded or
external is of no interest to them.

  From a performance point of view, i believe that this is a non
 starter. If you can figure out how to design this so that it is not a
 function call, i would consider this a viable option.

 On the other side of this you are clearly correct that we are copying the
 data when we are making wide ints from INT_CSTs or CONST_WIDE_INTs.But
 this is why we represent data inside of the wide_ints, the INT_CSTs and the
 CONST_WIDE_INTs in a compressed form.   Even with very big types, which are
 

Re: [PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self

2012-11-26 Thread Richard Biener
On Wed, Nov 7, 2012 at 10:51 PM, Meador Inge mead...@codesourcery.com wrote:
 Ping ^ 4.

Ok.

Thanks,
Richard.

 On 10/29/2012 10:46 AM, Meador Inge wrote:
 Ping ^ 3.

 On 10/18/2012 10:30 AM, Meador Inge wrote:
 Ping ^ 2.

 On 10/09/2012 09:44 PM, Meador Inge wrote:
 Ping.

 On 10/04/2012 03:45 PM, Meador Inge wrote:
 Hi All,

 Currently the gcc-{ar,nm,ranlib} utilities assume that binutils is in
 path when invoking the wrapped binutils program.  This goes against the
 accepted practice in GCC to find sub-programs relative to where the
 GCC binaries are stored and to not make assumptions about the PATH.

 This patch changes the gcc-{ar,nm,ranlib} utilities to do the same
 by factoring out some utility code for finding files from collect2.c.
 These functions are then leveraged to find the binutils programs.
 Note that similar code exist in gcc.c.  Perhaps one day everything
 can be merged to the file-find files.

 Tested for Windows and GNU/Linux hosts and i686-pc-linux-gnu and
 arm-none-eabi targets.  OK?

 P.S. I am not quite sure what is best for the copyrights and contributed
 by comments in the file-find* files I added since that code was just 
 moved.
 This patch drops the contributed by and keeps all the copyright dates from
 collect2.c.

 2012-10-04  Meador Inge  mead...@codesourcery.com

* collect2.c (main): Call find_file_set_debug.
(find_a_find, add_prefix, prefix_from_env, prefix_from_string):
Factor out into ...
* file-find.c (New file): ... here and ...
* file-find.h (New file): ... here.
* gcc-ar.c (standard_exec_prefix): New variable.
(standard_libexec_prefix): Ditto.
(tooldir_base_prefix) Ditto.
(self_exec_prefix): Ditto.
(self_libexec_prefix): Ditto.
(self_tooldir_prefix): Ditto.
(target_version): Ditto.
(path): Ditto.
(target_path): Ditto.
(setup_prefixes): New function.
(main): Rework how wrapped programs are found.
* Makefile.in (OBJS-libcommon-target): Add file-find.o.
(AR_OBJS): New variable.
(gcc-ar$(exeext)): Add dependency on $(AR_OBJS).
(gcc-nm$(exeext)): Ditto.
(gcc-ranlib(exeext)): Ditto.
(COLLECT2_OBJS): Add file-find.o.
(collect2.o): Add file-find.h prerequisite.
(file-find.o): New rule.

 Index: gcc/gcc-ar.c
 ===
 --- gcc/gcc-ar.c   (revision 192099)
 +++ gcc/gcc-ar.c   (working copy)
 @@ -21,21 +21,110 @@
  #include config.h
  #include system.h
  #include libiberty.h
 +#include file-find.h

  #ifndef PERSONALITY
  #error Please set personality
  #endif

 +/* The exec prefix as derived at compile-time from --prefix.  */
 +
 +static const char standard_exec_prefix[] = STANDARD_EXEC_PREFIX;
 +
 +/* The libexec prefix as derived at compile-time from --prefix.  */
 +
  static const char standard_libexec_prefix[] = STANDARD_LIBEXEC_PREFIX;
 +
 +/* The bindir prefix as derived at compile-time from --prefix.  */
 +
  static const char standard_bin_prefix[] = STANDARD_BINDIR_PREFIX;
 -static const char *const target_machine = TARGET_MACHINE;

 +/* A relative path to be used in finding the location of tools
 +   relative to this program.  */
 +
 +static const char *const tooldir_base_prefix = TOOLDIR_BASE_PREFIX;
 +
 +/* The exec prefix as relocated from the location of this program.  */
 +
 +static const char *self_exec_prefix;
 +
 +/* The libexec prefix as relocated from the location of this program.  */
 +
 +static const char *self_libexec_prefix;
 +
 +/* The tools prefix as relocated from the location of this program.  */
 +
 +static const char *self_tooldir_prefix;
 +
 +/* The name of the machine that is being targeted.  */
 +
 +static const char *const target_machine = DEFAULT_TARGET_MACHINE;
 +
 +/* The target version.  */
 +
 +static const char *const target_version = DEFAULT_TARGET_VERSION;
 +
 +/* The collection of target specific path prefixes.  */
 +
 +static struct path_prefix target_path;
 +
 +/* The collection path prefixes.  */
 +
 +static struct path_prefix path;
 +
 +/* The directory separator.  */
 +
  static const char dir_separator[] = { DIR_SEPARATOR, 0 };

 +static void
 +setup_prefixes (const char *exec_path)
 +{
 +  const char *self;
 +
 +  self = getenv (GCC_EXEC_PREFIX);
 +  if (!self)
 +self = exec_path;
 +  else
 +self = concat (self, gcc- PERSONALITY, NULL);
 +
 +  /* Relocate the exec prefix.  */
 +  self_exec_prefix = make_relative_prefix (self,
 + standard_bin_prefix,
 + standard_exec_prefix);
 +  if (self_exec_prefix == NULL)
 +self_exec_prefix = standard_exec_prefix;
 +
 +  /* Relocate libexec prefix.  */
 +  self_libexec_prefix = make_relative_prefix (self,
 +standard_bin_prefix,
 +standard_libexec_prefix);
 +  if (self_libexec_prefix == NULL)
 +self_libexec_prefix = standard_libexec_prefix;
 +
 +
 +  /* Build the relative 

Re: [PATCH] Fix substitute_and_fold ICE (PR tree-optimization/55331)

2012-11-26 Thread Richard Biener
On Thu, Nov 15, 2012 at 9:09 PM, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 On the following testcase substitute_and_fold ICEs because memmove
 of length 1 on an empty class is optimized away, and this function wasn't
 prepared to handle that.

 Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
 trunk/4.7?

I think the bug is that the stmt is removed.  fold_stmt is not supposed to
remove the stmt in any case - it may replace it with a gimple_nop at most.

Thus, gimplify_and_update_call_from_tree is at fault here.

I am testing a patch that fixes it.

Thanks,
Richard.

 2012-11-15  Jakub Jelinek  ja...@redhat.com

 PR tree-optimization/55331
 * tree-ssa-propagate.c (substitute_and_fold): Handle fold_stmt
 turning a stmt into nothing.

 * g++.dg/opt/pr55331.C: New test.

 --- gcc/tree-ssa-propagate.c.jj 2012-11-02 09:01:55.0 +0100
 +++ gcc/tree-ssa-propagate.c2012-11-15 12:27:37.218543417 +0100
 @@ -1094,7 +1094,7 @@ substitute_and_fold (ssa_prop_get_value_
   gimple stmt = gsi_stmt (i);
   gimple old_stmt;
   enum gimple_code code = gimple_code (stmt);
 - gimple_stmt_iterator oldi;
 + gimple_stmt_iterator oldi, nexti;

   oldi = i;
   if (do_dce)
 @@ -1147,6 +1147,8 @@ substitute_and_fold (ssa_prop_get_value_
 }

   old_stmt = stmt;
 + nexti = oldi;
 + gsi_next (nexti);

   /* Some statements may be simplified using propagator
  specific information.  Do this before propagating
 @@ -1171,36 +1173,63 @@ substitute_and_fold (ssa_prop_get_value_
   /* Now cleanup.  */
   if (did_replace)
 {
 - stmt = gsi_stmt (oldi);
 -
 -  /* If we cleaned up EH information from the statement,
 - remove EH edges.  */
 - if (maybe_clean_or_replace_eh_stmt (old_stmt, stmt))
 -   gimple_purge_dead_eh_edges (bb);
 -
 -  if (is_gimple_assign (stmt)
 -   (get_gimple_rhs_class (gimple_assign_rhs_code (stmt))
 -  == GIMPLE_SINGLE_RHS))
 -  {
 -tree rhs = gimple_assign_rhs1 (stmt);
 -
 -if (TREE_CODE (rhs) == ADDR_EXPR)
 -  recompute_tree_invariant_for_addr_expr (rhs);
 -  }
 + /* old_stmt might have been also replaced by nothing,
 +in that case set stmt to NULL.  */
 + if (gsi_end_p (oldi))
 +   {
 + gcc_checking_assert (gsi_end_p (nexti));
 + stmt = NULL;
 +   }
 + else
 +   {
 + stmt = gsi_stmt (oldi);
 + if (!gsi_end_p (nexti)  gsi_stmt (nexti) == stmt)
 +   stmt = NULL;
 +   }

 - /* Determine what needs to be done to update the SSA form.  */
 - update_stmt (stmt);
 - if (!is_gimple_debug (stmt))
 -   something_changed = true;
 + if (stmt == NULL)
 +   {
 + if (maybe_clean_eh_stmt (old_stmt))
 +   gimple_purge_dead_eh_edges (bb);
 + something_changed = true;
 +   }
 + else
 +   {
 + /* If we cleaned up EH information from the statement,
 +remove EH edges.  */
 + if (maybe_clean_or_replace_eh_stmt (old_stmt, stmt))
 +   gimple_purge_dead_eh_edges (bb);
 +
 + if (is_gimple_assign (stmt)
 +  (get_gimple_rhs_class (gimple_assign_rhs_code (stmt))
 + == GIMPLE_SINGLE_RHS))
 +   {
 + tree rhs = gimple_assign_rhs1 (stmt);
 +
 + if (TREE_CODE (rhs) == ADDR_EXPR)
 +   recompute_tree_invariant_for_addr_expr (rhs);
 +   }
 +
 + /* Determine what needs to be done to update the SSA
 +form.  */
 + update_stmt (stmt);
 + if (!is_gimple_debug (stmt))
 +   something_changed = true;
 +   }
 }

   if (dump_file  (dump_flags  TDF_DETAILS))
 {
   if (did_replace)
 {
 - fprintf (dump_file, Folded into: );
 - print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
 - fprintf (dump_file, \n);
 + if (stmt == NULL)
 +   fprintf (dump_file, Folded into nothing\n);
 + else
 +   {
 + fprintf (dump_file, Folded into: );
 + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
 + fprintf (dump_file, \n);
 +   }
 }
   else
 fprintf (dump_file, Not 

Re: [PATCH] Fix substitute_and_fold ICE (PR tree-optimization/55331)

2012-11-26 Thread Jakub Jelinek
On Mon, Nov 26, 2012 at 04:24:41PM +0100, Richard Biener wrote:
 On Thu, Nov 15, 2012 at 9:09 PM, Jakub Jelinek ja...@redhat.com wrote:
  On the following testcase substitute_and_fold ICEs because memmove
  of length 1 on an empty class is optimized away, and this function wasn't
  prepared to handle that.
 
  Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
  trunk/4.7?
 
 I think the bug is that the stmt is removed.  fold_stmt is not supposed to
 remove the stmt in any case - it may replace it with a gimple_nop at most.
 
 Thus, gimplify_and_update_call_from_tree is at fault here.
 
 I am testing a patch that fixes it.

Note that fold_stmt_1 also has code to handle a call folding into nothing,
so perhaps that could be removed too if you change it to fold into a nop
instead.

Jakub


Re: [PATCH] Cleanup last_location and update input_location in ipa_prop

2012-11-26 Thread Richard Biener
On Thu, Nov 8, 2012 at 6:39 PM, Dehao Chen de...@google.com wrote:
 ping...

The emit-rtl.c hunk is ok.  I'm questioning the ipa-prop.c hunk though - what
looks at input_location (nothing outside of the frontend should, really).

Thanks,
Richard.


 On Mon, Nov 5, 2012 at 5:20 PM, Dehao Chen de...@google.com wrote:

 Hi,

 This is a patch to do some obvious cleanup and setting correct
 input_location in ipa_prop (because it invokes gimplification
 routines).

 Bootstrapped and passed gcc regression tests.

 Is it okay for trunk?

 Thanks,
 Dehao

 gcc/ChangeLog:
 2010-11-05  Dehao Chen  de...@google.com

 * ipa-prop.c (ipa_modify_call_arguments): Set input_location so that
 gimplification routines can have right location.
 * emit-rtl.c (last_location): Remove unused variable.


Re: [PATCH] Do not dump location for compare_debug

2012-11-26 Thread Richard Biener
On Thu, Nov 1, 2012 at 1:26 AM, Dehao Chen de...@google.com wrote:
 Hi,

 When -fcompare_debug is used, what we really want to do is to compare
 instructions between the -g version and -gtoggle version. However,
 current dump file still contains the source line in its rtl dump. This
 patch changes to only dump rtl without dumping its source info.

 Bootstrapped and passed gcc regression test on x86_64.

 Is it okay for trunk?

I'm not sure - it at least works comparing with them.  Alex - was source
location info included on purpose here?

Thanks,
Richard.

 Thanks,
 Dehao

 2012-10-31  Dehao Chen  de...@google.com

 * final.c (rest_of_clean_state): Dump the final rtl without location.


Re: patch to fix constant math - second small patch

2012-11-26 Thread Richard Biener
On Thu, Nov 8, 2012 at 7:13 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:
 I have added the proper doc.  OK to commit?

Ok.

Thanks,
Richard.

 Kenny



 On 10/08/2012 05:06 AM, Richard Guenther wrote:

 On Sat, Oct 6, 2012 at 12:48 AM, Kenneth Zadeck
 zad...@naturalbridge.com wrote:

 This patch adds machinery to genmodes.c so that largest possible sizes of
 various data structures can be determined at gcc build time.  These
 functions create 3 symbols that are available in insn-modes.h:
 MAX_BITSIZE_MODE_INT - the bitsize of the largest int.
 MAX_BITSIZE_MODE_PARTIAL_INT - the bitsize of the largest partial int.
 MAX_BITSIZE_MODE_ANY_INT - the largest bitsize of any kind of int.

 Ok.  Please document these macros in rtl.texi.

 Richard.




Re: Tighten checking of vector comparisons

2012-11-26 Thread Richard Biener
On Thu, Nov 22, 2012 at 11:59 AM, Marc Glisse marc.gli...@inria.fr wrote:
 (I forgot to send this at the time)


 On Sun, 4 Nov 2012, Richard Biener wrote:

 - else if (!INTEGRAL_TYPE_P (type)  TREE_CODE (type) !=
 VECTOR_TYPE)
 + else if (!INTEGRAL_TYPE_P (type)  !VOID_TYPE_P (type)
 +   TREE_CODE (type) != VECTOR_TYPE)

 [...]

 Ok for this part.


 Applied, thanks.


 Index: gcc/tree-cfg.c
 ===
 --- gcc/tree-cfg.c  (revision 193060)
 +++ gcc/tree-cfg.c  (working copy)
 @@ -3263,21 +3263,30 @@ verify_gimple_comparison (tree type, tre
error (mismatching comparison operand types);
debug_generic_expr (op0_type);
debug_generic_expr (op1_type);
return true;
  }

/* The resulting type of a comparison may be an effective boolean
 type.
 */
if (INTEGRAL_TYPE_P (type)
 (TREE_CODE (type) == BOOLEAN_TYPE
   || TYPE_PRECISION (type) == 1))
 -;
 +{
 +  if (TREE_CODE (op0_type) == VECTOR_TYPE
 + || TREE_CODE (op1_type) == VECTOR_TYPE)
 +{
 +  error (vector comparison returning a boolean);
 +  debug_generic_expr (op0_type);
 +  debug_generic_expr (op1_type);
 +  return true;
 +}


 verify_gimple_* should have positive checks, thus, check that
 if there are vector operands the comparison result should be a
 vector.  Not complaining about a vector comparison having a
 boolean result.


 I wasn't sure what that was supposed to look like, so I dropped it for now.

Ok, looking closer we have

  /* The resulting type of a comparison may be an effective boolean type.  */
  if (INTEGRAL_TYPE_P (type)
   (TREE_CODE (type) == BOOLEAN_TYPE
  || TYPE_PRECISION (type) == 1))
;
  /* Or an integer vector type with the same size and element count
 as the comparison operand types.  */
  else if (TREE_CODE (type) == VECTOR_TYPE
TREE_CODE (TREE_TYPE (type)) == INTEGER_TYPE)
{
...

Thus you are rejecting a boolean valued vector comparison which we
otherwise happily accept.  I suppose that makes sense (even though
at least equality compares can make sense).

Thus that hunk is ok as well.

Thanks,
Richard.

 --
 Marc Glisse


Re: [PATCH] Update source location for PRE inserted stmt

2012-11-26 Thread Dehao Chen
On Sun, Nov 25, 2012 at 11:37 AM, Xinliang David Li davi...@google.com wrote:
 On Sun, Nov 25, 2012 at 4:40 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Thu, Nov 15, 2012 at 5:46 PM, Eric Botcazou ebotca...@adacore.com wrote:
 But UNKNOWN_LOCATION is effectively wrong as well.  If other
 optimizations move the statements above the inserted instruction, then
 the new instruction ends up inheriting whatever location happens to be
 in the previous statement.

 That's correct, UNKNOWN_LOCATION isn't a panacea either, although it can 
 help
 in some cases.  The only safe thing to do is to put the exact source 
 location
 of the statement, i.e. the location of the source construct for which the
 statement has been generated.

 There may not be a source location for a generated statement, the computed
 value might not have been computed in the source at any point even.  Consider
 re-association of an expression and then a re-associated part becoming
 partially redundant.

  if ()
tem = a + b;
  tem2 = a + c  + b;

 after re-assoc + PRE:

  if ()
tem = a + b;
 else
tem' = a + b;  // which sloc here?

For inserted expr/stmt, inheriting location from placeholder is making
sense. But when the placeholder has no location in the first place,
explicit NOWHERE is needed, otherwise it will just has random location
from other basic block.

Thus for this cases, sloc here should be NOWHERE.

However, in the unittest I provided:

  if (x  0)
ret += *a;
  else
a++;
  ret += *a;

After PRE:

if (x  0) {
  pretmp_1 = *a;
  ret += pretmp_1;
} else {
  a++;
  pretmp_2 = *a; // it should inherit location from prev line
}
pretmp_3 = PHI(pretmp_1, pretmp_2)

How does that sound?

Thanks,
Dehao


 If there was a else branch, the proposed patch by Dehao will use the
 source location of the previous stmt 'tem' = a + b' was inserted
 after.

 I did not see any special source location handling in
 tree-ssa-reassoc.c -- looks like the same principle is used there --
 the new assignment after the operand shuffling will inherit the source
 location of the placeholder gimple stmt.


 tem'' = PHI tem, tem'  //  or here?  on the args?
 tem2 = tem'' + c;

 so what source location would you use for the inserted expression?  If
 UNKNOWN is not
 correct here then I think we need an explicit NOWHERE?




 Can that break line coverage info too, say, when source location get
 stripped from reassociated expressions?

 David





 Fixing the location when the statement is inserted will reduce this
 randomness.  I'm not sure I understand why you think this will break
 coverage.  The examples given in this thread were helped by this
 location fixing.

 What do you call randomness exactly?  GDB jumpiness?  I'm a little wary of
 fixing GDB jumpiness by distorting the debug info.  Ian has clearly outlined
 the correct approach to addressing it.

 Note that I didn't specifically reply to Dehao's patch here, which might be
 acceptable in the end, only to David's seemingly general statement about the
 natural way of putting a location on these statements.

 Richard.

 --
 Eric Botcazou


[C++ Patch / RFC] PR 51242

2012-11-26 Thread Paolo Carlini

Hi,

this PR is about the rejection (in C++11 mode of course) of:

enum class MyEnum { A = 1 };

struct MyClass
{
  MyEnum Field1 : 3;
};


whereas the corresponding unscoped enum case already works. As noticed 
by Jon, it seems that the below straightforward patchlet is enough to 
solve the problem but then we have a warning issue:


51242.C:5:19: warning: ‘MyClass::Field1’ is too small to hold all values 
of ‘enum class MyEnum’ [enabled by default]


which is easily explained: the underlying type is fixed (to an implicit 
int type), thus in finish_enum_value_list, fixed_underlying_type_p is 
true and the code restricting the ENUM_UNDERLYING_TYPE for diagnostic 
purposes doesn't run. In other terms, we warn for the same reason we for 
example -Wnarrowing warn (before erroring out) for:


enum class Code {
  SUCCESS = 0
};

Code a;

short r[] = {a};


That seems intended, but especially annoying in the bitfield case. An 
idea could be giving a name to the too small to hold all values 
warning and making possible disabling it. What do you suggest?


Thanks!
Paolo.
Index: decl2.c
===
--- decl2.c (revision 193814)
+++ decl2.c (working copy)
@@ -1030,7 +1030,7 @@ grokbitfield (const cp_declarator *declarator,
   if (TREE_CODE (value) == VOID_TYPE)
 return void_type_node;
 
-  if (!INTEGRAL_OR_UNSCOPED_ENUMERATION_TYPE_P (TREE_TYPE (value))
+  if (!INTEGRAL_OR_ENUMERATION_TYPE_P (TREE_TYPE (value))
(POINTER_TYPE_P (value)
   || !dependent_type_p (TREE_TYPE (value
 {


Re: Fix twolf -funroll-loops -O3 miscompilation (a semi-latent web.c bug)

2012-11-26 Thread Steven Bosscher
On Mon, Nov 26, 2012 at 3:38 PM, Dominique Dhumieres domi...@lps.ens.fr wrote:
 Or rather this one. Same hammer, different color. It turns out that
 the rtlanal.c change caused problems, so I've got to use a home-brewn
 equivalent of remove_reg_equal_equiv_notes_for_regno...

 This does not fix pr55006 on x86_64-apple-darwin10;-(while the patch in
 http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01607.html
 does).

Does it fail on x86_64-linux also? If not as-is, can you try with -fPIC?


 Test case is unchanged so I'm omitting it here.

 This test passes on x86_64-apple-darwin10 for a clean trunk.

Yes, I know. As you can probably tell from the patch, I've been
working on an older revision because I can't reproduce the problem on
the trunk. It's a very elusive problem.

Ciao!
Steven


Re: [patch] Rework RTL CFG graph dumping to dump DOT format

2012-11-26 Thread Richard Biener
On Fri, Nov 23, 2012 at 10:46 AM, Steven Bosscher stevenb@gmail.com wrote:
 Hello,

 The graph.[ch] code to dump the CFG for viewing with VCG is quite
 broken and hasn't been updated to follow the advances in the CFG
 infrastructure.

 The attached patch is the first in a series of patches I have planned
 to update the CFG-as-graph dumping code to dump GraphViz DOT input
 instead. The attached archive is an example input/output.

 There's still quite a lot of work to be done:
 - use pretty-print in the slim RTL dumping
 - move the escaped-string printing into pretty-print
 - make the basic block content dumping a cfghook
 - make these dumps work for GIMPLE also

 For the moment, this is a good starting point.

 Obviously there are no changes in code generation due to this patch,
 so IMHO this patch (and the rest of the series) should be safe for
 stage3. Hopefully everyone else agrees :-)

 Bootstrappedtested on powerpc64-unknown-linux-gnu. OK for trunk?

Hmm, you are removing an (appearant) abstraction that I didn't know was
there but remember to have asked for in the thread for dumping a late
callgraph for Ada purposes.  Is the abstraction not working or was it simply
a hassle to keep the VCG path working (I don't thjnk I have ever seen a
VCG consumer ... ;)).

Btw, I of course have my own CFG dumper (producing graphviz input)
in my local tree - attached for reference (I'm simply using it from
gdb sessions).

Richard.

 Ciao!
 Steven


p
Description: Binary data


Re: [PATCH] Cleanup last_location and update input_location in ipa_prop

2012-11-26 Thread Dehao Chen
On Mon, Nov 26, 2012 at 7:28 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Thu, Nov 8, 2012 at 6:39 PM, Dehao Chen de...@google.com wrote:
 ping...

 The emit-rtl.c hunk is ok.  I'm questioning the ipa-prop.c hunk though - what
 looks at input_location (nothing outside of the frontend should, really).

ipa_modify_call_arguments invokes force_gimple_operand_gsi, which uses
frontend routines to gimplify expr and uses input_location.

Thanks,
Dehao


 Thanks,
 Richard.


 On Mon, Nov 5, 2012 at 5:20 PM, Dehao Chen de...@google.com wrote:

 Hi,

 This is a patch to do some obvious cleanup and setting correct
 input_location in ipa_prop (because it invokes gimplification
 routines).

 Bootstrapped and passed gcc regression tests.

 Is it okay for trunk?

 Thanks,
 Dehao

 gcc/ChangeLog:
 2010-11-05  Dehao Chen  de...@google.com

 * ipa-prop.c (ipa_modify_call_arguments): Set input_location so that
 gimplification routines can have right location.
 * emit-rtl.c (last_location): Remove unused variable.


Re: [patch] Rework RTL CFG graph dumping to dump DOT format

2012-11-26 Thread Steven Bosscher
On Mon, Nov 26, 2012 at 4:46 PM, Richard Biener wrote:
 On Fri, Nov 23, 2012 at 10:46 AM, Steven Bosscher wrote:
 Hello,

 The graph.[ch] code to dump the CFG for viewing with VCG is quite
 broken and hasn't been updated to follow the advances in the CFG
 infrastructure.

 The attached patch is the first in a series of patches I have planned
 to update the CFG-as-graph dumping code to dump GraphViz DOT input
 instead. The attached archive is an example input/output.

 There's still quite a lot of work to be done:
 - use pretty-print in the slim RTL dumping
 - move the escaped-string printing into pretty-print
 - make the basic block content dumping a cfghook
 - make these dumps work for GIMPLE also

 For the moment, this is a good starting point.

 Obviously there are no changes in code generation due to this patch,
 so IMHO this patch (and the rest of the series) should be safe for
 stage3. Hopefully everyone else agrees :-)

 Bootstrappedtested on powerpc64-unknown-linux-gnu. OK for trunk?

 Hmm, you are removing an (appearant) abstraction that I didn't know was
 there but remember to have asked for in the thread for dumping a late
 callgraph for Ada purposes.  Is the abstraction not working or was it simply
 a hassle to keep the VCG path working (I don't thjnk I have ever seen a
 VCG consumer ... ;)).

I'm not sure what abstraction you mean. If you're talking about the
old graph.[ch], there's nothing abstract about it, it only dumps an
RTL CFG and, indeed, that wasn't working properly. The dumps were ugly
if they work at all. Each insn was dumped as a separate node and there
was no option to dump complete translation units to a single file (my
dumper does that just fine, it dumps each CFG as a disjoint subgraph).
(Some examples of what the dumps look like are attached to this
e-mail: http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01977.html)

A generic graph dumping interface would be nice (graph_traits?) but
for the moment I'm just trying to make a debugging tool available.


 Btw, I of course have my own CFG dumper (producing graphviz input)
 in my local tree - attached for reference (I'm simply using it from
 gdb sessions).

I'll integrate this in my work for a follow-up patch. Well, if you'd
OK the one from this thread first, of course ;-)

Ciao!
Steven


Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)

2012-11-26 Thread Teresa Johnson
Ping.
Teresa

On Thu, Nov 15, 2012 at 12:10 PM, Teresa Johnson tejohn...@google.com wrote:
 Revised patch that fixes failures encountered when enabling
 -freorder-blocks-and-partition, including the failure reported in PR 53743.

 This includes new verification code to ensure no cold blocks dominate hot
 blocks contributed by Steven Bosscher.

 I attempted to make the handling of partition updates through the optimization
 passes much more consistent, removing a number of partial fixes in the code
 stream in the process. The code to fixup partitions (including the 
 BB_PARTITION
 assignement, region crossing jump notes, and switch text section notes) is
 now handled in a few centralized locations. For example, inside
 rtl_redirect_edge_and_branch and force_nonfallthru_and_redirect, so that 
 callers
 don't need to attempt the fixup themselves.

 For optimization passes that make adjustments to the cfg while in cfg layout
 mode that are not easy to fix up incrementally, the new routine
 fixup_partitions handles the cleanup globally. This does require calculation
 of the dominance relation, however, as far as I can tell the routines which
 now invoke this global fixup (try_optimize_cfg and commit_edge_insertions)
 are invoked typically once (or a small number of times in the case of
 try_optimize_cfg) per optimization pass. Additionally, I compared the
 -ftime-report output for some large fdo compilations and saw only minimal
 increases in the dominance computation times, which were only a tiny percent
 of the overall compile time.

 Additionally, I added a flag to the rtl_data structure to indicate whether
 any partitioning was actually performed, so that optimizations which were
 conservatively disabled whenever the flag_reorder_blocks_and_partition
 is enabled (e.g. try_crossjump_to_edge, part of connect_traces) can be less
 conservative for functions where no partitions were formed (e.g. they are
 completely hot).

 Bootstrapped and tested on x86_64-unknown-linux-gnu. Also tested with 
 SPEC2006 int
 benchmarks and internal google benchmarks using profile feedback and
 -freorder-blocks-and-partition to get more coverage. Ok for trunk?

 Thanks,
 Teresa

 2012-11-14  Teresa Johnson  tejohn...@google.com
 Steven Bosscher  ste...@gcc.gnu.org

 * cfghooks.h (cfg_layout_finalize): New parameter.
 * modulo-sched.c (rest_of_handle_sms): New cfg_layout_finalize
 parameter.
 * ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert
 as this is now done by redirect_edge_and_branch_force.
 * function.c (thread_prologue_and_epilogue_insns): Insert new bb after
 barriers, new cfg_layout_finalize parameter, and don't store exit
 predecessor BB until after it is potentially split.
 * function.h (struct rtl_data): New flag has_bb_partition.
 * hw-doloop.c (reorder_loops): New cfg_layout_finalize parameter.
 * cfgcleanup.c (try_crossjump_to_edge): Only skip optimization if
 any blocks in function actually partitioned.
 (try_optimize_cfg): If cfg changed, invoke fixup_partitions to clean
 up partitioning.
 * bb-reorder.c (connect_traces): Only look for partitions and skip
 block copying if any blocks in function actually partitioned.
 (emit_barrier_after_bb): Handle insertion in non-cfglayout mode.
 (find_rarely_executed_basic_blocks_and_crossing_edges): Ensure
 that no cold blocks dominate a hot block.
 (fix_up_fall_thru_edges): Replace BB_COPY_PARTITION with assert
 as this is now done by force_nonfallthru_and_redirect.
 (add_reg_crossing_jump_notes): Handle the fact that some jumps may
 already be marked with region crossing note.
 (reorder_basic_blocks): Only need to verify partitions if any
 blocks in function actually partitioned.
 (insert_section_boundary_note): Only need to insert note if any
 blocks in function actually partitioned.
 (rest_of_handle_reorder_blocks): New cfg_layout_finalize
 parameter, and remove call to insert_section_boundary_note as this
 is now called via cfg_layout_finalize/fixup_reorder_chain.
 (duplicate_computed_gotos): New cfg_layout_finalize
 parameter.
 (partition_hot_cold_basic_blocks): Set flag indicating function
 has bb partitions.
 * bb-reorder.h: Declare insert_section_boundary_note and
 emit_barrier_after_bb, which are no longer static.
 * basic-block.h: Declare new function fixup_partitions.
 * cfgrtl.c (try_redirect_by_replacing_jump): Remove unnecessary
 check for region crossing note.
 (fixup_partition_crossing): New function.
 (fixup_bb_partition): Ditto.
 (rtl_redirect_edge_and_branch): Fixup partition boundaries.
 (force_nonfallthru_and_redirect): Fixup partition boundaries,
 remove old code that tried to 

Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

2012-11-26 Thread Kenneth Zadeck

On 11/26/2012 10:03 AM, Richard Biener wrote:

On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck zad...@naturalbridge.com wrote:

On 11/04/2012 11:54 AM, Richard Biener wrote:

On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:

Kenneth Zadeck zad...@naturalbridge.com writes:

I would like you to respond to at least point 1 of this email.   In it
there is code from the rtl level that was written twice, once for the
case when the size of the mode is less than the size of a HWI and once
for the case where the size of the mode is less that 2 HWIs.

my patch changes this to one instance of the code that works no matter
how large the data passed to it is.

you have made a specific requirement for wide int to be a template that
can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.   I
would like to know how this particular fragment is to be rewritten in
this model?   It seems that I would have to retain the structure where
there is one version of the code for each size that the template is
instantiated.

I think richi's argument was that wide_int should be split into two.
There should be a bare-metal class that just has a length and HWIs,
and the main wide_int class should be an extension on top of that
that does things to a bit precision instead.  Presumably with some
template magic so that the length (number of HWIs) is a constant for:

typedef foo2 double_int;

and a variable for wide_int (because in wide_int the length would be
the number of significant HWIs rather than the size of the underlying
array).  wide_int would also record the precision and apply it after
the full HWI operation.

So the wide_int class would still provide as wide as we need
arithmetic,
as in your rtl patch.  I don't think he was objecting to that.

That summarizes one part of my complaints / suggestions correctly.  In
other
mails I suggested to not make it a template but a constant over object
lifetime
'bitsize' (or maxlen) field.  Both suggestions likely require more thought
than
I put into them.  The main reason is that with C++ you can abstract from
where
wide-int information pieces are stored and thus use the arithmetic /
operation
workers without copying the (source) wide-int objects.  Thus you should
be able to write adaptors for double-int storage, tree or RTX storage.

We had considered something along these lines and rejected it.   I am not
really opposed to doing something like this, but it is not an obvious
winning idea and is likely not to be a good idea.   Here was our thought
process:

if you abstract away the storage inside a wide int, then you should be able
to copy a pointer to the block of data from either the rtl level integer
constant or the tree level one into the wide int.   It is certainly true
that making a wide_int from one of these is an extremely common operation
and doing this would avoid those copies.

However, this causes two problems:
1)  Mike's first cut at the CONST_WIDE_INT did two ggc allocations to make
the object.   it created the base object and then it allocated the array.
Richard S noticed that we could just allocate one CONST_WIDE_INT that had
the array in it.   Doing it this way saves one ggc allocation and one
indirection when accessing the data within the CONST_WIDE_INT.   Our plan is
to use the same trick at the tree level.   So to avoid the copying, you seem
to have to have a more expensive rep for CONST_WIDE_INT and INT_CST.

I did not propose having a pointer to the data in the RTX or tree int.  Just
the short-lived wide-ints (which are on the stack) would have a pointer to
the data - which can then obviously point into the RTX and tree data.
There is the issue then what if some wide-ints are not short lived. It 
makes me nervous to create internal pointers to gc ed memory.

2) You are now stuck either ggcing the storage inside a wide_int when they
are created as part of an expression or you have to play some game to
represent the two different storage plans inside of wide_int.

Hm?  wide-ints are short-lived and thus never live across a garbage collection
point.  We create non-GCed objects pointing to GCed objects all the time
and everywhere this way.
Again, this makes me nervous but it could be done.  However, it does 
mean that now the wide ints that are not created from rtxes or trees 
will be more expensive because they are not going to get their storage 
for free, they are going to alloca it.


however, it still is not clear, given that 99% of the wide ints are 
going to fit in a single hwi, that this would be a noticeable win.



   Clearly this
is where you think that we should be going by suggesting that we abstract
away the internal storage.   However, this comes at a price:   what is
currently an array access in my patches would (i believe) become a function
call.

No, the workers (that perform the array accesses) will simply get
a pointer to the first data element.  Then whether it's embedded or
external is of no interest to 

Re: [PATCH] Cleanup last_location and update input_location in ipa_prop

2012-11-26 Thread Richard Biener
On Mon, Nov 26, 2012 at 4:54 PM, Dehao Chen de...@google.com wrote:
 On Mon, Nov 26, 2012 at 7:28 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Thu, Nov 8, 2012 at 6:39 PM, Dehao Chen de...@google.com wrote:
 ping...

 The emit-rtl.c hunk is ok.  I'm questioning the ipa-prop.c hunk though - what
 looks at input_location (nothing outside of the frontend should, really).

 ipa_modify_call_arguments invokes force_gimple_operand_gsi, which uses
 frontend routines to gimplify expr and uses input_location.

Can you be more specific?  That's the place that needs fixing - there are a lot
more force_gimple_operand callers.

Richard.

 Thanks,
 Dehao


 Thanks,
 Richard.


 On Mon, Nov 5, 2012 at 5:20 PM, Dehao Chen de...@google.com wrote:

 Hi,

 This is a patch to do some obvious cleanup and setting correct
 input_location in ipa_prop (because it invokes gimplification
 routines).

 Bootstrapped and passed gcc regression tests.

 Is it okay for trunk?

 Thanks,
 Dehao

 gcc/ChangeLog:
 2010-11-05  Dehao Chen  de...@google.com

 * ipa-prop.c (ipa_modify_call_arguments): Set input_location so 
 that
 gimplification routines can have right location.
 * emit-rtl.c (last_location): Remove unused variable.


Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)

2012-11-26 Thread Christophe Lyon
Hi,

I have tested your patch on Spec2000 on ARM, and I can still see
several failures caused by:
error: fallthru edge crosses section boundary, including the case
described in PR55121.

On 26 November 2012 16:55, Teresa Johnson tejohn...@google.com wrote:
 Ping.
 Teresa

 On Thu, Nov 15, 2012 at 12:10 PM, Teresa Johnson tejohn...@google.com wrote:
 Revised patch that fixes failures encountered when enabling
 -freorder-blocks-and-partition, including the failure reported in PR 53743.

 This includes new verification code to ensure no cold blocks dominate hot
 blocks contributed by Steven Bosscher.

 I attempted to make the handling of partition updates through the 
 optimization
 passes much more consistent, removing a number of partial fixes in the code
 stream in the process. The code to fixup partitions (including the 
 BB_PARTITION
 assignement, region crossing jump notes, and switch text section notes) is
 now handled in a few centralized locations. For example, inside
 rtl_redirect_edge_and_branch and force_nonfallthru_and_redirect, so that 
 callers
 don't need to attempt the fixup themselves.

 For optimization passes that make adjustments to the cfg while in cfg layout
 mode that are not easy to fix up incrementally, the new routine
 fixup_partitions handles the cleanup globally. This does require calculation
 of the dominance relation, however, as far as I can tell the routines which
 now invoke this global fixup (try_optimize_cfg and commit_edge_insertions)
 are invoked typically once (or a small number of times in the case of
 try_optimize_cfg) per optimization pass. Additionally, I compared the
 -ftime-report output for some large fdo compilations and saw only minimal
 increases in the dominance computation times, which were only a tiny percent
 of the overall compile time.

 Additionally, I added a flag to the rtl_data structure to indicate whether
 any partitioning was actually performed, so that optimizations which were
 conservatively disabled whenever the flag_reorder_blocks_and_partition
 is enabled (e.g. try_crossjump_to_edge, part of connect_traces) can be less
 conservative for functions where no partitions were formed (e.g. they are
 completely hot).

 Bootstrapped and tested on x86_64-unknown-linux-gnu. Also tested with 
 SPEC2006 int
 benchmarks and internal google benchmarks using profile feedback and
 -freorder-blocks-and-partition to get more coverage. Ok for trunk?

 Thanks,
 Teresa

 2012-11-14  Teresa Johnson  tejohn...@google.com
 Steven Bosscher  ste...@gcc.gnu.org

 * cfghooks.h (cfg_layout_finalize): New parameter.
 * modulo-sched.c (rest_of_handle_sms): New cfg_layout_finalize
 parameter.
 * ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert
 as this is now done by redirect_edge_and_branch_force.
 * function.c (thread_prologue_and_epilogue_insns): Insert new bb 
 after
 barriers, new cfg_layout_finalize parameter, and don't store exit
 predecessor BB until after it is potentially split.
 * function.h (struct rtl_data): New flag has_bb_partition.
 * hw-doloop.c (reorder_loops): New cfg_layout_finalize parameter.
 * cfgcleanup.c (try_crossjump_to_edge): Only skip optimization if
 any blocks in function actually partitioned.
 (try_optimize_cfg): If cfg changed, invoke fixup_partitions to clean
 up partitioning.
 * bb-reorder.c (connect_traces): Only look for partitions and skip
 block copying if any blocks in function actually partitioned.
 (emit_barrier_after_bb): Handle insertion in non-cfglayout mode.
 (find_rarely_executed_basic_blocks_and_crossing_edges): Ensure
 that no cold blocks dominate a hot block.
 (fix_up_fall_thru_edges): Replace BB_COPY_PARTITION with assert
 as this is now done by force_nonfallthru_and_redirect.
 (add_reg_crossing_jump_notes): Handle the fact that some jumps may
 already be marked with region crossing note.
 (reorder_basic_blocks): Only need to verify partitions if any
 blocks in function actually partitioned.
 (insert_section_boundary_note): Only need to insert note if any
 blocks in function actually partitioned.
 (rest_of_handle_reorder_blocks): New cfg_layout_finalize
 parameter, and remove call to insert_section_boundary_note as this
 is now called via cfg_layout_finalize/fixup_reorder_chain.
 (duplicate_computed_gotos): New cfg_layout_finalize
 parameter.
 (partition_hot_cold_basic_blocks): Set flag indicating function
 has bb partitions.
 * bb-reorder.h: Declare insert_section_boundary_note and
 emit_barrier_after_bb, which are no longer static.
 * basic-block.h: Declare new function fixup_partitions.
 * cfgrtl.c (try_redirect_by_replacing_jump): Remove unnecessary
 check for region crossing note.
  

Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

2012-11-26 Thread Richard Biener
On Mon, Nov 26, 2012 at 5:03 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:
 On 11/26/2012 10:03 AM, Richard Biener wrote:

 On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck zad...@naturalbridge.com
 wrote:

 On 11/04/2012 11:54 AM, Richard Biener wrote:

 On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford
 rdsandif...@googlemail.com wrote:

 Kenneth Zadeck zad...@naturalbridge.com writes:

 I would like you to respond to at least point 1 of this email.   In it
 there is code from the rtl level that was written twice, once for the
 case when the size of the mode is less than the size of a HWI and once
 for the case where the size of the mode is less that 2 HWIs.

 my patch changes this to one instance of the code that works no matter
 how large the data passed to it is.

 you have made a specific requirement for wide int to be a template
 that
 can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.
 I
 would like to know how this particular fragment is to be rewritten in
 this model?   It seems that I would have to retain the structure where
 there is one version of the code for each size that the template is
 instantiated.

 I think richi's argument was that wide_int should be split into two.
 There should be a bare-metal class that just has a length and HWIs,
 and the main wide_int class should be an extension on top of that
 that does things to a bit precision instead.  Presumably with some
 template magic so that the length (number of HWIs) is a constant for:

 typedef foo2 double_int;

 and a variable for wide_int (because in wide_int the length would be
 the number of significant HWIs rather than the size of the underlying
 array).  wide_int would also record the precision and apply it after
 the full HWI operation.

 So the wide_int class would still provide as wide as we need
 arithmetic,
 as in your rtl patch.  I don't think he was objecting to that.

 That summarizes one part of my complaints / suggestions correctly.  In
 other
 mails I suggested to not make it a template but a constant over object
 lifetime
 'bitsize' (or maxlen) field.  Both suggestions likely require more
 thought
 than
 I put into them.  The main reason is that with C++ you can abstract from
 where
 wide-int information pieces are stored and thus use the arithmetic /
 operation
 workers without copying the (source) wide-int objects.  Thus you
 should
 be able to write adaptors for double-int storage, tree or RTX storage.

 We had considered something along these lines and rejected it.   I am not
 really opposed to doing something like this, but it is not an obvious
 winning idea and is likely not to be a good idea.   Here was our thought
 process:

 if you abstract away the storage inside a wide int, then you should be
 able
 to copy a pointer to the block of data from either the rtl level integer
 constant or the tree level one into the wide int.   It is certainly true
 that making a wide_int from one of these is an extremely common operation
 and doing this would avoid those copies.

 However, this causes two problems:
 1)  Mike's first cut at the CONST_WIDE_INT did two ggc allocations to
 make
 the object.   it created the base object and then it allocated the array.
 Richard S noticed that we could just allocate one CONST_WIDE_INT that had
 the array in it.   Doing it this way saves one ggc allocation and one
 indirection when accessing the data within the CONST_WIDE_INT.   Our plan
 is
 to use the same trick at the tree level.   So to avoid the copying, you
 seem
 to have to have a more expensive rep for CONST_WIDE_INT and INT_CST.

 I did not propose having a pointer to the data in the RTX or tree int.
 Just
 the short-lived wide-ints (which are on the stack) would have a pointer to
 the data - which can then obviously point into the RTX and tree data.

 There is the issue then what if some wide-ints are not short lived. It makes
 me nervous to create internal pointers to gc ed memory.

I thought they were all short-lived.

 2) You are now stuck either ggcing the storage inside a wide_int when
 they
 are created as part of an expression or you have to play some game to
 represent the two different storage plans inside of wide_int.

 Hm?  wide-ints are short-lived and thus never live across a garbage
 collection
 point.  We create non-GCed objects pointing to GCed objects all the time
 and everywhere this way.

 Again, this makes me nervous but it could be done.  However, it does mean
 that now the wide ints that are not created from rtxes or trees will be more
 expensive because they are not going to get their storage for free, they
 are going to alloca it.

No, those would simply use the embedded storage model.

 however, it still is not clear, given that 99% of the wide ints are going to
 fit in a single hwi, that this would be a noticeable win.

Currently even if they fit into a HWI you will still allocate 4 times the
larges integer mode size.  You say that doesn't matter because they
are 

[Patch, Fortran] PR5469

2012-11-26 Thread Tobias Burnus
l_push_char allocates memory which is freed with free_line. However, 
currently, the memory is not always freed when calling generate_error.


If one aborts, that's fine. However, generate_error can also set the 
iostat variable. Thus, one should ensure that the memory is always freed.


I wouldn't mind if someone, who knows libgfortran/io better than I, 
could confirm that the patch is okay. I don't want too free memory which 
is later used.


Build and regtested on x86-64-linux.
OK for the trunk?

Tobias


PS: Test case see PR. (I don't think it makes sense to include it in the 
test suite.)


PPS: I see the following failures when regtesting; the LTO and the 
realloc one I know, but I haven't see the reassoc_4.f before. It cannot 
be due to my patch and when I run it manually, it also works:


FAIL: gfortran.dg/lto/pr45586 f_lto_pr45586_0.o-f_lto_pr45586_0.o link, 
-O0 -flto  (internal compiler error)

FAIL: gfortran.dg/realloc_on_assign_5.f03  -O0  execution test
FAIL: gfortran.dg/reassoc_4.f  -O   scan-tree-dump-times reassoc1 [0-9] 
\\*  22
2012-11-26  Tobias Burnus  bur...@net-b.de

	PR fortran/5469
	* io/list_read (parse_repeat, read_integer, read_character,
	parse_real, read_real, check_type, list_formatted_read_scalar,
	finish_list_read): Call list_free.

diff --git a/libgfortran/io/list_read.c b/libgfortran/io/list_read.c
index 403e719..49615b3 100644
--- a/libgfortran/io/list_read.c
+++ b/libgfortran/io/list_read.c
@@ -617,6 +617,7 @@ parse_repeat (st_parameter_dt *dtp)
   free_saved (dtp);
   if (c == EOF)
 {
+  free_line (dtp);
   hit_eof (dtp);
   return 1;
 }
@@ -905,11 +906,14 @@ read_integer (st_parameter_dt *dtp, int length)
   free_saved (dtp);  
   if (c == EOF)
 {
+  free_line (dtp);
   hit_eof (dtp);
   return;
 }
   else if (c != '\n')
 eat_line (dtp);
+
+  free_line (dtp);
   snprintf (message, MSGLEN, Bad integer for item %d in list input,
 	  dtp-u.p.item_count);
   generate_error (dtp-common, LIBERROR_READ_VALUE, message);
@@ -1078,7 +1082,6 @@ read_character (st_parameter_dt *dtp, int length __attribute__ ((unused)))
   unget_char (dtp, c);
   eat_separator (dtp);
   dtp-u.p.saved_type = BT_CHARACTER;
-  free_line (dtp);
 }
   else 
 {
@@ -1087,10 +1090,12 @@ read_character (st_parameter_dt *dtp, int length __attribute__ ((unused)))
 		  dtp-u.p.item_count);
   generate_error (dtp-common, LIBERROR_READ_VALUE, message);
 }
+  free_line (dtp);
   return;
 
  eof:
   free_saved (dtp);
+  free_line (dtp);
   hit_eof (dtp);
 }
 
@@ -1283,11 +1288,14 @@ parse_real (st_parameter_dt *dtp, void *buffer, int length)
   free_saved (dtp);
   if (c == EOF)
 {
+  free_line (dtp);
   hit_eof (dtp);
   return 1;
 }
   else if (c != '\n')
 eat_line (dtp);
+
+  free_line (dtp);
   snprintf (message, MSGLEN, Bad floating point number for item %d,
 	  dtp-u.p.item_count);
   generate_error (dtp-common, LIBERROR_READ_VALUE, message);
@@ -1387,11 +1395,14 @@ eol_4:
   free_saved (dtp);
   if (c == EOF)
 {
+  free_line (dtp);
   hit_eof (dtp);
   return;
 }
   else if (c != '\n')   
 eat_line (dtp);
+
+  free_line (dtp);
   snprintf (message, MSGLEN, Bad complex value in item %d of list input,
 	  dtp-u.p.item_count);
   generate_error (dtp-common, LIBERROR_READ_VALUE, message);
@@ -1762,12 +1773,14 @@ read_real (st_parameter_dt *dtp, void * dest, int length)
   free_saved (dtp);
   if (c == EOF)
 {
+  free_line (dtp);
   hit_eof (dtp);
   return;
 }
   else if (c != '\n')
 eat_line (dtp);
 
+  free_line (dtp);
   snprintf (message, MSGLEN, Bad real number in item %d of list input,
 	  dtp-u.p.item_count);
   generate_error (dtp-common, LIBERROR_READ_VALUE, message);
@@ -1784,6 +1797,7 @@ check_type (st_parameter_dt *dtp, bt type, int len)
 
   if (dtp-u.p.saved_type != BT_UNKNOWN  dtp-u.p.saved_type != type)
 {
+  free_line (dtp);
   snprintf (message, MSGLEN, Read type %s where %s was expected for item %d,
 		  type_name (dtp-u.p.saved_type), type_name (type),
 		  dtp-u.p.item_count);
@@ -1797,6 +1811,7 @@ check_type (st_parameter_dt *dtp, bt type, int len)
 
   if (dtp-u.p.saved_length != len)
 {
+  free_line (dtp);
   snprintf (message, MSGLEN,
 		  Read kind %d %s where kind %d is required for item %d,
 		  dtp-u.p.saved_length, type_name (dtp-u.p.saved_type), len,
@@ -1970,7 +1985,10 @@ list_formatted_read_scalar (st_parameter_dt *dtp, bt type, void *p,
 
 cleanup:
   if (err == LIBERROR_END)
-hit_eof (dtp);
+{
+  free_line (dtp);
+  hit_eof (dtp);
+}
   return err;
 }
 
@@ -2018,7 +2036,10 @@ finish_list_read (st_parameter_dt *dtp)
 
   err = eat_line (dtp);
   if (err == LIBERROR_END)
-hit_eof (dtp);
+{
+  free_line (dtp);
+  hit_eof (dtp);
+}
 }
 
 /*			NAMELIST INPUT


[PATCH][AARCH64] Refactor constant generation

2012-11-26 Thread Sofiane Naci
Hi,

Constant building in the AArch64 backend spits out assembly code, which
affects scheduling of the generated code.
This patch rewrites the code to use RTL patterns.

A full aarch64-none-elf regression run shows no issues.
 
Thanks
Sofiane

-

ChangeLog:

2012-11-26  Sofiane Naci  sofiane.n...@arm.com

* config/aarch64/aarch64.c (aarch64_build_constant): Update
prototype.
Call emit_move_insn instead of printing movi/movn/movz
instructions.
Call gen_insv_immdi instead of printing movk instruction.
(aarch64_add_constant): Update prototype.
Generate RTL instead of priting add/sub instructions.
(aarch64_output_mi_thunk): Update calls to aarch64_build_constant
and aarch64_add_constant.


aarch64-refactor-const-gen.patch
Description: Binary data


RFA: Enable both ld and gold in GCC 4.8

2012-11-26 Thread H.J. Lu
Hi,

There is a patch to enable both ld and gold in GCC:

http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02076.html

and Diego asked if there is anything in particular blocking
this patch in trunk:

http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02121.html

I'd like to see it in GCC 4.8.

Thanks.


-- 
H.J.


Re: [PATCH] Section anchors and thread-local storage

2012-11-26 Thread David Edelsohn
On Mon, Nov 26, 2012 at 3:26 AM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 David Edelsohn dje@gmail.com writes:
 I have been working to enable native thread-local storage on AIX. One
 problem I encountered is the AIX assembler has difficulty with the
 anchor symbol for TLS CSECTs.  While the section anchors machinery
 uses a separate pool for TLS entries, should section anchor blocks be
 used for TLS symbols at all?

 powerpc-linux uses GOT annotations directly and never places TLS
 symbols in the TOC at all. Section anchors seem to be avoided by TLS
 code already.  The appended patch rejects TLS symbols for object
 blocks in general. I could add a target hook, but I wanted to propose
 this policy in general before pursing a target-specific hook.

 Yeah, TLS anchors work on mips64-linux-gnu, although admittedly
 section anchors aren't yet enabled by default there, so perhaps not
 many people have tried it.

I know that it *CAN* work. Part of my question was whether
thread-local symbols *SHOULD* be placed in section anchor blocks,
whether or not they can be placed there on some systems.

 A use_blocks_for_decl_p hook sounds good, and fits nicely with
 targetm.use_blocks_for_constant_p.

Below is the implementation as a new target hook.

Thanks, David

* target.def (use_blocks_for_decl_p): New hook.
* varasm.c (use_blocks_for_decl_p): Apply hook as final condition.

Index: target.def
===
--- target.def  (revision 193798)
+++ target.def  (working copy)
@@ -1495,6 +1495,13 @@
  bool, (enum machine_mode mode, const_rtx x),
  hook_bool_mode_const_rtx_false)

+/* True if the given decl can be put into an object_block.  */
+DEFHOOK
+(use_blocks_for_decl_p,
+ ,
+ bool, (const_tree decl),
+ hook_bool_const_tree_true)
+
 /* The minimum and maximum byte offsets for anchored addresses.  */
 DEFHOOKPOD
 (min_anchor_offset,
Index: varasm.c
===
--- varasm.c(revision 193798)
+++ varasm.c(working copy)
@@ -1113,7 +1113,7 @@
   if (lookup_attribute (alias, DECL_ATTRIBUTES (decl)))
 return false;

-  return true;
+  return targetm.use_blocks_for_decl_p (decl);
 }

 /* Create the DECL_RTL for a VAR_DECL or FUNCTION_DECL.  DECL should


Re: [tsan] Instrument atomics

2012-11-26 Thread Xinliang David Li
On Mon, Nov 26, 2012 at 12:35 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Mon, Nov 26, 2012 at 12:17:44PM +0400, Dmitry Vyukov wrote:
 On Sat, Nov 24, 2012 at 12:58 PM, Dmitry Vyukov dvyu...@google.com wrote:
  Ok.  A slight problem then is that where the tsan pass sits right now, 
  there
  is no easy way to find out if the builtin call will be expanded inline or
  not, so (similar for asan), if we instrument them in the pass, it might 
  be
  instrumented twice at runtime if the builtin is expanded as a library 
  call
  (once the added instrumentation for the builtin, once in the intercepted
  library call).  That isn't wrong, just might need slightly more resources
  than if we ensured we only instrument the builtin if it isn't expanded
  inline.
 
 
  Should inlining of those functions be disabled as if -fno-builtins is 
  specified?
 
  Yes, it sounds reasonable. Performance characteristics under tsan
  differ significantly, so most likely we don't care.


 Do we still need range access functions then?

 Yes.  I think whether to instrument builtins inline or not should be a user
 decision

Why is that? Most users probably are not aware builtins can be inlined
or care about it.

, -fsanitize-thread and either nothinng or -fno-builtin.  Implying
 -fno-builtin might penalize some code unnecessarily (remember, builtin
 handling is not only about expanding some builtins inline, but also about
 folding them if they are used with constant arguments, performing all sorts
 of optimizations etc.,

Yes, in some cases, it can incur a little runtime overhead as

int i;
memset(i, 1, 4);

but I think compared with tsan's runtime overhead, this is almost negligible.


 all of which are ok, just we don't have a knob to
 disable inline expansion of some builtins (which, some are certainly ok,
 e.g. those that don't have a library counterpart).

There is an internal interface to call:

disable_builtin_function (...) -- tsan option process can call it
selectively for the stringop related builtins.

 Also note that many programs use __builtin_* calls directly and then you
 can't disable recognizing them as builtins.

Those calls won't be affected, I think. Compiler can also directly
generate builtin calls. However that means the range checking is still
needed.

David



 Jakub


Re: [PATCH] Cleanup last_location and update input_location in ipa_prop

2012-11-26 Thread Dehao Chen
On Mon, Nov 26, 2012 at 8:10 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Mon, Nov 26, 2012 at 4:54 PM, Dehao Chen de...@google.com wrote:
 On Mon, Nov 26, 2012 at 7:28 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Thu, Nov 8, 2012 at 6:39 PM, Dehao Chen de...@google.com wrote:
 ping...

 The emit-rtl.c hunk is ok.  I'm questioning the ipa-prop.c hunk though - 
 what
 looks at input_location (nothing outside of the frontend should, really).

 ipa_modify_call_arguments invokes force_gimple_operand_gsi, which uses
 frontend routines to gimplify expr and uses input_location.

 Can you be more specific?  That's the place that needs fixing - there are a 
 lot
 more force_gimple_operand callers.

You are right, the patch is updated as follows:

Index: gcc/ipa-prop.c
===
--- gcc/ipa-prop.c (revision 193203)
+++ gcc/ipa-prop.c (working copy)
@@ -2870,7 +2870,8 @@ ipa_modify_call_arguments (struct cgraph_edge *cs,

   gcc_checking_assert (adj-offset % BITS_PER_UNIT == 0);
   base = gimple_call_arg (stmt, adj-base_index);
-  loc = EXPR_LOCATION (base);
+  loc = DECL_P (base) ? DECL_SOURCE_LOCATION (base) :
+ EXPR_LOCATION (base);

   if (TREE_CODE (base) != ADDR_EXPR
POINTER_TYPE_P (TREE_TYPE (base)))

Thanks,
Dehao


 Richard.

 Thanks,
 Dehao


 Thanks,
 Richard.


 On Mon, Nov 5, 2012 at 5:20 PM, Dehao Chen de...@google.com wrote:

 Hi,

 This is a patch to do some obvious cleanup and setting correct
 input_location in ipa_prop (because it invokes gimplification
 routines).

 Bootstrapped and passed gcc regression tests.

 Is it okay for trunk?

 Thanks,
 Dehao

 gcc/ChangeLog:
 2010-11-05  Dehao Chen  de...@google.com

 * ipa-prop.c (ipa_modify_call_arguments): Set input_location so 
 that
 gimplification routines can have right location.
 * emit-rtl.c (last_location): Remove unused variable.


Re: PR 55438: Incorrect use of BIGGEST_ALIGNMENT

2012-11-26 Thread Eric Botcazou
 In http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00257.html I said:
 
   get_best_mode has various checks to decide what counts as an acceptable
   bitfield mode.  It actually has two copies of them, with slightly
 different alignment checks:
 
 MIN (unit, BIGGEST_ALIGNMENT)  align
 
   vs.
 
 unit = MIN (align, BIGGEST_ALIGNMENT)
 
   The second looks more correct, since we can't necessarily guarantee
   larger alignments than BIGGEST_ALIGNMENT in all cases.
 
 PR 55438 shows why I was wrong.  BIGGEST_ALIGNMENT is not (as I thought)
 the biggest supported alignment, but the:
 
   Biggest alignment that any data type can require on this machine, in
   bits.  Note that this is not the biggest alignment that is supported,
   just the biggest alignment that, when violated, may cause a fault.
 
 So it's perfectly possible for BIGGEST_ALIGNMENT to be 8 on 32-bit machines.

That means that the Sequent check was flawed, doesn't it?  It also seems that 
the entire business of alignment with size comparisons is dubious.

 As things stand, my change causes get_best_mode to reject anything larger
 than a byte for that combination, which causes infinite recursion between
 store_fixed_bit_field and store_split_bit_field.
 
 There are two problems.  First:
 
   if (!bitregion_end_)
 {
   /* We can assume that any aligned chunk of ALIGN_ bits that overlaps
the bitfield is mapped and won't trap.  */
   bitregion_end_ = bitpos + bitsize + align_ - 1;
   bitregion_end_ -= bitregion_end_ % align_ + 1;
 }
 
 is too conservative if the aligment is capped to BIGGEST_ALIGNMENT.
 We should be able to guarantee that word-aligned memory is mapped
 in at least word-sized chunks.  (If you think now is the time to
 add a MIN_PAGE_SIZE macro, perhaps with MAX (BITS_PER_WORD,
 BIGGEST_ALIGNMENT) as its default, I can do that instead.)

So this will fix the Sequent check with a conservative cap of BITS_PER_WORD.
That's reasonable, I think.

 Also, in cases like these, the supposedly conservative:
 
 GET_MODE_BITSIZE (mode) = align
 
 doesn't preserve the cap in the original:
 
 MIN (unit, BIGGEST_ALIGNMENT)  align
 
 Fixed by using GET_MODE_ALIGNMENT instead.

Note that the comment just above needs to be adjusted then.  What about the 
similar check in next_mode?

  /* Stop if the mode requires too much alignment.  */
  if (unit  align_  SLOW_UNALIGNED_ACCESS (mode_, align_))
break;

It seems to me that we should change it as well.

-- 
Eric Botcazou


Re: [patch] reorg.c janitor patch 2: handle DEBUG_INSN

2012-11-26 Thread Eric Botcazou
 If that is true, then why are there so many post var-tracking passes
 using NONDEBUG_INSN_P and/or looking for the DEBUG_INSN code? See e.g.
 shorten_branches, reorg.c, various machine reorgs, etc.

They are very likely overzealous.

 For example from reorg.c:redundant_insn() in the loop /* Scan
 backwards looking for a match.  */:
 
 151731  nemet   if (!NONDEBUG_INSN_P (trial))
 99 kenner   continue;
 151731  nemet   --insns_to_search;
 
 This was introduced to fix PR41349. Apparently DEBUG_INSN live on
 beyond var-tracking...?

I think that the fix for this PR was really about NOTEs and not DEBUG_INSNs.

I'm going to test the following patchlet on the SPARC:

Index: reorg.c
===
--- reorg.c (revision 193748)
+++ reorg.c (working copy)
@@ -1628,7 +1628,7 @@ redundant_insn (rtx insn, rtx target, rt
   if (LABEL_P (trial))
return 0;
 
-  if (!NONDEBUG_INSN_P (trial))
+  if (!INSN_P (trial))
continue;
   --insns_to_search;
 
@@ -1731,7 +1731,7 @@ redundant_insn (rtx insn, rtx target, rt
trial  !LABEL_P (trial)  insns_to_search  0;
trial = PREV_INSN (trial))
 {
-  if (!NONDEBUG_INSN_P (trial))
+  if (!INSN_P (trial))
continue;
   --insns_to_search;
 


-- 
Eric Botcazou


Re: [tsan] Instrument atomics

2012-11-26 Thread Dmitry Vyukov
On Mon, Nov 26, 2012 at 9:07 PM, Xinliang David Li davi...@google.com wrote:
  Ok.  A slight problem then is that where the tsan pass sits right now, 
  there
  is no easy way to find out if the builtin call will be expanded inline 
  or
  not, so (similar for asan), if we instrument them in the pass, it might 
  be
  instrumented twice at runtime if the builtin is expanded as a library 
  call
  (once the added instrumentation for the builtin, once in the intercepted
  library call).  That isn't wrong, just might need slightly more 
  resources
  than if we ensured we only instrument the builtin if it isn't expanded
  inline.
 
 
  Should inlining of those functions be disabled as if -fno-builtins is 
  specified?
 
  Yes, it sounds reasonable. Performance characteristics under tsan
  differ significantly, so most likely we don't care.


 Do we still need range access functions then?

 Yes.  I think whether to instrument builtins inline or not should be a user
 decision

 Why is that? Most users probably are not aware builtins can be inlined
 or care about it.


I agree with David. Most likely it has negligible overhead under tsan.
Even if gcc provides ability to choose between builtins and library
functions, it does not have to provide it under tsan (which is a very
special mode).
On the other hand, I don't know what complexity is involved on gcc
side. As for runtime I can easily implement it, it's just a matter of
writing 2 thin wrappers for existing functionality.



, -fsanitize-thread and either nothinng or -fno-builtin.  Implying
 -fno-builtin might penalize some code unnecessarily (remember, builtin
 handling is not only about expanding some builtins inline, but also about
 folding them if they are used with constant arguments, performing all sorts
 of optimizations etc.,

 Yes, in some cases, it can incur a little runtime overhead as

 int i;
 memset(i, 1, 4);

 but I think compared with tsan's runtime overhead, this is almost negligible.


 all of which are ok, just we don't have a knob to
 disable inline expansion of some builtins (which, some are certainly ok,
 e.g. those that don't have a library counterpart).

 There is an internal interface to call:

 disable_builtin_function (...) -- tsan option process can call it
 selectively for the stringop related builtins.

 Also note that many programs use __builtin_* calls directly and then you
 can't disable recognizing them as builtins.

 Those calls won't be affected, I think. Compiler can also directly
 generate builtin calls. However that means the range checking is still
 needed.

 David



 Jakub


patch to fix a test in PR55277

2012-11-26 Thread Vladimir Makarov

  The following patch fixes the second test case in

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55277

  As for the first case, I don' think it will be fixed soon as  I have 
more urgent PRs.


  The patch was successfully tested and bootstrapped on x86/x86-64.

  Committed as rev. 193824.

2012-11-26  Vladimir Makarov  vmaka...@redhat.com

PR target/55277
* lra-constraints.c (in_class_p): Check reg class contents too.

2012-11-26  Vladimir Makarov  vmaka...@redhat.com

PR target/55277
* gcc.target/i386/pr55227.c: New test.

Index: lra-constraints.c
===
--- lra-constraints.c   (revision 193755)
+++ lra-constraints.c   (working copy)
@@ -293,7 +293,9 @@ in_class_p (rtx reg, enum reg_class cl,
  if (nregs == 1)
return true;
  for (j = 0; j  nregs; j++)
-   if (TEST_HARD_REG_BIT (lra_no_alloc_regs, hard_regno + j))
+   if (TEST_HARD_REG_BIT (lra_no_alloc_regs, hard_regno + j)
+   || ! TEST_HARD_REG_BIT (reg_class_contents[common_class],
+   hard_regno + j))
  break;
  if (j = nregs)
return true;
Index: testsuite/gcc.target/i386/pr55277.c
===
--- testsuite/gcc.target/i386/pr55277.c (revision 0)
+++ testsuite/gcc.target/i386/pr55277.c (working copy)
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ilp32 } */
+/* { dg-options -O1 } */
+
+int a, c;
+
+void f(long long p)
+{
+  long long b;
+
+  if(b)
+b = p ? : 0;
+
+  for (; p; p++)
+p *= a  (c = p *= !a  2);
+
+  a = b += !(b  3740917449u);
+}


[PING^2] PR 54805: __gthread_tsd* in vxlib-tls.c

2012-11-26 Thread rbmj

On 11/13/2012 10:22 PM, rbmj wrote:

On 11/5/2012 12:57 PM, rbmj wrote:

This removes warnings about implicit declarations and fixes one of the
function calls in vxlib-tls.c for vxworks targets.

I got the old prototypes from
http://gcc.gnu.org/ml/gcc-patches/2005-08/msg01314.html

See bug for further details.

Someone please comment or commit :)



Ping: http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00406.html

Robert Mason




Ping^2


Re: PR 55438: Incorrect use of BIGGEST_ALIGNMENT

2012-11-26 Thread Richard Sandiford
Eric Botcazou ebotca...@adacore.com writes:
 In http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00257.html I said:
 
   get_best_mode has various checks to decide what counts as an acceptable
   bitfield mode.  It actually has two copies of them, with slightly
 different alignment checks:
 
 MIN (unit, BIGGEST_ALIGNMENT)  align
 
   vs.
 
 unit = MIN (align, BIGGEST_ALIGNMENT)
 
   The second looks more correct, since we can't necessarily guarantee
   larger alignments than BIGGEST_ALIGNMENT in all cases.
 
 PR 55438 shows why I was wrong.  BIGGEST_ALIGNMENT is not (as I thought)
 the biggest supported alignment, but the:
 
   Biggest alignment that any data type can require on this machine, in
   bits.  Note that this is not the biggest alignment that is supported,
   just the biggest alignment that, when violated, may cause a fault.
 
 So it's perfectly possible for BIGGEST_ALIGNMENT to be 8 on 32-bit machines.

 That means that the Sequent check was flawed, doesn't it?  It also seems that 
 the entire business of alignment with size comparisons is dubious.

Yeah, I suppose the Sequent test didn't really have any effect on
BIGGEST_ALIGNMENT == BITS_PER_UNIT targets.

 Also, in cases like these, the supposedly conservative:
 
GET_MODE_BITSIZE (mode) = align
 
 doesn't preserve the cap in the original:
 
 MIN (unit, BIGGEST_ALIGNMENT)  align
 
 Fixed by using GET_MODE_ALIGNMENT instead.

 Note that the comment just above needs to be adjusted then.

OK, how about:

 /* ??? For historical reasons, reject modes that would normally
receive greater alignment, even if unaligned accesses are
acceptable.  This has both advantages and disadvantages.

 What about the similar check in next_mode?

   /* Stop if the mode requires too much alignment.  */
   if (unit  align_  SLOW_UNALIGNED_ACCESS (mode_, align_))
   break;

 It seems to me that we should change it as well.

I think here we really do want unit (i.e. the GET_MODE_BITSIZE).
We're dividing the bitfield into unit-sized chunks and want to know
whether those chunks are aligned or not.  If they aren't aligned,
we need to know whether unaligned accesses are OK.

Richard


Re: [patch] reorg.c janitor patch 2: handle DEBUG_INSN

2012-11-26 Thread Richard Sandiford
Eric Botcazou ebotca...@adacore.com writes:
 If that is true, then why are there so many post var-tracking passes
 using NONDEBUG_INSN_P and/or looking for the DEBUG_INSN code? See e.g.
 shorten_branches, reorg.c, various machine reorgs, etc.

 They are very likely overzealous.

 For example from reorg.c:redundant_insn() in the loop /* Scan
 backwards looking for a match.  */:
 
 151731  nemet   if (!NONDEBUG_INSN_P (trial))
 99 kenner   continue;
 151731  nemet   --insns_to_search;
 
 This was introduced to fix PR41349. Apparently DEBUG_INSN live on
 beyond var-tracking...?

 I think that the fix for this PR was really about NOTEs and not DEBUG_INSNs.

 I'm going to test the following patchlet on the SPARC:

Steve hit a case where running vartracking before dbr_schedule caused a
bootstrap failure on MIPS.  On targets without delay slots, that kind of
bug has been fixed by:

/* Variable tracking should be run after all optimizations which
   change order of insns.  It also needs a valid CFG.  */
#undef TARGET_DELAY_VARTRACK
#define TARGET_DELAY_VARTRACK true

But obviously we can't do that with dbr_schedule, since it changes
the order of insns but runs with the CFG pulled down.

In the long term it would be good to replace dbr_schedule altogether,
but in the medium term I thought we might want to update it so that
it can run before vartracking.

Richard


[RFC] Wrong register numbers in .dwarf_frame on Linux/PowerPC

2012-11-26 Thread Ulrich Weigand
Hello,

I noticed what appears to be a long-standing bug in generating .dwarf_frame
sections with GCC on Linux on PowerPC.

It had been my understanding that .dwarf_frame is supposed to differ from
.eh_frame on PowerPC w.r.t. register numbers: .eh_frame should use GCC
internal numbers, while .dwarf_frame should use the DWARF register numbers
documented in the PowerPC ELF ABI.  However, in actual fact, .dwarf_frame
does not use the DWARF numbers; and it does not use the GCC numbers either,
but a weird mixture: it uses GCC numbers for everything except for the
condition code register, for which it uses the DWARF number (64).

Doing a bit of code archaeology, it seems this behaviour goes back to a
patch checked in by Geoff Keating just before GCC 4.2.  (And indeed in
GCC 4.1 we actually have DWARF numbers in .dwarf_frame.):

use correct DWARF register numbers on rs6000 non-ELF ports
http://gcc.gnu.org/ml/gcc-patches/2006-10/msg00766.html

This patch moves the definition of the two macros DBX_REGISTER_NUMBER
and DWARF2_FRAME_REG_OUT from rs6000/sysv4.h to rs6000/rs6000.h:

  #define DBX_REGISTER_NUMBER(REGNO) rs6000_dbx_register_number (REGNO)

  /* Map register numbers held in the call frame info that gcc has
 collected using DWARF_FRAME_REGNUM to those that should be output in
 .debug_frame and .eh_frame.  We continue to use gcc hard reg numbers
 for .eh_frame, but use the numbers mandated by the various ABIs for
 .debug_frame.  rs6000_emit_prologue has translated any combination of
 CR2, CR3, CR4 saves to a save of CR2.  The actual code emitted saves
 the whole of CR, so we map CR2_REGNO to the DWARF reg for CR.  */
  #define DWARF2_FRAME_REG_OUT(REGNO, FOR_EH)   \
((FOR_EH) ? (REGNO) \
 : (REGNO) == CR2_REGNO ? 64\
 : DBX_REGISTER_NUMBER (REGNO))

The intention apparently was to make Darwin use the same numbers as
Linux.  However, the actual effect was to make Darwin use the same
numbers Linux used to use before, while at the same time breaking
the numbers used by Linux.

The problem is that target macro headers were included in the
following sequence:
   rs6000/rs6000.h
   svr4.h
   rs6000/sysv4.h
and the common svr4.h header contained a line:
  #undef DBX_REGISTER_NUMBER

When DBX_REGISTER_NUMBER was defined in rs6000/sysv4.h, this didn't
matter.  But now that the definition was moved to rs6000/rs6000.h,
it was in effect undone by the #undef in svr4.h, transforming the
definition of DWARF2_FRAME_REG_OUT to be:
  #define DWARF2_FRAME_REG_OUT(REGNO, FOR_EH)   \
((FOR_EH) ? (REGNO) \
 : (REGNO) == CR2_REGNO ? 64\   
 : (REGNO))
which explains the observed behaviour of GCC since 4.2.


Since then, a couple of changes were done to the sources, but all
preserved this behaviour.  Most notably, Joseph Myers eliminated
the common sysv4.h header, but moved the #undef to rs6000/sysv4.h
instead:

svr4.h avoidance: rs6000/powerpc
http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01264.html


Also, with current GCC and binutils, .dwarf_frame is usually not
generated by GCC directly, but via .cfi_... directives interpreted
by gas.  This was implemented by a patch by Jakub Jelinek:

[PATCH] Add support for .cfi_sections - way to emit also .debug_frame from 
.cfi_* directives
http://sourceware.org/ml/binutils/2009-10/msg00028.html

which notes:

 On ppc this is uglified by the need to translate register numbers - as
 .eh_frame uses different numbering of registers than .debug_frame.

In the patch as posted to the mailing list, this translation is here:

  /* ppc uses different register numbers between .eh_frame and .debug_frame.
 This macro translates the .eh_frame register numbers to .debug_frame
 register numbers.  */
  #define md_reg_eh_frame_to_debug_frame(regno) \
((regno) == 70 ? 64 /* cr2 */   \
 : (regno) == 64 ? 100 /* mq */ \
 : (regno) == 65 ? 108 /* lr */ \
 : (regno) == 66 ? 109 /* ctr */\
 : (regno) = 68  (regno) = 75 ? (regno) + 86 - 68 /* crN */ \
 : (regno) == 76 ? 101 /* xer */\
 : (regno) = 77  (regno) = 108 ? (regno) + 1124 - 77 /* vrN */  \
 : (regno) == 109 ? 356 /* vrsave */\
 : (regno) == 110 ? 67 /* vscr */   \
 : (regno) == 111 ? 99 /* spe_acc */\
 : (regno) == 112 ? 612 /* spefscr */   \
 : (regno))

But in the patch as actually checked in (tc-ppc.h rev. 1.41), we have:

  /* ppc uses different register numbers between .eh_frame and .debug_frame.
 This macro translates the .eh_frame register numbers to .debug_frame
 register numbers.  */
  #define md_reg_eh_frame_to_debug_frame(regno) \
((regno) == 

Re: [patch] reorg.c janitor patch 2: handle DEBUG_INSN

2012-11-26 Thread Jakub Jelinek
On Mon, Nov 26, 2012 at 07:08:41PM +, Richard Sandiford wrote:
 In the long term it would be good to replace dbr_schedule altogether,
 but in the medium term I thought we might want to update it so that
 it can run before vartracking.

var-tracking definitely isn't prepared to handle dbr_schedule created
sequences, I guess it would be much more work than just fixing whatever is
needed on the dbr side that doesn't work right now (if anything).

Jakub


Re: [RFC] Wrong register numbers in .dwarf_frame on Linux/PowerPC

2012-11-26 Thread David Edelsohn
On Mon, Nov 26, 2012 at 2:10 PM, Ulrich Weigand uweig...@de.ibm.com wrote:

 So I'm wondering where to go from here.  I guess we could:

 1. Bring GCC (and gas) behaviour in compliance with the documented ABI
by removing the #undef DBX_REGISTER_NUMBER and changing gas's
md_reg_eh_frame_to_debug_frame to the original implementation from
Jakub's patch.  That would make GDB work well on new files, but
there are a large number of binaries out there where we continue
to have the same behaviour as today ...

 2. Leave GCC and gas as-is and modify GDB to expect GCC numbering in
.dwarf_frame, except for the condition code register.  This would
break debugging of files built with GCC 4.0 and 4.1 unless we
want to add a special hack for that.

 3. Like 2., but remove the condition code hack: simply use identical
numbers in .eh_frame and .dwarf_frame.  This would make PowerPC
like other Linux platforms in that respect.

 Thoughts?

I vote for (3).

Thanks, David


Re: [RFC] Wrong register numbers in .dwarf_frame on Linux/PowerPC

2012-11-26 Thread Mark Kettenis
 Date: Mon, 26 Nov 2012 20:10:06 +0100 (CET)
 From: Ulrich Weigand uweig...@de.ibm.com
 
 Hello,
 
 I noticed what appears to be a long-standing bug in generating .dwarf_frame
 sections with GCC on Linux on PowerPC.
 
 ...
 
 So I'm wondering where to go from here.  I guess we could:
 
 1. Bring GCC (and gas) behaviour in compliance with the documented ABI
by removing the #undef DBX_REGISTER_NUMBER and changing gas's
md_reg_eh_frame_to_debug_frame to the original implementation from
Jakub's patch.  That would make GDB work well on new files, but
there are a large number of binaries out there where we continue
to have the same behaviour as today ...
 
 2. Leave GCC and gas as-is and modify GDB to expect GCC numbering in
.dwarf_frame, except for the condition code register.  This would
break debugging of files built with GCC 4.0 and 4.1 unless we
want to add a special hack for that.
 
 3. Like 2., but remove the condition code hack: simply use identical
numbers in .eh_frame and .dwarf_frame.  This would make PowerPC
like other Linux platforms in that respect.
 
 Thoughts?

What do other compilers (in particular XLC) do?  From a GDB standpoint
it would be a major PITA if different compilers would use different
encodings for .dwarf_frame.


Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)

2012-11-26 Thread Teresa Johnson
Are you sure you have all my changes applied? I applied the 4 patches
attached to PR55121 into my trunk checkout that has my fixes, and to a
pristine trunk checkout. I configured and built both for
--target=arm-none-linux-gnueabi, and built using your options, .i file
and gcda file. I can reproduce the failure using the pristine trunk
with your patches but not with my fixed trunk + your patches. (I just
updated to head to pickup recent changes and get the same result. The
vec changes required some manual changes to the patch, which I will
resend shortly.)

Without my fixes:

$ ~/extra/gcc_trunk_3_arm-eabi/gcc/cc1 -fpreproce
ssed eval.i -quiet -dumpbase eval.c -march=armv7-a -mtune=cortex-a9
-mthumb -mfpu=neon -mvectorize-with-neon-quad -mfloat-abi=softfp
-mtls-dialect=gnu -auxbase-strip eval.o -g -O3 -version -fprofile-use
-fno-common -o eval.s -freorder-blocks-and-partition
GNU C (GCC) version 4.8.0 20121126 (experimental) (arm-none-linux-gnueabi)
compiled by GNU C version 4.4.3, GMP version 4.3.2, MPFR version
2.4.2-p1, MPC version 0.8.1
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
GNU C (GCC) version 4.8.0 20121126 (experimental) (arm-none-linux-gnueabi)
compiled by GNU C version 4.4.3, GMP version 4.3.2, MPFR version
2.4.2-p1, MPC version 0.8.1
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: d19cc60a2f07de08237a8488bb35cd1a
eval.c: In function ‘Ge’:
eval.c:792:1: internal compiler error: in df_compact_blocks, at df-core.c:1560
 }
 ^
0x622f71 df_compact_blocks()
../../gcc_trunk_3/gcc/df-core.c:1560
0x5cfcb5 compact_blocks()
../../gcc_trunk_3/gcc/cfg.c:162
0xc9dce0 reorder_basic_blocks
../../gcc_trunk_3/gcc/bb-reorder.c:2154
0xc9dce0 rest_of_handle_reorder_blocks
../../gcc_trunk_3/gcc/bb-reorder.c:2219
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See http://gcc.gnu.org/bugs.html for instructions.


With my fixes:

$ ~/extra/gcc_trunk_4_arm-eabi/gcc/cc1 -fpreproce
ssed eval.i -quiet -dumpbase eval.c -march=armv7-a -mtune=cortex-a9
-mthumb -mfpu=neon -mvectorize-with-neon-quad -mfloat-abi=softfp
-mtls-dialect=gnu -auxbase-strip eval.o -g -O3 -version -fprofile-use
-fno-common -o eval.s -freorder-blocks-and-partition
GNU C (GCC) version 4.8.0 20121126 (experimental) (arm-none-linux-gnueabi)
compiled by GNU C version 4.4.3, GMP version 4.3.2, MPFR version
2.4.2-p1, MPC version 0.8.1
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
GNU C (GCC) version 4.8.0 20121126 (experimental) (arm-none-linux-gnueabi)
compiled by GNU C version 4.4.3, GMP version 4.3.2, MPFR version
2.4.2-p1, MPC version 0.8.1
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: 45b468efa7c981f9afb44c4dac2424f3


Thanks,
Teresa

On Mon, Nov 26, 2012 at 8:25 AM, Christophe Lyon
christophe.l...@linaro.org wrote:
 Hi,

 I have tested your patch on Spec2000 on ARM, and I can still see
 several failures caused by:
 error: fallthru edge crosses section boundary, including the case
 described in PR55121.

 On 26 November 2012 16:55, Teresa Johnson tejohn...@google.com wrote:
 Ping.
 Teresa

 On Thu, Nov 15, 2012 at 12:10 PM, Teresa Johnson tejohn...@google.com 
 wrote:
 Revised patch that fixes failures encountered when enabling
 -freorder-blocks-and-partition, including the failure reported in PR 53743.

 This includes new verification code to ensure no cold blocks dominate hot
 blocks contributed by Steven Bosscher.

 I attempted to make the handling of partition updates through the 
 optimization
 passes much more consistent, removing a number of partial fixes in the code
 stream in the process. The code to fixup partitions (including the 
 BB_PARTITION
 assignement, region crossing jump notes, and switch text section notes) is
 now handled in a few centralized locations. For example, inside
 rtl_redirect_edge_and_branch and force_nonfallthru_and_redirect, so that 
 callers
 don't need to attempt the fixup themselves.

 For optimization passes that make adjustments to the cfg while in cfg layout
 mode that are not easy to fix up incrementally, the new routine
 fixup_partitions handles the cleanup globally. This does require calculation
 of the dominance relation, however, as far as I can tell the routines which
 now invoke this global fixup (try_optimize_cfg and commit_edge_insertions)
 are invoked typically once (or a small number of times in the case of
 try_optimize_cfg) per optimization pass. Additionally, I compared the
 -ftime-report output for some large fdo compilations and saw only minimal
 increases in the dominance computation times, which were only a tiny percent
 of the overall compile time.

 Additionally, I added a flag to the rtl_data structure to indicate whether
 any partitioning was actually performed, so that optimizations which were
 conservatively disabled whenever

Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)

2012-11-26 Thread Jack Howarth
On Mon, Nov 26, 2012 at 12:19:55PM -0800, Teresa Johnson wrote:
 Are you sure you have all my changes applied? I applied the 4 patches
 attached to PR55121 into my trunk checkout that has my fixes, and to a
 pristine trunk checkout. I configured and built both for
 --target=arm-none-linux-gnueabi, and built using your options, .i file
 and gcda file. I can reproduce the failure using the pristine trunk
 with your patches but not with my fixed trunk + your patches. (I just
 updated to head to pickup recent changes and get the same result. The
 vec changes required some manual changes to the patch, which I will
 resend shortly.)

Teresa,
Your mailer seems to have corrupted the posted patch with stray
=3D characters and line breaks. Can you repost a copy as an attachment
to the list?
 Jack

 
 Without my fixes:
 
 $ ~/extra/gcc_trunk_3_arm-eabi/gcc/cc1 -fpreproce
 ssed eval.i -quiet -dumpbase eval.c -march=armv7-a -mtune=cortex-a9
 -mthumb -mfpu=neon -mvectorize-with-neon-quad -mfloat-abi=softfp
 -mtls-dialect=gnu -auxbase-strip eval.o -g -O3 -version -fprofile-use
 -fno-common -o eval.s -freorder-blocks-and-partition
 GNU C (GCC) version 4.8.0 20121126 (experimental) (arm-none-linux-gnueabi)
 compiled by GNU C version 4.4.3, GMP version 4.3.2, MPFR version
 2.4.2-p1, MPC version 0.8.1
 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
 GNU C (GCC) version 4.8.0 20121126 (experimental) (arm-none-linux-gnueabi)
 compiled by GNU C version 4.4.3, GMP version 4.3.2, MPFR version
 2.4.2-p1, MPC version 0.8.1
 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
 Compiler executable checksum: d19cc60a2f07de08237a8488bb35cd1a
 eval.c: In function ‘Ge’:
 eval.c:792:1: internal compiler error: in df_compact_blocks, at df-core.c:1560
  }
  ^
 0x622f71 df_compact_blocks()
 ../../gcc_trunk_3/gcc/df-core.c:1560
 0x5cfcb5 compact_blocks()
 ../../gcc_trunk_3/gcc/cfg.c:162
 0xc9dce0 reorder_basic_blocks
 ../../gcc_trunk_3/gcc/bb-reorder.c:2154
 0xc9dce0 rest_of_handle_reorder_blocks
 ../../gcc_trunk_3/gcc/bb-reorder.c:2219
 Please submit a full bug report,
 with preprocessed source if appropriate.
 Please include the complete backtrace with any bug report.
 See http://gcc.gnu.org/bugs.html for instructions.
 
 
 With my fixes:
 
 $ ~/extra/gcc_trunk_4_arm-eabi/gcc/cc1 -fpreproce
 ssed eval.i -quiet -dumpbase eval.c -march=armv7-a -mtune=cortex-a9
 -mthumb -mfpu=neon -mvectorize-with-neon-quad -mfloat-abi=softfp
 -mtls-dialect=gnu -auxbase-strip eval.o -g -O3 -version -fprofile-use
 -fno-common -o eval.s -freorder-blocks-and-partition
 GNU C (GCC) version 4.8.0 20121126 (experimental) (arm-none-linux-gnueabi)
 compiled by GNU C version 4.4.3, GMP version 4.3.2, MPFR version
 2.4.2-p1, MPC version 0.8.1
 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
 GNU C (GCC) version 4.8.0 20121126 (experimental) (arm-none-linux-gnueabi)
 compiled by GNU C version 4.4.3, GMP version 4.3.2, MPFR version
 2.4.2-p1, MPC version 0.8.1
 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
 Compiler executable checksum: 45b468efa7c981f9afb44c4dac2424f3
 
 
 Thanks,
 Teresa
 
 On Mon, Nov 26, 2012 at 8:25 AM, Christophe Lyon
 christophe.l...@linaro.org wrote:
  Hi,
 
  I have tested your patch on Spec2000 on ARM, and I can still see
  several failures caused by:
  error: fallthru edge crosses section boundary, including the case
  described in PR55121.
 
  On 26 November 2012 16:55, Teresa Johnson tejohn...@google.com wrote:
  Ping.
  Teresa
 
  On Thu, Nov 15, 2012 at 12:10 PM, Teresa Johnson tejohn...@google.com 
  wrote:
  Revised patch that fixes failures encountered when enabling
  -freorder-blocks-and-partition, including the failure reported in PR 
  53743.
 
  This includes new verification code to ensure no cold blocks dominate hot
  blocks contributed by Steven Bosscher.
 
  I attempted to make the handling of partition updates through the 
  optimization
  passes much more consistent, removing a number of partial fixes in the 
  code
  stream in the process. The code to fixup partitions (including the 
  BB_PARTITION
  assignement, region crossing jump notes, and switch text section notes) is
  now handled in a few centralized locations. For example, inside
  rtl_redirect_edge_and_branch and force_nonfallthru_and_redirect, so that 
  callers
  don't need to attempt the fixup themselves.
 
  For optimization passes that make adjustments to the cfg while in cfg 
  layout
  mode that are not easy to fix up incrementally, the new routine
  fixup_partitions handles the cleanup globally. This does require 
  calculation
  of the dominance relation, however, as far as I can tell the routines 
  which
  now invoke this global fixup (try_optimize_cfg and commit_edge_insertions)
  are invoked typically once (or a small number of times in the case of
  try_optimize_cfg) per optimization pass. Additionally, I compared the
  -ftime-report output

Re: PR 55438: Incorrect use of BIGGEST_ALIGNMENT

2012-11-26 Thread Eric Botcazou
 OK, how about:
 
/* ??? For historical reasons, reject modes that would normally
   receive greater alignment, even if unaligned accesses are
   acceptable.  This has both advantages and disadvantages.

Fine with me.

 I think here we really do want unit (i.e. the GET_MODE_BITSIZE).
 We're dividing the bitfield into unit-sized chunks and want to know
 whether those chunks are aligned or not.  If they aren't aligned,
 we need to know whether unaligned accesses are OK.

We could conceivably have an aligned mode with bitsize 32 and alignment 16 
(i.e. STRICT_ALIGNMENT 1 and BIGGEST_ALIGNMENT 16), meaning that we can use 
16-bit aligned 32-bit chunks in this mode.  Why would the bitsize matter 
(assuming that it's a multiple of the alignment)?

-- 
Eric Botcazou


PATCH: lto/55474: global-buffer-overflow in lto-wrapper.c

2012-11-26 Thread H.J. Lu
Hi,

OPT_SPECIAL_unknown, OPT_SPECIAL_ignore, OPT_SPECIAL_program_name and
OPT_SPECIAL_input_file are special options, which aren't in cl_options.
This patch avoids

if (!(cl_options[foption-opt_index].flags  CL_TARGET))

on them.  This patch skips them.  OK to install?

Thanks.


H.J.
---
2012-11-26  H.J. Lu  hongjiu...@intel.com

PR lto/55474
* lto-wrapper.c (merge_and_complain): Handle
OPT_SPECIAL_unknown, OPT_SPECIAL_ignore,
OPT_SPECIAL_program_name and OPT_SPECIAL_input_file.

diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index 1f4d212..24de743 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -393,6 +393,12 @@ merge_and_complain (struct cl_decoded_option 
**decoded_options,
   struct cl_decoded_option *foption = fdecoded_options[i];
   switch (foption-opt_index)
{
+   case OPT_SPECIAL_unknown:
+   case OPT_SPECIAL_ignore:
+   case OPT_SPECIAL_program_name:
+   case OPT_SPECIAL_input_file:
+ break;
+
default:
  if (!(cl_options[foption-opt_index].flags  CL_TARGET))
break;


[Patch, Fortran] PR55475 - fix invalid reads with show_locus

2012-11-26 Thread Tobias Burnus
As found with -fsanitize=address by HJ, but it also shows up with 
valgrind. The fix for the PR is the change in scanner.c; I think the 
patch is rather obvious.


The change in error.c is due to:

  if (c1 == c2)
c2 += 1;

which could lead to an out-of-bounds condition is c1 is already at the 
last character - then one exceeds the bound for c2.


Build and tested on x86-64-linux with no new failures.*
OK for the trunk?

Tobias

* I get: FAIL for gfortran.dg/lto/pr45586, 
gfortran.dg/realloc_on_assign_5.f03 and gfortran.dg/reassoc_4.f and 
XPASS for gfortran.dg/do_1.f90.
2012-11-26  Tobias Burnus  bur...@net-b.de

	PR fortran/55475
	* scanner.c (gfc_next_char_literal): Fix setting locus
	to free_line_length for the error message.
	* error.c (show_locus): Fix potential out-of-bounds
	read.

diff --git a/gcc/fortran/error.c b/gcc/fortran/error.c
index 4b06156..611540c 100644
--- a/gcc/fortran/error.c
+++ b/gcc/fortran/error.c
@@ -387,7 +387,7 @@ show_locus (locus *loc, int c1, int c2)
   cmax -= offset;
 
   p = (lb-line[offset]);
-  for (i = 0; i = cmax; i++)
+  for (i = 0; i  cmax; i++)
 {
   int spaces, j;
   spaces = gfc_widechar_display_length (*p++);
@@ -401,6 +401,11 @@ show_locus (locus *loc, int c1, int c2)
 	error_char (' ');
 }
 
+  if (i == c1)
+error_char ('1');
+  else if (i == c2)
+error_char ('2');
+
   error_char ('\n');
 
 }
diff --git a/gcc/fortran/scanner.c b/gcc/fortran/scanner.c
index e0556a9..765c0f9 100644
--- a/gcc/fortran/scanner.c
+++ b/gcc/fortran/scanner.c
@@ -1068,10 +1068,12 @@ restart:
 	   gfc_current_locus.lb-truncated)
 	{
 	  int maxlen = gfc_option.free_line_length;
+	  gfc_char_t *current_nextc = gfc_current_locus.nextc;
+
 	  gfc_current_locus.lb-truncated = 0;
-	  gfc_current_locus.nextc += maxlen;
+	  gfc_current_locus.nextc =  gfc_current_locus.lb-line + maxlen;
 	  gfc_warning_now (Line truncated at %L, gfc_current_locus);
-	  gfc_current_locus.nextc -= maxlen;
+	  gfc_current_locus.nextc = current_nextc;
 	}
 
   if (c != '')


Re: r193821 - in /trunk/gcc: ChangeLog common.opt d...

2012-11-26 Thread Hans-Peter Nilsson
I don't see this posted on gcc-patches@, but...

On Mon, 26 Nov 2012, ste...@gcc.gnu.org wrote:
 testsuite/
   * testsuite/gcc.dg/20050811-1.c: Change -dv option to -graph option
   to -fdump-rtl-all.
   * testsuite/gcc.dg/pr37858.c: Remove -dv option.

...looks like you missed:

Executing on host: /tmp/hpautotest-gcc0/cris-elf/gccobj/gcc/xgcc
-B/tmp/hpautotest-gcc0/cris-elf/gccobj/gcc/
/tmp/hpautotest-gcc0/gcc/gcc/testsuite/gcc.dg/20050811-2.c
-fno-diagnostics-show-caret   -O2 -dv -fdump-rtl-postreload -S
-isystem
/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/./newlib/targ-include
-isystem /tmp/hpautotest-gcc0/gcc/newlib/libc/include  -o
20050811-2.s(timeout = 300)
cc1: warning: unrecognized gcc debugging option: v [enabled by
default]
output is:
cc1: warning: unrecognized gcc debugging option: v [enabled by
default]

FAIL: gcc.dg/20050811-2.c (test for excess errors)

brgds, H-P


[PATCH] Fix the bug of comparing goto_locus to UNKNOWN_LOCATION

2012-11-26 Thread Dehao Chen
Hi,

This patch fixes a bug of comparing goto_locus to UNKNOWN_LOCATION
when combining basic blocks.

Bootstrapped and and passed gcc regression tests.

Is it okay for trunk?

Thanks,
Dehao

gcc/ChangeLog:
2012-11-26  Dehao Chen  de...@google.com

* cfgrtl.c (rtl_merge_blocks): Check with UNKNOWN_LOCATION correctly.
(cfg_layout_merge_blocks): Likewise.

Index: gcc/cfgrtl.c
===
--- gcc/cfgrtl.c (revision 193828)
+++ gcc/cfgrtl.c (working copy)
@@ -890,7 +890,8 @@ rtl_merge_blocks (basic_block a, basic_block b)
   df_bb_delete (b-index);

   /* If B was a forwarder block, propagate the locus on the edge.  */
-  if (forwarder_p  !EDGE_SUCC (b, 0)-goto_locus)
+  if (forwarder_p
+   LOCATION_LOCUS (EDGE_SUCC (b, 0)-goto_locus) == UNKNOWN_LOCATION)
 EDGE_SUCC (b, 0)-goto_locus = EDGE_SUCC (a, 0)-goto_locus;

   if (dump_file)
@@ -4149,7 +4150,7 @@ cfg_layout_merge_blocks (basic_block a, basic_bloc

   /* If B was a forwarder block, propagate the locus on the edge.  */
   if (forwarder_p
-   LOCATION_LOCUS (EDGE_SUCC (b, 0)-goto_locus) != UNKNOWN_LOCATION)
+   LOCATION_LOCUS (EDGE_SUCC (b, 0)-goto_locus) == UNKNOWN_LOCATION)
 EDGE_SUCC (b, 0)-goto_locus = EDGE_SUCC (a, 0)-goto_locus;

   if (dump_file)


Re: [PATCH] Cleanup last_location and update input_location in ipa_prop

2012-11-26 Thread Dehao Chen
The new patch is attached. Bootstrapped and passed gcc regression test.

Ok for trunk?

Thanks,
Dehao

gcc/ChangeLog:
2010-11-05  Dehao Chen  de...@google.com

* ipa-prop.c (ipa_modify_call_arguments): Set loc correctly.
* emit-rtl.c (last_location): Remove unused variable.

Index: gcc/emit-rtl.c
===
--- gcc/emit-rtl.c (revision 193203)
+++ gcc/emit-rtl.c (working copy)
@@ -5937,7 +5937,7 @@ location_t epilogue_location;
 /* Hold current location information and last location information, so the
datastructures are built lazily only when some instructions in given
place are needed.  */
-static location_t curr_location, last_location;
+static location_t curr_location;

 /* Allocate insn location datastructure.  */
 void
@@ -5945,7 +5945,6 @@ insn_locations_init (void)
 {
   prologue_location = epilogue_location = 0;
   curr_location = UNKNOWN_LOCATION;
-  last_location = UNKNOWN_LOCATION;
 }

 /* At the end of emit stage, clear current location.  */
Index: gcc/ipa-prop.c
===
--- gcc/ipa-prop.c (revision 193203)
+++ gcc/ipa-prop.c (working copy)
@@ -2870,7 +2870,8 @@ ipa_modify_call_arguments (struct cgraph_edge *cs,

   gcc_checking_assert (adj-offset % BITS_PER_UNIT == 0);
   base = gimple_call_arg (stmt, adj-base_index);
-  loc = EXPR_LOCATION (base);
+  loc = DECL_P (base) ? DECL_SOURCE_LOCATION (base)
+  : EXPR_LOCATION (base);

   if (TREE_CODE (base) != ADDR_EXPR
POINTER_TYPE_P (TREE_TYPE (base)))


Re: r193821 - in /trunk/gcc: ChangeLog common.opt d...

2012-11-26 Thread Steven Bosscher
On Mon, Nov 26, 2012 at 11:38 PM, Hans-Peter Nilsson wrote:
 I don't see this posted on gcc-patches@, but...

 On Mon, 26 Nov 2012,  wrote:
 testsuite/
   * testsuite/gcc.dg/20050811-1.c: Change -dv option to -graph option
   to -fdump-rtl-all.
   * testsuite/gcc.dg/pr37858.c: Remove -dv option.

 ...looks like you missed:

 Executing on host: /tmp/hpautotest-gcc0/cris-elf/gccobj/gcc/xgcc
 -B/tmp/hpautotest-gcc0/cris-elf/gccobj/gcc/
 /tmp/hpautotest-gcc0/gcc/gcc/testsuite/gcc.dg/20050811-2.c
 -fno-diagnostics-show-caret   -O2 -dv -fdump-rtl-postreload -S
 -isystem
 /tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/./newlib/targ-include
 -isystem /tmp/hpautotest-gcc0/gcc/newlib/libc/include  -o
 20050811-2.s(timeout = 300)
 cc1: warning: unrecognized gcc debugging option: v [enabled by
 default]
 output is:
 cc1: warning: unrecognized gcc debugging option: v [enabled by
 default]

 FAIL: gcc.dg/20050811-2.c (test for excess errors)

 brgds, H-P

Yup, sorry. Fixed now.

Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog (revision 193829)
+++ testsuite/ChangeLog (working copy)
@@ -1,3 +1,7 @@
+2012-11-26  Steven Bosscher  
+
+   * gcc.dg/20050811-2.c: Change -dv option to -graph option.
+
 2012-11-26  Georg-Johann Lay  

* gcc.dg/54455.c: Require scheduling.
Index: testsuite/gcc.dg/20050811-2.c
===
--- testsuite/gcc.dg/20050811-2.c   (revision 193829)
+++ testsuite/gcc.dg/20050811-2.c   (working copy)
@@ -1,6 +1,6 @@
-/* Test whether -dov doesn't crash.  */
+/* Test whether graph dumping doesn't crash.  */
 /* { dg-do compile } */
-/* { dg-options -O2 -dv -fdump-rtl-postreload } */
+/* { dg-options -O2 -fdump-rtl-postreload-graph } */

 int foo (void)
 {


[patch] RFA: slim RTL printing cleanups

2012-11-26 Thread Steven Bosscher
Hello,

This patch performs some necessary TLC on slim RTL printing in sched-vis.c:

* Make it independent of the scheduler. Actually it already was,
mostly. This patch completes the job.

* Harmonize dumping templates for INSN_UID.

* Always print the pattern of a CALL_INSN.

* Don't print jump for sched dumps only.

* Harmonize function names:
  - print_* print to a char * buffer
  - dump_* print to a given FILE *
  - debug_* print to stderr.

Some of the ideas come from Richard S.'s
http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00284.html.

The next patch addresses a bigger problem: The print_* variants should
use something better than a fixed-size buffer. I plan to use
pretty-print. An obstack would also work; but I'd like to keep one
layer of abstraction between the file output and the printers, so that
users can post-process the string if necessary (the case I'm thinking
of, is the graph dumpers, which need to escape some characters before
writing to a file). Any thoughts on this from anyone?

Bootstrappedtested on powerpc64-unknown-linux-gnu. OK for trunk?
(Yes, I know it's stage3, but look at it this way: the patch does no
harm, and good dumps help for debugging real bugs ;-)

Ciao!
Steven


print-rtl-slim_1.diff
Description: Binary data


[patch] fix libstdc++/55463 calling mem_fn with rvalues

2012-11-26 Thread Jonathan Wakely
PR libstdc++/55463
* include/std/functional (_Mem_fn): Handle rvalue objects. Add
noexcept-specifications.
* testsuite/20_util/function_objects/mem_fn/55463.cc: New.
* testsuite/20_util/bind/ref_neg.cc: Adjust dg-error line numbers.

Tested x86_64-linux, committed to trunk.
commit 4ddc9752bc93b233efcff9273b0fe4e488a783b8
Author: Jonathan Wakely jwakely@gmail.com
Date:   Sun Nov 25 21:56:23 2012 +

PR libstdc++/55463
* include/std/functional (_Mem_fn): Handle rvalue objects. Add
noexcept-specifications.
* testsuite/20_util/function_objects/mem_fn/55463.cc: New.
* testsuite/20_util/bind/ref_neg.cc: Adjust dg-error line numbers.

diff --git a/libstdc++-v3/include/std/functional 
b/libstdc++-v3/include/std/functional
index 4d6e6a8..1a98127 100644
--- a/libstdc++-v3/include/std/functional
+++ b/libstdc++-v3/include/std/functional
@@ -65,7 +65,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 class _Mem_fn;
   templatetypename _Tp, typename _Class
 _Mem_fn_Tp _Class::*
-mem_fn(_Tp _Class::*);
+mem_fn(_Tp _Class::*) noexcept;
 
 _GLIBCXX_HAS_NESTED_TYPE(result_type)
 
@@ -528,13 +528,16 @@ _GLIBCXX_HAS_NESTED_TYPE(result_type)
 
   templatetypename _Tp
_Res
-   _M_call(_Tp __object, const volatile _Class *,
+   _M_call(_Tp __object, const volatile _Class *,
_ArgTypes... __args) const
-   { return (__object.*__pmf)(std::forward_ArgTypes(__args)...); }
+   {
+ return (std::forward_Tp(__object).*__pmf)
+   (std::forward_ArgTypes(__args)...);
+   }
 
   templatetypename _Tp
_Res
-   _M_call(_Tp __ptr, const volatile void *, _ArgTypes... __args) const
+   _M_call(_Tp __ptr, const volatile void *, _ArgTypes... __args) const
{ return ((*__ptr).*__pmf)(std::forward_ArgTypes(__args)...); }
 
 public:
@@ -555,12 +558,17 @@ _GLIBCXX_HAS_NESTED_TYPE(result_type)
   // Handle smart pointers, references and pointers to derived
   templatetypename _Tp
_Res
-   operator()(_Tp __object, _ArgTypes... __args) const
+   operator()(_Tp __object, _ArgTypes... __args) const
{
- return _M_call(__object, __object,
+ return _M_call(std::forward_Tp(__object), __object,
  std::forward_ArgTypes(__args)...);
}
 
+  templatetypename _Tp
+   _Res
+   operator()(reference_wrapper_Tp __ref, _ArgTypes... __args) const
+   { return operator()(__ref.get(), std::forward_ArgTypes(__args)...); }
+
 private:
   _Functor __pmf;
 };
@@ -575,13 +583,16 @@ _GLIBCXX_HAS_NESTED_TYPE(result_type)
 
   templatetypename _Tp
_Res
-   _M_call(_Tp __object, const volatile _Class *,
+   _M_call(_Tp __object, const volatile _Class *,
_ArgTypes... __args) const
-   { return (__object.*__pmf)(std::forward_ArgTypes(__args)...); }
+   {
+ return (std::forward_Tp(__object).*__pmf)
+   (std::forward_ArgTypes(__args)...);
+   }
 
   templatetypename _Tp
_Res
-   _M_call(_Tp __ptr, const volatile void *, _ArgTypes... __args) const
+   _M_call(_Tp __ptr, const volatile void *, _ArgTypes... __args) const
{ return ((*__ptr).*__pmf)(std::forward_ArgTypes(__args)...); }
 
 public:
@@ -601,12 +612,17 @@ _GLIBCXX_HAS_NESTED_TYPE(result_type)
 
   // Handle smart pointers, references and pointers to derived
   templatetypename _Tp
-   _Res operator()(_Tp __object, _ArgTypes... __args) const
+   _Res operator()(_Tp __object, _ArgTypes... __args) const
{
- return _M_call(__object, __object,
+ return _M_call(std::forward_Tp(__object), __object,
  std::forward_ArgTypes(__args)...);
}
 
+  templatetypename _Tp
+   _Res
+   operator()(reference_wrapper_Tp __ref, _ArgTypes... __args) const
+   { return operator()(__ref.get(), std::forward_ArgTypes(__args)...); }
+
 private:
   _Functor __pmf;
 };
@@ -621,13 +637,16 @@ _GLIBCXX_HAS_NESTED_TYPE(result_type)
 
   templatetypename _Tp
_Res
-   _M_call(_Tp __object, const volatile _Class *,
+   _M_call(_Tp __object, const volatile _Class *,
_ArgTypes... __args) const
-   { return (__object.*__pmf)(std::forward_ArgTypes(__args)...); }
+   {
+ return (std::forward_Tp(__object).*__pmf)
+   (std::forward_ArgTypes(__args)...);
+   }
 
   templatetypename _Tp
_Res
-   _M_call(_Tp __ptr, const volatile void *, _ArgTypes... __args) const
+   _M_call(_Tp __ptr, const volatile void *, _ArgTypes... __args) const
{ return ((*__ptr).*__pmf)(std::forward_ArgTypes(__args)...); }
 
 public:
@@ -648,12 +667,17 @@ _GLIBCXX_HAS_NESTED_TYPE(result_type)
   // Handle smart pointers, references and pointers to derived
   templatetypename _Tp
_Res
-   operator()(_Tp __object, _ArgTypes... 

[patch] tweak std::bind constraint

2012-11-26 Thread Jonathan Wakely
* include/std/functional (__is_socketlike): Change from class
template to alias template.

Tested x86_64-linux, committed to trunk.
commit 634c004b52c86148e1719e20744504816374dfdb
Author: Jonathan Wakely jwakely@gmail.com
Date:   Sun Nov 25 22:01:28 2012 +

* include/std/functional (__is_socketlike): Change from class
template to alias template.

diff --git a/libstdc++-v3/include/std/functional 
b/libstdc++-v3/include/std/functional
index 1a98127..0d8fbd6 100644
--- a/libstdc++-v3/include/std/functional
+++ b/libstdc++-v3/include/std/functional
@@ -1503,14 +1503,8 @@ _GLIBCXX_HAS_NESTED_TYPE(result_type)
   // Trait type used to remove std::bind() from overload set via SFINAE
   // when first argument has integer type, so that std::bind() will
   // not be a better match than ::bind() from the BSD Sockets API.
-  templatetypename _Tp
-class __is_socketlike
-{
-  typedef typename decay_Tp::type _Tp2;
-public:
-  static const bool value =
-   is_integral_Tp2::value || is_enum_Tp2::value;
-};
+  templatetypename _Tp, typename _Tp2 = typename decay_Tp::type
+using __is_socketlike = __or_is_integral_Tp2, is_enum_Tp2;
 
   templatebool _SocketLike, typename _Func, typename... _BoundArgs
 struct _Bind_helper




Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

2012-11-26 Thread Kenneth Zadeck

Richard,

I spent a good part of the afternoon talking to Mike about this.  He is 
on the c++ standards committee and is a much more seasoned c++ 
programmer than I am.


He convinced me that with a large amount of engineering and c++ 
foolishness that it was indeed possible to get your proposal to 
POSSIBLY work as well as what we did.


But now the question is why would any want to do this?

At the very least you are talking about instantiating two instances of 
wide-ints, one for the stack allocated uses and one for the places where 
we just move a pointer from the tree or the rtx. Then you are talking 
about creating connectors so that the stack allocated functions can take 
parameters of pointer version and visa versa.


Then there is the issue that rather than just saying that something is a 
wide int, that the programmer is going to have to track it's origin.   
In particular,  where in the code right now i say.


wide_int foo = wide_int::from_rtx (r1);
wide_int bar = wide_int::from_rtx (r2) + foo;

now i would have to say

wide_int_ptr foo = wide_int_ptr::from_rtx (r1);
wide_int_stack bar = wide_int_ptr::from_rtx (r2) + foo;

then when i want to call some function using a wide_int ref that 
function now must be either overloaded to take both or i have to choose 
one of the two instantiations (presumably based on which is going to be 
more common) and just have the compiler fix up everything (which it is 
likely to do).


And so what is the payoff:
1) No one except the c++ elite is going to understand the code. The rest 
of the community will hate me and curse the ground that i walk on.
2) I will end up with a version of wide-int that can be used as a medium 
life container (where i define medium life as not allowed to survive a 
gc since they will contain pointers into rtxes and trees.)
3) An no clients that actually wanted to do this!!I could use as an 
example one of your favorite passes, tree-vrp.   The current double-int 
could have been a medium lifetime container since it has a smaller 
footprint, but in fact tree-vrp converts those double-ints back into 
trees for medium storage.   Why, because it needs the other fields of a 
tree-cst to store the entire state.  Wide-ints also suffer this 
problem.  their only state are the data, and the three length fields.   
They have no type and none of the other tree info so the most obvious 
client for a medium lifetime object is really not going to be a good 
match even if you solve the storage problem.


The fact is that wide-ints are an excellent short term storage class 
that can be very quickly converted into our two long term storage 
classes.  Your proposal is requires a lot of work, will not be easy to 
use and as far as i can see has no payoff on the horizon.   It could be 
that there could be future clients for a medium lifetime value, but 
asking for this with no clients in hand is really beyond the scope of a 
reasonable review.


I remind you that the purpose of these patches is to solve problems that 
exist in the current compiler that we have papered over for years.   If 
someone needs wide-ints in some way that is not foreseen then they can 
change it.


kenny

On 11/26/2012 11:30 AM, Richard Biener wrote:

On Mon, Nov 26, 2012 at 5:03 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

On 11/26/2012 10:03 AM, Richard Biener wrote:

On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck zad...@naturalbridge.com
wrote:

On 11/04/2012 11:54 AM, Richard Biener wrote:

On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:

Kenneth Zadeck zad...@naturalbridge.com writes:

I would like you to respond to at least point 1 of this email.   In it
there is code from the rtl level that was written twice, once for the
case when the size of the mode is less than the size of a HWI and once
for the case where the size of the mode is less that 2 HWIs.

my patch changes this to one instance of the code that works no matter
how large the data passed to it is.

you have made a specific requirement for wide int to be a template
that
can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.
I
would like to know how this particular fragment is to be rewritten in
this model?   It seems that I would have to retain the structure where
there is one version of the code for each size that the template is
instantiated.

I think richi's argument was that wide_int should be split into two.
There should be a bare-metal class that just has a length and HWIs,
and the main wide_int class should be an extension on top of that
that does things to a bit precision instead.  Presumably with some
template magic so that the length (number of HWIs) is a constant for:

 typedef foo2 double_int;

and a variable for wide_int (because in wide_int the length would be
the number of significant HWIs rather than the size of the underlying
array).  wide_int would also record the precision and apply it after
the full HWI operation.

So the wide_int 

[PATCH, libgcc, ARM] __gnu_f2h_internal inaccuracy

2012-11-26 Thread John Tytgat
__gnu_f2h_internal in libgcc converts single-precision floating-point value
to half-precision value for ARM.  I noticed there is a slight inaccuracy
for floating-point values 2^-25 + epsilon (with epsilon  0 and  2^-26).
Those should all be converted to 2^-24 in half-precision.

And this because the mask used to implement the even-odd rounding for
aexp = -25 is wrong.  Currently we have:

  /* Decimal point between bits 22 and 23.  */
  mantissa |= 0x0080;
  if (aexp  -14)
{
  mask = 0x007f;
  if (aexp  -25)
aexp = -26;
  else if (aexp != -25)
mask = 24 + aexp;
}
  else
mask = 0x1fff;

But when aexp is 25 the mask should be 0xff instead of 0x7f as the
decimal dot in half-precision will be between bit 24 and 23 of the
above mentioned mantissa.  Cfr. the even-odd rounding done:

  /* Round.  */
  if (mantissa  mask)
{
  increment = (mask + 1)  1;
  if ((mantissa  mask) == increment)
increment = mantissa  (increment  1);
  mantissa += increment;
  if (mantissa = 0x0100)
{
  mantissa = 1;
  aexp++;
}
}

Attached patch solves this problem.  I've left out the clamping of
aexp to -26 for values less than -25 as this it not necessary.  After
the even-odd rounding all aexp values less than -24 will result in +0. or
-0. anyway.

John Tytgat  j...@bass-software.com

* config/arm/fp16.c (__gnu_f2h_internal): Fix inaccuracy.

I've got a copyright assignment but no write access.

John Tytgat.
-- 
John Tytgat BASS
j...@bass-software.com


Re: [PATCH, libgcc, ARM] __gnu_f2h_internal inaccuracy

2012-11-26 Thread John Tytgat
In message ab11eef452...@hobbes.bass-software.com
  John Tytgat j...@bass-software.com wrote:

 [...] Attached patch solves this problem. [...]

This time for real.

John Tytgat.
-- 
John Tytgat, in his comfy chair at home
john.tyt...@aaug.net
Index: libgcc/config/arm/fp16.c
===
--- libgcc/config/arm/fp16.c(revision 193830)
+++ libgcc/config/arm/fp16.c(working copy)
@@ -47,11 +47,9 @@
   mantissa |= 0x0080;
   if (aexp  -14)
 {
-  mask = 0x007f;
-  if (aexp  -25)
-   aexp = -26;
-  else if (aexp != -25)
-   mask = 24 + aexp;
+  mask = 0x00ff;
+  if (aexp = -25)
+mask = 25 + aexp;
 }
   else
 mask = 0x1fff;


RE: [PATCH, libgcc] Make possible to disable JCR in crtstuff.c

2012-11-26 Thread Joey Ye
Ping^2

 -Original Message-
 From: Joey Ye
 Sent: Tuesday, November 20, 2012 10:09
 To: gcc-patches@gcc.gnu.org
 Cc: Joey Ye
 Subject: RE: [PATCH, libgcc] Make possible to disable JCR in crtstuff.c
 
 Ping, as Joseph Prostko is saying that this patch shall solve the same
 problem he's facing.
 
  -Original Message-
  From: Joey Ye [mailto:joey...@arm.com]
  Sent: Friday, September 21, 2012 15:42
  To: gcc-patches@gcc.gnu.org
  Subject: [PATCH, libgcc] Make possible to disable JCR in crtstuff.c
 
  Current crtstuff.c checks if JCR_SECTION_NAME is defined to decide
  whether
  do work for JCR. However, defaults.h always defines JCR_SECTION_NAME:
 
  #ifndef JCR_SECTION_NAME
  #define JCR_SECTION_NAME .jcr
  #endif
 
  So it is impossible to disable JCR related code in crtbegin.o, which
  can
  save some bytes for every applications that doesn't need java.
 
  This patch revise the check of JCR_SECTION_NAME to
  TARGET_USE_JCR_SECTION.
  By defining latter to zero disable JCR in crtstuff. This change
 doesn't
  impact logic of any target given following defines in defaults.h:
 
  #ifndef TARGET_USE_JCR_SECTION
  #ifdef JCR_SECTION_NAME
  #define TARGET_USE_JCR_SECTION 1
  #else
  #define TARGET_USE_JCR_SECTION 0
  #endif
  #endif
 
  Again, this patch doesn't impact libgcc on any target, unless
  TARGET_USE_JCR_SECTION is explicitly defined to 0 with make
  CFLAGS_FOR_TARGET=-DTARGET_USE_JCR_SECTION=0. AIX defines
  TARGET_USE_JCR_SECTION to 0, but it has no crtbegin/end stuff. So also
  no impact.
 
  OK to trunk?
 
  2012-09-21  Joey Ye  joey...@arm.com
 
  * crtstuff.c: Check TARGET_USE_JCR_SECTION.
 
  Index: libgcc/crtstuff.c
  ===
  --- libgcc/crtstuff.c   (revision 190556)
  +++ libgcc/crtstuff.c   (working copy)
  @@ -256,13 +256,13 @@
= { };
   #endif /* USE_EH_FRAME_REGISTRY */
 
  -#ifdef JCR_SECTION_NAME
  +#if TARGET_USE_JCR_SECTION  defined (JCR_SECTION_NAME)
   /* Stick a label at the beginning of the java class registration info
  so we can register them properly.  */
   STATIC void *__JCR_LIST__[]
 __attribute__ ((used, section(JCR_SECTION_NAME),
  aligned(sizeof(void*
 = { };
  -#endif /* JCR_SECTION_NAME */
  +#endif /* TARGET_USE_JCR_SECTION  JCR_SECTION_NAME */
 
   #if USE_TM_CLONE_REGISTRY
   STATIC func_ptr __TMC_LIST__[]
  @@ -438,7 +438,7 @@
   #endif
 
   #if defined(USE_EH_FRAME_REGISTRY) \
  -|| defined(JCR_SECTION_NAME) \
  +|| defined(TARGET_USE_JCR_SECTION) \
   || defined(USE_TM_CLONE_REGISTRY)
   /* Stick a call to __register_frame_info into the .init section.  For
  some
  reason calls with no arguments work more reliably in .init, so
  stick the
  @@ -461,7 +461,7 @@
   #endif /* CRT_GET_RFIB_DATA */
   #endif /* USE_EH_FRAME_REGISTRY */
 
  -#ifdef JCR_SECTION_NAME
  +#if TARGET_USE_JCR_SECTION
 if (__JCR_LIST__[0])
   {
 void (*register_classes) (void *) = _Jv_RegisterClasses;
  @@ -469,7 +469,7 @@
 if (register_classes)
  register_classes (__JCR_LIST__);
   }
  -#endif /* JCR_SECTION_NAME */
  +#endif /* TARGET_USE_JCR_SECTION */
 
   #if USE_TM_CLONE_REGISTRY
 register_tm_clones ();
  @@ -483,7 +483,7 @@
 __attribute__ ((__used__, section(.init_array),
  aligned(sizeof(func_ptr
 = { frame_dummy };
   #endif /* !defined(INIT_SECTION_ASM_OP) */
  -#endif /* USE_EH_FRAME_REGISTRY || JCR_SECTION_NAME ||
  USE_TM_CLONE_REGISTRY */
  +#endif /* USE_EH_FRAME_REGISTRY || TARGET_USE_JCR_SECTION ||
  USE_TM_CLONE_REGISTRY */
 
   #else  /* OBJECT_FORMAT_ELF */
 
  @@ -551,7 +551,7 @@
   }
 
   #if defined(USE_EH_FRAME_REGISTRY) \
  -|| defined(JCR_SECTION_NAME) \
  +|| defined(TARGET_USE_JCR_SECTION) \
   || defined(USE_TM_CLONE_REGISTRY)
   /* A helper function for __do_global_ctors, which is in crtend.o.
  Here
  in crtbegin.o, we can reference a couple of symbols not visible
  there.
  @@ -566,7 +566,7 @@
   __register_frame_info (__EH_FRAME_BEGIN__, object);
   #endif
 
  -#ifdef JCR_SECTION_NAME
  +#if TARGET_USE_JCR_SECTION
 if (__JCR_LIST__[0])
   {
 void (*register_classes) (void *) = _Jv_RegisterClasses;
  @@ -580,7 +580,7 @@
 register_tm_clones ();
   #endif /* USE_TM_CLONE_REGISTRY */
   }
  -#endif /* USE_EH_FRAME_REGISTRY || JCR_SECTION_NAME ||
  USE_TM_CLONE_REGISTRY */
  +#endif /* USE_EH_FRAME_REGISTRY || TARGET_USE_JCR_SECTION ||
  USE_TM_CLONE_REGISTRY */
 
   #else /* ! INIT_SECTION_ASM_OP  ! HAS_INIT_SECTION */
   #error What are you doing with crtstuff.c, then?
  @@ -656,13 +656,13 @@
= { 0 };
   #endif /* EH_FRAME_SECTION_NAME */
 
  -#ifdef JCR_SECTION_NAME
  +#if TARGET_USE_JCR_SECTION  defined (JCR_SECTION_NAME)
   /* Null terminate the .jcr section array.  */
   STATIC void *__JCR_END__[1]
  __attribute__ ((used, section(JCR_SECTION_NAME),
 aligned(sizeof(void *
  = { 0 };
  -#endif /* JCR_SECTION_NAME 

Re: [PATCH, libgcc] Make possible to disable JCR in crtstuff.c

2012-11-26 Thread H.J. Lu
On Mon, Nov 26, 2012 at 7:19 PM, Joey Ye joey...@arm.com wrote:
 Ping^2

 -Original Message-
 From: Joey Ye
 Sent: Tuesday, November 20, 2012 10:09
 To: gcc-patches@gcc.gnu.org
 Cc: Joey Ye
 Subject: RE: [PATCH, libgcc] Make possible to disable JCR in crtstuff.c

 Ping, as Joseph Prostko is saying that this patch shall solve the same
 problem he's facing.

  -Original Message-
  From: Joey Ye [mailto:joey...@arm.com]
  Sent: Friday, September 21, 2012 15:42
  To: gcc-patches@gcc.gnu.org
  Subject: [PATCH, libgcc] Make possible to disable JCR in crtstuff.c
 
  Current crtstuff.c checks if JCR_SECTION_NAME is defined to decide
  whether
  do work for JCR. However, defaults.h always defines JCR_SECTION_NAME:
 
  #ifndef JCR_SECTION_NAME
  #define JCR_SECTION_NAME .jcr
  #endif
 
  So it is impossible to disable JCR related code in crtbegin.o, which
  can
  save some bytes for every applications that doesn't need java.
 
  This patch revise the check of JCR_SECTION_NAME to
  TARGET_USE_JCR_SECTION.
  By defining latter to zero disable JCR in crtstuff. This change
 doesn't
  impact logic of any target given following defines in defaults.h:
 
  #ifndef TARGET_USE_JCR_SECTION
  #ifdef JCR_SECTION_NAME
  #define TARGET_USE_JCR_SECTION 1
  #else
  #define TARGET_USE_JCR_SECTION 0
  #endif
  #endif
 
  Again, this patch doesn't impact libgcc on any target, unless
  TARGET_USE_JCR_SECTION is explicitly defined to 0 with make
  CFLAGS_FOR_TARGET=-DTARGET_USE_JCR_SECTION=0. AIX defines
  TARGET_USE_JCR_SECTION to 0, but it has no crtbegin/end stuff. So also
  no impact.
 
  OK to trunk?
 
  2012-09-21  Joey Ye  joey...@arm.com
 
  * crtstuff.c: Check TARGET_USE_JCR_SECTION.
 

Since we have --enable-languages=.., can't we disable
JCR_SECTION section if java isn't enabled?

-- 
H.J.


Re: [PATCH, libgcc] Make possible to disable JCR in crtstuff.c

2012-11-26 Thread Joseph Prostko
On Mon, Nov 26, 2012 at 10:19 PM, Joey Ye joey...@arm.com wrote:
 Ping^2

 -Original Message-
 From: Joey Ye
 Sent: Tuesday, November 20, 2012 10:09
 To: gcc-patches@gcc.gnu.org
 Cc: Joey Ye
 Subject: RE: [PATCH, libgcc] Make possible to disable JCR in crtstuff.c

 Ping, as Joseph Prostko is saying that this patch shall solve the same
 problem he's facing.

Indeed.  I am working on updating Haiku's GCC port to GCC 4.7.2, and
in the past we always ended up editing defaults.h to not define
JCR_SECTION_NAME as a way to get around this issue.  Basically, in
Haiku, we don't want _Jv_RegisterClasses to end up in crtbegin.o so we
patched things in what I consider an unsatisfactory manner.  I hope to
get Haiku's GCC changes upstreamed at some point, and I admit our
current fix wouldn't stand any chance at being upstreamed compared to
Joey's suggested fix.

- joe


Re: [PATCH, libgcc] Make possible to disable JCR in crtstuff.c

2012-11-26 Thread Joseph Prostko
On Mon, Nov 26, 2012 at 11:55 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Nov 26, 2012 at 7:19 PM, Joey Ye joey...@arm.com wrote:
 Ping^2

 -Original Message-
 From: Joey Ye
 Sent: Tuesday, November 20, 2012 10:09
 To: gcc-patches@gcc.gnu.org
 Cc: Joey Ye
 Subject: RE: [PATCH, libgcc] Make possible to disable JCR in crtstuff.c

 Since we have --enable-languages=.., can't we disable
 JCR_SECTION section if java isn't enabled?

I suppose that is certainly another option.  My main goal is to not
have _Jv_RegisterClasses show up in crtbegin.o, as it simply isn't
needed or desired in the case of Haiku.  Right now we only enable C
and C++, and it simply makes no sense to have anything Java-related
show up in our crtbegin.o.

- joe