Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-10 Thread Jeff Law

On 04/09/2012 11:17 PM, Adam Conrad wrote:


Like I said, then, you didn't actually read or understand why proposing
multilib paths doesn't work.  You realize conceptually, I hope, that
there's no guarantee of uniqueness in lib/lib64/lib32/libsf/libhf once
you cross the base CPU architecture boundary, right?
But what you haven't done is make a case for why anyone should care 
about this problem.




Sure, I said that /libhf/ld-linux.so.3 would *accidentally* work for
us right now, due to sheer luck, and you're running with that as saying
that we clearly have no problem here worth solving.  When the next
architecture clashes with linkers on another (hint: it almost certainly
will), do we get to argue about this all over again in six months,
instead of codifying a new and saner practice now?
My understanding of that architecture is that it's being handled as 
completely different from it's prior implementations.  ie, the toolchain 
and other things are treating it as an entirely separate architecture 
even though there is some common lineage to prior implementations.


If the tools are treating this upcoming architecture as a separate and 
distinct architecture rather than as a variant of a prior architecture, 
then why do we have to worry about conflicting in the filesystem space?


And just to be clear, I'm not taking sides, merely pointing out that you 
haven't made the case in this forum in a way that folks understand why 
this is an important problem.


Jeff


Re: Ada testcase CR line endings

2012-04-10 Thread Arnaud Charlet
 So, I'd like to change all the ada testcases to use normal unix line endings.
 
 testsuite/gnat.dg/taft_type2_pkg.ads
 
 is an example if one such file, any objections?

No objections.


Re: Ada testcase CR line endings

2012-04-10 Thread Mike Stump
On Apr 9, 2012, at 11:24 PM, Arnaud Charlet wrote:
 So, I'd like to change all the ada testcases to use normal unix line endings.
 
 testsuite/gnat.dg/taft_type2_pkg.ads
 
 is an example if one such file, any objections?
 
 No objections.

Applied to trunk and 4.7.x.  Thanks.

Index: gnat.dg/array19.adb
===
--- gnat.dg/array19.adb (revision 186264)
+++ gnat.dg/array19.adb (revision 186265)
@@ -1,34 +1,34 @@
--- { dg-do compile }
-
-package body Array19 is
-
-   function N return Integer is
-   begin
-  return 1;
-   end;
-
-   type Array_Type is array (1 .. N) of Float;
-
-   type Enum is (One, Two);
-
-   type Rec (D : Enum := Enum'First) is record
-  case D is
- when One = null;
- when Two = A : Array_Type;
-  end case;
-   end record;
-
-   procedure Proc is
-
-  R : Rec;
-
-  function F return Array_Type is
-  begin
- return (others = 0.0);
-  end F;
-
-   begin
-  R.A := F;
-   end;
-
-end Array19;
+-- { dg-do compile }
+
+package body Array19 is
+
+   function N return Integer is
+   begin
+  return 1;
+   end;
+
+   type Array_Type is array (1 .. N) of Float;
+
+   type Enum is (One, Two);
+
+   type Rec (D : Enum := Enum'First) is record
+  case D is
+ when One = null;
+ when Two = A : Array_Type;
+  end case;
+   end record;
+
+   procedure Proc is
+
+  R : Rec;
+
+  function F return Array_Type is
+  begin
+ return (others = 0.0);
+  end F;
+
+   begin
+  R.A := F;
+   end;
+
+end Array19;
Index: gnat.dg/array19.ads
===
--- gnat.dg/array19.ads (revision 186264)
+++ gnat.dg/array19.ads (revision 186265)
@@ -1,5 +1,5 @@
-package Array19 is
-
-   procedure Proc;
-
-end Array19;
+package Array19 is
+
+   procedure Proc;
+
+end Array19;
Index: gnat.dg/aggr11.adb
===
--- gnat.dg/aggr11.adb  (revision 186264)
+++ gnat.dg/aggr11.adb  (revision 186265)
@@ -1,17 +1,17 @@
--- { dg-do compile }
--- { dg-options -O }
-
-with Aggr11_Pkg; use Aggr11_Pkg;
-
-procedure Aggr11 is
-
-  A : Arr := ((1 = (Kind  = No_Error, B = True),
-   2 = (Kind = Error),
-   3 = (Kind = Error),
-   4 = (Kind  = No_Error, B = True),
-   5 = (Kind  = No_Error, B = True),
-   6 = (Kind  = No_Error, B = True)));
-
-begin
-   null;
-end;
+-- { dg-do compile }
+-- { dg-options -O }
+
+with Aggr11_Pkg; use Aggr11_Pkg;
+
+procedure Aggr11 is
+
+  A : Arr := ((1 = (Kind  = No_Error, B = True),
+   2 = (Kind = Error),
+   3 = (Kind = Error),
+   4 = (Kind  = No_Error, B = True),
+   5 = (Kind  = No_Error, B = True),
+   6 = (Kind  = No_Error, B = True)));
+
+begin
+   null;
+end;
Index: gnat.dg/discr6.adb
===
--- gnat.dg/discr6.adb  (revision 186264)
+++ gnat.dg/discr6.adb  (revision 186265)
@@ -1,33 +1,33 @@
--- { dg-do compile }
--- { dg-options -gnatdm -gnatws }
-
-with Discr6_Pkg;
-
-procedure Discr6 is
-
-  type T_Bit is range 0..1;
-  type T_Entier_16 is range -2**15 .. 2**15-1;
-
-  package My_Q is new Discr6_Pkg(T_Entier_16);
-
-  type T_Valeur is (BIT, Entier_16);
-
-  type R(D : T_Valeur) is record
-case D is
-  when BIT = V_BIT : T_Bit;
-  when Entier_16 = V_E16 : T_Entier_16;
-end case;
-  end record;
-  for R use record
-V_BIT at 0 range 0..7;
-V_E16 at 0 range 0..15;
-D at 8 range 0..7;
-  end record;
-  for R'size use 128;
-
-  A : R(Entier_16);
-  I : Integer;
-
-begin
-  I := My_Q.X(A.V_E16);
-end;
+-- { dg-do compile }
+-- { dg-options -gnatdm -gnatws }
+
+with Discr6_Pkg;
+
+procedure Discr6 is
+
+  type T_Bit is range 0..1;
+  type T_Entier_16 is range -2**15 .. 2**15-1;
+
+  package My_Q is new Discr6_Pkg(T_Entier_16);
+
+  type T_Valeur is (BIT, Entier_16);
+
+  type R(D : T_Valeur) is record
+case D is
+  when BIT = V_BIT : T_Bit;
+  when Entier_16 = V_E16 : T_Entier_16;
+end case;
+  end record;
+  for R use record
+V_BIT at 0 range 0..7;
+V_E16 at 0 range 0..15;
+D at 8 range 0..7;
+  end record;
+  for R'size use 128;
+
+  A : R(Entier_16);
+  I : Integer;
+
+begin
+  I := My_Q.X(A.V_E16);
+end;
Index: gnat.dg/aggr11_pkg.ads
===
--- gnat.dg/aggr11_pkg.ads  (revision 186264)
+++ gnat.dg/aggr11_pkg.ads  (revision 186265)
@@ -1,14 +1,14 @@
-package Aggr11_Pkg is
-
-   type Error_Type is (No_Error, Error);
-
-   type Rec (Kind : Error_Type := No_Error) is record
- case Kind is
-   when Error = null;
-   when others = B : Boolean;
- end case;
-   end record;
-
-   type Arr is array (1..6) of Rec;
-
-end Aggr11_Pkg;
+package Aggr11_Pkg is
+
+   type Error_Type is (No_Error, Error);
+
+   type 

Re: [Patch, Fortran, F03] PR52909: Procedure pointers not private to modules

2012-04-10 Thread Tobias Burnus
No patch review - but and answer to a question and a comment.

Janus Weil wrote:
 I am aware that it will break the ABI, but only for programs involving
 procedure pointers (which still is a 'relatively' new feature,
 supported since gfortran 4.4).

 Btw, speaking of ABI breaking: What are the chances of the array
 descriptor update and ABI cleanup happening for the 4.8 release? I
 know such an ABI breaking has been planned for some time, but I
 haven't followed the gfortran mailing list in detail during the last
 weeks and months, so I'm not sure what the current status is.

Regarding Fortran-dev (which contains the new descriptor): On the branch,
the new dimension triplet (lbound, extent, sm [stride multiplier in bytes])
is used - instead of the old (lbound, ubound, stride [in # of elements]);
there are still about 20 failures.

The next step is to reduce them to zero. Half of them are due to c_f_pointer;
I have a draft patch for it [moving it from the library to the compiler], but
it still fails in an odd way. The other issues haven't been debugged.
Additionally, one needs to do some optimizations - Thomas' patch is the first
step into that direction.

We then also have to do further changes to the other fields of the descriptor.
I have no idea whether it will be ready until the 4.8 release. It is doable,
but quite a lot of work needs to be done. Especially as we want to do some
additional cleanup. However, there are also other gfortran projects which have
to be worked on.

My personal guess is that the array descriptor will be mostly but not
completely ready by the end of Stage 1 and thus be deferred another year.
However, as the saying goes: Prediction is very difficult, especially about
the future. (Possibly by Niels Bohr, who might have taken it from the 
Danish poet Piet Hein, though the source of the quote is disputed.)

 * * *

Regarding ABI breakage:

I strongly propose that we - starting from 4.8 or starting from the
array-descriptor ABI breakage - keep backward compatibility for the .mod files.
That is, if the mod version is bumped, the compiler shall be able to read the
previous version.

And unless the ABI is serverely remodelled as with the array descriptor: We
should try to avoid ABI breakage and if it is needed as for the procedure 
pointer,
we should do the same as GCC's C++ compiler:

   -fabi-version=n
   -fabi-version=n
   Use version n of the C++ ABI.  Version 2 is the ver-
   sion of the C++ ABI that first appeared in G++ 3.4.
   Version 1 is the version of the C++ ABI that first
   appeared in G++ 3.2.  Version 0 will always be the
   version that conforms most closely to the C++ ABI
   specification.  Therefore, the ABI obtained using ver-
   sion 0 will change as ABI bugs are fixed.

   The default is version 2.

   Version 3 corrects an error in mangling a constant
   address as a template argument.
[...]
   See also -Wabi.


Regarding:
 I am aware that it will break the ABI, but only for programs involving
 procedure pointers (which still is a 'relatively' new feature,
 supported since gfortran 4.4).

That's already supported since 3 years - and there are several programs out
there which use it, given that is was somewhat stable. gfortran escaped some
issues by always bumping the .mod version (due to new features), which made it
less likely that a user could run into issues. We also fixed some very special
corner cases - in particularly with truely new features. But I don't think
that proc-pointers count in this regard: They are too widely supported by old
gfortran versions and by other compilers and proc-pointers as module variable
is not really a corner case.


Hence, we should really think about -fabi-version=n (and .mod compatibility).
Unless, we are positive that we will break the ABI for the array descriptor in
4.8, I am in favour of adding -fabi-version=n already for the proc-pointer
patch.

Comments?

Tobias


Re: [SMS] Support new loop pattern

2012-04-10 Thread Andrey Belevantsev

Hello Ayal,

First of all, thanks for your feedback.  Now to your questions:

On 31.03.2012 3:20, Ayal Zaks wrote:

Roman, Andrey,

Sorry for the delayed response.

It would indeed be good to have SMS apply to more loop patterns, still
within the realm of *countable* loops. SMS was originally designed to
handle doloops, with a specific pattern controlling the loop, easily
identified and separable from the loop's body. The newly proposed
change to support new loop patterns is pretty invasive and sizable,
taking place entirely within modulo-sched.c. The main issue I've been
considering, is whether it would be possible instead to transform the
new loop patterns we want SMS to handle, into doloops (potentially
introducing additional induction variables to feed other uses), and
then feed the resulting loop into SMS as is? In other words, could you
fold it into doloop.c? And if so, will doing so introduce significant
overheads?


Let me perhaps explain better.  The patch itself is one core patch (this 
thread) adding the new functionality on detecting more complex loop 
patterns and the three fixes to SMS found while working on the main patch 
(the fixes are in the mails pinged at the very end of this message).  The 
three fixes are worthwhile to commit separately anyways, they are splitted 
up from the main patch for this purpose, so I would suggest to consider 
them in any case.


For the main patch, its core is as small as we could get.  It stays with 
the countable loops as for the cases where we could get overflow behavior 
or infinite loops we bail out early.  We handle only a case of simple 
same-step affine counters.  The main reason why we add support to SMS and 
not to the doloop pass are is when we do not pipeline a loop newly 
transformed to the doloop form, this loop actually slows down on the 
platforms not having a true doloop pattern.  One has to undo the doloop 
form and to get back to the original loop form to avoid this, which seems 
rather strange.  Also, the separate decrement insn that changes the 
induction variable is better be scheduled to get more precise schedule. 
And yes, I believe that making an extra induction variable just to have the 
control part without uses in the loop core will be unnecessary overhead.


Thus, I believe that if we do want SMS to handle more complex loop, then it 
is inevitable that SMS itself would be somewhat more complex.  I would 
welcome your suggestions to make the patch more clear.  One way I see is 
that the function for getting the condition of the new loop form can be 
moved to the generic RTL loop code given the agreement of other RTL 
maintainers.  Also, some new helpers can be introduced for handling this 
specific loop forms.  But it seems that the distinction between 
doloop/non-doloop loops has to stay in the code.


Yours,
Andrey


2012/3/29 Andrey Belevantseva...@ispras.ru:

Hello,

I'd like to ping again those SMS patches once we're back to Stage 1.

Ayal, maybe it would remove some burden for you if you'd review the general
SMS functionality of those patches, and we'd ask RTL folks to look at the
pieces related to RTL pattern matching and generation?



It definitely would ... especially in light of the above issue.
Thanks (for your patches, patience, pings..),
Ayal.




Yours,
Andrey


On 10.02.2012 16:15, Roman Zhuykov wrote:


Ping.
Ayal, please review this patch and these three patches too:
http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00505.html
http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00506.html
http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01800.html

--
Roman Zhuykov
zhr...@ispras.ru




Re: [patch] Fix cygwin ada install [was Re: Yet another issue with gcc current trunk with ada on cygwin]

2012-04-10 Thread Arnaud Charlet
 Back on this! It turn out that this breaks the shared Ada runtime.
 Indeed, exported variables on Ada packages in a DLLs are only accessible
 when linking against DLL (thanks to runtime pseudo reloc).
 
 With the patch applied it is not possible to build any Ada application
 using the shared runtime. This is a very serious problem, the patch
 should be reverted.

Indeed, that's a blocking issue.

Pascal, could you take care of reverting this change (at least the parts
that are intrusve)? Thanks in advance.

Arno


Re: [Patch ARM/ configury] Add fall-back check for gnu_unique_object

2012-04-10 Thread Ramana Radhakrishnan
 The patch with correct configure output is ok.

 Thanks - this is what I committed.

Is this something that can be considered for backporting to release
branches ? This patch technically doesn't fix a regression but brings
in line behaviour of the normal bootstrap for %gnu_unique_object ?

http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01046.html


regards,
Ramana


 Ramana





 2012-03-15  Ramana Radhakrishnan  ramana.radhakrish...@linaro.org

        * config.gcc (target_type_format_char): New. Document it. Set it for
        arm*-*-* .
        * configure.ac (gnu_unique_option): Use target_type_format_char in 
 test.
        Comment rationale.
        * configure: Regenerate .



 Ramana



 Paolo


Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-10 Thread Konstantinos Margaritis
On Tue, 10 Apr 2012 07:36:07 +0200
Jakub Jelinek ja...@redhat.com wrote:
 We really want consistency about the dynamic linker names etc. across
 different targets and sneaking silently multiarch paths on one architecture
 would make it inconsistent with all the others.  So, please just use
 /libhf/ld-linux.so.3.

I personally find /libhf extremely ugly. If having a second path is a problem, 
how about using the triplet in the filename? Like:

/lib/ld-linux-arm-linux-gnueabihf.so.3

?

Unique per arch and not multiarched. And AFAIK, Linux can handle long filenames 
just fine...

Konstantinos


Re: [Libiberty - V2]: Handle VMS as a LLP64 platform in splay-tree.h

2012-04-10 Thread Tristan Gingold

On Apr 6, 2012, at 12:26 AM, Ian Lance Taylor wrote:

 Tristan Gingold ging...@adacore.com writes:
 
 gcc/
 2012-04-05  Tristan Gingold  ging...@adacore.com
 
  * gengtype.c (main): Make uintptr_t a known type.
 
 include/
 2012-04-05  Tristan Gingold  ging...@adacore.com
 
  * splay-tree.h: Conditionnaly includes stdint.h and inttypes.h
  (libi_uhostptr_t, libi_shostptr_t): Remove, replaced by uintptr_t.
 
 This is OK.

Thanks, committed.

Tristan.



Re: Fix partitioning of aliases

2012-04-10 Thread Richard Guenther
On Mon, 9 Apr 2012, Jan Hubicka wrote:

 Hi,
 this patch fixes several different ICEs related to handling aliases in WHOPR
 partitioning.  It took me over week debug this, but when variable alias
 is added to a boundary and its destination is not added, we get queue of
 unforutnate events where the destinatoin gets analyzed and added at ltrans 
 time
 resulting in interesting miscompilation seen at Mozilla with some vtables.
 The problem is that constructor won't get streamed when the declaration is
 not in varpool at partitioning time and thus once the variable is re-added
 it has zero constructor.
 Of course the problem manifests itself in various weird ways depending
 on ordering of linker command maing it very difficult to reduce anything.
 
 While working on this I also noticed that PR 52634 is about related problem
 where aliases are incorectly partitioned into multiple partitions.
 The patch also fixes the varpool ICEs mentioned in the other two PRs.
 I failed to produce testcase version of PR52722 testcase, since it does not
 link now either, but it won't ICE.
 
 I will commit the patch and wait for some time, but I would like to backport
 it to 4.7, since it solves quite nasty misoptimization problem.

Yeah, it looks fine to me.

 At mainline after this patch i would like to follow with series of cleanups
 and API changes I have in queue for symtab work.

Thanks,
Richard.

 Honza
 
   PR lto/52722
   PR lto/51765
   PR lto/52634
   * lto-cgraph.c (compute_ltrans_boundary): When alias is in the boundary,
   add its target too.
   * lto.c (add_references_to_partition): Add also aliased nodes.
   (add_cgraph_node_to_partition,
   add_varpool_node_to_partition): Work on nodes, not functions/variables;
   when adding alias, add also the aliased object.
   * gcc.dg/lto/pr52634_1.c: New testcase.
   * gcc.dg/lto/pr52634_0.c: New testcase.
 Index: lto-cgraph.c
 ===
 *** lto-cgraph.c  (revision 185767)
 --- lto-cgraph.c  (working copy)
 *** compute_ltrans_boundary (struct lto_out_
 *** 799,804 
 --- 799,806 
 lto_set_varpool_encoder_encode_initializer (varpool_encoder, vnode);
 add_references (encoder, varpool_encoder, vnode-ref_list);
   }
 +   else if (vnode-alias || vnode-alias_of)
 + add_references (encoder, varpool_encoder, vnode-ref_list);
   }
   
 /* Go over all the nodes again to include callees that are not in
 Index: lto/lto.c
 ===
 *** lto/lto.c (revision 185767)
 --- lto/lto.c (working copy)
 *** free_ltrans_partitions (void)
 *** 1444,1450 
 VEC_free (ltrans_partition, heap, ltrans_partitions);
   }
   
 ! /* See all references that go to comdat objects and bring them into 
 partition too.  */
   static void
   add_references_to_partition (ltrans_partition part, struct ipa_ref_list 
 *refs)
   {
 --- 1444,1451 
 VEC_free (ltrans_partition, heap, ltrans_partitions);
   }
   
 ! /* See all references that go to comdat objects and bring them into 
 partition too.
 !Also see all aliases of the newly added entry and bring them, too.  */
   static void
   add_references_to_partition (ltrans_partition part, struct ipa_ref_list 
 *refs)
   {
 *** add_references_to_partition (ltrans_part
 *** 1453,1467 
 for (i = 0; ipa_ref_list_reference_iterate (refs, i, ref); i++)
   {
 if (ref-refered_type == IPA_REF_CGRAPH
 !DECL_COMDAT (cgraph_function_node (ipa_ref_node (ref), NULL)-decl)
  !cgraph_node_in_set_p (ipa_ref_node (ref), part-cgraph_set))
   add_cgraph_node_to_partition (part, ipa_ref_node (ref));
 else
   if (ref-refered_type == IPA_REF_VARPOOL
 !  DECL_COMDAT (ipa_ref_varpool_node (ref)-decl)
 !  !varpool_node_in_set_p (ipa_ref_varpool_node (ref), 
 part-varpool_set))
 add_varpool_node_to_partition (part, ipa_ref_varpool_node (ref));
   }
   }
   
   /* Worker for add_cgraph_node_to_partition.  */
 --- 1454,1498 
 for (i = 0; ipa_ref_list_reference_iterate (refs, i, ref); i++)
   {
 if (ref-refered_type == IPA_REF_CGRAPH
 !(DECL_COMDAT (cgraph_function_node (ipa_ref_node (ref),
 !NULL)-decl)
 !   || (ref-use == IPA_REF_ALIAS
 !lookup_attribute
 !(weakref, DECL_ATTRIBUTES (ipa_ref_node (ref)-decl
  !cgraph_node_in_set_p (ipa_ref_node (ref), part-cgraph_set))
   add_cgraph_node_to_partition (part, ipa_ref_node (ref));
 else
   if (ref-refered_type == IPA_REF_VARPOOL
 !  (DECL_COMDAT (ipa_ref_varpool_node (ref)-decl)
 ! || (ref-use == IPA_REF_ALIAS
 !  lookup_attribute
 !  (weakref,
 !   DECL_ATTRIBUTES (ipa_ref_varpool_node (ref)-decl
 ! 

Re: [patch] Add generic __builtin_bswap16 support

2012-04-10 Thread Richard Guenther
On Mon, Apr 9, 2012 at 11:43 PM, Eric Botcazou ebotca...@adacore.com wrote:
 Hi,

 this adds generic support for __builtin_bswap16 (only PowerPC has it for now).
 It is mapped to the bswap optab in HImode, whose implementation is as follows:
  - if a bswaphi2 pattern is present (PowerPC), it is directly used; or else
  - if a rotlhi2/rotrhi2 pattern is present (x86), it is directly used, or else
  - if ashlxx2  lshrxx2 are present (SPARC), they are used to open-code the
 operation, or else,
  - if a bswapsi2 pattern is present, it is used (with a final shift), 
 otherwise
  - the bswapsi2 libcall is used (with a final shift).

 Since it is expected that most architectures will fall into one of the first 3
 cases, no __bswaphi2 symbol is added to libgcc.  And pass_optimize_bswap isn't
 modified to recognize the builtin either, as this seems overkill to me.

 Tested on x86, x86-64 and PowerPC Linux, OK for the mainline?

Ok.

Thanks,
Richard.


 2012-04-09  Eric Botcazou  ebotca...@adacore.com

        PR target/52624
        * doc/extend.texi (Other Builtins): Document __builtin_bswap16.
        (PowerPC AltiVec/VSX Built-in Functions): Remove it.
        * builtin-types.def (BT_UINT16): New primitive type.
        (BT_FN_UINT16_UINT16): New function type.
        * builtins.def (BUILT_IN_BSWAP16): New.
        * builtins.c (expand_builtin_bswap): Add TARGET_MODE argument.
        (expand_builtin) BUILT_IN_BSWAP16: New case.  Pass TARGET_MODE to
        expand_builtin_bswap.
        * optabs.c (expand_unop): Deal with bswap in HImode specially.  Add
        missing bits for bswap to libcall code.
        * tree.c (build_common_tree_nodes): Build uint16_type_node.
        * tree.h (enum tree_index): Add TI_UINT16_TYPE.
        (uint16_type_node): New define.
        * config/rs6000/rs6000-builtin.def (RS6000_BUILTIN_BSWAP_HI): Delete.
        * config/rs6000/rs6000.c (rs6000_expand_builtin): Remove handling of
        above builtin.
        (rs6000_init_builtins): Likewise.
        * config/rs6000/rs6000.md (bswaphi2): Add TARGET_POWERPC predicate.


 2012-04-09  Eric Botcazou  ebotca...@adacore.com

 c-family/
        * c-common.h (uint16_type_node): Rename into...
        (c_uint16_type_node): ...this.
        * c-common.c (c_common_nodes_and_builtins): Adjust for above renaming.
        * c-cppbuiltin.c (builtin_define_stdint_macros): Likewise.


 2012-04-09  Eric Botcazou  ebotca...@adacore.com

 testsuite/
        * gcc.dg/builtin-bswap-1.c: Test __builtin_bswap16  __builtin_bswap64.
        * gcc.dg/builtin-bswap-4.c: Test __builtin_bswap16.
        * gcc.dg/builtin-bswap-5.c: Likewise.
        * gcc.target/i386/builtin-bswap-4.c: New test.


 --
 Eric Botcazou


Re: [PATCH] Fix PR52614

2012-04-10 Thread Dominique Dhumieres
 I don't know enough about Fortran to know whether the same issues arise
 there.  Perhaps in Fortran a common symbol is always a common symbol and
 can never be a defined symbol.  If that is the case then for Fortran I
 think it would be safe to change the alignment of the common symbol.  Of
 course we would need to have some way to indicate that in the
 middle-end--DECL_ALWAYS_COMMON or something.

I don't know if this answers the question, but I have run the gfortran 
test suite with -fno-common and I have seen only one failure:

[macbook] lin/test% gfc 
/opt/gcc/work/gcc/testsuite/gfortran.dg/global_vars_f90_init.f90 
/opt/gcc/work/gcc/testsuite/gfortran.dg/global_vars_f90_init_driver.c 
-fno-common -save-temps
ld: duplicate symbol _i in global_vars_f90_init_driver.o and 
global_vars_f90_init.o
collect2: error: ld returned 1 exit status

AFAICT, COMMON is only lightly tested in the gfortran test suite and mostly
for errors or warnings.  However I find surprising that
gfortran.dg/bind_c_coms.f90 + gfortran.dg/bind_c_coms_driver.c works:

[macbook] lin/test% gfc /opt/gcc/work/gcc/testsuite/gfortran.dg/bind_c_coms.f90 
/opt/gcc/work/gcc/testsuite/gfortran.dg/bind_c_coms_driver.c -fno-common
[macbook] lin/test% a.out 
[macbook] lin/test% 

Dominique


Re: [PATCH] Fix PR middle-end/52894

2012-04-10 Thread Richard Guenther
On Tue, Apr 10, 2012 at 2:45 AM, John David Anglin
d...@hiauly1.hia.nrc.ca wrote:
 The current 4.5, 4.6 and 4.7 branches are now seriously broken for all
 PA target due to the fix applied for PR middle-end/52640, an optimization
 fix.  This is because the PA backend defers output of function descriptors
 and externals (hpux) using TARGET_ASM_FILE_END.  As a result, 
 assemble_external
 can be called after after varasm.c processes it's list of pending externals
 and the vector used for this list deleted.

 The attached change fixes this problem.  The change has been tested
 on hppa-unknown-linux-gnu and hppa2.0w-hp-hpux11.11 with no observed
 regressions.  I believe it to be the simplest way to fix these release
 branches although it may not be optimal (externals are doublely delayed
 on PA hpux).

 Ok for the 4.5, 4.6 and 4.7 branches?  The fix for PR middle-end/52640
 was not applied to the trunk.

I can't immediately see how your description of the list of pending externals
and the vector is deleted.  pa.c keeps its own vector which references the
decls and the only issue I see is that if you call assemble_external after
processing externals you ICE because the pointer-set is not initialized?

Why does the pa backend end up calling assemble_external at final time?

Richard.

 Dave
 --
 J. David Anglin                                  dave.ang...@nrc-cnrc.gc.ca
 National Research Council of Canada              (613) 990-0752 (FAX: 
 952-6602)

 2012-04-09  John David Anglin  dave.ang...@nrc-cnrc.gc.ca

        PR middle-end/52894
        * varasm.c (process_pending_assemble_externals): Set
        pending_assemble_externals_processed true.
        (assemble_external): Call assemble_external_real if the pending
        assemble externals have been processed.

 Index: varasm.c
 ===
 --- varasm.c    (revision 186213)
 +++ varasm.c    (working copy)
 @@ -2108,6 +2108,11 @@
    the entire pending_assemble_externals list.  See assemble_external().  */
  static struct pointer_set_t *pending_assemble_externals_set;

 +/* Some targets delay some output to final using TARGET_ASM_FILE_END.
 +   As a result, assemble_external can be called after the list of externals
 +   is processed and the pointer set destroyed.  */
 +static bool pending_assemble_externals_processed;
 +
  #ifdef ASM_OUTPUT_EXTERNAL
  /* True if DECL is a function decl for which no out-of-line copy exists.
    It is assumed that DECL's assembler name has been set.  */
 @@ -2160,6 +2165,7 @@
     assemble_external_real (TREE_VALUE (list));

   pending_assemble_externals = 0;
 +  pending_assemble_externals_processed = true;
   pointer_set_destroy (pending_assemble_externals_set);
  #endif
  }
 @@ -2201,6 +2207,12 @@
     weak_decls = tree_cons (NULL, decl, weak_decls);

  #ifdef ASM_OUTPUT_EXTERNAL
 +  if (pending_assemble_externals_processed)
 +    {
 +      assemble_external_real (decl);
 +      return;
 +    }
 +
   if (! pointer_set_insert (pending_assemble_externals_set, decl))
     pending_assemble_externals = tree_cons (NULL, decl,
                                            pending_assemble_externals);


Re: ICE on bootstrap of libstdc++ for mingw-targets in add_bb_to_loop

2012-04-10 Thread Richard Guenther
On Tue, Apr 10, 2012 at 12:41 PM, Kai Tietz ktiet...@googlemail.com wrote:
 Hello,

 recent changes to cfgloop have caused an ICE on bootstrapping
 libstdc++ for mingw targets.  I assume same ICE happens for cygwin
 hosted gcc bootstrap, too. ICE happens on compiling of
 gcc/libstdc++-v3/libsupc++/eh_alloc.cc

 Code which produces seg-fault is in macro:

      DEF_VEC_P (loop_p);

 backtrace:

 #0  add_bb_to_loop (bb=0x7ef68ca0, loop=0x0) at ../../gcc/gcc/cfgloop.h:88

loop is NULL, so the issue is that mingw somewhere generates a basic-block
without a loop father.

 #1  0x00a5e3d7 in split_block (bb=0x7ef68bc8, i=0x7eb6ea30)

for this block (bb).

Find out where it is generated and fix that.

Richard.

    at ../../gcc/gcc/cfghooks.c:450
 #2  0x00abaf49 in find_many_sub_basic_blocks (blocks=0x442f178)
    at ../../gcc/gcc/cfgbuild.c:474
 #3  0x008a280f in break_superblocks () at ../../gcc/gcc/cfglayout.c:1345
 #4  0x007293f9 in finish_eh_generation () at ../../gcc/gcc/except.c:1431
 #5  0x00dc06f4 in gimple_expand_cfg () at ../../gcc/gcc/cfgexpand.c:4652
 #6  0x008d4afc in execute_one_pass (pass=0x1143bc0)
    at ../../gcc/gcc/passes.c:2079
 #7  0x008d4cc4 in execute_pass_list (pass=0x4350238)
    at ../../gcc/gcc/passes.c:2134
 #8  0x00a8b1c4 in tree_rest_of_compilation (fndecl=0x7eca1d00)
    at ../../gcc/gcc/tree-optimize.c:422
 #9  0x008da273 in cgraph_expand_function (node=0x7ecd7578)
    at ../../gcc/gcc/cgraphunit.c:1784
 #10 0x008dc2d9 in cgraph_optimize () at ../../gcc/gcc/cgraphunit.c:1851
 #11 0x008dcb2e in cgraph_finalize_compilation_unit ()
    at ../../gcc/gcc/cgraphunit.c:2628
 #12 0x00510c5b in cp_write_global_declarations ()
    at ../../gcc/gcc/cp/decl2.c:4077
 #13 0x008e6452 in toplev_main (argc=30, argv=0x43497d0)
    at ../../gcc/gcc/toplev.c:572
 #14 0x006a104a in main (argc=30, argv=0x43497d0) at ../../gcc/gcc/main.c:36


Re: [PATCH] Fix PR52614

2012-04-10 Thread Richard Guenther
On Tue, Apr 10, 2012 at 1:00 PM, Dominique Dhumieres domi...@lps.ens.fr wrote:
 I don't know enough about Fortran to know whether the same issues arise
 there.  Perhaps in Fortran a common symbol is always a common symbol and
 can never be a defined symbol.  If that is the case then for Fortran I
 think it would be safe to change the alignment of the common symbol.  Of
 course we would need to have some way to indicate that in the
 middle-end--DECL_ALWAYS_COMMON or something.

 I don't know if this answers the question, but I have run the gfortran
 test suite with -fno-common and I have seen only one failure:

 [macbook] lin/test% gfc 
 /opt/gcc/work/gcc/testsuite/gfortran.dg/global_vars_f90_init.f90 
 /opt/gcc/work/gcc/testsuite/gfortran.dg/global_vars_f90_init_driver.c 
 -fno-common -save-temps
 ld: duplicate symbol _i in global_vars_f90_init_driver.o and 
 global_vars_f90_init.o
 collect2: error: ld returned 1 exit status

 AFAICT, COMMON is only lightly tested in the gfortran test suite and mostly
 for errors or warnings.  However I find surprising that
 gfortran.dg/bind_c_coms.f90 + gfortran.dg/bind_c_coms_driver.c works:

-fno-common seems to be ignored by the fortran frontend, thus,

subroutine foo
  common/BAR/ x
  real(8) x
end

BAR is still a common decl with -fno-common.  For your testcase only
the C compiled parts end up not using commons.

Thus the vectorizer would fail to promote the alignment of BAR regardless
of -f[no-]common.  GFortran OTOH does

.comm   bar_,8,16

for the above, thus aligns it to 16 already (to 32 with -mavx,
probably for vectorization).
So we should be fine.

Richard.

 [macbook] lin/test% gfc 
 /opt/gcc/work/gcc/testsuite/gfortran.dg/bind_c_coms.f90 
 /opt/gcc/work/gcc/testsuite/gfortran.dg/bind_c_coms_driver.c -fno-common
 [macbook] lin/test% a.out
 [macbook] lin/test%

 Dominique


[PATCH] Move tree_rest_of_compilation

2012-04-10 Thread Richard Guenther

To cgraphunit.c, where its only caller (should) reside.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2012-04-10  Richard Guenther  rguent...@suse.de

* toplev.h (tree_rest_of_compilation): Remove.
* tree-optimize.c (tree_rest_of_compilation): Likewise.
* cgraph.h (tree_rest_of_compilation): Declare.
* tree-optimize.c (tree_rest_of_compilation): Move ...
* cgraphunit.c (tree_rest_of_compilation): ... here.
* cgraph.c (cgraph_add_new_function): Adjust.
* Makefile.in (tree-optimize.o): Adjust.
(cgraphunit.o): Likewise.

Index: trunk/gcc/cgraph.h
===
*** trunk.orig/gcc/cgraph.h 2012-04-10 10:19:22.0 +0200
--- trunk/gcc/cgraph.h  2012-04-10 10:20:45.216721264 +0200
*** void cgraph_mark_if_needed (tree);
*** 565,570 
--- 565,571 
  void cgraph_analyze_function (struct cgraph_node *);
  void cgraph_finalize_compilation_unit (void);
  void cgraph_optimize (void);
+ void tree_rest_of_compilation (struct cgraph_node *);
  void cgraph_mark_needed_node (struct cgraph_node *);
  void cgraph_mark_address_taken_node (struct cgraph_node *);
  void cgraph_mark_reachable_node (struct cgraph_node *);
Index: trunk/gcc/cgraphunit.c
===
*** trunk.orig/gcc/cgraphunit.c 2012-04-10 10:20:00.0 +0200
--- trunk/gcc/cgraphunit.c  2012-04-10 10:57:26.947589800 +0200
*** along with GCC; see the file COPYING3.
*** 111,116 
--- 111,117 
  #include coretypes.h
  #include tm.h
  #include tree.h
+ #include output.h
  #include rtl.h
  #include tree-flow.h
  #include tree-inline.h
*** along with GCC; see the file COPYING3.
*** 141,146 
--- 142,148 
  #include ipa-inline.h
  #include ipa-utils.h
  #include lto-streamer.h
+ #include regset.h /* FIXME: For reg_obstack.  */
  
  static void cgraph_expand_all_functions (void);
  static void cgraph_mark_functions_to_output (void);
*** assemble_thunks_and_aliases (struct cgra
*** 1768,1773 
--- 1770,1863 
}
  }
  
+ /* For functions-as-trees languages, this performs all optimization and
+compilation for FNDECL.  */
+ 
+ void
+ tree_rest_of_compilation (struct cgraph_node *node)
+ {
+   tree fndecl = node-decl;
+   location_t saved_loc;
+ 
+   timevar_push (TV_REST_OF_COMPILATION);
+ 
+   gcc_assert (cgraph_global_info_ready);
+ 
+   /* Initialize the default bitmap obstack.  */
+   bitmap_obstack_initialize (NULL);
+ 
+   /* Initialize the RTL code for the function.  */
+   current_function_decl = fndecl;
+   saved_loc = input_location;
+   input_location = DECL_SOURCE_LOCATION (fndecl);
+   init_function_start (fndecl);
+ 
+   gimple_register_cfg_hooks ();
+ 
+   bitmap_obstack_initialize (reg_obstack); /* FIXME, only at RTL generation*/
+ 
+   execute_all_ipa_transforms ();
+ 
+   /* Perform all tree transforms and optimizations.  */
+ 
+   /* Signal the start of passes.  */
+   invoke_plugin_callbacks (PLUGIN_ALL_PASSES_START, NULL);
+ 
+   execute_pass_list (all_passes);
+ 
+   /* Signal the end of passes.  */
+   invoke_plugin_callbacks (PLUGIN_ALL_PASSES_END, NULL);
+ 
+   bitmap_obstack_release (reg_obstack);
+ 
+   /* Release the default bitmap obstack.  */
+   bitmap_obstack_release (NULL);
+ 
+   set_cfun (NULL);
+ 
+   /* If requested, warn about function definitions where the function will
+  return a value (usually of some struct or union type) which itself will
+  take up a lot of stack space.  */
+   if (warn_larger_than  !DECL_EXTERNAL (fndecl)  TREE_TYPE (fndecl))
+ {
+   tree ret_type = TREE_TYPE (TREE_TYPE (fndecl));
+ 
+   if (ret_type  TYPE_SIZE_UNIT (ret_type)
+  TREE_CODE (TYPE_SIZE_UNIT (ret_type)) == INTEGER_CST
+  0  compare_tree_int (TYPE_SIZE_UNIT (ret_type),
+  larger_than_size))
+   {
+ unsigned int size_as_int
+   = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (ret_type));
+ 
+ if (compare_tree_int (TYPE_SIZE_UNIT (ret_type), size_as_int) == 0)
+   warning (OPT_Wlarger_than_, size of return value of %q+D is %u 
bytes,
+  fndecl, size_as_int);
+ else
+   warning (OPT_Wlarger_than_, size of return value of %q+D is larger 
than %wd bytes,
+  fndecl, larger_than_size);
+   }
+ }
+ 
+   gimple_set_body (fndecl, NULL);
+   if (DECL_STRUCT_FUNCTION (fndecl) == 0
+!cgraph_get_node (fndecl)-origin)
+ {
+   /* Stop pointing to the local nodes about to be freed.
+But DECL_INITIAL must remain nonzero so we know this
+was an actual function definition.
+For a nested function, this is done in c_pop_function_context.
+If rest_of_compilation set this to 0, leave it 0.  */
+   if (DECL_INITIAL (fndecl) != 0)
+   DECL_INITIAL (fndecl) = 

[Patch] tree-parloops.c (eliminate_local_variables): Add braces to suppress warnings

2012-04-10 Thread JonY
Hi,

Patch OK?

ChangeLog:
tree-parloops.c (eliminate_local_variables): Add braces to suppress warnings.

Index: tree-parloops.c
===
--- tree-parloops.c (revision 186243)
+++ tree-parloops.c (working copy)
@@ -723,13 +723,15 @@
   FOR_EACH_VEC_ELT (basic_block, body, i, bb)
 if (bb != entry_bb  bb != exit_bb)
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (gsi))
-   if (is_gimple_debug (gsi_stmt (gsi)))
- {
-   if (gimple_debug_bind_p (gsi_stmt (gsi)))
- has_debug_stmt = true;
- }
-   else
- eliminate_local_variables_stmt (entry, gsi, decl_address);
+{
+ if (is_gimple_debug (gsi_stmt (gsi)))
+   {
+ if (gimple_debug_bind_p (gsi_stmt (gsi)))
+   has_debug_stmt = true;
+   }
+ else
+   eliminate_local_variables_stmt (entry, gsi, decl_address);
+}

   if (has_debug_stmt)
 FOR_EACH_VEC_ELT (basic_block, body, i, bb)



signature.asc
Description: OpenPGP digital signature


Re: Ada testcase CR line endings

2012-04-10 Thread Robert Dewar

On 4/10/2012 1:35 AM, Mike Stump wrote:

So, I'd like to change all the ada testcases to use normal unix line endings.

testsuite/gnat.dg/taft_type2_pkg.ads

is an example if one such file, any objections?


As long as the test is not about line endings this seems fine.


Re: [PATCH] Fix PR middle-end/52894

2012-04-10 Thread Dave Anglin

On 10-Apr-12, at 7:06 AM, Richard Guenther wrote:

I can't immediately see how your description of the list of pending  
externals
and the vector is deleted.  pa.c keeps its own vector which  
references the
decls and the only issue I see is that if you call assemble_external  
after
processing externals you ICE because the pointer-set is not  
initialized?


Why does the pa backend end up calling assemble_external at final  
time?


The backend calls assemble_integer to output function descriptors at  
final

time.  This indirectly calls assemble_external from output_addr_const.
This can be seen in comment #4 in the PR.  This occurs after the pending
externals are processed.

We have to ensure that we only output function descriptors that are  
referenced,

and we can only determine this at final.

As a result, assemble_external is called after the VEC has been deleted.
The compiler doesn't ICE.  It goes into an infinite loop when a call to
pointer_set_insert tries to extend the deleted VEC.  It does this even  
if the

pointer is already in the VEC hash.

Previously, the storage wasn't deleted.  I suspect the ineffective  
calls on hpux
weren't noticed because assemble_external had already been called for  
the

function descriptors earlier, so the list in the backend was complete.

The problem was first seen with the Linux target.  There is no backend
list for this target because it doesn't use or need assemble_external.

I personally think it was wrong to add the deferral in  
assemble_external,

but that's another issue.

Dave.


[patch, fortran] Trim spaces on list-directed reads

2012-04-10 Thread Thomas Koenig

Hello world,

this patch effectively trims the spaces from the string on
list-directed reads.  This avoids the large overhead on
processing these spaces when reading from long lines.

I didn't do this for internal units which are arrays because
this would need a separate calculation for each record, and
would mean a major change to the way the code is structured.

The running time for the test case from PR 50673 is reduced
from ~8 s to 0.1 s by this patch.

Regression-tested.  OK for trunk?

Thomas

2012-04-09  Thomas Koenig  tkoe...@gcc.gnu.org

PR libfortran/38199
PR libfortran/50673
* intrinsics/string_intriniscs_inc.c (string_len_trim):
Remove prototypes for string_len_trim and move to...
* libgfortran.h (string_len_trim): ... here and
(string_len_trim_char4): ...here.
* io/unit.c: For non-array internal arrays where we do reading,
adjust the record length to the last non-blank character.
* io/unix.c:  Fix typo.
Index: intrinsics/string_intrinsics_inc.c
===
--- intrinsics/string_intrinsics_inc.c	(Revision 186190)
+++ intrinsics/string_intrinsics_inc.c	(Arbeitskopie)
@@ -44,9 +44,6 @@ extern void concat_string (gfc_charlen_type, CHART
 			   gfc_charlen_type, const CHARTYPE *);
 export_proto(concat_string);
 
-extern gfc_charlen_type string_len_trim (gfc_charlen_type, const CHARTYPE *);
-export_proto(string_len_trim);
-
 extern void adjustl (CHARTYPE *, gfc_charlen_type, const CHARTYPE *);
 export_proto(adjustl);
 
Index: libgfortran.h
===
--- libgfortran.h	(Revision 186190)
+++ libgfortran.h	(Arbeitskopie)
@@ -791,6 +791,13 @@ internal_proto(fstrcpy);
 extern gfc_charlen_type cf_strcpy (char *, gfc_charlen_type, const char *);
 internal_proto(cf_strcpy);
 
+extern gfc_charlen_type string_len_trim (gfc_charlen_type, const char *);
+export_proto(string_len_trim);
+
+extern gfc_charlen_type string_len_trim_char4 (gfc_charlen_type,
+	   const gfc_char4_t *);
+export_proto(string_len_trim_char4);
+
 /* io/intrinsics.c */
 
 extern void flush_all_units (void);
Index: io/unit.c
===
--- io/unit.c	(Revision 186190)
+++ io/unit.c	(Arbeitskopie)
@@ -397,7 +397,7 @@ get_internal_unit (st_parameter_dt *dtp)
   __gthread_mutex_lock (iunit-lock);
 
   iunit-recl = dtp-internal_unit_len;
-  
+
   /* For internal units we set the unit number to -1.
  Otherwise internal units can be mistaken for a pre-connected unit or
  some other file I/O unit.  */
@@ -415,6 +415,26 @@ get_internal_unit (st_parameter_dt *dtp)
 
   start_record *= iunit-recl;
 }
+  else
+{
+  /* If we are not processing an array, adjust the unit record length not
+	 to include trailing blanks for list-formatted reads.  */
+  if (dtp-u.p.mode == READING  dtp-format == NULL)
+	{
+	  if (dtp-common.unit == 0)
+	{
+	  dtp-internal_unit_len =
+		string_len_trim (dtp-internal_unit_len, dtp-internal_unit);
+	  iunit-recl = dtp-internal_unit_len;
+	}
+	  else
+	{
+	  dtp-internal_unit_len =
+		string_len_trim_char4 (dtp-internal_unit_len, dtp-internal_unit);
+	  iunit-recl = dtp-internal_unit_len;
+	}
+	}
+}
 
   /* Set initial values for unit parameters.  */
   if (dtp-common.unit)
Index: io/unix.c
===
--- io/unix.c	(Revision 186190)
+++ io/unix.c	(Arbeitskopie)
@@ -736,7 +736,7 @@ mem_alloc_w4 (stream * strm, int * len)
 }
 
 
-/* Stream read function for character(kine=1) internal units.  */
+/* Stream read function for character(kind=1) internal units.  */
 
 static ssize_t
 mem_read (stream * s, void * buf, ssize_t nbytes)


Re: [Patch] tree-parloops.c (eliminate_local_variables): Add braces to suppress warnings

2012-04-10 Thread Richard Guenther
On Tue, Apr 10, 2012 at 2:15 PM, JonY jo...@users.sourceforge.net wrote:
 Hi,

 Patch OK?

What kind of warning?

 ChangeLog:
 tree-parloops.c (eliminate_local_variables): Add braces to suppress warnings.

 Index: tree-parloops.c
 ===
 --- tree-parloops.c     (revision 186243)
 +++ tree-parloops.c     (working copy)
 @@ -723,13 +723,15 @@
   FOR_EACH_VEC_ELT (basic_block, body, i, bb)
     if (bb != entry_bb  bb != exit_bb)
       for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (gsi))
 -       if (is_gimple_debug (gsi_stmt (gsi)))
 -         {
 -           if (gimple_debug_bind_p (gsi_stmt (gsi)))
 -             has_debug_stmt = true;
 -         }
 -       else
 -         eliminate_local_variables_stmt (entry, gsi, decl_address);
 +        {
 +         if (is_gimple_debug (gsi_stmt (gsi)))
 +           {
 +             if (gimple_debug_bind_p (gsi_stmt (gsi)))
 +               has_debug_stmt = true;
 +           }
 +         else
 +           eliminate_local_variables_stmt (entry, gsi, decl_address);
 +        }

   if (has_debug_stmt)
     FOR_EACH_VEC_ELT (basic_block, body, i, bb)



Re: [PATCH, i386, Android] -mandroid support for i386 target

2012-04-10 Thread Ilya Enkovich
Ping.

On Apr 4, 2012 at 1:02, Ilya Enkovich enkovich@gmail.com wrote:
 On Tue, Apr 3, 2012 at 3:49 AM, Ilya Enkovich enkovich@gmail.com wrote:
 On 3/04/2012, at 2:16 AM, Ilya Enkovich wrote:


 The point is that one can build a toolchain for i686-linux-gnu that will 
 support both 32-bit and 64-bit multilibs.  The 32-bit multilib will be 
 used by default, and compilation for 64-bit user-space can be requested 
 with -m64 option.  Even though Android is not supported for -m64, such a 
 toolchain can support Android as a additional -m32 -mandroid multilib. 
  I.e., the toolchain will have three multilibs in total: -m32 
 (default), -m64 and -m32 -mandroid.  I386/linux64.h will be picked 
 up for such a toolchain, even though by default it would compile for 
 32-bit target.  Does this clear up things?


 I think I see your point. And it seems to make all this work I'll also
 have to rename i386/gnu-user.h into i386/gnu-user32.h and create
 i386/gnu-user.h with common definitions to be included by
 gnu-user[32|64].h. Otherwise I wont be able to use some definitions
 (i.e. GNU_USER_TARGET_LINK_SPEC and GNU_USER_TARGET_MATHFILE_SPEC) in
 linux64.h. Right?

 It's simpler that you think.  The target headers ($tm_file in config.gcc 
 -- gnu-user.h, linux*.h, etc. in this case) are all included into tm.h, 
 which serves as proxy to all those headers.  All definitions made in 
 preceding headers are available in subsequent headers.  So, given that 
 i386/gnu-user*.h precedes i386/linux*.h in config.gcc's $tm_file, you only 
 need to touch linux*.h.

 Thanks,

 --
 Maxim Kuvyrkov
 CodeSourcery / Mentor Graphics



 Hi,

 I prepared version with common linux.h and gnu-user.h. Does it look OK?

 Bootstrapped and checked on linux-x86_64.


 There are so many duplicates in gnu-user64.h and gnu-user32.h.
 Please move all of them to gnu-user.h.

 Thanks.

 --
 H.J.

 Hi,

 Here is a new version with all gnu-user32.h and gnu-user64.h common
 definitions moved to gnu-user.h.

 Bootstrapped and checked on linux-x86_64.

 Thanks,
 Ilya
 ---
 2012-04-04  Enkovich Ilya  ilya.enkov...@intel.com

        * config/i386/linux.h: Renamed to ...
        * config/i386/linux32.h: ... this.

        * config/i386/gnu-user.h: Renamed to ...
        * config/i386/gnu-user32.h: ... this.
        (CPP_SPEC): Removed.
        (CC1_SPEC): Removed.
        (ENDFILE_SPEC): Removed.
        (DEFAULT_PCC_STRUCT_RETURN): Removed.
        (TARGET_TLS_DIRECT_SEG_REFS_DEFAULT): Removed.
        (TARGET_OS_CPP_BUILTINS): Removed.
        (LIBGCC2_HAS_TF_MODE): Removed.
        (LIBGCC2_TF_CEXT): Removed.
        (TF_SIZE): Removed.
        (TARGET_ASM_FILE_END): Removed.
        (STACK_CHECK_MOVING_SP): Removed.
        (STACK_CHECK_STATIC_BUILTIN): Removed.
        (TARGET_THREAD_SSP_OFFSET): Removed.
        (TARGET_CAN_SPLIT_STACK): Removed.
        (TARGET_THREAD_SPLIT_STACK_OFFSET): Removed.
        (GNU_USER_TARGET_LINK_SPEC): New.
        (LINK_SPEC): Use GNU_USER_TARGET_LINK_SPEC.

        * config.gcc: Rename i386/linux.h to i386/linux32.h and
        i386/gnu-user.h to i386/gnu-user32.h.

        * config/i386/linux.h: New.

        * config/i386/gnu-user.h: New.

        * config/i386/linux64.h: Include i386/linux.h.

        * config/i386/gnu-user64.h: Include i386/gnu-user.h
        (CPP_SPEC): Removed.
        (CC1_SPEC): Removed.
        (ENDFILE_SPEC): Removed.
        (DEFAULT_PCC_STRUCT_RETURN): Removed.
        (TARGET_TLS_DIRECT_SEG_REFS_DEFAULT): Removed.
        (TARGET_OS_CPP_BUILTINS): Removed.
        (LIBGCC2_HAS_TF_MODE): Removed.
        (LIBGCC2_TF_CEXT): Removed.
        (TF_SIZE): Removed.
        (TARGET_ASM_FILE_END): Removed.
        (STACK_CHECK_MOVING_SP): Removed.
        (STACK_CHECK_STATIC_BUILTIN): Removed.
        (TARGET_THREAD_SSP_OFFSET): Removed.
        (TARGET_CAN_SPLIT_STACK): Removed.
        (TARGET_THREAD_SPLIT_STACK_OFFSET): Removed.
        (GNU_USER_TARGET_LINK_SPEC): New.
        (LINK_SPEC): Use GNU_USER_TARGET_LINK_SPEC.


Re: [PATCH] Fix PR middle-end/52894

2012-04-10 Thread Richard Guenther
On Tue, Apr 10, 2012 at 2:28 PM, Dave Anglin dave.ang...@bell.net wrote:
 On 10-Apr-12, at 7:06 AM, Richard Guenther wrote:

 I can't immediately see how your description of the list of pending
 externals
 and the vector is deleted.  pa.c keeps its own vector which references
 the
 decls and the only issue I see is that if you call assemble_external after
 processing externals you ICE because the pointer-set is not initialized?

 Why does the pa backend end up calling assemble_external at final time?


 The backend calls assemble_integer to output function descriptors at final
 time.  This indirectly calls assemble_external from output_addr_const.
 This can be seen in comment #4 in the PR.  This occurs after the pending
 externals are processed.

 We have to ensure that we only output function descriptors that are
 referenced,
 and we can only determine this at final.

You mean function destcriptors for functions that are output?  You already
know this at cgraph clone materialization time.

 As a result, assemble_external is called after the VEC has been deleted.
 The compiler doesn't ICE.  It goes into an infinite loop when a call to
 pointer_set_insert tries to extend the deleted VEC.  It does this even if
 the
 pointer is already in the VEC hash.

Isn't the bug then that the backend deletes the VEC too early?

 Previously, the storage wasn't deleted.  I suspect the ineffective calls on
 hpux
 weren't noticed because assemble_external had already been called for the
 function descriptors earlier, so the list in the backend was complete.

 The problem was first seen with the Linux target.  There is no backend
 list for this target because it doesn't use or need assemble_external.

 I personally think it was wrong to add the deferral in assemble_external,
 but that's another issue.

I think the best thing would be to revert the offending patch on old branches
(4.6 and 4.5) and see if the deferal can be fixed properly - though I didn't
look at the pa issue in detail.  Maybe Steven can do this.

Richard.

 Dave.


Re: ICE on bootstrap of libstdc++ for mingw-targets in add_bb_to_loop

2012-04-10 Thread Kai Tietz
Hi,

issue was that TARGET_EXCEPT_UNWIND_INFO wasn't set for i386
architectures.  As mingw targets are using SjLj, it is necessary that
this hook is present.

ChangeLog

2012-04-10  Kai Tietz  kti...@redhat.com

PR c++/52918
* common/config/i386/i386-common.c (ix86_except_unwind_info):
Add target-hook for supporting SjLj.

Tested for x86_64-w64-mingw32, and for x86_64-unknown-linux-gnu.  Ok for apply?

Regards,
Kai


Index: common/config/i386/i386-common.c
===
--- common/config/i386/i386-common.c(revision 186264)
+++ common/config/i386/i386-common.c(working copy)
@@ -667,6 +667,26 @@
   return ret;
 }

+/* Implement TARGET_EXCEPT_UNWIND_INFO.  */
+
+static enum unwind_info_type
+ix86_except_unwind_info (struct gcc_options *opts)
+{
+  /* Honor the --enable-sjlj-exceptions configure switch.  */
+#ifdef CONFIG_SJLJ_EXCEPTIONS
+  if (CONFIG_SJLJ_EXCEPTIONS)
+return UI_SJLJ;
+#endif
+
+  /* For simplicity elsewhere in this file, indicate that all unwind
+ info is disabled if we're not emitting unwind tables.  */
+  if (!opts-x_flag_exceptions  !opts-x_flag_unwind_tables)
+return UI_NONE;
+
+  return UI_TARGET;
+}
+
+
 #undef TARGET_DEFAULT_TARGET_FLAGS
 #define TARGET_DEFAULT_TARGET_FLAGS\
   (TARGET_DEFAULT  \
@@ -684,4 +704,7 @@
 #undef TARGET_SUPPORTS_SPLIT_STACK
 #define TARGET_SUPPORTS_SPLIT_STACK ix86_supports_split_stack

+#undef TARGET_EXCEPT_UNWIND_INFO
+#define TARGET_EXCEPT_UNWIND_INFO  ix86_except_unwind_info
+
 struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;


Re: [RFC ivopts] ARM - Make ivopts take into account whether pre and post increments are actually supported on targets.

2012-04-10 Thread Ramana Radhakrishnan
On 28 March 2012 11:13, Richard Guenther richard.guent...@gmail.com wrote:
 On Wed, Mar 28, 2012 at 11:57 AM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Tue, Mar 27, 2012 at 3:17 PM, Ramana Radhakrishnan
 ramana.radhakrish...@linaro.org wrote:
 And the patch is now attached 

 This does not look like it would compile on any other target.

 Looks like the macros are pre-existing in rtl.h.  With that the ivopts change
 is ok.  I'll let arm folks decide over the arm specific bits.

Thanks for the approval on the ivopts stuff -which I haven't changed .
I have revised the ARM backend specific changes to tweak costs to
prevent auto-inc-dec from gratuitously adding more moves for the
pre/post_modify_disp variety of instructions and to disable these for
versions of the architecture which don't have support for LDRD.

I would like another set of eyes on the backend specific changes - I
am currently regression testing this final version on FSF trunk.

2012-04-10  Ramana Radhakrishnan  ramana.radhakrish...@linaro.org

* tree-ssa-loop-ivopts.c (add_autoinc_candidates, get_address_cost):
Replace use of HAVE_{POST/PRE}_{INCREMENT/DECREMENT} with
USE_{LOAD/STORE}_{PRE/POST}_{INCREMENT/DECREMENT} appropriately.
* config/arm/arm.h (ARM_AUTOINC_VALID_FOR_MODE_P): New.
(USE_LOAD_POST_INCREMENT): Define.
(USE_LOAD_PRE_INCREMENT): Define.
(USE_LOAD_POST_DECREMENT): Define.
(USE_LOAD_PRE_DECREMENT): Define.
(USE_STORE_PRE_DECREMENT): Define.
(USE_STORE_PRE_INCREMENT): Define.
(USE_STORE_POST_DECREMENT): Define.
(USE_STORE_POST_INCREMENT): Define.
(arm_auto_incmodes): Add enumeration.
* config/arm/arm-protos.h (arm_autoinc_modes_ok_p): Declare.
* config/arm/arm.c (arm_autoinc_modes_ok_p): Define.
(arm_rtx_costs_1): Adjust costs for
auto-inc modes and pre / post modify in floating point mode.
(arm_size_rtx_costs): Likewise.


regards,
Ramana



 Richard.

 Richard.

 Ramana
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 900d09a..f9cb75a 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -248,4 +248,6 @@ extern int vfp3_const_double_for_fract_bits (rtx);
 extern void arm_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel);
 extern bool arm_expand_vec_perm_const (rtx target, rtx op0, rtx op1, rtx sel);
 
+extern bool arm_autoinc_modes_ok_p (enum machine_mode, enum arm_auto_incmodes);
+
 #endif /* ! GCC_ARM_PROTOS_H */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 5522fc1..6bc5aa9 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -7121,6 +7121,19 @@ arm_rtx_costs_1 (rtx x, enum rtx_code outer, int* total, 
bool speed)
   /* Memory costs quite a lot for the first word, but subsequent words
 load at the equivalent of a single insn each.  */
   *total = COSTS_N_INSNS (2 + ARM_NUM_REGS (mode));
+
+  /* If we have hard float or there is no support for ldrd 
+and strd there is no point in allowing post_dec, 
+pre_inc and pre/post_modify_disp to have the same cost
+for memory accesses in floating point modes.  */
+  if ((TARGET_HARD_FLOAT
+  || !TARGET_LDRD)
+  (FLOAT_MODE_P (mode) 
+ (GET_CODE (XEXP (x, 0)) == POST_DEC
+  || GET_CODE (XEXP (x, 0)) == PRE_INC
+  || GET_CODE (XEXP (x, 0)) == PRE_MODIFY
+  || GET_CODE (XEXP (x, 0)) == POST_MODIFY)))
+   *total += COSTS_N_INSNS (2);
   return true;
 
 case DIV:
@@ -7831,6 +7844,20 @@ arm_size_rtx_costs (rtx x, enum rtx_code code, enum 
rtx_code outer_code,
*total = COSTS_N_INSNS (2);
   else
*total = COSTS_N_INSNS (ARM_NUM_REGS (mode));
+
+  /* If we have hard float or there is no support for ldrd 
+and strd there is no point in allowing post_dec, 
+pre_inc and pre/post_modify_disp to have the same cost
+for memory accesses in floating point modes.  */  
+  if ((TARGET_HARD_FLOAT
+  || !TARGET_LDRD)
+  (FLOAT_MODE_P (mode) 
+ (GET_CODE (XEXP (x, 0)) == POST_DEC
+  || GET_CODE (XEXP (x, 0)) == PRE_INC
+  || GET_CODE (XEXP (x, 0)) == PRE_MODIFY
+  || GET_CODE (XEXP (x, 0)) == POST_MODIFY)))
+   *total = COSTS_N_INSNS (2);
+  
   return true;
 
 case DIV:
@@ -25680,5 +25707,51 @@ arm_vectorize_vec_perm_const_ok (enum machine_mode 
vmode,
   return ret;
 }
 
-
+bool
+arm_autoinc_modes_ok_p (enum machine_mode mode, enum arm_auto_incmodes code)
+{
+  /* If we are soft float and we do not have ldrd 
+ then all auto increment forms are ok.  */
+  if (TARGET_SOFT_FLOAT  (TARGET_LDRD || GET_MODE_SIZE (mode) = 4))
+return true;
+
+  switch (code)
+{
+  /* Post increment and Pre Decrement are supported for all
+instruction forms except for vector forms.  */
+case 

Re: [PATCH][ARM] NEON DImode immediate constants

2012-04-10 Thread Andrew Stubbs

Ping.

On 30/03/12 12:15, Andrew Stubbs wrote:

On 28/02/12 16:20, Andrew Stubbs wrote:

Hi all,

This patch implements 64-bit immediate constant loads in NEON.

The current state is that you can load const_vector, but not const_int.
This is clearly not ideal. The result is a constant pool entry when it's
not necessary.

The patch disables the movdi_vfp patterns for loading DImode values, if
the operand is const_int and NEON is enabled, and extends the neon_mov
pattern to include DImode const_int, as well as the const_vector
operands. I've modified neon_valid_immediate only enough to accept
const_int input - the logic remains untouched.


That patch failed to bootstrap successfully, but this updated patch
bootstraps and tests with no regressions.

OK?

Andrew




Re: [PING] allowing fwprop to propagate subregs

2012-04-10 Thread Richard Sandiford
Ulrich Weigand uweig...@de.ibm.com writes:
 http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01269.html

I don't like approving patches I had a hand in, but since Paolo
was happy, and since no-one else seems to care... OK.  Thanks for
getting this working.

Richard


Re: [SH] PR 50751 - add HImode displacement addressing support

2012-04-10 Thread Kaz Kojima
Oleg Endo oleg.e...@t-online.de wrote:
 The attached patch adds HImode addressing support.
 Tested against rev. 186243 with
 sudo make -k check RUNTESTFLAGS=--target_board=sh-sim
 \{-m2/-ml,-m2/-mb,-m2a/-mb,-m2a-single/-mb,-m4/-ml,-m4/-mb,
 -m4-single/-ml,-m4-single/-mb,-m4a-single/-ml,-m4a-single/-mb}
 
 and no new failures.
 Test cases will follow soon.

The patch is OK for trunk.

BTW, do you have the numbers of CSiBE with this?

Regards,
kaz


rfa: vectorize strided loads [2/2] [PR 18437]

2012-04-10 Thread Michael Matz
Hi,

and this implements generally strided loads where the stride is a 
loop-invariant (constant or ssa-name).  We only do so if the load can't be 
handled by interleaving groups.  The implementation is fairly straight 
forward:

 for (i = 0; i  n; i += stride)
   ... = array[i];

is transformed into:

 for (j = 0; ; j += VF*stride)
   tmp1 = array[j];
   tmp2 = array[j + stride];
   ...
   vectemp = {tmp1, tmp2, ...}

(of course variously adjusted for component number)

The nice thing is that by such separate loads we don't need to care for 
alignment (if the old access was aligned, so will be the new as the access 
size doesn't change).

This is one more step in vectorizing the same loops as ICC.  polyhedron 
has such a case in rnflow, where it helps quite much:

Without patch:

   Benchmark   Compile  Executable   Ave Run  Number   Estim
Name(secs) (bytes)(secs) Repeats   Err %
   -   ---  --   --- ---  --
  ac  3.71 3960337 10.06   2  0.0368
  aermod 75.30 5594185 19.30   5  0.7102
 air 10.30 483  6.59   2  0.0653
capacita  7.23 4185227 57.25   2  0.0968
 channel  2.05 4658532  2.70   5  4.4701
   doduc 14.55 4237850 23.31   2  0.0768
 fatigue  6.64 4127554  7.48   5  0.3063
 gas_dyn 13.25 4113557  2.99   5  4.5063
  induct  7.23 4338356  8.85   5  0.9927
   linpk  1.24 3927350 10.37   5  5.1174
mdbx  5.51 4053841 12.95   5  0.0956
  nf 10.69 4062276 11.44   5  0.2727
 protein 33.04 5011924 39.53   5  0.2328
  rnflow 25.35 4238651 31.78   2  0.0673
test_fpu 12.24 4184939  9.00   5  0.1157
tfft  1.15 3976710  3.95   3  0.0736

Geometric Mean Execution Time =  11.21 seconds

With patch:

   Benchmark   Compile  Executable   Ave Run  Number   Estim
Name(secs) (bytes)(secs) Repeats   Err %
   -   ---  --   --- ---  --
  ac  3.91 3960337 10.34   5  0.3661
  aermod 76.44 5598281 19.03   5  1.3394
 air 11.52 483  6.75   5  0.9347
capacita  7.78 4189323 56.33   2  0.0976
 channel  2.14 4658532  2.59   5  0.6640
   doduc 14.61 4237850 23.41   5  0.2974
 fatigue  6.62 4127554  7.44   5  0.1082
 gas_dyn 13.14 4113557  2.82   5  0.5253
  induct  7.26 4338356  8.49   2  0.0082
   linpk  1.23 3927350  9.78   2  0.0705
mdbx  5.47 4053841 12.90   2  0.0601
  nf 10.67 4062276 11.33   2  0.0004
 protein 32.81 5011924 39.48   2  0.0893
  rnflow 26.57 4246843 26.70   5  0.0915
test_fpu 13.29 4193131  8.82   5  0.2136
tfft  1.14 3976710  3.95   5  0.1753

Geometric Mean Execution Time =  10.95 seconds

I.e. for rnflow from 31.78 to 26.70 seconds, nearly 20% better.

Regstrapped together with [1/2] on x86_64-linux, all langs+Ada, no 
regressions.  Okay for trunk?


Ciao,
Michael.

PR tree-optimization/18437

* tree-vectorizer.h (_stmt_vec_info.stride_load_p): New member.
(STMT_VINFO_STRIDE_LOAD_P): New accessor.
(vect_check_strided_load): Declare.
* tree-vect-data-refs.c (vect_check_strided_load): New function.
(vect_analyze_data_refs): Use it to accept strided loads.
* tree-vect-stmts.c (vectorizable_load): Ditto and handle them.

testsuite/
* gfortran.dg/vect/rnflow-trs2a2.f90: New test.

Index: tree-vectorizer.h
===
--- tree-vectorizer.h.orig  2012-04-10 13:19:18.0 +0200
+++ tree-vectorizer.h   2012-04-10 13:21:19.0 +0200
@@ -545,6 +545,7 @@ typedef struct _stmt_vec_info {
 
   /* For loads only, true if this is a gather load.  */
   bool gather_p;
+  bool stride_load_p;
 } *stmt_vec_info;
 
 /* Access Functions.  */
@@ -559,6 +560,7 @@ typedef struct _stmt_vec_info {
 #define STMT_VINFO_VECTORIZABLE(S) (S)-vectorizable
 #define STMT_VINFO_DATA_REF(S) (S)-data_ref_info
 #define STMT_VINFO_GATHER_P(S)(S)-gather_p
+#define STMT_VINFO_STRIDE_LOAD_P(S)   (S)-stride_load_p
 
 #define STMT_VINFO_DR_BASE_ADDRESS(S)  (S)-dr_base_address
 #define STMT_VINFO_DR_INIT(S)  (S)-dr_init
@@ -875,6 +877,7 @@ extern bool vect_analyze_data_ref_access
 extern bool vect_prune_runtime_alias_test_list (loop_vec_info);
 extern tree vect_check_gather (gimple, loop_vec_info, tree *, tree *,

Re: Symbol table 2/many: break out partitioning code out of lto.c

2012-04-10 Thread Diego Novillo

On 4/10/12 9:32 AM, Jan Hubicka wrote:

Hi,
LTO partitioning logic is one of places that are really symmetric accross most 
types
of symbols; also the implementation is convoluted now by split in between lto.c 
that
computes cgraph/varpool node sets specifying partitions and lto-cgraph.c that
computes a boundaries (i.e. what functions/vars are referenced from given unit).
There is a lot of logic duplication in between the promotion done in lto.c and 
boundary
logic + there are subtle differences causing a lot of pain.

The plan is to reorg the code to new API and unify it, so the partitioning code
produce boundaries on the run and represents every of the partition by the 
encoder,
not by sets.


Excellent.


* lto.c: Update copyright; remove params.h, ipa-inline.h
and ipa-utils.h inlines; inline lto-partition.h
(ltrans_partition_def, ltrans_partition, add_cgraph_node_to_partition,
add_varpool_node_to_partition, new_partition, free_ltrans_partitions,
add_references_to_partition, add_cgraph_node_to_partition_1,
add_cgraph_node_to_partition, add_varpool_node_to_partition,
undo_partition, partition_cgraph_node_p, partition_varpool_node_p,
lto_1_to_1_map, node_cmp, varpool_node_cmp, lto_balanced_map,
promote_var, promote_fn, lto_promote_cross_file_statics): move to...
* lto-partition.c: ... here; new file.
* lto-partition.h: New file.
* Make-lang.in (lto.o): Update dependencies.
(lto-partition.o): New.


OK, thanks.


Diego.


Re: Symbol table 1/many: symtab_nodes

2012-04-10 Thread Richard Guenther
On Tue, 10 Apr 2012, Jan Hubicka wrote:

 Hi,
 this is very basis of the symbol table work.  The basic idea is:
 
 1) give cgraph node and varpool node common base (symtab node)
 2) Move common data there: that is type, declaration, visibility flags,
force_output, ipa references and other stuff.
 3) Introduce symtab.c for basic symbol table manipulation that in first run
is represented as simple linked list of cgraph/varpool node entries.
 4) Update ipa-reference API for the new class structure
 5) Significantly clanup/rework old cgraph reachability and construction code
that has grown up very convoluted to meet non-unit-at-a-time needs.
Merge most of logic done on varpool and cgraph into common grounds
 6) reorg WHOPR partitioning code
 7) Add new types of symbol table entries - labels, aliases and possibly
constant pools and symbols defined by asm statements so existing WHOPR bugs
have chance to be solved.
 8) Add actual symbol name hash - this is trickier than it sounds given that
GCC currently don't really know the symbol names on non-LTO targets.
I plan to simply take the existing lto-symtab logic and slowly require
targtets to provide hook to translate ASSEMBLER_NAME into symbol name.
It would be nice to get rid of ASSEMBLER_NAME and replace it by SYMBOL_NAME
+ set of flags, but this is very hard to do since assembler name goes
everywhere into target machinery.  I am not really volunteering to do that
at this moment.
 9) Reorg most of lto-symtab to work on symtab keeping the prevailing and
other stuff there.
 
 I would like to get most of the changes into mainline for this stage1.  In 
 worst
 case at least 1-6.
 
 I prototytped all except for lto-symtab reorg, but I would like to re-do
 everything incrementally for mainline with cleanups on the way.  Many
 of things can be significantly simplified over current implementation
 and I hope it will make it easier to follow the changes and get me some useful
 feedback.  It is kind of second chance for cleaner cgraph API and it would
 be nice to get it mostly right.
 
 This patch takes the very first step by adding the base object.  It implements
 inheritance in C via cgraph/varpool accessors. C++ would make sense here, but 
 I
 think I still can't because of gengtype.  Once C++ is not only allowed, but
 also useful, I don't think changing the syntactical shugar to actual classes
 would be a significant project.

I agree.  I hope we get to 8), as lto-symtab requires that ;)

This is ok.

Thanks for finally pushing this,
Richard.

 Honza
 
   * cgraph.h: Remove misledaing comment on ipa-ref.h.
   (symtab_type): New enum.
   (symtab_node): New structure.
   (cgraph_node, varpool_node): Add symbol base type.
   (cgraph, varpool): New accestor functions.
   * cgraph.c (cgraph_create_node_1): Set symbol type.
   * varpool.c (varpool_node): Set symbol type.
 Index: cgraph.h
 ===
 --- cgraph.h  (revision 186252)
 +++ cgraph.h  (working copy)
 @@ -27,7 +27,23 @@ along with GCC; see the file COPYING3.
  #include tree.h
  #include basic-block.h
  #include function.h
 -#include ipa-ref.h /* FIXME: inappropriate dependency of cgraph on IPA.  */
 +#include ipa-ref.h
 +
 +/* Symbol table consists of functions and variables.
 +   TODO: add labels, constant pool and aliases.  */
 +enum symtab_type
 +{
 +  SYMTAB_FUNCTION,
 +  SYMTAB_VARIABLE
 +};
 +
 +/* Base of all entries in the symbol table.
 +   The symtab_node is inherited by cgraph and varpol nodes.  */
 +struct GTY(()) symtab_node
 +{
 +  /* Type of the symbol.  */
 +  enum symtab_type type;
 +};
  
  enum availability
  {
 @@ -150,6 +166,7 @@ struct GTY(()) cgraph_clone_info
 Each function decl has assigned cgraph_node listing callees and callers.  
 */
  
  struct GTY((chain_next (%h.next), chain_prev (%h.previous))) cgraph_node 
 {
 +  struct symtab_node symbol;
tree decl;
struct cgraph_edge *callees;
struct cgraph_edge *callers;
 @@ -387,6 +404,7 @@ DEF_VEC_ALLOC_P(cgraph_edge_p,heap);
 Each static variable decl has assigned varpool_node.  */
  
  struct GTY((chain_next (%h.next), chain_prev (%h.prev))) varpool_node {
 +  struct symtab_node symbol;
tree decl;
/* For aliases points to declaration DECL is alias of.  */
tree alias_of;
 @@ -688,6 +706,23 @@ void varpool_add_new_variable (tree);
  #define FOR_EACH_STATIC_VARIABLE(node) \
 for ((node) = varpool_nodes_queue; (node); (node) = (node)-next_needed)
  
 +/* Return callgraph node for given symbol and check it is a function. */
 +static inline struct cgraph_node *
 +cgraph (struct symtab_node *node)
 +{
 +  gcc_checking_assert (node-type == SYMTAB_FUNCTION);
 +  return (struct cgraph_node *)node;
 +}
 +
 +/* Return varpool node for given symbol and check it is a variable.  */
 +static inline struct varpool_node *
 +varpool (struct symtab_node *node)
 +{
 +  

Re: [PATCH] Fix PR middle-end/52894

2012-04-10 Thread Richard Guenther
On Tue, Apr 10, 2012 at 3:36 PM, Steven Bosscher stevenb@gmail.com wrote:
 On Tue, Apr 10, 2012 at 2:44 PM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Tue, Apr 10, 2012 at 2:28 PM, Dave Anglin dave.ang...@bell.net wrote:
 On 10-Apr-12, at 7:06 AM, Richard Guenther wrote:

 I can't immediately see how your description of the list of pending
 externals
 and the vector is deleted.  pa.c keeps its own vector which references
 the
 decls and the only issue I see is that if you call assemble_external after
 processing externals you ICE because the pointer-set is not initialized?

 Why does the pa backend end up calling assemble_external at final time?


 The backend calls assemble_integer to output function descriptors at final
 time.  This indirectly calls assemble_external from output_addr_const.
 This can be seen in comment #4 in the PR.  This occurs after the pending
 externals are processed.

 We have to ensure that we only output function descriptors that are
 referenced,
 and we can only determine this at final.

 You mean function destcriptors for functions that are output?  You already
 know this at cgraph clone materialization time.

 As a result, assemble_external is called after the VEC has been deleted.
 The compiler doesn't ICE.  It goes into an infinite loop when a call to
 pointer_set_insert tries to extend the deleted VEC.  It does this even if
 the
 pointer is already in the VEC hash.

 Isn't the bug then that the backend deletes the VEC too early?

 Previously, the storage wasn't deleted.  I suspect the ineffective calls on
 hpux
 weren't noticed because assemble_external had already been called for the
 function descriptors earlier, so the list in the backend was complete.

 The problem was first seen with the Linux target.  There is no backend
 list for this target because it doesn't use or need assemble_external.

 I personally think it was wrong to add the deferral in assemble_external,
 but that's another issue.

 I think the best thing would be to revert the offending patch on old branches
 (4.6 and 4.5) and see if the deferral can be fixed properly - though I didn't
 look at the pa issue in detail.  Maybe Steven can do this.

 It seems to me that the patch Dave posted is fine. It basically
 reverts to the old situation (i.e. before the
 pending_assemble_external stuff was added) for all back ends.

Well, ok then.

Thanks,
Richard.

 Ciao!
 Steven


Re: [SH] PR 50751 - add HImode displacement addressing support

2012-04-10 Thread Oleg Endo
On Tue, 2012-04-10 at 22:42 +0900, Kaz Kojima wrote:
 Oleg Endo oleg.e...@t-online.de wrote:
  The attached patch adds HImode addressing support.
  Tested against rev. 186243 with
  sudo make -k check RUNTESTFLAGS=--target_board=sh-sim
  \{-m2/-ml,-m2/-mb,-m2a/-mb,-m2a-single/-mb,-m4/-ml,-m4/-mb,
  -m4-single/-ml,-m4-single/-mb,-m4a-single/-ml,-m4a-single/-mb}
  
  and no new failures.
  Test cases will follow soon.
 
 The patch is OK for trunk.
 
 BTW, do you have the numbers of CSiBE with this?
 

Only for -m4-single -ml -O2 -mpretend-cmove so far.
Not so spectacular :T
I'll also do a comparison of more variants to see if something went
really bad.  It's a bit difficult to isolate the degradations because
there's quite some code reordering happening after the patch... 

Cheers,
Oleg



OpenTCP-1.0.4
  arp 2089 - 2061 -28 / -1.340354 %
  bootp/bootp  740 - 704  -36 / -4.864865 %
  demo/main_demo   372 - 368   -4 / -1.075269 %
  demo/tcp_client_demo 396 - 396 
  demo/tcp_server_demo 468 - 468 
  demo/udp_demo268 - 268 
  dhcp/dhcpc  1588 - 1588
  dns/dns 1340 - 1328 -12 / -0.895522 %
  ethernet1240 - 1228 -12 / -0.967742 %
  http/http_server1784 - 1764 -20 / -1.121076 %
  http/https_callbacks 656 - 656 
  icmp 348 - 352   +4 / +1.149425 %
  ip  1780 - 1668-112 / -6.292135 %
  pop3/pop3_client3796 - 3728 -68 / -1.791359 %
  pop3/pop3c_callbacks  28 - 28  
  smtp/smtp_client2284 - 2212 -72 / -3.152364 %
  smtp/smtpc_callbacks  32 - 32  
  system   980 - 980 
  tcp 4966 - 4970  +4 / +0.080548 %
  tftp/tftps  1024 - 920 -104 / -10.156250 %
  timers   228 - 228 
  udp 1362 - 1322 -40 / -2.936858 %

 total: 27769 - 27269  -500 / -1.800569 %


bzip2-1.0.2
  blocksort   7262 - 7258  -4 / -0.055081 %
  bzip2  17288 - 17248-40 / -0.231374 %
  bzip2recover3868 - 3868
  bzlib  10668 - 10636-32 / -0.299963 %
  compress   14208 - 14192-16 / -0.112613 %
  crctable1024 - 1024
  decompress  8380 - 8380
  huffman 1436 - 1436
  randtable   2048 - 2048

 total: 66182 - 66090  -92 / -0.139011 %


cg_compiler_opensrc
  atom3728 - 3728
  binding 1752 - 1728 -24 / -1.369863 %
  cgcmain 2388 - 2388
  cgstruct 128 - 128 
  check   3300 - 3300
  compile14052 - 14052   
  constfold   3996 - 3996
  cpp 6776 - 6768  -8 / -0.118064 %
  generic_hal 2516 - 2516
  hal 2720 - 2720
  inline  2724 - 2724
  memory   548 - 548 
  parser 15796 - 15796   
  printutils 11888 - 11888   
  scanner 6716 - 6672 -44 / -0.655152 %
  semantic4364 - 4364
  stdlib  3224 - 3224
  support24328 - 24332 +4 / +0.016442 %
  support_iter1832 - 1832
  symbols 6324 - 6312 -12 / -0.189753 %
  tokenize 624 - 624 
  tokens  3112 - 3120  +8 / +0.257069 %

 total:122836 - 122760 -76 / -0.061871 %


compiler
  cg 2868 - 2868
  main   2600 - 2600
  parser 4524 - 4528  +4 / +0.088417 %
  scanner3788 - 3788
  vam5264 - 5264
  vas6292 - 6300  +8 / +0.127146 %

 total: 25336 - 25348  +12 / +0.047363 %


flex-2.5.31
  buf  1100 - 1100
  ccl  1220 - 1220
  dfa  6952 - 6952
  ecs   612 - 612 
  filter   2360 - 2360
  gen 24512 - 24488-24 / -0.097911 %
  libmain28 - 28  
  libyywrap   4 - 4   
  main19804 - 19792-12 / -0.060594 %
  misc 6480 - 6480
  nfa  3812 - 3812
  options  2768 - 2768
  parse9524 - 9528  +4 / +0.041999 %
  regex 596 - 596 
  scan47012 - 47004 -8 / -0.017017 %
  scanopt  3672 - 3672
  skel96652 - 96652   
  sym  1012 - 1012
  tables   2080 - 2064 -16 / -0.769231 %
  tables_shared  28 - 28  
  tblcmp   3764 - 3764
  yylex1292 - 1292

 total:235284 - 235228 -56 / -0.023801 %


jikespg-1.3
  src/ctabs  49912 - 49876-36 / -0.072127 %
  src/globals  288 - 288 
  src/lpgparse   47340 - 47036   -304 / -0.642163 %
  src/lpgutil 5152 - 5120 -32 / -0.621118 %
  src/main5440 - 5440
  src/mkfirst

Re: [i386, patch, RFC] HLE support in GCC

2012-04-10 Thread Kirill Yukhin

 Yeah.  And you don't need to change the FEs in any way, all that is needed
 is to change the middle-end/expansion (builtins.c - e.g. get_memmodel)
 and the backend (plus predefine the macros in the backend).

Jakub

Hi Jakub,
Attached patch implements HLE support for __atomic_compare_exchange_n.

So, to emit HLE prefix, it is possible to do:
int
foo2 (int *p, int oldv, int newv)
{
  __atomic_compare_exchange_n (p, oldv, newv, 0, __ATOMIC_ACQUIRE |
__ATOMIC_USE_HLE, __ATOMIC_ACQUIRE);
  return oldv;
}
Which will generate:
...
lock xacquire cmpxchgl  %esi, (%rcx)
...

Comments?

PS: No tests and TARGET_HLE defined yet.

Thanks, K


hle-rfc-2.gcc.patch
Description: Binary data


Re: [SH] PR 50751 - add HImode displacement addressing support

2012-04-10 Thread Nathan Froyd
- Original Message -
  BTW, do you have the numbers of CSiBE with this?
 
 Only for -m4-single -ml -O2 -mpretend-cmove so far.
 Not so spectacular :T
 I'll also do a comparison of more variants to see if something went
 really bad.  It's a bit difficult to isolate the degradations because
 there's quite some code reordering happening after the patch...

Are you looking at execution time or code size?  If the latter, you could try 
disabling scheduling for comparison purposes (-fno-schedule-insns 
-fno-schedule-insns2).  Even if you're looking at execution time, disabling 
scheduling might make it easier to see where things are going wrong in the 
patched code.

-Nathan


Re: RFC reminder: an alternative -fsched-pressure implementation

2012-04-10 Thread Ramana Radhakrishnan
 The condition I orignally set myself was that this patch should only
 go in if it becomes the default on at least one architecture,
 specifically ARM.  Ulrich tells me that Linaro have now made it
 the default for ARM in their GCC 4.7 release, so hopefully Ramana
 would be OK with doing the same in upstream 4.8.

Before I approve that I'd like to hopefully see some numbers across
the M-class cores as well so that we make sure that side of the
architecture isn't missed out (I don't expect it to make a huge
difference there but it's still worth checking) - I'm quite happy for
it to be default on A8 and A9 based on the evidence I've seen so far.

regards,
Ramana


Re: [SH] PR 50751 - add HImode displacement addressing support

2012-04-10 Thread Oleg Endo
On Tue, 2012-04-10 at 07:14 -0700, Nathan Froyd wrote:
 - Original Message -
   BTW, do you have the numbers of CSiBE with this?
  
  Only for -m4-single -ml -O2 -mpretend-cmove so far.
  Not so spectacular :T
  I'll also do a comparison of more variants to see if something went
  really bad.  It's a bit difficult to isolate the degradations because
  there's quite some code reordering happening after the patch...
 
 Are you looking at execution time or code size?  If the latter, you
 could try disabling scheduling for comparison purposes
 (-fno-schedule-insns -fno-schedule-insns2).  Even if you're looking at
 execution time, disabling scheduling might make it easier to see where
 things are going wrong in the patched code.

Code size only.  Thanks for the hint, will try it out.

Cheers,
Oleg



Re: [i386, patch, RFC] HLE support in GCC

2012-04-10 Thread Jakub Jelinek
On Tue, Apr 10, 2012 at 06:12:08PM +0400, Kirill Yukhin wrote:
 Attached patch implements HLE support for __atomic_compare_exchange_n.

The target hook is definitely not appropriate, just define it in
ix86_target_macros in i386-c.c instead or so.

Jakub


Re: rfa: vectorize strided loads [2/2] [PR 18437]

2012-04-10 Thread Richard Guenther
On Tue, Apr 10, 2012 at 3:46 PM, Michael Matz m...@suse.de wrote:
 Hi,

 and this implements generally strided loads where the stride is a
 loop-invariant (constant or ssa-name).  We only do so if the load can't be
 handled by interleaving groups.  The implementation is fairly straight
 forward:

         for (i = 0; i  n; i += stride)
           ... = array[i];

 is transformed into:

         for (j = 0; ; j += VF*stride)
           tmp1 = array[j];
           tmp2 = array[j + stride];
           ...
           vectemp = {tmp1, tmp2, ...}

 (of course variously adjusted for component number)

 The nice thing is that by such separate loads we don't need to care for
 alignment (if the old access was aligned, so will be the new as the access
 size doesn't change).

 This is one more step in vectorizing the same loops as ICC.  polyhedron
 has such a case in rnflow, where it helps quite much:

 Without patch:

   Benchmark   Compile  Executable   Ave Run  Number   Estim
        Name    (secs)     (bytes)    (secs) Repeats   Err %
   -   ---  --   --- ---  --
          ac      3.71     3960337     10.06       2  0.0368
      aermod     75.30     5594185     19.30       5  0.7102
         air     10.30     483      6.59       2  0.0653
    capacita      7.23     4185227     57.25       2  0.0968
     channel      2.05     4658532      2.70       5  4.4701
       doduc     14.55     4237850     23.31       2  0.0768
     fatigue      6.64     4127554      7.48       5  0.3063
     gas_dyn     13.25     4113557      2.99       5  4.5063
      induct      7.23     4338356      8.85       5  0.9927
       linpk      1.24     3927350     10.37       5  5.1174
        mdbx      5.51     4053841     12.95       5  0.0956
          nf     10.69     4062276     11.44       5  0.2727
     protein     33.04     5011924     39.53       5  0.2328
      rnflow     25.35     4238651     31.78       2  0.0673
    test_fpu     12.24     4184939      9.00       5  0.1157
        tfft      1.15     3976710      3.95       3  0.0736

 Geometric Mean Execution Time =      11.21 seconds

 With patch:

   Benchmark   Compile  Executable   Ave Run  Number   Estim
        Name    (secs)     (bytes)    (secs) Repeats   Err %
   -   ---  --   --- ---  --
          ac      3.91     3960337     10.34       5  0.3661
      aermod     76.44     5598281     19.03       5  1.3394
         air     11.52     483      6.75       5  0.9347
    capacita      7.78     4189323     56.33       2  0.0976
     channel      2.14     4658532      2.59       5  0.6640
       doduc     14.61     4237850     23.41       5  0.2974
     fatigue      6.62     4127554      7.44       5  0.1082
     gas_dyn     13.14     4113557      2.82       5  0.5253
      induct      7.26     4338356      8.49       2  0.0082
       linpk      1.23     3927350      9.78       2  0.0705
        mdbx      5.47     4053841     12.90       2  0.0601
          nf     10.67     4062276     11.33       2  0.0004
     protein     32.81     5011924     39.48       2  0.0893
      rnflow     26.57     4246843     26.70       5  0.0915
    test_fpu     13.29     4193131      8.82       5  0.2136
        tfft      1.14     3976710      3.95       5  0.1753

 Geometric Mean Execution Time =      10.95 seconds

 I.e. for rnflow from 31.78 to 26.70 seconds, nearly 20% better.

 Regstrapped together with [1/2] on x86_64-linux, all langs+Ada, no
 regressions.  Okay for trunk?

Ok with ...


 Ciao,
 Michael.

        PR tree-optimization/18437

        * tree-vectorizer.h (_stmt_vec_info.stride_load_p): New member.
        (STMT_VINFO_STRIDE_LOAD_P): New accessor.
        (vect_check_strided_load): Declare.
        * tree-vect-data-refs.c (vect_check_strided_load): New function.
        (vect_analyze_data_refs): Use it to accept strided loads.
        * tree-vect-stmts.c (vectorizable_load): Ditto and handle them.

 testsuite/
        * gfortran.dg/vect/rnflow-trs2a2.f90: New test.

 Index: tree-vectorizer.h
 ===
 --- tree-vectorizer.h.orig      2012-04-10 13:19:18.0 +0200
 +++ tree-vectorizer.h   2012-04-10 13:21:19.0 +0200
 @@ -545,6 +545,7 @@ typedef struct _stmt_vec_info {

   /* For loads only, true if this is a gather load.  */
   bool gather_p;
 +  bool stride_load_p;
  } *stmt_vec_info;

  /* Access Functions.  */
 @@ -559,6 +560,7 @@ typedef struct _stmt_vec_info {
  #define STMT_VINFO_VECTORIZABLE(S)         (S)-vectorizable
  #define STMT_VINFO_DATA_REF(S)             (S)-data_ref_info
  #define STMT_VINFO_GATHER_P(S)            (S)-gather_p
 +#define STMT_VINFO_STRIDE_LOAD_P(S)       (S)-stride_load_p

  #define STMT_VINFO_DR_BASE_ADDRESS(S)      (S)-dr_base_address
  #define STMT_VINFO_DR_INIT(S)              (S)-dr_init
 @@ -875,6 +877,7 @@ extern bool vect_analyze_data_ref_access
  extern bool vect_prune_runtime_alias_test_list 

Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-10 Thread Dennis Gilmore
On Tue, 10 Apr 2012 12:18:51 +0300
Konstantinos Margaritis konstantinos.margari...@linaro.org wrote:

 On Tue, 10 Apr 2012 07:36:07 +0200
 Jakub Jelinek ja...@redhat.com wrote:
  We really want consistency about the dynamic linker names etc.
  across different targets and sneaking silently multiarch paths on
  one architecture would make it inconsistent with all the others.
  So, please just use /libhf/ld-linux.so.3.
 
 I personally find /libhf extremely ugly. If having a second path is a
 problem, how about using the triplet in the filename? Like:
 
 /lib/ld-linux-arm-linux-gnueabihf.so.3
 
 ?
 
 Unique per arch and not multiarched. And AFAIK, Linux can handle long
 filenames just fine...

every distro uses a unique triplet, by putting the triplet in there you
then need to get all distros to change to using the same triplets. I
personally prefer /libhfp rather than /libhf but I am ok with using
either. Any change from /lib  would need us to do a mass rebuild.
because while not 100% needed I would rather keep libraries with the
linker. the changes to rpm to support it would be somewhat minimal. we
have stated in Fedora though that we have no intention to support mixing
hfp and sfp on the same system.  we really do need to ensure consensus
for arm64 which I think should be /lib64 for 64 bit arch consistency.


Dennis


Re: [Patch] tree-parloops.c (eliminate_local_variables): Add braces to suppress warnings

2012-04-10 Thread Richard Guenther
On Tue, Apr 10, 2012 at 3:06 PM, JonY jo...@users.sourceforge.net wrote:
 On 4/10/2012 20:37, Richard Guenther wrote:
 On Tue, Apr 10, 2012 at 2:15 PM, JonY wrote:
 Hi,

 Patch OK?

 What kind of warning?


 Oops, I forgot to mention gcc was complaining about missing braces.

Not for me.  And I fail to see why it should.

Richard.

 ChangeLog:
 tree-parloops.c (eliminate_local_variables): Add braces to suppress 
 warnings.

 Index: tree-parloops.c
 ===
 --- tree-parloops.c     (revision 186243)
 +++ tree-parloops.c     (working copy)
 @@ -723,13 +723,15 @@
   FOR_EACH_VEC_ELT (basic_block, body, i, bb)
     if (bb != entry_bb  bb != exit_bb)
       for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (gsi))
 -       if (is_gimple_debug (gsi_stmt (gsi)))
 -         {
 -           if (gimple_debug_bind_p (gsi_stmt (gsi)))
 -             has_debug_stmt = true;
 -         }
 -       else
 -         eliminate_local_variables_stmt (entry, gsi, decl_address);
 +        {
 +         if (is_gimple_debug (gsi_stmt (gsi)))
 +           {
 +             if (gimple_debug_bind_p (gsi_stmt (gsi)))
 +               has_debug_stmt = true;
 +           }
 +         else
 +           eliminate_local_variables_stmt (entry, gsi, decl_address);
 +        }

   if (has_debug_stmt)
     FOR_EACH_VEC_ELT (basic_block, body, i, bb)






Re: [C11-atomic] [patch] gimple atomic statements

2012-04-10 Thread Richard Guenther
On Tue, Apr 10, 2012 at 3:14 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 Richard Guenther richard.guent...@gmail.com writes:
 On Fri, Apr 6, 2012 at 10:13 AM, Richard Sandiford
 rdsandif...@googlemail.com wrote:
 Richard Guenther richard.guent...@gmail.com writes:
 They can affect shared memory in some ways like a call, but don't have 
 many
 of the other attributes of call.  They are really more like an assignment 
 or
 other operation with arbitrary shared memory side effects.  I do hope to 
 be
 able to teach the optimizers the directionality of the memory model
 restrictions.  ie, ACQUIRE is only a barrier to hoisting shared memory
 code...  stores can be moved downward past this mode. RELEASE is only a
 barrier to sinking code.   RELAXED is no barrier at all to code motion.  
 In
 fact, a relaxed store is barely different than a real store... but there 
 is
 a slight difference so we can't make it a normal store :-P.

 By teaching the other parts of the compiler about a GIMPLE_ ATOMIC, we 
 could
 hopefully lessen their impact eventually.

 Ok.  I suppose having a GIMPLE_ATOMIC is fine then.

 Just for my own education really, but: does this mean that there'd
 be unnecessary pessimisation in representing the thing as a call?

 No, there are not.  They are more pessimized than GIMPLE_ASSIGNs
 (unless you handle them specially in a few places).  But the same is
 true for GIMPLE_ATOMIC.  The question is one of convenience as
 far as I understand.  In general I would like to avoid new GIMPLE codes,
 especially random ones.  You can do everything with builtins or
 internal functions just fine.

 The interleaved load/store internal fns are really assignments too,
 so if calls aren't right for that kind of operation, maybe we need
 to replace the internal fns with something else.  Or at least come
 up with some new call properties.

 What missed optimizations do you see?

 None. :-)  But...

 Which is a roundabout way of wondering what the main difficulties
 would be in attaching things like directionality to a call.

 Directionality?

 [See above.]

 ...I was asking in the context quoted above, which seemed to be the bit
 that convinced you GIMPLE_ATOMIC would be OK after all.  And the two
 main reasons in the bit quoted above seemed to be that GIMPLE_ATOMIC
 was more like GIMPLE_ASSIGN (which is true of the current internal
 fns too, and was why I was interested) and that we wanted to add the
 directionality of the memory model (which seemed at face value like
 something that could be attached to a call).

 Not arguing for anything here, just an onlooker wanting to understand. :-)

 (BTW, it sounds like restricting memory accesses to GIMPLE_ASSIGN
 might cause trouble for the interleave load/store stuff too.)

 Well.  In the end my plan was to have a GIMPLE_LOAD and GIMPLE_STORE
 stmt, where the load would load to an SSA name and the store would store from
 a constant or an SSA name.  Advantages would be simplified data-flow analysis
 and things like aggregate copy propagation and value-numbering for free.  It
 would also be easier to attach special data / properties to the now
 single load / store
 sites (where in calls you can have an arbitrary number of loads at
 least).  Whatever
 special property the interleaved load/store has would be stored there, too.  
 The
 idea was to be able to fold most of the implicit information about
 loads/stores that
 are in the MEM_REF /COMPONENT_REF / ARRAY_REF trees into proper gimple
 level information, like having an array of index, stride tuples for
 (multidimensional)
 array accesses.  Think of it like a place where we can properly embed the
 MEM_ATTRs we have on the RTL side for example.  That's much easier if
 there were loads/stores only in very defined places.

 Ah, OK, so there'd still be a single gimple stmt (GIMPLE_LOAD or 
 GIMPLE_STORE),
 and that stmt would do the interleaving too?

Well, whatever we'd like to add (part of the atomic stuff would fit here, too,
just not the operation part like increment).

Richard.

  Sounds good.  I was worried at
 first that we'd have two separate stmts (e.g. a load and an interleave)
 which was what the internal fns were supposed to avoid.



 Thanks,
 Richard


Re: PATCH: Define TRY_EMPTY_VM_SPACE for Linux/x32

2012-04-10 Thread H.J. Lu
On Thu, Apr 5, 2012 at 8:57 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Apr 5, 2012 at 8:36 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Thu, Apr 5, 2012 at 3:28 PM, H.J. Lu hjl.to...@gmail.com wrote:

 Looking at how other targets implement this check, I don't think that
 this is a problem at all. This issue only shows on a non-bootstrapped
 build. A full bootstrap will use correct address.

 The other place where it shows up is cross compilers but who is going
 to use a 3.2 compiler with GCC 4.8 anyways?


 FWIW, I have no problem with checking __LP64__ only. I just want to
 mention this potential issue.


 Here is a different patch.  It checks sizeof (void *) for those hosts
 where it is appropriate.  Does it look OK?

 Looks OK to me, but you need a GWP reviewer to OK this change now...


 Here is the patch with ChangeLog.  Tested on Linux/x32.  Jakub,
 is this OK for trunk?


Hi Richard,

Is this patch:

http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00303.html

OK for trunk?

Thanks.


-- 
H.J.


Re: [i386, patch, RFC] HLE support in GCC

2012-04-10 Thread H.J. Lu
On Tue, Apr 10, 2012 at 7:12 AM, Kirill Yukhin kirill.yuk...@gmail.com wrote:

 Yeah.  And you don't need to change the FEs in any way, all that is needed
 is to change the middle-end/expansion (builtins.c - e.g. get_memmodel)
 and the backend (plus predefine the macros in the backend).

        Jakub

 Hi Jakub,
 Attached patch implements HLE support for __atomic_compare_exchange_n.

 So, to emit HLE prefix, it is possible to do:
 int
 foo2 (int *p, int oldv, int newv)
 {
  __atomic_compare_exchange_n (p, oldv, newv, 0, __ATOMIC_ACQUIRE |
 __ATOMIC_USE_HLE, __ATOMIC_ACQUIRE);
  return oldv;
 }

This is wrong since HLE ACQUIRE/RELEASE has nothing to do with
C++ atomic acquire/release. You can have HLE RELEASE with C++
atomic acquire.


-- 
H.J.


Re: [i386, patch, RFC] HLE support in GCC

2012-04-10 Thread Jakub Jelinek
On Tue, Apr 10, 2012 at 07:42:53AM -0700, H.J. Lu wrote:
  Attached patch implements HLE support for __atomic_compare_exchange_n.
 
  So, to emit HLE prefix, it is possible to do:
  int
  foo2 (int *p, int oldv, int newv)
  {
   __atomic_compare_exchange_n (p, oldv, newv, 0, __ATOMIC_ACQUIRE |
  __ATOMIC_USE_HLE, __ATOMIC_ACQUIRE);
   return oldv;
  }
 
 This is wrong since HLE ACQUIRE/RELEASE has nothing to do with
 C++ atomic acquire/release. You can have HLE RELEASE with C++
 atomic acquire.

Yes, of course, there should be two bits for HLE rather than one.

Jakub


Re: [RFC] Unconditionally clean up CFG before emitting prologue

2012-04-10 Thread Richard Guenther
On Mon, Apr 9, 2012 at 6:11 PM, Eric Botcazou ebotca...@adacore.com wrote:
 Isn't the gimple cfg-cleanup we run post optimization (right before
 expansion) not enough?  Or the cfg-cleanup we perform right after expansion
 now? At least if the branches are really caused by the gimplification
 process I would expect things to be cleaned up at this point, no?

 At least the patch has a non-null (albeit minor) effect, which pertains mainly
 to jump threading (typical diff attached, Ada code at -O0 on x86-64).

So it's trivial forwarders that are removed.  I wonder why cfgcleanup on the
tree level does not do this, or why it does that on the RTL level.  Are the
blocks becoming empty only during/after register allocation?

Richard.

 --
 Eric Botcazou


Re: RFC: PR rtl-optimization/52876: Sign extend 32 to 64bit then clear upper 32bits fails O1 or higher

2012-04-10 Thread H.J. Lu
On Thu, Apr 5, 2012 at 11:48 AM, H.J. Lu hongjiu...@intel.com wrote:
 Hi,

 CSE turns (reg:DI 64)

 (insn 6 3 7 2 (set (reg:DI 64)
        (sign_extend:DI (subreg/u:SI (reg/v/f:DI 63 [ addr ]) 0))) x.i:6 122
 {*extendsidi2_rex64} (nil))

 into (reg/f:DI 64 [ addr ]).  But nonzero_bits1 in rtlanal.c has

 #if defined(POINTERS_EXTEND_UNSIGNED)  !defined(HAVE_ptr_extend)
      /* If pointers extend unsigned and this is a pointer in Pmode, say that
         all the bits above ptr_mode are known to be zero.  */
      /* As we do not know which address space the pointer is refering to,
         we can do this only if the target does not support different pointer
         or address modes depending on the address space.  */
      if (target_default_pointer_address_modes_p ()
           POINTERS_EXTEND_UNSIGNED  GET_MODE (x) == Pmode
           REG_POINTER (x))
        nonzero = GET_MODE_MASK (ptr_mode);
 #endif

 Setting REG_POINTER with incompatible pointer sign extension is wrong.
 This patch adds a bool parameter to set_reg_attrs_from_value to
 indicate if a register can be marked as a pointer.  Does it look OK?

 Thanks.


 H.J.
 ---
 gcc/

 2012-04-05  H.J. Lu  hongjiu...@intel.com

        PR rtl-optimization/52876
        * emit-rtl.c (set_reg_attrs_from_value): Add a bool parameter
        to indicate if a register can be a pointer.
        (gen_reg_rtx_and_attrs): Pass true to set_reg_attrs_from_value.
        (set_reg_attrs_for_parm): Likewise.

        * emit-rtl.h (set_reg_attrs_from_value): Add a bool parameter.

        * reginfo.c (reg_scan_mark_refs): Pass false to
        set_reg_attrs_from_value with incompatible pointer sign
        extension.

 gcc/testsuite

 2012-04-05  H.J. Lu  hongjiu...@intel.com

        PR rtl-optimization/52876
        * gcc.target/i386/pr52876.c: New.

 diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
 index 8d7d441..791ff5c 100644
 --- a/gcc/emit-rtl.c
 +++ b/gcc/emit-rtl.c
 @@ -967,7 +967,7 @@ adjust_reg_mode (rtx reg, enum machine_mode mode)
    have different modes, REG is a (possibly paradoxical) lowpart of X.  */

  void
 -set_reg_attrs_from_value (rtx reg, rtx x)
 +set_reg_attrs_from_value (rtx reg, rtx x, bool can_be_reg_pointer)
  {
   int offset;

 @@ -983,14 +983,14 @@ set_reg_attrs_from_value (rtx reg, rtx x)
       if (MEM_OFFSET_KNOWN_P (x))
        REG_ATTRS (reg) = get_reg_attrs (MEM_EXPR (x),
                                         MEM_OFFSET (x) + offset);
 -      if (MEM_POINTER (x))
 +      if (can_be_reg_pointer  MEM_POINTER (x))
        mark_reg_pointer (reg, 0);
     }
   else if (REG_P (x))
     {
       if (REG_ATTRS (x))
        update_reg_offset (reg, x, offset);
 -      if (REG_POINTER (x))
 +      if (can_be_reg_pointer  REG_POINTER (x))
        mark_reg_pointer (reg, REGNO_POINTER_ALIGN (REGNO (x)));
     }
  }
 @@ -1002,7 +1002,7 @@ rtx
  gen_reg_rtx_and_attrs (rtx x)
  {
   rtx reg = gen_reg_rtx (GET_MODE (x));
 -  set_reg_attrs_from_value (reg, x);
 +  set_reg_attrs_from_value (reg, x, true);
   return reg;
  }

 @@ -1013,7 +1013,7 @@ void
  set_reg_attrs_for_parm (rtx parm_rtx, rtx mem)
  {
   if (REG_P (parm_rtx))
 -    set_reg_attrs_from_value (parm_rtx, mem);
 +    set_reg_attrs_from_value (parm_rtx, mem, true);
   else if (GET_CODE (parm_rtx) == PARALLEL)
     {
       /* Check for a NULL entry in the first slot, used to indicate that the
 diff --git a/gcc/emit-rtl.h b/gcc/emit-rtl.h
 index bc91193..2f259ef 100644
 --- a/gcc/emit-rtl.h
 +++ b/gcc/emit-rtl.h
 @@ -63,7 +63,7 @@ extern rtx copy_insn_1 (rtx);
  extern rtx copy_insn (rtx);
  extern rtx gen_int_mode (HOST_WIDE_INT, enum machine_mode);
  extern rtx emit_copy_of_insn_after (rtx, rtx);
 -extern void set_reg_attrs_from_value (rtx, rtx);
 +extern void set_reg_attrs_from_value (rtx, rtx, bool);
  extern void set_reg_attrs_for_parm (rtx, rtx);
  extern void set_reg_attrs_for_decl_rtl (tree t, rtx x);
  extern void adjust_reg_mode (rtx, enum machine_mode);
 diff --git a/gcc/reginfo.c b/gcc/reginfo.c
 index 6353126..585d7a8 100644
 --- a/gcc/reginfo.c
 +++ b/gcc/reginfo.c
 @@ -1224,14 +1224,24 @@ reg_scan_mark_refs (rtx x, rtx insn)
       if (REG_P (dest)  !REG_ATTRS (dest))
        {
          rtx src = SET_SRC (x);
 +         bool can_be_reg_pointer = true;

          while (GET_CODE (src) == SIGN_EXTEND
                 || GET_CODE (src) == ZERO_EXTEND
                 || GET_CODE (src) == TRUNCATE
                 || (GET_CODE (src) == SUBREG  subreg_lowpart_p (src)))
 -           src = XEXP (src, 0);
 +           {
 +#if defined(POINTERS_EXTEND_UNSIGNED)  !defined(HAVE_ptr_extend)
 +             if ((GET_CODE (src) == SIGN_EXTEND
 +                   POINTERS_EXTEND_UNSIGNED)
 +                 || (GET_CODE (src) != SIGN_EXTEND
 +                      ! POINTERS_EXTEND_UNSIGNED))
 +               can_be_reg_pointer = false;
 +#endif
 +             src = XEXP (src, 0);
 +           }

 -         set_reg_attrs_from_value (dest, src);
 +         set_reg_attrs_from_value (dest, 

[PATCH] Fix PR52912, more loop preserving and jump-threading fallout

2012-04-10 Thread Richard Guenther

This properly tells the CFG manipulation of the case where we thread
through a loop exit - the duplicated header block will end up in the
outer loop and the original loop structure will be preserved.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2012-04-10  Richard Guenther  rguent...@suse.de

PR tree-optimization/52912
* tree-ssa-threadupdate.c (thread_block): Tell the cfg
manipulation code we are threading through a loop header
to an exit destination.

* gcc.dg/torture/pr52912.c: New testcase.

Index: gcc/tree-ssa-threadupdate.c
===
*** gcc/tree-ssa-threadupdate.c (revision 186274)
--- gcc/tree-ssa-threadupdate.c (working copy)
*** thread_block (basic_block bb, bool noloo
*** 661,666 
--- 661,673 
/* We do not update dominance info.  */
free_dominance_info (CDI_DOMINATORS);
  
+   /* We know we only thread through the loop header to loop exits.
+  Let the basic block duplication hook know we are not creating
+  a multiple entry loop.  */
+   if (noloop_only
+bb == bb-loop_father-header)
+ set_loop_copy (bb-loop_father, loop_outer (bb-loop_father));
+ 
/* Now create duplicates of BB.
  
   Note that for a block with a high outgoing degree we can waste
*** thread_block (basic_block bb, bool noloo
*** 692,697 
--- 699,708 
htab_delete (redirection_data);
redirection_data = NULL;
  
+   if (noloop_only
+bb == bb-loop_father-header)
+ set_loop_copy (bb-loop_father, NULL);
+ 
/* Indicate to our caller whether or not any jumps were threaded.  */
return local_info.jumps_threaded;
  }
Index: gcc/testsuite/gcc.dg/torture/pr52912.c
===
*** gcc/testsuite/gcc.dg/torture/pr52912.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr52912.c  (revision 0)
***
*** 0 
--- 1,18 
+ /* { dg-do compile } */
+ 
+ int a, b, c;
+ static int
+ fn1 (p1)
+ {
+ lbl_549:
+   if (p1)
+ goto lbl_549;
+   return 0;
+ }
+ 
+ void
+ fn2 ()
+ {
+   b = (c  a)  fn1 (c) = c;
+ }
+ 


Patches to enable -ftrack-macro-expansion by default

2012-04-10 Thread Dodji Seketeli
Hello,

I am proposing a series of patches which is supposed to address the
remaining issues (I am aware of) preventing us from enabling the
-ftrack-macro-expansion by default.

The idea is to address each issue I notice in the course of trying to
bootstrap the compiler and running the tests with
-ftrack-macro-expansion enabled.

Beside the fixes, I ended up disabling the -ftrack-macro-expansion for
many test cases (sometimes globally in the dg-*.exp files, or on a
case by case basis), because that option changes the compiler output
and so requires that I either adapt the test case or disable the
option.  For other tests, I chose to adapt the test case.

You will find the patch series as a follow-up of this message.

-- 
Dodji


Guard use of modulo in cshift (speedup protein)

2012-04-10 Thread Michael Matz
Hi,

this patch speeds up polyhedrons protein on Bulldozer quite a bit.  The 
things is that in this testcase cshift is called with a very short length
(=3) and that the shift amount always is less than the length.  
Surprisingly the division instruction takes up considerable amount of 
time, so much that it makes sense to guard it, when the shift is in bound.

Here's some oprofile of _gfortrani_cshift0_i4 (total 31020 cycles):

23  0.0032 :   caf00:   idiv   %r13
 13863  1.9055 :   caf03:   lea(%rdx,%r13,1),%r12

I.e. despite the memory shuffling one third of the cshift cycles are that 
division.  With the patch the time for protein drops from 0m21.367s to 
0m20.547s on this Bulldozer machine.  I've checked that it has no adverse 
effect on older AMD or Intel cores (0:44.30elapsed vs 0:44.00elapsed, 
still an improvement).

Regstrapped on x86_64-linux.  Okay for trunk?


Ciao,
Michael.

* m4/cshift0.m4 (cshift0_'rtype_code`): Guard use of modulo.

* generated/cshift0_c10.c: Regenerated.
* generated/cshift0_c16.c: Regenerated.
* generated/cshift0_c4.c: Regenerated.
* generated/cshift0_c8.c: Regenerated.
* generated/cshift0_i16.c: Regenerated.
* generated/cshift0_i1.c: Regenerated.
* generated/cshift0_i2.c: Regenerated.
* generated/cshift0_i4.c: Regenerated.
* generated/cshift0_i8.c: Regenerated.
* generated/cshift0_r10.c: Regenerated.
* generated/cshift0_r16.c: Regenerated.
* generated/cshift0_r4.c: Regenerated.
* generated/cshift0_r8.c: Regenerated.

Index: m4/cshift0.m4
===
--- m4/cshift0.m4   (revision 186272)
+++ m4/cshift0.m4   (working copy)
@@ -98,9 +98,13 @@ cshift0_'rtype_code` ('rtype` *ret, cons
   rptr = ret-base_addr;
   sptr = array-base_addr;
 
-  shift = len == 0 ? 0 : shift % (ptrdiff_t)len;
-  if (shift  0)
-shift += len;
+  /* Avoid the costly modulo for trivially in-bound shifts.  */
+  if (shift  0 || shift = len)
+{
+  shift = len == 0 ? 0 : shift % (ptrdiff_t)len;
+  if (shift  0)
+   shift += len;
+}
 
   while (rptr)
 {


[PATCH 01/11] Fix cpp_sys_macro_p with -ftrack-macro-expansion

2012-04-10 Thread Dodji Seketeli
Hello,

cpp_sys_macro_p crashes when -ftrack-macro-expansion is on.  The issue
can be reproduced by running the tests:

runtest --tool gcc --tool_opts=-ftrack-macro-expansion cpp.exp=sysmac1.c
runtest --tool gcc --tool_opts=-ftrack-macro-expansion cpp.exp=sysmac2.c

This is because it just doesn't support that mode.  Fixed thus.
Tested and bootstrapped on x86_64-unknown-linux-gnu against trunk.

Note that the bootstrap with -ftrack-macro-expansion turned on
exhibits other separate issues that are addressed in subsequent
patches.  This patch just fixes one class of problems.

The patch does pass bootstrap with -ftrack-macro-expansion turned off,
though.

libcpp/

* macro.c (cpp_sys_macro_p):  Support -ftrack-macro-expansion.
---
 libcpp/macro.c |   10 +-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/libcpp/macro.c b/libcpp/macro.c
index 54de3e3..58a722c 100644
--- a/libcpp/macro.c
+++ b/libcpp/macro.c
@@ -2436,7 +2436,15 @@ cpp_get_token_with_location (cpp_reader *pfile, 
source_location *loc)
 int
 cpp_sys_macro_p (cpp_reader *pfile)
 {
-  cpp_hashnode *node = pfile-context-c.macro;
+  cpp_hashnode *node = NULL;
+
+  if (CPP_OPTION (pfile, track_macro_expansion))
+{
+  if (pfile-context-c.mc)
+   node = pfile-context-c.mc-macro_node;
+}
+  else
+node = pfile-context-c.macro;
 
   return node  node-value.macro  node-value.macro-syshdr;
 }
-- 
Dodji


[PATCH] Fix PR52881, more loop preserving fallout with RTL optimizers

2012-04-10 Thread Richard Guenther

RTL optimizers perform CFG adjustments themselves rather than relying
on cleanup_cfg (as the gimple level does).  This makes it necessary
to handle things all-over-the-place.  This case is if-conversion
performing forwarder block removal (well, it calls it speculation),
removing empty loop latches.  That's unwanted, thus the following
patch makes sure we do not do this.

I'm sure more RTL optimiziation fallout will pop up - and I wonder
if we should simply avoid modifying the CFG all over the place and
instead do that in cleanup_cfg.  Thus, in the if-conversion case,
simply do the speculation and leave the empty forwarder blocks
to be cleaned up by cleanup_cfg (or for CSE to leave unconditional
jumps conditional, as if (0/1), for example).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2012-04-10  Richard Guenther  rguent...@suse.de

PR rtl-optimization/52881
* ifcvt.c (find_if_case_2): Avoid speculating loop latches.

* gcc.dg/torture/pr52881.c: New testcase.
* gcc.dg/torture/pr52913.c: Likewise.

Index: gcc/ifcvt.c
===
*** gcc/ifcvt.c (revision 186274)
--- gcc/ifcvt.c (working copy)
*** find_if_case_2 (basic_block test_bb, edg
*** 3927,3932 
--- 3927,3937 
edge else_succ;
int then_prob, else_prob;
  
+   /* We do not want to speculate (empty) loop latches.  */
+   if (current_loops
+else_bb-loop_father-latch == else_bb)
+ return FALSE;
+ 
/* If we are partitioning hot/cold basic blocks, we don't want to
   mess up unconditional or indirect jumps that cross between hot
   and cold sections.
Index: gcc/testsuite/gcc.dg/torture/pr52881.c
===
*** gcc/testsuite/gcc.dg/torture/pr52881.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr52881.c  (revision 0)
***
*** 0 
--- 1,35 
+ /* { dg-do compile } */
+ 
+ int a, b, c, d, e, f, h, i, j, k, l, m, n, o;
+ static int g;
+ int
+ fn1 () {
+ for (;; ++f)
+   if (e)
+   break;
+ return 0;
+ }
+ unsigned char fn2 ();
+ void
+ fn3 () {
+ lbl_220:
+ if (j) {
+ lbl_221:
+   l = (g || b) = fn1 ();
+   for (;;) {
+   g = 0;
+   fn2 ();
+   if (k)
+ goto lbl_220;
+   break;
+   }
+   if (l)
+ goto lbl_221;
+ }
+ }
+ unsigned char
+ fn2 () {
+ o = d ? 0 : c;
+ h = m | a % o != n;
+ return i;
+ }
Index: gcc/testsuite/gcc.dg/torture/pr52913.c
===
--- gcc/testsuite/gcc.dg/torture/pr52913.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr52913.c  (revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+
+int a, b, c, d, e;
+void
+fn1 ()
+{
+lbl_101:
+  e = 0;
+lbl_274:
+  for (c = 0; c  1; c = a)
+if (d)
+  if (b)
+   goto lbl_101;
+  else
+   break;
+  d = 1;
+  goto lbl_274;
+}


[PATCH 02/11] Fix token pasting with -ftrack-macro-expansion

2012-04-10 Thread Dodji Seketeli
This patch makes token pasting work with -ftrack-macro-expansion
turned on.  It improves some pasting related tests of the gcc.dg/cpp
sub-directory.

Tested and bootstrapped on x86_64-unknown-linux-gnu against trunk.

Note that the bootstrap with -ftrack-macro-expansion exhibits other
separate issues that are addressed in subsequent patches.  This patch
just fixes one class of problems.

The patch does pass bootstrap with -ftrack-macro-expansion turned off,
though.

libcpp/

* macro.c (paste_all_tokens): Put the token resulting from pasting
into an extended token context with -ftrack-macro-location is in
effect.

gcc/testsuite/

* gcc.dg/cpp/paste17.c: New test case for
-ftrack-macro-expansion=2 mode only.
* gcc.dg/cpp/macro-exp-tracking-5.c: Likewise.
---
 gcc/testsuite/gcc.dg/cpp/macro-exp-tracking-5.c |   18 ++
 gcc/testsuite/gcc.dg/cpp/paste17.c  |8 ++
 libcpp/macro.c  |   28 ++-
 3 files changed, 53 insertions(+), 1 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/cpp/macro-exp-tracking-5.c
 create mode 100644 gcc/testsuite/gcc.dg/cpp/paste17.c

diff --git a/gcc/testsuite/gcc.dg/cpp/macro-exp-tracking-5.c 
b/gcc/testsuite/gcc.dg/cpp/macro-exp-tracking-5.c
new file mode 100644
index 000..7933660
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/macro-exp-tracking-5.c
@@ -0,0 +1,18 @@
+/*
+  { dg-options -fshow-column -ftrack-macro-expansion }
+  { dg-do compile }
+ */
+
+#define PASTED var ## iable /* { dg-error undeclared } */
+#define call_foo(p1, p2) \
+  foo (p1,  \
+   p2);  /*  { dg-message in expansion of macro } */
+
+void foo(int, char);
+
+void
+bar()
+{
+  call_foo(1,PASTED); /* { dg-message expanded from here } */
+}
+
diff --git a/gcc/testsuite/gcc.dg/cpp/paste17.c 
b/gcc/testsuite/gcc.dg/cpp/paste17.c
new file mode 100644
index 000..9c6506f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/paste17.c
@@ -0,0 +1,8 @@
+ /* { dg-options -ftrack-macro-expansion=2 } */
+/* { dg-do preprocess } */
+
+#define do_paste 1.0e ## -1
+
+do_paste
+
+/* { dg-final {scan-file paste17.i 1.0e- 1 } }*/
diff --git a/libcpp/macro.c b/libcpp/macro.c
index 58a722c..e0bfc31 100644
--- a/libcpp/macro.c
+++ b/libcpp/macro.c
@@ -611,6 +611,21 @@ paste_all_tokens (cpp_reader *pfile, const cpp_token *lhs)
 {
   const cpp_token *rhs = NULL;
   cpp_context *context = pfile-context;
+  source_location virt_loc = 0;
+
+  /* We must have been called on a token that appears at the left
+ hand side of a ## operator.  */
+  if (!(lhs-flags  PASTE_LEFT))
+abort ();
+
+  if (context-tokens_kind == TOKENS_KIND_EXTENDED)
+/* The caller must have called consume_next_token_from_context
+   right before calling us.  That has incremented the pointer to
+   the current virtual location.  So it now points to the location
+   of the token that comes right after *LHS.  We want the
+   resulting pasted token to have the location of the current
+   *LHS, though.  */
+virt_loc = *(context-c.mc-cur_virt_loc - 1);
 
   do
 {
@@ -650,7 +665,18 @@ paste_all_tokens (cpp_reader *pfile, const cpp_token *lhs)
   while (rhs-flags  PASTE_LEFT);
 
   /* Put the resulting token in its own context.  */
-  _cpp_push_token_context (pfile, NULL, lhs, 1);
+  if (context-tokens_kind == TOKENS_KIND_EXTENDED)
+{
+  source_location *virt_locs = NULL;
+  _cpp_buff *token_buf = tokens_buff_new (pfile, 1, virt_locs);
+  tokens_buff_add_token (token_buf, virt_locs, lhs,
+virt_loc, 0, NULL, 0);
+  push_extended_tokens_context (pfile, context-c.mc-macro_node,
+   token_buf, virt_locs,
+   (const cpp_token **)token_buf-base, 1);
+}
+  else
+_cpp_push_token_context (pfile, NULL, lhs, 1);
 }
 
 /* Returns TRUE if the number of arguments ARGC supplied in an
-- 
1.7.6.5



-- 
Dodji


[PATCH 03/11] Fix PCH crash on GTYed pointer-to-scalar field of a

2012-04-10 Thread Dodji Seketeli
Disclaimer: I am sorry for the length of this message.

In essence, it's just a one liner fix in gengtype.c.  I felt the need to
explain extensively what I think I understood because it didn't seem
obvious to me.

So, when -ftrack-macro-expansion is activated, the PCH generation
machinery can crash in gt_pch_save when it's about to relocate the
pointer for the line_maps::info_macro::maps[i]::d.macro.macro_locations
member.

The call that crashes (in ggc-common.c) is:

state.ptrs[i]-note_ptr_fn (state.ptrs[i]-obj,
state.ptrs[i]-note_ptr_cookie,
relocate_ptrs, state);

The -note_ptr_fn called in this case is the gengtype-generated
gt_pch_p_9line_maps function.  It crashes because the second argument
passed to it is a pointer to struct line_map, instead of being a
pointer to struct line_maps (extra 's') like what the function
expects.

You can see the crash for the test case:

runtest --tool g++ --tool_opts=-ftrack-macro-expansion pch.exp=system-1.C

I believe it's because a part of the code of gt_pch_nx_line_maps
(generated as part of gtype-desc.c by gengtype) is not correct.  Note
that this gt_pch_nx_line_maps function is called from gt_pch_save in
the snippet:

  for (rt = gt_ggc_rtab; *rt; rt++)
for (rti = *rt; rti-base != NULL; rti++)
  for (i = 0; i  rti-nelt; i++)
(*rti-pchw)(*(void **)((char *)rti-base + rti-stride * i));

So, in that gt_pch_nx_line_maps, in the branch that starts with the
code:

  if ((*x).info_macro.maps != NULL) {
size_t i3;
for (i3 = 0; i3 != (size_t)(((*x).info_macro).used); i3++) {
  switch (((*x).info_macro.maps[i3]).reason == LC_ENTER_MACRO)

we have the code:

gt_pch_note_object ((*x).info_macro.maps[i3].d.macro.macro_locations,
(*x).info_macro.maps,
 gt_pch_p_9line_maps,
  gt_types_enum_last);

This last snippet registers gt_pch_p_9line_maps to be called on the
object pointed by (*x).info_macro.maps[i3].d.macro.macro_locations (as
a first argument), with (*x).info_macro.maps as its second argument.

Note that (*x).info_macro.maps is of type struct line_map*, while 'x'
is of type struct line_maps* - beware, there is an 's' at the end of
the latter.

The problem is that gt_pch_p_9line_maps requires that its second
argument be an instance of _struct line_maps_, not struct line_map.
So later when gt_pch_p_9line_maps is called, it just crashes.

More generally, these gt_pch_p_xxx functions seem to require that
their second argument be an instance of the xxx in question.  And that
invariant is violated by the snippet of code above.

The invariant seems to be violated only for the case where a GTYed
structure (possibly embedded in another GTYed structure) contains a
pointer to a scalar (that is not a string) which memory is ggc/GTY
managed, like the line_map_macro::macro_locations field.  And this
only happens for PCH generation.

Looking at gengtype.c, it seems like write_types_process_field can be
fooled in that case.  It expects that the expression d-prev_val[3]
contains the name of the second argument of the gt_pch_p_xxx (which is
generically referenced by wtd-subfield_marker_routine there).  That
expression can resolve to either x, as we would like it to be, but
can also resolve to another arbitrary name for e.g, the case of a
pointer-to-struct used as a root).

This patch simply forces the second argument of gt_pch_p_xxx to be 'x'
even in the case of a member that is a pointer to a scalar.

As a result, here is the the diff the new generated gtype-desc.c file:

@@ -5234,7 +5234,7 @@ gt_pch_nx_line_maps (void *x_p)
 size_t i2;
 for (i2 = 0; i2 != (size_t)(2 * 
((*x).info_ordinary.maps[i0].d.macro).n_tokens); i2++) {
 }
-gt_pch_note_object 
((*x).info_ordinary.maps[i0].d.macro.macro_locations, (*x).info_ordinary.maps, 
gt_pch_p_9line_maps, gt_types_enum_last);
+gt_pch_note_object 
((*x).info_ordinary.maps[i0].d.macro.macro_locations, x, gt_pch_p_9line_maps, 
gt_types_enum_last);
   }
   break;
 default:
@@ -5261,7 +5261,7 @@ gt_pch_nx_line_maps (void *x_p)
 size_t i5;
 for (i5 = 0; i5 != (size_t)(2 * 
((*x).info_macro.maps[i3].d.macro).n_tokens); i5++) {
 }
-gt_pch_note_object 
((*x).info_macro.maps[i3].d.macro.macro_locations, (*x).info_macro.maps, 
gt_pch_p_9line_maps, gt_types_enum_last);
+gt_pch_note_object 
((*x).info_macro.maps[i3].d.macro.macro_locations, x, gt_pch_p_9line_maps, 
gt_types_enum_last);
   }
   break;
 default:
@@ -9366,7 +9366,7 @@ gt_pch_na_regno_reg_rtx (ATTRIBUTE_UNUSED void *x_p)
 for (i1 = 0; i1 != (size_t)(crtl-emit.x_reg_rtx_no); i1++) {
   gt_pch_n_7rtx_def (regno_reg_rtx[i1]);
 }
-gt_pch_note_object (regno_reg_rtx, 

Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-10 Thread Konstantinos Margaritis
On Tue, 10 Apr 2012 09:32:22 -0500
Dennis Gilmore den...@gilmore.net.au wrote: 
 every distro uses a unique triplet, by putting the triplet in there you
 then need to get all distros to change to using the same triplets. I
 personally prefer /libhfp rather than /libhf but I am ok with using
 either. Any change from /lib  would need us to do a mass rebuild.
 because while not 100% needed I would rather keep libraries with the
 linker. the changes to rpm to support it would be somewhat minimal. we
 have stated in Fedora though that we have no intention to support mixing
 hfp and sfp on the same system.  we really do need to ensure consensus
 for arm64 which I think should be /lib64 for 64 bit arch consistency.

Ok, I respect that, what about using the actual ABI name, ie aapcs-vfp? or 
something that includes both the architecture and the eabi, (arm-hardfloat, 
armhf, armhfp, etc), but *in* the filename (excluding the case of using a 
separate directory as it's not too popular).

Also, I'm not suggesting changing the triplet or anything, just deciding on a 
unique name for the triplet. Debian's argument is that the default multilib 
solution is not future-proof and we would prefer something that is more unique. 

Regarding /lib64 for aarch64, that's an entirely different discussion, though I 
do agree it should be also be resolved sooner rather than later as well.

Regards

Konstantinos


Re: Guard use of modulo in cshift (speedup protein)

2012-04-10 Thread Steven Bosscher
On Tue, Apr 10, 2012 at 4:53 PM, Michael Matz m...@suse.de wrote:
 Hi,

 this patch speeds up polyhedrons protein on Bulldozer quite a bit.  The
 things is that in this testcase cshift is called with a very short length
 (=3) and that the shift amount always is less than the length.
 Surprisingly the division instruction takes up considerable amount of
 time, so much that it makes sense to guard it, when the shift is in bound.

 Here's some oprofile of _gfortrani_cshift0_i4 (total 31020 cycles):

    23  0.0032 :   caf00:       idiv   %r13
  13863  1.9055 :   caf03:       lea    (%rdx,%r13,1),%r12

 I.e. despite the memory shuffling one third of the cshift cycles are that
 division.  With the patch the time for protein drops from 0m21.367s to
 0m20.547s on this Bulldozer machine.  I've checked that it has no adverse
 effect on older AMD or Intel cores (0:44.30elapsed vs 0:44.00elapsed,
 still an improvement).

 Regstrapped on x86_64-linux.  Okay for trunk?


 Ciao,
 Michael.

        * m4/cshift0.m4 (cshift0_'rtype_code`): Guard use of modulo.

        * generated/cshift0_c10.c: Regenerated.
        * generated/cshift0_c16.c: Regenerated.
        * generated/cshift0_c4.c: Regenerated.
        * generated/cshift0_c8.c: Regenerated.
        * generated/cshift0_i16.c: Regenerated.
        * generated/cshift0_i1.c: Regenerated.
        * generated/cshift0_i2.c: Regenerated.
        * generated/cshift0_i4.c: Regenerated.
        * generated/cshift0_i8.c: Regenerated.
        * generated/cshift0_r10.c: Regenerated.
        * generated/cshift0_r16.c: Regenerated.
        * generated/cshift0_r4.c: Regenerated.
        * generated/cshift0_r8.c: Regenerated.

 Index: m4/cshift0.m4
 ===
 --- m4/cshift0.m4       (revision 186272)
 +++ m4/cshift0.m4       (working copy)
 @@ -98,9 +98,13 @@ cshift0_'rtype_code` ('rtype` *ret, cons
   rptr = ret-base_addr;
   sptr = array-base_addr;

 -  shift = len == 0 ? 0 : shift % (ptrdiff_t)len;
 -  if (shift  0)
 -    shift += len;
 +  /* Avoid the costly modulo for trivially in-bound shifts.  */
 +  if (shift  0 || shift = len)
 +    {
 +      shift = len == 0 ? 0 : shift % (ptrdiff_t)len;
 +      if (shift  0)
 +       shift += len;
 +    }

   while (rptr)
     {

This is OK.

Do you think it would be worthwhile to do this transformation in the
middle end too, based on profile information for values? IIRC
value-prof handles constant divmod but not ranges for modulo
operations.

Steven


Re: ICE on bootstrap of libstdc++ for mingw-targets in add_bb_to_loop

2012-04-10 Thread Kai Tietz
Sorry, for the noise.  This patch just activates dw2 and therefore
avoids the ICE.

Continue on that.

Kai


move warning_if_unused_value to c-common.c

2012-04-10 Thread Manuel López-Ibáñez
It is a C-family FE function, not sure how it ended up in stmt.c.

OK to commit?

Cheers,

Manuel.

Bootstrapped and regression tested with languages=all,ada,obj-c++

2012-04-10  Manuel López-Ibáñez  m...@gcc.gnu.org

* c-family/c-common.c (warn_if_unused_valuer): Move definition to here.
* tree.h (warn_if_unused_value): Move declaration from here.
* c-family/c-common.h (warn_if_unused_value): Move declaration to here.
* cp/cvt.c (convert_to_void): Update comment.
* stmt.c (warn_if_unused_value): Move definition from here.


warning_if_unused.diff
Description: Binary data


Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-10 Thread Steve McIntyre
On Tue, Apr 10, 2012 at 09:32:22AM -0500, Dennis Gilmore wrote:
On Tue, 10 Apr 2012 12:18:51 +0300
Konstantinos Margaritis konstantinos.margari...@linaro.org wrote:

 On Tue, 10 Apr 2012 07:36:07 +0200
 Jakub Jelinek ja...@redhat.com wrote:
  We really want consistency about the dynamic linker names etc.
  across different targets and sneaking silently multiarch paths on
  one architecture would make it inconsistent with all the others.
  So, please just use /libhf/ld-linux.so.3.
 
 I personally find /libhf extremely ugly. If having a second path is a
 problem, how about using the triplet in the filename? Like:
 
 /lib/ld-linux-arm-linux-gnueabihf.so.3
 
 ?
 
 Unique per arch and not multiarched. And AFAIK, Linux can handle long
 filenames just fine...

every distro uses a unique triplet, by putting the triplet in there you
then need to get all distros to change to using the same triplets.

Aargh. Again, use of a standard triplet for arm hard-float was agreed
by all parties at the Plumbers' meeting last September. For exactly
this reason. Now it seems that a number of people have totally ignored
that consensus for the last six months.

I personally prefer /libhfp rather than /libhf but I am ok with using
either. Any change from /lib would need us to do a mass
rebuild. because while not 100% needed I would rather keep libraries
with the linker. the changes to rpm to support it would be somewhat
minimal. we have stated in Fedora though that we have no intention to
support mixing hfp and sfp on the same system.  we really do need to
ensure consensus for arm64 which I think should be /lib64 for 64 bit
arch consistency.

Precisely who in Redhat/Fedora land has the power to make decisions in
this area? We've been clearly wasting our time thus far trying to
negotiate agreements about the hard-float platform.

Cheers,
-- 
Steve McIntyresteve.mcint...@linaro.org
http://www.linaro.org/ Linaro.org | Open source software for ARM SoCs



Re: Guard use of modulo in cshift (speedup protein)

2012-04-10 Thread Michael Matz
Hi,

On Tue, 10 Apr 2012, Steven Bosscher wrote:

 This is OK.

r186283.

 Do you think it would be worthwhile to do this transformation in the 
 middle end too, based on profile information for values?

I'd think so, but it probably requires a new profiler that counts for how 
often 0 = A = B for every A % B.  Just profiling the range of values 
might be misleading (because A = N and B = M and N = M doesn't imply 
that A = B often holds).

But it would possibly be an interesting experiment already to do such 
transformation generally (without profiling) and see what it gives on some 
benchmarks.  Just to get a feel what's on the plate.

 IIRC value-prof 
 handles constant divmod but not ranges for modulo operations.


Ciao,
Michael.


Re: [PATCH] Fix PR52881, more loop preserving fallout with RTL optimizers

2012-04-10 Thread Steven Bosscher
On Tue, Apr 10, 2012 at 4:56 PM, Richard Guenther rguent...@suse.de wrote:
 I'm sure more RTL optimiziation fallout will pop up - and I wonder
 if we should simply avoid modifying the CFG all over the place and
 instead do that in cleanup_cfg.  Thus, in the if-conversion case,
 simply do the speculation and leave the empty forwarder blocks
 to be cleaned up by cleanup_cfg (or for CSE to leave unconditional
 jumps conditional, as if (0/1), for example).

The problem with that could be that if(0|1) is not a valid instruction
for every machine. You may need to set a register (pseudo, or hard
reg) that may not be available (e.g. if it is live).

Ciao!
Steven


Re: rfa: vectorize strided loads [2/2] [PR 18437]

2012-04-10 Thread Michael Matz
Hi,

On Tue, 10 Apr 2012, Richard Guenther wrote:

  +         vec_inv = build_constructor (vectype, v);
  +         new_temp = vect_init_vector (stmt, vec_inv, vectype, gsi);
  +         new_stmt = SSA_NAME_DEF_STMT (new_temp);
  +         mark_symbols_for_renaming (new_stmt);
 
 This should not be necessary - in fact please manually set the proper
 VUSE here.

I don't see how I could know the right VUSE.  It might not even exist yet.  
Reusing the existing vuse from the load that is replaced isn't correct.  
Consider that there are (unrelated) stores or calls before the load, and 
those are vectorized.  When updating their operands they'll get a new 
vdef, and I'd need to use _that_ one:

original:
  #vdef MEM_1
  scalar_store =   (this insn will be removed by vectorizable_store)
  #vuse MEM_1
  ... = A[i]

---

  #vdef MEM   (note, no ssaname yet)
  vect_store = ...
  #vuse MEM_1
  ... = A[i]  ( will be later removed as unused LHS)
  #vuse XXX
  ... = A[i]
  #vuse XXX
  ... = A[i+stride]

So, which SSA name to use for XXX, when the vect_store doesn't yet know 
(which it doesn't).

Tracking (and generating) proper VDEF SSA names is possible (we only 
vectorize a single BB, and that in stmt order), but that requires some 
rework of the vectorizer main loops, which should be a separate patch.
Agreed?


Ciao,
Michael.

Re: move warning_if_unused_value to c-common.c

2012-04-10 Thread Gabriel Dos Reis
On Tue, Apr 10, 2012 at 10:25 AM, Manuel López-Ibáñez
lopeziba...@gmail.com wrote:
 It is a C-family FE function, not sure how it ended up in stmt.c.

 OK to commit?

OK.


Re: [i386, patch, RFC] HLE support in GCC

2012-04-10 Thread Andi Kleen
H.J. Lu hjl.to...@gmail.com writes:

 So, to emit HLE prefix, it is possible to do:
 int
 foo2 (int *p, int oldv, int newv)
 {
  __atomic_compare_exchange_n (p, oldv, newv, 0, __ATOMIC_ACQUIRE |
 __ATOMIC_USE_HLE, __ATOMIC_ACQUIRE);
  return oldv;
 }

 This is wrong since HLE ACQUIRE/RELEASE has nothing to do with
 C++ atomic acquire/release. You can have HLE RELEASE with C++
 atomic acquire.

It makes sense to combine the two. On x86 C++ atomic acquire/release
means the compiler cannot move references outside. For HLE
we really want the same, otherwise some of the memory references
inside the transaction may not be transactional.

So I think HLE_ACQUIRE should imply C++ acquire
and HLE_RELEASE imply C++ release.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only


Re: RFC reminder: an alternative -fsched-pressure implementation

2012-04-10 Thread Vladimir Makarov

On 04/10/2012 09:35 AM, Richard Sandiford wrote:

Hi Vlad,

Back in Decemember, when we were still very much in stage 3, I sent
an RFC about an alternative implementation of -fsched-pressure.
Just wanted to send a reminder now that we're in the proper stage:

http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01684.html

Ulrich has benchmarked it on ARM, S/390 and Power7 (thanks), and got
reasonable results.  (I mentioned bad Power 7 results in that message,
because of the way the VSX_REGS class is handled.  Ulrich's results
are without -mvsx though.)

The condition I orignally set myself was that this patch should only
go in if it becomes the default on at least one architecture,
specifically ARM.  Ulrich tells me that Linaro have now made it
the default for ARM in their GCC 4.7 release, so hopefully Ramana
would be OK with doing the same in upstream 4.8.
Sorry for forgetting about this and thanks for the remainder.  I'll 
start looking at this patch thoroughly and reviewing it on this week 
(may be I'll do some benchmarks too).

I realise the whole thing is probably more complicated and ad-hoc
than you'd like.  Saying it can't go in is a perfectly acceptable
answer IMO.

The first impression I got for the patch when you submitted it, that it 
probably should go in.


Re: [PATCH] Caret diagnostics

2012-04-10 Thread Manuel López-Ibáñez
On 10 April 2012 00:28, Jason Merrill ja...@redhat.com wrote:
 On 04/09/2012 04:01 PM, Manuel López-Ibáńez wrote:

 * It uses  the default cutoff as max_width, whatever it is (as
 controlled by -fmessage-length).
 * It uses the pretty-printer. The text cannot (should not) wrap
 because we still print only max_width chars at most.


 Hmm, I think if pp_line_cutoff is 0 and we're on a terminal, we still want
 to use COLUMNS to limit how much of the source we print.

Like this?

2012-04-05  Manuel López-Ibáñez  m...@gcc.gnu.org

PR 24985
libstdc++-v3/
* testsuite/lib/prune.exp: Handle caret.
libmudflap/
* testsuite/lib/libmudflap.exp: Handle caret.
gcc/
* diagnostic.h (show_caret): Declare.
(caret_max_width): Declare.
(diagnostic_show_locus): Declare.
(diagnostic_set_caret_max_width): Declare.
* diagnostic.c (diagnostic_initialize): Initialize to false.
(diagnostic_show_locus): New.
(diagnostic_report_diagnostic): Call it.
(getenv_columns): New.
(diagnostic_set_caret_max_width): New.
* input.c (read_line): New.
(location_get_source_line): New.
* input.h (location_get_source_line): Declare.
* toplev.c (general_init): Initialize show_caret from options.
* dwarf2out.c (gen_producer_string): Handle fdiagnostics-show-caret.
* opts.c (common_handle_option): Likewise.
* pretty-print.h (pp_get_prefix): New.
(pp_base_get_prefix): New.
* common.opt (fdiagnostics-show-caret): New option.
* doc/invoke.texi (fdiagnostics-show-caret): Document it.
testsuite/
* lib/prune.exp: Add -fno-diagnostics-show-caret.


caret-diagnostics-20120410.diff
Description: Binary data


Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-10 Thread Michael Edwards
FWIW, my use case for multiarch is not sharing the root filesystem 
among multiple systems.  It's sharing the non-lib namespace (/etc, 
/bin, data) among multiple instruction sets / ABI variants on the same 
system.  I don't need (/usr)?/s?bin to be decorated with a triplet, 
because the kernel picks a fresh ld.so variant at the execve() boundary; 
I am happy to mix ARM and x86 binaries (and Python and shell scripts) in 
/bin, and let the kernel (and binfmt_misc + qemu) sort it out.  I only 
need (/usr)?/lib to be disambiguated *at runtime* because ld.so is not 
as smart as the kernel.  (It's not just ld.so, of course; module/plugin 
loaders for everything from Python to Firefox have the same problem, and 
if they don't have the triplet in there somewhere then multiarch breaks 
them.)


As long as the kernel can find the right ld.so and each ld.so can find 
its own ld.so.conf, I don't really care where the libraries are put at 
runtime, as long as I can make it different for each ISA/ABI.  However, 
I do care how much autoconf / pkg-config / debhelper misery I have to go 
through each time I need to pull in another library dependency.  
Upstream build machinery can usually accommodate 
/just/about/anything/lib.  Trailing components like lib32, libhf, or 
lib-gnu-autoconf-triplet are not as consistently trivial.


Personally, I would like for all shared objects to live in 
(/usr)?/gnu-autoconf-triplet(-extranoise)?/lib, and for the kernel to 
take responsibility for pointing (/usr)?/lib at an overlay mount 
containing whatever makes sense for the currently running binary, a la 
/proc/self.  That way I can grandfather in binaries with ABI-ignorant 
hard-coded library paths, and still handle ISA variants.  The 
extranoise might be neon, or ssse3, or android (so that 
non-Android binaries on the same system don't see Android-specific 
libraries with stupidly generic names like libui.so).  And the overlay 
mount is so that I can, if I choose, build the vast majority of my 
system without NEON instructions (and thus not take the overhead of VFP 
context save/restore during timeslices that don't use actual floating 
point) and still use a subset of those libraries from NEON-anointed 
binaries (assuming I define some sensible way for the kernel to make 
that distinction).


That isn't necessarily the right solution, of course -- either at a 
technical level or in the light of the LSB process and other 
inter-distro politics.  But maybe it's at least a more plausible use 
case for 2012 than NFS-mounting /usr/local on a mix of sun4c, sun4u, and 
IRIX workstations.  (That never did work quite right ...)


Cheers,
- Michael



Re: [i386, patch, RFC] HLE support in GCC

2012-04-10 Thread Uros Bizjak
On Tue, Apr 10, 2012 at 4:20 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Tue, Apr 10, 2012 at 06:12:08PM +0400, Kirill Yukhin wrote:
 Attached patch implements HLE support for __atomic_compare_exchange_n.

 The target hook is definitely not appropriate, just define it in
 ix86_target_macros in i386-c.c instead or so.

Also, I think it is better to pass operand that holds the model
constant to the final insn and conditionally output xacquire/xrelease
based on INTVAL of this operand.

Something like in attached patch, but probably with a helper function in i386.c.

Uros.
Index: config/i386/sync.md
===
--- config/i386/sync.md (revision 186282)
+++ config/i386/sync.md (working copy)
@@ -315,8 +315,9 @@
(match_operand:SI 7 const_int_operand)]   ;; failure model
   TARGET_CMPXCHG
 {
-  emit_insn (gen_atomic_compare_and_swap_singlemode
-(operands[1], operands[2], operands[3], operands[4]));
+  emit_insn
+   (gen_atomic_compare_and_swap_singlemode
+(operands[1], operands[2], operands[3], operands[4], operands[6]));
   ix86_expand_setcc (operands[0], EQ, gen_rtx_REG (CCZmode, FLAGS_REG),
 const0_rtx);
   DONE;
@@ -344,8 +345,9 @@
 {
   if (MODEmode == DImode  TARGET_64BIT)
 {
-  emit_insn (gen_atomic_compare_and_swap_singledi
-(operands[1], operands[2], operands[3], operands[4]));
+  emit_insn
+   (gen_atomic_compare_and_swap_singledi
+   (operands[1], operands[2], operands[3], operands[4], operands[6]));
 }
   else
 {
@@ -370,7 +372,7 @@
mem = replace_equiv_address (mem, force_reg (Pmode, XEXP (mem, 0)));
 
   emit_insn (gen_atomic_compare_and_swap_doublemode
-(lo_o, hi_o, mem, lo_e, hi_e, lo_n, hi_n));
+(lo_o, hi_o, mem, lo_e, hi_e, lo_n, hi_n, operands[6]));
 }
   ix86_expand_setcc (operands[0], EQ, gen_rtx_REG (CCZmode, FLAGS_REG),
 const0_rtx);
@@ -382,15 +384,24 @@
(unspec_volatile:SWI
  [(match_operand:SWI 1 memory_operand +m)
   (match_operand:SWI 2 register_operand 0)
-  (match_operand:SWI 3 register_operand r)]
+  (match_operand:SWI 3 register_operand r)
+  (match_operand:SI 4 const_int_operand)]
  UNSPECV_CMPXCHG_1))
(set (match_dup 1)
(unspec_volatile:SWI [(const_int 0)] UNSPECV_CMPXCHG_2))
(set (reg:CCZ FLAGS_REG)
 (unspec_volatile:CCZ [(const_int 0)] UNSPECV_CMPXCHG_3))]
   TARGET_CMPXCHG
-  lock{%;} cmpxchg{imodesuffix}\t{%3, %1|%1, %3})
+{
+  static char buf[128];
+  const char *hle
+= (INTVAL (operands[4])  8) ? xacquire  : ;
 
+  snprintf (buf, sizeof (buf),
+   lock{%;} %scmpxchg{imodesuffix}\t{%3, %1|%1, %3}, hle);
+  return buf;
+})
+
 ;; For double-word compare and swap, we are obliged to play tricks with
 ;; the input newval (op5:op6) because the Intel register numbering does
 ;; not match the gcc register numbering, so the pair must be CX:BX.
@@ -403,7 +414,8 @@
   (match_operand:DCASHMODE 3 register_operand 0)
   (match_operand:DCASHMODE 4 register_operand 1)
   (match_operand:DCASHMODE 5 register_operand b)
-  (match_operand:DCASHMODE 6 register_operand c)]
+  (match_operand:DCASHMODE 6 register_operand c)
+  (match_operand:SI 7 const_int_operand)]
  UNSPECV_CMPXCHG_1))
(set (match_operand:DCASHMODE 1 register_operand =d)
(unspec_volatile:DCASHMODE [(const_int 0)] UNSPECV_CMPXCHG_2))
@@ -412,8 +424,16 @@
(set (reg:CCZ FLAGS_REG)
 (unspec_volatile:CCZ [(const_int 0)] UNSPECV_CMPXCHG_4))]
   
-  lock{%;} cmpxchgdoublemodesuffixb\t%2)
+{
+  static char buf[128];
+  const char *hle
+= (INTVAL (operands[7])  8) ? xacquire  : ;
 
+  snprintf (buf, sizeof (buf),
+   lock{%;} %scmpxchgdoublemodesuffixb\t%2, hle);
+  return buf;
+})
+
 ;; Theoretically we'd like to use constraint r (any reg) for op5,
 ;; but that includes ecx.  If op5 and op6 are the same (like when
 ;; the input is -1LL) GCC might chose to allocate op5 to ecx, like


Re: ICE on bootstrap of libstdc++ for mingw-targets in add_bb_to_loop

2012-04-10 Thread Kai Tietz
Hi,

issue was caused by the patch for PR middle-end/52772, which just
handles dw2.  This patch makes sure that for sjlj we add new generated
bb's for dispatching table to loop, too.

ChangeLog

2012-04-10  Kai Tietz  kti...@redhat.com

PR c++/52924
* except.c (sjlj_emit_dispatch_table): Add bb to loop,
as done for dwarf2.

Bootstrapped for x64_64-w64-mingw32, and x86_64-unknown-linux-gnu.
Later for all languages including Ada and Obj-C++.

Ok for apply?

Regards,
Kai

Index: except.c
===
--- except.c(revision 186288)
+++ except.c(working copy)
@@ -1344,6 +1344,16 @@
e = make_edge (bb, bb-next_bb, EDGE_FALLTHRU);
e-count = bb-count;
e-probability = REG_BR_PROB_BASE;
+   if (current_loops)
+ {
+   struct loop *loop = bb-next_bb-loop_father;
+   /* If we created a pre-header block, add the new block to the
+  outer loop, otherwise to the loop itself.  */
+   if (bb-next_bb == loop-header)
+ add_bb_to_loop (bb, loop_outer (loop));
+   else
+ add_bb_to_loop (bb, loop);
+ }

disp_index++;
   }
@@ -1365,6 +1375,17 @@
   e-count = bb-count;
   e-probability = REG_BR_PROB_BASE;
 }
+
+  if (current_loops)
+{
+  struct loop *loop = bb-next_bb-loop_father;
+  /* If we created a pre-header block, add the new block to the
+outer loop, otherwise to the loop itself.  */
+  if (bb-next_bb == loop-header)
+   add_bb_to_loop (bb, loop_outer (loop));
+  else
+   add_bb_to_loop (bb, loop);
+}
 }

 static void


Re: [PATCH, rs6000] Fix PR16458, eliminate redudant compares

2012-04-10 Thread Peter Bergner
On Mon, 2012-01-30 at 10:46 +0100, Richard Guenther wrote:
 On Fri, Jan 27, 2012 at 5:56 PM, Peter Bergner berg...@vnet.ibm.com wrote:
  This patch fixes PR16458 by using the type expression attached to a reg
  rtx to detect its signedness and generating unsigned compares when
  appropriate.  However, we continue to use signed compares for the
  special case of when we compare an unsigned reg against the constant 0.
  This is because signed and unsigned compares give the same results
  for equality and (unsigned)x  0) is equivalent to x != 0.
  Using a signed compare in this special case allows us to continue to
  generate record form instructions (ie, instructions that implicitly
  set cr0).
[snip]
 
 This does not sound suitable for stage4.  rs6000_unsigned_reg_p
 looks suspiciously non-rs6000 specific, and if we really can rely
 on the way it is implemented should find its way to general RTL
 helper routines.

Now that we're in stage1 again, I'm attaching an updated patch that moves
rs6000_unsigned_reg_p into an arch independent RTL file like you wanted.
I have also attached a couple of patch hunks from Michael that sets register
attributes in a couple of cases we didn't before and that allows my patch
to coalesce the compares in pr16458-4.c.  The description of that issue is
located in:

http://gcc.gnu.org/ml/gcc/2012-01/msg00339.html


The updated and expanded patch passes bootstrap and regtesting with no
regessions on powerpc64-linux.  Ok for mainline now?


Peter


2012-mm-dd  Peter Bergner  berg...@vnet.ibm.com
Michael Matz  m...@suse.de

gcc/
PR target/16458
* rtlanal.c (unsigned_reg_p): New function.
* rtl.h (unsigned_reg_p): Prototype it.
* config/rs6000/rs6000.c (rs6000_generate_compare): Use it.
Update comment.
* expr.c (expand_expr_real_1): Set register attributes.
* stmt.c (expand_case): Likewise.

gcc/testsuite/
PR target/16458
* gcc.target/powerpc/pr16458-1.c: New test.
* gcc.target/powerpc/pr16458-2.c: Likewise.
* gcc.target/powerpc/pr16458-3.c: Likewise.
* gcc.target/powerpc/pr16458-4.c: Likewise.


Index: gcc/rtlanal.c
===
--- gcc/rtlanal.c   (revision 186244)
+++ gcc/rtlanal.c   (working copy)
@@ -636,6 +636,25 @@ count_occurrences (const_rtx x, const_rt
 }
 
 
+/* Return TRUE if OP is a register or subreg of a register that
+   holds an unsigned quantity.  Otherwise, return FALSE.  */
+
+bool
+unsigned_reg_p (rtx op)
+{
+  if (REG_P (op)
+   REG_EXPR (op)
+   TYPE_UNSIGNED (TREE_TYPE (REG_EXPR (op
+return true;
+
+  if (GET_CODE (op) == SUBREG
+   SUBREG_PROMOTED_UNSIGNED_P (op))
+return true;
+
+  return false;
+}
+
+
 /* Nonzero if register REG appears somewhere within IN.
Also works if REG is not a register; in this case it checks
for a subexpression of IN that is Lisp equal to REG.  */
Index: gcc/rtl.h
===
--- gcc/rtl.h   (revision 186244)
+++ gcc/rtl.h   (working copy)
@@ -1909,6 +1909,7 @@ extern HOST_WIDE_INT get_integer_term (c
 extern rtx get_related_value (const_rtx);
 extern bool offset_within_block_p (const_rtx, HOST_WIDE_INT);
 extern void split_const (rtx, rtx *, rtx *);
+extern bool unsigned_reg_p (rtx);
 extern int reg_mentioned_p (const_rtx, const_rtx);
 extern int count_occurrences (const_rtx, const_rtx, int);
 extern int reg_referenced_p (const_rtx, const_rtx);
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 186244)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -15570,14 +15570,11 @@ rs6000_generate_compare (rtx cmp, enum m
   || code == GEU || code == LEU)
 comp_mode = CCUNSmode;
   else if ((code == EQ || code == NE)
-   GET_CODE (op0) == SUBREG
-   GET_CODE (op1) == SUBREG
-   SUBREG_PROMOTED_UNSIGNED_P (op0)
-   SUBREG_PROMOTED_UNSIGNED_P (op1))
+   unsigned_reg_p (op0)
+   (unsigned_reg_p (op1)
+  || (CONST_INT_P (op1)  INTVAL (op1) != 0)))
 /* These are unsigned values, perhaps there will be a later
-   ordering compare that can be shared with this one.
-   Unfortunately we cannot detect the signedness of the operands
-   for non-subregs.  */
+   ordering compare that can be shared with this one.  */
 comp_mode = CCUNSmode;
   else
 comp_mode = CCmode;
Index: gcc/expr.c
===
--- gcc/expr.c  (revision 186244)
+++ gcc/expr.c  (working copy)
@@ -9015,8 +9015,13 @@ expand_expr_real_1 (tree exp, rtx target
   stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp)))
g = SSA_NAME_DEF_STMT (exp);
   if (g)
-   return expand_expr_real (gimple_assign_rhs_to_tree (g), target, tmode,
-

Re: [i386, patch, RFC] HLE support in GCC

2012-04-10 Thread Uros Bizjak
Hello!

  This is wrong since HLE ACQUIRE/RELEASE has nothing to do with
  C++ atomic acquire/release. You can have HLE RELEASE with C++
  atomic acquire.

 It makes sense to combine the two. On x86 C++ atomic acquire/release
 means the compiler cannot move references outside. For HLE
 we really want the same, otherwise some of the memory references
 inside the transaction may not be transactional.

 So I think HLE_ACQUIRE should imply C++ acquire
 and HLE_RELEASE imply C++ release.

In this case, can we reverse this sentence and just emit lock
xacquire for MEMMODEL_ACQUIRE and lock xrelease for
MEMMODEL_RELEASE ? Do we need separate HLE_* defines or can we somehow
recycle existing C++11 memmodel defines?

Uros.


Re: Added better handling of outdated profile data. (issue 5989046)

2012-04-10 Thread asharif

On 2012/04/05 18:53:28, asharif wrote:

Jan, please take a look and provide some feedback.



Thanks,


Ping?

http://codereview.appspot.com/5989046/


Re: [patch] Fix PR52822 (stable_partition move-assigns object to itself) in trunk, 4.7, and 4.6

2012-04-10 Thread Paolo Carlini
.. ah: remember to add PR libstdc++/52822 at the beginning of the 
ChangeLog entry, otherwise Bugzilla will not pick your commit!


Thanks,
Paolo.


[PATCH] Atom: Enabling unroll at O2 optimization level

2012-04-10 Thread Igor Zamyatin
Hi All!

Here is a patch that enables unroll at O2 for Atom.

This gives good performance boost on EEMBC 2.0 (~+8% in Geomean for 32
bits) with quite moderate code size increase (~5% for EEMBC2.0, 32
bits).

Tested for i386 and x86-64, ok for trunk?

Thanks,
Igor

ChangeLog:

2012-04-10  Yakovlev Vladimir  vladimir.b.yakov...@intel.com

       * gcc/config/i386/i386.c (check_imull): New routine.
       (ix86_loop_unroll_adjust): New target hook.
       (ix86_option_override_internal): Enable unrolling on Atom at -O2.
       (TARGET_LOOP_UNROLL_ADJUST): New define.


unroll.patch
Description: Binary data


Re: [PATCH, rs6000] Fix PR16458, eliminate redudant compares

2012-04-10 Thread David Edelsohn
On Tue, Apr 10, 2012 at 1:36 PM, Peter Bergner berg...@vnet.ibm.com wrote:

 2012-mm-dd  Peter Bergner  berg...@vnet.ibm.com
            Michael Matz  m...@suse.de

 gcc/
        PR target/16458
        * rtlanal.c (unsigned_reg_p): New function.
        * rtl.h (unsigned_reg_p): Prototype it.
        * config/rs6000/rs6000.c (rs6000_generate_compare): Use it.
        Update comment.
        * expr.c (expand_expr_real_1): Set register attributes.
        * stmt.c (expand_case): Likewise.

 gcc/testsuite/
        PR target/16458
        * gcc.target/powerpc/pr16458-1.c: New test.
        * gcc.target/powerpc/pr16458-2.c: Likewise.
        * gcc.target/powerpc/pr16458-3.c: Likewise.
        * gcc.target/powerpc/pr16458-4.c: Likewise.

The rs6000 and testsuite parts of the patch are okay with me.

Thanks, David


Re: [PATCH] Caret diagnostics

2012-04-10 Thread Gabriel Dos Reis
On Tue, Apr 10, 2012 at 11:46 AM, Manuel López-Ibáñez
lopeziba...@gmail.com wrote:
 On 10 April 2012 00:28, Jason Merrill ja...@redhat.com wrote:
 On 04/09/2012 04:01 PM, Manuel López-Ibáńez wrote:

 * It uses  the default cutoff as max_width, whatever it is (as
 controlled by -fmessage-length).
 * It uses the pretty-printer. The text cannot (should not) wrap
 because we still print only max_width chars at most.


 Hmm, I think if pp_line_cutoff is 0 and we're on a terminal, we still want
 to use COLUMNS to limit how much of the source we print.

 Like this?

There is a novelty in this new version that I don't think we discussed
before: automatic expansion of tabs to 8 hard space characters.  That
number should not be hardcoded as there is no reason to believe a tab
character always expands to 8 space characters.  You should check
the environment first; if not present the default expansion number should
be a symbolic constant as opposed to a magic number sprinkled all over the
places.  I would also encourage you to introduce more abstraction to
reduce the size of diagnostic_show_locus.


follow up patch to clean up TODO_dump_func

2012-04-10 Thread Xinliang David Li
Hi Richard, this is a follow up patch for more cleanups.

Bootstrap and tested on x86-64/linux.

Ok for trunk?

thanks,

David


todo_dump.p
Description: Binary data


cg
Description: Binary data


Re: [Patch] tree-parloops.c (eliminate_local_variables): Add braces to suppress warnings

2012-04-10 Thread NightStrike
On Tue, Apr 10, 2012 at 10:38 AM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Tue, Apr 10, 2012 at 3:06 PM, JonY jo...@users.sourceforge.net wrote:
 On 4/10/2012 20:37, Richard Guenther wrote:
 On Tue, Apr 10, 2012 at 2:15 PM, JonY wrote:
 Hi,

 Patch OK?

 What kind of warning?


 Oops, I forgot to mention gcc was complaining about missing braces.

 Not for me.  And I fail to see why it should.

This is the warning:

../../../../build/gcc/src/gcc/tree-parloops.c:724: warning: suggest
explicit braces to avoid ambiguous `else'


Re: [PATCH] Caret diagnostics

2012-04-10 Thread Jason Merrill

On 04/10/2012 12:46 PM, Manuel López-Ibáñez wrote:

+  max_width = context-caret_max_width;
+  if (max_width= 0)
+max_width = INT_MAX;


I don't think we need the test here; diagnostic_set_caret_max_width 
should make sure caret_max_width is set sensibly.  Otherwise, yes, thanks.


On 04/10/2012 02:52 PM, Gabriel Dos Reis wrote:

There is a novelty in this new version that I don't think we discussed
before: automatic expansion of tabs to 8 hard space characters.  That
number should not be hardcoded as there is no reason to believe a tab
character always expands to 8 space characters.  You should check
the environment first; if not present the default expansion number should
be a symbolic constant as opposed to a magic number sprinkled all over the
places.


Hmm.  I don't know if there's any reliable way to query tab stops; the 
it termcap/terminfo capability tells you what it was originally 
(presumably 8), but it might have changed.


Jason


Re: [PATCH] Caret diagnostics

2012-04-10 Thread Manuel López-Ibáñez
On 10 April 2012 20:52, Gabriel Dos Reis g...@integrable-solutions.net wrote:
 On Tue, Apr 10, 2012 at 11:46 AM, Manuel López-Ibáñez
 lopeziba...@gmail.com wrote:
 On 10 April 2012 00:28, Jason Merrill ja...@redhat.com wrote:
 On 04/09/2012 04:01 PM, Manuel López-Ibáńez wrote:

 * It uses  the default cutoff as max_width, whatever it is (as
 controlled by -fmessage-length).
 * It uses the pretty-printer. The text cannot (should not) wrap
 because we still print only max_width chars at most.


 Hmm, I think if pp_line_cutoff is 0 and we're on a terminal, we still want
 to use COLUMNS to limit how much of the source we print.

 Like this?

 There is a novelty in this new version that I don't think we discussed
 before: automatic expansion of tabs to 8 hard space characters.  That
 number should not be hardcoded as there is no reason to believe a tab
 character always expands to 8 space characters.  You should check
 the environment first; if not present the default expansion number should
 be a symbolic constant as opposed to a magic number sprinkled all over the
 places.  I would also encourage you to introduce more abstraction to
 reduce the size of diagnostic_show_locus.

There is no novelty, it has been there from the beginning, and now I
realize it was a mistake to do the extra work to implement this.

The GCS says: Calculate column numbers assuming that space and all
ASCII printing characters have equal width, and assuming tab stops
every 8 columns
http://www.gnu.org/prep/standards/standards.html#Errors

So, in fact, libcpp is buggy for not implementing this (as can be seen
in emacs). If/When libcpp is fixed, the column info will be correct
for tabs. And then, one may care about printing tabs as anything
different from a single space. But until then, a single space is a
good as anything. In fact this is what is done in c-ppoutput.c.

As can be seen below, this simplifies a lot the function, so I
appreciate the suggestion. If you have suggestions for removing other
code, I will be happy to implement them. The less code, the less
potential for bugs.

Cheers,

Manuel.

+/* Print the physical source line corresponding to the location of
+   this diagnostics, and a caret indicating the precise column.  */
+void
+diagnostic_show_locus (diagnostic_context * context,
+  const diagnostic_info *diagnostic)
+{
+  const char *line;
+  char *buffer;
+  expanded_location s;
+  int real_width;
+  int max_width;
+  const char *saved_prefix;
+  int right_margin = 10;
+
+  if (!context-show_caret
+  || diagnostic-location = BUILTINS_LOCATION)
+return;
+
+  s = expand_location(diagnostic-location);
+  line = location_get_source_line (s);
+  if (line == NULL)
+return;
+
+  real_width = strlen (line);
+
+  max_width = context-caret_max_width;
+  if (max_width = 0)
+max_width = INT_MAX;
+
+  /* If the line is longer than max_width and the column is too close
+ to max_width, skip part of the line until it is not so close.  */
+  right_margin = MIN(real_width - s.column, right_margin);
+  right_margin = max_width - right_margin;
+  if (real_width = max_width  s.column  right_margin)
+{
+  line += s.column - right_margin;
+  s.column = right_margin;
+}
+
+  pp_newline (context-printer);
+  saved_prefix = pp_get_prefix (context-printer);
+  pp_set_prefix (context-printer, NULL);
+  pp_character (context-printer, ' ');
+  while (max_width  0  *line != '\0')
+{
+  char c = *line == '\t' ? ' ' : *line;
+  pp_character (context-printer, c);
+  max_width--;
+  line++;
+}
+  pp_newline (context-printer);
+  /* pp_printf does not implement %*c.  */
+  buffer = XALLOCAVEC (char, s.column + 3);
+  snprintf (buffer, s.column + 3,  %*c, s.column, '^');
+  pp_string (context-printer, buffer);
+  pp_set_prefix (context-printer, saved_prefix);
+}
+


Re: [PATCH] Caret diagnostics

2012-04-10 Thread Manuel López-Ibáñez
On 10 April 2012 21:31, Jason Merrill ja...@redhat.com wrote:
 On 04/10/2012 12:46 PM, Manuel López-Ibáñez wrote:

 +  max_width = context-caret_max_width;
 +  if (max_width= 0)
 +    max_width = INT_MAX;


 I don't think we need the test here; diagnostic_set_caret_max_width should
 make sure caret_max_width is set sensibly.  Otherwise, yes, thanks.

It was just a bit of defensive programming, since nothing stops anyone
to set the value directly. But I will remove it.

 On 04/10/2012 02:52 PM, Gabriel Dos Reis wrote:

 There is a novelty in this new version that I don't think we discussed
 before: automatic expansion of tabs to 8 hard space characters.  That
 number should not be hardcoded as there is no reason to believe a tab
 character always expands to 8 space characters.  You should check
 the environment first; if not present the default expansion number should
 be a symbolic constant as opposed to a magic number sprinkled all over the
 places.


 Hmm.  I don't know if there's any reliable way to query tab stops; the it
 termcap/terminfo capability tells you what it was originally (presumably 8),
 but it might have changed.

No idea either, but it is in fact easier to print tabs a single
spaces. This simplifies the code a lot, as pointed by Gabriel. So if
you also prefer the simpler version, I am fine with committing that
one (it also saves space in the output).

Cheers,

Manuel.


[PATCH 04/11] Fix expansion point loc for macro-like tokens

2012-04-10 Thread Dodji Seketeli
Consider the test case gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c.
Its interesting part is:

#define A(x) vari x /* line 7.  */
#define vari(x)
#define B , varj
int A(B) ;  /* line 10.  */

In its initial version, this test was being pre-processed as:

# 1 gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c
# 1 build/gcc//
# 1 command-line
# 1 gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c
# 10 gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c
int
# 7 gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c
 vari

, varj ;

Note how int and vari are on separate lines, whereas int and
, varj are on the same line.

This looks like a bug to me, even independantly from the macro
location tracking work.

With macro location tracking turned on, the preprocessed output
becomes:

# 1 gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c
# 1 command-line
# 1 gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c
# 10 gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c
int vari , varj ;

Which, IMO, is what we'd expect.

This is due to an unexpected side effect of enter_macro_context when
passed a token that might look like a function-like macro at first
sight, but that it eventually considers to not be a macro after all.

This is the case for the vari token which looks like a macro when it
is first lexed, but is eventually considered to be a normal token by
enter_macro_context because it's not used as a function-like macro
invocation.

In that case, besides returning NULL, enter_macro_context sets
pfile-context-c.macro to NULL, making cpp_get_token_1 forget to set
the location of the vari to the expansion point of A.

Please look at the extensive comments I have added to cpp_get_token_1
to understand the details.

The easy fix here is to speculatively set the location of the returned
token before calling enter_macro_context.

Tested on x86_64-unknown-linux-gnu against trunk.  Now this test has
the same output with and without tracking locations accross macro
expansions.

Note that the bootstrap with -ftrack-macro-expansion exhibits other
separate issues that are addressed in subsequent patches.  This patch
just fixes one class of problems.

The patch does pass bootstrap with -ftrack-macro-expansion turned off,
though.

libcpp/
* macro.c (enter_macro_context): Update comment.
(cpp_get_token_1): Set virtual location before calling
enter_macro_context.

gcc/testsuite/

* gcc.dg/debug/dwarf2/pr41445-5.c: Adjust.
* gcc.dg/debug/dwarf2/pr41445-6.c: Likewise.

---
 gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c |5 ++-
 gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-6.c |5 ++-
 libcpp/macro.c|   43 +++-
 3 files changed, 49 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c 
b/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c
index 03af604..d21acd5 100644
--- a/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c
@@ -9,6 +9,9 @@
 #define B , varj
 int A(B) ;
 
-/* { dg-final { scan-assembler 
DW_TAG_variable\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\vari\[^\\r\\n\]*DW_AT_name(\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*DW_AT_)*\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\[^0-9a-fA-FxX](0x)?7\[^0-9a-fA-FxX]\[^\\r\\n\]*DW_AT_decl_line
 } } */
+/*  We want to check that both vari and varj have the same line
+number.  */
+
+/* { dg-final { scan-assembler 
DW_TAG_variable\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\vari\[^\\r\\n\]*DW_AT_name(\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*DW_AT_)*\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\[^0-9a-fA-FxX](0xa|10)\[^0-9a-fA-FxX]\[^\\r\\n\]*DW_AT_decl_line
 } } */
 /* { dg-final { scan-assembler 
DW_TAG_variable\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\varj\[^\\r\\n\]*DW_AT_name(\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*DW_AT_)*\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\[^0-9a-fA-FxX](0xa|10)\[^0-9a-fA-FxX]\[^\\r\\n\]*DW_AT_decl_line
 } } */
 /* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-6.c 
b/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-6.c
index 8aa37d1..d6d79cc 100644
--- a/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-6.c
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-6.c
@@ -4,5 +4,8 @@
 
 #include pr41445-5.c
 
-/* { dg-final { scan-assembler 
DW_TAG_variable\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\vari\[^\\r\\n\]*DW_AT_name(\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*DW_AT_)*\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\[^0-9a-fA-FxX](0x)?7\[^0-9a-fA-FxX]\[^\\r\\n\]*DW_AT_decl_line
 } } */
+/*  We want to check that both vari and varj have the same line
+number.  */
+
+/* { dg-final { scan-assembler 
DW_TAG_variable\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\vari\[^\\r\\n\]*DW_AT_name(\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*DW_AT_)*\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\[^0-9a-fA-FxX](0xa|10)?\[^0-9a-fA-FxX]\[^\\r\\n\]*DW_AT_decl_line
 } } */
 /* { dg-final { scan-assembler 

Re: [i386, patch, RFC] HLE support in GCC

2012-04-10 Thread Andi Kleen
 In this case, can we reverse this sentence and just emit lock
 xacquire for MEMMODEL_ACQUIRE and lock xrelease for
 MEMMODEL_RELEASE ? Do we need separate HLE_* defines or can we somehow
 recycle existing C++11 memmodel defines?

No you absolutely can't. Transactions are quite different from a normal
lock. There can be good reasons to have locks that never speculates
(e.g. if they do some operation that always aborts)

-Andi
-- 
a...@linux.intel.com -- Speaking for myself only.


[PATCH 05/11] Make expand_location resolve to locus in main source file

2012-04-10 Thread Dodji Seketeli
Apparently, quite some places in the compiler (like the C/C++
preprocessor, the debug info machinery) expect expand_location to
resolve to locations that are in the main source file, even if the
token at stake comes from a macro that was defined in a header
somewhere.  Turning on -ftrack-macro-expansion by default was
triggering a lot of failures (not necessarily related to diagnostics)
because expand_location resolves to spelling locations instead.

So I have changed expand_location to honour the initial expectation.

In addition, I came up with the new expand_location_to_spelling_point
used in diagnostic_build_prefix because the diagnostic system, on the
other hand, wants to point to the location of the token where it was
spelled, and then display the error context involving all the macro
whose expansion led to that spelling point - if we are in the context
of a macro expansion there.

This seems to me like a reasonable balance.

Tested and bootstrapped on x86_64-unknown-linux-gnu against trunk and
whitnessed that a lot more tests were PASSing.

Note that the bootstrap with -ftrack-macro-expansion exhibits other
separate issues that are addressed in subsequent patches.  This patch
just fixes one class of problems.

The patch does pass bootstrap with -ftrack-macro-expansion turned off,
though.

gcc/

* input.c (expand_location_1): New.  Takes a parameter to choose
whether to resolve the location to spelling or expansion point.
Was factorized from ...
(expand_location): ... here.
(expand_location_to_spelling_point): New.  Implemented in terms of
expand_location_1.
* diagnostic.c (diagnostic_build_prefix): Use the new
expand_location_to_spelling_point instead of expand_location.
---
 gcc/diagnostic.c |2 +-
 gcc/input.c  |   39 ++-
 gcc/input.h  |9 +
 3 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index a68d6ce..337328c 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -183,7 +183,7 @@ diagnostic_build_prefix (diagnostic_context *context,
 must-not-happen
   };
   const char *text = _(diagnostic_kind_text[diagnostic-kind]);
-  expanded_location s = expand_location (diagnostic-location);
+  expanded_location s = expand_location_to_spelling_point 
(diagnostic-location);
   if (diagnostic-override_column)
 s.column = diagnostic-override_column;
   gcc_assert (diagnostic-kind  DK_LAST_DIAGNOSTIC_KIND);
diff --git a/gcc/input.c b/gcc/input.c
index 4077f9e..dcd348b 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -32,16 +32,22 @@ struct line_maps *line_table;
 
 /* Expand the source location LOC into a human readable location.  If
LOC resolves to a builtin location, the file name of the readable
-   location is set to the string built-in.  */
-
-expanded_location
-expand_location (source_location loc)
+   location is set to the string built-in. If EXPANSION_POINT_P is
+   TRUE and LOC is virtual, then it is resolved to the expansion
+   point of the involved macro.  Otherwise, it is resolved to the
+   spelling location of the token.  */
+
+static expanded_location
+expand_location_1 (source_location loc,
+  bool expansion_point_p)
 {
   expanded_location xloc;
   const struct line_map *map;
 
   loc = linemap_resolve_location (line_table, loc,
- LRK_SPELLING_LOCATION, map);
+ expansion_point_p
+ ? LRK_MACRO_EXPANSION_POINT
+ : LRK_SPELLING_LOCATION, map);
   xloc = linemap_expand_location (line_table, map, loc);
 
   if (loc = BUILTINS_LOCATION)
@@ -50,6 +56,29 @@ expand_location (source_location loc)
   return xloc;
 }
 
+/* Expand the source location LOC into a human readable location.  If
+   LOC is virtual, it resolves to the expansion point of the involved
+   macro.  If LOC resolves to a builtin location, the file name of the
+   readable location is set to the string built-in.  */
+
+expanded_location
+expand_location (source_location loc)
+{
+  return expand_location_1 (loc, /*expansion_point_p=*/true);
+}
+
+/* Expand the source location LOC into a human readable location.  If
+   LOC is virtual, it resolves to the expansion location of the
+   relevant macro.  If LOC resolves to a builtin location, the file
+   name of the readable location is set to the string
+   built-in.  */
+
+expanded_location
+expand_location_to_spelling_point (source_location loc)
+{
+  return expand_location_1 (loc, /*expansion_piont_p=*/false);
+}
+
 #define ONE_K 1024
 #define ONE_M (ONE_K * ONE_K)
 
diff --git a/gcc/input.h b/gcc/input.h
index f2f3513..5838153 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -38,6 +38,7 @@ extern char builtins_location_check[(BUILTINS_LOCATION
  RESERVED_LOCATION_COUNT) ? 1 : -1];
 
 extern expanded_location expand_location 

[PATCH 06/11] Strip built-in loc from displayed expansion context

2012-04-10 Thread Dodji Seketeli
Now that diagnostics for tokens coming from macro expansions point to
the spelling location of the relevant token (and then displays the
context of the expansion), some ugly (not so seldom) corner cases can
happen.

When the relevant token is a built-in token (which means the location
of that token is BUILTINS_LOCATION) the location prefix displayed to
the user in the diagnostic line is the built-in:0:0 string.  For
instance:

built-in:0:0: warning: conversion to 'float' alters 'int' constant value

For the user, I think this is surprising and useless.

A more user-friendly approach would be to refer to the first location
that (in the reported macro expansion context) is for a location in
real source code, like what is shown in the new test case
gcc/testsuite/g++.dg/warn/Wconversion-real-integer2.C accompanying
this patch.

To do this, I am making the line-map module provide a new
linemap_unwind_to_first_non_reserved_loc function that resolves a
virtual location to the first spelling location that is in real source
code.

I am then using that facility in the diagnostics printing module and
in the macro unwinder to avoid printing diagnostics lines that refer
to the locations for built-ins or more generally for reserved
locations.  Note that when I start the dance of skipping a built-in
location I also skip locations that are in system headers, because it
turned out that a lot of those built-ins are actually used in system
headers (e.g, #define INT_MAX __INT_MAX__ where __INT_MAX__ is a
built-in).

Besides the user-friendliness gain, this patch allows a number of
regression tests to PASS unchanged with and without
-ftrack-macro-expansion.

Tested and bootstrapped on x86_64-unknown-linux-gnu against trunk.

Note that the bootstrap with -ftrack-macro-expansion exhibits other
separate issues that are addressed in subsequent patches.  This patch
just fixes one class of problems.

The patch does pass bootstrap with -ftrack-macro-expansion turned off,
though.

libcpp/

* include/line-map.h (linemap_unwind_toward_expansion): Fix typo
in comment.
(linemap_unwind_to_first_non_reserved_loc): Declare new function.
* line-map.c (linemap_unwind_to_first_non_reserved_loc): Define
new function.

gcc/

* input.c (expand_location_1): When expanding to spelling location
in a context of a macro expansion, skip reserved system header
locations.  Update comments.  * tree-diagnostic.c
(maybe_unwind_expanded_macro_loc): Likewise.

gcc/testsuite/

* g++.dg/warn/Wconversion-real-integer2.C: New test.
* g++.dg/warn/Wconversion-real-integer-3.C: Likewise.
* g++.dg/warn/conversion-real-integer-3.h: New header used by the
new test above.
---
 gcc/input.c|   38 ---
 .../g++.dg/warn/Wconversion-real-integer-3.C   |   20 
 .../g++.dg/warn/Wconversion-real-integer2.C|   33 +
 .../g++.dg/warn/conversion-real-integer-3.h|3 +
 gcc/tree-diagnostic.c  |   12 +
 libcpp/include/line-map.h  |   20 -
 libcpp/line-map.c  |   49 
 7 files changed, 167 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wconversion-real-integer-3.C
 create mode 100644 gcc/testsuite/g++.dg/warn/Wconversion-real-integer2.C
 create mode 100644 gcc/testsuite/g++.dg/warn/conversion-real-integer-3.h

diff --git a/gcc/input.c b/gcc/input.c
index dcd348b..8edf05b 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -35,7 +35,14 @@ struct line_maps *line_table;
location is set to the string built-in. If EXPANSION_POINT_P is
TRUE and LOC is virtual, then it is resolved to the expansion
point of the involved macro.  Otherwise, it is resolved to the
-   spelling location of the token.  */
+   spelling location of the token.
+
+   When resolving to the spelling location of the token, if the
+   resulting location is for a built-in location (that is, it has no
+   associated line/column) in the context of a macro expansion, the
+   returned location is the first one (while unwinding the macro
+   location towards its expansion point) that is in real source
+   code.  */
 
 static expanded_location
 expand_location_1 (source_location loc,
@@ -43,12 +50,29 @@ expand_location_1 (source_location loc,
 {
   expanded_location xloc;
   const struct line_map *map;
-
-  loc = linemap_resolve_location (line_table, loc,
- expansion_point_p
- ? LRK_MACRO_EXPANSION_POINT
- : LRK_SPELLING_LOCATION, map);
-  xloc = linemap_expand_location (line_table, map, loc);
+  enum location_resolution_kind lrk = LRK_MACRO_EXPANSION_POINT;
+
+  memset (xloc, 0, sizeof (xloc));
+
+  if (loc = RESERVED_LOCATION_COUNT)
+{
+  if (!expansion_point_p)
+   {
+ 

Re: [i386, patch, RFC] HLE support in GCC

2012-04-10 Thread H.J. Lu
On Tue, Apr 10, 2012 at 9:34 AM, Andi Kleen a...@firstfloor.org wrote:
 H.J. Lu hjl.to...@gmail.com writes:

 So, to emit HLE prefix, it is possible to do:
 int
 foo2 (int *p, int oldv, int newv)
 {
  __atomic_compare_exchange_n (p, oldv, newv, 0, __ATOMIC_ACQUIRE |
 __ATOMIC_USE_HLE, __ATOMIC_ACQUIRE);
  return oldv;
 }

 This is wrong since HLE ACQUIRE/RELEASE has nothing to do with
 C++ atomic acquire/release. You can have HLE RELEASE with C++
 atomic acquire.

 It makes sense to combine the two. On x86 C++ atomic acquire/release
 means the compiler cannot move references outside. For HLE
 we really want the same, otherwise some of the memory references
 inside the transaction may not be transactional.

 So I think HLE_ACQUIRE should imply C++ acquire
 and HLE_RELEASE imply C++ release.


If it is the case, can we generate HLE RELEASE/ACQUIRE
prefix automatically for C++ atomics via -mhle command line option.
Then you don't need to modify the source codes to enable HLE support.

-- 
H.J.


Re: [i386, patch, RFC] HLE support in GCC

2012-04-10 Thread Andi Kleen
 If it is the case, can we generate HLE RELEASE/ACQUIRE
 prefix automatically for C++ atomics via -mhle command line option.
 Then you don't need to modify the source codes to enable HLE support.

No, for HLE someone needs to decide whether HLE is beneficial for a given
lock.  There are cases where it is not or even wrong (e.g. if there is no
matching unlock)

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only.


Re: [C++ Patch] for c++/52465

2012-04-10 Thread Fabien Chêne
2012/4/7 Jason Merrill ja...@redhat.com:
 On 04/07/2012 11:37 AM, Fabien Chêne wrote:

 Perhaps it is more correct like that, in cp_parser_set_decl_spec_type ?

 Even that seems late.  Why not just return the target decl from
 cp_parser_class_name?

Ah yes, that's slightly better.
(I've kept the NULL check in strip_using_decl, it seems safer to me.
Just tell me if you prefer not)
Tested x86_64-unkown-linux-gnu. OK for trunk and 4.7 ?

gcc/testsuite/ChangeLog

2012-03-08  Fabien Chêne  fab...@gcc.gnu.org

   PR c++/52465
   * g++.dg/lookup/using52.C: New.

gcc/cp/ChangeLog

2012-03-08  Fabien Chêne  fab...@gcc.gnu.org

   PR c++/52465
   * decl.c (grokdeclarator): Call strip_using_decl.
   * parser.c (cp_parser_class_name): Call strip_using_decl and
   perform some checks on the target decl.
   * name-lookup.c (strip_using_decl): Returns NULL_TREE if the decl
   to be stripped is NULL_TREE.
   (qualify_lookup): Call strip_using_decl and perform some checks on
   the target decl.


-- 
Fabien


52465_3.patch
Description: Binary data


Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-10 Thread Carlos O'Donell
On Tue, Apr 3, 2012 at 6:56 PM, Joseph S. Myers jos...@codesourcery.com wrote:
 (e) Existing practice for cases that do use different dynamic linkers is
 to use a separate library directory, not just dynamic linker name, as in
 lib32 and lib64 for MIPS or libx32 for x32; it's certainly a lot easier to
 make two sets of libraries work in parallel if you have separate library
 directories like that.  So it would seem more appropriate to define a
 directory libhf for ARM (meaning you need a binutils patch as well to
 handle that directory, I think), and these different Debian-style names
 could be implemented separately in a multiarch patch if someone submits
 one that properly accounts for my review comments on previous patch
 versions (failure to produce such a fixed patch being why Debian multiarch
 directory support has not got into GCC so far).

The thread doesn't seem to be wrapping up, instead it appears to go in
circles :-)

As a glibc maintainer, when it comes to this issue, I am guided by:

(1) This is a distribution problem and the solution needs to come from
a consensus among the distributions.

(2) The gcc/glibc community should listen to the distributions.

The distributions have the most experience when it comes to
whole-system issues. I certainly don't have that experience.
Unfortunately *I* give the distributions a B- or C+ for communication.
Please make it easy for me to give you an A. It is exceedingly
difficult for me to review solutions that span multiple patches,
emails, mailing lists, and communities. The best way to avoid
rehashing old problems is to document the design and get sign off from
the interested parties.

If I see uncoordinated and conflicting responses from the
distributions... I get worried.

Is there a proposal on a wiki that begins to summarize the suggestions
and solution?

Cheers,
Carlos.


[PATCH 07/11] Fix -Wuninitialized for -ftrack-macro-expansion

2012-04-10 Thread Dodji Seketeli
Besides the warning emitted by warn_uninit, this function wants
to hint the user at where the uninitialized variable was declared, for
cases where the declaration location is outside the current function.

Now that expand_location expands to the location that is in the main
source file (even for -ftrack-macro-expansion) the hinting part was
not working well for cases where the variable is declared in a macro
(outside the function), which is then expanded in the function.

So I had to adjust warn_uninit a little bit to make it consider the
spelling location of the variable declaration.

I have fixed the test gcc.dg/cpp/pragma-diagnostic-2.c on which I
believe gcc shouldn't emit any error.

Here is the new output on that test:

=~=
gcc.dg/cpp/pragma-diagnostic-2.c: In function 'g':
gcc.dg/cpp/pragma-diagnostic-2.c:10:5: warning: 'a' is used uninitialized in 
this function [-Wuninitialized]
gcc.dg/cpp/pragma-diagnostic-2.c:9:7: note: 'a' was declared here
gcc.dg/cpp/pragma-diagnostic-2.c:9:7: note: in expansion of macro 
'CODE_WITH_WARNING'
gcc.dg/cpp/pragma-diagnostic-2.c:17:3: note: expanded from here
gcc.dg/cpp/pragma-diagnostic-2.c: In function 'h':
gcc.dg/cpp/pragma-diagnostic-2.c:10:5: warning: 'a' is used uninitialized in 
this function [-Wuninitialized]
gcc.dg/cpp/pragma-diagnostic-2.c:9:7: note: 'a' was declared here
gcc.dg/cpp/pragma-diagnostic-2.c:9:7: note: in expansion of macro 
'CODE_WITH_WARNING'
gcc.dg/cpp/pragma-diagnostic-2.c:27:3: note: expanded from here
=~=

Tested and bootstrapped on x86_64-unknown-linux-gnu against trunk.

Note that the bootstrap with -ftrack-macro-expansion turned on
exhibits other separate issues that are addressed in subsequent
patches.  This patch just fixes one class of problems.

The patch does pass bootstrap with -ftrack-macro-expansion turned off,
though.

gcc/
* tree-ssa.c (warn_uninit): Use the spelling location of the
variable declaration.  Use linemap_location_before_p for source
locations.

gcc/testsuite/

* gcc.dg/cpp/pragma-diagnostic-2.c:  Fix this.
---
 gcc/testsuite/gcc.dg/cpp/pragma-diagnostic-2.c |   10 ++
 gcc/tree-ssa.c |   15 +++
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/cpp/pragma-diagnostic-2.c 
b/gcc/testsuite/gcc.dg/cpp/pragma-diagnostic-2.c
index 7ab95b0..57f3f01 100644
--- a/gcc/testsuite/gcc.dg/cpp/pragma-diagnostic-2.c
+++ b/gcc/testsuite/gcc.dg/cpp/pragma-diagnostic-2.c
@@ -6,8 +6,8 @@
 void f (unsigned);
 
 #define CODE_WITH_WARNING \
-  int a; /* { dg-message expansion|declared here } */  \
-  f (a) /* { dg-message expansion } */
+  int a; /* { dg-message was declared here } */   \
+  f (a) /* { dg-warning used uninitialized } */
 
 #pragma GCC diagnostic ignored -Wuninitialized
 
@@ -26,9 +26,3 @@ h (void)
 {
   CODE_WITH_WARNING;   /* { dg-message expanded } */
 }
-
-/*
-  { dg-message some warnings being treated as errors  {target *-*-*} 0 }
-*/
-
-/* { dg-error uninitialized  { target *-*-* } { 10 } } */
diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c
index 08f908f..977f0f7 100644
--- a/gcc/tree-ssa.c
+++ b/gcc/tree-ssa.c
@@ -1584,7 +1584,7 @@ warn_uninit (enum opt_code wc, tree t,
 tree expr, tree var, const char *gmsgid, void *data)
 {
   gimple context = (gimple) data;
-  location_t location;
+  location_t location, cfun_loc;
   expanded_location xloc, floc;
 
   if (!ssa_undefined_value_p (t))
@@ -1602,8 +1602,12 @@ warn_uninit (enum opt_code wc, tree t,
   location = (context != NULL  gimple_has_location (context))
 ? gimple_location (context)
 : DECL_SOURCE_LOCATION (var);
+  location = linemap_resolve_location (line_table, location,
+  LRK_SPELLING_LOCATION,
+  NULL);
+  cfun_loc = DECL_SOURCE_LOCATION (cfun-decl);
   xloc = expand_location (location);
-  floc = expand_location (DECL_SOURCE_LOCATION (cfun-decl));
+  floc = expand_location (cfun_loc);
   if (warning_at (location, wc, gmsgid, expr))
 {
   TREE_NO_WARNING (expr) = 1;
@@ -1611,8 +1615,11 @@ warn_uninit (enum opt_code wc, tree t,
   if (location == DECL_SOURCE_LOCATION (var))
return;
   if (xloc.file != floc.file
- || xloc.line  floc.line
- || xloc.line  LOCATION_LINE (cfun-function_end_locus))
+ || linemap_location_before_p (line_table,
+   location, cfun_loc)
+ || linemap_location_before_p (line_table,
+   cfun-function_end_locus,
+   location))
inform (DECL_SOURCE_LOCATION (var), %qD was declared here, var);
 }
 }
-- 
Dodji


Re: [PATCH] Caret diagnostics

2012-04-10 Thread Manuel López-Ibáñez
On 10 April 2012 21:41, Manuel López-Ibáñez lopeziba...@gmail.com wrote:
 On 10 April 2012 21:31, Jason Merrill ja...@redhat.com wrote:
 On 04/10/2012 12:46 PM, Manuel López-Ibáñez wrote:

 +  max_width = context-caret_max_width;
 +  if (max_width= 0)
 +    max_width = INT_MAX;


 I don't think we need the test here; diagnostic_set_caret_max_width should
 make sure caret_max_width is set sensibly.  Otherwise, yes, thanks.

 It was just a bit of defensive programming, since nothing stops anyone
 to set the value directly. But I will remove it.

 On 04/10/2012 02:52 PM, Gabriel Dos Reis wrote:

 There is a novelty in this new version that I don't think we discussed
 before: automatic expansion of tabs to 8 hard space characters.  That
 number should not be hardcoded as there is no reason to believe a tab
 character always expands to 8 space characters.  You should check
 the environment first; if not present the default expansion number should
 be a symbolic constant as opposed to a magic number sprinkled all over the
 places.


 Hmm.  I don't know if there's any reliable way to query tab stops; the it
 termcap/terminfo capability tells you what it was originally (presumably 8),
 but it might have changed.

 No idea either, but it is in fact easier to print tabs a single
 spaces. This simplifies the code a lot, as pointed by Gabriel. So if
 you also prefer the simpler version, I am fine with committing that
 one (it also saves space in the output).

A new version of the patch. It removes the special handling of tabs.
And abstracts out the adjusting of the line as Gabriel suggested. I am
not sure what else could be abstracted out. The function is much
shorter than other functions in diagnostic.c.

This patch builds and I am running a full boostrap+check now.

OK if it passes?

Cheers,

Manuel.


2012-04-05  Manuel López-Ibáñez  m...@gcc.gnu.org

PR 24985
libstdc++-v3/
* testsuite/lib/prune.exp: Handle caret.
libmudflap/
* testsuite/lib/libmudflap.exp: Handle caret.
gcc/
* diagnostic.h (show_caret): Declare.
(caret_max_width): Declare.
(diagnostic_show_locus): Declare.
* diagnostic.c (diagnostic_initialize): Initialize to false.
(diagnostic_show_locus): New.
(diagnostic_report_diagnostic): Call it.
(getenv_columns): New.
(adjust_line): New.
(diagnostic_set_caret_max_width): New.
* input.c (read_line): New.
(location_get_source_line): New.
* input.h (location_get_source_line): Declare.
* toplev.c (general_init): Initialize show_caret from options.
* dwarf2out.c (gen_producer_string): Handle fdiagnostics-show-caret.
* opts.c (common_handle_option): Likewise.
* pretty-print.h (pp_get_prefix): New.
(pp_base_get_prefix): New.
* common.opt (fdiagnostics-show-caret): New option.
* doc/invoke.texi (fdiagnostics-show-caret): Document it.
testsuite/
* lib/prune.exp: Add -fno-diagnostics-show-caret.


caret-diagnostics-20120411.diff
Description: Binary data


Re: Ada testcase CR line endings

2012-04-10 Thread Mike Stump
On Apr 10, 2012, at 5:24 AM, Robert Dewar wrote:
 On 4/10/2012 1:35 AM, Mike Stump wrote:
 So, I'd like to change all the ada testcases to use normal unix line endings.
 
 testsuite/gnat.dg/taft_type2_pkg.ads
 
 is an example if one such file, any objections?
 
 As long as the test is not about line endings this seems fine.

The C testsuite does indeed have such testcases, from the last time I converted 
endings in bulk for the C testsuite.  The Ada testsuite appears to not have any 
such testcase.


Re: [PATCH] Caret diagnostics

2012-04-10 Thread Mike Stump
On Apr 10, 2012, at 11:52 AM, Gabriel Dos Reis wrote:
 On Tue, Apr 10, 2012 at 11:46 AM, Manuel López-Ibáñez
 lopeziba...@gmail.com wrote:
 On 10 April 2012 00:28, Jason Merrill ja...@redhat.com wrote:
 On 04/09/2012 04:01 PM, Manuel López-Ibáńez wrote:
 
 * It uses  the default cutoff as max_width, whatever it is (as
 controlled by -fmessage-length).
 * It uses the pretty-printer. The text cannot (should not) wrap
 because we still print only max_width chars at most.
 
 
 Hmm, I think if pp_line_cutoff is 0 and we're on a terminal, we still want
 to use COLUMNS to limit how much of the source we print.
 
 Like this?
 
 There is a novelty in this new version that I don't think we discussed
 before: automatic expansion of tabs to 8 hard space characters.  That
 number should not be hardcoded as there is no reason to believe a tab
 character always expands to 8 space characters.  You should check
 the environment first; if not present the default expansion number should
 be a symbolic constant as opposed to a magic number sprinkled all over the
 places.  I would also encourage you to introduce more abstraction to
 reduce the size of diagnostic_show_locus.

Knobs have a cost.  Having 1000s of command line options doesn't really make a 
product better.  Our job is to provide what we must and omit what we can.  You 
argue for a knob that no user has asked for or expressed a need for.  I think 
we should have a much higher bar than this.  The reality is, tabs are 8, 
period.  There was a time went editors managed tabs stops by changing column 
into which a tab would move, this ability taken from devices called typewriters 
which had a little metal bit that you moved around where you wanted tab stops.  
The days of there devices called typewriters are over, we no longer cater to 
them.  Today, editors can manage what the key called TAB does, and its 
semantics are disconnected from the file save format.  This file format can be 
set at 8, while the edit function is indent 2, 4, 8 or as complex as what emacs 
does with c-mode.  It is a bad idea to have a file format tab is other than 8, 
and we should encourage all but legacy people to convert to 8.  We do this by 
having a wiki entry and explains the use of pr, and how to set the file save 
format to 8, if we need to.  For legacy people, they don't _want_ a new 
compiler, pretty much by definition.  So, I'd argue for a symbol constant, but, 
a constant nonetheless.  If a large user _had_ serious needs for something 
other than 8, _they_ can change the compiler to:

  #define TAB_COLS (getenv(TABS) ? atoi (genenv(TABS)) : 8)

if they want.  I don't think we should have a knob for this.


Re: [PATCH] Caret diagnostics

2012-04-10 Thread Gabriel Dos Reis
On Tue, Apr 10, 2012 at 5:23 PM, Mike Stump mikest...@comcast.net wrote:
 On Apr 10, 2012, at 11:52 AM, Gabriel Dos Reis wrote:
 On Tue, Apr 10, 2012 at 11:46 AM, Manuel López-Ibáñez
 lopeziba...@gmail.com wrote:
 On 10 April 2012 00:28, Jason Merrill ja...@redhat.com wrote:
 On 04/09/2012 04:01 PM, Manuel López-Ibáńez wrote:

 * It uses  the default cutoff as max_width, whatever it is (as
 controlled by -fmessage-length).
 * It uses the pretty-printer. The text cannot (should not) wrap
 because we still print only max_width chars at most.


 Hmm, I think if pp_line_cutoff is 0 and we're on a terminal, we still want
 to use COLUMNS to limit how much of the source we print.

 Like this?

 There is a novelty in this new version that I don't think we discussed
 before: automatic expansion of tabs to 8 hard space characters.  That
 number should not be hardcoded as there is no reason to believe a tab
 character always expands to 8 space characters.  You should check
 the environment first; if not present the default expansion number should
 be a symbolic constant as opposed to a magic number sprinkled all over the
 places.  I would also encourage you to introduce more abstraction to
 reduce the size of diagnostic_show_locus.

 Knobs have a cost.  Having 1000s of command line options doesn't really make 
 a product better.  Our job is to provide what we must and omit what we can.  
 You argue for a knob that no user has asked for or expressed a need for.  I 
 think we should have a much higher bar than this.  The reality is, tabs are 
 8, period.  There was a time went editors managed tabs stops by changing 
 column into which a tab would move, this ability taken from devices called 
 typewriters which had a little metal bit that you moved around where you 
 wanted tab stops.  The days of there devices called typewriters are over, we 
 no longer cater to them.  Today, editors can manage what the key called TAB 
 does, and its semantics are disconnected from the file save format.  This 
 file format can be set at 8, while the edit function is indent 2, 4, 8 or as 
 complex as what emacs does with c-mode.  It is a bad idea to have a file 
 format tab is other than 8, and we should encourage all but legacy people to 
 convert to 8.  We do this by having a wiki entry and explains the use of pr, 
 and how to set the file save format to 8, if we need to.  For legacy people, 
 they don't _want_ a new compiler, pretty much by definition.  So, I'd argue 
 for a symbol constant, but, a constant nonetheless.  If a large user _had_ 
 serious needs for something other than 8, _they_ can change the compiler to:

  #define TAB_COLS (getenv(TABS) ? atoi (genenv(TABS)) : 8)

 if they want.  I don't think we should have a knob for this.

This is fully mystifying to me and I still do not understand what it
is all about after reading it 8 times.

- Gaby


Re: execute permissions in ada

2012-04-10 Thread Geert Bosch


On Apr 10, 2012, at 1:45, Mike Stump mikest...@comcast.net wrote:

 I assume that was a friendly, please feel free to fix *.ads as well.

Yes, sorry for the terse email. I wasn't quite sure if your message implied 
there was only an issue with *.adb or not and wasn't in a position to check at 
that time. 

  -Geert


Re: [PATCH] Caret diagnostics

2012-04-10 Thread Mike Stump
On Apr 10, 2012, at 4:37 PM, Gabriel Dos Reis wrote:
 You should check the environment first


 I don't think we should have a knob for this.

 This is fully mystifying to me and I still do not understand what it is all 
 about after reading it 8 times.

Let me rephrase, I disagree, we should not check the environment first.

I assume, you meant checking the environment, git getenv, for some named 
variable to control the expansion of tabs.  Yes?


Re: [PATCH] Caret diagnostics

2012-04-10 Thread Jason Merrill

On 04/10/2012 04:33 PM, Manuel López-Ibáñez wrote:

A new version of the patch. It removes the special handling of tabs.
And abstracts out the adjusting of the line as Gabriel suggested. I am
not sure what else could be abstracted out. The function is much
shorter than other functions in diagnostic.c.

This patch builds and I am running a full boostrap+check now.

OK if it passes?


OK.

Jason



Re: [C++ Patch] for c++/52465

2012-04-10 Thread Jason Merrill

On 04/10/2012 04:23 PM, Fabien Chêne wrote:

Ah yes, that's slightly better.
(I've kept the NULL check in strip_using_decl, it seems safer to me.
Just tell me if you prefer not)
Tested x86_64-unkown-linux-gnu. OK for trunk and 4.7 ?

2012-03-08  Fabien Chênefab...@gcc.gnu.org

PR c++/52465
* decl.c (grokdeclarator): Call strip_using_decl.
* parser.c (cp_parser_class_name): Call strip_using_decl and
perform some checks on the target decl.
* name-lookup.c (strip_using_decl): Returns NULL_TREE if the decl
to be stripped is NULL_TREE.
(qualify_lookup): Call strip_using_decl and perform some checks on
the target decl.


Your ChangeLog needs to be adjusted.  :)

The patch is OK, thanks!

Jason



Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-10 Thread Mike Frysinger
On Tuesday 10 April 2012 01:17:36 Adam Conrad wrote:
 On Tue, Apr 10, 2012 at 12:01:57AM -0400, Mike Frysinger wrote:
  On Monday 09 April 2012 19:31:40 Adam Conrad wrote:
   I realize that most people can't see past their own use case to
   understand why a unique location for linkers is helpful, useful, and
   important for some other people's use cases, but you either didn't
   read or chose to ignore why using multilib paths just plain doesn't
   scale past a single base architecture, and why that's a problem for
   people who aren't you.
  
  and as already stated, the proposed paths here, free of multiarch
  subpaths, satisfy the requirements that you've put forth
 
 Like I said, then, you didn't actually read or understand why proposing
 multilib paths doesn't work.  You realize conceptually, I hope, that
 there's no guarantee of uniqueness in lib/lib64/lib32/libsf/libhf once
 you cross the base CPU architecture boundary, right?

i don't see this as a problem

 Sure, I said that /libhf/ld-linux.so.3 would *accidentally* work for
 us right now, due to sheer luck, and you're running with that as saying
 that we clearly have no problem here worth solving.

my point was: it works today and has no clashes.  that satisfies the omg we 
have to ship something $now and satisfies the requirement that only the Debian 
people are putting forth (and has already been violated by many targets): the 
ldso must be unique across all arches/multilibs.

 When the next architecture clashes with linkers on another (hint: it almost
 certainly will), do we get to argue about this all over again in six months,
 instead of codifying a new and saner practice now?

i don't buy that it'll happen that soon (since ldso's don't get generated 
quickly), but that is certainly plenty of time for the Debian project to 
attempt to convince everyone else that multiarch isn't a waste of time.  and 
does so without holding up moving forward with a unified arm hardfloat abi.
-mike


signature.asc
Description: This is a digitally signed message part.


Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-10 Thread Mike Frysinger
On Tuesday 10 April 2012 12:46:49 Michael Edwards wrote:
 That way I can grandfather in binaries with ABI-ignorant
 hard-coded library paths, and still handle ISA variants.  The
 extranoise might be neon, or ssse3

aren't ISA variants handled already by glibc ?  that's what the hwcaps stuff 
does -- you can put optimized versions in ISA-specific subdirs of the normal 
lib paths.  glibc will look for those first before falling back to the common 
libs.
-mike


signature.asc
Description: This is a digitally signed message part.


  1   2   >