date:20120927

[PATCH, i386]: Fix PR51109, symbol size in scheduler state machine is reduced

2012-09-27 Thread Gopalasubramanian, Ganesh

Hi All,

This is a fix for PR 51109.

There are three changes

1.  Microcoded instructions are considered as single issue instructions and 
are therefore issued to a separate execution unit.
2.  The multiplier unit is attached to execution unit 1 (ieu1). Since ieu 
is handled as a separate automaton in the patch, separate mult automaton is 
not required.
3.  The integer execution units (2AGUs and 2EXs) are now decoupled. Now, 
they are described as separate automatons.

Is it OK for upstream?

Regards
Ganesh

2012-09-27  Ganesh Gopalasubramanian  ganesh.gopalasubraman...@amd.com

PR 51109
* gcc/config/i386/bdver1.md (bdver1_int): Automaton has been 
split to reduce state transitions.
   
Index: gcc/config/i386/bdver1.md
===
--- gcc/config/i386/bdver1.md   (revision 191658)
+++ gcc/config/i386/bdver1.md   (working copy)
@@ -36,7 +36,7 @@
 (define_attr bdver1_decode direct,vector,double
   (const_string direct))

-(define_automaton bdver1,bdver1_int,bdver1_load,bdver1_mult,bdver1_fp)
+(define_automaton bdver1,bdver1_ieu,bdver1_load,bdver1_fp,bdver1_agu)

 (define_cpu_unit bdver1-decode0 bdver1)
 (define_cpu_unit bdver1-decode1 bdver1)
@@ -71,16 +71,14 @@
 | (nothing,(bdver1-decode1 + 
bdver1-decode2


-(define_cpu_unit bdver1-ieu0 bdver1_int)
-(define_cpu_unit bdver1-ieu1 bdver1_int)
+(define_cpu_unit bdver1-ieu0 bdver1_ieu)
+(define_cpu_unit bdver1-ieu1 bdver1_ieu)
 (define_reservation bdver1-ieu (bdver1-ieu0 | bdver1-ieu1))

-(define_cpu_unit bdver1-agu0 bdver1_int)
-(define_cpu_unit bdver1-agu1 bdver1_int)
+(define_cpu_unit bdver1-agu0 bdver1_agu)
+(define_cpu_unit bdver1-agu1 bdver1_agu)
 (define_reservation bdver1-agu (bdver1-agu0 | bdver1-agu1))

-(define_cpu_unit bdver1-mult bdver1_mult)
-
 (define_cpu_unit bdver1-load0 bdver1_load)
 (define_cpu_unit bdver1-load1 bdver1_load)
 (define_reservation bdver1-load bdver1-agu,
@@ -93,6 +91,12 @@
 ;; 128bit SSE instructions issue two stores at once.
 (define_reservation bdver1-store2 (bdver1-load0 + bdver1-load1))

+;; vectorpath (microcoded) instructions are single issue instructions.
+;; So, they occupy all the integer units.
+(define_reservation bdver1-ivector bdver1-ieu0+bdver1-ieu1+
+  bdver1-agu0+bdver1-agu1+
+  bdver1-load0+bdver1-load1)
+
 ;; The FP operations start to execute at stage 12 in the pipeline, while
 ;; integer operations start to execute at stage 9 for athlon and 11 for K8
 ;; Compensate the difference for athlon because it results in significantly
@@ -125,7 +129,7 @@
 (define_insn_reservation bdver1_call 0
 (and (eq_attr cpu bdver1,bdver2)
  (eq_attr type call,callv))
-bdver1-double,bdver1-agu,bdver1-ieu)
+bdver1-double,bdver1-agu)
 ;; PUSH mem is double path.
 (define_insn_reservation bdver1_push 1
 (and (eq_attr cpu bdver1,bdver2)
@@ -135,17 +139,17 @@
 (define_insn_reservation bdver1_pop 1
 (and (eq_attr cpu bdver1,bdver2)
  (eq_attr type pop))
-bdver1-direct,(bdver1-ieu+bdver1-load))
+bdver1-direct,bdver1-ivector)
 ;; LEAVE no latency info so far, assume same with amdfam10.
 (define_insn_reservation bdver1_leave 3
 (and (eq_attr cpu bdver1,bdver2)
  (eq_attr type leave))
-bdver1-vector,(bdver1-ieu+bdver1-load))
+bdver1-vector,bdver1-ivector)
 ;; LEA executes in AGU unit with 1 cycle latency on BDVER1.
 (define_insn_reservation bdver1_lea 1
 (and (eq_attr cpu bdver1,bdver2)
  (eq_attr type lea))
-bdver1-direct,bdver1-agu,nothing)
+bdver1-direct,bdver1-agu)

 ;; MUL executes in special multiplier unit attached to IEU1.
 (define_insn_reservation bdver1_imul_DI 6
@@ -153,23 +157,23 @@
  (and (eq_attr type imul)
   (and (eq_attr mode DI)
(eq_attr memory none,unknown
-
bdver1-direct1,bdver1-ieu1,bdver1-mult,nothing,bdver1-ieu1)
+bdver1-direct1,bdver1-ieu1)
 (define_insn_reservation bdver1_imul 4
 (and (eq_attr cpu bdver1,bdver2)
  (and (eq_attr type imul)
   (eq_attr memory none,unknown)))
-bdver1-direct1,bdver1-ieu1,bdver1-mult,bdver1-ieu1)
+bdver1-direct1,bdver1-ieu1)
 (define_insn_reservation bdver1_imul_mem_DI 10
 (and (eq_attr cpu bdver1,bdver2)
  (and (eq_attr type imul)

Re: [PR54551] global dead debug pseudo tracking in fast-dce

2012-09-27 Thread Jakub Jelinek

On Tue, Sep 25, 2012 at 07:21:04PM -0300, Alexandre Oliva wrote:
 On Sep 25, 2012, Jakub Jelinek ja...@redhat.com wrote:
 
  (the other alternative would be to use mode in the hash function etc.,
  but if usually the same pseudo has the same mode everywhere, then the above
  should be good enough).
 
 AFAIK each pseudo is referenced everywhere using the same RTX; if so, it
 follows that it has the same mode in all uses.

Ok, leave the mode check out then.  But still checking the result of the
function is IMHO desirable.

  I believe the coding conventions ask to put the inlines outside of the
  class body, see e.g. coverage.c.
 
 I wasn't sure about one-liners; hash-table.h itself has inline
 one-liners, one of which I used as the basis for the descriptor.  That
 said, the braces were not in separate lines.

Seems the C++ coding conventions are unfinished and vague and the codebase
is growing different styles in different spots :(.

 I'm going on a trip tomorrow morning, and I'll only return on Friday
 evening.  I'll have a look at the C++ coding conventions and the other
 issues you brought up when I return.  However, if you'd rather have the
 fix in before that, I won't mind if you pick it up from where I left.

I can wait, after all the related msg00711.html patch hasn't been reviewed
yet anyway.

Jakub

Re: [CPP] Add pragmas for emitting diagnostics

2012-09-27 Thread Florian Weimer


On 09/26/2012 10:19 PM, Tom Tromey wrote:

Florian == Florian Weimer fwei...@redhat.com writes:


Florian This patch adds support for #pragma GCC warning and #pragma GCC
Florian error. These pragmas can be used from preprocessor macros,
Florian unlike the existing #warning and #error directives.  Library
Florian authors can use these pragmas to add deprecation warnings to
Florian macros they define.

I'm not sure if my libcpp review powers extend to an extension like
this.

It seems reasonable to me though.


Thanks, I'll wait until next week for further feedback.


Florian Index: gcc/doc/cpp.texi
[...]
Florian +contained in the pragma must be a single string literal.  Similary,

Typo, similarly -- missing l.


Thanks, fixed in my copy.


Florian +@code{#pragma GCC error message} issues an error message.  Unlike
Florian +the @samp{#warning} and @samp{#error} directives provided by
Florian +compilers, these pragmas can be embedded in preprocessor macros using

I would just remove provided by compilers.


Yes, it's a bit awkward, and I've removed it.  I wanted to stress that 
these directives aren't part of the preprocessor, to avoid confusion.



Florian +  cpp_error (pfile, CPP_DL_ERROR, invalid #pragma GCC %s 
directive,

It seems to me that the '#pragma GCC %s' part should have quotes around
it.


Good.  I don't think we've got the magic quotes inside libgcc, so I'm 
just going with this, following other examples in the file:


+  cpp_error (pfile, CPP_DL_ERROR, invalid \#pragma GCC %s\ 
directive,

+error ? error : warning);

--
Florian Weimer / Red Hat Product Security Team

[PATCH] Fix a typo in gcov.texi

2012-09-27 Thread Marek Polacek

Instead of -profile-dir, we want -fprofile-dir, I'm afraid.  Ok?

2012-09-27  Marek Polacek  pola...@redhat.com

* doc/gcov.texi (Gcov Data Files): Fix a typo.

--- gcc/doc/gcov.texi.mp2012-09-27 11:55:45.658201583 +0200
+++ gcc/doc/gcov.texi   2012-09-27 11:56:05.335252754 +0200
@@ -555,7 +555,7 @@ file suffix with either @file{.gcno}, or
 contain coverage and profile data stored in a platform-independent format.
 The @file{.gcno} files are placed in the same directory as the object
 file.  By default, the @file{.gcda} files are also stored in the same
-directory as the object file, but the GCC @option{-profile-dir} option
+directory as the object file, but the GCC @option{-fprofile-dir} option
 may be used to store the @file{.gcda} files in a separate directory.
 
 The @file{.gcno} notes file is generated when the source file is compiled

Marek

Re: [PATCH] Fix a typo in gcov.texi

2012-09-27 Thread Jakub Jelinek

On Thu, Sep 27, 2012 at 11:59:38AM +0200, Marek Polacek wrote:
 Instead of -profile-dir, we want -fprofile-dir, I'm afraid.  Ok?
 
 2012-09-27  Marek Polacek  pola...@redhat.com
 
   * doc/gcov.texi (Gcov Data Files): Fix a typo.

Ok.  --profile-dir=/tmp/ also works, but not -profile-dir=/tmp/.

 --- gcc/doc/gcov.texi.mp  2012-09-27 11:55:45.658201583 +0200
 +++ gcc/doc/gcov.texi 2012-09-27 11:56:05.335252754 +0200
 @@ -555,7 +555,7 @@ file suffix with either @file{.gcno}, or
  contain coverage and profile data stored in a platform-independent format.
  The @file{.gcno} files are placed in the same directory as the object
  file.  By default, the @file{.gcda} files are also stored in the same
 -directory as the object file, but the GCC @option{-profile-dir} option
 +directory as the object file, but the GCC @option{-fprofile-dir} option
  may be used to store the @file{.gcda} files in a separate directory.
  
  The @file{.gcno} notes file is generated when the source file is compiled
 
   Marek

Jakub

[C++ PATCH] Nit fix in build_new_1

2012-09-27 Thread Jakub Jelinek

Hi!

All INTEGER_CSTs are TREE_CONSTANT, I don't see the point
in testing that.  Ok for trunk?

2012-09-27  Jakub Jelinek  ja...@redhat.com

* init.c (build_new_1): Don't test TREE_CONSTANT
of INTEGER_CST.

--- gcc/cp/init.c.jj2012-09-25 11:59:43.0 +0200
+++ gcc/cp/init.c   2012-09-27 12:42:32.382457943 +0200
@@ -2235,8 +2235,7 @@ build_new_1 (VEC(tree,gc) **placement, t
 {
   tree inner_nelts = array_type_nelts_top (elt_type);
   tree inner_nelts_cst = maybe_constant_value (inner_nelts);
-  if (TREE_CONSTANT (inner_nelts_cst)
-  TREE_CODE (inner_nelts_cst) == INTEGER_CST)
+  if (TREE_CODE (inner_nelts_cst) == INTEGER_CST)
{
  bool overflow;
  double_int result = TREE_INT_CST (inner_nelts_cst)

Jakub

Re: [C++ PATCH] Nit fix in build_new_1

2012-09-27 Thread Richard Guenther

On Thu, Sep 27, 2012 at 12:44 PM, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 All INTEGER_CSTs are TREE_CONSTANT, I don't see the point
 in testing that.  Ok for trunk?

Ok.

Thanks,
Richard.

 2012-09-27  Jakub Jelinek  ja...@redhat.com

 * init.c (build_new_1): Don't test TREE_CONSTANT
 of INTEGER_CST.

 --- gcc/cp/init.c.jj2012-09-25 11:59:43.0 +0200
 +++ gcc/cp/init.c   2012-09-27 12:42:32.382457943 +0200
 @@ -2235,8 +2235,7 @@ build_new_1 (VEC(tree,gc) **placement, t
  {
tree inner_nelts = array_type_nelts_top (elt_type);
tree inner_nelts_cst = maybe_constant_value (inner_nelts);
 -  if (TREE_CONSTANT (inner_nelts_cst)
 -  TREE_CODE (inner_nelts_cst) == INTEGER_CST)
 +  if (TREE_CODE (inner_nelts_cst) == INTEGER_CST)
 {
   bool overflow;
   double_int result = TREE_INT_CST (inner_nelts_cst)

 Jakub

[C++ Patch / RFC] PR 51422

2012-09-27 Thread Paolo Carlini


Hi,

almost forgot that a few weeks ago I spent some time on this PR...

The issue is simple: in these repeated error conditions we ICE on the 
gcc_assert in is_normal_capture_proxy: decl is a VAR_DECL with an 
error_mark_node as TREE_TYPE.


Then checking error_operand_p (decl) in is_capture_proxy solves the 
problem but now the question is: do we have reasons to believe that such 
VAR_DECLs should never ever reach is_normal_capture_proxy? Otherwise 
robustifying a predicate like this seems a good idea to me. Patch passes 
testing on x86_64-linux of course.


Thanks!
Paolo.

///
Index: cp/semantics.c
===
--- cp/semantics.c  (revision 190666)
+++ cp/semantics.c  (working copy)
@@ -8929,6 +8929,9 @@ capture_decltype (tree decl)
 bool
 is_capture_proxy (tree decl)
 {
+  if (error_operand_p (decl))
+return false;
+
   return (TREE_CODE (decl) == VAR_DECL
   DECL_HAS_VALUE_EXPR_P (decl)
   !DECL_ANON_UNION_VAR_P (decl)
Index: testsuite/g++.dg/cpp0x/lambda/lambda-ice8.C
===
--- testsuite/g++.dg/cpp0x/lambda/lambda-ice8.C (revision 0)
+++ testsuite/g++.dg/cpp0x/lambda/lambda-ice8.C (revision 0)
@@ -0,0 +1,10 @@
+// PR c++/51422
+// { dg-do compile { target c++11 } }
+
+templatetypename struct A {};
+
+void foo()
+{
+  [i] { Adecltype(i)(); }; // { dg-error not declared|invalid }
+  [i] { Adecltype(i)(); }; // { dg-error invalid }
+}

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-09-27 Thread Sharad Singhai

Thanks for the review. A couple of comments inline:

 Some minor issues:

 * c/c-decl.c (c_write_global_declarations): Use different method to
 determine if the dump has ben initialized.
 * cp/decl2.c (cp_write_global_declarations): Ditto.
 * testsuite/gcc.target/i386/vect-double-1.c: Fix test.

 these subdirs all have their separate ChangeLog entry from where the
 directory name is omitted.

 Index: tree-dump.c
 ===
 --- tree-dump.c (revision 191490)
 +++ tree-dump.c (working copy)
 @@ -24,9 +24,11 @@ along with GCC; see the file COPYING3.  If not see
  #include coretypes.h
  #include tm.h
  #include tree.h
 +#include gimple-pretty-print.h
  #include splay-tree.h
  #include filenames.h
  #include diagnostic-core.h
 +#include rtl.h

 what do you need gimple-pretty-print.h and rtl.h for?

 +
 +extern void dump_bb (FILE *, basic_block, int, int);
 +

 that should be declared in some header

 +/* Dump gimple statement GS with SPC indentation spaces and
 +   EXTRA_DUMP_FLAGS on the dump streams if DUMP_KIND is enabled.  */
 +
 +void
 +dump_gimple_stmt (int dump_kind, int extra_dump_flags, gimple gs, int spc)
 +{

 the gimple stuff really belongs in to gimple-pretty-print.c

This dump_gimple_stmt () is just a dispatcher, which uses internal
data structure such as dump streams/flags. If I move it into
gimple-pretty-print.c, then I would have to export those
streams/flags. I was hoping to avoid it by keeping all dump_* ()
methods together in dumpfile.c (earlier in tree-dump.c). Thus, later
one could just make dump_file/dump_flags static when all the passes
have converted to this scheme.


 (parts of tree-dump.c should be moved to a new file dumpfile.c)

 +/* Dump tree T using EXTRA_DUMP_FLAGS on dump streams if DUMP_KIND is
 +   enabled.  */
 +
 +void
 +dump_generic_expr (int dump_kind, int extra_dump_flags, tree t)
 +{

 belongs to tree-pretty-print.c (to where the routines are it calls)

This is again a dispatcher for dump_generic_expr () which writes to
the appropriate stream depending upon dump_kind.


 +int
 +dump_start (int phase, int *flag_ptr)
 +{

 perfect candidate for dumpfile.c

 You can do this re-shuffling as followup, but please try to not include rtl.h
 or gimple-pretty-print.h from tree-dump.c.  Thus re-shuffling required by that
 do now.  tree-dump.c should only know about dumping 'tree'.

Okay, I have moved relevant methods into dumpfile.c.


 Index: tree-dump.h
 ===
 --- tree-dump.h (revision 191490)
 +++ tree-dump.h (working copy)
 @@ -22,6 +22,7 @@ along with GCC; see the file COPYING3.  If not see
  #ifndef GCC_TREE_DUMP_H
  #define GCC_TREE_DUMP_H

 +#include input.h

 probably no longer required.

 Index: dumpfile.h
 ===
 --- dumpfile.h  (revision 191490)
 +++ dumpfile.h  (working copy)
 @@ -22,6 +22,9 @@ along with GCC; see the file COPYING3.  If not see
  #ifndef GCC_DUMPFILE_H
  #define GCC_DUMPFILE_H 1

 +#include coretypes.h
 +#include input.h

 likewise for input.h.

 Index: testsuite/gcc.target/i386/vect-double-1.c
 ===
 --- testsuite/gcc.target/i386/vect-double-1.c   (revision 191490)
 +++ testsuite/gcc.target/i386/vect-double-1.c   (working copy)
 @@ -32,5 +32,5 @@ sse2_test (void)
  }
  }

 -/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect } } */
 +/* { dg-final { scan-tree-dump-times Vectorized loops: 1 1 vect } } */
  /* { dg-final { cleanup-tree-dump vect } } */

 I am sure you need a gazillion more testsuite adjustments?  Thus, did you
 really test the patch by a bootstrap and a toplevel make -k check for
 regressions?

 Index: opts.c
 ===
 --- opts.c  (revision 191490)
 +++ opts.c  (working copy)
 @@ -25,6 +25,7 @@ along with GCC; see the file COPYING3.  If not see
  #include system.h
  #include intl.h
  #include coretypes.h
 +#include dumpfile.h

 I don't see that you add a use for this.  Please double-check all your include
 file changes.

 Index: gimple-pretty-print.c
 ===
 --- gimple-pretty-print.c   (revision 191490)
 +++ gimple-pretty-print.c   (working copy)
 @@ -69,7 +69,7 @@ maybe_init_pretty_print (FILE *file)
  }
 ...
 Index: gimple-pretty-print.h
 ===
 --- gimple-pretty-print.h   (revision 191490)
 +++ gimple-pretty-print.h   (working copy)
 @@ -31,6 +31,6 @@ extern void debug_gimple_seq (gimple_seq);
  extern void print_gimple_seq (FILE *, gimple_seq, int, int);
  extern void print_gimple_stmt (FILE *, gimple, int, int);
  extern void print_gimple_expr (FILE *, gimple, int, int);
 -extern void dump_gimple_stmt (pretty_printer *, gimple, int, int);

Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)

2012-09-27 Thread Michael Matz

Hi,

On Wed, 26 Sep 2012, Lawrence Crowl wrote:

  A lower-case type name indicates to me a non-changing type,
  i.e. nothing that depends on a template.  In C we only had
  such types so we used lower-case names everywhere.  With C++
  and templates I think we should start using upper case for some
  very specific use cases, like first letter of dependend types.
 
 How would you distinguish them from template parameter names,
 which by convention have an upper case first letter?

I wouldn't.  If the distinction becomes so important that authors need to 
see the speciality immediately by having a different convention how to 
spell names, then I think we did something wrong, and we should simplify 
the code.

 What about non-type dependent names?

I'm not sure what you're asking.  Let's make an example:

template typename T
struct D : BT
{
  typedef typename BT::E E; // element_type
  E getme (int index);
}

In fact, as BT::E would probably be defined like typedef typename T E, 
I would even have no issue to call the above E also T.  The distinction 
between the template arg name and the typedef would be blurred, and I say, 
so what; one is a typedef of the other and hence mostly equivalent for 
practical purposes.  (And if they aren't, then again, we did something too 
complicated with the switch to C++).

 The advantage to following them is that they will surprise no one.

They will surprise everyone used to different conventions, for instance 
Qt, so that's not a reason.

 Do you have an alternate suggestion, one that does not confuse template 
 parameters and dependent names?

Upper last character?  Just kidding :)  Too many detailed rules for 
conventions are the death of them, use rules of thumbs, my one would be 
somehow depends on template args - has upper character in name, where 
somehow depends on includes is a.


Ciao,
Michael.

Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)

2012-09-27 Thread Gabriel Dos Reis

On Thu, Sep 27, 2012 at 7:12 AM, Michael Matz m...@suse.de wrote:

 (And if they aren't, then again, we did something too
 complicated with the switch to C++).

or we are doing something by insisting not to use
standard notation.

Re: [PATCH, i386]: Fix PR51109, symbol size in scheduler state machine is reduced

2012-09-27 Thread Uros Bizjak

On Thu, Sep 27, 2012 at 10:30 AM, Gopalasubramanian, Ganesh
ganesh.gopalasubraman...@amd.com wrote:

 This is a fix for PR 51109.

 There are three changes

 1.  Microcoded instructions are considered as single issue instructions 
 and are therefore issued to a separate execution unit.
 2.  The multiplier unit is attached to execution unit 1 (ieu1). Since ieu 
 is handled as a separate automaton in the patch, separate mult automaton is 
 not required.
 3.  The integer execution units (2AGUs and 2EXs) are now decoupled. Now, 
 they are described as separate automatons.

 Is it OK for upstream?

 Regards
 Ganesh

 2012-09-27  Ganesh Gopalasubramanian  ganesh.gopalasubraman...@amd.com

 PR 51109
 * gcc/config/i386/bdver1.md (bdver1_int): Automaton has been
 split to reduce state transitions.

OK for mainline, if tested according to [1].

[1] http://gcc.gnu.org/contribute.html#testing

Thanks,
Uros.

[v3] libstdc++/54727

2012-09-27 Thread Paolo Carlini


Hi,

unbreak Mozilla build. Tested x86_64-linux.

Thanks,
Paolo.

/
2012-09-27  Paolo Carlini  paolo.carl...@oracle.com

PR libstdc++/54727
* config/cpu/i486/opt/bits/opt_random.h: Avoid UINT64_C.
Index: config/cpu/i486/opt/bits/opt_random.h
===
--- config/cpu/i486/opt/bits/opt_random.h   (revision 191799)
+++ config/cpu/i486/opt/bits/opt_random.h   (working copy)
@@ -64,7 +64,7 @@
  return;
  }
 
-   constexpr uint64_t __maskval = UINT64_C(0xf);
+   constexpr uint64_t __maskval = 0xfull;
static const __m128i __mask = _mm_set1_epi64x(__maskval);
static const __m128i __two = _mm_set1_epi64x(0x4000ull);
static const __m128d __three = _mm_set1_pd(3.0);

[PATCH] Correct handling of gcc-[ar|nm|ranlib] exit codes

2012-09-27 Thread Meador Inge

Hi All,

The gcc-[ar|nm|ranlib] LTO utils use 'pex_one' to spawn the wrapped binutils
program.  However, currently it is blindly returning the value of the 'err'
parameter for the exit code.  According the documentation [1] 'err' is only
set for an error return and 'status' is only set for a successful return.

This patch fixes the bug by appropriately checking the returned status
and extracting the exit code when needed.  Tested on GNU/Linux and Windows.

OK?

2012-09-27  Meador Inge  mead...@codesourcery.com

* gcc-ar.c (main): Handle the returning of the sub-process error
code correctly.

[1] http://gcc.gnu.org/onlinedocs/libiberty/Functions.html#Functions

Index: gcc/gcc-ar.c
===
--- gcc/gcc-ar.c(revision 191792)
+++ gcc/gcc-ar.c(working copy)
@@ -42,6 +42,7 @@
   const char *err_msg;
   const char **nargv;
   bool is_ar = !strcmp (PERSONALITY, ar);
+  int exit_code = FATAL_EXIT_CODE;
 
   exe_name = PERSONALITY;
 #ifdef CROSS_DIRECTORY_STRUCTURE
@@ -96,6 +97,20 @@
 NULL,NULL,  status, err);
   if (err_msg) 
 fprintf(stderr, Error running %s: %s\n, exe_name, err_msg);
+  else if (status)
+{
+  if (WIFSIGNALED (status))
+   {
+ int sig = WTERMSIG (status);
+ fprintf (stderr, %s terminated with signal %d [%s]%s\n,
+  exe_name, sig, strsignal(sig),
+  WCOREDUMP(status) ? , core dumped : );
+   }
+  else if (WIFEXITED (status))
+   exit_code = WEXITSTATUS (status);
+}
+  else
+exit_code = SUCCESS_EXIT_CODE;
 
-  return err;
+  return exit_code;
 }

Re: [PATCH, rtl-optimization]: Fix PR54457, [x32] Fail to combine 64bit index + constant

2012-09-27 Thread Uros Bizjak

On Wed, Sep 26, 2012 at 11:22 PM, Eric Botcazou ebotca...@adacore.com wrote:
 I agree (subreg:M (op:N A C) 0) to (op:M (subreg:N (A 0)) C) is
 a good transformation, but why do we need to handle as special
 the case where the subreg is itself the operand of a plus or minus?
 I think it should happen regardless of where the subreg occurs.

 Don't we need to restrict this to the low part though?

I have tried this approach with attached patch.  Unfortunately,
although it survived bootstrap without libjava on x86_64, it failed
building libjava with:

/home/uros/gcc-svn/trunk/libjava/classpath/javax/swing/plaf/basic/BasicSliderUI.java:1299:0:
error: insn does not satisfy its constraints:
   }
 ^
(insn 237 398 399 7 (set (reg:SI 1 dx [125])
(plus:SI (subreg:SI (mult:DI (reg:DI 1 dx [orig:72 D.78627 ] [72])
(const_int 2 [0x2])) 0)
(reg:SI 5 di)))
/home/uros/gcc-svn/trunk/libjava/classpath/javax/swing/plaf/basic/BasicSliderUI.java:1271
240 {*leasi}
 (expr_list:REG_DEAD (reg:DI 5 di)
(nil)))

Original RTX was (subreg:SI (plus:DI (mult:DI (...) reg:DI))), which
is valid RTX pattern for lea insn, the above is not.

Due to these problems, I think the safer approach is to limit the
transformation to (plus:SI (subreg:SI (plus:DI (...) 0)) RTXes, as was
the case with original patch. This approach would fix a specific
problem where simplify_plus_minus is not able to simplify the combined
RTX at combine time. Please note, that combined RTXes are always
checked for correctness at combine pass.

Uros.
Index: simplify-rtx.c
===
--- simplify-rtx.c  (revision 191796)
+++ simplify-rtx.c  (working copy)
@@ -5689,6 +5688,21 @@ simplify_subreg (enum machine_mode outermode, rtx
return CONST0_RTX (outermode);
 }
 
+  /* Simplify (subreg:SI (plus:DI ((x:DI) (y:DI)), 0)
+ to (plus:SI (subreg:SI (x:DI) 0) (subreg:SI (x:DI) 0)), where
+ the outer subreg is effectively a truncation to the original mode.  */
+  if ((GET_CODE (op) == PLUS
+   || GET_CODE (op) == MINUS)
+   SCALAR_INT_MODE_P (outermode)
+   SCALAR_INT_MODE_P (innermode)
+   GET_MODE_PRECISION (outermode)  GET_MODE_PRECISION (innermode)
+   subreg_lsb_1 (outermode, innermode, byte) == 0)
+return simplify_gen_binary (GET_CODE (op), outermode,
+   simplify_gen_subreg (outermode, XEXP (op, 0),
+innermode, 0),
+   simplify_gen_subreg (outermode, XEXP (op, 1),
+innermode, 0));
+  
   /* Simplify (subreg:QI (lshiftrt:SI (sign_extend:SI (x:QI)) C), 0) into
  to (ashiftrt:QI (x:QI) C), where C is a suitable small constant and
  the outer subreg is effectively a truncation to the original mode.  */

[PATCH] Shrink symtab_node_base

2012-09-27 Thread Richard Guenther


The following patch shrinks symtab_node_base from 104 bytes to 88 bytes
(on x86_64) by re-ordering and packing fields.

Bootstrap  regtest pending.

Richard.

2012-09-27  Richard Guenther  rguent...@suse.de

* cgraph.h (symtab_node_base): Re-order and pack fields.

Index: gcc/cgraph.h
===
--- gcc/cgraph.h(revision 191798)
+++ gcc/cgraph.h(working copy)
@@ -43,14 +43,37 @@ enum symtab_type
 struct GTY(()) symtab_node_base
 {
   /* Type of the symbol.  */
-  enum symtab_type type;
+  ENUM_BITFIELD (symtab_type) type : 8;
+
+  /* The symbols resolution.  */
+  ENUM_BITFIELD (ld_plugin_symbol_resolution) resolution : 8;
+
+  /* Set when function has address taken.
+ In current implementation it imply needed flag. */
+  unsigned address_taken : 1;
+  /* Set when variable is used from other LTRANS partition.  */
+  unsigned used_from_other_partition : 1;
+  /* Set when function is available in the other LTRANS partition.  
+ During WPA output it is used to mark nodes that are present in
+ multiple partitions.  */
+  unsigned in_other_partition : 1;
+  /* Set when function is visible by other units.  */
+  unsigned externally_visible : 1;
+  /* Needed variables might become dead by optimization.  This flag
+ forces the variable to be output even if it appears dead otherwise.  */
+  unsigned force_output : 1;
+
+  /* Ordering of all symtab entries.  */
+  int order;
+
   tree decl;
+
+  /* Vectors of referring and referenced entities.  */
   struct ipa_ref_list ref_list;
+
   /* Circular list of nodes in the same comdat group if non-NULL.  */
   symtab_node same_comdat_group;
-  /* Ordering of all symtab entries.  */
-  int order;
-  enum ld_plugin_symbol_resolution resolution;
+
   /* File stream where this node is being written to.  */
   struct lto_file_decl_data * lto_file_data;
 
@@ -65,21 +88,6 @@ struct GTY(()) symtab_node_base
   symtab_node previous_sharing_asm_name;
 
   PTR GTY ((skip)) aux;
-
-  /* Set when function has address taken.
- In current implementation it imply needed flag. */
-  unsigned address_taken : 1;
-  /* Set when variable is used from other LTRANS partition.  */
-  unsigned used_from_other_partition : 1;
-  /* Set when function is available in the other LTRANS partition.  
- During WPA output it is used to mark nodes that are present in
- multiple partitions.  */
-  unsigned in_other_partition : 1;
-  /* Set when function is visible by other units.  */
-  unsigned externally_visible : 1;
-  /* Needed variables might become dead by optimization.  This flag
- forces the variable to be output even if it appears dead otherwise.  */
-  unsigned force_output : 1;
 };
 
 enum availability

Re: [PATCH] Correct handling of gcc-[ar|nm|ranlib] exit codes

2012-09-27 Thread Richard Guenther

On Thu, Sep 27, 2012 at 3:01 PM, Meador Inge mead...@codesourcery.com wrote:
 Hi All,

 The gcc-[ar|nm|ranlib] LTO utils use 'pex_one' to spawn the wrapped binutils
 program.  However, currently it is blindly returning the value of the 'err'
 parameter for the exit code.  According the documentation [1] 'err' is only
 set for an error return and 'status' is only set for a successful return.

 This patch fixes the bug by appropriately checking the returned status
 and extracting the exit code when needed.  Tested on GNU/Linux and Windows.

 OK?

Ok.  Also for the branches (where applicable).

Thanks,
Richard.

 2012-09-27  Meador Inge  mead...@codesourcery.com

 * gcc-ar.c (main): Handle the returning of the sub-process error
 code correctly.

 [1] http://gcc.gnu.org/onlinedocs/libiberty/Functions.html#Functions

 Index: gcc/gcc-ar.c
 ===
 --- gcc/gcc-ar.c(revision 191792)
 +++ gcc/gcc-ar.c(working copy)
 @@ -42,6 +42,7 @@
const char *err_msg;
const char **nargv;
bool is_ar = !strcmp (PERSONALITY, ar);
 +  int exit_code = FATAL_EXIT_CODE;

exe_name = PERSONALITY;
  #ifdef CROSS_DIRECTORY_STRUCTURE
 @@ -96,6 +97,20 @@
  NULL,NULL,  status, err);
if (err_msg)
  fprintf(stderr, Error running %s: %s\n, exe_name, err_msg);
 +  else if (status)
 +{
 +  if (WIFSIGNALED (status))
 +   {
 + int sig = WTERMSIG (status);
 + fprintf (stderr, %s terminated with signal %d [%s]%s\n,
 +  exe_name, sig, strsignal(sig),
 +  WCOREDUMP(status) ? , core dumped : );
 +   }
 +  else if (WIFEXITED (status))
 +   exit_code = WEXITSTATUS (status);
 +}
 +  else
 +exit_code = SUCCESS_EXIT_CODE;

 -  return err;
 +  return exit_code;
  }

Re: [PATCH, rtl-optimization]: Fix PR54457, [x32] Fail to combine 64bit index + constant

2012-09-27 Thread Richard Sandiford

Uros Bizjak ubiz...@gmail.com writes:
 On Wed, Sep 26, 2012 at 11:22 PM, Eric Botcazou ebotca...@adacore.com wrote:
 I agree (subreg:M (op:N A C) 0) to (op:M (subreg:N (A 0)) C) is
 a good transformation, but why do we need to handle as special
 the case where the subreg is itself the operand of a plus or minus?
 I think it should happen regardless of where the subreg occurs.

 Don't we need to restrict this to the low part though?

 I have tried this approach with attached patch.  Unfortunately,
 although it survived bootstrap without libjava on x86_64, it failed
 building libjava with:

 /home/uros/gcc-svn/trunk/libjava/classpath/javax/swing/plaf/basic/BasicSliderUI.java:1299:0:
 error: insn does not satisfy its constraints:
}
  ^
 (insn 237 398 399 7 (set (reg:SI 1 dx [125])
 (plus:SI (subreg:SI (mult:DI (reg:DI 1 dx [orig:72 D.78627 ] [72])
 (const_int 2 [0x2])) 0)
 (reg:SI 5 di)))
 /home/uros/gcc-svn/trunk/libjava/classpath/javax/swing/plaf/basic/BasicSliderUI.java:1271
 240 {*leasi}
  (expr_list:REG_DEAD (reg:DI 5 di)
 (nil)))

 Original RTX was (subreg:SI (plus:DI (mult:DI (...) reg:DI))), which
 is valid RTX pattern for lea insn, the above is not.

 Due to these problems, I think the safer approach is to limit the
 transformation to (plus:SI (subreg:SI (plus:DI (...) 0)) RTXes, as was
 the case with original patch. This approach would fix a specific
 problem where simplify_plus_minus is not able to simplify the combined
 RTX at combine time. Please note, that combined RTXes are always
 checked for correctness at combine pass.

I think instead the (subreg (plus ...)) handling should be applied
to (subreg (mult ...)) too.  IMO the correct form of the above address
ought to be:

(set (reg:SI 1 dx [125])
 (plus:SI (mult:SI (reg:SI 1 dx [orig:72 D.78627 ] [72])
   (const_int 2 [0x2]))
  (reg:SI 5 di))

Richard

[PATCH] Prefer to use v?{and,or,xor}p[sd] for float vectors (PR target/54716)

2012-09-27 Thread Jakub Jelinek

Hi!

As discussed in the PR, the only way how to request a vector float/double
logical operation in C/C++ code without intrinsics is by casting to integer
vectors temporarily, but we then generate v?p{and,or,xor} instead of *p[sd].

The following patch changes that if either both of the operands of
vector integer and/or/xor are SUBREGs of the same vector float/double mode,
or one is SUBREG and another one is CONST_VECTOR.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2012-09-27  Jakub Jelinek  ja...@redhat.com

PR target/54716
* config/i386/predicates.md (nonimmediate_or_const_vector_operand):
New predicate.
* config/i386/i386.c (ix86_expand_vector_logical_operator): New
function.
* config/i386/i386-protos.h (ix86_expand_vector_logical_operator): New
prototype.
* config/i386/sse.md (codemode3 VI logic): Use it.

* gcc.target/i386/xorps-sse2.c: Remove xfails.

--- gcc/config/i386/predicates.md.jj2012-09-13 07:54:44.0 +0200
+++ gcc/config/i386/predicates.md   2012-09-27 09:56:54.994873237 +0200
@@ -777,6 +777,12 @@ (define_predicate vector_move_operand
   (ior (match_operand 0 nonimmediate_operand)
(match_operand 0 const0_operand)))
 
+;; Return true when OP is either nonimmediate operand, or any
+;; CONST_VECTOR.
+(define_predicate nonimmediate_or_const_vector_operand
+  (ior (match_operand 0 nonimmediate_operand)
+   (match_code const_vector)))
+
 ;; Return true when OP is nonimmediate or standard SSE constant.
 (define_predicate nonimmediate_or_sse_const_operand
   (match_operand 0 general_operand)
--- gcc/config/i386/i386.c.jj   2012-09-20 09:22:11.0 +0200
+++ gcc/config/i386/i386.c  2012-09-27 10:02:47.725786590 +0200
@@ -16490,6 +16490,82 @@ ix86_expand_binary_operator (enum rtx_co
 emit_move_insn (operands[0], dst);
 }
 
+/* Expand vector logical operation CODE (AND, IOR, XOR) in MODE with
+   the given OPERANDS.  */
+
+void
+ix86_expand_vector_logical_operator (enum rtx_code code, enum machine_mode 
mode,
+rtx operands[])
+{
+  rtx op1 = NULL_RTX, op2 = NULL_RTX;
+  if (GET_CODE (operands[1]) == SUBREG)
+{
+  op1 = operands[1];
+  op2 = operands[2];
+}
+  else if (GET_CODE (operands[2]) == SUBREG)
+{
+  op1 = operands[2];
+  op2 = operands[1];
+}
+  /* Optimize (__m128i) d | (__m128i) e and similar code
+ when d and e are float vectors into float vector logical
+ insn.  In C/C++ without using intrinsics there is no other way
+ to express vector logical operation on float vectors than
+ to cast them temporarily to integer vectors.  */
+  if (op1
+   !TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL
+   ((GET_CODE (op2) == SUBREG || GET_CODE (op2) == CONST_VECTOR))
+   GET_MODE_CLASS (GET_MODE (SUBREG_REG (op1))) == MODE_VECTOR_FLOAT
+   GET_MODE_SIZE (GET_MODE (SUBREG_REG (op1))) == GET_MODE_SIZE (mode)
+   SUBREG_BYTE (op1) == 0
+   (GET_CODE (op2) == CONST_VECTOR
+ || (GET_MODE (SUBREG_REG (op1)) == GET_MODE (SUBREG_REG (op2))
+  SUBREG_BYTE (op2) == 0))
+   can_create_pseudo_p ())
+{
+  rtx dst;
+  switch (GET_MODE (SUBREG_REG (op1)))
+   {
+   case V4SFmode:
+   case V8SFmode:
+   case V2DFmode:
+   case V4DFmode:
+ dst = gen_reg_rtx (GET_MODE (SUBREG_REG (op1)));
+ if (GET_CODE (op2) == CONST_VECTOR)
+   {
+ op2 = gen_lowpart (GET_MODE (dst), op2);
+ op2 = force_reg (GET_MODE (dst), op2);
+   }
+ else
+   {
+ op1 = operands[1];
+ op2 = SUBREG_REG (operands[2]);
+ if (!nonimmediate_operand (op2, GET_MODE (dst)))
+   op2 = force_reg (GET_MODE (dst), op2);
+   }
+ op1 = SUBREG_REG (op1);
+ if (!nonimmediate_operand (op1, GET_MODE (dst)))
+   op1 = force_reg (GET_MODE (dst), op1);
+ emit_insn (gen_rtx_SET (VOIDmode, dst,
+ gen_rtx_fmt_ee (code, GET_MODE (dst),
+ op1, op2)));
+ emit_move_insn (operands[0], gen_lowpart (mode, dst));
+ return;
+   default:
+ break;
+   }
+}
+  if (!nonimmediate_operand (operands[1], mode))
+operands[1] = force_reg (mode, operands[1]);
+  if (!nonimmediate_operand (operands[2], mode))
+operands[2] = force_reg (mode, operands[2]);
+  ix86_fixup_binary_operands_no_copy (code, mode, operands);
+  emit_insn (gen_rtx_SET (VOIDmode, operands[0],
+ gen_rtx_fmt_ee (code, mode, operands[1],
+ operands[2])));
+}
+
 /* Return TRUE or FALSE depending on whether the binary operator meets the
appropriate constraints.  */
 
--- gcc/config/i386/sse.md.jj   2012-09-14 14:36:44.0 +0200
+++ gcc/config/i386/sse.md  2012-09-27 09:52:47.182318053

Re: [testsuite] gcc.target/arm/unsigned-extend-1.c: omit -march option

2012-09-27 Thread Mike Stump

On Sep 26, 2012, at 4:58 PM, Janis Johnson janis_john...@mentor.com wrote:

 Test gcc.target/arm/unsigned-extend-1.c specifies option -march=armv6,
 which causes compile failures when that option conflicts with other
 test flags, such as for multilibs.  It can also be overridden by other
 test flags.  The test is scanning that instruction uxtb is not
 issued.  Leaving off the option won't hurt when the effective target
 doesn't support the instruction, and will allow the test to be tried
 with other newer processors as well.
 
 OK for trunk?

Ok.

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-09-27 Thread Xinliang David Li

On Thu, Sep 27, 2012 at 4:35 AM, Sharad Singhai sing...@google.com wrote:
 Thanks for the review. A couple of comments inline:

 Some minor issues:

 * c/c-decl.c (c_write_global_declarations): Use different method to
 determine if the dump has ben initialized.
 * cp/decl2.c (cp_write_global_declarations): Ditto.
 * testsuite/gcc.target/i386/vect-double-1.c: Fix test.

 these subdirs all have their separate ChangeLog entry from where the
 directory name is omitted.

 Index: tree-dump.c
 ===
 --- tree-dump.c (revision 191490)
 +++ tree-dump.c (working copy)
 @@ -24,9 +24,11 @@ along with GCC; see the file COPYING3.  If not see
  #include coretypes.h
  #include tm.h
  #include tree.h
 +#include gimple-pretty-print.h
  #include splay-tree.h
  #include filenames.h
  #include diagnostic-core.h
 +#include rtl.h

 what do you need gimple-pretty-print.h and rtl.h for?

 +
 +extern void dump_bb (FILE *, basic_block, int, int);
 +

 that should be declared in some header

 +/* Dump gimple statement GS with SPC indentation spaces and
 +   EXTRA_DUMP_FLAGS on the dump streams if DUMP_KIND is enabled.  */
 +
 +void
 +dump_gimple_stmt (int dump_kind, int extra_dump_flags, gimple gs, int spc)
 +{

 the gimple stuff really belongs in to gimple-pretty-print.c

 This dump_gimple_stmt () is just a dispatcher, which uses internal
 data structure such as dump streams/flags. If I move it into
 gimple-pretty-print.c, then I would have to export those
 streams/flags. I was hoping to avoid it by keeping all dump_* ()
 methods together in dumpfile.c (earlier in tree-dump.c). Thus, later
 one could just make dump_file/dump_flags static when all the passes
 have converted to this scheme.


You can make the flags/streams global but only expose them via inline
accessors in the header file.

David


 (parts of tree-dump.c should be moved to a new file dumpfile.c)

 +/* Dump tree T using EXTRA_DUMP_FLAGS on dump streams if DUMP_KIND is
 +   enabled.  */
 +
 +void
 +dump_generic_expr (int dump_kind, int extra_dump_flags, tree t)
 +{

 belongs to tree-pretty-print.c (to where the routines are it calls)

 This is again a dispatcher for dump_generic_expr () which writes to
 the appropriate stream depending upon dump_kind.


 +int
 +dump_start (int phase, int *flag_ptr)
 +{

 perfect candidate for dumpfile.c

 You can do this re-shuffling as followup, but please try to not include rtl.h
 or gimple-pretty-print.h from tree-dump.c.  Thus re-shuffling required by 
 that
 do now.  tree-dump.c should only know about dumping 'tree'.

 Okay, I have moved relevant methods into dumpfile.c.


 Index: tree-dump.h
 ===
 --- tree-dump.h (revision 191490)
 +++ tree-dump.h (working copy)
 @@ -22,6 +22,7 @@ along with GCC; see the file COPYING3.  If not see
  #ifndef GCC_TREE_DUMP_H
  #define GCC_TREE_DUMP_H

 +#include input.h

 probably no longer required.

 Index: dumpfile.h
 ===
 --- dumpfile.h  (revision 191490)
 +++ dumpfile.h  (working copy)
 @@ -22,6 +22,9 @@ along with GCC; see the file COPYING3.  If not see
  #ifndef GCC_DUMPFILE_H
  #define GCC_DUMPFILE_H 1

 +#include coretypes.h
 +#include input.h

 likewise for input.h.

 Index: testsuite/gcc.target/i386/vect-double-1.c
 ===
 --- testsuite/gcc.target/i386/vect-double-1.c   (revision 191490)
 +++ testsuite/gcc.target/i386/vect-double-1.c   (working copy)
 @@ -32,5 +32,5 @@ sse2_test (void)
  }
  }

 -/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect } } */
 +/* { dg-final { scan-tree-dump-times Vectorized loops: 1 1 vect } } */
  /* { dg-final { cleanup-tree-dump vect } } */

 I am sure you need a gazillion more testsuite adjustments?  Thus, did you
 really test the patch by a bootstrap and a toplevel make -k check for
 regressions?

 Index: opts.c
 ===
 --- opts.c  (revision 191490)
 +++ opts.c  (working copy)
 @@ -25,6 +25,7 @@ along with GCC; see the file COPYING3.  If not see
  #include system.h
  #include intl.h
  #include coretypes.h
 +#include dumpfile.h

 I don't see that you add a use for this.  Please double-check all your 
 include
 file changes.

 Index: gimple-pretty-print.c
 ===
 --- gimple-pretty-print.c   (revision 191490)
 +++ gimple-pretty-print.c   (working copy)
 @@ -69,7 +69,7 @@ maybe_init_pretty_print (FILE *file)
  }
 ...
 Index: gimple-pretty-print.h
 ===
 --- gimple-pretty-print.h   (revision 191490)
 +++ gimple-pretty-print.h   (working copy)
 @@ -31,6 +31,6 @@ extern void debug_gimple_seq (gimple_seq);
  extern void print_gimple_seq (FILE *, gimple_seq,

[Patch,avr]: Ad PR rtl-optimization/52543: Undo the MEM-UNSPEC hack

2012-09-27 Thread Georg-Johann Lay

PR52543 required to represent a load from non-generic address spaces as UNSPEC
instead of as MEM to avoid a gross code bloat.

http://gcc.gnu.org/PR52543

lower-subreg's cost model is still broken: It assumes that any loads from MEM
are from the generic address space and does not care for address spaces in its
cost model.

This patch undoes the changes from SVN r185605

http://gcc.gnu.org/viewcvs?view=revisionrevision=185605

and installs a different but less intrusive hack around PR52543:

targetm.mode_dependent_address_p has an address space parameter so that the
backend can pretend all non-generic addresses are mode-dependent.

This keeps lower-subreg.c from splitting the loads, and it is possible to
represent the loads as MEM and there is no more the need to represent them as
UNSPECs.

This patch is still not an optimal solution but the code is much closer to a
clean solution now.

Ok for trunk?

Johann


PR rtl-optimization/52543
* config/avr/avr.c (avr_mode_dependent_address_p): Return true for
all non-generic address spaces.
(TARGET_SECONDARY_RELOAD): New hook define to...
(avr_secondary_reload): ...this new static function.
* config/avr/avr.md (reload_inmode): New insns.

Undo r185605 (mostly):
* config/avr/avr-protos.h (avr_load_lpm): Remove.
* config/avr/avr.c (avr_load_libgcc_p): Don't restrict to __flash loads.
(avr_out_lpm): Also handle loads  1 byte.
(avr_load_lpm): Remove.
(avr_find_unused_d_reg): New static function.
(avr_out_lpm_no_lpmx): New static function.
(adjust_insn_length): Remove ADJUST_LEN_LOAD_LPM.
* config/avr/avr.md (unspec): Remove UNSPEC_LPM.
(load_mode_libgcc): Use MEM instead of UNSPEC_LPM.
(load_mode, load_mode_clobber): Remove.
(movmode): For multi-byte move from non-generic
16-bit address spaces: Expand to *movmode again.
(loadmode_libgcc): New expander.
(split-lpmx): Remove split.

Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 191761)
+++ config/avr/avr.md	(working copy)
@@ -63,7 +63,6 @@ (define_c_enum unspec
   [UNSPEC_STRLEN
UNSPEC_MOVMEM
UNSPEC_INDEX_JMP
-   UNSPEC_LPM
UNSPEC_FMUL
UNSPEC_FMULS
UNSPEC_FMULSU
@@ -142,7 +141,7 @@ (define_attr adjust_len
tsthi, tstpsi, tstsi, compare, compare64, call,
mov8, mov16, mov24, mov32, reload_in16, reload_in24, reload_in32,
ufract, sfract,
-   xload, movmem, load_lpm,
+   xload, movmem,
ashlqi, ashrqi, lshrqi,
ashlhi, ashrhi, lshrhi,
ashlsi, ashrsi, lshrsi,
@@ -393,60 +392,57 @@ (define_split
 ;;
 ;; Move stuff around
 
-;; Represent a load from __flash that needs libgcc support as UNSPEC.
-;; This is legal because we read from non-changing memory.
-;; For rationale see the FIXME below.
-
-;; load_psi_libgcc
-;; load_si_libgcc
-;; load_sf_libgcc
-(define_insn load_mode_libgcc
-  [(set (reg:MOVMODE 22)
-(unspec:MOVMODE [(reg:HI REG_Z)]
-UNSPEC_LPM))]
-  
+;; Secondary input reload from non-generic 16-bit address spaces
+(define_insn reload_inmode
+  [(set (match_operand:MOVMODE 0 register_operand   =r)
+(match_operand:MOVMODE 1 memory_operand  m))
+   (clobber (match_operand:QI 2 d_register_operand  =d))]
+  MEM_P (operands[1])
+!ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (operands[1]))
   {
-rtx n_bytes = GEN_INT (GET_MODE_SIZE (MODEmode));
-output_asm_insn (%~call __load_%0, n_bytes);
-return ;
-  }
-  [(set_attr type xcall)
-   (set_attr cc clobber)])
-
-
-;; Similar for inline reads from flash.  We use UNSPEC instead
-;; of MEM for the same reason as above: PR52543.
-;; $1 contains the memory segment.
-
-(define_insn load_mode
-  [(set (match_operand:MOVMODE 0 register_operand =r)
-(unspec:MOVMODE [(reg:HI REG_Z)
- (match_operand:QI 1 reg_or_0_operand rL)]
-UNSPEC_LPM))]
-  (CONST_INT_P (operands[1])  AVR_HAVE_LPMX)
-   || (REG_P (operands[1])  AVR_HAVE_ELPMX)
-  {
-return avr_load_lpm (insn, operands, NULL);
+return output_movqi (insn, operands, NULL);
   }
-  [(set_attr adjust_len load_lpm)
+  [(set_attr adjust_len mov8)
(set_attr cc clobber)])
 
 
-;; Similar to above for the complementary situation when there is no [E]LPMx.
-;; Clobber Z in that case.
+;; loadqi_libgcc
+;; loadhi_libgcc
+;; loadpsi_libgcc
+;; loadsi_libgcc
+;; loadsf_libgcc
+(define_expand loadmode_libgcc
+  [(set (match_dup 3)
+(match_dup 2))
+   (set (reg:MOVMODE 22)
+(match_operand:MOVMODE 1 memory_operand ))
+   (set (match_operand:MOVMODE 0 register_operand )
+(reg:MOVMODE 22))]
+  avr_load_libgcc_p (operands[1])
+  {
+operands[3] = gen_rtx_REG (HImode, REG_Z);
+operands[2] = force_operand (XEXP (operands[1], 0),

Re: [PATCH] Prefer to use v?{and,or,xor}p[sd] for float vectors (PR target/54716)

2012-09-27 Thread Richard Henderson

On 09/27/2012 08:24 AM, Jakub Jelinek wrote:
 Hi!
 
 As discussed in the PR, the only way how to request a vector float/double
 logical operation in C/C++ code without intrinsics is by casting to integer
 vectors temporarily, but we then generate v?p{and,or,xor} instead of *p[sd].
 
 The following patch changes that if either both of the operands of
 vector integer and/or/xor are SUBREGs of the same vector float/double mode,
 or one is SUBREG and another one is CONST_VECTOR.
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
 2012-09-27  Jakub Jelinek  ja...@redhat.com
 
   PR target/54716
   * config/i386/predicates.md (nonimmediate_or_const_vector_operand):
   New predicate.
   * config/i386/i386.c (ix86_expand_vector_logical_operator): New
   function.
   * config/i386/i386-protos.h (ix86_expand_vector_logical_operator): New
   prototype.
   * config/i386/sse.md (codemode3 VI logic): Use it.
 
   * gcc.target/i386/xorps-sse2.c: Remove xfails.

Ok.


r~

Re: [PATCH, AArch64] Handle symbol + offset more effectively

2012-09-27 Thread Marcus Shawcroft


On 25/09/12 14:45, Ian Bolton wrote:

Hi all,

This patch corrects what seemed to be a typo in expand_mov_immediate
in aarch64.c, where we had || instead of an  in our original code.

if (offset != const0_rtx
 (targetm.cannot_force_const_mem (mode, imm)
  || (can_create_pseudo_p (  //- should have been

At any given time, this code would have treated all input the same
and will have caused all non-zero offsets to have been forced to
temporaries, and made us never run the code in the remainder of the
function.

In terms of measurable impact, this patch provides a better fix to the
problem I was trying to solve with this patch:

http://gcc.gnu.org/ml/gcc-patches/2012-08/msg02072.html

Almost all credit should go to Richard Henderson for this patch.
It is all his, but for a minor change I made to some predicates which
now become relevant when we execute more of the expand_mov_immediate
function.

My testing showed no regressions for bare-metal or linux.

OK for aarch64-branch and aarch64-4.7-branch?

Cheers,
Ian


2012-09-25  Richard Hendersonr...@redhat.com
 Ian Boltonian.bol...@arm.com

* config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Fix a
functional typo and refactor code in switch statement.
* config/aarch64/aarch64.md (add_losym): Handle symbol + offset.
* config/aarch64/predicates.md (aarch64_tls_ie_symref): Match const.
(aarch64_tls_le_symref): Likewise.


OK for aarch64-branch and backport to aarch64-4.7-branch.
/Marcus

Re: add typedef printers to libstdc++

2012-09-27 Thread Tom Tromey

Jonathan Please go ahead and commit, thanks, Tom!

Thanks.  The needed gdb patches are still pending, so I plan to wait
until those go in before committing to libstdc++.  I hope it will be
next week sometime.

Tom

[C++ Patch] PR 52764

2012-09-27 Thread Paolo Carlini


Hi,

C++11, in 18.4.1/2, is very clear that __STDC_LIMIT_MACROS and 
__STDC_CONSTANT_MACROS play no role in C++11 and the macros are provided 
unconditionally.


The below implements such requirement in the way recommended on the 
audit trail, thus changing both stdint-gcc.h and stdint-wrap.h to cover 
both targets for which GCC installs and doesn't install stdint.h.


Tested x86_64-linux.

Thanks!
Paolo.

///
2012-09-27  Paolo Carlini  paolo.carl...@oracle.com

PR c++/52764
* ginclude/stdint-wrap.h: In C++11 if __STDC_HOSTED__ define
__STDC_LIMIT_MACROS and __STDC_CONSTANT_MACROS.
* ginclude/stdint-gcc.h: In C++11 unconditionally define
limit and constant macros.

/testsuite
2012-09-27  Paolo Carlini  paolo.carl...@oracle.com

PR c++/52764
* g++.dg/cpp0x/stdint.C: New.

/libstdc++-v3
2012-09-27  Paolo Carlini  paolo.carl...@oracle.com

PR c++/52764
* include/c_global/cstdint: Remove __STDC_LIMIT_MACROS and
__STDC_CONSTANT_MACROS related macros.
Index: gcc/testsuite/g++.dg/cpp0x/stdint.C
===
--- gcc/testsuite/g++.dg/cpp0x/stdint.C (revision 0)
+++ gcc/testsuite/g++.dg/cpp0x/stdint.C (working copy)
@@ -0,0 +1,135 @@
+// PR c++/52764
+// { dg-require-effective-target stdint_types }
+// { dg-do compile { target c++11 } }
+
+#include stdint.h
+
+#ifdef __INT8_TYPE__
+# if (!defined INT8_MAX \
+  || !defined INT8_MIN)
+# error
+# endif
+#endif
+#ifdef __UINT8_TYPE__
+# if !defined UINT8_MAX
+# error
+# endif
+#endif
+#ifdef __INT16_TYPE__
+# if (!defined INT16_MAX \
+  || !defined INT16_MIN)
+# error
+# endif
+#endif
+#ifdef __UINT16_TYPE__
+# if !defined UINT16_MAX
+# error
+# endif
+#endif
+#ifdef __INT32_TYPE__
+# if (!defined INT32_MAX \
+  || !defined INT32_MIN)
+# error
+# endif
+#endif
+#ifdef __UINT32_TYPE__
+# if !defined UINT32_MAX
+# error
+# endif
+#endif
+#ifdef __INT64_TYPE__
+# if (!defined INT64_MAX \
+  || !defined INT64_MIN)
+# error
+# endif
+#endif
+#ifdef __UINT64_TYPE__
+# if !defined UINT64_MAX
+# error
+# endif
+#endif
+
+#if (!defined INT_LEAST8_MAX \
+ || !defined INT_LEAST8_MIN\
+ || !defined UINT_LEAST8_MAX \
+ || !defined INT_LEAST16_MAX \
+ || !defined INT_LEAST16_MIN \
+ || !defined UINT_LEAST16_MAX \
+ || !defined INT_LEAST32_MAX \
+ || !defined INT_LEAST32_MIN \
+ || !defined UINT_LEAST32_MAX \
+ || !defined INT_LEAST64_MAX \
+ || !defined INT_LEAST64_MIN \
+ || !defined UINT_LEAST64_MAX)
+#error
+#endif
+
+#if (!defined INT_FAST8_MAX \
+ || !defined INT_FAST8_MIN \
+ || !defined UINT_FAST8_MAX \
+ || !defined INT_FAST16_MAX\
+ || !defined INT_FAST16_MIN\
+ || !defined UINT_FAST16_MAX \
+ || !defined INT_FAST32_MAX\
+ || !defined INT_FAST32_MIN\
+ || !defined UINT_FAST32_MAX \
+ || !defined INT_FAST64_MAX\
+ || !defined INT_FAST64_MIN\
+ || !defined UINT_FAST64_MAX)
+#error
+#endif
+
+#ifdef __INTPTR_TYPE__
+# if (!defined INTPTR_MAX \
+  || !defined INTPTR_MIN)
+# error
+# endif
+#endif
+#ifdef __UINTPTR_TYPE__
+# if !defined UINTPTR_MAX
+# error
+# endif
+#endif
+
+#if (!defined INTMAX_MAX \
+ || !defined INTMAX_MIN \
+ || !defined UINTMAX_MAX)
+#error
+#endif
+
+#if (!defined PTRDIFF_MAX \
+ || !defined PTRDIFF_MIN)
+#error
+#endif
+
+#if (!defined SIG_ATOMIC_MAX \
+ || !defined SIG_ATOMIC_MIN)
+#error
+#endif
+
+#if !defined SIZE_MAX
+#error
+#endif
+
+#if (!defined WCHAR_MAX \
+ || !defined WCHAR_MIN)
+#error
+#endif
+
+#if (!defined WINT_MAX \
+ || !defined WINT_MIN)
+#error
+#endif
+
+#if (!defined INT8_C \
+ || !defined INT16_C \
+ || !defined INT32_C \
+ || !defined INT64_C \
+ || !defined UINT8_C \
+ || !defined UINT16_C \
+ || !defined UINT32_C \
+ || !defined UINT64_C \
+ || !defined INTMAX_C \
+ || !defined UINTMAX_C)
+#error
+#endif
Index: gcc/ginclude/stdint-wrap.h
===
--- gcc/ginclude/stdint-wrap.h  (revision 191805)
+++ gcc/ginclude/stdint-wrap.h  (working copy)
@@ -1,5 +1,13 @@
 #ifndef _GCC_WRAP_STDINT_H
 #if __STDC_HOSTED__
+# if defined __cplusplus  __cplusplus = 201103L
+#  undef __STDC_LIMIT_MACROS
+#  define __STDC_LIMIT_MACROS
+# endif
+# if defined __cplusplus  __cplusplus = 201103L
+#  undef __STDC_CONSTANT_MACROS
+#  define __STDC_CONSTANT_MACROS
+# endif
 # include_next stdint.h
 #else
 # include stdint-gcc.h
Index: gcc/ginclude/stdint-gcc.h
===
--- gcc/ginclude/stdint-gcc.h   (revision 191805)
+++ gcc/ginclude/stdint-gcc.h   (working copy)
@@ -1,4 +1,4 @@
-/* Copyright (C) 2008, 2009 Free Software Foundation, Inc.
+/* Copyright (C) 2008-2012 Free Software Foundation, Inc.
 
 This file is part of GCC.
 
@@ -91,7 +91,8 @@ typedef

[PATCH v2, rtl-optimization]: Fix PR54457, [x32] Fail to combine 64bit index + constant

2012-09-27 Thread Uros Bizjak

On Thu, Sep 27, 2012 at 4:25 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:

 I agree (subreg:M (op:N A C) 0) to (op:M (subreg:N (A 0)) C) is
 a good transformation, but why do we need to handle as special
 the case where the subreg is itself the operand of a plus or minus?
 I think it should happen regardless of where the subreg occurs.

 Don't we need to restrict this to the low part though?

 I have tried this approach with attached patch.  Unfortunately,
 although it survived bootstrap without libjava on x86_64, it failed
 building libjava with:

 /home/uros/gcc-svn/trunk/libjava/classpath/javax/swing/plaf/basic/BasicSliderUI.java:1299:0:
 error: insn does not satisfy its constraints:
}
  ^
 (insn 237 398 399 7 (set (reg:SI 1 dx [125])
 (plus:SI (subreg:SI (mult:DI (reg:DI 1 dx [orig:72 D.78627 ] [72])
 (const_int 2 [0x2])) 0)
 (reg:SI 5 di)))
 /home/uros/gcc-svn/trunk/libjava/classpath/javax/swing/plaf/basic/BasicSliderUI.java:1271
 240 {*leasi}
  (expr_list:REG_DEAD (reg:DI 5 di)
 (nil)))

 Original RTX was (subreg:SI (plus:DI (mult:DI (...) reg:DI))), which
 is valid RTX pattern for lea insn, the above is not.

 Due to these problems, I think the safer approach is to limit the
 transformation to (plus:SI (subreg:SI (plus:DI (...) 0)) RTXes, as was
 the case with original patch. This approach would fix a specific
 problem where simplify_plus_minus is not able to simplify the combined
 RTX at combine time. Please note, that combined RTXes are always
 checked for correctness at combine pass.

 I think instead the (subreg (plus ...)) handling should be applied
 to (subreg (mult ...)) too.  IMO the correct form of the above address
 ought to be:

 (set (reg:SI 1 dx [125])
  (plus:SI (mult:SI (reg:SI 1 dx [orig:72 D.78627 ] [72])
(const_int 2 [0x2]))
   (reg:SI 5 di))

Great, this works as expected!

After some off-line discussion with Richard, attached is v2 of the patch.

2012-09-27  Uros Bizjak  ubiz...@gmail.com

PR rtl-optimization/54457
* simplify-rtx.c (simplify_subreg):
Simplify (subreg:SI (op:DI ((x:DI) (y:DI)), 0)
to (op:SI (subreg:SI (x:DI) 0) (subreg:SI (x:DI) 0)).

testsuite/ChangeLog:

2012-09-27  Uros Bizjak  ubiz...@gmail.com

PR rtl-optimization/54457
* gcc.target/i386/pr54457.c: New test.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}.

BTW: I propose that we start with limited selection of opcodes, so x32
autotester will pick and test the patch with SImode addresses.

OK for mainline?

Uros.
Index: simplify-rtx.c
===
--- simplify-rtx.c  (revision 191808)
+++ simplify-rtx.c  (working copy)
@@ -5689,6 +5689,28 @@ simplify_subreg (enum machine_mode outermode, rtx
return CONST0_RTX (outermode);
 }
 
+  /* Simplify (subreg:SI (op:DI ((x:DI) (y:DI)), 0)
+ to (op:SI (subreg:SI (x:DI) 0) (subreg:SI (x:DI) 0)), where
+ the outer subreg is effectively a truncation to the original mode.  */
+  if ((GET_CODE (op) == PLUS
+   || GET_CODE (op) == MINUS
+   || GET_CODE (op) == MULT)
+   SCALAR_INT_MODE_P (outermode)
+   SCALAR_INT_MODE_P (innermode)
+   GET_MODE_PRECISION (outermode)  GET_MODE_PRECISION (innermode)
+   byte == subreg_lowpart_offset (outermode, innermode))
+{
+  rtx op0 = simplify_gen_subreg (outermode, XEXP (op, 0),
+ innermode, byte);
+  if (op0)
+{
+  rtx op1 = simplify_gen_subreg (outermode, XEXP (op, 1),
+ innermode, byte);
+  if (op1)
+return simplify_gen_binary (GET_CODE (op), outermode, op0, op1);
+}
+}
+
   /* Simplify (subreg:QI (lshiftrt:SI (sign_extend:SI (x:QI)) C), 0) into
  to (ashiftrt:QI (x:QI) C), where C is a suitable small constant and
  the outer subreg is effectively a truncation to the original mode.  */
Index: testsuite/gcc.target/i386/pr54457.c
===
--- testsuite/gcc.target/i386/pr54457.c (revision 0)
+++ testsuite/gcc.target/i386/pr54457.c (working copy)
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { ! { ia32 } } } } */
+/* { dg-options -O2 -mx32 -maddress-mode=short } */
+
+extern char array[40];
+
+char foo (long long position)
+{
+  return array[position + 1];
+}
+
+/* { dg-final { scan-assembler-not add\[lq\]?\[^\n\]*1 } } */

Re: [PATCH v2, rtl-optimization]: Fix PR54457, [x32] Fail to combine 64bit index + constant

2012-09-27 Thread Paul_Koning


On Sep 27, 2012, at 2:04 PM, Uros Bizjak wrote:

 
 
 
 I agree (subreg:M (op:N A C) 0) to (op:M (subreg:N (A 0)) C) is
 a good transformation, but why do we need to handle as special
 the case where the subreg is itself the operand of a plus or minus?
 I think it should happen regardless of where the subreg occurs.
 
 Don't we need to restrict this to the low part though?
 
 ...
 
 After some off-line discussion with Richard, attached is v2 of the patch.
 
 2012-09-27  Uros Bizjak  ubiz...@gmail.com
 
PR rtl-optimization/54457
* simplify-rtx.c (simplify_subreg):
   Simplify (subreg:SI (op:DI ((x:DI) (y:DI)), 0)
   to (op:SI (subreg:SI (x:DI) 0) (subreg:SI (x:DI) 0)).
 ...

Is it just specific to DI - SI, or is it for any large mode - smaller mode, 
like SI - HI?

paul

Re: [PATCH v2, rtl-optimization]: Fix PR54457, [x32] Fail to combine 64bit index + constant

2012-09-27 Thread Uros Bizjak

On Thu, Sep 27, 2012 at 8:08 PM,  paul_kon...@dell.com wrote:

 I agree (subreg:M (op:N A C) 0) to (op:M (subreg:N (A 0)) C) is
 a good transformation, but why do we need to handle as special
 the case where the subreg is itself the operand of a plus or minus?
 I think it should happen regardless of where the subreg occurs.

 Don't we need to restrict this to the low part though?

 ...

 After some off-line discussion with Richard, attached is v2 of the patch.

 2012-09-27  Uros Bizjak  ubiz...@gmail.com

PR rtl-optimization/54457
* simplify-rtx.c (simplify_subreg):
   Simplify (subreg:SI (op:DI ((x:DI) (y:DI)), 0)
   to (op:SI (subreg:SI (x:DI) 0) (subreg:SI (x:DI) 0)).
 ...

 Is it just specific to DI - SI, or is it for any large mode - smaller mode, 
 like SI - HI?

Oh, I just copied v1 ChangeLog. The patch converts all modes where
size of mode M  size of mode N. Updated ChangeLog reads:

2012-09-27  Uros Bizjak  ubiz...@gmail.com

PR rtl-optimization/54457
* simplify-rtx.c (simplify_subreg):
Simplify (subreg:M (op:N ((x:N) (y:N)), 0)
to (op:M (subreg:M (x:N) 0) (subreg:M (x:N) 0)), where
the outer subreg is effectively a truncation to the original mode M.

testsuite/ChangeLog:

2012-09-27  Uros Bizjak  ubiz...@gmail.com

PR rtl-optimization/54457
* gcc.target/i386/pr54457.c: New test.

Uros.

Re: [PATCH v2, rtl-optimization]: Fix PR54457, [x32] Fail to combine 64bit index + constant

2012-09-27 Thread Jakub Jelinek

On Thu, Sep 27, 2012 at 08:04:58PM +0200, Uros Bizjak wrote:
 After some off-line discussion with Richard, attached is v2 of the patch.
 
 2012-09-27  Uros Bizjak  ubiz...@gmail.com
 
 PR rtl-optimization/54457
 * simplify-rtx.c (simplify_subreg):
   Simplify (subreg:SI (op:DI ((x:DI) (y:DI)), 0)
   to (op:SI (subreg:SI (x:DI) 0) (subreg:SI (x:DI) 0)).

Is that a good idea even for WORD_REGISTER_OPERATIONS targets?

 --- simplify-rtx.c(revision 191808)
 +++ simplify-rtx.c(working copy)
 @@ -5689,6 +5689,28 @@ simplify_subreg (enum machine_mode outermode, rtx
   return CONST0_RTX (outermode);
  }
  
 +  /* Simplify (subreg:SI (op:DI ((x:DI) (y:DI)), 0)
 + to (op:SI (subreg:SI (x:DI) 0) (subreg:SI (x:DI) 0)), where
 + the outer subreg is effectively a truncation to the original mode.  */
 +  if ((GET_CODE (op) == PLUS
 +   || GET_CODE (op) == MINUS
 +   || GET_CODE (op) == MULT)
 +   SCALAR_INT_MODE_P (outermode)
 +   SCALAR_INT_MODE_P (innermode)
 +   GET_MODE_PRECISION (outermode)  GET_MODE_PRECISION (innermode)
 +   byte == subreg_lowpart_offset (outermode, innermode))
 +{
 +  rtx op0 = simplify_gen_subreg (outermode, XEXP (op, 0),
 + innermode, byte);
 +  if (op0)
 +{
 +  rtx op1 = simplify_gen_subreg (outermode, XEXP (op, 1),
 + innermode, byte);
 +  if (op1)
 +return simplify_gen_binary (GET_CODE (op), outermode, op0, op1);
 +}
 +}
 +
/* Simplify (subreg:QI (lshiftrt:SI (sign_extend:SI (x:QI)) C), 0) into
   to (ashiftrt:QI (x:QI) C), where C is a suitable small constant and
   the outer subreg is effectively a truncation to the original mode.  */

Jakub

Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)

2012-09-27 Thread Lawrence Crowl

On 9/27/12, Michael Matz m...@suse.de wrote:
 On Wed, 26 Sep 2012, Lawrence Crowl wrote:
   A lower-case type name indicates to me a non-changing type,
   i.e. nothing that depends on a template.  In C we only had
   such types so we used lower-case names everywhere.  With C++
   and templates I think we should start using upper case for some
   very specific use cases, like first letter of dependend types.
 
  How would you distinguish them from template parameter names,
  which by convention have an upper case first letter?

 I wouldn't.  If the distinction becomes so important that authors
 need to see the speciality immediately by having a different
 convention how to spell names, then I think we did something wrong,
 and we should simplify the code.

  What about non-type dependent names?

 I'm not sure what you're asking.  Let's make an example:

 template typename T
 struct D : BT
 {
   typedef typename BT::E E; // element_type
   E getme (int index);
 }

Inside that struct, lets say we have a field of type E.  Do we name
it F or f?

 In fact, as BT::E would probably be defined like typedef
 typename T E, I would even have no issue to call the above E
 also T.  The distinction between the template arg name and the
 typedef would be blurred, and I say, so what; one is a typedef
 of the other and hence mostly equivalent for practical purposes.
 (And if they aren't, then again, we did something too complicated
 with the switch to C++).

  The advantage to following them is that they will surprise
  no one.

 They will surprise everyone used to different conventions, for
 instance Qt, so that's not a reason.

Anyone using the standard library will not be surprised if we follow
the conventions of the standard library.  I'd guess that the number
standard library programmers outnumbers the Qt programmers by 100
to 1.  I'd guess that the number of Qt programmers that do not know
the standard library is a minority.

  Do you have an alternate suggestion, one that does not confuse
  template parameters and dependent names?

 Upper last character?  Just kidding :)  Too many detailed rules
 for conventions are the death of them, use rules of thumbs,
 my one would be somehow depends on template args - has upper
 character in name, where somehow depends on includes is a.

Ah, but there is a problem.  That typedef name does not necessarily
depend on a template parameter.

It is common practice to have

struct Q
{
  typedef int E;
  E getme (int index);
};

and use it in exactly the same places you would use Dsomething.

In fact, one place is in the hash table code we are discussing.
The hash descriptor type may not itself be a template.  I believe
that few of them will actually be templates.

So, if E implies comes from template, the implication is wrong.

If we were to follow C++ standard library conventions, we would call
it value_type.  That would be my preference.  However, if folks
want a shorter name, I'll live with that too.  But as it stands,
the current name is very confusing.

-- 
Lawrence Crowl

Patch committed: Fix crash in libbacktrace if no debug info

2012-09-27 Thread Ian Lance Taylor

When changing the libbacktrace interface to avoid using mutexes, I
missed a spot.  This caused libbacktrace to crash on a binary with no
debug info.  This patch fixes the problem.  Bootstrapped and ran
libbacktrace tests.  Committed to mainline.

Ian


2012-09-27  Ian Lance Taylor  i...@google.com

PR other/54726
* elf.c (backtrace_initialize): Set *fileln_fn, not
state-fileln_fn.


Index: elf.c
===
--- elf.c	(revision 191810)
+++ elf.c	(working copy)
@@ -634,7 +634,7 @@ backtrace_initialize (struct backtrace_s
 {
   if (!backtrace_close (descriptor, error_callback, data))
 	goto fail;
-  state-fileline_fn = elf_nodebug;
+  *fileline_fn = elf_nodebug;
   state-fileline_data = NULL;
   return 1;
 }

PATCH: [4.6 Regression] 22_locale/num_put/put/char/9780-2.cc

2012-09-27 Thread H.J. Lu

Hi,

This patch backports revision 182385 from trunk to 4.6 branch.  Tested
on Linux/x86-64.  OK to install?

Thanks.


H.J.
--
diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog
index aa94768..ff4b13e 100644
--- a/libstdc++-v3/ChangeLog
+++ b/libstdc++-v3/ChangeLog
@@ -1,3 +1,11 @@
+2012-09-27  H.J. Lu  hongjiu...@intel.com
+
+   Backport from mainline
+   2011-12-15  Benjamin Kosnik  b...@redhat.com
+
+   * testsuite/22_locale/num_put/put/char/9780-2.cc: Add test for C
+   locale, add sanity checks in case of grouping.
+
 2012-07-22  Jonathan Wakely  jwakely@gmail.com
 
PR libstdc++/53270
diff --git a/libstdc++-v3/testsuite/22_locale/num_put/put/char/9780-2.cc 
b/libstdc++-v3/testsuite/22_locale/num_put/put/char/9780-2.cc
index 7993691..5cf0d04 100644
--- a/libstdc++-v3/testsuite/22_locale/num_put/put/char/9780-2.cc
+++ b/libstdc++-v3/testsuite/22_locale/num_put/put/char/9780-2.cc
@@ -1,7 +1,7 @@
 // { dg-require-namedlocale de_DE }
 // { dg-require-namedlocale es_ES }
 
-// Copyright (C) 2004, 2005, 2009 Free Software Foundation, Inc.
+// Copyright (C) 2004, 2005, 2009, 2011 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
@@ -22,23 +22,60 @@
 #include locale
 #include testsuite_hooks.h
 
-int main()
+// Make sure that formatted output uses the locale in the output stream.
+using namespace std;
+locale l1 = locale(de_DE);
+const num_putchar np = use_facetnum_putchar (l1);
+const numpunctchar npunct = use_facetnumpunctchar (l1);
+
+void test01()
 {
-  using namespace std;
+  bool test __attribute__((unused)) = true;
+
+  locale l2 = locale(C);
+  const numpunctchar npunct2 = use_facetnumpunctchar (l2);
+  char c = npunct2.thousands_sep();
+  string s = npunct2.grouping();
+
+  ostringstream oss;
+  oss.imbue(l2);
+
+  long l = 1234567890;
+  np.put(oss.rdbuf(), oss, ' ', l);
+  string res = oss.str();
+
+  VERIFY( res == 1234567890 );
+}
 
+void test02()
+{
   bool test __attribute__((unused)) = true;
-  locale l1 = locale(de_DE);
+
   locale l2 = locale(es_ES);
-  
-  const num_putchar np = use_facetnum_putchar (l1);  
+  const numpunctchar npunct3 = use_facetnumpunctchar (l2);
+  char c = npunct3.thousands_sep();
+  string s = npunct3.grouping();
+
   ostringstream oss;
   oss.imbue(l2);
 
   long l = 1234567890;
-  np.put(oss.rdbuf(), oss, ' ', l); // 1234567890
+  np.put(oss.rdbuf(), oss, ' ', l);
   string res = oss.str();
-  
-  VERIFY( res == 1234567890 );
 
+  if (!s.empty())
+VERIFY( res == 1.234.567.890 );
+  else
+VERIFY( res == 1234567890 );
+}
+
+int main()
+{
+  // Sanity check.
+  char c = npunct.thousands_sep();
+  string s = npunct.grouping();
+
+  test01();
+  test02();
   return 0;
 }

Re: PATCH: [4.6 Regression] 22_locale/num_put/put/char/9780-2.cc

2012-09-27 Thread Paolo Carlini


On 09/27/2012 10:33 PM, H.J. Lu wrote:

Hi,

This patch backports revision 182385 from trunk to 4.6 branch.  Tested
on Linux/x86-64.  OK to install?
Ok, thanks. But we are in 2012: let's simply have 2004-2012 as Copyright 
years.


Paolo.

Re: [PATCH v2, rtl-optimization]: Fix PR54457, [x32] Fail to combine 64bit index + constant

2012-09-27 Thread Richard Sandiford

Jakub Jelinek ja...@redhat.com writes:
 On Thu, Sep 27, 2012 at 08:04:58PM +0200, Uros Bizjak wrote:
 After some off-line discussion with Richard, attached is v2 of the patch.
 
 2012-09-27  Uros Bizjak  ubiz...@gmail.com
 
 PR rtl-optimization/54457
 * simplify-rtx.c (simplify_subreg):
  Simplify (subreg:SI (op:DI ((x:DI) (y:DI)), 0)
  to (op:SI (subreg:SI (x:DI) 0) (subreg:SI (x:DI) 0)).

 Is that a good idea even for WORD_REGISTER_OPERATIONS targets?

I think so.  Admittedly it means I'm going to have to change the
mips.md BADDU patterns from:

(define_insn *baddu_si_el
  [(set (match_operand:SI 0 register_operand =d)
(zero_extend:SI
 (subreg:QI
  (plus:SI (match_operand:SI 1 register_operand d)
   (match_operand:SI 2 register_operand d)) 0)))]
  ISA_HAS_BADDU  !BYTES_BIG_ENDIAN
  baddu\\t%0,%1,%2
  [(set_attr alu_type add)])

to:

(define_insn *baddu_si_el
  [(set (match_operand:SI 0 register_operand =d)
(zero_extend:SI
 (plus:QI (match_operand:QI 1 register_operand d)
  (match_operand:QI 2 register_operand d]
  ISA_HAS_BADDU
  baddu\\t%0,%1,%2
  [(set_attr alu_type add)])

But really, that's a better representation even on MIPS,
not least because it gets rid of the ugly endianness condition.

There will probably be fallout on other targets too.
But the upside is that we get rid of more subregs from .mds.
I think a few of the i386.md define_peephole2s could go too.
E.g. the second two of:

(define_peephole2
  [(set (match_operand:DI 0 register_operand)
(zero_extend:DI
  (plus:SI (match_operand:SI 1 register_operand)
   (match_operand:SI 2 nonmemory_operand]
  TARGET_64BIT  !TARGET_OPT_AGU
REGNO (operands[0]) == REGNO (operands[1])
peep2_regno_dead_p (0, FLAGS_REG)
  [(parallel [(set (match_dup 0)
   (zero_extend:DI (plus:SI (match_dup 1) (match_dup 2
  (clobber (reg:CC FLAGS_REG))])])

(define_peephole2
  [(set (match_operand:DI 0 register_operand)
(zero_extend:DI
  (plus:SI (match_operand:SI 1 nonmemory_operand)
   (match_operand:SI 2 register_operand]
  TARGET_64BIT  !TARGET_OPT_AGU
REGNO (operands[0]) == REGNO (operands[2])
peep2_regno_dead_p (0, FLAGS_REG)
  [(parallel [(set (match_dup 0)
   (zero_extend:DI (plus:SI (match_dup 2) (match_dup 1
  (clobber (reg:CC FLAGS_REG))])])

(define_peephole2
  [(set (match_operand:DI 0 register_operand)
(zero_extend:DI
  (subreg:SI (plus:DI (match_dup 0)
  (match_operand:DI 1 nonmemory_operand)) 0)))]
  TARGET_64BIT  !TARGET_OPT_AGU
peep2_regno_dead_p (0, FLAGS_REG)
  [(parallel [(set (match_dup 0)
   (zero_extend:DI (plus:SI (match_dup 2) (match_dup 1
  (clobber (reg:CC FLAGS_REG))])]
{
  operands[1] = gen_lowpart (SImode, operands[1]);
  operands[2] = gen_lowpart (SImode, operands[0]);
})

(define_peephole2
  [(set (match_operand:DI 0 register_operand)
(zero_extend:DI
  (subreg:SI (plus:DI (match_operand:DI 1 nonmemory_operand)
  (match_dup 0)) 0)))]
  TARGET_64BIT  !TARGET_OPT_AGU
peep2_regno_dead_p (0, FLAGS_REG)
  [(parallel [(set (match_dup 0)
   (zero_extend:DI (plus:SI (match_dup 2) (match_dup 1
  (clobber (reg:CC FLAGS_REG))])]
{
  operands[1] = gen_lowpart (SImode, operands[1]);
  operands[2] = gen_lowpart (SImode, operands[0]);
})

where we should now always generate the first two forms.

There's no semantic difference between the two rtxes, and I think
it would be confusing to have different canonical forms on different
targets.  If the caller really wants a word-mode operation on
WORD_REGISTER_OPERATIONS targets, then I think it's asking for
the wrong thing by taking this subreg.

Richard

Re: PATCH: [4.6 Regression] 22_locale/num_put/put/char/9780-2.cc

2012-09-27 Thread H.J. Lu

On Thu, Sep 27, 2012 at 1:51 PM, Paolo Carlini paolo.carl...@oracle.com wrote:
 On 09/27/2012 10:33 PM, H.J. Lu wrote:

 Hi,

 This patch backports revision 182385 from trunk to 4.6 branch.  Tested
 on Linux/x86-64.  OK to install?

 Ok, thanks. But we are in 2012: let's simply have 2004-2012 as Copyright
 years.

Done and checked in.

-- 
H.J.

[google/4_7] Patch committed: backport the location_block patch from trunk

2012-09-27 Thread Dehao Chen

I have backported the following patches from trunk to google-4_7:

r191494, r191510, r191614, r191669, r191680, r191706, r191747,
r191759, r191779, r191810.

Combine location_t and block into an integer, so that these two
building blocks of debug info are kept consistent during backend
optimizations. The side product of this patch is memory footprint
reduction, because the block field in both gimple and tree.exp are
eliminated.

2012-09-27  Dehao Chen  de...@google.com
gcc:
* toplev.c (toplev_main): Finalize block_locations.
* tree.c (tree_set_block): New.
(tree_block): Change to use LOCATION_BLOCK.
(build1_stat): Remove block.
* tree.h (TREE_SET_BLOCK): New.
(tree_set_block): New.
(tree_block): Change to return tree.
(TREE_BLOCK): Likewise.
(tree_exp): Remove block.
(DECL_IS_BUILTIN): Compare LOCATION_LOCUS.
(EXPR_HAS_LOCATION): Likewise.
(inlined_function_outer_scope_p): Likewise.
(OMP_CLAUSE_HAS_LOCATION): Likewise.
* cfglayout.c (reemit_insn_block_notes): Change to use LOCATION_BLOCK.
(fixup_reorder_chain): Likewise.
(insn_discriminator): Likewise.
(insn_locators_alloc): Remove.
(insn_locators_finalize): Remove.
(insn_locators_free): Remove.
(set_curr_insn_source_location): Remove.
(get_curr_insn_source_location): Remove.
(set_curr_insn_block): Remove.
(get_curr_insn_block): Remove.
(locator_scope): Remove.
(insn_scope): Change to use new location.
(locator_location): Remove.
(insn_line): Change to use new location.
(locator_file): Remove.
(insn_file): Change to use new location.
(locator_eq): Remove.
(insn_locations_init): New.
(insn_locations_finalize): New.
(set_curr_insn_location): New.
(curr_insn_location): New.
* final.c (final_start_function): Likewise.
* input.c (expand_location): Likewise.
(location_with_discriminator): Likewise.
(has_discriminator): Likewise.
(map_discriminator_location): Likewise.
(get_discriminator_from_locus): Likewise.
* input.h (LOCATION_LOCUS): New.
(LOCATION_BLOCK): New.
* fold-const.c (expr_location_or): Change to use new location.
* reorg.c (emit_delay_sequence): Likewise.
(dbr_schedule): Likewise.
* modulo-sched.c (loop_single_full_bb_p): Likewise.
(dump_insn_location): Likewise.
(loop_canon_p): Likewise.
(sms_schedule): Likewise.
* lto-streamer-out.c (lto_output_location_bitpack): Likewise.
* lto-cgraph.c (output_node_opt_summary): Likewise.
* jump.c (rtx_renumbered_equal_p): Likewise.
* ifcvt.c (noce_try_move): Likewise.
(noce_try_store_flag): Likewise.
(noce_try_store_flag_constants): Likewise.
(noce_try_addcc): Likewise.
(noce_try_store_flag_mask): Likewise.
(noce_try_cmove): Likewise.
(noce_try_cmove_arith): Likewise.
(noce_try_minmax): Likewise.
(noce_try_abs): Likewise.
(noce_try_sign_mask): Likewise.
(noce_try_bitop): Likewise.
(noce_process_if_block): Likewise.
(cond_move_process_if_block): Likewise.
(find_cond_trap): Likewise.
* ipa-prop.c (ipa_set_jf_constant): Likewise.
(ipa_write_jump_function): Likewise.
* dwarf2out.c (add_src_coords_attributes): Likewise.
* expr.c (expand_expr_real): Likewise.
* tree-parloops.c (create_loop_fn): Likewise.
* recog.c (peep2_attempt): Likewise.
* function.c (free_after_compilation): Likewise.
(expand_function_end): Likewise.
(maybe_copy_prologue_epilogue_insn): Likewise.
(set_insn_locations): Likewise.
(thread_prologue_and_epilogue_insns): Likewise.
* print-rtl.c (print_rtx): Likewise.
* profile.c (branch_prob): Likewise.
* trans-mem.c (ipa_tm_scan_irr_block): Likewise.
* gimplify.c (gimplify_call_expr): Likewise.
* except.c (duplicate_eh_regions_1): Likewise.
* emit-rtl.c (try_split): Likewise.
(make_insn_raw): Likewise.
(make_debug_insn_raw): Likewise.
(make_jump_insn_raw): Likewise.
(make_call_insn_raw): Likewise.
(emit_pattern_after_setloc): Likewise.
(emit_pattern_after): Likewise.
(emit_insn_after_setloc): Likewise.
(emit_insn_after): Likewise.
(emit_jump_insn_after_setloc): Likewise.
(emit_jump_insn_after): Likewise.
(emit_call_insn_after_setloc): Likewise.
(emit_call_insn_after): Likewise.
(emit_debug_insn_after_setloc): Likewise.
(emit_debug_insn_after): Likewise.
(emit_pattern_before_setloc): Likewise.
(emit_pattern_before): Likewise.
(emit_insn_before_setloc): Likewise.
(emit_insn_before): Likewise.
(emit_jump_insn_before_setloc): Likewise.
(emit_jump_insn_before): Likewise.
(emit_call_insn_before_setloc): Likewise.
(emit_call_insn_before): Likewise.
(emit_debug_insn_before_setloc): Likewise.
(emit_debug_insn_before): Likewise.
(emit_copy_of_insn_after): Likewise.
* cfgexpand.c (gimple_assign_rhs_to_tree): Change to use new location.
(expand_gimple_cond): Likewise.
(expand_call_stmt): Likewise.
(expand_gimple_stmt_1): Likewise.
(expand_gimple_basic_block): Likewise.
(construct_exit_block): Likewise.
(gimple_expand_cfg): Likewise.
* cfgcleanup.c (try_forward_edges): Likewise.
* tree-ssa-live.c (remove_unused_scope_block_p): Likewise.
(dump_scope_block): Likewise.
(mark_all_vars_used): Likewise.
(remove_unused_locals): Likewise.
(clear_unused_block_pointer): New.
(clear_unused_block_pointer_1): New.
* rtl.c

Re: [PATCH] Rs6000 infrastructure cleanup (switches), revised patch #2

2012-09-27 Thread Michael Meissner

This patch fixes a long standing bug that David noticed, namely if you don't
use -mcpu=xxx, the target options that are set by the configuration .h files
in TARGET_DEFAULT are cleared if they are in POWERPC_MASKS.  Note, if you
configure the compiler using --with-cpu=xxx, it provides a default cpu, so
users using pre-packaged compilers, typically would not see the bug.

In adding the support for the above change, I also tweaked the debug output for
-mdebug=reg so it prints more information about the switches set.  Other than
that, this is the same patch that I previously submitted.

I have done the bootstrap and make check with no regressions.  I have also
built special compilers without --with-cpu to make sure the changes were
correctly propigated.  Are these patches ok to install?

2012-09-27  Michael Meissner  meiss...@linux.vnet.ibm.com

* common/config/rs6000/rs6000-common.c (rs6000_handle_option):
Move all switches that set target_flags to set rs6000_isa_flags,
and make it HOST_WIDE_INT.  Save/restore new option words.  Add
TARGET_xxx maps for OPTION_xxx.  Add MASK_xxx maps for
OPTION_MASK_xxx.  Print more debug output for -mdebug=reg.  Move
masks for different cpu levels to rs6000-cpus.def.  Turn off VSX
if the assembler doesn't support Altivec.  Change #ifdef
TARGET_xxx to #ifdef OPTION_xxx.  If no -mcpu=xxx was used,
use all of the bits in TARGET_DEFAULT for the isa bits.
* gcc/config/rs6000/aix43.h (SUBTARGET_OVERRIDE_OPTIONS):
Likewise.
* gcc/config/rs6000/aix51.h (SUBTARGET_OVERRIDE_OPTIONS):
Likewise.
* gcc/config/rs6000/aix52.h (SUBTARGET_OVERRIDE_OPTIONS):
Likewise.
* gcc/config/rs6000/aix53.h (SUBTARGET_OVERRIDE_OPTIONS):
Likewise.
* gcc/config/rs6000/aix61.h (SUBTARGET_OVERRIDE_OPTIONS):
Likewise.
* gcc/config/rs6000/aix64.opt (-maix64): Likewise.
(-maix32): Likewise.
* gcc/config/rs6000/darwin.opt (-m64): Likewise.
(-m32): Likewise.
* gcc/config/rs6000/freebsd.h (RELOCATABLE_NEEDS_FIXUP):
Likewise.
* gcc/config/rs6000/freebsd64.h (RELOCATABLE_NEEDS_FIXUP):
Likewise.
(SUBSUBTARGET_OVERRIDE_OPTIONS): Likewise.
* gcc/config/rs6000/linux.h (RELOCATABLE_NEEDS_FIXUP): Likewise.
* gcc/config/rs6000/linux64.h (RELOCATABLE_NEEDS_FIXUP):
Likewise.
(SUBSUBTARGET_OVERRIDE_OPTIONS): Likewise.
(OPTION_LITTLE_ENDIAN): Likewise.
(OPTION_RELOCATABLE): Likewise.
(OPTION_EABI): Likewise.
(OPTION_PROTOTYPE): Likewise.
* gcc/config/rs6000/option-defaults.h (OPTION_MASK_64BIT):
Likewise.
(OPT_ARCH32): Likewise.
(OPT_ARCH64): Likewise.
* gcc/config/rs6000/rs6000-c.c (rs6000_target_modify_macros):
Likewise.
(rs6000_cpu_cpp_builtins): Likewise.
* gcc/config/rs6000/rs6000-cpus.def (ISA_2_1_MASKS): Likewise.
(ISA_2_2_MASKS): Likewise.
(ISA_2_4_MASKS): Likewise.
(ISA_2_5_MASKS_EMBEDDED): Likewise.
(ISA_2_5_MASKS_SERVER): Likewise.
(ISA_2_6_MASKS_EMBEDDED): Likewise.
(ISA_2_6_MASKS_SERVER): Likewise.
(POWERPC_7400_MASK): Likewise.
(POWERPC_MASKS): Likewise.
* gcc/config/rs6000/rs6000-protos.h
(rs6000_builtin_mask_calculate): Likewise.
(rs6000_target_modify_macros): Likewise.
(rs6000_target_modify_macros_ptr): Likewise.
* gcc/config/rs6000/rs6000.c (struct builtin_description):
Likewise.
(rs6000_target_modify_macros_ptr): Likewise.
(struct rs6000_builtin_info): Likewise.
(ISA_2_1_MASKS): Likewise.
(ISA_2_2_MASKS): Likewise.
(ISA_2_4_MASKS): Likewise.
(ISA_2_5_MASKS_EMBEDDED): Likewise.
(ISA_2_5_MASKS_SERVER): Likewise.
(ISA_2_6_MASKS_EMBEDDED): Likewise.
(ISA_2_6_MASKS_SERVER): Likewise.
(POWERPC_7400_MASK): Likewise.
(POWERPC_MASKS): Likewise.
(OPTION_MASK_STRICT_ALIGN): Likewise.
(struct rs6000_ptt): Likewise.
(DEBUG_FMT_ID): Likewise.
(DEBUG_FMT_D): Likewise.
(DEBUG_FMT_X): Likewise.
(DEBUG_FMT_WX): Likewise.
(DEBUG_FMT_WX2): Likewise.
(DEBUG_FMT_S): Likewise.
(rs6000_debug_reg_global): Likewise.
(darwin_rs6000_override_options): Likewise.
(rs6000_builtin_mask_calculate): Likewise.
(rs6000_option_override_internal): Likewise.
(rs6000_init_hard_regno_mode_ok): Likewise.
(paired_expand_builtin): Likewise.
(spe_expand_builtin): Likewise.
(rs6000_invalid_builtin): Likewise.
(rs6000_expand_builtin): Likewise.
(rs6000_builtin_decl): Likewise.
(rs6000_common_init_builtins): Likewise.
(rs6000_darwin_file_start): Likewise.
(struct rs6000_opt_mask): Likewise.
(rs6000_opt_masks): Likewise.

vec_cond_expr adjustments

2012-09-27 Thread Marc Glisse


Hello,

I have been experimenting with generating VEC_COND_EXPR from the 
front-end, and these are just a couple things I noticed.


1) optabs.c requires that the first argument of vec_cond_expr be a 
comparison, but verify_gimple_assign_ternary only checks 
is_gimple_condexpr, like for COND_EXPR. In the long term, it seems better 
to also allow ssa_name and vector_cst (thus match the is_gimple_condexpr 
condition), but for now I just want to know early if I created an invalid 
vec_cond_expr.


2) a little refactoring of the code to find a suitable vector type for 
comparison results, and one more place where it should be used (no 
testcase yet because I don't know if that path can be taken without 
front-end changes first). I did wonder, for tree-ssa-forwprop, about using 
directly TREE_TYPE (cond) without truth_type_for.


Hmm, now I am wondering whether I should have waited until I had front-end 
vec_cond_expr support to submit everything at once...


2012-09-27  Marc Glisse  marc.gli...@inria.fr

* tree-cfg.c (verify_gimple_assign_ternary): Stricter check on
first argument of VEC_COND_EXPR.
* tree.c (truth_type_for): New function.
* tree.h (truth_type_for): Declare.
* gimple-fold.c (and_comparisons_1): Call it.
(or_comparisons_1): Likewise.
* tree-ssa-forwprop.c (forward_propagate_into_cond): Likewise.

--
Marc GlisseIndex: gcc/tree-ssa-forwprop.c
===
--- gcc/tree-ssa-forwprop.c (revision 191810)
+++ gcc/tree-ssa-forwprop.c (working copy)
@@ -549,21 +549,22 @@ static bool
 forward_propagate_into_cond (gimple_stmt_iterator *gsi_p)
 {
   gimple stmt = gsi_stmt (*gsi_p);
   tree tmp = NULL_TREE;
   tree cond = gimple_assign_rhs1 (stmt);
   bool swap = false;
 
   /* We can do tree combining on SSA_NAME and comparison expressions.  */
   if (COMPARISON_CLASS_P (cond))
 tmp = forward_propagate_into_comparison_1 (stmt, TREE_CODE (cond),
-  boolean_type_node,
+  truth_type_for
+(TREE_TYPE (cond)),
   TREE_OPERAND (cond, 0),
   TREE_OPERAND (cond, 1));
   else if (TREE_CODE (cond) == SSA_NAME)
 {
   enum tree_code code;
   tree name = cond;
   gimple def_stmt = get_prop_source_stmt (name, true, NULL);
   if (!def_stmt || !can_propagate_from (def_stmt))
return 0;
 
Index: gcc/tree-cfg.c
===
--- gcc/tree-cfg.c  (revision 191810)
+++ gcc/tree-cfg.c  (working copy)
@@ -3758,22 +3758,24 @@ verify_gimple_assign_ternary (gimple stm
   tree rhs2_type = TREE_TYPE (rhs2);
   tree rhs3 = gimple_assign_rhs3 (stmt);
   tree rhs3_type = TREE_TYPE (rhs3);
 
   if (!is_gimple_reg (lhs))
 {
   error (non-register as LHS of ternary operation);
   return true;
 }
 
-  if (((rhs_code == VEC_COND_EXPR || rhs_code == COND_EXPR)
-   ? !is_gimple_condexpr (rhs1) : !is_gimple_val (rhs1))
+  if (((rhs_code == COND_EXPR) ? !is_gimple_condexpr (rhs1)
+   : (rhs_code == VEC_COND_EXPR) ? (!is_gimple_condexpr (rhs1)
+   || is_gimple_val (rhs1))
+   : !is_gimple_val (rhs1))
   || !is_gimple_val (rhs2)
   || !is_gimple_val (rhs3))
 {
   error (invalid operands in ternary operation);
   return true;
 }
 
   /* First handle operations that involve different types.  */
   switch (rhs_code)
 {
Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 191810)
+++ gcc/gimple-fold.c   (working copy)
@@ -23,21 +23,20 @@ along with GCC; see the file COPYING3.
 #include coretypes.h
 #include tm.h
 #include tree.h
 #include flags.h
 #include function.h
 #include dumpfile.h
 #include tree-flow.h
 #include tree-ssa-propagate.h
 #include target.h
 #include gimple-fold.h
-#include langhooks.h
 
 /* Return true when DECL can be referenced from current unit.
FROM_DECL (if non-null) specify constructor of variable DECL was taken from.
We can get declarations that are not possible to reference for various
reasons:
 
  1) When analyzing C++ virtual tables.
C++ virtual tables do have known constructors even
when they are keyed to other compilation unit.
Those tables can contain pointers to methods and vars
@@ -1686,29 +1685,21 @@ and_var_with_comparison_1 (gimple stmt,
(OP1A CODE1 OP1B) and (OP2A CODE2 OP2B), respectively.
If this can be done without constructing an intermediate value,
return the resulting tree; otherwise NULL_TREE is returned.
This function is deliberately asymmetric as it recurses on SSA_DEFs
in the first comparison but not the second.  */
 
 static tree
 and_comparisons_1 (enum

RFC: LRA for x86/x86-64 [0/9]

2012-09-27 Thread Vladimir Makarov


  Originally I was to submit LRA at the very beginning of stage1 for
gcc4.9 as it was discussed on this summer GNU Tools Cauldron.  After
some thinking, I've decided to submit LRA now but only switched on for
*x86/x86-64* target.  The reasons for that are
  o I am already pretty confident in LRA for this target with the
point of reliability, performance, code size, and compiler speed.
  o I am confident that I can fix LRA bugs and pitfalls which might be
recognized and reported during stage2 and 3 of gcc4.8.
  o Wider LRA testing for x86/x86-64 will make smoother a hard 
transition of

other targets to LRA during gcc4.9 development.

  During development of gcc4.9, I'd like to switch major targets to
LRA as it was planned before.  I hope that all targets will be
switched for the next release after gcc4.9 (although it will be
dependent mostly on the target maintainers).  When/if it is done,
reload and reload oriented machine-dependent code can be removed.

  LRA project was reported on 2012 GNU Tools Cauldron
(http://gcc.gnu.org/wiki/cauldron2012).  The presentation contains a
high-level description of LRA and the project status.

  The following patches makes LRA working for x86/x86-64. Separately
patches mostly do nothing until the last patch switches on LRA for
x86/x86-64.  Although compiler is bootstrapped after applying each
patch in given order, the division is only for review convenience.

  Any comments and proposals are appreciated.  Even if GCC community
decides that it is too late to submit it to gcc4.8, the earlier reviews
are always useful.

  The patches were successfully bootstrapped and tested for x86/x86-64.

RFC: LRA for x86/x86-64 [1/9]

2012-09-27 Thread Vladimir Makarov

  The following patch adds a new argument for function alter_subreg.  
LRA will sometime call alter_subreg with different argument value.


2012-09-27  Vladimir Makarov  vmaka...@redhat.com

* output.h (alter_subreg): Add new argument.
* dbxout.c (dbxout_symbol_location): Pass new argument to
alter_subreg.
* sdbout.c (sdbout_symbol): Pass new argument to alter_subreg.
* final.c (final_scan_insn, cleanup_subreg_operands): Pass new
argument to alter_subreg.
(walk_alter_subreg, output_operand): Ditto.
(alter_subreg): Add new argument.
* config/m32r/m32r.c (gen_split_move_double): Pass new argument to
alter_subreg.
* config/sh/sh.md: Ditto.
* config/xtensa/xtensa.c (fixup_subreg_mem): Ditto.
* config/m68k/m68k.c (emit_move_sequence): Ditto.
* config/arm/arm.c (load_multiple_sequence): Ditto.
(store_multiple_sequence): Ditto.
* config/pa/pa.c (pa_emit_move_sequence): Ditto.
* config/v850/v850.c (v850_reorg): Ditto.

Index: dbxout.c
===
--- dbxout.c	(revision 191771)
+++ dbxout.c	(working copy)
@@ -2994,7 +2994,7 @@ dbxout_symbol_location (tree decl, tree
 	  if (REGNO (value) = FIRST_PSEUDO_REGISTER)
 	return 0;
 	}
-  home = alter_subreg (home);
+  home = alter_subreg (home, true);
 }
   if (REG_P (home))
 {
Index: config/pa/pa.c
===
--- config/pa/pa.c	(revision 191771)
+++ config/pa/pa.c	(working copy)
@@ -1616,7 +1616,7 @@ pa_emit_move_sequence (rtx *operands, en
   rtx temp = gen_rtx_SUBREG (GET_MODE (operand0),
  reg_equiv_mem (REGNO (SUBREG_REG (operand0))),
  SUBREG_BYTE (operand0));
-  operand0 = alter_subreg (temp);
+  operand0 = alter_subreg (temp, true);
 }
 
   if (scratch_reg
@@ -1633,7 +1633,7 @@ pa_emit_move_sequence (rtx *operands, en
   rtx temp = gen_rtx_SUBREG (GET_MODE (operand1),
  reg_equiv_mem (REGNO (SUBREG_REG (operand1))),
  SUBREG_BYTE (operand1));
-  operand1 = alter_subreg (temp);
+  operand1 = alter_subreg (temp, true);
 }
 
   if (scratch_reg  reload_in_progress  GET_CODE (operand0) == MEM
Index: config/v850/v850.c
===
--- config/v850/v850.c	(revision 191771)
+++ config/v850/v850.c	(working copy)
@@ -1301,11 +1301,11 @@ v850_reorg (void)
 	  if (GET_CODE (dest) == SUBREG
 		   (GET_CODE (SUBREG_REG (dest)) == MEM
 		  || GET_CODE (SUBREG_REG (dest)) == REG))
-		alter_subreg (dest);
+		alter_subreg (dest, true);
 	  if (GET_CODE (src) == SUBREG
 		   (GET_CODE (SUBREG_REG (src)) == MEM
 		  || GET_CODE (SUBREG_REG (src)) == REG))
-		alter_subreg (src);
+		alter_subreg (src, true);
 
 	  if (GET_CODE (dest) == MEM  GET_CODE (src) == MEM)
 		mem = NULL_RTX;
Index: config/sh/sh.md
===
--- config/sh/sh.md	(revision 191771)
+++ config/sh/sh.md	(working copy)
@@ -7275,7 +7275,7 @@ label:
 	  rtx regop = operands[store_p], word0 ,word1;
 
 	  if (GET_CODE (regop) == SUBREG)
-	alter_subreg (regop);
+	alter_subreg (regop, true);
 	  if (REGNO (XEXP (addr, 0)) == REGNO (XEXP (addr, 1)))
 	offset = 2;
 	  else
@@ -7283,9 +7283,9 @@ label:
 	  mem = copy_rtx (mem);
 	  PUT_MODE (mem, SImode);
 	  word0 = gen_rtx_SUBREG (SImode, regop, 0);
-	  alter_subreg (word0);
+	  alter_subreg (word0, true);
 	  word1 = gen_rtx_SUBREG (SImode, regop, 4);
-	  alter_subreg (word1);
+	  alter_subreg (word1, true);
 	  if (store_p || ! refers_to_regno_p (REGNO (word0),
 	  REGNO (word0) + 1, addr, 0))
 	{
@@ -7743,7 +7743,7 @@ label:
   else
 	{
 	  x = gen_rtx_SUBREG (V2SFmode, operands[0], i * 8);
-	  alter_subreg (x);
+	  alter_subreg (x, true);
 	}
 
   if (MEM_P (operands[1]))
@@ -7752,7 +7752,7 @@ label:
   else
 	{
 	  y = gen_rtx_SUBREG (V2SFmode, operands[1], i * 8);
-	  alter_subreg (y);
+	  alter_subreg (y, true);
 	}
 
   emit_insn (gen_movv2sf_i (x, y));
Index: config/xtensa/xtensa.c
===
--- config/xtensa/xtensa.c	(revision 191771)
+++ config/xtensa/xtensa.c	(working copy)
@@ -1087,7 +1087,7 @@ fixup_subreg_mem (rtx x)
 	gen_rtx_SUBREG (GET_MODE (x),
 			reg_equiv_mem (REGNO (SUBREG_REG (x))),
 			SUBREG_BYTE (x));
-  x = alter_subreg (temp);
+  x = alter_subreg (temp, true);
 }
   return x;
 }
Index: config/m68k/m68k.c
===
--- config/m68k/m68k.c	(revision 191771)
+++ config/m68k/m68k.c	(working copy)
@@ -3658,7 +3658,7 @@ emit_move_sequence (rtx *operands, enum
   rtx temp = gen_rtx_SUBREG (GET_MODE (operand0),
  reg_equiv_mem (REGNO (SUBREG_REG (operand0))),
  SUBREG_BYTE (operand0));
-  operand0 = alter_subreg (temp);
+  operand0 = alter_subreg (temp, true);
 }
 
   if (scratch_reg
@@

RFC: LRA for x86/x86-64 [2/9]

2012-09-27 Thread Vladimir Makarov

LRA outputs a lot debug information about insns.  I found that using 
slim insn/rtl presentation helps a lot for LRA debuging. The following 
patch makes slim presentation printing functions visible to LRA.  It 
also implements one more such function.


2012-09-27  Vladimir Makarov  vmaka...@redhat.com

* rtl.h (debug_bb_n_slim, debug_bb_slim, print_value_slim): New
prototypes.
(debug_rtl_slim, debug_insn_slim): Ditto.
* sched-vis.c (print_value_slim): New.

Index: rtl.h
===
--- rtl.h	(revision 191771)
+++ rtl.h	(working copy)
@@ -2487,7 +2487,12 @@ extern rtx make_compound_operation (rtx,
 extern void delete_dead_jumptables (void);
 
 /* In sched-vis.c.  */
-extern void dump_insn_slim (FILE *, const_rtx x);
+extern void debug_bb_n_slim (int);
+extern void debug_bb_slim (struct basic_block_def *);
+extern void print_value_slim (FILE *, const_rtx, int);
+extern void debug_rtl_slim (FILE *, const_rtx, const_rtx, int, int);
+extern void dump_insn_slim (FILE *f, const_rtx x);
+extern void debug_insn_slim (const_rtx x);
 
 /* In sched-rgn.c.  */
 extern void schedule_insns (void);
Index: sched-vis.c
===
--- sched-vis.c	(revision 191771)
+++ sched-vis.c	(working copy)
@@ -546,6 +546,19 @@ print_value (char *buf, const_rtx x, int
 }
 }/* print_value */
 
+/* Prints rtxes, I customarily classified as values.  They're
+   constants, registers, labels, symbols and memory accesses.  Print
+   them to file F.  */
+
+void
+print_value_slim (FILE *f, const_rtx x, int verbose)
+{
+  char buf[BUF_LEN];
+
+  print_value (buf, x, verbose);
+  fprintf (f, %s, buf);
+}
+
 /* The next step in insn detalization, its pattern recognition.  */
 
 void

RFC: LRA for x86/x86-64 [3/9]

2012-09-27 Thread Vladimir Makarov


LRA creates a lot of new pseudos.  So the following patch implements
ahead allocation reg info information which is important for LRA
compilation speed.

2012-09-27  Vladimir Makarov  vmaka...@redhat.com

* reginfo.c (max_regno_since_last_resize): New.
(reg_preferred_class, reg_alternate_class): Add assert.
(allocate_reg_info): Initialize allocated reg info.
(resize_reg_info): Make bigger reg_info and initialize new memory.
(reginfo_init): Initialize max_regno_since_last_resize.
(setup_reg_classes): Change assert.

Index: reginfo.c
===
--- reginfo.c	(revision 191771)
+++ reginfo.c	(working copy)
@@ -839,6 +839,8 @@ static struct reg_pref *reg_pref;
 
 /* Current size of reg_info.  */
 static int reg_info_size;
+/* Max_reg_num still last resize_reg_info call.  */
+static int max_regno_since_last_resize;
 
 /* Return the reg_class in which pseudo reg number REGNO is best allocated.
This function is sometimes called before the info has been computed.
@@ -849,6 +851,7 @@ reg_preferred_class (int regno)
   if (reg_pref == 0)
 return GENERAL_REGS;
 
+  gcc_assert (regno  reg_info_size);
   return (enum reg_class) reg_pref[regno].prefclass;
 }
 
@@ -858,6 +861,7 @@ reg_alternate_class (int regno)
   if (reg_pref == 0)
 return ALL_REGS;
 
+  gcc_assert (regno  reg_info_size);
   return (enum reg_class) reg_pref[regno].altclass;
 }
 
@@ -868,45 +872,64 @@ reg_allocno_class (int regno)
   if (reg_pref == 0)
 return NO_REGS;
 
+  gcc_assert (regno  reg_info_size);
   return (enum reg_class) reg_pref[regno].allocnoclass;
 }
 
 
 
-/* Allocate space for reg info.  */
+/* Allocate space for reg info and initilize it.  */
 static void
 allocate_reg_info (void)
 {
-  reg_info_size = max_reg_num ();
+  int i;
+
+  max_regno_since_last_resize = max_reg_num ();
+  reg_info_size = max_regno_since_last_resize * 3 / 2 + 1;
   gcc_assert (! reg_pref  ! reg_renumber);
   reg_renumber = XNEWVEC (short, reg_info_size);
   reg_pref = XCNEWVEC (struct reg_pref, reg_info_size);
   memset (reg_renumber, -1, reg_info_size * sizeof (short));
+  for (i = 0; i  reg_info_size; i++)
+{
+  reg_pref[i].prefclass = GENERAL_REGS;
+  reg_pref[i].altclass = ALL_REGS;
+  reg_pref[i].allocnoclass = GENERAL_REGS;
+}
 }
 
 
-/* Resize reg info. The new elements will be uninitialized.  Return
-   TRUE if new elements (for new pseudos) were added.  */
+/* Resize reg info. The new elements will be initialized.  Return TRUE
+   if new pseudos were added since the last call.  */
 bool
 resize_reg_info (void)
 {
-  int old;
+  int old, i;
+  bool change_p;
 
   if (reg_pref == NULL)
 {
   allocate_reg_info ();
   return true;
 }
-  if (reg_info_size == max_reg_num ())
-return false;
+  change_p = max_regno_since_last_resize != max_reg_num ();
+  max_regno_since_last_resize = max_reg_num ();
+  if (reg_info_size = max_reg_num ())
+return change_p;
   old = reg_info_size;
-  reg_info_size = max_reg_num ();
+  reg_info_size = max_reg_num () * 3 / 2 + 1;
   gcc_assert (reg_pref  reg_renumber);
   reg_renumber = XRESIZEVEC (short, reg_renumber, reg_info_size);
   reg_pref = XRESIZEVEC (struct reg_pref, reg_pref, reg_info_size);
   memset (reg_pref + old, -1,
 	  (reg_info_size - old) * sizeof (struct reg_pref));
   memset (reg_renumber + old, -1, (reg_info_size - old) * sizeof (short));
+  for (i = old; i  reg_info_size; i++)
+{
+  reg_pref[i].prefclass = GENERAL_REGS;
+  reg_pref[i].altclass = ALL_REGS;
+  reg_pref[i].allocnoclass = GENERAL_REGS;
+}
   return true;
 }
 
@@ -938,6 +961,7 @@ reginfo_init (void)
   /* This prevents dump_reg_info from losing if called
  before reginfo is run.  */
   reg_pref = NULL;
+  reg_info_size = max_regno_since_last_resize = 0;
   /* No more global register variables may be declared.  */
   no_global_reg_vars = 1;
   return 1;
@@ -964,7 +988,7 @@ struct rtl_opt_pass pass_reginfo_init =
 
 
 
-/* Set up preferred, alternate, and cover classes for REGNO as
+/* Set up preferred, alternate, and allocno classes for REGNO as
PREFCLASS, ALTCLASS, and ALLOCNOCLASS.  */
 void
 setup_reg_classes (int regno,
@@ -973,7 +997,7 @@ setup_reg_classes (int regno,
 {
   if (reg_pref == NULL)
 return;
-  gcc_assert (reg_info_size == max_reg_num ());
+  gcc_assert (reg_info_size = max_reg_num ());
   reg_pref[regno].prefclass = prefclass;
   reg_pref[regno].altclass = altclass;
   reg_pref[regno].allocnoclass = allocnoclass;

RFC: LRA for x86/x86-64 [4/9]

2012-09-27 Thread Vladimir Makarov


  The following patch implements hooks (and their default values) will
be used by LRA.

2012-09-27  Vladimir Makarov  vmaka...@redhat.com

* target.h: Include tm.h.
* targhooks.h (default_lra_p): Declare.
(default_register_bank): Ditto.
(default_different_addr_displacement_p): Ditto.
* targhooks.c (default_lra_p): New function.
(default_register_bank): Ditto.
(default_different_addr_displacement_p): Ditto.
* target.def (lra_p): New hook.
(register_bank): Ditto.
(different_addr_displacement_p): Ditto.
(spill_class, spill_class_mode): New hooks.
* doc/tm.texi.in: Add TARGET_LRA_P, TARGET_REGISTER_BANK,
TARGET_DIFFERENT_ADDR_DISPLACEMENT_P, TARGET_SPILL_CLASS, and
TARGET_SPILL_CLASS_MODE.
* doc/tm.texi: Update.

Index: target.h
===
--- target.h	(revision 191771)
+++ target.h	(working copy)
@@ -51,6 +51,7 @@
 #define GCC_TARGET_H
 
 #include insn-modes.h
+#include tm.h
 
 #ifdef ENABLE_CHECKING
 
Index: targhooks.c
===
--- targhooks.c	(revision 191771)
+++ targhooks.c	(working copy)
@@ -840,6 +840,24 @@ default_branch_target_register_class (vo
   return NO_REGS;
 }
 
+extern bool
+default_lra_p (void)
+{
+  return false;
+}
+
+int
+default_register_bank (int hard_regno ATTRIBUTE_UNUSED)
+{
+  return 0;
+}
+
+extern bool
+default_different_addr_displacement_p (void)
+{
+  return false;
+}
+
 reg_class_t
 default_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x ATTRIBUTE_UNUSED,
 			  reg_class_t reload_class_i ATTRIBUTE_UNUSED,
Index: targhooks.h
===
--- targhooks.h	(revision 191771)
+++ targhooks.h	(working copy)
@@ -132,6 +132,9 @@ extern rtx default_static_chain (const_t
 extern void default_trampoline_init (rtx, tree, rtx);
 extern int default_return_pops_args (tree, tree, int);
 extern reg_class_t default_branch_target_register_class (void);
+extern bool default_lra_p (void);
+extern int default_register_bank (int);
+extern bool default_different_addr_displacement_p (void);
 extern reg_class_t default_secondary_reload (bool, rtx, reg_class_t,
 	 enum machine_mode,
 	 secondary_reload_info *);
Index: doc/tm.texi
===
--- doc/tm.texi	(revision 191771)
+++ doc/tm.texi	(working copy)
@@ -2893,6 +2893,26 @@ as below:
 @end smallexample
 @end defmac
 
+@deftypefn {Target Hook} bool TARGET_LRA_P (void)
+A target hook which returns true if we use LRA instead of reload pass.  It means that LRA was ported to the target.The default version of this target hook returns always false.
+@end deftypefn
+
+@deftypefn {Target Hook} int TARGET_REGISTER_BANK (int)
+A target hook which returns the register bank number to which the  register @var{hard_regno} belongs to.  The smaller the number, the  more preferable the hard register usage (when all other conditions are  the same).  This hook can be used to prefer some hard register over  others in LRA.  For example, some x86-64 register usage needs  additional prefix which makes instructions longer.  The hook can  return bigger bank number for such registers make them less favorable  and as result making the generated code smaller.The default version of this target hook returns always zero.
+@end deftypefn
+
+@deftypefn {Target Hook} bool TARGET_DIFFERENT_ADDR_DISPLACEMENT_P (void)
+A target hook which returns true if an address with the same structure  can have different maximal legitimate displacement.  For example, the  displacement can depend on memory mode or on operand combinations in  the insn.The default version of this target hook returns always false.
+@end deftypefn
+
+@deftypefn {Target Hook} {enum reg_class} TARGET_SPILL_CLASS (enum @var{reg_class})
+This hook defines a class of registers which could be used for spilled pseudos  of given class instead of memory
+@end deftypefn
+
+@deftypefn {Target Hook} {enum machine_mode} TARGET_SPILL_CLASS_MODE (enum @var{reg_class}, enum @var{reg_class}, enum @var{machine_mode})
+This hook defines mode in which a pseudo of given mode and of the first  register class can be spilled into the second register class
+@end deftypefn
+
 @node Old Constraints
 @section Obsolete Macros for Defining Constraints
 @cindex defining constraints, obsolete method
Index: doc/tm.texi.in
===
--- doc/tm.texi.in	(revision 191771)
+++ doc/tm.texi.in	(working copy)
@@ -2869,6 +2869,16 @@ as below:
 @end smallexample
 @end defmac
 
+@hook TARGET_LRA_P
+
+@hook TARGET_REGISTER_BANK
+
+@hook TARGET_DIFFERENT_ADDR_DISPLACEMENT_P
+
+@hook TARGET_SPILL_CLASS
+
+@hook TARGET_SPILL_CLASS_MODE
+
 @node Old Constraints
 @section Obsolete Macros for Defining Constraints
 @cindex defining constraints, obsolete method
Index: target.def

RFC: LRA for x86/x86-64 [5/9]

2012-09-27 Thread Vladimir Makarov


  The following patch mostly prepares some data from IRA which will be
used by LRA.  It is done by moving some definitions fro ira-int.h to
ira.h.  New data reg_class_subset is generated in IRA for LRA.
New functions dealing with equivs are created.  They will be used by
LRA.  Some code of IRA is rewritten to use them too.

  The patch also adds a wrapper code in IRA to be prepared to call
LRA.

2012-09-27  Vladimir Makarov  vmaka...@redhat.com

* ira-int.h (struct target_ira_int): Remove x_ira_class_subset_p
and x_ira_reg_classes_intersect_p.
(ira_class_subset_p, ira_reg_classes_intersect_p): Remove.
(ira_reg_equiv_len, ira_reg_equiv_invariant_p): Ditto.
(ira_reg_equiv_const): Ditto.
(ira_equiv_no_lvalue_p): New function.
* ira-color.c (color_pass, move_spill_restore, coalesce_allocnos):
Use ira_equiv_no_lvalue_p.
(coalesce_spill_slots, ira_sort_regnos_for_alter_reg): Ditto.
* ira-emit.c (ira_create_new_reg): Call ira_expand_reg_equiv.
(generate_edge_moves, change_loop) Use ira_equiv_no_lvalue_p.
(emit_move_list): Simplify code.  Call
ira_update_equiv_info_by_shuffle_insn.  Use ira_reg_equiv instead
of ira_reg_equiv_invariant_p and ira_reg_equiv_const. Change
assert.
* ira.c: (setup_reg_class_relations): Set up ira_reg_class_subset.
(ira_reg_equiv_invariant_p, ira_reg_equiv_const): Remove.
(find_reg_equiv_invariant_const): Ditto.
(setup_reg_renumber): Use ira_equiv_no_lvalue_p instead
of ira_reg_equiv_invariant_p.  Skip caps for LRA.
(setup_reg_equiv_init, ira_update_equiv_info_by_shuffle_insn): New
functions.
(ira_reg_equiv_len): Move it before ira_reg_equiv. Change
comment.
(ira_reg_equiv): New.
(ira_expand_reg_equiv, finish_reg_equiv): New functions.
(no_equiv, update_equiv_regs): Use ira_reg_equiv instead of
reg_equiv_init.
(setup_reg_equiv): New function.
(ira_use_lra_p): New global.
(ira): Move initialization of ira_obstack and ira_bitmap_obstack
upper.  Call init_reg_equiv, setup_reg_equiv, and
setup_reg_equiv_init instead of initialization of
ira_reg_equiv_len, ira_reg_equiv_invariant_p, and
ira_reg_equiv_const.  Don't flatten IRA IRA for LRA. Don't
reassign conflict allocnos for LRA. Call finish_reg_equiv.
(do_reload): Prepare code for LRA call.
* ira.h (ira_use_lra_p): New external.
(struct target_ira): Add members x_ira_class_subset_p
x_ira_reg_class_subset, and x_ira_reg_classes_intersect_p.
(ira_class_subset_p, ira_reg_class_subset): New macros.
(ira_reg_classes_intersect_p): New macro.
(ira_reg_equiv_len, ira_reg_equiv): New externals.
(struct ira_reg_equiv): New.
(ira_expand_reg_equiv, ira_update_equiv_info_by_shuffle_insn): New
prototypes.

Index: ira-int.h
===
--- ira-int.h	(revision 191771)
+++ ira-int.h	(working copy)
@@ -795,11 +795,6 @@ struct target_ira_int {
   /* Map class-true if class is a pressure class, false otherwise. */
   bool x_ira_reg_pressure_class_p[N_REG_CLASSES];
 
-  /* Register class subset relation: TRUE if the first class is a subset
- of the second one considering only hard registers available for the
- allocation.  */
-  int x_ira_class_subset_p[N_REG_CLASSES][N_REG_CLASSES];
-
   /* Array of the number of hard registers of given class which are
  available for allocation.  The order is defined by the hard
  register numbers.  */
@@ -838,13 +833,8 @@ struct target_ira_int {
  taking all hard-registers including fixed ones into account.  */
   enum reg_class x_ira_reg_class_intersect[N_REG_CLASSES][N_REG_CLASSES];
 
-  /* True if the two classes (that is calculated taking only hard
- registers available for allocation into account; are
- intersected.  */
-  bool x_ira_reg_classes_intersect_p[N_REG_CLASSES][N_REG_CLASSES];
-
   /* Classes with end marker LIM_REG_CLASSES which are intersected with
- given class (the first index;.  That includes given class itself.
+ given class (the first index).  That includes given class itself.
  This is calculated taking only hard registers available for
  allocation into account.  */
   enum reg_class x_ira_reg_class_super_classes[N_REG_CLASSES][N_REG_CLASSES];
@@ -861,7 +851,7 @@ struct target_ira_int {
 
   /* For each reg class, table listing all the classes contained in it
  (excluding the class itself.  Non-allocatable registers are
- excluded from the consideration;.  */
+ excluded from the consideration).  */
   enum reg_class x_alloc_reg_class_subclasses[N_REG_CLASSES][N_REG_CLASSES];
 
   /* Array whose values are hard regset of hard registers for which
@@ -894,8 +884,6 @@ extern struct target_ira_int *this_targe
   (this_target_ira_int-x_ira_reg_allocno_class_p)
 #define ira_reg_pressure_class_p \
   (this_target_ira_int-x_ira_reg_pressure_class_p)
-#define ira_class_subset_p \
-

RFC: LRA for x86/x86-64 [6/9]

2012-09-27 Thread Vladimir Makarov


  The following patch modifies some code in the rest of compiler for
correct work of LRA.  The code works the same way when LRA is not
used.  It is achieved by checking a new variable lra_in_progress.

2012-09-27  Vladimir Makarov  vmaka...@redhat.com

* rtlanal.c (simplify_subreg_regno): Permit ARG_POINTER_REGNUM and
STACK_POINTER_REGNU for LRA.
* jump.c (true_regnum): Always use hard_regno for subreg_get_info when
lra is in progress.
* expr.c (emit_move_insn_1): Pass an additional argument to
emit_move_via_integer.  Use emit_move_via_integer for LRA only if
the insn is recognized.
* recog.c (general_operand, register_operand): Accept paradoxical 
FLOAD_MODE

subregs for LRA.
(scratch_operand): Accept pseudos for LRA.
* emit-rtl.c (gen_rtx_REG): Add lra_in_progress.
(validate_subreg): Don't check offset for LRA and
floating point modes.
* rtl.h (lra_in_progress): New external.
* ira.c (lra_in_progress): Define.

Index: rtl.h
===
--- rtl.h	(revision 191771)
+++ rtl.h	(working copy)
@@ -2369,6 +2369,9 @@ extern int epilogue_completed;
 
 extern int reload_in_progress;
 
+/* Set to 1 while in lra.  */
+extern int lra_in_progress;
+
 /* This macro indicates whether you may create a new
pseudo-register.  */
 
Index: ira.c
===
--- ira.c	(revision 191771)
+++ ira.c	(working copy)
@@ -4308,6 +4308,9 @@ bool ira_conflicts_p;
 /* Saved between IRA and reload.  */
 static int saved_flag_ira_share_spill_slots;
 
+/* Set to 1 while in lra.  */
+int lra_in_progress = 0;
+
 /* This is the main entry of IRA.  */
 static void
 ira (FILE *f)
Index: jump.c
===
--- jump.c	(revision 191771)
+++ jump.c	(working copy)
@@ -1868,7 +1868,8 @@ true_regnum (const_rtx x)
 {
   if (REG_P (x))
 {
-  if (REGNO (x) = FIRST_PSEUDO_REGISTER  reg_renumber[REGNO (x)] = 0)
+  if (REGNO (x) = FIRST_PSEUDO_REGISTER
+	   (lra_in_progress || reg_renumber[REGNO (x)] = 0))
 	return reg_renumber[REGNO (x)];
   return REGNO (x);
 }
@@ -1880,7 +1881,8 @@ true_regnum (const_rtx x)
 	{
 	  struct subreg_info info;
 
-	  subreg_get_info (REGNO (SUBREG_REG (x)),
+	  subreg_get_info (lra_in_progress
+			   ? (unsigned) base : REGNO (SUBREG_REG (x)),
 			   GET_MODE (SUBREG_REG (x)),
 			   SUBREG_BYTE (x), GET_MODE (x), info);
 
Index: rtlanal.c
===
--- rtlanal.c	(revision 191771)
+++ rtlanal.c	(working copy)
@@ -3465,7 +3465,9 @@ simplify_subreg_regno (unsigned int xreg
   /* Give the backend a chance to disallow the mode change.  */
   if (GET_MODE_CLASS (xmode) != MODE_COMPLEX_INT
GET_MODE_CLASS (xmode) != MODE_COMPLEX_FLOAT
-   REG_CANNOT_CHANGE_MODE_P (xregno, xmode, ymode))
+   REG_CANNOT_CHANGE_MODE_P (xregno, xmode, ymode)
+  /* We can use mode change in LRA for some transformations.  */
+   ! lra_in_progress)
 return -1;
 #endif
 
@@ -3475,10 +3477,16 @@ simplify_subreg_regno (unsigned int xreg
 return -1;
 
   if (FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
-   xregno == ARG_POINTER_REGNUM)
+  /* We should convert arg register in LRA after the elimination
+	 if it is possible.  */
+   xregno == ARG_POINTER_REGNUM
+   ! lra_in_progress)
 return -1;
 
-  if (xregno == STACK_POINTER_REGNUM)
+  if (xregno == STACK_POINTER_REGNUM
+  /* We should convert hard stack register in LRA if it is
+	 possible.  */
+   ! lra_in_progress)
 return -1;
 
   /* Try to get the register offset.  */
Index: emit-rtl.c
===
--- emit-rtl.c	(revision 191771)
+++ emit-rtl.c	(working copy)
@@ -581,7 +581,7 @@ gen_rtx_REG (enum machine_mode mode, uns
  Also don't do this when we are making new REGs in reload, since
  we don't want to get confused with the real pointers.  */
 
-  if (mode == Pmode  !reload_in_progress)
+  if (mode == Pmode  !reload_in_progress  !lra_in_progress)
 {
   if (regno == FRAME_POINTER_REGNUM
 	   (!reload_completed || frame_pointer_needed))
@@ -723,7 +723,14 @@ validate_subreg (enum machine_mode omode
  (subreg:SI (reg:DF) 0) isn't.  */
   else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode))
 {
-  if (isize != osize)
+  if (! (isize == osize
+	 /* LRA can use subreg to store a floating point value in
+		an integer mode.  Although the floating point and the
+		integer modes need the same number of hard registers,
+		the size of floating point mode can be less than the
+		integer mode.  LRA also uses subregs for a register
+		should be used in different mode in on insn.  */
+	 || lra_in_progress))
 	return false;
 }
 
@@ -756,7 +763,8 @@ validate_subreg (enum machine_mode omode
  of a subword.  A subreg does *not* perform arbitrary bit

RFC: LRA for x86/x86-64 [8/9]

2012-09-27 Thread Vladimir Makarov


  The following patch adds a code neccessary for correct work of LRA
(function ira_setup_eliminable_regset) and for correct work of the
compiler when LRA is used (see file dwarf2out.c).

2012-09-27  Vladimir Makarov  vmaka...@redhat.com

* loop-invariant.c (calculate_loop_reg_pressure): Pass new
argument to ira_setup_eliminable_regset.
* haifa-sched.c (sched_init): Pass new argument to
ira_setup_eliminable_regset.
* dwarf2out.c: Include lra.h.
(based_loc_descr, compute_frame_pointer_to_fb_displacement): Use
lra_eliminate_regs for LRA instead of eliminate_regs.
* ira.c: (ira_setup_eliminable_regset): Add parameter. Remove
need_fp.  Call lra_init_elemination and mark
HARD_FRAME_POINTER_REGNUM as living forever if
frame_pointer_needed.
(ira): Call ira_setup_eliminable_regset with a new
argument.
* ira.h (ira_setup_eliminable_regset): Add an argument.
* Makefile.in (dwarf2out.o): Add dependence on ira.h and lra.h.

Index: ira.c
===
--- ira.c	(revision 191771)
+++ ira.c	(working copy)
@@ -1828,9 +1828,11 @@ compute_regs_asm_clobbered (void)
 }
 
 
-/* Set up ELIMINABLE_REGSET, IRA_NO_ALLOC_REGS, and REGS_EVER_LIVE.  */
+/* Set up ELIMINABLE_REGSET, IRA_NO_ALLOC_REGS, and REGS_EVER_LIVE.
+   If the function is called from IRA (not from the insn scheduler or
+   RTL loop invariant motion), FROM_IRA_P is true.  */
 void
-ira_setup_eliminable_regset (void)
+ira_setup_eliminable_regset (bool from_ira_p)
 {
 #ifdef ELIMINABLE_REGS
   int i;
@@ -1840,7 +1842,7 @@ ira_setup_eliminable_regset (void)
  sp for alloca.  So we can't eliminate the frame pointer in that
  case.  At some point, we should improve this by emitting the
  sp-adjusting insns for this case.  */
-  int need_fp
+  frame_pointer_needed
 = (! flag_omit_frame_pointer
|| (cfun-calls_alloca  EXIT_IGNORE_STACK)
/* We need the frame pointer to catch stack overflow exceptions
@@ -1850,8 +1852,14 @@ ira_setup_eliminable_regset (void)
|| crtl-stack_realign_needed
|| targetm.frame_pointer_required ());
 
-  frame_pointer_needed = need_fp;
+  if (from_ira_p  ira_use_lra_p)
+/* It can change FRAME_POINTER_NEEDED.  We call it only from IRA
+   because it is expensive.  */
+lra_init_elimination ();
 
+  if (frame_pointer_needed)
+df_set_regs_ever_live (HARD_FRAME_POINTER_REGNUM, true);
+
   COPY_HARD_REG_SET (ira_no_alloc_regs, no_unit_alloc_regs);
   CLEAR_HARD_REG_SET (eliminable_regset);
 
@@ -1864,7 +1872,7 @@ ira_setup_eliminable_regset (void)
 {
   bool cannot_elim
 	= (! targetm.can_eliminate (eliminables[i].from, eliminables[i].to)
-	   || (eliminables[i].to == STACK_POINTER_REGNUM  need_fp));
+	   || (eliminables[i].to == STACK_POINTER_REGNUM  frame_pointer_needed));
 
   if (!TEST_HARD_REG_BIT (crtl-asm_clobbers, eliminables[i].from))
 	{
@@ -1883,10 +1891,10 @@ ira_setup_eliminable_regset (void)
   if (!TEST_HARD_REG_BIT (crtl-asm_clobbers, HARD_FRAME_POINTER_REGNUM))
 {
   SET_HARD_REG_BIT (eliminable_regset, HARD_FRAME_POINTER_REGNUM);
-  if (need_fp)
+  if (frame_pointer_needed)
 	SET_HARD_REG_BIT (ira_no_alloc_regs, HARD_FRAME_POINTER_REGNUM);
 }
-  else if (need_fp)
+  else if (frame_pointer_needed)
 error (%s cannot be used in asm here,
 	   reg_names[HARD_FRAME_POINTER_REGNUM]);
   else
@@ -1897,10 +1905,10 @@ ira_setup_eliminable_regset (void)
   if (!TEST_HARD_REG_BIT (crtl-asm_clobbers, HARD_FRAME_POINTER_REGNUM))
 {
   SET_HARD_REG_BIT (eliminable_regset, FRAME_POINTER_REGNUM);
-  if (need_fp)
+  if (frame_pointer_needed)
 	SET_HARD_REG_BIT (ira_no_alloc_regs, FRAME_POINTER_REGNUM);
 }
-  else if (need_fp)
+  else if (frame_pointer_needed)
 error (%s cannot be used in asm here, reg_names[FRAME_POINTER_REGNUM]);
   else
 df_set_regs_ever_live (FRAME_POINTER_REGNUM, true);
@@ -4399,7 +4407,7 @@ ira (FILE *f)
 find_moveable_pseudos ();
 
   max_regno_before_ira = max_reg_num ();
-  ira_setup_eliminable_regset ();
+  ira_setup_eliminable_regset (true);
 
   ira_overall_cost = ira_reg_cost = ira_mem_cost = 0;
   ira_load_cost = ira_store_cost = ira_shuffle_cost = 0;
Index: ira.h
===
--- ira.h	(revision 191771)
+++ ira.h	(working copy)
@@ -173,7 +173,7 @@ extern struct ira_reg_equiv *ira_reg_equ
 extern void ira_init_once (void);
 extern void ira_init (void);
 extern void ira_finish_once (void);
-extern void ira_setup_eliminable_regset (void);
+extern void ira_setup_eliminable_regset (bool);
 extern rtx ira_eliminate_regs (rtx, enum machine_mode);
 extern void ira_set_pseudo_classes (FILE *);
 extern void ira_implicitly_set_insn_hard_regs (HARD_REG_SET *);
Index: loop-invariant.c
===
--- loop-invariant.c	(revision 191771)
+++ loop-invariant.c	(working copy)
@@ -1800,7

RFC: LRA for x86/x86-64 [9/9]

2012-09-27 Thread Vladimir Makarov

This is the last patch switching on LRA for x86/x86-64.  The patch also 
contains code deciding when to use spilling general regs into SSE 
instead of memory.


2012-09-27  Vladimir Makarov  vmaka...@redhat.com

* config/i386/i386.h (enum ix86_tune_indices): Add
X86_TUNE_GENERAL_REGS_SSE_SPILL.
(TARGET_GENERAL_REGS_SSE_SPILL): New macro.
* config/i386/i386.c (initial_ix86_tune_features): Set up
X86_TUNE_GENERAL_REGS_SSE_SPILL for m_COREI7 and
m_CORE2I7.
(ix86_lra_p, ix86_register_bank): New functions.
(ix86_secondary_reload): Add NON_Q_REGS, SIREG, DIREG.
(inline_secondary_memory_needed): Change assert.
(ix86_spill_class, ix86_spill_class_mode): New function.
(TARGET_LRA_P, TARGET_REGISTER_BANK, TARGET_SPILL_CLASS): New macros.
(TARGET_SPILL_CLASS_MODE): New macro.

Index: config/i386/i386.c
===
--- config/i386/i386.c	(revision 191771)
+++ config/i386/i386.c	(working copy)
@@ -2267,7 +2267,11 @@ static unsigned int initial_ix86_tune_fe
 
   /* X86_TUNE_REASSOC_FP_TO_PARALLEL: Try to produce parallel computations
  during reassociation of fp computation.  */
-  m_ATOM
+  m_ATOM,
+
+  /* X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE
+ regs instead of memory.  */
+  m_COREI7 | m_CORE2I7
 };
 
 /* Feature tests against the various architecture variations.  */
@@ -31694,6 +31698,38 @@ ix86_free_from_memory (enum machine_mode
 }
 }
 
+/* Return true if we use LRA instead of reload pass.  */
+static bool
+ix86_lra_p (void)
+{
+  return true;
+}
+
+/* Return a register bank number for hard reg REGNO.  */
+static int
+ix86_register_bank (int hard_regno)
+{
+  /* ebp and r13 as the base always wants a displacement, r12 as the
+ base always wants an index.  So discourage their usage in an
+ address.  */
+  if (hard_regno == R12_REG || hard_regno == R13_REG)
+return 4;
+  if (hard_regno == BP_REG)
+return 2;
+  /* New x86-64 int registers result in bigger code size.  Discourage
+ them.  */
+  if (FIRST_REX_INT_REG = hard_regno  hard_regno = LAST_REX_INT_REG)
+return 3;
+  /* New x86-64 SSE registers result in bigger code size.  Discourage
+ them.  */
+  if (FIRST_REX_SSE_REG = hard_regno  hard_regno = LAST_REX_SSE_REG)
+return 3;
+  /* Usage of AX register results in smaller code.  Prefer it.  */
+  if (hard_regno == 0)
+return 0;
+  return 1;
+}
+
 /* Implement TARGET_PREFERRED_RELOAD_CLASS.
 
Put float CONST_DOUBLE in the constant pool instead of fp regs.
@@ -31827,6 +31863,9 @@ ix86_secondary_reload (bool in_p, rtx x,
!in_p  mode == QImode
(rclass == GENERAL_REGS
 	  || rclass == LEGACY_REGS
+	  || rclass == NON_Q_REGS
+	  || rclass == SIREG
+	  || rclass == DIREG
 	  || rclass == INDEX_REGS))
 {
   int regno;
@@ -31936,7 +31975,7 @@ inline_secondary_memory_needed (enum reg
   || MAYBE_MMX_CLASS_P (class1) != MMX_CLASS_P (class1)
   || MAYBE_MMX_CLASS_P (class2) != MMX_CLASS_P (class2))
 {
-  gcc_assert (!strict);
+  gcc_assert (!strict || lra_in_progress);
   return true;
 }
 
@@ -40483,6 +40522,39 @@ ix86_autovectorize_vector_sizes (void)
   return (TARGET_AVX  !TARGET_PREFER_AVX128) ? 32 | 16 : 0;
 }
 
+
+
+/* Return class of registers which could be used for pseudo of class
+   RCLASS for spilling instead of memory.  Return NO_REGS if it is not
+   possible or non-profitable.  */
+static enum reg_class
+ix86_spill_class (enum reg_class rclass)
+{
+  if (TARGET_SSE  TARGET_GENERAL_REGS_SSE_SPILL
+   hard_reg_set_subset_p (reg_class_contents[rclass],
+reg_class_contents[GENERAL_REGS]))
+return SSE_REGS;
+  return NO_REGS;
+}
+
+/* Return mode in which pseudo of MODE and RCLASS can be spilled into
+   a register of class SPILL_CLASS.  Return VOIDmode if it is not
+   possible.  */
+static enum machine_mode
+ix86_spill_class_mode (enum reg_class rclass, enum reg_class spill_class,
+		   enum machine_mode mode)
+{
+  if (! TARGET_SSE || ! TARGET_GENERAL_REGS_SSE_SPILL
+  || ! hard_reg_set_subset_p (reg_class_contents[rclass],
+  reg_class_contents[GENERAL_REGS])
+  || spill_class != SSE_REGS)
+return VOIDmode;
+  if (mode == SImode || (TARGET_64BIT  mode == DImode))
+return mode;
+  return VOIDmode;
+}
+
+
 /* Implement targetm.vectorize.init_cost.  */
 
 static void *
@@ -40885,6 +40957,12 @@ ix86_memmodel_check (unsigned HOST_WIDE_
 #undef TARGET_LEGITIMATE_ADDRESS_P
 #define TARGET_LEGITIMATE_ADDRESS_P ix86_legitimate_address_p
 
+#undef TARGET_LRA_P
+#define TARGET_LRA_P ix86_lra_p
+
+#undef TARGET_REGISTER_BANK
+#define TARGET_REGISTER_BANK ix86_register_bank
+
 #undef TARGET_LEGITIMATE_CONSTANT_P
 #define TARGET_LEGITIMATE_CONSTANT_P ix86_legitimate_constant_p
 
@@ -40908,6 +40986,12 @@ ix86_memmodel_check (unsigned HOST_WIDE_
 #define TARGET_INIT_LIBFUNCS darwin_rename_builtins
 #endif
 
+#undef TARGET_SPILL_CLASS
+#define

Re: RFC: LRA for x86/x86-64 [4/9]

2012-09-27 Thread Joseph S. Myers

On Thu, 27 Sep 2012, Vladimir Makarov wrote:

 * target.h: Include tm.h.

That's a backward step; we'd like parts of the compiler that aren't using 
target macros directly not to end up including tm.h.  Why do you need 
this?

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Merge C++ conversion into trunk (4/6 - hash table rewrite)

2012-09-27 Thread Gabriel Dos Reis

On Thu, Sep 27, 2012 at 1:35 PM, Lawrence Crowl cr...@google.com wrote:

 If we were to follow C++ standard library conventions, we would call
 it value_type.  That would be my preference.  However, if folks
 want a shorter name, I'll live with that too.  But as it stands,
 the current name is very confusing.

Yes, and there appears to be no good reason to let it stand.

-- Gaby

Re: RFC: LRA for x86/x86-64 [4/9]

2012-09-27 Thread Vladimir Makarov


On 09/27/2012 07:05 PM, Joseph S. Myers wrote:

On Thu, 27 Sep 2012, Vladimir Makarov wrote:


 * target.h: Include tm.h.

That's a backward step; we'd like parts of the compiler that aren't using
target macros directly not to end up including tm.h.  Why do you need
this?


Thanks, Joseph.

Hook spill_class returns a value of enum reg_class which is defined in 
target-depend include file.


If it is really bad, I'll try to find a solution but it will be probably 
not a pleasant one like the hook returning int which is converted to and 
from enum reg_class.

Re: RFC: LRA for x86/x86-64 [4/9]

2012-09-27 Thread Joseph S. Myers

On Thu, 27 Sep 2012, Vladimir Makarov wrote:

 Hook spill_class returns a value of enum reg_class which is defined in
 target-depend include file.

That's what reg_class_t is for: avoiding enum reg_class in hook 
interfaces.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: RFC: LRA for x86/x86-64 [4/9]

2012-09-27 Thread Vladimir Makarov


On 09/27/2012 08:07 PM, Joseph S. Myers wrote:

On Thu, 27 Sep 2012, Vladimir Makarov wrote:


Hook spill_class returns a value of enum reg_class which is defined in
target-depend include file.

That's what reg_class_t is for: avoiding enum reg_class in hook
interfaces.


Ok.  Thanks for pointing this out.

Here is the modified patch.

2012-09-27  Vladimir Makarov  vmaka...@redhat.com

* targhooks.h (default_lra_p): Declare.
(default_register_bank): Ditto.
(default_different_addr_displacement_p): Ditto.
* targhooks.c (default_lra_p): New function.
(default_register_bank): Ditto.
(default_different_addr_displacement_p): Ditto.
* target.def (lra_p): New hook.
(register_bank): Ditto.
(different_addr_displacement_p): Ditto.
(spill_class, spill_class_mode): New hooks.
* doc/tm.texi.in: Add TARGET_LRA_P, TARGET_REGISTER_BANK,
TARGET_DIFFERENT_ADDR_DISPLACEMENT_P, TARGET_SPILL_CLASS, and
TARGET_SPILL_CLASS_MODE.
* doc/tm.texi: Update.

The change also requires some modification in the 9th patch. The 
ChangeLog for the patch should be the same as before.


Index: targhooks.c
===
--- targhooks.c	(revision 191771)
+++ targhooks.c	(working copy)
@@ -840,6 +840,24 @@ default_branch_target_register_class (vo
   return NO_REGS;
 }
 
+extern bool
+default_lra_p (void)
+{
+  return false;
+}
+
+int
+default_register_bank (int hard_regno ATTRIBUTE_UNUSED)
+{
+  return 0;
+}
+
+extern bool
+default_different_addr_displacement_p (void)
+{
+  return false;
+}
+
 reg_class_t
 default_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x ATTRIBUTE_UNUSED,
 			  reg_class_t reload_class_i ATTRIBUTE_UNUSED,
Index: targhooks.h
===
--- targhooks.h	(revision 191771)
+++ targhooks.h	(working copy)
@@ -132,6 +132,9 @@ extern rtx default_static_chain (const_t
 extern void default_trampoline_init (rtx, tree, rtx);
 extern int default_return_pops_args (tree, tree, int);
 extern reg_class_t default_branch_target_register_class (void);
+extern bool default_lra_p (void);
+extern int default_register_bank (int);
+extern bool default_different_addr_displacement_p (void);
 extern reg_class_t default_secondary_reload (bool, rtx, reg_class_t,
 	 enum machine_mode,
 	 secondary_reload_info *);
Index: doc/tm.texi
===
--- doc/tm.texi	(revision 191771)
+++ doc/tm.texi	(working copy)
@@ -2893,6 +2893,26 @@ as below:
 @end smallexample
 @end defmac
 
+@deftypefn {Target Hook} bool TARGET_LRA_P (void)
+A target hook which returns true if we use LRA instead of reload pass.  It means that LRA was ported to the target.The default version of this target hook returns always false.
+@end deftypefn
+
+@deftypefn {Target Hook} int TARGET_REGISTER_BANK (int)
+A target hook which returns the register bank number to which the  register @var{hard_regno} belongs to.  The smaller the number, the  more preferable the hard register usage (when all other conditions are  the same).  This hook can be used to prefer some hard register over  others in LRA.  For example, some x86-64 register usage needs  additional prefix which makes instructions longer.  The hook can  return bigger bank number for such registers make them less favorable  and as result making the generated code smaller.The default version of this target hook returns always zero.
+@end deftypefn
+
+@deftypefn {Target Hook} bool TARGET_DIFFERENT_ADDR_DISPLACEMENT_P (void)
+A target hook which returns true if an address with the same structure  can have different maximal legitimate displacement.  For example, the  displacement can depend on memory mode or on operand combinations in  the insn.The default version of this target hook returns always false.
+@end deftypefn
+
+@deftypefn {Target Hook} reg_class_t TARGET_SPILL_CLASS (reg_class_t)
+This hook defines a class of registers which could be used for spilled pseudos  of given class instead of memory
+@end deftypefn
+
+@deftypefn {Target Hook} {enum machine_mode} TARGET_SPILL_CLASS_MODE (reg_class_t, @var{reg_class_t}, enum @var{machine_mode})
+This hook defines mode in which a pseudo of given mode and of the first  register class can be spilled into the second register class
+@end deftypefn
+
 @node Old Constraints
 @section Obsolete Macros for Defining Constraints
 @cindex defining constraints, obsolete method
Index: doc/tm.texi.in
===
--- doc/tm.texi.in	(revision 191771)
+++ doc/tm.texi.in	(working copy)
@@ -2869,6 +2869,16 @@ as below:
 @end smallexample
 @end defmac
 
+@hook TARGET_LRA_P
+
+@hook TARGET_REGISTER_BANK
+
+@hook TARGET_DIFFERENT_ADDR_DISPLACEMENT_P
+
+@hook TARGET_SPILL_CLASS
+
+@hook TARGET_SPILL_CLASS_MODE
+
 @node Old Constraints
 @section Obsolete Macros for Defining Constraints
 @cindex defining

[PATCH 4.7] Backport Don't pull in unwinder for 64-bit division routines

2012-09-27 Thread Joey Ye

OK backporting following patches to 4.7?

http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01193.html

2012-08-17  Julian Brown  jul...@codesourcery.com

* Makefile.in (LIB2_DIVMOD_EXCEPTION_FLAGS): Default to
-fexceptions -fnon-call-exceptions if not defined.
($(lib2-divmod-o), $(lib2-divmod-s-o)): Use above.
* config/arm/t-bpabi (LIB2_DIVMOD_EXCEPTION_FLAGS): Define.

* gcc.target/arm/div64-unwinding.c: New test.

2012-09-26  Janis Johnson  jani...@codesourcery.com

* gcc.target/arm/div64-unwinding.c: XFAIL for GNU/Linux.

* gcc.target/arm/mmx-2.c: Specify -mcpu=iwmmxt.

* gcc.target/arm/combine-movs.c: Use effective target arm_thumb2.

* gcc.target/arm/pr42879.c: Handle big-endian.

54 matches

Mail list logo