date:20140423

On Tue, 15 Apr 2014, Jakub Jelinek wrote:

 Hi!
 
 This patch adds two new options (compatible with clang) which allow
 users to choose the behavior of undefined behavior sanitization.
 
 By default as before, all undefined behaviors (except for
 __builtin_unreachable and missing return in C++) continue after reporting
 which means that you can get lots of runtime errors from a single program
 run and the exit code will not reflect the failure in that case.
 
 With this patch, one can use -fsanitize=undefined -fno-sanitize-recover,
 which will report just the first undefined behavior and then exit with
 non-zero code.
 Or one can use -fsanitize-undefined-trap-on-error, which will just
 __builtin_trap () on undefined behavior, not report anything and not require
 linking of -lubsan (useful e.g. for the kernel or embedded apps).
 If -fsanitize-undefined-trap-on-error, then -f{,no-}sanitize-recover
 is ignored, as ub traps, of course only the first undefined behavior will
 be reported (through the SIGILL/abort).
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Works for me.

Thanks,
Richard.

 2014-04-15  Jakub Jelinek  ja...@redhat.com
 
   PR sanitizer/60275
   * common.opt (fsanitize-recover, fsanitize-undefined-trap-on-error):
   New options.
   * gcc.c (sanitize_spec_function): Don't return  for undefined
   if flag_sanitize_undefined_trap_on_error.
   * sanitizer.def (BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW_ABORT,
   BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS_ABORT,
   BUILT_IN_UBSAN_HANDLE_VLA_BOUND_NOT_POSITIVE_ABORT,
   BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_ABORT,
   BUILT_IN_UBSAN_HANDLE_ADD_OVERFLOW_ABORT,
   BUILT_IN_UBSAN_HANDLE_SUB_OVERFLOW_ABORT,
   BUILT_IN_UBSAN_HANDLE_MUL_OVERFLOW_ABORT,
   BUILT_IN_UBSAN_HANDLE_NEGATE_OVERFLOW_ABORT,
   BUILT_IN_UBSAN_HANDLE_LOAD_INVALID_VALUE_ABORT): New builtins.
   * ubsan.c (ubsan_instrument_unreachable): Return
   __builtin_trap () if flag_sanitize_undefined_trap_on_error.
   (ubsan_expand_null_ifn): Emit __builtin_trap ()
   if flag_sanitize_undefined_trap_on_error and
   __ubsan_handle_type_mismatch_abort if !flag_sanitize_recover.
   (ubsan_expand_null_ifn, ubsan_build_overflow_builtin,
   instrument_bool_enum_load): Emit __builtin_trap () if
   flag_sanitize_undefined_trap_on_error and
   __builtin_handle_*_abort () if !flag_sanitize_recover.
   * doc/invoke.texi (-fsanitize-recover,
   -fsanitize-undefined-trap-on-error): Document.
 c-family/
   * c-ubsan.c (ubsan_instrument_return): Return __builtin_trap ()
   if flag_sanitize_undefined_trap_on_error.
   (ubsan_instrument_division, ubsan_instrument_shift,
   ubsan_instrument_vla): Likewise.  Use __ubsan_handle_*_abort ()
   if !flag_sanitize_recover.
 testsuite/
   * g++.dg/ubsan/return-2.C: Revert 2014-03-24 changes, add
   -fno-sanitize-recover to dg-options.
   * g++.dg/ubsan/cxx11-shift-1.C: Remove c++11 target restriction,
   add -std=c++11 to dg-options.
   * g++.dg/ubsan/cxx11-shift-2.C: Likewise.
   * g++.dg/ubsan/cxx1y-vla.C: Remove c++1y target restriction,
   add -std=c++1y to dg-options.
   * c-c++-common/ubsan/undefined-1.c: Revert 2014-03-24 changes, add
   -fno-sanitize-recover to dg-options.
   * c-c++-common/ubsan/overflow-sub-1.c: Likewise.
   * c-c++-common/ubsan/vla-4.c: Likewise.
   * c-c++-common/ubsan/pr59503.c: Likewise.
   * c-c++-common/ubsan/vla-3.c: Likewise.
   * c-c++-common/ubsan/save-expr-1.c: Likewise.
   * c-c++-common/ubsan/overflow-add-1.c: Likewise.
   * c-c++-common/ubsan/shift-3.c: Likewise.
   * c-c++-common/ubsan/overflow-1.c: Likewise.
   * c-c++-common/ubsan/overflow-negate-2.c: Likewise.
   * c-c++-common/ubsan/vla-2.c: Likewise.
   * c-c++-common/ubsan/overflow-mul-1.c: Likewise.
   * c-c++-common/ubsan/pr60613-1.c: Likewise.
   * c-c++-common/ubsan/shift-6.c: Likewise.
   * c-c++-common/ubsan/overflow-mul-3.c: Likewise.
   * c-c++-common/ubsan/overflow-add-3.c: New test.
   * c-c++-common/ubsan/overflow-add-4.c: New test.
   * c-c++-common/ubsan/div-by-zero-6.c: New test.
   * c-c++-common/ubsan/div-by-zero-7.c: New test.
 
 --- gcc/common.opt.jj 2014-04-15 09:57:33.400264838 +0200
 +++ gcc/common.opt2014-04-15 10:28:10.554519376 +0200
 @@ -862,6 +862,14 @@ fsanitize=
  Common Driver Report Joined
  Select what to sanitize
  
 +fsanitize-recover
 +Common Report Var(flag_sanitize_recover) Init(1)
 +After diagnosing undefined behavior attempt to continue execution
 +
 +fsanitize-undefined-trap-on-error
 +Common Report Var(flag_sanitize_undefined_trap_on_error) Init(0)
 +Use trap instead of a library function for undefined behavior sanitization
 +
  fasynchronous-unwind-tables
  Common Report Var(flag_asynchronous_unwind_tables) Optimization
  Generate unwind tables that are exact at each instruction

libsanitizer merge from upstream request

2014-04-23 Thread Christophe Lyon

Konstantin / Jakub,

Could you update GCC's libsanitizer version? I'd like to have the
AArch64 support, which was committed on my behalf in LLVM sources as
svn rev 201303. You may prefer to merge with a more recent revision of
course :-)

Once AArch64 support is merged, I'll post the GCC part.

Thanks,

Christophe.

Re: [PATCH][RFC] Remove RTL loop unswitching

On Sun, 20 Apr 2014, Jan Hubicka wrote:

  
  This removes RTL loop unswitching (see last years discussion about
  compile-time issues of that pass).  RTL loop unswitching is
  enabled together with GIMPLE loop unswitching at -O3 and by
  -floop-unswitch.  It's clearly the wrong place to do high-level
  loop transforms these days, and the cost of maintainance doesn't
  outweight the questionable benefit.
  
  Thus the following patch removes it.
  
  Bootstrap / regtest pending on x86_64-unknown-linux-gnu (I hope
  for testsuite fallout).
  
  Any objections?
 
 Not really, I am all for moving more of loop stuff to trees.
 Did you performed some benchmarks? (I remember I did in 2012
 but completely forgot the outcome).

I did that last year and it showed no difference in SPEC 2k6.

When bootstrapping with -O3 and a gcc_unreachable () in the
RTL unswitching path you get some ICEs there but they are
due to different effective --param max-unswitch-insns that
is on GIMPLE applied to tree_num_loop_insns () and on RTL
to num_loop_insns ().

I'll go forward with the patch today.

 On related note, shall I try to update the following?
 http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02270.html

Yeah.

Thanks,
Richard.

 Honza
  
  Thanks,
  Richard.
  
  2014-04-15  Richard Biener  rguent...@suse.de
  
  * Makefile.in (OBJS): Remove loop-unswitch.o.
  * loop-unswitch.c: Delete.
  * tree-pass.h (make_pass_rtl_unswitch): Remove.
  * passes.def (pass_rtl_unswitch): Likewise.
  * loop-init.c (gate_rtl_unswitch): Likewise.
  (rtl_unswitch): Likewise.
  (pass_data_rtl_unswitch): Likewise.
  (pass_rtl_unswitch): Likewise.
  (make_pass_rtl_unswitch): Likewise.
  * rtl.h (reversed_condition): Likewise.
  (compare_and_jump_seq): Likewise.
  * loop-iv.c (reversed_condition): Move here from loop-unswitch.c
  and make static.
  * loop-unroll.c (compare_and_jump_seq): Likewise.
  
  Index: gcc/Makefile.in
  ===
  --- gcc/Makefile.in (revision 209410)
  +++ gcc/Makefile.in (working copy)
  @@ -1294,7 +1294,6 @@ OBJS = \
  loop-invariant.o \
  loop-iv.o \
  loop-unroll.o \
  -   loop-unswitch.o \
  lower-subreg.o \
  lra.o \
  lra-assigns.o \
  Index: gcc/tree-pass.h
  ===
  --- gcc/tree-pass.h (revision 209410)
  +++ gcc/tree-pass.h (working copy)
  @@ -512,7 +512,6 @@ extern rtl_opt_pass *make_pass_outof_cfg
   extern rtl_opt_pass *make_pass_loop2 (gcc::context *ctxt);
   extern rtl_opt_pass *make_pass_rtl_loop_init (gcc::context *ctxt);
   extern rtl_opt_pass *make_pass_rtl_move_loop_invariants (gcc::context 
  *ctxt);
  -extern rtl_opt_pass *make_pass_rtl_unswitch (gcc::context *ctxt);
   extern rtl_opt_pass *make_pass_rtl_unroll_and_peel_loops (gcc::context 
  *ctxt);
   extern rtl_opt_pass *make_pass_rtl_doloop (gcc::context *ctxt);
   extern rtl_opt_pass *make_pass_rtl_loop_done (gcc::context *ctxt);
  Index: gcc/passes.def
  ===
  --- gcc/passes.def  (revision 209410)
  +++ gcc/passes.def  (working copy)
  @@ -341,7 +341,6 @@ along with GCC; see the file COPYING3.
 PUSH_INSERT_PASSES_WITHIN (pass_loop2)
NEXT_PASS (pass_rtl_loop_init);
NEXT_PASS (pass_rtl_move_loop_invariants);
  - NEXT_PASS (pass_rtl_unswitch);
NEXT_PASS (pass_rtl_unroll_and_peel_loops);
NEXT_PASS (pass_rtl_doloop);
NEXT_PASS (pass_rtl_loop_done);
  Index: gcc/loop-init.c
  ===
  --- gcc/loop-init.c (revision 209410)
  +++ gcc/loop-init.c (working copy)
  @@ -518,61 +518,7 @@ make_pass_rtl_move_loop_invariants (gcc:
   }
   
   
  -/* Loop unswitching for RTL.  */
  -static bool
  -gate_rtl_unswitch (void)
  -{
  -  return flag_unswitch_loops;
  -}
  -
  -static unsigned int
  -rtl_unswitch (void)
  -{
  -  if (number_of_loops (cfun)  1)
  -unswitch_loops ();
  -  return 0;
  -}
  -
  -namespace {
  -
  -const pass_data pass_data_rtl_unswitch =
  -{
  -  RTL_PASS, /* type */
  -  loop2_unswitch, /* name */
  -  OPTGROUP_LOOP, /* optinfo_flags */
  -  true, /* has_gate */
  -  true, /* has_execute */
  -  TV_LOOP_UNSWITCH, /* tv_id */
  -  0, /* properties_required */
  -  0, /* properties_provided */
  -  0, /* properties_destroyed */
  -  0, /* todo_flags_start */
  -  TODO_verify_rtl_sharing, /* todo_flags_finish */
  -};
  -
  -class pass_rtl_unswitch : public rtl_opt_pass
  -{
  -public:
  -  pass_rtl_unswitch (gcc::context *ctxt)
  -: rtl_opt_pass (pass_data_rtl_unswitch, ctxt)
  -  {}
  -
  -  /* opt_pass methods: */
  -  bool gate () { return gate_rtl_unswitch (); }
  -  unsigned int execute () { return rtl_unswitch (); }
  -
  -}; // class pass_rtl_unswitch
  -
  -} // anon namespace
  -
  -rtl_opt_pass *
  -make_pass_rtl_unswitch (gcc::context *ctxt)
  -{
  -

[PATCH] Fix PR60891


This fixes an oversight in loop_optimizer_init () loop-fixup code
that fails to honor AVOID_CFG_MANIPULATIONS.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to
trunk and 4.9 branch.

Richard.

2014-04-23  Richard Biener  rguent...@suse.de

PR middle-end/60891
* loop-init.c (loop_optimizer_init): Make sure to apply
LOOPS_MAY_HAVE_MULTIPLE_LATCHES before fixing up loops.

* gcc.dg/torture/pr60891.c: New testcase.

Index: gcc/loop-init.c
===
--- gcc/loop-init.c (revision 209559)
+++ gcc/loop-init.c (working copy)
@@ -94,20 +94,15 @@ loop_optimizer_init (unsigned flags)
   else
 {
   bool recorded_exits = loops_state_satisfies_p 
(LOOPS_HAVE_RECORDED_EXITS);
+  bool needs_fixup = loops_state_satisfies_p (LOOPS_NEED_FIXUP);
 
   gcc_assert (cfun-curr_properties  PROP_loops);
 
   /* Ensure that the dominators are computed, like flow_loops_find does.  
*/
   calculate_dominance_info (CDI_DOMINATORS);
 
-  if (loops_state_satisfies_p (LOOPS_NEED_FIXUP))
-   {
- loops_state_clear (~0U);
- fix_loop_structure (NULL);
-   }
-
 #ifdef ENABLE_CHECKING
-  else
+  if (!needs_fixup)
verify_loop_structure ();
 #endif
 
@@ -115,6 +110,14 @@ loop_optimizer_init (unsigned flags)
   if (recorded_exits)
release_recorded_exits ();
   loops_state_clear (~0U);
+
+  if (needs_fixup)
+   {
+ /* Apply LOOPS_MAY_HAVE_MULTIPLE_LATCHES early as fix_loop_structure
+re-applies flags.  */
+ loops_state_set (flags  LOOPS_MAY_HAVE_MULTIPLE_LATCHES);
+ fix_loop_structure (NULL);
+   }
 }
 
   /* Apply flags to loops.  */
Index: gcc/testsuite/gcc.dg/torture/pr60891.c
===
--- gcc/testsuite/gcc.dg/torture/pr60891.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr60891.c  (working copy)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-additional-options -fno-tree-ch -fno-tree-cselim 
-fno-tree-dominator-opts } */
+
+int a, b, c, d, e, f;
+
+void foo (int x)
+{
+  for (;;)
+{
+  int g = c;
+  if (x)
+   {
+ if (e)
+   while (a)
+ --f;
+   }
+  for (b = 5; b; b--)
+   {
+   }
+  if (!g)
+   x = 0;
+}
+}

[PATCH] Fix PR60895


This fixes PR60895 - copying TREE_ADDRESSABLE from a decl to
a handled-component-ref doesn't work as the inliner tries to do.
Use mark_addressable instead.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk
and 4.9 branch.

Richard.

2014-04-23  Richard Biener  rguent...@suse.de

PR middle-end/60895
* tree-inline.c (declare_return_variable): Use mark_addressable.

* g++.dg/torture/pr60895.C: New testcase.

Index: gcc/tree-inline.c
===
--- gcc/tree-inline.c   (revision 209559)
+++ gcc/tree-inline.c   (working copy)
@@ -3120,7 +3124,8 @@ declare_return_variable (copy_body_data
{
  var = return_slot;
  gcc_assert (TREE_CODE (var) != SSA_NAME);
- TREE_ADDRESSABLE (var) |= TREE_ADDRESSABLE (result);
+ if (TREE_ADDRESSABLE (result))
+   mark_addressable (var);
}
   if ((TREE_CODE (TREE_TYPE (result)) == COMPLEX_TYPE
|| TREE_CODE (TREE_TYPE (result)) == VECTOR_TYPE)
Index: gcc/testsuite/g++.dg/torture/pr60895.C
===
--- gcc/testsuite/g++.dg/torture/pr60895.C  (revision 0)
+++ gcc/testsuite/g++.dg/torture/pr60895.C  (working copy)
@@ -0,0 +1,32 @@
+// { dg-do compile }
+
+struct C
+{
+  double elems[3];
+};
+
+C
+foo ()
+{
+  C a;
+  double *f = a.elems;
+  int b;
+  for (; b;)
+{
+  *f = 0;
+  ++f;
+}
+  return a;
+}
+
+struct J
+{
+  C c;
+  __attribute__((always_inline)) J () : c (foo ()) {}
+};
+
+void
+bar ()
+{
+  J ();
+}

Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc

2014-04-23 Thread Bernhard Reutner-Fischer

On 17 April 2014 19:01, Konstantin Serebryany
konstantin.s.serebry...@gmail.com wrote:
 On Thu, Apr 17, 2014 at 8:45 PM, Bernhard Reutner-Fischer
 rep.dot@gmail.com wrote:
 On 17 April 2014 16:51:23 Konstantin Serebryany
 konstantin.s.serebry...@gmail.com wrote:

 On Thu, Apr 17, 2014 at 6:27 PM, Bernhard Reutner-Fischer
 rep.dot@gmail.com wrote:
  On 17 April 2014 16:07, Konstantin Serebryany
  konstantin.s.serebry...@gmail.com wrote:
  Hi,
 
  If you are trying to modify the libsanitizer files, please read here:
  https://code.google.com/p/address-sanitizer/wiki/HowToContribute
 
  I read that, thanks. Patch 3/3 is for current compiler-rt git repo,
  please install it there, i do not have write access to the LLVM nor
  compiler-rt trees.

 I can commit your patch to llvm tree only after you follow the process
 described on that page.
 Sorry, this is a hard rule.


 What part of the process do you think I did not follow?

 I made a patch for compiler-rt, sent it to llvm-comm...@cs.uiuc.edu then
 provided the corresponding GCC parts, along a backport of the new bits that
 I expect to be overwritten once you do a new merge, leaving just the GCC
 configuy bits. This is how I read the wiki page you cite.

 Please tell me what you expect me to do differently?

 First, I did not notice that you've sent it to llvm-commits because it
 was also sent to the gcc list (unusual thing to happen)
 and got filtered into the gcc part of my mail. Sorry.
 But second, the patch is far from trivial and you should not expect us
 to commit it w/o a careful review,
 so here comes another part of the wiki: For non-trivial patches
 please use Phabricator -- this will help us reply faster.

http://reviews.llvm.org/D3464

thanks,

Re: fuse-caller-save - hook format

2014-04-23 Thread Richard Earnshaw

On 22/04/14 18:13, Tom de Vries wrote:
 On 22-04-14 18:18, Richard Sandiford wrote:
 Tom de Vries tom_devr...@mentor.com writes:

 On 22-04-14 17:27, Richard Sandiford wrote:
 Tom de Vries tom_devr...@mentor.com writes:
 2. post_expand_call_insn.
 A utility hook to facilitate adding the clobbers to 
 CALL_INSN_FUNCTION_USAGE.

 Why is this needed though?  Like I say, I think targets should update
 CALL_INSN_FUNCTION_USAGE when emitting calls as part of the call expander.
 Splitting the functionality of the call expanders across the define_expand
 and a new hook just makes things unnecessarily complicated IMO.


 Richard,

 It is not needed, but it is convenient.

 There are targets where the define_expands for calls use the rtl template.
 Having to add clobbers to the CALL_INSN_FUNCTION_USAGE for such a target 
 means
 you cannot use the rtl template any more and instead need to generate
 all needed
 RTL insns in C code.

 This hook means that you can keep using the rtl template, which is less
 intrusive for those targets.

 
 [ switching order of questions ]
 Which target do you have in mind?
 
 Aarch64.
 
   But if the target is simple enough to use a single call pattern for call
   cases, wouldn't it be possible to add the clobber directly to the call
   pattern?
 
 I think that can be done, but that feels intrusive as well. I thought the 
 reason 
 that we added these clobbers to CALL_INSN_FUNCTION_USAGE was exactly because 
 we 
 did not want to add them to the rtl patterns?
 
 But, if the maintainer is fine with that, so am I.
 
 Richard Earnshaw,
 
 are you ok with adding the IP0_REGNUM/IP1_REGNUM clobbers to all the call 
 patterns in the Aarch64 target?
 
 The alternatives are:
 - rewrite the call expansions not to use the rtl templates, and add the 
 clobbers
there to CALL_INSN_FUNCTION_USAGE
 - get the post_expand_call_insn hook approved and use that to add the clobbers
to CALL_INSN_FUNCTION_USAGE.
 
 what is your preference?
 

It seems undesirable to me to be hard-coding ABI constraints directly
into the MD file.  It's not a major problem while there is one ABI
that's common to all targets; but it's quite possible this sort of
detail would change from platform to platform.  That sort of churn is
best kept out of the MD file itself, if at all possible.

R.

 Thanks,
 - Tom

RFA: x86 backend: Add default-manifest to Cygwin/MinGW links

2014-04-23 Thread Nick Clifton

Hi Guys,

  Please could I have permission to apply the patch below ?  Ideally for
  both mainline and the 4.9 branch.

  The patch adds a file called default-manifest.o to the end of a
  final link command line for the Cygwin and MinGW targets.  The file is
  only added if it can be found in the library search path(s), so the
  patch will have no effect if the file does not exist.

  The default manifest file contains a resource section (.rsrc) holding
  information necessary for the binary to be run under Windows 8.  It is
  placed last on the linker command line so that a user provided
  manifest, if there is one, will take precedence over the default
  manifest.

  The manifest used to be automatically added by the linker, but this
  proved to be problematic as the linker is not good at selectively
  inserting binaries.  The manifest itself is provided by a separate
  project which will have to become a new dependency for the Cygwin and
  MinGW projects.

Cheers
  Nick

gcc/ChangeLog
2014-04-23  Nick Clifton  ni...@redhat.com

* config/i386/cygwin.h (ENDFILE_SPEC): Include
default-manifest.o if it can be found in the search path.
* config/i386/mingw32.h (ENDFILE_SPEC): Likewise.

Index: gcc/config/i386/cygwin.h
===
--- gcc/config/i386/cygwin.h(revision 209670)
+++ gcc/config/i386/cygwin.h(working copy)
@@ -45,6 +45,7 @@
 #undef ENDFILE_SPEC
 #define ENDFILE_SPEC \
   %{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s}\
+   %{!shared:%:if-exists(default-manifest.o%s)}\
crtend.o%s
 
 /* Normally, -lgcc is not needed since everything in it is in the DLL, but we
Index: gcc/config/i386/mingw32.h
===
--- gcc/config/i386/mingw32.h   (revision 209670)
+++ gcc/config/i386/mingw32.h   (working copy)
@@ -148,6 +148,7 @@
 #undef ENDFILE_SPEC
 #define ENDFILE_SPEC \
   %{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s} \
+   %{!shared:%:if-exists(default-manifest.o%s)}\
   crtend.o%s
 
 /* Override startfile prefix defaults.  */

Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc

2014-04-23 Thread Konstantin Serebryany

Thanks. Let's move the discussion there.

On Wed, Apr 23, 2014 at 12:46 PM, Bernhard Reutner-Fischer
rep.dot@gmail.com wrote:
 On 17 April 2014 19:01, Konstantin Serebryany
 konstantin.s.serebry...@gmail.com wrote:
 On Thu, Apr 17, 2014 at 8:45 PM, Bernhard Reutner-Fischer
 rep.dot@gmail.com wrote:
 On 17 April 2014 16:51:23 Konstantin Serebryany
 konstantin.s.serebry...@gmail.com wrote:

 On Thu, Apr 17, 2014 at 6:27 PM, Bernhard Reutner-Fischer
 rep.dot@gmail.com wrote:
  On 17 April 2014 16:07, Konstantin Serebryany
  konstantin.s.serebry...@gmail.com wrote:
  Hi,
 
  If you are trying to modify the libsanitizer files, please read here:
  https://code.google.com/p/address-sanitizer/wiki/HowToContribute
 
  I read that, thanks. Patch 3/3 is for current compiler-rt git repo,
  please install it there, i do not have write access to the LLVM nor
  compiler-rt trees.

 I can commit your patch to llvm tree only after you follow the process
 described on that page.
 Sorry, this is a hard rule.


 What part of the process do you think I did not follow?

 I made a patch for compiler-rt, sent it to llvm-comm...@cs.uiuc.edu then
 provided the corresponding GCC parts, along a backport of the new bits that
 I expect to be overwritten once you do a new merge, leaving just the GCC
 configuy bits. This is how I read the wiki page you cite.

 Please tell me what you expect me to do differently?

 First, I did not notice that you've sent it to llvm-commits because it
 was also sent to the gcc list (unusual thing to happen)
 and got filtered into the gcc part of my mail. Sorry.
 But second, the patch is far from trivial and you should not expect us
 to commit it w/o a careful review,
 so here comes another part of the wiki: For non-trivial patches
 please use Phabricator -- this will help us reply faster.

 http://reviews.llvm.org/D3464

 thanks,

Re: RFA: x86 backend: Add default-manifest to Cygwin/MinGW links

2014-04-23 Thread Kai Tietz

Hello Nick,

2014-04-23 10:53 GMT+02:00 Nick Clifton ni...@redhat.com:
 Hi Guys,

   Please could I have permission to apply the patch below ?  Ideally for
   both mainline and the 4.9 branch.

   The patch adds a file called default-manifest.o to the end of a
   final link command line for the Cygwin and MinGW targets.  The file is
   only added if it can be found in the library search path(s), so the
   patch will have no effect if the file does not exist.

   The default manifest file contains a resource section (.rsrc) holding
   information necessary for the binary to be run under Windows 8.  It is
   placed last on the linker command line so that a user provided
   manifest, if there is one, will take precedence over the default
   manifest.

   The manifest used to be automatically added by the linker, but this
   proved to be problematic as the linker is not good at selectively
   inserting binaries.  The manifest itself is provided by a separate
   project which will have to become a new dependency for the Cygwin and
   MinGW projects.

 Cheers
   Nick

Well, I am a bit concerned about the position of the manifest-object.
What will actually happen, if user specifies an user-specific
manifest-object.  Will the default one, if present, be ignored, or
will it be still linked?

Cheers,
Kai

Re: Remove obsolete Solaris 9 support

2014-04-23 Thread Rainer Orth

Andrew Hughes gnu.and...@redhat.com writes:

 - Original Message -
 On Sat, 2014-04-19 at 09:03 +0100, Andrew Haley wrote:
  On 04/16/2014 12:16 PM, Rainer Orth wrote:
   * I'm removing the sys/loadavg.h check from classpath.  Again, I'm
 uncertain if this is desirable.  In the past, classpath changes were
 merged upstream by one of the libjava maintainers.
  
  We should not diverge from GNU Classpath unless there is a strong reason
  to do so.
 
 I think the configure check is mostly harmless, but wouldn't be opposed
 removing it. It really seems to have been added explicitly for Solaris
 9, which is probably really dead by now. Andrew Hughes, you added it
 back in 2008. Are you still using/building on any Solaris 9 setups?
 

 I vaguely remember adding it. I was building on the university's Solaris 9
 machines at the time. They've long since replaced them with GNU/Linux machines
 and I've been at Red Hat for over five years, so those days are long gone :)

 I have some Freetype fixes to push to Classpath as well, so I'll fix this too
 and look at merging to gcj in the not-too-distant future. I think it's long
 overdue. Ideally, the change should be left out of this patch, so as to avoid
 conflicts.

Based on the other Andrew's comment and the knowledge that classpath
(like libgo) lives upstream, I didn't commit that part with the rest of
the patch.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: RFA: x86 backend: Add default-manifest to Cygwin/MinGW links

2014-04-23 Thread Nicholas Clifton


Hi Kai,


   The default manifest file contains a resource section (.rsrc) holding
   information necessary for the binary to be run under Windows 8.  It is
   placed last on the linker command line so that a user provided
   manifest, if there is one, will take precedence over the default
   manifest.



Well, I am a bit concerned about the position of the manifest-object.
What will actually happen, if user specifies an user-specific
manifest-object.  Will the default one, if present, be ignored, or
will it be still linked?


The default one, if present, will be ignored[1].

This is why I am using ENDFILE_SPEC to add the default manifest to the 
linker command line.  This ensures that the default manifest is placed 
after any user specified object files on the linker command line.  The 
resource merging code in the linker is specifically designed to drop any 
duplicate resources, only keeping the resource that appeared first on 
the command line.


Cheers
  Nick

[1] Strictly speaking the default manifest will not be ignored.  It will 
be included in the link, and merged into the output .rsrc section.  But 
the resource merging code in the linker will drop everything in the 
default manifest giving preference to the user supplied manifest instead.

Re: RFA: x86 backend: Add default-manifest to Cygwin/MinGW links

2014-04-23 Thread Kai Tietz

2014-04-23 11:06 GMT+02:00 Nicholas Clifton ni...@redhat.com:
 Hi Kai,


The default manifest file contains a resource section (.rsrc) holding
information necessary for the binary to be run under Windows 8.  It is
placed last on the linker command line so that a user provided
manifest, if there is one, will take precedence over the default
manifest.


 Well, I am a bit concerned about the position of the manifest-object.
 What will actually happen, if user specifies an user-specific
 manifest-object.  Will the default one, if present, be ignored, or
 will it be still linked?


 The default one, if present, will be ignored[1].

 This is why I am using ENDFILE_SPEC to add the default manifest to the
 linker command line.  This ensures that the default manifest is placed after
 any user specified object files on the linker command line.  The resource
 merging code in the linker is specifically designed to drop any duplicate
 resources, only keeping the resource that appeared first on the command
 line.

 Cheers
   Nick

 [1] Strictly speaking the default manifest will not be ignored.  It will be
 included in the link, and merged into the output .rsrc section.  But the
 resource merging code in the linker will drop everything in the default
 manifest giving preference to the user supplied manifest instead.


Thanks for explaining.  So patch is ok for trunk, and for 4.9 branch.

Thanks,
Kai

Re: [Patch, Fortran] PR60881 - fix ICE with allocatable scalar coarrays

2014-04-23 Thread Paul Richard Thomas

Dear Tobias,

As you say, this of a rather obvious nature and is OK for trunk.

Cheers

Paul

On 21 April 2014 22:52, Tobias Burnus bur...@net-b.de wrote:
 Dear all,

 for a change, a patch for the trunk and not for the fortran-caf branch. The
 following is a rather obvious patch which fixes the ICE.

 Built and regtested on x86-64-gnu-linux.
 OK for the trunk? As it is of rather obvious nature, I will commit it to the
 trunk in the next days unless there are objections.

 Tobias



-- 
The knack of flying is learning how to throw yourself at the ground and miss.
   --Hitchhikers Guide to the Galaxy

Re: [PATCH] Fix warning in libgfortran configure script

2014-04-23 Thread Richard Earnshaw

On 17/04/14 17:49, Kyrill Tkachov wrote:
 Hi all,
 
 While configuring libgfortran I'm getting this message:
 libgfortran/configure: line 25938: test: =: unary operator expected
 The script doesn't fail and continues afterwards, but I don't think it's 
 supposed to give that warning.
 This patch makes it go away and makes it more consistent with other similar 
 uses 
 (a few lines below $ac_cv_lib_rt_clock_gettime is quoted when used in a test 
 structure). configure.ac is updated and configure is regenerated with 
 autoconf 2.64
 
 Ok for trunk?
 
 Make sure libgfortran builds for arm-none-eabi.
 
 libgfortran/
 2014-04-17  Kyrylo Tkachov  kyrylo.tkac...@arm.com
 
  * configure.ac: Quote usage of ac_cv_func_clock_gettime in if test.
  * configure: Regenerate.
 

This looks fairly safe to me.  My only question might be why isn't the
variable set to one of 'yes' or 'no'?

OK unless the fortran maintainers chime in within 24 hours.

R.

 
 libgfortran-configure.patch
 
 
 diff --git a/libgfortran/configure b/libgfortran/configure
 index 23f57c7..d3ced74 100755
 --- a/libgfortran/configure
 +++ b/libgfortran/configure
 @@ -25935,7 +25935,7 @@ fi
  # test is copied from libgomp, and modified to not link in -lrt as
  # libgfortran calls clock_gettime via a weak reference if it's found
  # in librt.
 -if test $ac_cv_func_clock_gettime = no; then
 +if test $ac_cv_func_clock_gettime = no; then
{ $as_echo $as_me:${as_lineno-$LINENO}: checking for clock_gettime in 
 -lrt 5
  $as_echo_n checking for clock_gettime in -lrt...  6; }
  if test ${ac_cv_lib_rt_clock_gettime+set} = set; then :
 diff --git a/libgfortran/configure.ac b/libgfortran/configure.ac
 index de2d65e..24dbf2b 100644
 --- a/libgfortran/configure.ac
 +++ b/libgfortran/configure.ac
 @@ -510,7 +510,7 @@ 
 AC_CHECK_LIB([m],[feenableexcept],[have_feenableexcept=yes 
 AC_DEFINE([HAVE_FEENA
  # test is copied from libgomp, and modified to not link in -lrt as
  # libgfortran calls clock_gettime via a weak reference if it's found
  # in librt.
 -if test $ac_cv_func_clock_gettime = no; then
 +if test $ac_cv_func_clock_gettime = no; then
AC_CHECK_LIB(rt, clock_gettime,
  [AC_DEFINE(HAVE_CLOCK_GETTIME_LIBRT, 1,
 [Define to 1 if you have the `clock_gettime' function in 
 librt.])])

Re: RFA: x86 backend: Add default-manifest to Cygwin/MinGW links

2014-04-23 Thread Corinna Vinschen

[This time to everyone, not only to Kai, sorry]

Hi guys,

On Apr 23 11:08, Kai Tietz wrote:
 2014-04-23 11:06 GMT+02:00 Nicholas Clifton ni...@redhat.com:
  Hi Kai,
 
 
 The default manifest file contains a resource section (.rsrc) holding
 information necessary for the binary to be run under Windows 8.  It is
 placed last on the linker command line so that a user provided
 manifest, if there is one, will take precedence over the default
 manifest.
 
 
  Well, I am a bit concerned about the position of the manifest-object.
  What will actually happen, if user specifies an user-specific
  manifest-object.  Will the default one, if present, be ignored, or
  will it be still linked?
 
 
  The default one, if present, will be ignored[1].
 
  This is why I am using ENDFILE_SPEC to add the default manifest to the
  linker command line.  This ensures that the default manifest is placed after
  any user specified object files on the linker command line.  The resource
  merging code in the linker is specifically designed to drop any duplicate
  resources, only keeping the resource that appeared first on the command
  line.
 
  Cheers
Nick
 
  [1] Strictly speaking the default manifest will not be ignored.  It will be
  included in the link, and merged into the output .rsrc section.  But the
  resource merging code in the linker will drop everything in the default
  manifest giving preference to the user supplied manifest instead.
 
 
 Thanks for explaining.  So patch is ok for trunk, and for 4.9 branch.

Couldn't have said it better.

However, we know that the act of merging will currently result in broken
resources in the executable.  Wouldn't it be better to apply the above
patch only after the resource merge fix?


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat


pgpaqCW9LpugL.pgp
Description: PGP signature

Re: RFA: x86 backend: Add default-manifest to Cygwin/MinGW links

2014-04-23 Thread Nicholas Clifton


Hi Corinna,


However, we know that the act of merging will currently result in broken
resources in the executable.  Wouldn't it be better to apply the above
patch only after the resource merge fix?


No.  Well not in my opinion. :-)  The reason is that this patch only 
makes a difference if the default manifest can be found in a library 
search path.  If there is none present then nothing happens.  So you can 
disable the (broken) merging of a default manifest file by simply not 
having it present.  Which should be the case for all current installations.


Plus - I am hoping to fix the resource merging problem soon.  (Any day 
now, honest).  So I would like to have the gcc patch in place for when 
that happens.


Cheers
  Nick

[PATCH][RFC] (Auto)-add TODO_verify_il


This goes forward with an old idea of doing IL verification after
each pass.  This is a baby-step towards it by adding TODO_verify_il,
auto-added by the pass manager at the todo-after position.  It
moves loop-closed SSA verification (which was done whenever loops
were in loop-closed SSA form - before _and_ after a pass...)
under the TODO_verify_il umbrella.

Bootstrap/regtest ongoing on x86_64-unknown-linux-gnu.

I'm proposing to remove TODO_verify_* by enabling them under
TODO_verify_il.

Any comments?

Thansk,
Richard.

2014-04-23  Richard Biener  rguent...@suse.de

* tree-pass.h (TODO_verify_il): Define.
(TODO_verify_all): Complete properly.
* passes.c (execute_function_todo): Move existing loop-closed
SSA verification under TODO_verify_il.
(execute_one_pass): Trigger TODO_verify_il at todo-after time.

Index: gcc/tree-pass.h
===
--- gcc/tree-pass.h (revision 209677)
+++ gcc/tree-pass.h (working copy)
@@ -234,6 +234,7 @@ protected:
 #define TODO_verify_flow   (1  3)
 #define TODO_verify_stmts  (1  4)
 #define TODO_cleanup_cfg   (1  5)
+#define TODO_verify_il (1  6)
 #define TODO_dump_symtab   (1  7)
 #define TODO_remove_functions  (1  8)
 #define TODO_rebuild_frequencies   (1  9)
@@ -309,7 +310,8 @@ protected:
  | TODO_update_ssa_only_virtuals)
 
 #define TODO_verify_all \
-  (TODO_verify_ssa | TODO_verify_flow | TODO_verify_stmts)
+  (TODO_verify_ssa | TODO_verify_flow | TODO_verify_stmts | TODO_verify_il \
+   | TODO_verify_rtl_sharing)
 
 
 /* Register pass info. */
Index: gcc/passes.c
===
--- gcc/passes.c(revision 209677)
+++ gcc/passes.c(working copy)
@@ -1777,8 +1777,7 @@ execute_function_todo (void *data)
 return;
 
 #if defined ENABLE_CHECKING
-  if (flags  TODO_verify_ssa
-  || (current_loops  loops_state_satisfies_p (LOOP_CLOSED_SSA)))
+  if (flags  TODO_verify_ssa)
 {
   verify_gimple_in_cfg (cfun);
   verify_ssa (true);
@@ -1787,8 +1786,18 @@ execute_function_todo (void *data)
 verify_gimple_in_cfg (cfun);
   if (flags  TODO_verify_flow)
 verify_flow_info ();
-  if (current_loops  loops_state_satisfies_p (LOOP_CLOSED_SSA))
-verify_loop_closed_ssa (false);
+  if (flags  TODO_verify_il)
+{
+  if (current_loops
+  loops_state_satisfies_p (LOOP_CLOSED_SSA))
+   {
+ if (!(flags  (TODO_verify_stmts|TODO_verify_ssa)))
+   verify_gimple_in_cfg (cfun);
+ if (!(flags  TODO_verify_ssa))
+   verify_ssa (true);
+ verify_loop_closed_ssa (false);
+   }
+}
   if (flags  TODO_verify_rtl_sharing)
 verify_rtl_sharing ();
 #endif
@@ -2170,7 +2179,7 @@ execute_one_pass (opt_pass *pass)
 check_profile_consistency (pass-static_pass_number, 0, true);
 
   /* Run post-pass cleanup and verification.  */
-  execute_todo (todo_after | pass-todo_flags_finish);
+  execute_todo (todo_after | pass-todo_flags_finish | TODO_verify_il);
   if (profile_report  cfun  (cfun-curr_properties  PROP_cfg))
 check_profile_consistency (pass-static_pass_number, 1, true);

Re: [RFC] Add aarch64 support for ada

2014-04-23 Thread Eric Botcazou

 OK, I have installed a variant of the patch (it should not change anything).

But it breaks on IA-64 for the same reason as on Aarch64 so we'll need to find 
something else.

-- 
Eric Botcazou

Re: [PATCH][RFC][wide-int] Fix some build errors on arm in wide-int branch and report ICE

On Tue, Apr 22, 2014 at 6:04 PM, Mike Stump mikest...@comcast.net wrote:
 On Apr 22, 2014, at 8:33 AM, Richard Sandiford rdsandif...@googlemail.com 
 wrote:
 Kyrill Tkachov kyrylo.tkac...@arm.com writes:
 Ping.
 http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00769.html
 Any ideas? I recall chatter on IRC that we want to merge wide-int into trunk
 soon. Bootstrap failure on arm would prevent that...

 Sorry for the late reply.  I hadn't forgotten, but I wanted to wait
 until I had chance to look into the ICE before replying, which I haven't
 had chance to do yet.

 They are separable issues, so, I checked in the change.

 It's a shame we can't use C++ style casts,
 but I suppose that's the price to pay for being able to write
 unsigned HOST_WIDE_INT”.

 unsigned_HOST_WIDE_INT isn’t horrible, but, yeah, my fingers were expecting a 
 typedef or better.  I slightly prefer the int (1) style, but I think we 
 should go the direction of the patch.

Well, on my list of things to try for 4.10 is to kill off HOST_WIDE_* and
require a 64bit integer type on the host and force all targets to use
a 64bit 'hwi'.  Thus, s/HOST_WIDE_INT/int64_t/ (and the appropriate
related changes).

Richard.

Re: [wide-int 1/8] Fix some off-by-one errors and bounds tests

On Tue, Apr 22, 2014 at 9:45 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 This is the first of 8 patches from reading through the diff with mainline.
 Some places had an off-by-one error on an index and some used = 0
 instead of = 0.

 I think we should use MAX_BITSIZE_MODE_ANY_MODE rather than
 MAX_BITSIZE_MODE_ANY_INT when handling floating-point modes.

 Two hunks contain unrelated formatting fixes too.

 Tested on x86_64-linux-gnu.  OK to install?

Ok.

Thanks,
Richard.

 Thanks,
 Richard

 Index: gcc/c-family/c-ada-spec.c
 ===
 --- gcc/c-family/c-ada-spec.c   2014-04-22 20:31:10.632895953 +0100
 +++ gcc/c-family/c-ada-spec.c   2014-04-22 20:31:24.880998602 +0100
 @@ -2205,8 +2205,9 @@ dump_generic_ada_node (pretty_printer *b
   val = -val;
 }
   sprintf (pp_buffer (buffer)-digit_buffer,
 -  16#% HOST_WIDE_INT_PRINT x, val.elt (val.get_len () - 
 1));
 - for (i = val.get_len () - 2; i = 0; i--)
 +  16#% HOST_WIDE_INT_PRINT x,
 +  val.elt (val.get_len () - 1));
 + for (i = val.get_len () - 2; i = 0; i--)
 sprintf (pp_buffer (buffer)-digit_buffer,
  HOST_WIDE_INT_PRINT_PADDED_HEX, val.elt (i));
   pp_string (buffer, pp_buffer (buffer)-digit_buffer);
 Index: gcc/dbxout.c
 ===
 --- gcc/dbxout.c2014-04-22 20:31:10.632895953 +0100
 +++ gcc/dbxout.c2014-04-22 20:31:24.881998608 +0100
 @@ -720,7 +720,7 @@ stabstr_O (tree cst)
  }

prec -= res_pres;
 -  for (i = prec - 3; i = 0; i = i - 3)
 +  for (i = prec - 3; i = 0; i = i - 3)
  {
digit = wi::extract_uhwi (cst, i, 3);
stabstr_C ('0' + digit);
 Index: gcc/dwarf2out.c
 ===
 --- gcc/dwarf2out.c 2014-04-22 20:31:10.632895953 +0100
 +++ gcc/dwarf2out.c 2014-04-22 20:31:24.884998630 +0100
 @@ -1847,7 +1847,7 @@ output_loc_operands (dw_loc_descr_ref lo
 int i;
 int len = get_full_len (*val2-v.val_wide);
 if (WORDS_BIG_ENDIAN)
 - for (i = len; i = 0; --i)
 + for (i = len - 1; i = 0; --i)
 dw2_asm_output_data (HOST_BITS_PER_WIDE_INT / 
 HOST_BITS_PER_CHAR,
  val2-v.val_wide-elt (i), NULL);
 else
 @@ -2073,7 +2073,7 @@ output_loc_operands (dw_loc_descr_ref lo

   dw2_asm_output_data (1, len * l, NULL);
   if (WORDS_BIG_ENDIAN)
 -   for (i = len; i = 0; --i)
 +   for (i = len - 1; i = 0; --i)
   dw2_asm_output_data (l, val2-v.val_wide-elt (i), NULL);
   else
 for (i = 0; i  len; ++i)
 @@ -5398,11 +5398,11 @@ print_die (dw_die_ref die, FILE *outfile
 int i = a-dw_attr_val.v.val_wide-get_len ();
 fprintf (outfile, constant ();
 gcc_assert (i  0);
 -   if (a-dw_attr_val.v.val_wide-elt (i) == 0)
 +   if (a-dw_attr_val.v.val_wide-elt (i - 1) == 0)
   fprintf (outfile, 0x);
 fprintf (outfile, HOST_WIDE_INT_PRINT_HEX,
  a-dw_attr_val.v.val_wide-elt (--i));
 -   while (-- i = 0)
 +   while (--i = 0)
   fprintf (outfile, HOST_WIDE_INT_PRINT_PADDED_HEX,
a-dw_attr_val.v.val_wide-elt (i));
 fprintf (outfile, ));
 @@ -8723,7 +8723,7 @@ output_die (dw_die_ref die)
NULL);

 if (WORDS_BIG_ENDIAN)
 - for (i = len; i = 0; --i)
 + for (i = len - 1; i = 0; --i)
 {
   dw2_asm_output_data (l, a-dw_attr_val.v.val_wide-elt (i),
name);
 Index: gcc/simplify-rtx.c
 ===
 --- gcc/simplify-rtx.c  2014-04-22 20:31:10.632895953 +0100
 +++ gcc/simplify-rtx.c  2014-04-22 20:31:24.884998630 +0100
 @@ -5395,7 +5395,7 @@ simplify_immed_subreg (enum machine_mode
 case MODE_DECIMAL_FLOAT:
   {
 REAL_VALUE_TYPE r;
 -   long tmp[MAX_BITSIZE_MODE_ANY_INT / 32];
 +   long tmp[MAX_BITSIZE_MODE_ANY_MODE / 32];

 /* real_from_target wants its input in words affected by
FLOAT_WORDS_BIG_ENDIAN.  However, we ignore this,

Re: [wide-int 4/8] Tweak uses of new API

On Tue, Apr 22, 2014 at 9:55 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 This is an assorted bunch of API tweaks:

 - use neg_p instead of lts_p (..., 0)
 - use STATIC_ASSERT for things that are known at compile time
 - avoid unnecessary wide(st)_int temporaries and arithmetic
 - remove an unnecessary template parameter
 - use to_short_addr for an offset_int-HOST_WIDE_INT offset change

 Tested on x86_64-linux-gnu.  OK to install?

Ok.

Thanks,
Richard.

 Thanks,
 Richard


 Index: gcc/ada/gcc-interface/cuintp.c
 ===
 --- gcc/ada/gcc-interface/cuintp.c  2014-04-22 20:31:10.680896299 +0100
 +++ gcc/ada/gcc-interface/cuintp.c  2014-04-22 20:31:24.526996049 +0100
 @@ -160,7 +160,7 @@ UI_From_gnu (tree Input)
   in a signed 64-bit integer.  */
if (tree_fits_shwi_p (Input))
  return UI_From_Int (tree_to_shwi (Input));
 -  else if (wi::lts_p (Input, 0)  TYPE_UNSIGNED (gnu_type))
 +  else if (wi::neg_p (Input)  TYPE_UNSIGNED (gnu_type))
  return No_Uint;
  #endif

 Index: gcc/expmed.c
 ===
 --- gcc/expmed.c2014-04-22 20:31:10.680896299 +0100
 +++ gcc/expmed.c2014-04-22 20:31:24.527996056 +0100
 @@ -4971,7 +4971,7 @@ make_tree (tree type, rtx x)
return t;

  case CONST_DOUBLE:
 -  gcc_assert (HOST_BITS_PER_WIDE_INT * 2 = MAX_BITSIZE_MODE_ANY_INT);
 +  STATIC_ASSERT (HOST_BITS_PER_WIDE_INT * 2 = MAX_BITSIZE_MODE_ANY_INT);
if (TARGET_SUPPORTS_WIDE_INT == 0  GET_MODE (x) == VOIDmode)
 t = wide_int_to_tree (type,
   wide_int::from_array (CONST_DOUBLE_LOW (x), 2,
 Index: gcc/fold-const.c
 ===
 --- gcc/fold-const.c2014-04-22 20:31:10.680896299 +0100
 +++ gcc/fold-const.c2014-04-22 20:31:24.530996079 +0100
 @@ -4274,9 +4274,8 @@ build_range_check (location_t loc, tree
if (integer_onep (low)  TREE_CODE (high) == INTEGER_CST)
  {
int prec = TYPE_PRECISION (etype);
 -  wide_int osb = wi::set_bit_in_zero (prec - 1, prec) - 1;

 -  if (osb == high)
 +  if (wi::mask (prec - 1, false, prec) == high)
 {
   if (TYPE_UNSIGNED (etype))
 {
 @@ -12950,7 +12949,7 @@ fold_binary_loc (location_t loc,
operand_equal_p (tree_strip_nop_conversions (TREE_OPERAND (arg0,
 1)),
   arg1, 0)
 -  wi::bit_and (TREE_OPERAND (arg0, 0), 1) == 1)
 +  wi::extract_uhwi (TREE_OPERAND (arg0, 0), 0, 1) == 1)
 {
   return omit_two_operands_loc (loc, type,
 code == NE_EXPR
 Index: gcc/predict.c
 ===
 --- gcc/predict.c   2014-04-22 20:31:10.680896299 +0100
 +++ gcc/predict.c   2014-04-22 20:31:24.531996086 +0100
 @@ -1309,33 +1309,34 @@ predict_iv_comparison (struct loop *loop
bool overflow, overall_overflow = false;
widest_int compare_count, tem;

 -  widest_int loop_bound = wi::to_widest (loop_bound_var);
 -  widest_int compare_bound = wi::to_widest (compare_var);
 -  widest_int base = wi::to_widest (compare_base);
 -  widest_int compare_step = wi::to_widest (compare_step_var);
 -
/* (loop_bound - base) / compare_step */
 -  tem = wi::sub (loop_bound, base, SIGNED, overflow);
 +  tem = wi::sub (wi::to_widest (loop_bound_var),
 +wi::to_widest (compare_base), SIGNED, overflow);
overall_overflow |= overflow;
 -  widest_int loop_count = wi::div_trunc (tem, compare_step, SIGNED,
 -overflow);
 +  widest_int loop_count = wi::div_trunc (tem,
 +wi::to_widest (compare_step_var),
 +SIGNED, overflow);
overall_overflow |= overflow;

 -  if (!wi::neg_p (compare_step)
 +  if (!wi::neg_p (wi::to_widest (compare_step_var))
^ (compare_code == LT_EXPR || compare_code == LE_EXPR))
 {
   /* (loop_bound - compare_bound) / compare_step */
 - tem = wi::sub (loop_bound, compare_bound, SIGNED, overflow);
 + tem = wi::sub (wi::to_widest (loop_bound_var),
 +wi::to_widest (compare_var), SIGNED, overflow);
   overall_overflow |= overflow;
 - compare_count = wi::div_trunc (tem, compare_step, SIGNED, 
 overflow);
 + compare_count = wi::div_trunc (tem, wi::to_widest 
 (compare_step_var),
 +SIGNED, overflow);
   overall_overflow |= overflow;
 }
else
  {
   /* (compare_bound - base) / compare_step */
 - tem = wi::sub (compare_bound, base, SIGNED, overflow);
 + tem =

Re: [wide-int 3/8] Add and use udiv_ceil

On Tue, Apr 22, 2014 at 9:51 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 Just a minor tweak to avoid several calculations when one would do.
 Since we have a function for rounded-up division, we might as well
 use it instead of the (X + Y - 1) / Y idiom.

 Tested on x86_64-linux-gnu.  OK to install?

Ok.

Thanks,
Richard.

 Thanks,
 Richard


 Index: gcc/dwarf2out.c
 ===
 --- gcc/dwarf2out.c 2014-04-22 20:31:25.187000808 +0100
 +++ gcc/dwarf2out.c 2014-04-22 20:31:26.374009366 +0100
 @@ -14824,7 +14824,7 @@ simple_decl_align_in_bits (const_tree de
  static inline offset_int
  round_up_to_align (const offset_int t, unsigned int align)
  {
 -  return wi::udiv_trunc (t + align - 1, align) * align;
 +  return wi::udiv_ceil (t, align) * align;
  }

  /* Given a pointer to a FIELD_DECL, compute and return the byte offset of the
 Index: gcc/wide-int.h
 ===
 --- gcc/wide-int.h  2014-04-22 20:31:25.842005530 +0100
 +++ gcc/wide-int.h  2014-04-22 20:31:26.375009373 +0100
 @@ -521,6 +521,7 @@ #define SHIFT_FUNCTION \
BINARY_FUNCTION udiv_floor (const T1 , const T2 );
BINARY_FUNCTION sdiv_floor (const T1 , const T2 );
BINARY_FUNCTION div_ceil (const T1 , const T2 , signop, bool * = 0);
 +  BINARY_FUNCTION udiv_ceil (const T1 , const T2 );
BINARY_FUNCTION div_round (const T1 , const T2 , signop, bool * = 0);
BINARY_FUNCTION divmod_trunc (const T1 , const T2 , signop,
 WI_BINARY_RESULT (T1, T2) *);
 @@ -2566,6 +2567,13 @@ wi::div_ceil (const T1 x, const T2 y,
return quotient;
  }

 +template typename T1, typename T2
 +inline WI_BINARY_RESULT (T1, T2)
 +wi::udiv_ceil (const T1 x, const T2 y)
 +{
 +  return div_ceil (x, y, UNSIGNED);
 +}
 +
  /* Return X / Y, rouding towards nearest with ties away from zero.
 Treat X and Y as having the signedness given by SGN.  Indicate
 in *OVERFLOW if the result overflows.  */

Re: [wide-int 6/8] Avoid redundant extensions

On Tue, Apr 22, 2014 at 10:04 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 register_edge_assert_for_2 operates on wide_ints of precision nprec
 so a lot of the extensions are redundant.

 Tested on x86_64-linux-gnu.  OK to install?

Ok.

Thanks,
Richard.

 Thanks,
 Richard


 Index: gcc/tree-vrp.c
 ===
 --- gcc/tree-vrp.c  2014-04-22 20:58:26.969683484 +0100
 +++ gcc/tree-vrp.c  2014-04-22 21:00:26.670617168 +0100
 @@ -5125,16 +5125,13 @@ register_edge_assert_for_2 (tree name, e
 {
   wide_int minv, maxv, valv, cst2v;
   wide_int tem, sgnbit;
 - bool valid_p = false, valn = false, cst2n = false;
 + bool valid_p = false, valn, cst2n;
   enum tree_code ccode = comp_code;

   valv = wide_int::from (val, nprec, UNSIGNED);
   cst2v = wide_int::from (cst2, nprec, UNSIGNED);
 - if (TYPE_SIGN (TREE_TYPE (val)) == SIGNED)
 -   {
 - valn = wi::neg_p (wi::sext (valv, nprec));
 - cst2n = wi::neg_p (wi::sext (cst2v, nprec));
 -   }
 + valn = wi::neg_p (valv, TYPE_SIGN (TREE_TYPE (val)));
 + cst2n = wi::neg_p (cst2v, TYPE_SIGN (TREE_TYPE (val)));
   /* If CST2 doesn't have most significant bit set,
  but VAL is negative, we have comparison like
  if ((x  0x123)  -4) (always true).  Just give up.  */
 @@ -5153,13 +5150,11 @@ register_edge_assert_for_2 (tree name, e
  have folded the comparison into false) and
  maximum unsigned value is VAL | ~CST2.  */
   maxv = valv | ~cst2v;
 - maxv = wi::zext (maxv, nprec);
   valid_p = true;
   break;

 case NE_EXPR:
   tem = valv | ~cst2v;
 - tem = wi::zext (tem, nprec);
   /* If VAL is 0, handle (X  CST2) != 0 as (X  CST2)  0U.  */
   if (valv == 0)
 {
 @@ -5176,7 +5171,7 @@ register_edge_assert_for_2 (tree name, e
   sgnbit = wi::zero (nprec);
   goto lt_expr;
 }
 - if (!cst2n  wi::neg_p (wi::sext (cst2v, nprec)))
 + if (!cst2n  wi::neg_p (cst2v))
 sgnbit = wi::set_bit_in_zero (nprec - 1, nprec);
   if (sgnbit != 0)
 {
 @@ -5245,7 +5240,6 @@ register_edge_assert_for_2 (tree name, e
   maxv -= 1;
 }
   maxv |= ~cst2v;
 - maxv = wi::zext (maxv, nprec);
   minv = sgnbit;
   valid_p = true;
   break;
 @@ -5274,7 +5268,6 @@ register_edge_assert_for_2 (tree name, e
 }
   maxv -= 1;
   maxv |= ~cst2v;
 - maxv = wi::zext (maxv, nprec);
   minv = sgnbit;
   valid_p = true;
   break;
 @@ -5283,7 +5276,7 @@ register_edge_assert_for_2 (tree name, e
   break;
 }
   if (valid_p
 -  wi::zext (maxv - minv, nprec) != wi::minus_one (nprec))
 +  (maxv - minv) != -1)
 {
   tree tmp, new_val, type;
   int i;

Re: [wide-int 7/8] Undo some changes from trunk

On Tue, Apr 22, 2014 at 10:12 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 This patch undoes a few assorted differences from trunk.

 For fold-const.c the old code was:

   /* If INNER is a right shift of a constant and it plus BITNUM does
  not overflow, adjust BITNUM and INNER.  */
   if (TREE_CODE (inner) == RSHIFT_EXPR
TREE_CODE (TREE_OPERAND (inner, 1)) == INTEGER_CST
tree_fits_uhwi_p (TREE_OPERAND (inner, 1))
bitnum  TYPE_PRECISION (type)
(tree_to_uhwi (TREE_OPERAND (inner, 1))
(unsigned) (TYPE_PRECISION (type) - bitnum)))
 {
   bitnum += tree_to_uhwi (TREE_OPERAND (inner, 1));
   inner = TREE_OPERAND (inner, 0);
 }

 and we lost the bitnum range test.

 The gimple-fold.c change contained an unrelated stylistic change that
 makes the code a bit less efficient.

 For ipa-prop.c we should convert to a HOST_WIDE_INT before multiplying,
 like trunk does.  It doesn't change the result and is more efficient.

 objc-act.c contains three copies of the same code.  The check for 0 was
 kept in the third but not the first two.

 Tested on x86_64-linux-gnu.  OK to install?

Ok.

Thanks,
Richard.

 Thanks,
 Richard


 Index: gcc/fold-const.c
 ===
 --- gcc/fold-const.c2014-04-22 21:00:26.921619127 +0100
 +++ gcc/fold-const.c2014-04-22 21:00:27.317622218 +0100
 @@ -6581,8 +6581,9 @@ fold_single_bit_test (location_t loc, en
  not overflow, adjust BITNUM and INNER.  */
if (TREE_CODE (inner) == RSHIFT_EXPR
TREE_CODE (TREE_OPERAND (inner, 1)) == INTEGER_CST
 -  wi::ltu_p (wi::to_widest (TREE_OPERAND (inner, 1)) + bitnum,
 -   TYPE_PRECISION (type)))
 +  bitnum  TYPE_PRECISION (type)
 +  wi::ltu_p (TREE_OPERAND (inner, 1),
 +   TYPE_PRECISION (type) - bitnum))
 {
   bitnum += tree_to_uhwi (TREE_OPERAND (inner, 1));
   inner = TREE_OPERAND (inner, 0);
 Index: gcc/gimple-fold.c
 ===
 --- gcc/gimple-fold.c   2014-04-22 20:58:26.869682704 +0100
 +++ gcc/gimple-fold.c   2014-04-22 21:00:27.31866 +0100
 @@ -3163,12 +3163,13 @@ fold_const_aggregate_ref_1 (tree t, tree
(idx = (*valueize) (TREE_OPERAND (t, 1)))
TREE_CODE (idx) == INTEGER_CST)
 {
 - tree low_bound = array_ref_low_bound (t);
 - tree unit_size = array_ref_element_size (t);
 + tree low_bound, unit_size;

   /* If the resulting bit-offset is constant, track it.  */
 - if (TREE_CODE (low_bound) == INTEGER_CST
 -  tree_fits_uhwi_p (unit_size))
 + if ((low_bound = array_ref_low_bound (t),
 +  TREE_CODE (low_bound) == INTEGER_CST)
 +  (unit_size = array_ref_element_size (t),
 + tree_fits_uhwi_p (unit_size)))
 {
   offset_int woffset
 = wi::sext (wi::to_offset (idx) - wi::to_offset (low_bound),
 Index: gcc/ipa-prop.c
 ===
 --- gcc/ipa-prop.c  2014-04-22 20:58:26.869682704 +0100
 +++ gcc/ipa-prop.c  2014-04-22 21:00:27.319622234 +0100
 @@ -3787,8 +3787,8 @@ ipa_modify_call_arguments (struct cgraph
   if (TYPE_ALIGN (type)  align)
 align = TYPE_ALIGN (type);
 }
 - misalign += (offset_int::from (off, SIGNED)
 -  * BITS_PER_UNIT).to_short_addr ();
 + misalign += (offset_int::from (off, SIGNED).to_short_addr ()
 +  * BITS_PER_UNIT);
   misalign = misalign  (align - 1);
   if (misalign != 0)
 align = (misalign  -misalign);
 Index: gcc/objc/objc-act.c
 ===
 --- gcc/objc/objc-act.c 2014-04-22 20:58:26.869682704 +0100
 +++ gcc/objc/objc-act.c 2014-04-22 21:00:27.320622242 +0100
 @@ -4882,7 +4882,9 @@ objc_decl_method_attributes (tree *node,
  which specifies the index of the format string
  argument.  Add 2.  */
   number = TREE_VALUE (second_argument);
 - if (number  TREE_CODE (number) == INTEGER_CST)
 + if (number
 +  TREE_CODE (number) == INTEGER_CST
 +  !wi::eq_p (number, 0))
 TREE_VALUE (second_argument)
   = wide_int_to_tree (TREE_TYPE (number),
   wi::add (number, 2));
 @@ -4893,7 +4895,9 @@ objc_decl_method_attributes (tree *node,
  in which case we don't need to add 2.  Add 2 if not
  0.  */
   number = TREE_VALUE (third_argument);
 - if (number  TREE_CODE

Re: [C PATCH] Make attributes accept enum values (PR c/50459)

2014-04-23 Thread Marek Polacek

On Sat, Apr 19, 2014 at 09:56:02AM -0400, Jason Merrill wrote:
 On 04/17/2014 12:00 PM, Marek Polacek wrote:
== CPP_CLOSE_PAREN)))
  {
tree arg1 = c_parser_peek_token (parser)-value;
 +  if (!attr_takes_id_p)
 +{
 +  /* This is for enum values, so that they can be used as
 + an attribute parameter; lookup_name will find their
 + CONST_DECLs.  */
 +  tree ln = lookup_name (arg1);
 +  if (ln)
 +arg1 = ln;
 +}
c_parser_consume_token (parser);
 
 Instead, we should add !attr_takes_id_p to the if condition
 immediately above so that we parse the arguments as an
 expression-list.

Ah, indeed.  So like this?  I had to add some ugliness because of
Obj-C and also tweak a few tests, since we now print slightly
different error message if the identifier in attribute argument
isn't declared.

Regtested/bootstrapped on x86_64-linux.

2014-04-22  Marek Polacek  pola...@redhat.com

PR c/50459
c-family/
* c-common.c (check_user_alignment): Return -1 if alignment is error
node.
(handle_aligned_attribute): Don't call default_conversion on
FUNCTION_DECLs.
(handle_vector_size_attribute): Likewise.
(handle_tm_wrap_attribute): Handle case when wrap_decl is error node.
(handle_sentinel_attribute): Call default_conversion and allow even
integral types as an argument.
c/
* c-parser.c (c_parser_attributes): Parse the arguments as an
expression-list if the attribute takes identifier.
testsuite/
* c-c++-common/attributes-1.c: Remove dg-error line.
* c-c++-common/pr50459.c: New test.
* c-c++-common/pr59280.c: Add undeclared to dg-error.
* gcc.dg/nonnull-2.c: Likewise.
* gcc.dg/pr55570.c: Modify dg-error.
* gcc.dg/tm/wrap-2.c: Likewise.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index c0e247b..df44faa 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -7418,6 +7418,8 @@ check_user_alignment (const_tree align, bool allow_zero)
 {
   int i;
 
+  if (error_operand_p (align))
+return -1;
   if (TREE_CODE (align) != INTEGER_CST
   || !INTEGRAL_TYPE_P (TREE_TYPE (align)))
 {
@@ -7539,7 +7541,8 @@ handle_aligned_attribute (tree *node, tree ARG_UNUSED 
(name), tree args,
   if (args)
 {
   align_expr = TREE_VALUE (args);
-  if (align_expr  TREE_CODE (align_expr) != IDENTIFIER_NODE)
+  if (align_expr  TREE_CODE (align_expr) != IDENTIFIER_NODE
+  TREE_CODE (align_expr) != FUNCTION_DECL)
align_expr = default_conversion (align_expr);
 }
   else
@@ -8404,9 +8407,11 @@ handle_tm_wrap_attribute (tree *node, tree name, tree 
args,
   else
 {
   tree wrap_decl = TREE_VALUE (args);
-  if (TREE_CODE (wrap_decl) != IDENTIFIER_NODE
-  TREE_CODE (wrap_decl) != VAR_DECL
-  TREE_CODE (wrap_decl) != FUNCTION_DECL)
+  if (error_operand_p (wrap_decl))
+;
+  else if (TREE_CODE (wrap_decl) != IDENTIFIER_NODE
+   TREE_CODE (wrap_decl) != VAR_DECL
+   TREE_CODE (wrap_decl) != FUNCTION_DECL)
error (%qE argument not an identifier, name);
   else
{
@@ -8533,7 +8538,8 @@ handle_vector_size_attribute (tree *node, tree name, tree 
args,
   *no_add_attrs = true;
 
   size = TREE_VALUE (args);
-  if (size  TREE_CODE (size) != IDENTIFIER_NODE)
+  if (size  TREE_CODE (size) != IDENTIFIER_NODE
+   TREE_CODE (size) != FUNCTION_DECL)
 size = default_conversion (size);
 
   if (!tree_fits_uhwi_p (size))
@@ -8944,8 +8950,12 @@ handle_sentinel_attribute (tree *node, tree name, tree 
args,
   if (args)
 {
   tree position = TREE_VALUE (args);
+  if (position  TREE_CODE (position) != IDENTIFIER_NODE
+  TREE_CODE (position) != FUNCTION_DECL)
+   position = default_conversion (position);
 
-  if (TREE_CODE (position) != INTEGER_CST)
+  if (TREE_CODE (position) != INTEGER_CST
+  || !INTEGRAL_TYPE_P (TREE_TYPE (position)))
{
  warning (OPT_Wattributes,
   requested position is not an integer constant);
diff --git gcc/c/c-parser.c gcc/c/c-parser.c
index 5653e49..8d91d6b 100644
--- gcc/c/c-parser.c
+++ gcc/c/c-parser.c
@@ -3943,11 +3943,16 @@ c_parser_attributes (c_parser *parser)
 In objective-c the identifier may be a classname.  */
  if (c_parser_next_token_is (parser, CPP_NAME)
   (c_parser_peek_token (parser)-id_kind == C_ID_ID
- || (c_dialect_objc () 
-  c_parser_peek_token (parser)-id_kind == 
C_ID_CLASSNAME))
+ || (c_dialect_objc ()
+  c_parser_peek_token (parser)-id_kind
+== C_ID_CLASSNAME))
   ((c_parser_peek_2nd_token (parser)-type == CPP_COMMA)
  || (c_parser_peek_2nd_token

Re: [wide-int 8/8] Formatting and typo fixes

On Tue, Apr 22, 2014 at 10:14 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 Almost obvious, but just in case...

 The first mem_loc_descriptor hunk just reflows the text so that the
 line breaks are less awkward.

 Tested on x86_64-linux-gnu.  OK to install?

Ok.

Thanks,
Richard.

 Thanks,
 Richard


 Index: gcc/doc/rtl.texi
 ===
 --- gcc/doc/rtl.texi2014-04-22 21:08:26.002367845 +0100
 +++ gcc/doc/rtl.texi2014-04-22 21:13:54.343668582 +0100
 @@ -1553,7 +1553,7 @@ neither inherently signed nor inherently
  signedness is determined by the rtl operation instead.

  On more modern ports, @code{CONST_DOUBLE} only represents floating
 -point values.  New ports define to @code{TARGET_SUPPORTS_WIDE_INT} to
 +point values.  New ports define @code{TARGET_SUPPORTS_WIDE_INT} to
  make this designation.

  @findex CONST_DOUBLE_LOW
 @@ -1571,7 +1571,7 @@ the precise bit pattern used by the targ

  @findex CONST_WIDE_INT
  @item (const_wide_int:@var{m} @var{nunits} @var{elt0} @dots{})
 -This contains an array of @code{HOST_WIDE_INTS} that is large enough
 +This contains an array of @code{HOST_WIDE_INT}s that is large enough
  to hold any constant that can be represented on the target.  This form
  of rtl is only used on targets that define
  @code{TARGET_SUPPORTS_WIDE_INT} to be nonzero and then
 Index: gcc/dwarf2out.c
 ===
 --- gcc/dwarf2out.c 2014-04-22 21:13:54.297668148 +0100
 +++ gcc/dwarf2out.c 2014-04-22 21:13:54.337668526 +0100
 @@ -12911,14 +12911,13 @@ mem_loc_descriptor (rtx rtl, enum machin
   dw_die_ref type_die;

   /* Note that if TARGET_SUPPORTS_WIDE_INT == 0, a
 -CONST_DOUBLE rtx could represent either an large integer
 -or a floating-point constant.  If
 -TARGET_SUPPORTS_WIDE_INT != 0, the value is always a
 -floating point constant.
 +CONST_DOUBLE rtx could represent either a large integer
 +or a floating-point constant.  If TARGET_SUPPORTS_WIDE_INT != 0,
 +the value is always a floating point constant.

  When it is an integer, a CONST_DOUBLE is used whenever
 -the constant requires 2 HWIs to be adequately
 -represented.  We output CONST_DOUBLEs as blocks.  */
 +the constant requires 2 HWIs to be adequately represented.
 +We output CONST_DOUBLEs as blocks.  */
   if (mode == VOIDmode
   || (GET_MODE (rtl) == VOIDmode
GET_MODE_BITSIZE (mode) != HOST_BITS_PER_DOUBLE_INT))
 @@ -15147,9 +15146,9 @@ insert_wide_int (const wide_int val, un
  }

/* We'd have to extend this code to support odd sizes.  */
 -  gcc_assert (elt_size % (HOST_BITS_PER_WIDE_INT/BITS_PER_UNIT) == 0);
 +  gcc_assert (elt_size % (HOST_BITS_PER_WIDE_INT / BITS_PER_UNIT) == 0);

 -  int n = elt_size / (HOST_BITS_PER_WIDE_INT/BITS_PER_UNIT);
 +  int n = elt_size / (HOST_BITS_PER_WIDE_INT / BITS_PER_UNIT);

if (WORDS_BIG_ENDIAN)
  for (i = n - 1; i = 0; i--)
 Index: gcc/emit-rtl.c
 ===
 --- gcc/emit-rtl.c  2014-04-22 21:08:26.002367845 +0100
 +++ gcc/emit-rtl.c  2014-04-22 21:13:54.338668535 +0100
 @@ -213,8 +213,8 @@ const_wide_int_htab_hash (const void *x)
  const_wide_int_htab_eq (const void *x, const void *y)
  {
int i;
 -  const_rtx xr = (const_rtx)x;
 -  const_rtx yr = (const_rtx)y;
 +  const_rtx xr = (const_rtx) x;
 +  const_rtx yr = (const_rtx) y;
if (CONST_WIDE_INT_NUNITS (xr) != CONST_WIDE_INT_NUNITS (yr))
  return 0;

 Index: gcc/fold-const.c
 ===
 --- gcc/fold-const.c2014-04-22 21:13:54.308668252 +0100
 +++ gcc/fold-const.c2014-04-22 21:13:54.340668554 +0100
 @@ -1775,7 +1775,7 @@ fold_convert_const_fixed_from_int (tree

di.low = TREE_INT_CST_ELT (arg1, 0);
if (TREE_INT_CST_NUNITS (arg1) == 1)
 -di.high = (HOST_WIDE_INT)di.low  0 ? (HOST_WIDE_INT)-1 : 0;
 +di.high = (HOST_WIDE_INT) di.low  0 ? (HOST_WIDE_INT) -1 : 0;
else
  di.high = TREE_INT_CST_ELT (arg1, 1);

 Index: gcc/rtl.c
 ===
 --- gcc/rtl.c   2014-04-22 21:08:26.002367845 +0100
 +++ gcc/rtl.c   2014-04-22 21:13:54.341668564 +0100
 @@ -232,7 +232,7 @@ cwi_output_hex (FILE *outfile, const_rtx
  {
int i = CWI_GET_NUM_ELEM (x);
gcc_assert (i  0);
 -  if (CWI_ELT (x, i-1) == 0)
 +  if (CWI_ELT (x, i - 1) == 0)
  /* The HOST_WIDE_INT_PRINT_HEX prepends a 0x only if the val is
 non zero.  We want all numbers to have a 0x prefix.  */
  fprintf (outfile, 0x);
 Index: gcc/rtl.h
 ===
 --- gcc/rtl.h   2014-04-22 21:08:26.002367845 +0100
 +++ gcc/rtl.h   2014-04-22 21:13:54.341668564 +0100

[PATCH] Update libstdc++ baseline symbols for m68k

2014-04-23 Thread Andreas Schwab

Committed.

Andreas.

* config/abi/post/m68k-linux-gnu/baseline_symbols.txt
(CXXABI_1.3.9): New version.

diff --git a/libstdc++-v3/config/abi/post/m68k-linux-gnu/baseline_symbols.txt 
b/libstdc++-v3/config/abi/post/m68k-linux-gnu/baseline_symbols.txt
index ce247a9..bd2e67f 100644
--- a/libstdc++-v3/config/abi/post/m68k-linux-gnu/baseline_symbols.txt
+++ b/libstdc++-v3/config/abi/post/m68k-linux-gnu/baseline_symbols.txt
@@ -2520,6 +2520,7 @@ OBJECT:0:CXXABI_1.3.5
 OBJECT:0:CXXABI_1.3.6
 OBJECT:0:CXXABI_1.3.7
 OBJECT:0:CXXABI_1.3.8
+OBJECT:0:CXXABI_1.3.9
 OBJECT:0:CXXABI_TM_1
 OBJECT:0:GLIBCXX_3.4
 OBJECT:0:GLIBCXX_3.4.1
-- 
1.9.2

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.

[PATCH] Tweak an error msg a little

2014-04-23 Thread Marek Polacek

I think it's better to be consistent and always quote the
transaction_wrap name, it even looks nicer.

I ran tm.exp tests, ok for trunk?

2014-04-23  Marek Polacek  pola...@redhat.com

* c-common.c (handle_tm_wrap_attribute): Tweak error message.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index 0b5ded8..a08c873 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -8421,7 +8421,7 @@ handle_tm_wrap_attribute (tree *node, tree name, tree 
args,
error (%qD is not compatible with %qD, wrap_decl, decl);
}
  else
-   error (transaction_wrap argument is not a function);
+   error (%qE argument is not a function, name);
}
 }
 
Marek

Commit: MSP430: Enhance -mhwmult option

2014-04-23 Thread Nick Clifton

Hi Guys,

  I am applying the attached patch to enhance the -mhwmult command line
  option of the MSP430 backend.  The option can now be used to specify
  the type of hardware multiplier supported to be enabled as well as
  just enabling or disabling the support.  The default behaviour is now
  to enable hardware multiply support based upon the -mmcu command line
  option used.  If no -mmcu option has been specified, or the mcu name
  is unrecognised, then the normal 32-bit hardware support will be
  enabled.
  
  The patch also fixes the parsing of the -mmcu= and -mcpu= command line
  options so that the last one specified takes precedence.

Cheers
  Nick

gcc/ChangeLog
2014-04-23  Nick Clifton  ni...@redhat.com

* config/msp430/msp430.c (msp430_handle_option): Move function
to msp430-common.c
(msp430_option_override): Simplify mcu and mcpu option handling.
(msp430_is_f5_mcu): Rename to msp430_use_f5_series_hwmult.  Add
support for -mhwmult command line option.
(has_32bit_hwmult): Rename to use_32bit_hwmult.  Add support for
-mhwmult command line option.
(msp430_hwmult_enabled): Delete.
(msp43o_output_labelref): Add support for -mhwmult command line
option.
* config/msp430/msp430.md (mulhisi3, umulhisi3, mulsidi3)
(umulsidi3): Likewise.
* config/msp430/msp430.opt (mmcu): Add Report attribute.
(mcpu, mlarge, msmall): Likewise.
(mhwmult): New option.
* config/msp430/msp430-protos.h (msp430_hwmult_enabled): Remove
prototype.
(msp430_is_f5_mcu): Remove prototype.
(msp430_use_f5_series_hwmult): Add prototype.
* config/msp430/msp430-opts.h: New file.
* common/config/msp430: New directory.
* common/config/msp430/msp430-common.c: New file.
* config.gcc (msp430): Remove target_has_targetm_common.
* doc/invoke.texi: Document -mhwmult command line option.



msp430.opts.patch.xz
Description: application/xz

Re: [PATCH] Tweak an error msg a little

On Wed, Apr 23, 2014 at 12:22 PM, Marek Polacek pola...@redhat.com wrote:
 I think it's better to be consistent and always quote the
 transaction_wrap name, it even looks nicer.

 I ran tm.exp tests, ok for trunk?

Ok.

Thanks,
Richard.

 2014-04-23  Marek Polacek  pola...@redhat.com

 * c-common.c (handle_tm_wrap_attribute): Tweak error message.

 diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
 index 0b5ded8..a08c873 100644
 --- gcc/c-family/c-common.c
 +++ gcc/c-family/c-common.c
 @@ -8421,7 +8421,7 @@ handle_tm_wrap_attribute (tree *node, tree name, tree 
 args,
 error (%qD is not compatible with %qD, wrap_decl, decl);
 }
   else
 -   error (transaction_wrap argument is not a function);
 +   error (%qE argument is not a function, name);
 }
  }

 Marek

-fuse-caller-save - Collect register usage information

2014-04-23 Thread Tom de Vries


On 22-04-14 17:05, Tom de Vries wrote:

I've updated the fuse-caller-save patch series to model non-callee call clobbers
in CALL_INSN_FUNCTION_USAGE.


Vladimir,

This is the updated version of the previously approved patch 
http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01320.html , updated for the new 
hook call_fusage_contains_non_callee_clobbers.


The only difference is in the functions get_call_reg_set_usage and 
collect_fn_hard_reg_usage which use the hook.


OK for trunk?

Thanks,
- Tom

2013-04-29  Radovan Obradovic  robrado...@mips.com
Tom de Vries  t...@codesourcery.com

* cgraph.h (struct cgraph_node): Add function_used_regs,
function_used_regs_initialized and function_used_regs_valid fields.
* final.c: Move include of hard-reg-set.h to before rtl.h to declare
find_all_hard_reg_sets.
(collect_fn_hard_reg_usage, get_call_fndecl, get_call_cgraph_node)
(get_call_reg_set_usage): New function.
(rest_of_handle_final): Use collect_fn_hard_reg_usage.

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 15310d8..eb0fe8e 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -408,6 +408,15 @@ public:
   /* Time profiler: first run of function.  */
   int tp_first_run;
 
+  /* Call unsaved hard registers really used by the corresponding
+ function (including ones used by functions called by the
+ function).  */
+  HARD_REG_SET function_used_regs;
+  /* Set if function_used_regs is initialized.  */
+  unsigned function_used_regs_initialized: 1;
+  /* Set if function_used_regs is valid.  */
+  unsigned function_used_regs_valid: 1;
+
   /* Set when decl is an abstract function pointed to by the
  ABSTRACT_DECL_ORIGIN of a reachable function.  */
   unsigned used_as_abstract_origin : 1;
diff --git a/gcc/final.c b/gcc/final.c
index 83abee2..0b1947d 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.  If not see
 
 #include tree.h
 #include varasm.h
+#include hard-reg-set.h
 #include rtl.h
 #include tm_p.h
 #include regs.h
@@ -57,7 +58,6 @@ along with GCC; see the file COPYING3.  If not see
 #include recog.h
 #include conditions.h
 #include flags.h
-#include hard-reg-set.h
 #include output.h
 #include except.h
 #include function.h
@@ -223,6 +223,7 @@ static int alter_cond (rtx);
 static int final_addr_vec_align (rtx);
 #endif
 static int align_fuzz (rtx, rtx, int, unsigned);
+static void collect_fn_hard_reg_usage (void);
 
 /* Initialize data in final at the beginning of a compilation.  */
 
@@ -4425,6 +4426,7 @@ rest_of_handle_final (void)
   assemble_start_function (current_function_decl, fnname);
   final_start_function (get_insns (), asm_out_file, optimize);
   final (get_insns (), asm_out_file, optimize);
+  collect_fn_hard_reg_usage ();
   final_end_function ();
 
   /* The IA-64 .handlerdata directive must be issued before the .endp
@@ -4720,3 +4722,119 @@ make_pass_clean_state (gcc::context *ctxt)
 {
   return new pass_clean_state (ctxt);
 }
+
+/* Collect hard register usage for the current function.  */
+
+static void
+collect_fn_hard_reg_usage (void)
+{
+  rtx insn;
+  int i;
+  struct cgraph_node *node;
+
+  if (!flag_use_caller_save)
+return;
+
+  node = cgraph_get_node (current_function_decl);
+  gcc_assert (node != NULL);
+
+  gcc_assert (!node-function_used_regs_initialized);
+  node-function_used_regs_initialized = 1;
+
+  for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))
+{
+  HARD_REG_SET insn_used_regs;
+
+  if (!NONDEBUG_INSN_P (insn))
+	continue;
+
+  find_all_hard_reg_sets (insn, insn_used_regs, false);
+
+  if (CALL_P (insn)
+	   (!targetm.call_fusage_contains_non_callee_clobbers ()
+	  || !get_call_reg_set_usage (insn, insn_used_regs, call_used_reg_set)))
+	{
+	  CLEAR_HARD_REG_SET (node-function_used_regs);
+	  return;
+	}
+
+  IOR_HARD_REG_SET (node-function_used_regs, insn_used_regs);
+}
+
+  /* Be conservative - mark fixed and global registers as used.  */
+  IOR_HARD_REG_SET (node-function_used_regs, fixed_reg_set);
+  for (i = 0; i  FIRST_PSEUDO_REGISTER; i++)
+if (global_regs[i])
+  SET_HARD_REG_BIT (node-function_used_regs, i);
+
+#ifdef STACK_REGS
+  /* Handle STACK_REGS conservatively, since the df-framework does not
+ provide accurate information for them.  */
+
+  for (i = FIRST_STACK_REG; i = LAST_STACK_REG; i++)
+SET_HARD_REG_BIT (node-function_used_regs, i);
+#endif
+
+  node-function_used_regs_valid = 1;
+}
+
+/* Get the declaration of the function called by INSN.  */
+
+static tree
+get_call_fndecl (rtx insn)
+{
+  rtx note, datum;
+
+  if (!flag_use_caller_save)
+return NULL_TREE;
+
+  note = find_reg_note (insn, REG_CALL_DECL, NULL_RTX);
+  if (note == NULL_RTX)
+return NULL_TREE;
+
+  datum = XEXP (note, 0);
+  if (datum != NULL_RTX)
+return SYMBOL_REF_DECL (datum);
+
+  return NULL_TREE;
+}
+
+static struct cgraph_node *
+get_call_cgraph_node (rtx insn)
+{
+  tree

Re: RFA: x86 backend: Add default-manifest to Cygwin/MinGW links

2014-04-23 Thread Corinna Vinschen

Hi Nick,

On Apr 23 10:41, Nicholas Clifton wrote:
 Hi Corinna,
 
 However, we know that the act of merging will currently result in broken
 resources in the executable.  Wouldn't it be better to apply the above
 patch only after the resource merge fix?
 
 No.  Well not in my opinion. :-)  The reason is that this patch only
 makes a difference if the default manifest can be found in a library
 search path.  If there is none present then nothing happens.  So you
 can disable the (broken) merging of a default manifest file by
 simply not having it present.  Which should be the case for all
 current installations.
 
 Plus - I am hoping to fix the resource merging problem soon.  (Any
 day now, honest).  So I would like to have the gcc patch in place
 for when that happens.

Ok, sounds fine to me.


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
Red Hat


pgpMqGA1lAxMO.pgp
Description: PGP signature

Re: [PATCH, ARM] Improve 64 bit division performance

2014-04-23 Thread Charles Baylis

Ping?

Ramana mentioned at Linaro Connect that this should be tested on more platforms.

I've now checked this on qemu with no regressions on trunk for:
arm-unknown-linux-gnueabihf v7-A: ARM and Thumb-2
arm-unknown-linux-gnueabi v4t, v5t, v6: ARM

OK for trunk?

Archive link: http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01611.html

On 27 February 2014 16:38, Charles Baylis charles.bay...@linaro.org wrote:
 [resending as text/plain]

 Hi

 These patches optimise 64 bit division by removing the use of the
 __gnu_[u]ldivmod_helper functions and hence avoiding the redundant
 calculation of the remainder in those functions.

 Bootstrapped, tested and checked for arm-unknown-linux-gnueabihf.

 Benchmarked on Chromebook and Raspberry Pi using attached divbench3.c.
 Loop1 varies the divisor and loop2 varies the dividend.

 Chromebook:

 before:
 loop1 unsigned: 3.474419
 loop2 unsigned: 6.564871
 loop1 signed:   4.127967
 loop2 signed:   6.071490

 after:
 loop1 unsigned: 2.781364
 loop2 unsigned: 6.166478
 loop1 signed:   2.800974
 loop2 signed:   6.129588

 Raspberry pi:
 before
 loop1 unsigned:28.881753
 loop2 unsigned:19.876385
 loop1 signed:  32.074941
 loop2 signed:  20.594860

 after:
 loop1 unsigned:24.893846
 loop2 unsigned:19.537562
 loop1 signed:  25.334509
 loop2 signed:  19.615088

 Any comments? OK for stage 1?


 Patch 1:

 2014-02-27  Charles Baylis  charles.bay...@linaro.org

 * config/arm/bpabi.S (__aeabi_uldivmod): Perform division using call
 to __udivmoddi4.


 Patch 2:

 2014-02-27  Charles Baylis  charles.bay...@linaro.org

 * config/arm/bpabi.S (__aeabi_ldivmod): Perform signed division via
 call to __udivmoddi4 and fixing up for negative operands.

[PATCH] Fix PR60903


LIM fails to properly mark new blocks/edges it creates as
belonging to irreducible regions.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk
and 4.9 branch.

Richard.

2014-04-23  Richard Biener  rguent...@suse.de

PR tree-optimization/60903
* tree-ssa-loop-im.c (analyze_memory_references): Remove
commented code block.
(execute_sm_if_changed): Properly apply IRREDUCIBLE_LOOP
loop flags to newly created BBs and edges.

* gcc.dg/torture/pr60903.c: New testcase.

Index: gcc/tree-ssa-loop-im.c
===
*** gcc/tree-ssa-loop-im.c  (revision 209677)
--- gcc/tree-ssa-loop-im.c  (working copy)
*** analyze_memory_references (void)
*** 1544,1558 
struct loop *loop, *outer;
unsigned i, n;
  
- #if 0
-   /* Initialize bb_loop_postorder with a mapping from loop-num to
-  its postorder index.  */
-   i = 0;
-   bb_loop_postorder = XNEWVEC (unsigned, number_of_loops (cfun));
-   FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
- bb_loop_postorder[loop-num] = i++;
- #endif
- 
/* Collect all basic-blocks in loops and sort them after their
   loops postorder.  */
i = 0;
--- 1547,1552 
*** execute_sm_if_changed (edge ex, tree mem
*** 1807,1812 
--- 1803,1809 
gimple_stmt_iterator gsi;
gimple stmt;
struct prev_flag_edges *prev_edges = (struct prev_flag_edges *) ex-aux;
+   bool irr = ex-flags  EDGE_IRREDUCIBLE_LOOP;
  
/* ?? Insert store after previous store if applicable.  See note
   below.  */
*** execute_sm_if_changed (edge ex, tree mem
*** 1821,1828 
old_dest = ex-dest;
new_bb = split_edge (ex);
then_bb = create_empty_bb (new_bb);
!   if (current_loops  new_bb-loop_father)
! add_bb_to_loop (then_bb, new_bb-loop_father);
  
gsi = gsi_start_bb (new_bb);
stmt = gimple_build_cond (NE_EXPR, flag, boolean_false_node,
--- 1818,1826 
old_dest = ex-dest;
new_bb = split_edge (ex);
then_bb = create_empty_bb (new_bb);
!   if (irr)
! then_bb-flags = BB_IRREDUCIBLE_LOOP;
!   add_bb_to_loop (then_bb, new_bb-loop_father);
  
gsi = gsi_start_bb (new_bb);
stmt = gimple_build_cond (NE_EXPR, flag, boolean_false_node,
*** execute_sm_if_changed (edge ex, tree mem
*** 1834,1842 
stmt = gimple_build_assign (unshare_expr (mem), tmp_var);
gsi_insert_after (gsi, stmt, GSI_CONTINUE_LINKING);
  
!   make_edge (new_bb, then_bb, EDGE_TRUE_VALUE);
!   make_edge (new_bb, old_dest, EDGE_FALSE_VALUE);
!   then_old_edge = make_edge (then_bb, old_dest, EDGE_FALLTHRU);
  
set_immediate_dominator (CDI_DOMINATORS, then_bb, new_bb);
  
--- 1832,1843 
stmt = gimple_build_assign (unshare_expr (mem), tmp_var);
gsi_insert_after (gsi, stmt, GSI_CONTINUE_LINKING);
  
!   make_edge (new_bb, then_bb,
!EDGE_TRUE_VALUE | (irr ? EDGE_IRREDUCIBLE_LOOP : 0));
!   make_edge (new_bb, old_dest,
!EDGE_FALSE_VALUE | (irr ? EDGE_IRREDUCIBLE_LOOP : 0));
!   then_old_edge = make_edge (then_bb, old_dest,
!EDGE_FALLTHRU | (irr ? EDGE_IRREDUCIBLE_LOOP : 0));
  
set_immediate_dominator (CDI_DOMINATORS, then_bb, new_bb);
  
Index: gcc/testsuite/gcc.dg/torture/pr60903.c
===
*** gcc/testsuite/gcc.dg/torture/pr60903.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr60903.c  (working copy)
***
*** 0 
--- 1,22 
+ /* { dg-do compile } */
+ 
+ extern int a, b, k, q;
+ 
+ void
+ foo ()
+ {
+   if (a)
+ {
+   while (q)
+   {
+   lbl:
+ if (a)
+   {
+ a = 0;
+ goto lbl;
+   }
+   }
+   b = k;
+ }
+   goto lbl;
+ }

Add clobber_reg

2014-04-23 Thread Tom de Vries


On 22-04-14 17:05, Tom de Vries wrote:

I've updated the fuse-caller-save patch series to model non-callee call clobbers
in CALL_INSN_FUNCTION_USAGE.


Eric,

Richard Sandiford mentioned here ( 
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00870.html ):

...
Although we really should have a utility function like use_reg, but for
clobbers, so that the above would become:

  clobber_reg (CALL_INSN_FUNCTION_USAGE (insn), gen_rtx_REG (word_mode, 18));
...


I've implemented a patch that adds clobber_reg and clobber_reg_mode, similar to 
use_reg and use_reg_mode.


Bootstrapped and reg-tested on x86_64 as part of the fuse-caller-save series.

OK for trunk?

Thanks,
- Tom

2014-04-18  Tom de Vries  t...@codesourcery.com

* expr.c (clobber_reg_mode): New function.
* expr.h (clobber_reg): New function.

diff --git a/gcc/expr.c b/gcc/expr.c
index 72e4401..fc58eb7f 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -2396,6 +2396,18 @@ use_reg_mode (rtx *call_fusage, rtx reg, enum machine_mode mode)
 = gen_rtx_EXPR_LIST (mode, gen_rtx_USE (VOIDmode, reg), *call_fusage);
 }
 
+/* Add a CLOBBER expression for REG to the (possibly empty) list pointed
+   to by CALL_FUSAGE.  REG must denote a hard register.  */
+
+void
+clobber_reg_mode (rtx *call_fusage, rtx reg, enum machine_mode mode)
+{
+  gcc_assert (REG_P (reg)  REGNO (reg)  FIRST_PSEUDO_REGISTER);
+
+  *call_fusage
+= gen_rtx_EXPR_LIST (mode, gen_rtx_CLOBBER (VOIDmode, reg), *call_fusage);
+}
+
 /* Add USE expressions to *CALL_FUSAGE for each of NREGS consecutive regs,
starting at REGNO.  All of these registers must be hard registers.  */
 
diff --git a/gcc/expr.h b/gcc/expr.h
index 524da67..1823feb 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -346,6 +346,7 @@ extern void copy_blkmode_from_reg (rtx, rtx, tree);
 /* Mark REG as holding a parameter for the next CALL_INSN.
Mode is TYPE_MODE of the non-promoted parameter, or VOIDmode.  */
 extern void use_reg_mode (rtx *, rtx, enum machine_mode);
+extern void clobber_reg_mode (rtx *, rtx, enum machine_mode);
 
 extern rtx copy_blkmode_to_reg (enum machine_mode, tree);
 
@@ -356,6 +357,13 @@ use_reg (rtx *fusage, rtx reg)
   use_reg_mode (fusage, reg, VOIDmode);
 }
 
+/* Mark REG as clobbered by the call with FUSAGE as CALL_INSN_FUNCTION_USAGE.  */
+static inline void
+clobber_reg (rtx *fusage, rtx reg)
+{
+  clobber_reg_mode (fusage, reg, VOIDmode);
+}
+
 /* Mark NREGS consecutive regs, starting at REGNO, as holding parameters
for the next CALL_INSN.  */
 extern void use_regs (rtx *, int, int);

Re: [PATCH, ARM] Suppress Redundant Flag Setting for Cortex-A15

2014-04-23 Thread Christophe Lyon

Hi,

On 28 January 2014 13:10, Ramana Radhakrishnan
ramana@googlemail.com wrote:
On Fri, Jan 24, 2014 at 5:16 PM, Ian Bolton ian.bol...@arm.com wrote:
Hi there!

An existing optimisation for Thumb-2 converts t32 encodings to
t16 encodings to reduce codesize, at the expense of causing
redundant flag setting for ADD, AND, etc. This redundant flag
setting can have negative performance impact on cortex-a15.

This patch introduces two new tuning options so that the conversion
from t32 to t16, which takes place in thumb2_reorg, can be suppressed
for cortex-a15.

To maintain some of the original benefit (reduced codesize), the
suppression is only done where the enclosing basic block is deemed
worthy of optimising for speed.

This tested with no regressions and performance has improved for
the workloads tested on cortex-a15. (It might be beneficial to
other processors too, but that has not been investigated yet.)

OK for stage 1?

This is OK for stage1.

Ramana

Cheers,
Ian

2014-01-24 Ian Bolton ian.bol...@arm.com

gcc/
* config/arm/arm-protos.h (tune_params): New struct members.
* config/arm/arm.c: Initialise tune_params per processor.
(thumb2_reorg): Suppress conversion from t32 to t16 when
optimizing for speed, based on new tune_params.

This causes
gcc.target/arm/negdi-1.c
gcc.target/arm/negdi-2.c
to FAIL when GCC is configured as:
--with-mode=ar
--with-cpu=cortex-a15
--with-fpu=neon-vfpv4

both tests used to PASS.
(see
http://cbuild.validation.linaro.org/build/cross-validation/gcc/209561/report-build-info.html)

Christophe.

Re: Remove obsolete Solaris 9 support

2014-04-23 Thread Uros Bizjak

On Tue, Apr 22, 2014 at 2:35 PM, Rainer Orth
r...@cebitec.uni-bielefeld.de wrote:
 Uros Bizjak ubiz...@gmail.com writes:

 On Wed, Apr 16, 2014 at 1:16 PM, Rainer Orth
 r...@cebitec.uni-bielefeld.de wrote:
 Now that 4.9 has branched, it's time to actually remove the obsolete
 Solaris 9 configuration.  Most of this is just legwork and falls under
 my Solaris maintainership.

 A couple of questions, though:

 * Uros: I'm removing all sse_os_support() checks from the testsuite.
   Solaris 9 was the only consumer, so it seems best to do away with it.

 This is OK, but please leave sse-os-check.h (and corresponding
 sse_os_support calls) in the testsuite. Just remove the Solaris 9
 specific code from sse-os-check.h and always return 1, perhaps with
 the comment that all currently supported OSes support SSE
 instructions.

 Here's the final patch I've checked in, incorporating all review
 comments.  I've left out the libgo (already checked in by Ian) and
 classpath parts.

It looks to me that one part was left in libgcc/config/i386/crtfastmath.c:

#if !defined __x86_64__  defined __sun__  defined __svr4__
#include signal.h
#include ucontext.h
...
#endif

Re: [Patch ARM] Allow any register for DImode values in Thumb2.

2014-04-23 Thread Christophe Lyon

On 27 February 2014 14:58, Ramana Radhakrishnan ramra...@arm.com wrote:
 Hi

 I noticed that for T32 we don't allow any old register for DImode values.
 The restriction of an even register is true only for ARM state because the
 ISA doesn't allow any old register in this place. In a few large .i files
 that I had knocking about, noticed a nice drop in stack usage and a
 generally improved register allocation strategy.

 Queued for stage1 after suitable testing including a bootstrap and
 regression test in Thumb2 found no issues.

 regards
 Ramana

 DATE  Ramana Radhakrishnan  ramana.radhakrish...@arm.com

 * config/arm/arm.c (arm_hard_regno_mode_ok): Loosen restrictions on
 core registers for DImode values in Thumb2.


Hi Ramana,

I've noticed some regressions after this patch has been committed (rev 209615):

  gcc.c-torture/compile/pr34856.c  -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions
  gcc.c-torture/compile/pr34856.c  -O3 -fomit-frame-pointer -funroll-loops
  gcc.c-torture/execute/scal-to-vec1.c compilation,  -O2
  gcc.c-torture/execute/scal-to-vec1.c compilation,  -O2 -flto
-fno-use-linker-plugin -flto-partition=none
  gcc.c-torture/execute/scal-to-vec1.c compilation,  -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects

Now all produce ICE in several GCC configurations (mostly when
generating thumb code) eg:
--target arm-none-eabi --with-cpu=cortex-a9 --with-mode=thumb
--target arm-none-linux-gnueabi --with-cpu=cortex-a9 --with-mode=thumb

but it's OK for target arm-none-linux-gnueabihf.

See 
http://cbuild.validation.linaro.org/build/cross-validation/gcc/209615/report-build-info.html

Christophe.

Re: [Patch ARM] Allow any register for DImode values in Thumb2.

2014-04-23 Thread Ramana Radhakrishnan

On Wed, Apr 23, 2014 at 1:53 PM, Christophe Lyon
christophe.l...@linaro.org wrote:
 On 27 February 2014 14:58, Ramana Radhakrishnan ramra...@arm.com wrote:
 Hi

 I noticed that for T32 we don't allow any old register for DImode values.
 The restriction of an even register is true only for ARM state because the
 ISA doesn't allow any old register in this place. In a few large .i files
 that I had knocking about, noticed a nice drop in stack usage and a
 generally improved register allocation strategy.

 Queued for stage1 after suitable testing including a bootstrap and
 regression test in Thumb2 found no issues.

 regards
 Ramana

 DATE  Ramana Radhakrishnan  ramana.radhakrish...@arm.com

 * config/arm/arm.c (arm_hard_regno_mode_ok): Loosen restrictions on
 core registers for DImode values in Thumb2.


 Hi Ramana,

 I've noticed some regressions after this patch has been committed (rev 
 209615):

   gcc.c-torture/compile/pr34856.c  -O3 -fomit-frame-pointer
 -funroll-all-loops -finline-functions
   gcc.c-torture/compile/pr34856.c  -O3 -fomit-frame-pointer -funroll-loops
   gcc.c-torture/execute/scal-to-vec1.c compilation,  -O2
   gcc.c-torture/execute/scal-to-vec1.c compilation,  -O2 -flto
 -fno-use-linker-plugin -flto-partition=none
   gcc.c-torture/execute/scal-to-vec1.c compilation,  -O2 -flto
 -fuse-linker-plugin -fno-fat-lto-objects


 Now all produce ICE in several GCC configurations (mostly when
 generating thumb code) eg:
 --target arm-none-eabi --with-cpu=cortex-a9 --with-mode=thumb
 --target arm-none-linux-gnueabi --with-cpu=cortex-a9 --with-mode=thumb


Thanks for the report - I'll have a look. I've had this in a tree for
testing for sometime that runs these configurations atleast the
bare-metal arm-none-eabi one with multilib testing for thumb.

 but it's OK for target arm-none-linux-gnueabihf.

 See 
 http://cbuild.validation.linaro.org/build/cross-validation/gcc/209615/report-build-info.html

 Christophe.

Add post_expand_call_insn hook

2014-04-23 Thread Tom de Vries


On 22-04-14 17:05, Tom de Vries wrote:

I've updated the fuse-caller-save patch series to model non-callee call clobbers
in CALL_INSN_FUNCTION_USAGE.


Eric,

this patch adds a post_expand_call_insn hook.

The hook is called right after expansion of calls, and allows a target to do 
additional processing, such as f.i. adding clobbers to CALL_INSN_FUNCTION_USAGE.


Instead of using the hook, we could add code to the preparation statements 
operand of the different call expands, but that requires those expands not to 
use the rtl template, and generate all the rtl through c code. Which requires a 
rewrite of the call expands in case of Aarch64.


Bootstrapped and reg-tested on x86_64 as part of the fuse-caller-save patch 
series.

OK for trunk?

Thanks,
- Tom

2014-04-18  Tom de Vries  t...@codesourcery.com

* target.def (post_expand_call_insn): New DEFHOOK.
* calls.c (expand_call, emit_library_call_value_1): Call
post_expand_call_insn hook.
* tm.texi.in (@section Storage Layout): Add hook
TARGET_POST_EXPAND_CALL_INSN.
* hooks.c (hook_void_rtx): New function.
* hooks.h (hook_void_rtx): Declare function.
diff --git a/gcc/calls.c b/gcc/calls.c
index e798c7a..0777a02 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -3507,6 +3507,8 @@ expand_call (tree exp, rtx target, int ignore)
 
   free (stack_usage_map_buf);
 
+  targetm.post_expand_call_insn (last_call_insn ());
+
   return target;
 }
 
@@ -4344,6 +4346,8 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value,
 
   free (stack_usage_map_buf);
 
+  targetm.post_expand_call_insn (last_call_insn ());
+
   return value;
 
 }
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8af8efd..40b5bb1 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -1408,6 +1408,11 @@ registers whenever the function being expanded has any SDmode
 usage.
 @end deftypefn
 
+@deftypefn {Target Hook} void TARGET_POST_EXPAND_CALL_INSN (rtx)
+This hook is called just after expansion of a call_expr into rtl, allowing
+the target to perform additional processing.
+@end deftypefn
+
 @deftypefn {Target Hook} void TARGET_INSTANTIATE_DECLS (void)
 This hook allows the backend to perform additional instantiations on rtl
 that are not actually in any insns yet, but will be later.
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 8991c3c..812b0b8 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -1285,6 +1285,8 @@ The default definition of this macro returns false for all sizes.
 
 @hook TARGET_EXPAND_TO_RTL_HOOK
 
+@hook TARGET_POST_EXPAND_CALL_INSN
+
 @hook TARGET_INSTANTIATE_DECLS
 
 @hook TARGET_MANGLE_TYPE
diff --git a/gcc/hooks.c b/gcc/hooks.c
index 1c67bdf..53e8591 100644
--- a/gcc/hooks.c
+++ b/gcc/hooks.c
@@ -461,6 +461,13 @@ hook_void_rtx_int (rtx insn ATTRIBUTE_UNUSED, int mode ATTRIBUTE_UNUSED)
 {
 }
 
+/* Generic hook that takes a rtx and an int and returns void.  */
+
+void
+hook_void_rtx (rtx insn ATTRIBUTE_UNUSED)
+{
+}
+
 /* Generic hook that takes a struct gcc_options * and returns void.  */
 
 void
diff --git a/gcc/hooks.h b/gcc/hooks.h
index 896b41d..4df5ae0 100644
--- a/gcc/hooks.h
+++ b/gcc/hooks.h
@@ -66,6 +66,7 @@ extern bool hook_bool_dint_dint_uint_bool_true (double_int, double_int,
 
 extern void hook_void_void (void);
 extern void hook_void_constcharptr (const char *);
+extern void hook_void_rtx (rtx);
 extern void hook_void_rtx_int (rtx, int);
 extern void hook_void_FILEptr_constcharptr (FILE *, const char *);
 extern bool hook_bool_FILEptr_rtx_false (FILE *, rtx);
diff --git a/gcc/target.def b/gcc/target.def
index ae0bc9c..2f7178c 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -4639,6 +4639,15 @@ usage.,
  hook_void_void)
 
 /* This target hook allows the backend to perform additional
+   processing after expansion of a call insn.  */
+DEFHOOK
+(post_expand_call_insn,
+ This hook is called just after expansion of a call_expr into rtl, allowing\n\
+the target to perform additional processing.,
+ void, (rtx),
+ hook_void_rtx)
+
+/* This target hook allows the backend to perform additional
instantiations on rtx that are not actually in insns yet,
but will be later.  */
 DEFHOOK

[Patch, Fortran, testsuite] Increase tolerance level for precision of bessel function.

2014-04-23 Thread Tejas Belagod



Hi,

The attached patch adjusts a fortran test to decrease the precision of one of 
the points on the bessel curve.
gfortran.dg/bessel_7.f90 fails for a value 3.0 because libm does not seem to be 
accurate enough compared to what the test expects.


I did a like-for-like run on x86 vs aarch64. The issue seems to be in the level 
of precision that this test checks for. At the fail point, though the two values 
being compared are comparable, they aren't equal.


On aarch64, it looks like this:

33 -0.138861489E+30 -0.138861319E+30   -0.17E+24 10.2699956894  T  T
34 -0.304842886E+31 -0.304842493E+31   -0.39E+25 10.8117713928  T  T
35 -0.689588648E+32 -0.689587681E+32   -0.97E+26 11.7649326324  T  T
36 -0.160599184E+34 -0.160598952E+34   -0.23E+28 12.1240425110  T  F

If you see row #36, the 2nd and 3rd column values are comparable, but not equal. 
The delta is indicated in the 5th column which is greater that what the test 
expects -  12 ULPs.


On x86 it looks like this:

33 -0.138861508E+30 -0.138861366E+30   -0.14E+24  8.5583286285  T  T
34 -0.304842916E+31 -0.304842614E+31   -0.30E+25  8.3167467117  T  T
35 -0.689588696E+32 -0.689587971E+32   -0.73E+26  8.8236989975  T  T
36 -0.160599184E+34 -0.160599029E+34   -0.15E+28  8.0826950073  T  T

The delta on aarch64 is more than x86. If we increase the tolerance level for 
precision as shown in the patch, the test works fine for both x86 and aarch64.


Tested on aarch64-none-linux-gnu, x86_64-unknown-linux-gnu.

OK for trunk?

Thanks,
Tejas.

Changelog:

2014-04-23  Tejas Belagod  tejas.bela...@arm.com

testsuite/

* gfortran.dg/bessel_7.f90(myeps): Increase precision tolerance level.diff --git a/gcc/testsuite/gfortran.dg/bessel_7.f90 
b/gcc/testsuite/gfortran.dg/bessel_7.f90
index 7e63ed1..c6b5f74 100644
--- a/gcc/testsuite/gfortran.dg/bessel_7.f90
+++ b/gcc/testsuite/gfortran.dg/bessel_7.f90
@@ -16,7 +16,7 @@
 implicit none
 real,parameter :: values(*) = [0.0, 0.5, 1.0, 0.9, 
1.8,2.0,3.0,4.0,4.25,8.0,34.53, 475.78] 
 real,parameter :: myeps(size(values)) = epsilon(0.0) 
-  * [2, 3, 4, 5, 8, 2, 12, 6, 7, 6, 36, 168 ]
+  * [2, 3, 4, 5, 8, 2, 13, 6, 7, 6, 36, 168 ]
 ! The following is sufficient for me - the values above are a bit
 ! more tolerant
 !  * [0, 0, 0, 3, 3, 0, 9, 0, 2, 1, 22, 130 ]

Re: [PATCH] Fix warning in libgfortran configure script

On 23 April 2014 10:22, Richard Earnshaw rearn...@arm.com wrote:

 libgfortran/
 2014-04-17  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  * configure.ac: Quote usage of ac_cv_func_clock_gettime in if test.
  * configure: Regenerate.


 This looks fairly safe to me.  My only question might be why isn't the
 variable set to one of 'yes' or 'no'?

This is due to the newlib library detection kludgery further up the
file.  Rather than using autoconf to probe the interface, we detect
newlib, bypass the AC_CHECK_FUNC_ONCE() macro and hardwire the
interface.  This has the effect of leaving various ac_cv_func_*
variables undefined

Cheers
/Marcus

Re: [AArch64/ARM 3/3] Add execution tests of ARM TRN Intrinsics

2014-04-23 Thread Ramana Radhakrishnan

On Fri, Mar 28, 2014 at 3:50 PM, Alan Lawrence alan.lawre...@arm.com wrote:
 Final patch in series, adds new tests of the ARM TRN Intrinsics, that also
 check
 the execution results, reusing the test bodies introduced into AArch64 in
 the
 first patch. (These tests subsume the autogenerated ones in
 testsuite/gcc.target/arm/neon/ that only check assembler output.)

 Tests use gcc.target/arm/simd/simd.exp from corresponding patch for ZIP
 Intrinsics, will commit that first.

 All tests passing on arm-none-eabi.

The ARM bits are ok.


 testsuite/ChangeLog:
 2012-03-28  Alan Lawrence  alan.lawre...@arm.com

 * gcc.target/arm/simd/vtrnqf32_1.c: New file.
 * gcc.target/arm/simd/vtrnqp16_1.c: New file.
 * gcc.target/arm/simd/vtrnqp8_1.c: New file.
 * gcc.target/arm/simd/vtrnqs16_1.c: New file.
 * gcc.target/arm/simd/vtrnqs32_1.c: New file.
 * gcc.target/arm/simd/vtrnqs8_1.c: New file.
 * gcc.target/arm/simd/vtrnqu16_1.c: New file.
 * gcc.target/arm/simd/vtrnqu32_1.c: New file.
 * gcc.target/arm/simd/vtrnqu8_1.c: New file.
 * gcc.target/arm/simd/vtrnf32_1.c: New file.
 * gcc.target/arm/simd/vtrnp16_1.c: New file.
 * gcc.target/arm/simd/vtrnp8_1.c: New file.
 * gcc.target/arm/simd/vtrns16_1.c: New file.
 * gcc.target/arm/simd/vtrns32_1.c: New file.
 * gcc.target/arm/simd/vtrns8_1.c: New file.
 * gcc.target/arm/simd/vtrnu16_1.c: New file.
 * gcc.target/arm/simd/vtrnu32_1.c: New file.
 * gcc.target/arm/simd/vtrnu8_1.c: New file.
 diff --git a/gcc/testsuite/gcc.target/arm/simd/vtrnf32_1.c
 b/gcc/testsuite/gcc.target/arm/simd/vtrnf32_1.c
 new file mode 100644
 index 000..c9620fb
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/arm/simd/vtrnf32_1.c
 @@ -0,0 +1,12 @@
 +/* Test the `vtrnf32' ARM Neon intrinsic.  */
 +
 +/* { dg-do run } */
 +/* { dg-require-effective-target arm_neon_ok } */
 +/* { dg-options -save-temps -O1 -fno-inline } */
 +/* { dg-add-options arm_neon } */
 +
 +#include arm_neon.h
 +#include ../../aarch64/simd/vtrnf32.x
 +
 +/* { dg-final { scan-assembler-times vtrn\.32\[ \t\]+\[dD\]\[0-9\]+,
 ?\[dD\]\[0-9\]+!?\(?:\[ \t\]+@\[a-zA-Z0-9 \]+\)?\n 1 } } */
 +/* { dg-final { cleanup-saved-temps } } */
 diff --git a/gcc/testsuite/gcc.target/arm/simd/vtrnp16_1.c
 b/gcc/testsuite/gcc.target/arm/simd/vtrnp16_1.c
 new file mode 100644
 index 000..0ff4319
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/arm/simd/vtrnp16_1.c
 @@ -0,0 +1,12 @@
 +/* Test the `vtrnp16' ARM Neon intrinsic.  */
 +
 +/* { dg-do run } */
 +/* { dg-require-effective-target arm_neon_ok } */
 +/* { dg-options -save-temps -O1 -fno-inline } */
 +/* { dg-add-options arm_neon } */
 +
 +#include arm_neon.h
 +#include ../../aarch64/simd/vtrnp16.x
 +
 +/* { dg-final { scan-assembler-times vtrn\.16\[ \t\]+\[dD\]\[0-9\]+,
 ?\[dD\]\[0-9\]+!?\(?:\[ \t\]+@\[a-zA-Z0-9 \]+\)?\n 1 } } */
 +/* { dg-final { cleanup-saved-temps } } */
 diff --git a/gcc/testsuite/gcc.target/arm/simd/vtrnp8_1.c
 b/gcc/testsuite/gcc.target/arm/simd/vtrnp8_1.c
 new file mode 100644
 index 000..2b047e4
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/arm/simd/vtrnp8_1.c
 @@ -0,0 +1,12 @@
 +/* Test the `vtrnp8' ARM Neon intrinsic.  */
 +
 +/* { dg-do run } */
 +/* { dg-require-effective-target arm_neon_ok } */
 +/* { dg-options -save-temps -O1 -fno-inline } */
 +/* { dg-add-options arm_neon } */
 +
 +#include arm_neon.h
 +#include ../../aarch64/simd/vtrnp8.x
 +
 +/* { dg-final { scan-assembler-times vtrn\.8\[ \t\]+\[dD\]\[0-9\]+,
 ?\[dD\]\[0-9\]+!?\(?:\[ \t\]+@\[a-zA-Z0-9 \]+\)?\n 1 } } */
 +/* { dg-final { cleanup-saved-temps } } */
 diff --git a/gcc/testsuite/gcc.target/arm/simd/vtrnqf32_1.c
 b/gcc/testsuite/gcc.target/arm/simd/vtrnqf32_1.c
 new file mode 100644
 index 000..dd4e883
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/arm/simd/vtrnqf32_1.c
 @@ -0,0 +1,12 @@
 +/* Test the `vtrnQf32' ARM Neon intrinsic.  */
 +
 +/* { dg-do run } */
 +/* { dg-require-effective-target arm_neon_ok } */
 +/* { dg-options -save-temps -O1 -fno-inline } */
 +/* { dg-add-options arm_neon } */
 +
 +#include arm_neon.h
 +#include ../../aarch64/simd/vtrnqf32.x
 +
 +/* { dg-final { scan-assembler-times vtrn\.32\[ \t\]+\[qQ\]\[0-9\]+,
 ?\[qQ\]\[0-9\]+!?\(?:\[ \t\]+@\[a-zA-Z0-9 \]+\)?\n 1 } } */
 +/* { dg-final { cleanup-saved-temps } } */
 diff --git a/gcc/testsuite/gcc.target/arm/simd/vtrnqp16_1.c
 b/gcc/testsuite/gcc.target/arm/simd/vtrnqp16_1.c
 new file mode 100644
 index 000..374eee3
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/arm/simd/vtrnqp16_1.c
 @@ -0,0 +1,12 @@
 +/* Test the `vtrnQp16' ARM Neon intrinsic.  */
 +
 +/* { dg-do run } */
 +/* { dg-require-effective-target arm_neon_ok } */
 +/* { dg-options -save-temps -O1 -fno-inline } */
 +/* { dg-add-options arm_neon } */
 +
 +#include arm_neon.h
 +#include ../../aarch64/simd/vtrnqp16.x
 +
 +/* { dg-final { scan-assembler-times vtrn\.16\[ \t\]+\[qQ\]\[0-9\]+,

Re: [wide-int 2/8] Fix ubsan internal-fn.c handling

Richard Sandiford rdsandif...@googlemail.com writes:
 This code was mixing hprec and hprec*2 wide_ints.  The simplest fix
 seemed to be to introduce a function that gives the minimum precision
 necessary to represent a function, which also means that no temporary
 wide_ints are needed.

 Other places might be able to use this too, but I'd like to look at
 that after the merge.

 The patch series fixed a regression in c-c++-common/ubsan/overflow-2.c
 and I assume it's due to this change.

 Tested on x86_64-linux-gnu.  OK to install?

Richard B. expressed doubts about this on IRC, so for a bit more detail:

The comparisons we're doing are on the range of an SSA name.
There are three ways that these ranges could be stored in the
range_info_def:

(1) as INTEGER_CSTs.  This was felt to be unacceptable because it would
create too many garbage constants.

(2) as widest_ints.  This too was unacceptable because it would bloat
the range_info_def.

(3) as a form of wide_int in which the HWIs are allocated as a trailing
part of the containing structure.  This means that range_info_defs
for 64-bit types only have 3 HWIs (smaller than now).

We went for (3).

Having decided to store the ranges like wide_ints, the question then is:
what about the get/set_range_info interface?  Two obvious options are:

(a) present the ranges as wide_ints.

(b) present the ranges as widest_ints, converting in and out as necessary.

(a) is more efficient and seems to fit well with the pre-ubsan callers,
so that's what was chosen.

In the patch we have two wide_ints that have the same precision as the
SSA name.  The values we were creating via wi::min_value and wi::max_value
instead had half that precision.  This is the same kind of mismatch as
you'd get comparing HImode and SImode in RTL, say.

We could fix the bug by using something like:

  if (wi::les_p (arg0_max, wi::mask (hprec, false, prec))
   wi::les_p (wi::mask (hprec, true, prec), arg0_min))

etc.  Or we could extend the wide_ints to widest_ints so that precision
doesn't matter when doing the comparisons.  But both those options
involve temporaries and seem unnecessarily complicated.  All we're
really asking here is: what is the minimum precision needed to represent
this constant?  That's something that could be generally useful
(e.g. when checking whether a value fits a type) so the patch adds
a corresponding wi:: function.

Thanks,
Richard



 Thanks,
 Richard


 Index: gcc/internal-fn.c
 ===
 --- gcc/internal-fn.c 2014-04-22 20:31:10.516895118 +0100
 +++ gcc/internal-fn.c 2014-04-22 20:31:25.842005530 +0100
 @@ -478,7 +478,7 @@ ubsan_expand_si_overflow_mul_check (gimp
 rtx do_overflow = gen_label_rtx ();
 rtx hipart_different = gen_label_rtx ();
  
 -   int hprec = GET_MODE_PRECISION (hmode);
 +   unsigned int hprec = GET_MODE_PRECISION (hmode);
 rtx hipart0 = expand_shift (RSHIFT_EXPR, mode, op0, hprec,
 NULL_RTX, 0);
 hipart0 = gen_lowpart (hmode, hipart0);
 @@ -513,12 +513,11 @@ ubsan_expand_si_overflow_mul_check (gimp
 wide_int arg0_min, arg0_max;
 if (get_range_info (arg0, arg0_min, arg0_max) == VR_RANGE)
   {
 -   if (wi::les_p (arg0_max, wi::max_value (hprec, SIGNED))
 -wi::les_p (wi::min_value (hprec, SIGNED), arg0_min))
 +   unsigned int mprec0 = wi::min_precision (arg0_min, SIGNED);
 +   unsigned int mprec1 = wi::min_precision (arg0_max, SIGNED);
 +   if (mprec0 = hprec  mprec1 = hprec)
   op0_small_p = true;
 -   else if (wi::les_p (arg0_max, wi::max_value (hprec, UNSIGNED))
 - wi::les_p (~wi::max_value (hprec, UNSIGNED),
 -  arg0_min))
 +   else if (mprec0 = hprec + 1  mprec1 = hprec + 1)
   op0_medium_p = true;
 if (!wi::neg_p (arg0_min, TYPE_SIGN (TREE_TYPE (arg0
   op0_sign = 0;
 @@ -531,12 +530,11 @@ ubsan_expand_si_overflow_mul_check (gimp
 wide_int arg1_min, arg1_max;
 if (get_range_info (arg1, arg1_min, arg1_max) == VR_RANGE)
   {
 -   if (wi::les_p (arg1_max, wi::max_value (hprec, SIGNED))
 -wi::les_p (wi::min_value (hprec, SIGNED), arg1_min))
 +   unsigned int mprec0 = wi::min_precision (arg1_min, SIGNED);
 +   unsigned int mprec1 = wi::min_precision (arg1_max, SIGNED);
 +   if (mprec0 = hprec  mprec1 = hprec)
   op1_small_p = true;
 -   else if (wi::les_p (arg1_max, wi::max_value (hprec, UNSIGNED))
 - wi::les_p (~wi::max_value (hprec, UNSIGNED),
 -  arg1_min))
 +   else if (mprec0 = hprec + 1  mprec1 = hprec + 1)

Re: [Patch ARM] Allow any register for DImode values in Thumb2.

2014-04-23 Thread Ramana Radhakrishnan

On Wed, Apr 23, 2014 at 2:06 PM, Ramana Radhakrishnan
ramana@googlemail.com wrote:
 On Wed, Apr 23, 2014 at 1:53 PM, Christophe Lyon
 christophe.l...@linaro.org wrote:
 On 27 February 2014 14:58, Ramana Radhakrishnan ramra...@arm.com wrote:
 Hi

 I noticed that for T32 we don't allow any old register for DImode values.
 The restriction of an even register is true only for ARM state because the
 ISA doesn't allow any old register in this place. In a few large .i files
 that I had knocking about, noticed a nice drop in stack usage and a
 generally improved register allocation strategy.

 Queued for stage1 after suitable testing including a bootstrap and
 regression test in Thumb2 found no issues.

 regards
 Ramana

 DATE  Ramana Radhakrishnan  ramana.radhakrish...@arm.com

 * config/arm/arm.c (arm_hard_regno_mode_ok): Loosen restrictions on
 core registers for DImode values in Thumb2.


 Hi Ramana,

 I've noticed some regressions after this patch has been committed (rev 
 209615):

   gcc.c-torture/compile/pr34856.c  -O3 -fomit-frame-pointer
 -funroll-all-loops -finline-functions
   gcc.c-torture/compile/pr34856.c  -O3 -fomit-frame-pointer -funroll-loops
   gcc.c-torture/execute/scal-to-vec1.c compilation,  -O2
   gcc.c-torture/execute/scal-to-vec1.c compilation,  -O2 -flto
 -fno-use-linker-plugin -flto-partition=none
   gcc.c-torture/execute/scal-to-vec1.c compilation,  -O2 -flto
 -fuse-linker-plugin -fno-fat-lto-objects


 Now all produce ICE in several GCC configurations (mostly when
 generating thumb code) eg:
 --target arm-none-eabi --with-cpu=cortex-a9 --with-mode=thumb
 --target arm-none-linux-gnueabi --with-cpu=cortex-a9 --with-mode=thumb


 Thanks for the report - I'll have a look. I've had this in a tree for
 testing for sometime that runs these configurations atleast the
 bare-metal arm-none-eabi one with multilib testing for thumb.

Uggh I hate it that gmail sometimes cuts off your sentences.

Needless to say, this is surprising


 but it's OK for target arm-none-linux-gnueabihf.

 See 
 http://cbuild.validation.linaro.org/build/cross-validation/gcc/209615/report-build-info.html

 Christophe.

RE: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend

2014-04-23 Thread Robert Suchanek

 Yeah, I think the lack of elimination is the problem.  process_address
 eliminates $frame temporarily before checking whether the address
 is valid, but the places that check EXTRA_CONSTRAINT_STR pass the
 original uneliminated address.  So the legitimate_address_p hook sees
 the $sp-based address but the W constraint only sees the $frame-based
 address (which might or might not be valid, depending on whether $frame
 is eliminated to the stack or hard frame pointer).  I think the constraints
 should see the eliminated address too.

That makes sense and explains why it worked when $frame was eliminated
to hard frame pointer but didn't for the stack pointer.

 BTW, we might want to define something like:
 
 #define MODE_BASE_REG_CLASS(MODE) \
   (TARGET_MIPS16 \
? ((MODE) == SImode || (MODE) == DImode ? M16_SP_REGS : M16_REGS) \
: GR_REGS)
 
 instead of BASE_REG_CLASS.  It might lead to slightly better code
 (or not -- if it doesn't then don't bother :-)).

I have already tried it and no visible difference was seen.

 If this patch is OK then I think the only thing blocking the switch
 to LRA is the asm-subreg-1.c failure.  I think it'd be fine to XFAIL
 that test on MIPS for now, until there's a consensus about what X means
 for asms.

The patch worked for me and passed the regression test. Thanks.

If we were going to XFAIL the test then it would apply specifically for -mips16 
-O1.
In any other combination it appears to work. Would that be a stopper?

Below is the revised patch addressing all the comments and changes so far.

Regards,
Robert

2014-03-26  Robert Suchanek  robert.sucha...@imgtec.com

* lra-constraints.c (base_to_reg): New function.
(process_address): Use new function.

* config/mips/constraints.md (d): BASE_REG_CLASS
replaced by TARGET_MIPS16 ? M16_REGS : GR_REGS.
* config/mips/mips.c (mips_regno_mode_ok_for_base_p):
Remove use !strict_p for MIPS16.
(mips_register_priority): New function that implements
the target hook TARGET_REGISTER_PRIORITY.
(mips_spill_class): Likewise for TARGET_SPILL_CLASS
(mips_lra_p): Likewise for TARGET_LRA_P.
* config/mips/mips.h (reg_class): Add M16_SP_REGS and SPILL_REGS
classes.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Likewise.
(BASE_REG_CLASS): Use M16_SP_REGS.
* config/mips/mips.md (*mul_acc_si, *mul_sub_si): Add alternative
tuned for LRA. New set attribute to enable alternatives
depending on the register allocator used.
(*lea64): Disable pattern for MIPS16.
* config/mips/mips.opt
(mlra): New option

diff --git gcc/config/mips/constraints.md gcc/config/mips/constraints.md
index f6834fd..fa33c30 100644
--- gcc/config/mips/constraints.md
+++ gcc/config/mips/constraints.md
@@ -19,7 +19,7 @@
 
 ;; Register constraints
 
-(define_register_constraint d BASE_REG_CLASS
+(define_register_constraint d TARGET_MIPS16 ? M16_REGS : GR_REGS
   An address register.  This is equivalent to @code{r} unless
generating MIPS16 code.)
 
diff --git gcc/config/mips/mips.c gcc/config/mips/mips.c
index 45256e9..81b6c26 100644
--- gcc/config/mips/mips.c
+++ gcc/config/mips/mips.c
@@ -655,7 +655,7 @@ const enum reg_class 
mips_regno_to_class[FIRST_PSEUDO_REGISTER] = {
   M16_REGS,M16_STORE_REGS,  LEA_REGS,LEA_REGS,
   LEA_REGS,LEA_REGS,LEA_REGS,LEA_REGS,
   T_REG,   PIC_FN_ADDR_REG, LEA_REGS,LEA_REGS,
-  LEA_REGS,LEA_REGS,LEA_REGS,LEA_REGS,
+  LEA_REGS,M16_SP_REGS, LEA_REGS,LEA_REGS,
 
   FP_REGS, FP_REGS,FP_REGS,FP_REGS,
   FP_REGS, FP_REGS,FP_REGS,FP_REGS,
@@ -2241,22 +2241,9 @@ mips_regno_mode_ok_for_base_p (int regno, enum 
machine_mode mode,
 return true;
 
   /* In MIPS16 mode, the stack pointer can only address word and doubleword
- values, nothing smaller.  There are two problems here:
-
-   (a) Instantiating virtual registers can introduce new uses of the
-  stack pointer.  If these virtual registers are valid addresses,
-  the stack pointer should be too.
-
-   (b) Most uses of the stack pointer are not made explicit until
-  FRAME_POINTER_REGNUM and ARG_POINTER_REGNUM have been eliminated.
-  We don't know until that stage whether we'll be eliminating to the
-  stack pointer (which needs the restriction) or the hard frame
-  pointer (which doesn't).
-
- All in all, it seems more consistent to only enforce this restriction
- during and after reload.  */
+ values, nothing smaller.  */
   if (TARGET_MIPS16  regno == STACK_POINTER_REGNUM)
-return !strict_p || GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
+return GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
 
   return TARGET_MIPS16 ? M16_REG_P (regno) : GP_REG_P (regno);
 }
@@ -12115,6 +12102,18 @@

Re: [wide-int 2/8] Fix ubsan internal-fn.c handling

On Wed, Apr 23, 2014 at 3:29 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 Richard Sandiford rdsandif...@googlemail.com writes:
 This code was mixing hprec and hprec*2 wide_ints.  The simplest fix
 seemed to be to introduce a function that gives the minimum precision
 necessary to represent a function, which also means that no temporary
 wide_ints are needed.

 Other places might be able to use this too, but I'd like to look at
 that after the merge.

 The patch series fixed a regression in c-c++-common/ubsan/overflow-2.c
 and I assume it's due to this change.

 Tested on x86_64-linux-gnu.  OK to install?

 Richard B. expressed doubts about this on IRC, so for a bit more detail:

 The comparisons we're doing are on the range of an SSA name.
 There are three ways that these ranges could be stored in the
 range_info_def:

 (1) as INTEGER_CSTs.  This was felt to be unacceptable because it would
 create too many garbage constants.

 (2) as widest_ints.  This too was unacceptable because it would bloat
 the range_info_def.

 (3) as a form of wide_int in which the HWIs are allocated as a trailing
 part of the containing structure.  This means that range_info_defs
 for 64-bit types only have 3 HWIs (smaller than now).

 We went for (3).

 Having decided to store the ranges like wide_ints, the question then is:
 what about the get/set_range_info interface?  Two obvious options are:

 (a) present the ranges as wide_ints.

 (b) present the ranges as widest_ints, converting in and out as necessary.

 (a) is more efficient and seems to fit well with the pre-ubsan callers,
 so that's what was chosen.

 In the patch we have two wide_ints that have the same precision as the
 SSA name.  The values we were creating via wi::min_value and wi::max_value
 instead had half that precision.  This is the same kind of mismatch as
 you'd get comparing HImode and SImode in RTL, say.

 We could fix the bug by using something like:

   if (wi::les_p (arg0_max, wi::mask (hprec, false, prec))
wi::les_p (wi::mask (hprec, true, prec), arg0_min))

 etc.  Or we could extend the wide_ints to widest_ints so that precision
 doesn't matter when doing the comparisons.  But both those options
 involve temporaries and seem unnecessarily complicated.  All we're
 really asking here is: what is the minimum precision needed to represent
 this constant?  That's something that could be generally useful
 (e.g. when checking whether a value fits a type) so the patch adds
 a corresponding wi:: function.

Ah, that makes sense now ;)

Thus the patch is ok.

Thanks,
Richard.

 Thanks,
 Richard



 Thanks,
 Richard


 Index: gcc/internal-fn.c
 ===
 --- gcc/internal-fn.c 2014-04-22 20:31:10.516895118 +0100
 +++ gcc/internal-fn.c 2014-04-22 20:31:25.842005530 +0100
 @@ -478,7 +478,7 @@ ubsan_expand_si_overflow_mul_check (gimp
 rtx do_overflow = gen_label_rtx ();
 rtx hipart_different = gen_label_rtx ();

 -   int hprec = GET_MODE_PRECISION (hmode);
 +   unsigned int hprec = GET_MODE_PRECISION (hmode);
 rtx hipart0 = expand_shift (RSHIFT_EXPR, mode, op0, hprec,
 NULL_RTX, 0);
 hipart0 = gen_lowpart (hmode, hipart0);
 @@ -513,12 +513,11 @@ ubsan_expand_si_overflow_mul_check (gimp
 wide_int arg0_min, arg0_max;
 if (get_range_info (arg0, arg0_min, arg0_max) == VR_RANGE)
   {
 -   if (wi::les_p (arg0_max, wi::max_value (hprec, SIGNED))
 -wi::les_p (wi::min_value (hprec, SIGNED), arg0_min))
 +   unsigned int mprec0 = wi::min_precision (arg0_min, SIGNED);
 +   unsigned int mprec1 = wi::min_precision (arg0_max, SIGNED);
 +   if (mprec0 = hprec  mprec1 = hprec)
   op0_small_p = true;
 -   else if (wi::les_p (arg0_max, wi::max_value (hprec, 
 UNSIGNED))
 - wi::les_p (~wi::max_value (hprec, UNSIGNED),
 -  arg0_min))
 +   else if (mprec0 = hprec + 1  mprec1 = hprec + 1)
   op0_medium_p = true;
 if (!wi::neg_p (arg0_min, TYPE_SIGN (TREE_TYPE (arg0
   op0_sign = 0;
 @@ -531,12 +530,11 @@ ubsan_expand_si_overflow_mul_check (gimp
 wide_int arg1_min, arg1_max;
 if (get_range_info (arg1, arg1_min, arg1_max) == VR_RANGE)
   {
 -   if (wi::les_p (arg1_max, wi::max_value (hprec, SIGNED))
 -wi::les_p (wi::min_value (hprec, SIGNED), arg1_min))
 +   unsigned int mprec0 = wi::min_precision (arg1_min, SIGNED);
 +   unsigned int mprec1 = wi::min_precision (arg1_max, SIGNED);
 +   if (mprec0 = hprec  mprec1 = hprec)
   op1_small_p = true;
 -   else if (wi::les_p (arg1_max, wi::max_value (hprec,

[PATCH] Add MIPS -mxpa command line option.

2014-04-23 Thread Andrew Bennett

Hi,

This patch adds a GCC MIPS command line option (-mxpa) to enable/disable
support for the eXtended Physical Address (XPA) instructions within
the assembler.

The ChangeLog and patch are shown below.

Many thanks,


Andrew



* doc/invoke.texi: Document -mxpa and -mno-xpa MIPS command line
options.
* config/mips/mips.opt (mxpa): New option.
* config/mips/mips.h (ASM_SPEC): Pass mxpa and mno-xpa to the
assembler.



diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index b25865b..91a33ef 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -1176,6 +1176,7 @@ struct mips_cpu_info {
 %{mmcu} %{mno-mcu} \
 %{meva} %{mno-eva} \
 %{mvirt} %{mno-virt} \
+%{mxpa} %{mno-xpa} \
 %{msmartmips} %{mno-smartmips} \
 %{mmt} %{mno-mt} \
 %{mfix-rm7000} %{mno-fix-rm7000} \
diff --git a/gcc/config/mips/mips.opt b/gcc/config/mips/mips.opt
index 6ee5398..c992cee 100644
--- a/gcc/config/mips/mips.opt
+++ b/gcc/config/mips/mips.opt
@@ -392,6 +392,10 @@ mvirt
 Target Report Var(TARGET_VIRT)
 Use Virtualization Application Specific instructions
 
+mxpa
+Target Report Var(TARGET_XPA)
+Use eXtended Physical Address (XPA) instructions
+
 mvr4130-align
 Target Report Mask(VR4130_ALIGN)
 Perform VR4130-specific alignment optimizations
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ff43f26..22a66e8 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -781,6 +781,7 @@ Objective-C and Objective-C++ Dialects}.
 -mmcu -mmno-mcu @gol
 -meva -mno-eva @gol
 -mvirt -mno-virt @gol
+-mxpa -mno-xpa @gol
 -mmicromips -mno-micromips @gol
 -mfpu=@var{fpu-type} @gol
 -msmartmips  -mno-smartmips @gol
@@ -17494,6 +17495,12 @@ Use (do not use) the MIPS Enhanced Virtual Addressing 
instructions.
 @opindex mno-virt
 Use (do not use) the MIPS Virtualization Application Specific instructions.
 
+@item -mxpa
+@itemx -mno-xpa
+@opindex mxpa
+@opindex mno-xpa
+Use (do not use) the MIPS eXtended Physical Address (XPA) instructions.
+
 @item -mlong64
 @opindex mlong64
 Force @code{long} types to be 64 bits wide.  See @option{-mlong32} for
-- 
1.7.1

Re: calloc = malloc + memset

On Fri, Apr 18, 2014 at 8:27 PM, Marc Glisse marc.gli...@inria.fr wrote:
 Thanks for the comments!


 On Fri, 18 Apr 2014, Jakub Jelinek wrote:

 The passes.def change makes me a little bit nervous, but if it works,
 perhaps.


 Would you prefer running the pass twice? I thought there would be less
 resistance to moving the pass than duplicating it.

Indeed.  I think placing it after loops and CSE (thus what you have done)
makes sense.  strlenopt itself shouldn't enable much additional
optimizations.  But well, pass ordering is always tricky.

Didn't look at the rest of the changes, but Jakub is certainly able to
approve the patch so I leave it to him.

Thanks,
Richard.

 By the way, I think even
 passes we run only once should have the required functions implemented so
 they can be run several times (at least most of them), in case users want to
 do that in plugins. I was surprised when I tried adding a second strlen pass
 and the compiler refused.


 --- gcc/testsuite/g++.dg/tree-ssa/calloc.C  (revision 0)
 +++ gcc/testsuite/g++.dg/tree-ssa/calloc.C  (working copy)
 @@ -0,0 +1,35 @@
 +/* { dg-do compile { target c++11 } } */
 +/* { dg-options -O3 -fdump-tree-optimized } */
 +
 +#include new
 +#include vector
 +#include cstdlib
 +
 +void g(void*);
 +inline void* operator new(std::size_t sz)
 +{
 +  void *p;
 +
 +  if (sz == 0)
 +sz = 1;
 +
 +  // Slightly modified from the libsupc++ version, that one has 2 calls
 +  // to malloc which makes it too hard to optimize.
 +  while ((p = std::malloc (sz)) == 0)
 +{
 +  std::new_handler handler = std::get_new_handler ();
 +  if (! handler)
 +throw std::bad_alloc();
 +  handler ();
 +}
 +  return p;
 +}
 +
 +void f(void*p,int n){
 +  new(p)std::vectorint(n);
 +}
 +
 +/* { dg-final { scan-tree-dump-times calloc 1 optimized } } */
 +/* { dg-final { scan-tree-dump-not malloc optimized } } */
 +/* { dg-final { scan-tree-dump-not memset optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized } } */


 This looks to me way too much fragile, any time the libstdc++
 or glibc headers change a little bit, you might need to adjust the
 dg-final directives.  Much better would be if you just provided
 the prototypes yourself and subset of the std::vector you really need for
 the testcase.  You can throw some class or int, it doesn't have to be
 std::bad_alloc, etc.


 I don't understand what seems so fragile to you. There is a single function
 in the .optimized dump, which just calls calloc in a loop. It doesn't seem
 that likely that a change in glibc/libstdc++ would make an extra memset pop
 up. A change in libstdc++ could easily prevent the optimization completely
 (I'd like to hope we can avoid that, half of the purpose of the testcase was
 making sure libstdc++ didn't change in a bad way), but I don't really see
 how it could keep it in a way that requires tweaking dg-final.

 While trying to write a standalone version, I hit again many missed
 optimizations, getting such nice things in the .optimized dump as:

   _12 = p_13 + sz_7;
   if (_12 != p_13)

 or:

   _12 = p_13 + sz_7;
   _30 = (unsigned long) _12;
   _9 = p_13 + 4;
   _10 = (unsigned long) _9;
   _11 = _30 - _10;
   _22 = _11 /[ex] 4;
   _21 = _22;
   _40 = _21 + 1;
   _34 = _40 * 4;

 It is embarrassing... I hope the combiner GSoC will work well and we can
 just add a dozen patterns to handle this before 4.10.


 --- gcc/testsuite/gcc.dg/strlenopt-9.c  (revision 208772)
 +++ gcc/testsuite/gcc.dg/strlenopt-9.c  (working copy)
 @@ -11,21 +11,21 @@ fn1 (int r)
   optimized away.  */
return strchr (p, '\0');
  }

  __attribute__((noinline, noclone)) size_t
  fn2 (int r)
  {
char *p, q[10];
strcpy (q, abc);
p = r ? a : q;
 -  /* String length for p varies, therefore strlen below isn't
 +  /* String length is constant for both alternatives, and strlen is
   optimized away.  */
return strlen (p);


 Is this because of jump threading?


 It is PRE that turns:

   if (r_4(D) == 0)
 goto bb 5;
   else
 goto bb 3;

   bb 5:
   goto bb 4;

   bb 3:

   bb 4:
   # p_1 = PHI q(5), a(3)
   _5 = __builtin_strlen (p_1);

 into:

   if (r_4(D) == 0)
 goto bb 5;
   else
 goto bb 3;

   bb 5:
   _7 = __builtin_strlen (q);
   pretmp_8 = _7;
   goto bb 4;

   bb 3:

   bb 4:
   # p_1 = PHI q(5), a(3)
   # prephitmp_9 = PHI pretmp_8(5), 1(3)
   _5 = prephitmp_9;

 It says:

 Found partial redundancy for expression
 {call_expr__builtin_strlen,p_1}@.MEM_3 (0005)


 --- gcc/testsuite/gcc.dg/tree-ssa/calloc-1.c(revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/calloc-1.c(working copy)
 @@ -0,0 +1,29 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fdump-tree-optimized } */
 +
 +#include stdlib.h
 +#include string.h


 Even this I find unsafe.  The strlenopt*.c tests use it's custom
 strlenopt.h header for a reason, you might just add a calloc
 prototype in there and use that header.


 Might as well use __builtin_* then.


 +/* Handle a

Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend

Robert Suchanek robert.sucha...@imgtec.com writes:
 If we were going to XFAIL the test then it would apply specifically
 for -mips16 -O1.  In any other combination it appears to work. Would
 that be a stopper?

Hmm, in that case maybe we should just leave it failing.  The alternative
would be to skip the test altogther for MIPS, with a PR referencing it,
but that seems a bit over-the-top.

 2014-03-26  Robert Suchanek  robert.sucha...@imgtec.com

   * lra-constraints.c (base_to_reg): New function.
   (process_address): Use new function.

   * config/mips/constraints.md (d): BASE_REG_CLASS
   replaced by TARGET_MIPS16 ? M16_REGS : GR_REGS.
   * config/mips/mips.c (mips_regno_mode_ok_for_base_p):
   Remove use !strict_p for MIPS16.
   (mips_register_priority): New function that implements
   the target hook TARGET_REGISTER_PRIORITY.
   (mips_spill_class): Likewise for TARGET_SPILL_CLASS
   (mips_lra_p): Likewise for TARGET_LRA_P.
   * config/mips/mips.h (reg_class): Add M16_SP_REGS and SPILL_REGS
   classes.
   (REG_CLASS_NAMES): Likewise.
   (REG_CLASS_CONTENTS): Likewise.
   (BASE_REG_CLASS): Use M16_SP_REGS.
   * config/mips/mips.md (*mul_acc_si, *mul_sub_si): Add alternative
   tuned for LRA. New set attribute to enable alternatives
   depending on the register allocator used.
   (*lea64): Disable pattern for MIPS16.
   * config/mips/mips.opt
   (mlra): New option

Looks good.

 @@ -12115,6 +12102,18 @@ mips_register_move_cost (enum machine_mode mode,
return 0;
  }
  
 +/* Return a register priority for hard reg REGNO.  */
 +
 +static int
 +mips_register_priority (int hard_regno)
 +{
 +  /* Treat MIPS16 registers with higher priority than other regs.  */
 +  if (TARGET_MIPS16
 +   TEST_HARD_REG_BIT (reg_class_contents[M16_REGS], hard_regno))
 +return 1;
 +  return 0;
 +}
 +
  /* Implement TARGET_MEMORY_MOVE_COST.  */
  
  static int
 @@ -18897,6 +18896,21 @@ mips_atomic_assign_expand_fenv (tree *hold, tree 
 *clear, tree *update)
*update = build2 (COMPOUND_EXPR, void_type_node, *update,
   atomic_feraiseexcept_call);
  }
 +
 +static reg_class_t
 +mips_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED,
 +   enum machine_mode mode ATTRIBUTE_UNUSED)
 +{
 +  if (TARGET_MIPS16)
 +return SPILL_REGS;
 +  return NO_REGS;
 +}
 +
 +static bool
 +mips_lra_p (void)
 +{
 +  return mips_lra_flag;
 +}
  
  /* Initialize the GCC target structure.  */
  #undef TARGET_ASM_ALIGNED_HI_OP

Please use comments of the form:

  /* Implement TARGET_FOO.  */

above all three functions (instead of the current one in the case of
mips_register_priority), just so that it's painfully obvious that
these are target hooks.

OK for the MIPS part with that change, thanks.

Out of interest, do you see any difference if you include $sp in SPILL_REGS?
That obviously doesn't make much conceptual sense, but it would give a
cleaner class hierarchy.

Richard

Re: [PATCH] Add MIPS -mxpa command line option.

Andrew Bennett andrew.benn...@imgtec.com writes:
 * doc/invoke.texi: Document -mxpa and -mno-xpa MIPS command line
 options.
 * config/mips/mips.opt (mxpa): New option.
 * config/mips/mips.h (ASM_SPEC): Pass mxpa and mno-xpa to the
 assembler.

OK, thanks.  If your account doesn't have gcc access yet then please ask
overseers@ to add it.  Remember to add yourself to MAINTAINERS afterwards :-)

Or if you'd prefer not to get access, I can commit it for you.

Thanks,
Richard

Re: Remove obsolete Solaris 9 support

2014-04-23 Thread Rainer Orth

Uros Bizjak ubiz...@gmail.com writes:

 It looks to me that one part was left in libgcc/config/i386/crtfastmath.c:

 #if !defined __x86_64__  defined __sun__  defined __svr4__
 #include signal.h
 #include ucontext.h
 ...
 #endif

Right, missed it because it carried no Solaris 9 comment.  I'll remove
it after a round of testing.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH 00/89] Compile-time gimple-checking

On Tue, Apr 22, 2014 at 9:42 PM, Richard Biener
richard.guent...@gmail.com wrote:
 On April 22, 2014 8:56:56 PM CEST, Richard Sandiford 
 rdsandif...@googlemail.com wrote:
David Malcolm dmalc...@redhat.com writes:
 Alternatively we could change the is-a.h API to eliminate this
 discrepancy, and keep the typedefs; giving something like the
following:

   static void
   dump_gimple_switch (pretty_printer *buffer, gimple_switch gs, int
spc,
   int flags)
   [...snip...]

   [...later, within pp_gimple_stmt_1:]

  case GIMPLE_SWITCH:
dump_gimple_switch (buffer, as_a gimple_switch (gs), spc,
flags);
break;

 which is concise, readable, and avoid the change in pointerness
compared
 to the gimple typedef; the local decls above would look like this:
   gimple some_stmt;  /* note how this doesn't have a star... */
   gimple_assign assign_stmt; /* ...and neither do these */
   gimple_cond assign_stmt;
   gimple_phi phi;

 I think this last proposal is my preferred API, but it requires the
 change to is-a.h

 Attached is a proposed change to the is-a.h API that elimintates the
 discrepancy, allowing the use of typedefs with is-a.h (doesn't yet
 compile, but hopefully illustrates the idea).  Note how it changes
the
 API to match C++'s  dynamic_cast operator i.e. you do

   Q* q = dyn_castQ* (p);

 not:

   Q* q = dyn_castQ (p);

Thanks for being flexible. :-)  I like this version too FWIW, for the
reason you said: it really does look like a proper C++ cast.

 Indeed. I even wasn't aware it is different Than a c++ cast...

It would be nice if you can change that with a separate patch posted
in a separate thread to be more visible.

Also I see you introduce a const_FOO class with every FOO one.
I wonder whether, now that we have C++, can address const-correctness
in a less awkward way than with a typedef.  Can you try to go back
in time and see why we did with that in the first place?  ISTR that
it was oh, if we were only using C++ we wouldn't need to jump through
that hoop.

Thanks,
Richard.

 Richard.

If we ever decide to get rid of the typedefs (maybe at the same time as
using auto) then the choice might be different, but that would be a
much
more systematic and easily-automated change than this one.

Thanks,
Richard

Re: [PATCH][RFC][wide-int] Fix some build errors on arm in wide-int branch and report ICE

2014-04-23 Thread Kenneth Zadeck


On 04/23/2014 05:47 AM, Richard Biener wrote:

On Tue, Apr 22, 2014 at 6:04 PM, Mike Stump mikest...@comcast.net wrote:

On Apr 22, 2014, at 8:33 AM, Richard Sandiford rdsandif...@googlemail.com 
wrote:

Kyrill Tkachov kyrylo.tkac...@arm.com writes:

Ping.
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00769.html
Any ideas? I recall chatter on IRC that we want to merge wide-int into trunk
soon. Bootstrap failure on arm would prevent that...

Sorry for the late reply.  I hadn't forgotten, but I wanted to wait
until I had chance to look into the ICE before replying, which I haven't
had chance to do yet.

They are separable issues, so, I checked in the change.


It's a shame we can't use C++ style casts,
but I suppose that's the price to pay for being able to write
unsigned HOST_WIDE_INT”.

unsigned_HOST_WIDE_INT isn’t horrible, but, yeah, my fingers were expecting a 
typedef or better.  I slightly prefer the int (1) style, but I think we should 
go the direction of the patch.

Well, on my list of things to try for 4.10 is to kill off HOST_WIDE_* and
require a 64bit integer type on the host and force all targets to use
a 64bit 'hwi'.  Thus, s/HOST_WIDE_INT/int64_t/ (and the appropriate
related changes).

Richard.
I should point out that there is a community that wants to go in the 
opposite direction here.   They are the people with real 32 bit hosts 
who want to go back to a world where they are allowed to make hwi a 32 
bit value.They have been waiting wide-int to be committed because 
they see this as a way to get back to world where most of the math is 
done natively.


I am not part of this community but they feel that if the math that has 
the potential to be big to be is done in wide-ints, then they can go 
back to using a 32 bit hwi for everything else.For them, a wide-int 
built on 32 hwi's would be a win.


kenny

Re: [PATCH 00/89] Compile-time gimple-checking

On Wed, Apr 23, 2014 at 4:19 PM, Richard Biener
richard.guent...@gmail.com wrote:
 On Tue, Apr 22, 2014 at 9:42 PM, Richard Biener
 richard.guent...@gmail.com wrote:
 On April 22, 2014 8:56:56 PM CEST, Richard Sandiford 
 rdsandif...@googlemail.com wrote:
David Malcolm dmalc...@redhat.com writes:
 Alternatively we could change the is-a.h API to eliminate this
 discrepancy, and keep the typedefs; giving something like the
following:

   static void
   dump_gimple_switch (pretty_printer *buffer, gimple_switch gs, int
spc,
   int flags)
   [...snip...]

   [...later, within pp_gimple_stmt_1:]

  case GIMPLE_SWITCH:
dump_gimple_switch (buffer, as_a gimple_switch (gs), spc,
flags);
break;

 which is concise, readable, and avoid the change in pointerness
compared
 to the gimple typedef; the local decls above would look like this:
   gimple some_stmt;  /* note how this doesn't have a star... */
   gimple_assign assign_stmt; /* ...and neither do these */
   gimple_cond assign_stmt;
   gimple_phi phi;

 I think this last proposal is my preferred API, but it requires the
 change to is-a.h

 Attached is a proposed change to the is-a.h API that elimintates the
 discrepancy, allowing the use of typedefs with is-a.h (doesn't yet
 compile, but hopefully illustrates the idea).  Note how it changes
the
 API to match C++'s  dynamic_cast operator i.e. you do

   Q* q = dyn_castQ* (p);

 not:

   Q* q = dyn_castQ (p);

Thanks for being flexible. :-)  I like this version too FWIW, for the
reason you said: it really does look like a proper C++ cast.

 Indeed. I even wasn't aware it is different Than a c++ cast...

 It would be nice if you can change that with a separate patch posted
 in a separate thread to be more visible.

 Also I see you introduce a const_FOO class with every FOO one.
 I wonder whether, now that we have C++, can address const-correctness
 in a less awkward way than with a typedef.  Can you try to go back
 in time and see why we did with that in the first place?  ISTR that
 it was oh, if we were only using C++ we wouldn't need to jump through
 that hoop.

To followup myself here, it's because 'tree' is a typedef to a pointer
and thus 'const tree' is different from 'const tree_node *'.

Not sure why we re-introduced the 'mistake' of making 'tree' a pointer
when we introduced 'gimple'.  If we were to make 'gimple' the class
type itself we can use gimple *, const gimple * and also const gimple 
(when a NULL pointer is not expected).

Anyway, gazillion new typedefs are ugly :/  (typedefs are ugly)

Richard.

 Thanks,
 Richard.

 Richard.

If we ever decide to get rid of the typedefs (maybe at the same time as
using auto) then the choice might be different, but that would be a
much
more systematic and easily-automated change than this one.

Thanks,
Richard

Re: [PATCH][RFC][wide-int] Fix some build errors on arm in wide-int branch and report ICE

On Wed, Apr 23, 2014 at 4:29 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:
 On 04/23/2014 05:47 AM, Richard Biener wrote:

 On Tue, Apr 22, 2014 at 6:04 PM, Mike Stump mikest...@comcast.net wrote:

 On Apr 22, 2014, at 8:33 AM, Richard Sandiford
 rdsandif...@googlemail.com wrote:

 Kyrill Tkachov kyrylo.tkac...@arm.com writes:

 Ping.
 http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00769.html
 Any ideas? I recall chatter on IRC that we want to merge wide-int into
 trunk
 soon. Bootstrap failure on arm would prevent that...

 Sorry for the late reply.  I hadn't forgotten, but I wanted to wait
 until I had chance to look into the ICE before replying, which I haven't
 had chance to do yet.

 They are separable issues, so, I checked in the change.

 It's a shame we can't use C++ style casts,
 but I suppose that's the price to pay for being able to write
 unsigned HOST_WIDE_INT”.

 unsigned_HOST_WIDE_INT isn’t horrible, but, yeah, my fingers were
 expecting a typedef or better.  I slightly prefer the int (1) style, but I
 think we should go the direction of the patch.

 Well, on my list of things to try for 4.10 is to kill off HOST_WIDE_* and
 require a 64bit integer type on the host and force all targets to use
 a 64bit 'hwi'.  Thus, s/HOST_WIDE_INT/int64_t/ (and the appropriate
 related changes).

 Richard.

 I should point out that there is a community that wants to go in the
 opposite direction here.   They are the people with real 32 bit hosts who
 want to go back to a world where they are allowed to make hwi a 32 bit
 value.They have been waiting wide-int to be committed because they see
 this as a way to get back to world where most of the math is done natively.

 I am not part of this community but they feel that if the math that has the
 potential to be big to be is done in wide-ints, then they can go back to
 using a 32 bit hwi for everything else.For them, a wide-int built on 32
 hwi's would be a win.

That wide-int builds on HWI is an implementation detail.  It can easily
be changed to build on int32_t.

Btw, what important target still supports a 32bit HWI?  None for what
I know.  Look at config.gcc and what does _not_ set need_64bit_hwint.
Even plain arm needs it.

Richard.

 kenny

Re: [PATCH 00/89] Compile-time gimple-checking

2014-04-23 Thread Michael Matz

Hi,

On Mon, 21 Apr 2014, David Malcolm wrote:

 This is a greatly-expanded version of:
   http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01262.html
 
 As of r205034 (de6bd75e3c9bc1efe8a6387d48eedaa4dafe622d) and
 r205428 (a90353203da18288cdac1b0b78fe7b22c69fe63f) the various gimple
 statements form a C++ inheritance hierarchy, but we're not yet making much
 use of that in the code: everything refers to just gimple (or
 const_gimple), and type-checking is performed at run-time within the
 various gimple_foo_* accessors in gimple.h, and almost nowhere else.
 
 The following patch series introduces compile-time checking of much of
 the handling of gimple statements.

FWIW, I still don't like any of this for reasons already outlined here: 
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00773.html

(basically: I consider automatically creating types a very bad idea.  You 
do do that by simply creating a type for every gimple code.)

 case GIMPLE_SWITCH:
   dump_gimple_switch (buffer, gs-as_a_gimple_switch (), spc, flags);
   break;
 
 where the -as_a_gimple_switch is a no-op cast from gimple to the more
 concrete gimple_switch in a release build, with runtime checking for
 code == GIMPLE_SWITCH added in a checked build (it uses as_a 
 internally).

Unlike others here I do like the cast-as-method (if we absolutely _must_ 
have a complicated type hierarchy for gimple), but would suggest different 
a name: the gimple_ is tautological, and the a_ just noise, just name 
it gs-as_switch() (incidentally then it's _really_ shorter than the 
ugly is_a/as_a syntax).


Ciao,
Michael.

[PATCH] Avoid going to GENERIC for TER expansion


$subject - we have the sepops interface for this.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2014-04-23  Richard Biener  rguent...@suse.de

* expr.c (expand_expr_real_1): Avoid gimple_assign_rhs_to_tree
during TER and instead use the sepops interface for expanding
non-GIMPLE_SINGLE_RHS.

Index: gcc/expr.c
===
*** gcc/expr.c  (revision 209559)
--- gcc/expr.c  (working copy)
*** expand_expr_real_1 (tree exp, rtx target
*** 9395,9406 
if (g)
{
  rtx r;
! location_t saved_loc = curr_insn_location ();
! 
! set_curr_insn_location (gimple_location (g));
! r = expand_expr_real (gimple_assign_rhs_to_tree (g), target,
!   tmode, modifier, NULL, inner_reference_p);
! set_curr_insn_location (saved_loc);
  if (REG_P (r)  !REG_EXPR (r))
set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (exp), r);
  return r;
--- 9395,9427 
if (g)
{
  rtx r;
! ops.code = gimple_assign_rhs_code (g);
!   switch (get_gimple_rhs_class (ops.code))
!   {
!   case GIMPLE_TERNARY_RHS:
! ops.op2 = gimple_assign_rhs3 (g);
! /* Fallthru */
!   case GIMPLE_BINARY_RHS:
! ops.op1 = gimple_assign_rhs2 (g);
! /* Fallthru */
!   case GIMPLE_UNARY_RHS:
! ops.op0 = gimple_assign_rhs1 (g);
! ops.type = TREE_TYPE (gimple_assign_lhs (g));
! ops.location = gimple_location (g);
! r = expand_expr_real_2 (ops, target, tmode, modifier);
! break;
!   case GIMPLE_SINGLE_RHS:
! {
!   location_t saved_loc = curr_insn_location ();
!   set_curr_insn_location (gimple_location (g));
!   r = expand_expr_real (gimple_assign_rhs1 (g), target,
! tmode, modifier, NULL, inner_reference_p);
!   set_curr_insn_location (saved_loc);
!   break;
! }
!   default:
! gcc_unreachable ();
!   }
  if (REG_P (r)  !REG_EXPR (r))
set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (exp), r);
  return r;

Re: [PATCH][RFC][wide-int] Fix some build errors on arm in wide-int branch and report ICE

2014-04-23 Thread Jakub Jelinek

On Wed, Apr 23, 2014 at 04:36:23PM +0200, Richard Biener wrote:
  I should point out that there is a community that wants to go in the
  opposite direction here.   They are the people with real 32 bit hosts who
  want to go back to a world where they are allowed to make hwi a 32 bit
  value.They have been waiting wide-int to be committed because they see
  this as a way to get back to world where most of the math is done natively.
 
  I am not part of this community but they feel that if the math that has the
  potential to be big to be is done in wide-ints, then they can go back to
  using a 32 bit hwi for everything else.For them, a wide-int built on 32
  hwi's would be a win.

I don't think wide-int will be more efficient than 64-bit integer
support on 32-bit architectures, if it would be, that would just mean that we
need to improve support for the double word integers for that target.
So what exactly would be the advantage of going back to 32-bit HWI say on
i?86?

Jakub

Re: [PATCH][RFC][wide-int] Fix some build errors on arm in wide-int branch and report ICE

2014-04-23 Thread Kenneth Zadeck


On 04/23/2014 10:36 AM, Richard Biener wrote:

On Wed, Apr 23, 2014 at 4:29 PM, Kenneth Zadeck
zad...@naturalbridge.com wrote:

On 04/23/2014 05:47 AM, Richard Biener wrote:

On Tue, Apr 22, 2014 at 6:04 PM, Mike Stump mikest...@comcast.net wrote:

On Apr 22, 2014, at 8:33 AM, Richard Sandiford
rdsandif...@googlemail.com wrote:

Kyrill Tkachov kyrylo.tkac...@arm.com writes:

Ping.
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00769.html
Any ideas? I recall chatter on IRC that we want to merge wide-int into
trunk
soon. Bootstrap failure on arm would prevent that...

Sorry for the late reply.  I hadn't forgotten, but I wanted to wait
until I had chance to look into the ICE before replying, which I haven't
had chance to do yet.

They are separable issues, so, I checked in the change.


It's a shame we can't use C++ style casts,
but I suppose that's the price to pay for being able to write
unsigned HOST_WIDE_INT”.

unsigned_HOST_WIDE_INT isn’t horrible, but, yeah, my fingers were
expecting a typedef or better.  I slightly prefer the int (1) style, but I
think we should go the direction of the patch.

Well, on my list of things to try for 4.10 is to kill off HOST_WIDE_* and
require a 64bit integer type on the host and force all targets to use
a 64bit 'hwi'.  Thus, s/HOST_WIDE_INT/int64_t/ (and the appropriate
related changes).

Richard.

I should point out that there is a community that wants to go in the
opposite direction here.   They are the people with real 32 bit hosts who
want to go back to a world where they are allowed to make hwi a 32 bit
value.They have been waiting wide-int to be committed because they see
this as a way to get back to world where most of the math is done natively.

I am not part of this community but they feel that if the math that has the
potential to be big to be is done in wide-ints, then they can go back to
using a 32 bit hwi for everything else.For them, a wide-int built on 32
hwi's would be a win.

That wide-int builds on HWI is an implementation detail.  It can easily
be changed to build on int32_t.

Btw, what important target still supports a 32bit HWI?  None for what
I know.  Look at config.gcc and what does _not_ set need_64bit_hwint.
Even plain arm needs it.
I think that originally, hwi was supposed to be a natural integer on the 
host machine and it was corrupted to always be a 64 bit integer.


Right now, wide-int is built on hwis which are always 64 bits.On a 
32 bit machine, this means that there are two levels of abstraction to 
get to the hardware,  one to get from wide-int to 64 bits and one to get 
from 64 bits to 32 bits.


The easy part of converting wide-int to run natively on a 32 bit machine 
is going to be the internals of wide-int.  Of course until you test it 
you never know, but we did try very hard not to care about the internal 
size of the rep. The hard part will be the large number of places 
where wide-int converts to or from hwi. Some of those callers expect 
things to really be 64 bits and some of them deal with numbers that are 
always small enough to be implemented in the efficient native 
representation.  I think that the push against you is that the latter 
case should not be converted to int64_t.



Richard.


kenny

Re: -fuse-caller-save - Collect register usage information

2014-04-23 Thread Vladimir Makarov


On 2014-04-23, 6:41 AM, Tom de Vries wrote:

On 22-04-14 17:05, Tom de Vries wrote:

I've updated the fuse-caller-save patch series to model non-callee
call clobbers
in CALL_INSN_FUNCTION_USAGE.


Vladimir,

This is the updated version of the previously approved patch
http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01320.html , updated for
the new hook call_fusage_contains_non_callee_clobbers.

The only difference is in the functions get_call_reg_set_usage and
collect_fn_hard_reg_usage which use the hook.

OK for trunk?

2013-04-29  Radovan Obradovic  robrado...@mips.com
 Tom de Vries  t...@codesourcery.com

 * cgraph.h (struct cgraph_node): Add function_used_regs,
 function_used_regs_initialized and function_used_regs_valid fields.
 * final.c: Move include of hard-reg-set.h to before rtl.h to declare
 find_all_hard_reg_sets.
 (collect_fn_hard_reg_usage, get_call_fndecl, get_call_cgraph_node)
 (get_call_reg_set_usage): New function.
 (rest_of_handle_final): Use collect_fn_hard_reg_usage.



It looks ok for me, Tom.  But to be straight I am not a maintainer for 
this part of the compiler.  So it is just my recommendation.  I guess to 
get an approval for these changes, you should ask Jan Hubicka (cgraph.h) 
or a global or RTL reviewer for final.c.

Re: Add call_fusage_contains_non_callee_clobbers hook

Tom de Vries tom_devr...@mentor.com writes:
 On 22-04-14 17:05, Tom de Vries wrote:
 I've updated the fuse-caller-save patch series to model non-callee
 call clobbers
 in CALL_INSN_FUNCTION_USAGE.


 Vladimir,

 This patch adds a hook to indicate whether a target has added the non-callee 
 call clobbers to CALL_INSN_FUNCTION_USAGE, meaning it's safe to do the 
 fuse-caller-save optimization.

FWIW I think this should be a plain bool rather than a function,
like delay_sched2 etc.

Thanks,
Richard

Re: -fuse-caller-save - Collect register usage information

Tom de Vries tom_devr...@mentor.com writes:
 +/* Collect hard register usage for the current function.  */
 +
 +static void
 +collect_fn_hard_reg_usage (void)
 +{
 +  rtx insn;
 +  int i;
 +  struct cgraph_node *node;
 +
 +  if (!flag_use_caller_save)
 +return;
 +
 +  node = cgraph_get_node (current_function_decl);
 +  gcc_assert (node != NULL);
 +
 +  gcc_assert (!node-function_used_regs_initialized);
 +  node-function_used_regs_initialized = 1;
 +
 +  for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))
 +{
 +  HARD_REG_SET insn_used_regs;
 +
 +  if (!NONDEBUG_INSN_P (insn))
 + continue;
 +
 +  find_all_hard_reg_sets (insn, insn_used_regs, false);
 +
 +  if (CALL_P (insn)
 +(!targetm.call_fusage_contains_non_callee_clobbers ()
 +   || !get_call_reg_set_usage (insn, insn_used_regs, 
 call_used_reg_set)))

If the uses of flag_use_caller_save also check
call_fusage_contains_non_callee_clobbers, would it be better to test
them both together here too, rather than waiting to see a call?

 +  /* Be conservative - mark fixed and global registers as used.  */
 +  IOR_HARD_REG_SET (node-function_used_regs, fixed_reg_set);
 +  for (i = 0; i  FIRST_PSEUDO_REGISTER; i++)
 +if (global_regs[i])
 +  SET_HARD_REG_BIT (node-function_used_regs, i);

The loop isn't needed; all globals are fixed.

Thanks again for working on this.

Richard

[PATCH AARCH64] One-line tidy of bit-twiddle expression in aarch64.c

This patch is a small tidy of a more-complicated expression that just flips a 
single bit and can thus be a simple XOR.


No regressions on aarch64-none-elf or aarch64_be-none-elf. (I've verified code 
is indeed exercised by dg-torture.exp vshuf-v*.c).


Also ok after applying TBL and testsuite patches in 
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01309.html and 
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00579.html.


gcc/ChangeLog:
2014-04-23  Alan Lawrence  alan.lawre...@arm.com

* config/aarch64/aarch64.c (aarch64_expand_vec_perm_1): tidy bit-flip 
expression.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index a3147ee..b879754 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -8124,7 +8124,7 @@ aarch64_expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
   rtx x;
 
   for (i = 0; i  nelt; ++i)
-	d-perm[i] = (d-perm[i] + nelt)  (2 * nelt - 1);
+	d-perm[i] ^= nelt; /* Keep the same index, but in the other vector.  */
 
   x = d-op0;
   d-op0 = d-op1;

RE: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend

2014-04-23 Thread Robert Suchanek

 Hmm, in that case maybe we should just leave it failing.  The alternative
 would be to skip the test altogther for MIPS, with a PR referencing it,
 but that seems a bit over-the-top.

I'd leave it as it is for now until the consensus regarding the 'X' constraint
is reached.
 
 Please use comments of the form:
 
   /* Implement TARGET_FOO.  */
 
 above all three functions (instead of the current one in the case of
 mips_register_priority), just so that it's painfully obvious that
 these are target hooks.

Modified as requested and attached the patch below. I tried to keep 
to the conventions but apparently I seem to overlook certain things.
I'll remember this part now :).

 Out of interest, do you see any difference if you include $sp in SPILL_REGS?
 That obviously doesn't make much conceptual sense, but it would give a
 cleaner class hierarchy.

Including $sp does not make any difference, exactly the same code size.
Although I haven't thoroughly tested it, I limited the check to -Os.

Regards,
Robert

2014-03-26  Robert Suchanek  robert.sucha...@imgtec.com

* lra-constraints.c (base_to_reg): New function.
(process_address): Use new function.

* config/mips/constraints.md (d): BASE_REG_CLASS
replaced by TARGET_MIPS16 ? M16_REGS : GR_REGS.
* config/mips/mips.c (mips_regno_mode_ok_for_base_p):
Remove use !strict_p for MIPS16.
(mips_register_priority): New function that implements
the target hook TARGET_REGISTER_PRIORITY.
(mips_spill_class): Likewise for TARGET_SPILL_CLASS
(mips_lra_p): Likewise for TARGET_LRA_P.
* config/mips/mips.h (reg_class): Add M16_SP_REGS and SPILL_REGS
classes.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Likewise.
(BASE_REG_CLASS): Use M16_SP_REGS.
* config/mips/mips.md (*mul_acc_si, *mul_sub_si): Add alternative
tuned for LRA. New set attribute to enable alternatives
depending on the register allocator used.
(*lea64): Disable pattern for MIPS16.
* config/mips/mips.opt
(mlra): New option

diff --git gcc/config/mips/constraints.md gcc/config/mips/constraints.md
index f6834fd..fa33c30 100644
--- gcc/config/mips/constraints.md
+++ gcc/config/mips/constraints.md
@@ -19,7 +19,7 @@
 
 ;; Register constraints
 
-(define_register_constraint d BASE_REG_CLASS
+(define_register_constraint d TARGET_MIPS16 ? M16_REGS : GR_REGS
   An address register.  This is equivalent to @code{r} unless
generating MIPS16 code.)
 
diff --git gcc/config/mips/mips.c gcc/config/mips/mips.c
index 45256e9..f8d90b2 100644
--- gcc/config/mips/mips.c
+++ gcc/config/mips/mips.c
@@ -655,7 +655,7 @@ const enum reg_class 
mips_regno_to_class[FIRST_PSEUDO_REGISTER] = {
   M16_REGS,M16_STORE_REGS,  LEA_REGS,LEA_REGS,
   LEA_REGS,LEA_REGS,LEA_REGS,LEA_REGS,
   T_REG,   PIC_FN_ADDR_REG, LEA_REGS,LEA_REGS,
-  LEA_REGS,LEA_REGS,LEA_REGS,LEA_REGS,
+  LEA_REGS,M16_SP_REGS, LEA_REGS,LEA_REGS,
 
   FP_REGS, FP_REGS,FP_REGS,FP_REGS,
   FP_REGS, FP_REGS,FP_REGS,FP_REGS,
@@ -2241,22 +2241,9 @@ mips_regno_mode_ok_for_base_p (int regno, enum 
machine_mode mode,
 return true;
 
   /* In MIPS16 mode, the stack pointer can only address word and doubleword
- values, nothing smaller.  There are two problems here:
-
-   (a) Instantiating virtual registers can introduce new uses of the
-  stack pointer.  If these virtual registers are valid addresses,
-  the stack pointer should be too.
-
-   (b) Most uses of the stack pointer are not made explicit until
-  FRAME_POINTER_REGNUM and ARG_POINTER_REGNUM have been eliminated.
-  We don't know until that stage whether we'll be eliminating to the
-  stack pointer (which needs the restriction) or the hard frame
-  pointer (which doesn't).
-
- All in all, it seems more consistent to only enforce this restriction
- during and after reload.  */
+ values, nothing smaller.  */
   if (TARGET_MIPS16  regno == STACK_POINTER_REGNUM)
-return !strict_p || GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
+return GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
 
   return TARGET_MIPS16 ? M16_REG_P (regno) : GP_REG_P (regno);
 }
@@ -12115,6 +12102,18 @@ mips_register_move_cost (enum machine_mode mode,
   return 0;
 }
 
+/* Implement TARGET_REGISTER_PRIORITY.  */
+
+static int
+mips_register_priority (int hard_regno)
+{
+  /* Treat MIPS16 registers with higher priority than other regs.  */
+  if (TARGET_MIPS16
+   TEST_HARD_REG_BIT (reg_class_contents[M16_REGS], hard_regno))
+return 1;
+  return 0;
+}
+
 /* Implement TARGET_MEMORY_MOVE_COST.  */
 
 static int
@@ -18897,6 +18896,25 @@ mips_atomic_assign_expand_fenv (tree *hold, tree 
*clear, tree *update)
   *update = build2 (COMPOUND_EXPR,

Re: [PATCH 00/89] Compile-time gimple-checking

2014-04-23 Thread David Malcolm

On Wed, 2014-04-23 at 16:19 +0200, Richard Biener wrote:
 On Tue, Apr 22, 2014 at 9:42 PM, Richard Biener
 richard.guent...@gmail.com wrote:
  On April 22, 2014 8:56:56 PM CEST, Richard Sandiford 
  rdsandif...@googlemail.com wrote:
 David Malcolm dmalc...@redhat.com writes:
  Alternatively we could change the is-a.h API to eliminate this
  discrepancy, and keep the typedefs; giving something like the
 following:
 
static void
dump_gimple_switch (pretty_printer *buffer, gimple_switch gs, int
 spc,
int flags)
[...snip...]
 
[...later, within pp_gimple_stmt_1:]
 
   case GIMPLE_SWITCH:
 dump_gimple_switch (buffer, as_a gimple_switch (gs), spc,
 flags);
 break;
 
  which is concise, readable, and avoid the change in pointerness
 compared
  to the gimple typedef; the local decls above would look like this:
gimple some_stmt;  /* note how this doesn't have a star... */
gimple_assign assign_stmt; /* ...and neither do these */
gimple_cond assign_stmt;
gimple_phi phi;
 
  I think this last proposal is my preferred API, but it requires the
  change to is-a.h
 
  Attached is a proposed change to the is-a.h API that elimintates the
  discrepancy, allowing the use of typedefs with is-a.h (doesn't yet
  compile, but hopefully illustrates the idea).  Note how it changes
 the
  API to match C++'s  dynamic_cast operator i.e. you do
 
Q* q = dyn_castQ* (p);
 
  not:
 
Q* q = dyn_castQ (p);
 
 Thanks for being flexible. :-)  I like this version too FWIW, for the
 reason you said: it really does look like a proper C++ cast.
 
  Indeed. I even wasn't aware it is different Than a c++ cast...
 
 It would be nice if you can change that with a separate patch posted
 in a separate thread to be more visible.

Done, as:
  http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01439.html

I've experimentally ported patch 2 of the series (gimple_switch) to this
approach, dropping the unloved casting methods in favor of as_a and
dyn_cast, and it works.

 Also I see you introduce a const_FOO class with every FOO one.
 I wonder whether, now that we have C++, can address const-correctness
 in a less awkward way than with a typedef.  Can you try to go back
 in time and see why we did with that in the first place?  ISTR that
 it was oh, if we were only using C++ we wouldn't need to jump through
 that hoop.
 
 Thanks,
 Richard.
 
  Richard.
 
 If we ever decide to get rid of the typedefs (maybe at the same time as
 using auto) then the choice might be different, but that would be a
 much
 more systematic and easily-automated change than this one.
 
 Thanks,
 Richard

Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend

2014-04-23 Thread Vladimir Makarov


On 2014-04-21, 8:23 AM, Richard Sandiford wrote:

Robert Suchanek robert.sucha...@imgtec.com writes:

Did you see the failures even after your mips_regno_mode_ok_for_base_p
change?  LRA should know how to reload a W address.


Yes but I realize there is more. It fails because $sp is now included
in BASE_REG_CLASS and W is based on it. However, I suppose that
it would be too eager to say it is wrong and likely there is
something missing
in LRA if we want to keep all alternatives. Currently there is no check
if a reloaded operand has a valid address, use of $sp in lbu/lhu cases.
Even if we added extra checks we are less likely to benefit as we need
to reload the base into register.


Not sure what you mean, sorry.  W exists specifically to exclude
$sp-based and $pc-based addresses.  LRA AFAIK should already be able
to reload addresses that are valid in the TARGET_LEGITIMATE_ADDRESS_P
sense but which do not match the constraints for a particular insn.

Can you remember one of the tests that fails?


I couldn't trigger the problem with the original testcase but found
another one that reveals it. The following needs to compiled with
-mips32r2 -mips16 -Os:

struct { int addr; } c;
struct command { int args[1]; };
unsigned short a;

fn1 (struct command *p1)
{
 unsigned short d;
 d = fn2 ();
 a = p1-args[0];
 fn3 (a);
 if (c.addr)
 {
 fn4 (p1-args[0]);
 return;
 }
 (c)-addr = fn5 ();
 fn6 (d);
}


Thanks.


Not sure how the constraint would/should exclude $sp-based address in
LRA.  In this particular case, a spilled pseudo is changed to memory
giving the following RTL form:

(insn 30 29 31 4 (set (reg:SI 4 $4)
 (and:SI (mem/c:SI (plus:SI (reg/f:SI 78 $frame)
 (const_int 16 [0x10])) [7 %sfp+16 S4 A32])
 (const_int 65535 [0x]))) shell.i:17 161 {*andsi3_mips16}
  (expr_list:REG_DEAD (reg:SI 194 [ D.1469 ])
 (nil)))

The operand 1 during alternative selection is not marked as a bad
operand as it is a memory operand. $frame appears to be fine as it
could be eliminated later to hard register. No reloads are inserted
for the instructions concerned. Unless, $frame should be temporarily
eliminated and then a reload would be inserted?


Yeah, I think the lack of elimination is the problem.  process_address
eliminates $frame temporarily before checking whether the address
is valid, but the places that check EXTRA_CONSTRAINT_STR pass the
original uneliminated address.  So the legitimate_address_p hook sees
the $sp-based address but the W constraint only sees the $frame-based
address (which might or might not be valid, depending on whether $frame
is eliminated to the stack or hard frame pointer).  I think the constraints
should see the eliminated address too.

This patch seems to fix it for me.  Tested on x86_64-linux-gnu.
Vlad, is this OK for trunk?

BTW, we might want to define something like:

#define MODE_BASE_REG_CLASS(MODE) \
   (TARGET_MIPS16 \
? ((MODE) == SImode || (MODE) == DImode ? M16_SP_REGS : M16_REGS) \
: GR_REGS)

instead of BASE_REG_CLASS.  It might lead to slightly better code
(or not -- if it doesn't then don't bother :-)).

If this patch is OK then I think the only thing blocking the switch
to LRA is the asm-subreg-1.c failure.  I think it'd be fine to XFAIL
that test on MIPS for now, until there's a consensus about what X means
for asms.


gcc/
* lra-constraints.c (valid_address_p): Move earlier in file.
Add a constraint argument to the address_info version.
(satisfies_memory_constraint_p): New function.
(satisfies_address_constraint_p): Likewise.
(process_alt_operands, curr_insn_transform): Use them.
(process_address): Pass the constraint to valid_address_p when
checking address operands.




Yes, it looks ok for me, Richard.  Thanks on working on this.

I am on vacation till May 4th. If the patch results in problems on other 
targets, I hope you revert it.  But to be honest, I believe it is very 
safe and don't expect any problems at all.

[Committed][ARM][AArch64] Patches previously ok'd for stage1

2014-04-23 Thread Kyrill Tkachov


Hi all,

I've committed to trunk some of my arm and aarch64 patches that I had pending 
for stage1 (approval email in parentheses):


http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00933.html 
(http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01609.html)


http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00934.html 
(http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01634.html)


http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00935.html 
(http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01608.html)


http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00936.html 
(http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01635.html)


http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01330.html 
(http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01343.html)


http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01274.html 
(http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01346.html)


I've committed them as revisions r209701 to r209706 inclusively.

Thanks,
Kyrill

[PATCH AARCH64] fix and enable non-const shuffle for bigendian using TBL instruction

At present vec_perm with non-const indices is not handled on bigendian, so gcc 
generates generic, slow, code. This patch fixes up TBL to reverse the indices 
within each input vector (following Richard Henderson's suggestion of using an 
XOR with (nelts - 1) rather than a complicated mask/add/subtract, 
http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01285.html), and enables the code 
for bigendian.


Regressed on aarch64_be-none-elf with no changes. (This is as expected: in all 
affected cases, gcc was already producing correct non-arch-specific code using 
scalar op. However, I have manually verified for various tests in 
c-c++-common/torture/vshuf-v* that (a) TBL instructions are now produced, (b) a 
version of the compiler that produces TBLs without the index correction, fails 
tests).


Note tests c-c++-common/torture/vshuf-{v16hi,v4df,v4di,v8si} (i.e. the 32-byte 
vectors) were broken prior to this patch and are not affected.


gcc/ChangeLog:
2014-04-23  Alan Lawrence  alan.lawre...@arm.com

* config/aarch64/aarch64-simd.md (vec_perm): Enable for bigendian.
* config/aarch64/aarch64.c (aarch64_expand_vec_perm): Remove assert
against bigendian and adjust indices.diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 73aee2c..e14e9b0 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4002,7 +4002,7 @@
(match_operand:VB 1 register_operand)
(match_operand:VB 2 register_operand)
(match_operand:VB 3 register_operand)]
-  TARGET_SIMD  !BYTES_BIG_ENDIAN
+  TARGET_SIMD
 {
   aarch64_expand_vec_perm (operands[0], operands[1],
 			   operands[2], operands[3]);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index d332741..6875b58 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7763,18 +7763,24 @@ aarch64_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
   enum machine_mode vmode = GET_MODE (target);
   unsigned int i, nelt = GET_MODE_NUNITS (vmode);
   bool one_vector_p = rtx_equal_p (op0, op1);
-  rtx rmask[MAX_VECT_LEN], mask;
-
-  gcc_checking_assert (!BYTES_BIG_ENDIAN);
+  rtx mask;
 
   /* The TBL instruction does not use a modulo index, so we must take care
  of that ourselves.  */
-  mask = GEN_INT (one_vector_p ? nelt - 1 : 2 * nelt - 1);
-  for (i = 0; i  nelt; ++i)
-rmask[i] = mask;
-  mask = gen_rtx_CONST_VECTOR (vmode, gen_rtvec_v (nelt, rmask));
+  mask = aarch64_simd_gen_const_vector_dup (vmode,
+  one_vector_p ? nelt - 1 : 2 * nelt - 1);
   sel = expand_simple_binop (vmode, AND, sel, mask, NULL, 0, OPTAB_LIB_WIDEN);
 
+  /* For big-endian, we also need to reverse the index within the vector
+ (but not which vector).  */
+  if (BYTES_BIG_ENDIAN)
+{
+  /* If one_vector_p, mask is a vector of (nelt - 1)'s already.  */
+  if (!one_vector_p)
+mask = aarch64_simd_gen_const_vector_dup (vmode, nelt - 1);
+  sel = expand_simple_binop (vmode, XOR, sel, mask,
+ NULL, 0, OPTAB_LIB_WIDEN);
+}
   aarch64_expand_vec_perm_1 (target, op0, op1, sel);
 }

Re: [PATCH 00/89] Compile-time gimple-checking

2014-04-23 Thread Andrew MacLeod


On 04/23/2014 10:19 AM, Richard Biener wrote:

On Tue, Apr 22, 2014 at 9:42 PM, Richard Biener
richard.guent...@gmail.com wrote:

On April 22, 2014 8:56:56 PM CEST, Richard Sandiford 
rdsandif...@googlemail.com wrote:

David Malcolm dmalc...@redhat.com writes:

Alternatively we could change the is-a.h API to eliminate this
discrepancy, and keep the typedefs; giving something like the

following:

   static void
   dump_gimple_switch (pretty_printer *buffer, gimple_switch gs, int

spc,

   int flags)
   [...snip...]

   [...later, within pp_gimple_stmt_1:]

  case GIMPLE_SWITCH:
dump_gimple_switch (buffer, as_a gimple_switch (gs), spc,

flags);

break;

which is concise, readable, and avoid the change in pointerness

compared

to the gimple typedef; the local decls above would look like this:
   gimple some_stmt;  /* note how this doesn't have a star... */
   gimple_assign assign_stmt; /* ...and neither do these */
   gimple_cond assign_stmt;
   gimple_phi phi;

I think this last proposal is my preferred API, but it requires the
change to is-a.h

Attached is a proposed change to the is-a.h API that elimintates the
discrepancy, allowing the use of typedefs with is-a.h (doesn't yet
compile, but hopefully illustrates the idea).  Note how it changes

the

API to match C++'s  dynamic_cast operator i.e. you do

   Q* q = dyn_castQ* (p);

not:

   Q* q = dyn_castQ (p);

Thanks for being flexible. :-)  I like this version too FWIW, for the
reason you said: it really does look like a proper C++ cast.

Indeed. I even wasn't aware it is different Than a c++ cast...

It would be nice if you can change that with a separate patch posted
in a separate thread to be more visible.

Also I see you introduce a const_FOO class with every FOO one.
I wonder whether, now that we have C++, can address const-correctness
in a less awkward way than with a typedef.  Can you try to go back
in time and see why we did with that in the first place?  ISTR that
it was oh, if we were only using C++ we wouldn't need to jump through
that hoop.


I was also wondering if we shouldn't be able to get rid of the 'const_' 
versions and just properly use const with the c++ classes.


I think we can...

Andrew

[Patch] Fix obsolete autoconf macros in configure.ac

2014-04-23 Thread Steve Ellcey


The gcc configure.ac script is using an obsolete form of the AC_CHECK_TYPE
autoconf macro to check for caddr_t and ssize_t. 

http://www.gnu.org/software/autoconf/manual/autoconf-2.60/html_node/Obsolete-Macros.html#Obsolete-Macros

This usage is causing a build failure for me when building a windows GCC
using the mingw toolset.

I would like to replace the obsolete autoconf macros with a 'proper' one.
Tested with my mingw build and a MIPS targetted linux build.

OK to checkin?

Steve Ellcey
sell...@mips.com



2014-04-23  Steve Ellcey  sell...@mips.com

* configure.ac (caddr_t, ssize_t): Use AC_CHECK_TYPES instead
of obsolete form of AC_CHECK_TYPE.
* configure: Regenerate.


diff --git a/gcc/configure.ac b/gcc/configure.ac
index d789557..98acb1b 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -1083,8 +1083,8 @@ int main()
   fi
 fi
 
-AC_CHECK_TYPE(ssize_t, int)
-AC_CHECK_TYPE(caddr_t, char *)
+AC_CHECK_TYPES([ssize_t])
+AC_CHECK_TYPES([caddr_t])
 
 GCC_AC_FUNC_MMAP_BLACKLIST

Re: [PATCH][RFC] Remove RTL loop unswitching

2014-04-23 Thread Jan Hubicka

 On Sun, 20 Apr 2014, Jan Hubicka wrote:
 
   
   This removes RTL loop unswitching (see last years discussion about
   compile-time issues of that pass).  RTL loop unswitching is
   enabled together with GIMPLE loop unswitching at -O3 and by
   -floop-unswitch.  It's clearly the wrong place to do high-level
   loop transforms these days, and the cost of maintainance doesn't
   outweight the questionable benefit.
   
   Thus the following patch removes it.
   
   Bootstrap / regtest pending on x86_64-unknown-linux-gnu (I hope
   for testsuite fallout).
   
   Any objections?
  
  Not really, I am all for moving more of loop stuff to trees.
  Did you performed some benchmarks? (I remember I did in 2012
  but completely forgot the outcome).
 
 I did that last year and it showed no difference in SPEC 2k6.
 
 When bootstrapping with -O3 and a gcc_unreachable () in the
 RTL unswitching path you get some ICEs there but they are
 due to different effective --param max-unswitch-insns that
 is on GIMPLE applied to tree_num_loop_insns () and on RTL
 to num_loop_insns ().

Yep, I remember seeing some interesting special cases where RTL analyzis
did catch on invariants but tree didn't, but nothing important.
 
 I'll go forward with the patch today.
 
  On related note, shall I try to update the following?
  http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02270.html
 
 Yeah.

Will do,
Honza
 
 Thanks,
 Richard.
 
  Honza
   
   Thanks,
   Richard.
   
   2014-04-15  Richard Biener  rguent...@suse.de
   
 * Makefile.in (OBJS): Remove loop-unswitch.o.
 * loop-unswitch.c: Delete.
 * tree-pass.h (make_pass_rtl_unswitch): Remove.
 * passes.def (pass_rtl_unswitch): Likewise.
 * loop-init.c (gate_rtl_unswitch): Likewise.
 (rtl_unswitch): Likewise.
 (pass_data_rtl_unswitch): Likewise.
 (pass_rtl_unswitch): Likewise.
 (make_pass_rtl_unswitch): Likewise.
 * rtl.h (reversed_condition): Likewise.
 (compare_and_jump_seq): Likewise.
 * loop-iv.c (reversed_condition): Move here from loop-unswitch.c
 and make static.
 * loop-unroll.c (compare_and_jump_seq): Likewise.
   
   Index: gcc/Makefile.in
   ===
   --- gcc/Makefile.in   (revision 209410)
   +++ gcc/Makefile.in   (working copy)
   @@ -1294,7 +1294,6 @@ OBJS = \
 loop-invariant.o \
 loop-iv.o \
 loop-unroll.o \
   - loop-unswitch.o \
 lower-subreg.o \
 lra.o \
 lra-assigns.o \
   Index: gcc/tree-pass.h
   ===
   --- gcc/tree-pass.h   (revision 209410)
   +++ gcc/tree-pass.h   (working copy)
   @@ -512,7 +512,6 @@ extern rtl_opt_pass *make_pass_outof_cfg
extern rtl_opt_pass *make_pass_loop2 (gcc::context *ctxt);
extern rtl_opt_pass *make_pass_rtl_loop_init (gcc::context *ctxt);
extern rtl_opt_pass *make_pass_rtl_move_loop_invariants (gcc::context 
   *ctxt);
   -extern rtl_opt_pass *make_pass_rtl_unswitch (gcc::context *ctxt);
extern rtl_opt_pass *make_pass_rtl_unroll_and_peel_loops (gcc::context 
   *ctxt);
extern rtl_opt_pass *make_pass_rtl_doloop (gcc::context *ctxt);
extern rtl_opt_pass *make_pass_rtl_loop_done (gcc::context *ctxt);
   Index: gcc/passes.def
   ===
   --- gcc/passes.def(revision 209410)
   +++ gcc/passes.def(working copy)
   @@ -341,7 +341,6 @@ along with GCC; see the file COPYING3.
  PUSH_INSERT_PASSES_WITHIN (pass_loop2)
   NEXT_PASS (pass_rtl_loop_init);
   NEXT_PASS (pass_rtl_move_loop_invariants);
   -   NEXT_PASS (pass_rtl_unswitch);
   NEXT_PASS (pass_rtl_unroll_and_peel_loops);
   NEXT_PASS (pass_rtl_doloop);
   NEXT_PASS (pass_rtl_loop_done);
   Index: gcc/loop-init.c
   ===
   --- gcc/loop-init.c   (revision 209410)
   +++ gcc/loop-init.c   (working copy)
   @@ -518,61 +518,7 @@ make_pass_rtl_move_loop_invariants (gcc:
}


   -/* Loop unswitching for RTL.  */
   -static bool
   -gate_rtl_unswitch (void)
   -{
   -  return flag_unswitch_loops;
   -}
   -
   -static unsigned int
   -rtl_unswitch (void)
   -{
   -  if (number_of_loops (cfun)  1)
   -unswitch_loops ();
   -  return 0;
   -}
   -
   -namespace {
   -
   -const pass_data pass_data_rtl_unswitch =
   -{
   -  RTL_PASS, /* type */
   -  loop2_unswitch, /* name */
   -  OPTGROUP_LOOP, /* optinfo_flags */
   -  true, /* has_gate */
   -  true, /* has_execute */
   -  TV_LOOP_UNSWITCH, /* tv_id */
   -  0, /* properties_required */
   -  0, /* properties_provided */
   -  0, /* properties_destroyed */
   -  0, /* todo_flags_start */
   -  TODO_verify_rtl_sharing, /* todo_flags_finish */
   -};
   -
   -class pass_rtl_unswitch : public rtl_opt_pass
   -{
   -public:
   -  pass_rtl_unswitch (gcc::context *ctxt)
   -: rtl_opt_pass (pass_data_rtl_unswitch,

Re: [PATCH] Change is-a.h to support typedefs of pointers

On April 23, 2014 5:31:42 PM CEST, David Malcolm dmalc...@redhat.com wrote:
The is-a.h API currently implicitly injects a pointer into the type:

  template typename T, typename U
  inline T *
 ^^^  Note how it returns a (T*)
  as_a (U *p)
  {
gcc_checking_assert (is_a T (p));
   but uses the specialization of T, not T*
  here
return is_a_helper T::cast (p);
   ^^^ and here

  }

so that currently one must write:

  Q* q = dyn_cast Q (p);

This causes difficulties when dealing with typedefs to pointers.  For
example, with:

  typedef struct foo_d foo;
  typedef struct bar_d bar;

we can't write:

  bar b = dyn_cast bar (f);
  ^^^   ^^^

but have to write:

  bar b = dyn_cast bar_d (f);
  ^^^   ^  Note the mismatching types.

The following patch changes the is-a.h API to remove the implicit
injection of a pointer, so that one writes:

  Q* q = dyn_cast Q* (p);

rather than:

  Q* q = dyn_cast Q (p);

which also gives us more consistency with C++'s dynamic_cast
operator, and
allows the above cast to a typedef-ptr to be written as:

  bar b = dyn_cast bar (f);
  ^^^   ^^^  they can now match.

The patch also fixes up the users (a fair amount of cgraph/symtable
code, along
with the gimple accessors).

The example motivating this is to better support as_a and dyn_cast in
gimple code, in the Compile-time gimple-checking patch series so
that,
with suitable typesdefs matching the names in gimple.def, such as:

  typedef struct gimple_statement_assign *gimple_assign;

we can write:

case GIMPLE_ASSIGN:
  {
gimple_assign assign_stmt = as_agimple_assign (stmt);
^^
/* do assign-related things on assign_stmt */
  }

instead of the clunkier:

case GIMPLE_ASSIGN:
  {
  gimple_assign assign_stmt = as_agimple_statement_assign (stmt);
^^^^
/* do assign-related things on assign_stmt */
  }

See the http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01259.html
subthread for more details, which also considered changing the names of
the structs and eliminating the typedefs.  However, doing so without
the attached fix to the is-a API would introduce an inconsistency
between
decls for the base class vs subclass, so there'd be:
 gimple stmt;/* no star */
 gimple_assign *stmt;/* star */
(or to change the gimple typedef, which would be a monster patch).

Successfully bootstrappedregrtested on x86_64-unknown-linux-gnu.

OK for trunk?  

OK for trunk, no need to wait for 4.9.1 for this.

Thanks,
Richard.

Would the release managers prefer to make this
contingent
on holding off from committing until after 4.9.1 is out? (to minimize
impact of this change on backporting effort)

Thanks
Dave

gcc/
   * is-a.h: Update comments to reflect the following changes to the
   pointerness of the API, making the template parameter match the
   return type, allowing use of is-a.h with typedefs of pointers.
   (is_a_helper::cast): Return a T rather then a pointer to a T, so
   that the return type matches the parameter to the is_a_helper.
   (as_a): Likewise.
   (dyn_cast): Likewise.

   * cgraph.c (cgraph_node_for_asm): Update for removal of implicit
   pointer from the is-a.h API.

   * cgraph.h (is_a_helper cgraph_node::test): Convert to...
   (is_a_helper cgraph_node *::test): ...this, matching change to
   is-a.h API.
   (is_a_helper varpool_node::test): Likewise, convert to...
   (is_a_helper varpool_node *::test): ...this.

   (varpool_first_variable): Update for removal of implicit pointer
   from the is-a.h API.
   (varpool_next_variable): Likewise.
   (varpool_first_static_initializer): Likewise.
   (varpool_next_static_initializer): Likewise.
   (varpool_first_defined_variable): Likewise.
   (varpool_next_defined_variable): Likewise.
   (cgraph_first_defined_function): Likewise.
   (cgraph_next_defined_function): Likewise.
   (cgraph_first_function): Likewise.
   (cgraph_next_function): Likewise.
   (cgraph_first_function_with_gimple_body): Likewise.
   (cgraph_next_function_with_gimple_body): Likewise.
   (cgraph_alias_target): Likewise.
   (varpool_alias_target): Likewise.
   (cgraph_function_or_thunk_node): Likewise.
   (varpool_variable_node): Likewise.
   (symtab_real_symbol_p): Likewise.
   * cgraphunit.c (referred_to_p): Likewise.
   (analyze_functions): Likewise.
   (handle_alias_pairs): Likewise.
   * gimple-fold.c (can_refer_decl_in_current_unit_p): Likewise.
   * gimple-ssa.h (gimple_vuse_op): Likewise.
   (gimple_vdef_op): Likewise.
   * gimple-streamer-in.c (input_gimple_stmt): Likewise.
   * gimple.c

[ARM] Initialize new tune_params values

2014-04-23 Thread James Greenhalgh


Hi,

Revision 209561 introduces two new paramteres for tune_params, but does
not initialize them in the Cortex-A57 or Cortex-A12 tuning structures.

This breaks bootstrap. Fixed by initializing them to sensible values.

Checked to ensure the warnings are cleared, and bootstrap can continue.

Ramana has acked this offline, so I've applied this as revision 209710
under the reasonably obvious rule.

Thanks,
James

---
gcc/

2014-04-23  James Greenhalgh  james.greenha...@arm.com

* config/arm/arm.c (arm_cortex_a57_tune): Initialize all fields.
(arm_cortex_a12_tune): Likewise.
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 88d957a..de247cd 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1658,7 +1658,8 @@ const struct tune_params arm_cortex_a57_tune =
   true,   /* Prefer LDRD/STRD.  */
   {true, true},/* Prefer non short circuit.  */
   arm_default_vec_cost,   /* Vectorizer costs.  */
-  false/* Prefer Neon for 64-bits bitops.  */
+  false,   /* Prefer Neon for 64-bits bitops.  */
+  true, true   /* Prefer 32-bit encodings.  */
 };
 
 /* Branches can be dual-issued on Cortex-A5, so conditional execution is
@@ -1711,7 +1712,8 @@ const struct tune_params arm_cortex_a12_tune =
   true,		/* Prefer LDRD/STRD.  */
   {true, true},	/* Prefer non short circuit.  */
   arm_default_vec_cost,/* Vectorizer costs.  */
-  false /* Prefer Neon for 64-bits bitops.  */
+  false,/* Prefer Neon for 64-bits bitops.  */
+  false, false  /* Prefer 32-bit encodings.  */
 };
 
 /* armv7m tuning.  On Cortex-M4 cores for example, MOVW/MOVT take a single

Re: [Patch] Fix obsolete autoconf macros in configure.ac

2014-04-23 Thread Andreas Schwab

Steve Ellcey  sell...@mips.com writes:

 diff --git a/gcc/configure.ac b/gcc/configure.ac
 index d789557..98acb1b 100644
 --- a/gcc/configure.ac
 +++ b/gcc/configure.ac
 @@ -1083,8 +1083,8 @@ int main()
fi
  fi
  
 -AC_CHECK_TYPE(ssize_t, int)
 -AC_CHECK_TYPE(caddr_t, char *)
 +AC_CHECK_TYPES([ssize_t])
 +AC_CHECK_TYPES([caddr_t])

You also need to handle the no longer supported default definition.
Moreover, the two macro calls can be combined into one.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.

Re: Inliner heuristics TLC 1/n - let small function inlinng to ignore cold portion of program

2014-04-23 Thread Xinliang David Li

On Tue, Apr 22, 2014 at 1:17 PM, Jan Hubicka hubi...@ucw.cz wrote:
 This looks fine.  LIPO has similar change too.  Other directions worth
 looking into:

 1) To model icache effect better,  weighted callee size need to be
 used with profile. The weight for BB may look like: min(1,
 FREQ(BB)/FREQ(ENTRY)).
 2) When function splitting is turned on, are any inline heuristic
 changes are needed? E.g. only consider the hot code part of node for
 unit growth computation?

 We are also looking into more aggressive approach to track per loop
 (inter-procedural) region growth limit, instead of using one single
 global limit.

 Per-loop growth seems interesting. I assume it is not hard to associate edges
 with loop nests and it has more of a local nature.

per-function loop nests form a graph, which can be embedded inside the
callgraph. One of the main thing is loop graph update (just like
callgraph node/edge cloning), and summary data update during inlining.

 Did you experiment with it?

We currently do not have time for this, but you are welcome to
experiment with it:)

Related ideas: 1) per loop priority; 2) more precise code-reuse
(icache locality), and icache/itlb penalty estimate; 3) more precise
per loop register pressure estimate; 4) other loop transformation
hints.

thanks,

David

 Honza

Re: [PATCH 00/89] Compile-time gimple-checking

2014-04-23 Thread Jeff Law


On 04/22/14 02:36, Richard Biener wrote:

On Mon, Apr 21, 2014 at 6:56 PM, David Malcolm dmalc...@redhat.com wrote:

This is a greatly-expanded version of:
   http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01262.html

As of r205034 (de6bd75e3c9bc1efe8a6387d48eedaa4dafe622d) and
r205428 (a90353203da18288cdac1b0b78fe7b22c69fe63f) the various gimple
statements form a C++ inheritance hierarchy, but we're not yet making much
use of that in the code: everything refers to just gimple (or
const_gimple), and type-checking is performed at run-time within the
various gimple_foo_* accessors in gimple.h, and almost nowhere else.

The following patch series introduces compile-time checking of much of
the handling of gimple statements.

Various new typedefs are introduced for pointers to statements where the
specific code is known, matching the corresponding names from gimple.def.


Even though I like these changes in principle I also wear a release
managers hat.  Being one of the persons doing frequent backports
of trunk fixes to branches this will cause a _lot_ of headache.  So ... can
we delay this until, say, 4.9.1 is out?

Understood.

So how about we proceed with the review  approvals, but they stage in 
after 4.9.1?  Ideally by the time 4.9.1 is ready, the entire series in 
its final form has been reviewed and approved.




jeff

Re: [PATCH 00/89] Compile-time gimple-checking

2014-04-23 Thread Jeff Law


On 04/22/14 02:03, Richard Sandiford wrote:

First of all, thanks a lot for doing this.  Maybe one day we'll have
the same in rtl :-)
Funny you should mention that.  I blocked off a hunk of time for David 
to investigate doing some work on that this year.


Jeff

Re: [Patch] Fix obsolete autoconf macros in configure.ac

2014-04-23 Thread Steve Ellcey

On Wed, 2014-04-23 at 18:40 +0200, Andreas Schwab wrote:
 Steve Ellcey  sell...@mips.com writes:
 
  diff --git a/gcc/configure.ac b/gcc/configure.ac
  index d789557..98acb1b 100644
  --- a/gcc/configure.ac
  +++ b/gcc/configure.ac
  @@ -1083,8 +1083,8 @@ int main()
 fi
   fi
   
  -AC_CHECK_TYPE(ssize_t, int)
  -AC_CHECK_TYPE(caddr_t, char *)
  +AC_CHECK_TYPES([ssize_t])
  +AC_CHECK_TYPES([caddr_t])
 
 You also need to handle the no longer supported default definition.
 Moreover, the two macro calls can be combined into one.
 
 Andreas.

Actually, now that I look more at caddr_t, I see that we probably
shouldn't be using it at all.  The only uses in the gcc subdirectory are
for calls to mmap and munmap (in gcc.c, gcc-common.c, and
config/host-solaris.c) and the latest definitions for mmap and munmap
say it should use 'void *', not caddr_t.  I will submit a new
patch to remove the uses (and definition) of caddr_t from gcc.

ssize_t should probably still be fixed, but that was not causing me a
failure and it can be handled separately.

Steve Ellcey
sell...@mips.com

Re: [AArch64/ARM 1/3] Add execution + assembler tests of the AArch64 ZIP Intrinsics.

On 27 March 2014 10:52, Alan Lawrence alan.lawre...@arm.com wrote:
 This adds DejaGNU tests of the existing AArch64 vzip_* intrinsics, both
 checking the assembler output and the runtime results. Test bodies are in
 separate files ready to reuse for ARM in the third patch. Putting these in a
 new subdirectory ready for tests of other/related intrinsics.

 All tests passing on aarch64-none-elf and aarch64_be-none-elf.

 testsuite/ChangeLog:

 2014-03-25  Alan Lawrence  alan.lawre...@arm.com

 * gcc.target/aarch64/simd/simd.exp: New file.
 * gcc.target/aarch64/simd/vzipf32_1.c: New file.
 * gcc.target/aarch64/simd/vzipf32.x: New file.
 * gcc.target/aarch64/simd/vzipp16_1.c: New file.
 * gcc.target/aarch64/simd/vzipp16.x: New file.
 * gcc.target/aarch64/simd/vzipp8_1.c: New file.
 * gcc.target/aarch64/simd/vzipp8.x: New file.
 * gcc.target/aarch64/simd/vzipqf32_1.c: New file.
 * gcc.target/aarch64/simd/vzipqf32.x: New file.
 * gcc.target/aarch64/simd/vzipqp16_1.c: New file.
 * gcc.target/aarch64/simd/vzipqp16.x: New file.
 * gcc.target/aarch64/simd/vzipqp8_1.c: New file.
 * gcc.target/aarch64/simd/vzipqp8.x: New file.
 * gcc.target/aarch64/simd/vzipqs16_1.c: New file.
 * gcc.target/aarch64/simd/vzipqs16.x: New file.
 * gcc.target/aarch64/simd/vzipqs32_1.c: New file.
 * gcc.target/aarch64/simd/vzipqs32.x: New file.
 * gcc.target/aarch64/simd/vzipqs8_1.c: New file.
 * gcc.target/aarch64/simd/vzipqs8.x: New file.
 * gcc.target/aarch64/simd/vzipqu16_1.c: New file.
 * gcc.target/aarch64/simd/vzipqu16.x: New file.
 * gcc.target/aarch64/simd/vzipqu32_1.c: New file.
 * gcc.target/aarch64/simd/vzipqu32.x: New file.
 * gcc.target/aarch64/simd/vzipqu8_1.c: New file.
 * gcc.target/aarch64/simd/vzipqu8.x: New file.
 * gcc.target/aarch64/simd/vzips16_1.c: New file.
 * gcc.target/aarch64/simd/vzips16.x: New file.
 * gcc.target/aarch64/simd/vzips32_1.c: New file.
 * gcc.target/aarch64/simd/vzips32.x: New file.
 * gcc.target/aarch64/simd/vzips8_1.c: New file.
 * gcc.target/aarch64/simd/vzips8.x: New file.
 * gcc.target/aarch64/simd/vzipu16_1.c: New file.
 * gcc.target/aarch64/simd/vzipu16.x: New file.
 * gcc.target/aarch64/simd/vzipu32_1.c: New file.
 * gcc.target/aarch64/simd/vzipu32.x: New file.
 * gcc.target/aarch64/simd/vzipu8_1.c: New file.
 * gcc.target/aarch64/simd/vzipu8.x: New file.

OK /Marcus

Re: [AArch64/ARM 2/3] Rewrite AArch64 ZIP Intrinsics using __builtin_shuffle

On 27 March 2014 10:52, Alan Lawrence alan.lawre...@arm.com wrote:
 This patch replaces the temporary inline assembler for vzip_* in arm_neon.h
 with equivalent calls to __builtin_shuffle. These are matched by
 aarch64_expand_vec_perm_const{,_1} to output the same assembler
 instructions.

 Tests from first patch still passing on aarch64-none-elf and
 aarch64_be-none-elf.

 gcc/ChangeLog:

 2012-03-27  Alan Lawrence  alan.lawre...@arm.com

 * config/aarch64/arm_neon.h (vzip1_f32, vzip1_p8, vzip1_p16,
 vzip1_s8,
 vzip1_s16, vzip1_s32, vzip1_u8, vzip1_u16, vzip1_u32, vzip1q_f32,
 vzip1q_f64, vzip1q_p8, vzip1q_p16, vzip1q_s8, vzip1q_s16,
 vzip1q_s32,
 vzip1q_s64, vzip1q_u8, vzip1q_u16, vzip1q_u32, vzip1q_u64,
 vzip2_f32,
 vzip2_p8, vzip2_p16, vzip2_s8, vzip2_s16, vzip2_s32, vzip2_u8,
 vzip2_u16, vzip2_u32, vzip2q_f32, vzip2q_f64, vzip2q_p8, vzip2q_p16,
 vzip2q_s8, vzip2q_s16, vzip2q_s32, vzip2q_s64, vzip2q_u8,
 vzip2q_u16,
 vzip2q_u32, vzip2q_u64): Replace inline __asm__ with
 __builtin_shuffle.

OK /Marcus

Re: [Patch] Fix obsolete autoconf macros in configure.ac

2014-04-23 Thread Rainer Orth

Steve Ellcey sell...@mips.com writes:

 Actually, now that I look more at caddr_t, I see that we probably
 shouldn't be using it at all.  The only uses in the gcc subdirectory are
 for calls to mmap and munmap (in gcc.c, gcc-common.c, and
 config/host-solaris.c) and the latest definitions for mmap and munmap
 say it should use 'void *', not caddr_t.  I will submit a new

This may be irrelevant: Solaris (and other OSes) regularly provide
different compilation environments for various levels of standards
compatibility, and the default is not the latest.  Even apart from
Solaris, not everyone adheres to yesterday's version of POSIX.1.

 patch to remove the uses (and definition) of caddr_t from gcc.

Please be very careful here; this easily break several ports.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [AArch64/ARM 1/3] Add execution + assembler tests of AArch64 UZP Intrinsics

On 27 March 2014 17:17, Alan Lawrence alan.lawre...@arm.com wrote:
 This adds DejaGNU tests of the existing AArch64 vuzp_* intrinsics, both
 checking the assembler output and the runtime results. Test bodies are in
 separate files ready to reuse for ARM in the third patch.

 Putting these in a new subdirectory with the ZIP Intrinsic tests, using
 simd.exp added there (will commit ZIP tests first).

 All tests passing on aarch64-none-elf and aarch64_be-none-elf.

 testsuite/ChangeLog:
 2014-03-27  Alan Lawrence  alan.lawre...@arm.com

 * gcc.target/aarch64/simd/vuzpf32_1.c: New file.
 * gcc.target/aarch64/simd/vuzpf32.x: New file.
 * gcc.target/aarch64/simd/vuzpp16_1.c: New file.
 * gcc.target/aarch64/simd/vuzpp16.x: New file.
 * gcc.target/aarch64/simd/vuzpp8_1.c: New file.
 * gcc.target/aarch64/simd/vuzpp8.x: New file.
 * gcc.target/aarch64/simd/vuzpqf32_1.c: New file.
 * gcc.target/aarch64/simd/vuzpqf32.x: New file.
 * gcc.target/aarch64/simd/vuzpqp16_1.c: New file.
 * gcc.target/aarch64/simd/vuzpqp16.x: New file.
 * gcc.target/aarch64/simd/vuzpqp8_1.c: New file.
 * gcc.target/aarch64/simd/vuzpqp8.x: New file.
 * gcc.target/aarch64/simd/vuzpqs16_1.c: New file.
 * gcc.target/aarch64/simd/vuzpqs16.x: New file.
 * gcc.target/aarch64/simd/vuzpqs32_1.c: New file.
 * gcc.target/aarch64/simd/vuzpqs32.x: New file.
 * gcc.target/aarch64/simd/vuzpqs8_1.c: New file.
 * gcc.target/aarch64/simd/vuzpqs8.x: New file.
 * gcc.target/aarch64/simd/vuzpqu16_1.c: New file.
 * gcc.target/aarch64/simd/vuzpqu16.x: New file.
 * gcc.target/aarch64/simd/vuzpqu32_1.c: New file.
 * gcc.target/aarch64/simd/vuzpqu32.x: New file.
 * gcc.target/aarch64/simd/vuzpqu8_1.c: New file.
 * gcc.target/aarch64/simd/vuzpqu8.x: New file.
 * gcc.target/aarch64/simd/vuzps16_1.c: New file.
 * gcc.target/aarch64/simd/vuzps16.x: New file.
 * gcc.target/aarch64/simd/vuzps32_1.c: New file.
 * gcc.target/aarch64/simd/vuzps32.x: New file.
 * gcc.target/aarch64/simd/vuzps8_1.c: New file.
 * gcc.target/aarch64/simd/vuzps8.x: New file.
 * gcc.target/aarch64/simd/vuzpu16_1.c: New file.
 * gcc.target/aarch64/simd/vuzpu16.x: New file.
 * gcc.target/aarch64/simd/vuzpu32_1.c: New file.
 * gcc.target/aarch64/simd/vuzpu32.x: New file.
 * gcc.target/aarch64/simd/vuzpu8_1.c: New file.
 * gcc.target/aarch64/simd/vuzpu8.x: New file.

OK /Marcus

Re: [AArch64/ARM 2/3] Rewrite AArch64 UZP Intrinsics using __builtin_shuffle

On 27 March 2014 17:25, Alan Lawrence alan.lawre...@arm.com wrote:
 This patch replaces the temporary inline assembler for vuzp_* in arm_neon.h
 with equivalent calls to __builtin_shuffle.  These are matched by
 aarch64_expand_vec_perm_const{,_1} to output (generally) the same assembler
 instructions.  That is, except for two-element vectors, where ZIP, UZP and
 TRN instructions all have the same effect; gcc's backend chooses to output
 ZIP so this patch also updates the 3 affected tests.

 Regressed, and tests from first patch still passing modulo updates herein,
 on aarch64-none-elf and aarch64_be-none-elf.

 gcc/testsuite/ChangeLog:
 2014-03-27  Alan Lawrence  alan.lawre...@arm.com

 * gcc.target/aarch64/vuzps32_1.c: Expect zip1/2 insn rather than
 uzp1/2.
 * gcc.target/aarch64/vuzpu32_1.c: Likewise.
 * gcc.target/aarch64/vuzpf32_1.c: Likewise.

 gcc/ChangeLog:
 2014-03-27  Alan Lawrence  alan.lawre...@arm.com

 * config/aarch64/arm_neon.h (vuzp1_f32, vuzp1_p8, vuzp1_p16,
 vuzp1_s8,
 vuzp1_s16, vuzp1_s32, vuzp1_u8, vuzp1_u16, vuzp1_u32, vuzp1q_f32,
 vuzp1q_f64, vuzp1q_p8, vuzp1q_p16, vuzp1q_s8, vuzp1q_s16,
 vuzp1q_s32,
 vuzp1q_s64, vuzp1q_u8, vuzp1q_u16, vuzp1q_u32, vuzp1q_u64,
 vuzp2_f32,
 vuzp2_p8, vuzp2_p16, vuzp2_s8, vuzp2_s16, vuzp2_s32, vuzp2_u8,
 vuzp2_u16, vuzp2_u32, vuzp2q_f32, vuzp2q_f64, vuzp2q_p8, vuzp2q_p16,
 vuzp2q_s8, vuzp2q_s16, vuzp2q_s32, vuzp2q_s64, vuzp2q_u8,
 vuzp2q_u16,
 vuzp2q_u32, vuzp2q_u64): Replace temporary asm with
 __builtin_shuffle.

OK /Marcus

Re: -Wvariadic-macros does not print warning

2014-04-23 Thread Prathamesh Kulkarni

forgot to add gcc-patches@gcc.gnu.org. Sorry for the double-post.

On Wed, Apr 23, 2014 at 11:28 PM, Prathamesh Kulkarni
bilbotheelffri...@gmail.com wrote:
 This is a follow up mail to 
 http://gcc.gnu.org/ml/gcc-help/2014-04/msg00096.html
 I have attached patch that prints the warning when passed -Wvariadic-macros
 (I mostly followed it along lines of -Wlong-long).
 OK for trunk ?

 [libcpp]
 * macro.c (parse_params): Remove condition CPP_OPTION (pfile, cpp_pedantic).

 [gcc/c-family]
 * c.opt (-Wvariadic-macros): Init(-1) instead of Init(1).
 * c-opts.c (c_common_handle_option): Add case OPT_Wvariadic_macros.
(sanitize_cpp_opts): Check condition for pedantic or
 warn_traditional.

 Thanks and Regards,
 Prathamesh

[4.9.1 RFA] [tree-optimization/60902] Invalidate outputs of GIMPLE_ASMs when threading around loops

2014-04-23 Thread Jeff Law



The more aggressive threading across loop backedges requires 
invalidating equivalences that do not hold across all iterations of a loop.


At first glance, invaliding at PHI nodes should be sufficient as any 
statement which potentially generated a new equivalence would be 
reprocessed as we come across the backedge.  However, there is one 
important case where that does not hold.


Specifically we might have derived a value from a conditional and the 
conditional might have been fed by a statement that doesn't produce 
useful equivalences (such as a GIMPLE_ASM).  Thus the equivalence from 
the conditional is still visible because no new equivalence will be 
recorded for the GIMPLE_ASM.


So if the result of the GIMPLE_ASM that gets used in the conditional 
varies from one loop iteration to the next, we could use a stale value 
from a prior iteration to thread the current iteration.  That's exactly 
what happens in the ffmpeg code.


Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  Also 
verified that the sample audio in the referenced BZs no longer chops off 
after ~2 seconds.


Installed on the trunk.  OK for 4.9.1 after a suitable soak period on 
the trunk?




commit 02269351ce3a81b5470b8137fb3c34bca27011da
Author: Jeff Law l...@redhat.com
Date:   Wed Apr 23 00:25:47 2014 -0600

PR tree-optimization/60902
* tree-ssa-threadedge.c
(record_temporary_equivalences_from_stmts_at_dest): Make sure to
invalidate outputs from statements that do not produce useful
outputs for threading.

PR tree-optimization/60902
* gcc.target/i386/pr60902.c: New test.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 638c0da..ddebba7 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2014-04-23  Jeff Law  l...@redhat.com
+
+   PR tree-optimization/60902
+   * tree-ssa-threadedge.c
+   (record_temporary_equivalences_from_stmts_at_dest): Make sure to
+   invalidate outputs from statements that do not produce useful
+   outputs for threading.
+
 2014-04-23 Venkataramanan Kumar  venkataramanan.ku...@linaro.org
 
* config/aarch64/aarch64.md (stack_protect_set, stack_protect_test)
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 126ad08..62b07f4 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2014-04-23  Jeff Law  l...@redhat.com
+
+   PR tree-optimization/60902
+   * gcc.target/i386/pr60902.c: New test.
+
 2014-04-23  Alex Velenko  alex.vele...@arm.com
 
* gcc.target/aarch64/vdup_lane_1.c: New testcase.
diff --git a/gcc/testsuite/gcc.target/i386/pr60902.c 
b/gcc/testsuite/gcc.target/i386/pr60902.c
new file mode 100644
index 000..b81dcd7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr60902.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+/* { dg-options -O2 } */
+extern void abort ();
+extern void exit (int);
+
+int x;
+
+foo()
+{
+  static int count;
+  count++;
+  if (count  1)
+abort ();
+}
+
+static inline int
+frob ()
+{
+  int a;
+  __asm__ (mov %1, %0\n\t : =r (a) : m (x));
+  x++;
+  return a;
+}
+
+int
+main ()
+{
+  int i;
+  for (i = 0; i  10  frob () == 0; i++)
+foo();
+  exit (0);
+}
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index c447b72..8a0103b 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -387,7 +387,34 @@ record_temporary_equivalences_from_stmts_at_dest (edge e,
(gimple_code (stmt) != GIMPLE_CALL
   || gimple_call_lhs (stmt) == NULL_TREE
   || TREE_CODE (gimple_call_lhs (stmt)) != SSA_NAME))
-   continue;
+   {
+ /* STMT might still have DEFS and we need to invalidate any known
+equivalences for them.
+
+Consider if STMT is a GIMPLE_ASM with one or more outputs that
+feeds a conditional inside a loop.  We might derive an equivalence
+due to the conditional.  */
+ tree op;
+ ssa_op_iter iter;
+
+ if (backedge_seen)
+   FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_ALL_DEFS)
+ {
+   /* This call only invalidates equivalences created by
+  PHI nodes.  This is by design to keep the cost of
+  of invalidation reasonable.  */
+   invalidate_equivalences (op, stack, src_map, dst_map);
+
+   /* However, conditionals can imply values for real
+  operands as well.  And those won't be recorded in the
+  maps.  In fact, those equivalences may be recorded totally
+  outside the threading code.  We can just create a new
+  temporary NULL equivalence here.  */
+   record_temporary_equivalence (op, NULL_TREE, stack);
+ }
+
+ continue;
+   }
 
   /* The result of __builtin_object_size depends on all the arguments
 of a phi node. Temporarily using only one edge

Re: -Wvariadic-macros does not print warning

2014-04-23 Thread Prathamesh Kulkarni

I didn't attach the patch, I am extremely sorry for the noise.
I am re-posting the mail.
This is a follow up mail to http://gcc.gnu.org/ml/gcc-help/2014-04/msg00096.html
I have attached patch that prints the warning when passed -Wvariadic-macros
(I mostly followed it along lines of -Wlong-long).
OK for trunk ?

[libcpp]
* macro.c (parse_params): Remove condition CPP_OPTION (pfile, cpp_pedantic).

[gcc/c-family]
* c.opt (-Wvariadic-macros): Init(-1) instead of Init(1).
* c-opts.c (c_common_handle_option): Add case OPT_Wvariadic_macros.
   (sanitize_cpp_opts): Check condition for pedantic or
warn_traditional.

Thanks and Regards,
Prathamesh

On Wed, Apr 23, 2014 at 11:30 PM, Prathamesh Kulkarni
bilbotheelffri...@gmail.com wrote:
 forgot to add gcc-patches@gcc.gnu.org. Sorry for the double-post.

 On Wed, Apr 23, 2014 at 11:28 PM, Prathamesh Kulkarni
 bilbotheelffri...@gmail.com wrote:
 This is a follow up mail to 
 http://gcc.gnu.org/ml/gcc-help/2014-04/msg00096.html
 I have attached patch that prints the warning when passed -Wvariadic-macros
 (I mostly followed it along lines of -Wlong-long).
 OK for trunk ?

 [libcpp]
 * macro.c (parse_params): Remove condition CPP_OPTION (pfile, cpp_pedantic).

 [gcc/c-family]
 * c.opt (-Wvariadic-macros): Init(-1) instead of Init(1).
 * c-opts.c (c_common_handle_option): Add case OPT_Wvariadic_macros.
(sanitize_cpp_opts): Check condition for pedantic or
 warn_traditional.

 Thanks and Regards,
 Prathamesh
Index: libcpp/macro.c
===
--- libcpp/macro.c	(revision 209470)
+++ libcpp/macro.c	(working copy)
@@ -2800,8 +2800,7 @@ parse_params (cpp_reader *pfile, cpp_mac
   (pfile, CPP_W_VARIADIC_MACROS,
 		   anonymous variadic macros were introduced in C99);
 	}
-	  else if (CPP_OPTION (pfile, cpp_pedantic)
-		CPP_OPTION (pfile, warn_variadic_macros))
+	  else if (CPP_OPTION (pfile, warn_variadic_macros))
 	cpp_pedwarning (pfile, CPP_W_VARIADIC_MACROS,
 		ISO C does not permit named variadic macros);
 
Index: gcc/c-family/c-opts.c
===
--- gcc/c-family/c-opts.c	(revision 209470)
+++ gcc/c-family/c-opts.c	(working copy)
@@ -396,6 +396,10 @@ c_common_handle_option (size_t scode, co
   cpp_opts-cpp_warn_long_long = value;
   break;
 
+case OPT_Wvariadic_macros:
+  cpp_opts-warn_variadic_macros = value;
+  break;
+
 case OPT_Wmissing_include_dirs:
   cpp_opts-warn_missing_include_dirs = value;
   break;
@@ -1227,8 +1231,9 @@ sanitize_cpp_opts (void)
 
   /* Similarly with -Wno-variadic-macros.  No check for c99 here, since
  this also turns off warnings about GCCs extension.  */
-  cpp_opts-warn_variadic_macros
-= cpp_warn_variadic_macros  (pedantic || warn_traditional);
+  if (cpp_warn_variadic_macros == -1)
+cpp_warn_variadic_macros = pedantic || warn_traditional;
+  cpp_opts-warn_variadic_macros = cpp_warn_variadic_macros;
 
   /* If we're generating preprocessor output, emit current directory
  if explicitly requested or if debugging information is enabled.
Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt	(revision 209470)
+++ gcc/c-family/c.opt	(working copy)
@@ -785,7 +785,7 @@ C ObjC C++ ObjC++ Var(warn_unused_result
 Warn if a caller of a function, marked with attribute warn_unused_result, does not use its return value
 
 Wvariadic-macros
-C ObjC C++ ObjC++ Var(cpp_warn_variadic_macros) Init(1) Warning
+C ObjC C++ ObjC++ Var(cpp_warn_variadic_macros) Init(-1) Warning
 Warn about using variadic macros
 
 Wvarargs

Re: Optimize n?rotate(x,n):x

2014-04-23 Thread Marc Glisse


Honza, any comment on Richard's question?

On Tue, 15 Apr 2014, Richard Biener wrote:


On Mon, Apr 14, 2014 at 6:40 PM, Marc Glisse marc.gli...@inria.fr wrote:

On Mon, 14 Apr 2014, Richard Biener wrote:


+  /* If the special case has a high probability, keep it.  */
+  if (EDGE_PRED (middle_bb, 0)-probability  PROB_EVEN)



I suppose Honza has a comment on how to test this properly
(not sure if -probability or -frequency is always initialized properly).
for example single_likely_edge tests profile_status_for_fn !=
PROFILE_ABSENT (and uses a fixed probability value ...).
Anyway, the comparison looks backwards to me, but maybe I'm
missing sth - I'd use = PROB_LIKELY ;)



Maybe the comment is confusing? middle_bb contains the expensive operation
(say a/b) that the special case skips entirely. If the division happens in
less than 50% of cases (that's the proba of the edge going from cond to
middle_bb), then doing the comparison+jump may be cheaper and I abort the
optimization. At least the testcase with __builtin_expect should prove that
I didn't do it backwards.


Ah, indeed.  My mistake.


value-prof seems to use 50% as the cut-off where it may become interesting
to special case division, hence my choice of PROB_EVEN. I am not sure which
way you want to use PROB_LIKELY (80%). If we have more than 80% chances of
executing the division, always perform it? Or if we have more than 80%
chances of skipping the division, keep the branch?


Ok, if it's from value-prof then that's fine.

The patch is ok if Honza doesn't have any comments on whether it's ok
to look at -probability unconditionally.

Thanks,
Richard.


Attached is the latest version (passed the testsuite).
Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c  (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c  (working copy)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options -O -fdump-tree-phiopt1 } */
+
+int f(int a, int b, int c) {
+  if (c  5) return c;
+  if (a == 0) return b;
+  return a + b;
+}
+
+unsigned rot(unsigned x, int n) {
+  const int bits = __CHAR_BIT__ * __SIZEOF_INT__;
+  return (n == 0) ? x : ((x  n) | (x  (bits - n)));
+}
+
+unsigned m(unsigned a, unsigned b) {
+  if (a == 0)
+return 0;
+  else
+return a  b;
+}
+
+/* { dg-final { scan-tree-dump-times goto 2 phiopt1 } } */
+/* { dg-final { cleanup-tree-dump phiopt1 } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c  (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c  (working copy)
@@ -0,0 +1,19 @@
+/* { dg-do compile { target x86_64-*-* } } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+int f(int a, int b) {
+  if (__builtin_expect(a == 0, 1)) return b;
+  return a + b;
+}
+
+// optab_handler can handle if(b==1) but not a/b
+// so we consider a/b too expensive.
+unsigned __int128 g(unsigned __int128 a, unsigned __int128 b) {
+  if (b == 1)
+return a;
+  else
+return a / b;
+}
+
+/* { dg-final { scan-tree-dump-times goto  4 optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
Index: gcc/tree-ssa-phiopt.c
===
--- gcc/tree-ssa-phiopt.c   (revision 209353)
+++ gcc/tree-ssa-phiopt.c   (working copy)
@@ -140,20 +140,37 @@ static bool gate_hoist_loads (void);
x = PHI (CONST, a)

Gets replaced with:
  bb0:
  bb2:
t1 = a == CONST;
t2 = b  c;
t3 = t1  t2;
x = a;

+
+   It also replaces
+
+ bb0:
+   if (a != 0) goto bb1; else goto bb2;
+ bb1:
+   c = a + b;
+ bb2:
+   x = PHI c (bb1), b (bb0), ...;
+
+   with
+
+ bb0:
+   c = a + b;
+ bb2:
+   x = PHI c (bb0), ...;
+
ABS Replacement
---

This transformation, implemented in abs_replacement, replaces

  bb0:
if (a = 0) goto bb2; else goto bb1;
  bb1:
x = -a;
  bb2:
@@ -809,20 +826,103 @@ operand_equal_for_value_replacement (con
   if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp))
 return true;

   tmp = gimple_assign_rhs2 (def);
   if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp))
 return true;

   return false;
 }

+/* Returns true if ARG is a neutral element for operation CODE
+   on the RIGHT side.  */
+
+static bool
+neutral_element_p (tree_code code, tree arg, bool right)
+{
+  switch (code)
+{
+case PLUS_EXPR:
+case BIT_IOR_EXPR:
+case BIT_XOR_EXPR:
+  return integer_zerop (arg);
+
+case LROTATE_EXPR:
+case RROTATE_EXPR:
+case LSHIFT_EXPR:
+case RSHIFT_EXPR:
+case MINUS_EXPR:
+case POINTER_PLUS_EXPR:
+  return right  integer_zerop (arg);
+
+case MULT_EXPR:
+  return integer_onep (arg);
+
+case TRUNC_DIV_EXPR:
+case CEIL_DIV_EXPR:
+

Re: [c++] typeinfo for target types

2014-04-23 Thread Richard Henderson

On 04/13/2014 01:41 AM, Marc Glisse wrote:
 Hello,
 
 this patch generates typeinfo for target types. On x86_64, it adds these 6
 lines to nm -C libsupc++.a. A follow-up patch will be needed to export and
 version those in the shared library.
 
 + V typeinfo for __float128
 + V typeinfo for __float128 const*
 + V typeinfo for __float128*
 + V typeinfo name for __float128
 + V typeinfo name for __float128 const*
 + V typeinfo name for __float128*
 
 Bootstrap and testsuite on x86_64-linux-gnu (a bit of noise in 
 tsan/tls_race.c).
 
 2014-04-13  Marc Glisse  marc.gli...@inria.fr
 
 PR libstdc++/43622
 gcc/c-family/
 * c-common.c (registered_builtin_types): Make non-static.
 * c-common.h (registered_builtin_types): Declare.
 gcc/cp/
 * rtti.c (emit_support_tinfo_1): New function, extracted from
 emit_support_tinfos.
 (emit_support_tinfos): Call it and iterate on registered_builtin_types.
 

This is causing aarch64 builds to break.  Any c++ compilation aborts at

#0  fancy_abort (file=0x14195c8 ../../git-rh/gcc/cp/mangle.c, line=2303,
function=0x1419ff8 write_builtin_type(tree_node*)::__FUNCTION__
write_builtin_type) at ../../git-rh/gcc/diagnostic.c:1190
#1  0x007ce2b4 in write_builtin_type (
type=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
at ../../git-rh/gcc/cp/mangle.c:2303
#2  0x007cc85c in write_type (
type=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
at ../../git-rh/gcc/cp/mangle.c:1969
#3  0x007d4d98 in mangle_special_for_type (
type=real_type 0x7fb1653540 __builtin_aarch64_simd_df,
code=0x1419a98 TI) at ../../git-rh/gcc/cp/mangle.c:3569
#4  0x007d4dcc in mangle_typeinfo_for_type (
type=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
at ../../git-rh/gcc/cp/mangle.c:3585
#5  0x0070618c in get_tinfo_decl (
type=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
at ../../git-rh/gcc/cp/rtti.c:422
#6  0x00709ff0 in emit_support_tinfo_1 (
bltn=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
at ../../git-rh/gcc/cp/rtti.c:1485
#7  0x0070a344 in emit_support_tinfos ()
at ../../git-rh/gcc/cp/rtti.c:1550

Presumably the backend needs to grow some mangling support for its builtins,
but in the meantime can we do something less drastic than abort?  Isn't this
only really an issue if someone tries to access one of these types via typeinfo?


r~

Re: [i386] define __SIZEOF_FLOAT128__

2014-04-23 Thread Marc Glisse


(Adding an i386 maintainer in Cc)
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00620.html

On Sun, 13 Apr 2014, Marc Glisse wrote:


Hello,

some people like having a macro to test if a type is available 
(__SIZEOF_INT128__ for instance). This adds macros for __float80 and 
__float128. The types seem to be always available, so I didn't add any 
condition.


If you think this is a bad idea, please close the PR.

Bootstrap+testsuite on x86_64-linux-gnu.

2014-04-13  Marc Glisse  marc.gli...@inria.fr

PR preprocessor/56540
* config/i386/i386-c.c (ix86_target_macros): Define
__SIZEOF_FLOAT80__ and __SIZEOF_FLOAT128__.


--
Marc Glisse

[AArch64/ARM 0/3] Patch series for REV permute instructions

The meat of this is in the second patch, which makes the AArch64 backend look 
for shuffle masks that can be turned into REV instructions, and updates the VREV 
Neon Intrinsics to use __builtin_shuffle rather than the current inline 
assembler; this then produces the same instructions (unless the midend can do 
better).


Before that, the first patch adds execution + assembler tests of the existing
intrinsics, which then serve as a testcase for the second patch.

Third patch reuses the test bodies from first patch in equivalent tests on the
ARM architecture.

Ok for trunk?

--Alan

Re: [PATCH] Change is-a.h to support typedefs of pointers

2014-04-23 Thread David Malcolm

On Wed, 2014-04-23 at 18:32 +0200, Richard Biener wrote:
 On April 23, 2014 5:31:42 PM CEST, David Malcolm dmalc...@redhat.com wrote:

[...snip...]

 The following patch changes the is-a.h API to remove the implicit
 injection of a pointer, so that one writes:
 
   Q* q = dyn_cast Q* (p);
 
 rather than:
 
   Q* q = dyn_cast Q (p);
 

[...snip...]

 Successfully bootstrappedregrtested on x86_64-unknown-linux-gnu.
 
 OK for trunk?  
 
 OK for trunk, no need to wait for 4.9.1 for this.

Thanks.  Committed to trunk as r209719.

[...snip...]

[AArch64/ARM 1/3] Add execution + assembler tests of AArch64 REV Neon Intrinsics

This adds DejaGNU tests of the existing AArch64 vrev_* intrinsics, both checking 
the assembler output and the runtime results. Test bodies are in separate files 
ready to reuse for ARM in the third patch.


All tests passing on aarch64-none-elf and aarch64_be-none-elf.

gcc/testsuite/ChangeLog:

2014-04-23  Alan Lawrence  alan.lawre...@arm.com

* gcc.target/aarch64/simd/vrev16p8_1.c: New file.
* gcc.target/aarch64/simd/vrev16p8.x: New file.
* gcc.target/aarch64/simd/vrev16qp8_1.c: New file.
* gcc.target/aarch64/simd/vrev16qp8.x: New file.
* gcc.target/aarch64/simd/vrev16qs8_1.c: New file.
* gcc.target/aarch64/simd/vrev16qs8.x: New file.
* gcc.target/aarch64/simd/vrev16qu8_1.c: New file.
* gcc.target/aarch64/simd/vrev16qu8.x: New file.
* gcc.target/aarch64/simd/vrev16s8_1.c: New file.
* gcc.target/aarch64/simd/vrev16s8.x: New file.
* gcc.target/aarch64/simd/vrev16u8_1.c: New file.
* gcc.target/aarch64/simd/vrev16u8.x: New file.
* gcc.target/aarch64/simd/vrev32p16_1.c: New file.
* gcc.target/aarch64/simd/vrev32p16.x: New file.
* gcc.target/aarch64/simd/vrev32p8_1.c: New file.
* gcc.target/aarch64/simd/vrev32p8.x: New file.
* gcc.target/aarch64/simd/vrev32qp16_1.c: New file.
* gcc.target/aarch64/simd/vrev32qp16.x: New file.
* gcc.target/aarch64/simd/vrev32qp8_1.c: New file.
* gcc.target/aarch64/simd/vrev32qp8.x: New file.
* gcc.target/aarch64/simd/vrev32qs16_1.c: New file.
* gcc.target/aarch64/simd/vrev32qs16.x: New file.
* gcc.target/aarch64/simd/vrev32qs8_1.c: New file.
* gcc.target/aarch64/simd/vrev32qs8.x: New file.
* gcc.target/aarch64/simd/vrev32qu16_1.c: New file.
* gcc.target/aarch64/simd/vrev32qu16.x: New file.
* gcc.target/aarch64/simd/vrev32qu8_1.c: New file.
* gcc.target/aarch64/simd/vrev32qu8.x: New file.
* gcc.target/aarch64/simd/vrev32s16_1.c: New file.
* gcc.target/aarch64/simd/vrev32s16.x: New file.
* gcc.target/aarch64/simd/vrev32s8_1.c: New file.
* gcc.target/aarch64/simd/vrev32s8.x: New file.
* gcc.target/aarch64/simd/vrev32u16_1.c: New file.
* gcc.target/aarch64/simd/vrev32u16.x: New file.
* gcc.target/aarch64/simd/vrev32u8_1.c: New file.
* gcc.target/aarch64/simd/vrev32u8.x: New file.
* gcc.target/aarch64/simd/vrev64f32_1.c: New file.
* gcc.target/aarch64/simd/vrev64f32.x: New file.
* gcc.target/aarch64/simd/vrev64p16_1.c: New file.
* gcc.target/aarch64/simd/vrev64p16.x: New file.
* gcc.target/aarch64/simd/vrev64p8_1.c: New file.
* gcc.target/aarch64/simd/vrev64p8.x: New file.
* gcc.target/aarch64/simd/vrev64qf32_1.c: New file.
* gcc.target/aarch64/simd/vrev64qf32.x: New file.
* gcc.target/aarch64/simd/vrev64qp16_1.c: New file.
* gcc.target/aarch64/simd/vrev64qp16.x: New file.
* gcc.target/aarch64/simd/vrev64qp8_1.c: New file.
* gcc.target/aarch64/simd/vrev64qp8.x: New file.
* gcc.target/aarch64/simd/vrev64qs16_1.c: New file.
* gcc.target/aarch64/simd/vrev64qs16.x: New file.
* gcc.target/aarch64/simd/vrev64qs32_1.c: New file.
* gcc.target/aarch64/simd/vrev64qs32.x: New file.
* gcc.target/aarch64/simd/vrev64qs8_1.c: New file.
* gcc.target/aarch64/simd/vrev64qs8.x: New file.
* gcc.target/aarch64/simd/vrev64qu16_1.c: New file.
* gcc.target/aarch64/simd/vrev64qu16.x: New file.
* gcc.target/aarch64/simd/vrev64qu32_1.c: New file.
* gcc.target/aarch64/simd/vrev64qu32.x: New file.
* gcc.target/aarch64/simd/vrev64qu8_1.c: New file.
* gcc.target/aarch64/simd/vrev64qu8.x: New file.
* gcc.target/aarch64/simd/vrev64s16_1.c: New file.
* gcc.target/aarch64/simd/vrev64s16.x: New file.
* gcc.target/aarch64/simd/vrev64s32_1.c: New file.
* gcc.target/aarch64/simd/vrev64s32.x: New file.
* gcc.target/aarch64/simd/vrev64s8_1.c: New file.
* gcc.target/aarch64/simd/vrev64s8.x: New file.
* gcc.target/aarch64/simd/vrev64u16_1.c: New file.
* gcc.target/aarch64/simd/vrev64u16.x: New file.
* gcc.target/aarch64/simd/vrev64u32_1.c: New file.
* gcc.target/aarch64/simd/vrev64u32.x: New file.
* gcc.target/aarch64/simd/vrev64u8_1.c: New file.
* gcc.target/aarch64/simd/vrev64u8.x: New file.diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vrev16p8.x b/gcc/testsuite/gcc.target/aarch64/simd/vrev16p8.x
new file mode 100644
index 000..6316abf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/simd/vrev16p8.x
@@ -0,0 +1,22 @@
+extern void abort (void);
+
+poly8x8_t
+test_vrev16p8 (poly8x8_t _arg)
+{
+  return vrev16_p8 (_arg);
+}
+
+int
+main (int argc, char **argv)
+{
+  int i;
+  poly8x8_t inorder = {1, 2, 3, 4, 5, 6, 7, 8};
+  poly8x8_t reversed =

[AArch64/ARM 2/3] Recognize shuffle patterns for REV instructions on AARch64, rewrite intrinsics.

This patch (borrowing heavily from the ARM backend) makes 
aarch64_expand_vec_perm_const output REV instructions when appropriate,

and then implements the vrev_XXX intrinsics in terms of __builtin_shuffle (which
now produces the same assembly instructions).

No regressions (and tests in previous patch 
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01468.html still passing) on 
aarch64-none-elf; also on aarch64_be-none-elf, where there are

no regressions following testsuite config changes in
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00579.html, but some noise (due
to unexpected success in vectorization) without that patch.

gcc/ChangeLog:
2014-04-23  Alan Lawrence  alan.lawre...@arm.com

* config/aarch64/iterators.md: add a REVERSE iterator and rev_op
attribute for REV64/32/16 insns.
* config/aarch64/aarch64-simd.md: add corresponding define_insn
parameterized by REVERSE iterator.
* config/aarch64/aarch64.c (aarch64_evpc_rev): recognize REVnn patterns.
(aarch64_expand_vec_perm_const_1): call aarch64_evpc_rev also.
* config/aarch64/arm_neon.h (vrev{16,32,64}[q]_{s,p,u,f}{8,16,32}): 
rewrite to
use __builtin_shuffle.diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 4dffb59..d499e86 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4032,6 +4032,15 @@
   [(set_attr type neon_permuteq)]
 )
 
+(define_insn aarch64_revREVERSE:rev_opmode
+  [(set (match_operand:VALL 0 register_operand =w)
+	(unspec:VALL [(match_operand:VALL 1 register_operand w)]
+REVERSE))]
+  TARGET_SIMD
+  revREVERSE:rev_op\\t%0.Vtype, %1.Vtype
+  [(set_attr type neon_revq)]
+)
+
 (define_insn aarch64_st2mode_dreg
   [(set (match_operand:TI 0 aarch64_simd_struct_operand =Utv)
 	(unspec:TI [(match_operand:OI 1 register_operand w)
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 16c51a8..5bb10a2 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -8047,6 +8047,80 @@ aarch64_evpc_zip (struct expand_vec_perm_d *d)
   return true;
 }
 
+/* Recognize patterns for the REV insns.  */
+
+static bool
+aarch64_evpc_rev (struct expand_vec_perm_d *d)
+{
+  unsigned int i, j, diff, nelt = d-nelt;
+  rtx (*gen) (rtx, rtx);
+
+  if (!d-one_vector_p)
+return false;
+
+  diff = d-perm[0];
+  switch (diff)
+{
+case 7:
+  switch (d-vmode)
+	{
+	case V16QImode: gen = gen_aarch64_rev64v16qi; break;
+	case V8QImode: gen = gen_aarch64_rev64v8qi;  break;
+	default:
+	  return false;
+	}
+  break;
+case 3:
+  switch (d-vmode)
+	{
+	case V16QImode: gen = gen_aarch64_rev32v16qi; break;
+	case V8QImode: gen = gen_aarch64_rev32v8qi;  break;
+	case V8HImode: gen = gen_aarch64_rev64v8hi;  break;
+	case V4HImode: gen = gen_aarch64_rev64v4hi;  break;
+	default:
+	  return false;
+	}
+  break;
+case 1:
+  switch (d-vmode)
+	{
+	case V16QImode: gen = gen_aarch64_rev16v16qi; break;
+	case V8QImode: gen = gen_aarch64_rev16v8qi;  break;
+	case V8HImode: gen = gen_aarch64_rev32v8hi;  break;
+	case V4HImode: gen = gen_aarch64_rev32v4hi;  break;
+	case V4SImode: gen = gen_aarch64_rev64v4si;  break;
+	case V2SImode: gen = gen_aarch64_rev64v2si;  break;
+	case V4SFmode: gen = gen_aarch64_rev64v4sf;  break;
+	case V2SFmode: gen = gen_aarch64_rev64v2sf;  break;
+	default:
+	  return false;
+	}
+  break;
+default:
+  return false;
+}
+
+  for (i = 0; i  nelt ; i += diff + 1)
+for (j = 0; j = diff; j += 1)
+  {
+	/* This is guaranteed to be true as the value of diff
+	   is 7, 3, 1 and we should have enough elements in the
+	   queue to generate this.  Getting a vector mask with a
+	   value of diff other than these values implies that
+	   something is wrong by the time we get here.  */
+	gcc_assert (i + j  nelt);
+	if (d-perm[i + j] != i + diff - j)
+	  return false;
+  }
+
+  /* Success! */
+  if (d-testing_p)
+return true;
+
+  emit_insn (gen (d-target, d-op0));
+  return true;
+}
+
 static bool
 aarch64_evpc_dup (struct expand_vec_perm_d *d)
 {
@@ -8153,6 +8227,8 @@ aarch64_expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
 	return true;
   else if (aarch64_evpc_trn (d))
 	return true;
+  else if (aarch64_evpc_rev (d))
+return true;
   else if (aarch64_evpc_dup (d))
 	return true;
   return aarch64_evpc_tbl (d);
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 6af99361..383ed56 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -10628,402 +10628,6 @@ vrecpeq_u32 (uint32x4_t a)
   return result;
 }
 
-__extension__ static __inline poly8x8_t __attribute__ ((__always_inline__))
-vrev16_p8 (poly8x8_t a)
-{
-  poly8x8_t result;
-  __asm__ (rev16 %0.8b,%1.8b
-   : =w(result)
-   : w(a)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline int8x8_t

Re: [c++] typeinfo for target types

2014-04-23 Thread Marc Glisse


On Wed, 23 Apr 2014, Richard Henderson wrote:


On 04/13/2014 01:41 AM, Marc Glisse wrote:

Hello,

this patch generates typeinfo for target types. On x86_64, it adds these 6
lines to nm -C libsupc++.a. A follow-up patch will be needed to export and
version those in the shared library.

+ V typeinfo for __float128
+ V typeinfo for __float128 const*
+ V typeinfo for __float128*
+ V typeinfo name for __float128
+ V typeinfo name for __float128 const*
+ V typeinfo name for __float128*

Bootstrap and testsuite on x86_64-linux-gnu (a bit of noise in tsan/tls_race.c).

2014-04-13  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/43622
gcc/c-family/
* c-common.c (registered_builtin_types): Make non-static.
* c-common.h (registered_builtin_types): Declare.
gcc/cp/
* rtti.c (emit_support_tinfo_1): New function, extracted from
emit_support_tinfos.
(emit_support_tinfos): Call it and iterate on registered_builtin_types.



This is causing aarch64 builds to break.


If it is causing too much trouble, we could ifdef out the last 2 lines of 
emit_support_tinfos and revert the libstdc++ changes (or even revert the 
whole thing).



Any c++ compilation aborts at


That's surprising, the code I touched is only ever supposed to run while 
compiling one file in libsupc++, if I understand correctly.



#0  fancy_abort (file=0x14195c8 ../../git-rh/gcc/cp/mangle.c, line=2303,
   function=0x1419ff8 write_builtin_type(tree_node*)::__FUNCTION__
   write_builtin_type) at ../../git-rh/gcc/diagnostic.c:1190
#1  0x007ce2b4 in write_builtin_type (
   type=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
   at ../../git-rh/gcc/cp/mangle.c:2303
#2  0x007cc85c in write_type (
   type=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
   at ../../git-rh/gcc/cp/mangle.c:1969
#3  0x007d4d98 in mangle_special_for_type (
   type=real_type 0x7fb1653540 __builtin_aarch64_simd_df,
   code=0x1419a98 TI) at ../../git-rh/gcc/cp/mangle.c:3569
#4  0x007d4dcc in mangle_typeinfo_for_type (
   type=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
   at ../../git-rh/gcc/cp/mangle.c:3585
#5  0x0070618c in get_tinfo_decl (
   type=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
   at ../../git-rh/gcc/cp/rtti.c:422
#6  0x00709ff0 in emit_support_tinfo_1 (
   bltn=real_type 0x7fb1653540 __builtin_aarch64_simd_df)
   at ../../git-rh/gcc/cp/rtti.c:1485
#7  0x0070a344 in emit_support_tinfos ()
   at ../../git-rh/gcc/cp/rtti.c:1550

Presumably the backend needs to grow some mangling support for its builtins,


aarch64 has complicated builtins... __builtin_aarch64_simd_df uses 
double_aarch64_type_node which is not the same as double_type_node. I 
mostly looked at the x86 backend, so I didn't notice that aarch64 
registers a lot more builtins.



but in the meantime can we do something less drastic than abort?


Sounds good, but I am not sure how exactly. We could use a separate hook 
(register_builtin_type_for_typeinfo?) so back-ends have to explicitly say 
they want typeinfo, but it is ugly having to register types multiple 
times. We could add a parameter to the existing register_builtin_type 
saying whether we want typeinfo, but that means updating all back-ends. We 
could get the mangling functions to take a parameter that says whether 
errors should be fatal and skip generating the typeinfo when we can't 
mangle, but there is no convenient way to communicate this mangling 
failure (0 bytes written?).


Would mangling the aarch64 builtins be a lot of work? Did other platforms 
break as well?


Isn't this only really an issue if someone tries to access one of these 
types via typeinfo?


Yes.

--
Marc Glisse

Re: [i386] define __SIZEOF_FLOAT128__

2014-04-23 Thread H.J. Lu

On Wed, Apr 23, 2014 at 11:48 AM, Marc Glisse marc.gli...@inria.fr wrote:
 (Adding an i386 maintainer in Cc)
 http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00620.html


 On Sun, 13 Apr 2014, Marc Glisse wrote:

 Hello,

 some people like having a macro to test if a type is available
 (__SIZEOF_INT128__ for instance). This adds macros for __float80 and
 __float128. The types seem to be always available, so I didn't add any
 condition.

 If you think this is a bad idea, please close the PR.

 Bootstrap+testsuite on x86_64-linux-gnu.

 2014-04-13  Marc Glisse  marc.gli...@inria.fr

 PR preprocessor/56540
 * config/i386/i386-c.c (ix86_target_macros): Define
 __SIZEOF_FLOAT80__ and __SIZEOF_FLOAT128__.


For __SIZEOF_FLOAT80__, you should check TARGET_128BIT_LONG_DOUBLE
instead of TARGET_64BIT.

-- 
H.J.

[AArch64/ARM 3/3] Add execution tests of ARM REV intrinsics