Re: [PATCH 02/89] Introduce gimple_switch and use it in various places
On 04/22/2014 10:17 PM, David Malcolm wrote: or indeed, something like: else if (gimple_switch switch_stmt = dyn_cast gimple_switch (stmt)) { to avoid an 83-character-wide line :) Hope that's the appropriate way to split such a line; I can never remember if one is supposed to put the linebreak before or after the =. It's supposed to come before, so that the equal sign is at the start of the line. Operators are also supposed to be at the start and not the end. At least the operator part is covered in the coding standards: http://www.gnu.org/prep/standards/html_node/Formatting.html -- Florian Weimer / Red Hat Product Security Team
Re: [RFC] Add aarch64 support for ada
diff --git a/gcc/ada/gcc-interface/Makefile.in b/gcc/ada/gcc-interface/Makefile.in index dc5e912..302d9a3 100644 --- a/gcc/ada/gcc-interface/Makefile.in +++ b/gcc/ada/gcc-interface/Makefile.in @@ -2123,6 +2123,44 @@ ifeq ($(strip $(filter-out alpha% linux%,$(arch) $(osys))),) LIBRARY_VERSION := $(LIB_VERSION) endif +# AArch64 Linux +ifeq ($(strip $(filter-out aarch64% linux%,$(arch) $(osys))),) + LIBGNAT_TARGET_PAIRS = \ + a-exetim.adba-exetim-posix.adb \ + a-exetim.adsa-exetim-default.ads \ + a-intnam.adsa-intnam-linux.ads \ + a-synbar.adba-synbar-posix.adb \ + a-synbar.adsa-synbar-posix.ads \ + s-inmaop.adbs-inmaop-posix.adb \ + s-intman.adbs-intman-posix.adb \ + s-linux.adss-linux.ads \ + s-mudido.adbs-mudido-affinity.adb \ + s-osinte.adss-osinte-linux.ads \ + s-osinte.adbs-osinte-posix.adb \ + s-osprim.adbs-osprim-posix.adb \ + s-taprop.adbs-taprop-linux.adb \ + s-tasinf.adss-tasinf-linux.ads \ + s-tasinf.adbs-tasinf-linux.adb \ + s-tpopsp.adbs-tpopsp-tls.adb \ + s-taspri.adss-taspri-posix.ads \ + g-sercom.adbg-sercom-linux.adb \ + $(ATOMICS_TARGET_PAIRS) \ + $(ATOMICS_BUILTINS_TARGET_PAIRS) \ + system.adssystem-linux-x86_64.ads + ## ^^ Note the above is a pretty-close placeholder. + + TOOLS_TARGET_PAIRS = \ +mlib-tgt-specific.adbmlib-tgt-specific-linux.adb \ +indepsw.adbindepsw-gnu.adb + + EXTRA_GNATRTL_TASKING_OBJS=s-linux.o a-exetim.o + EH_MECHANISM=-gcc + THREADSLIB=-lpthread -lrt + GNATLIB_SHARED=gnatlib-shared-dual + GMEM_LIB = gmemlib + LIBRARY_VERSION := $(LIB_VERSION) +endif + # x86-64 Linux ifeq ($(strip $(filter-out %x86_64 linux%,$(arch) $(osys))),) LIBGNAT_TARGET_PAIRS = \ This patch was not made on the mainline but got nevertheless applied on the mainline, breaking the build on x86 and x86-64 at least as a result. -- Eric Botcazou
Re: [PATCH] Add support for -fno-sanitize-recover and -fsanitize-undefined-trap-on-error (PR sanitizer/60275)
On Tue, 15 Apr 2014, Jakub Jelinek wrote: Hi! This patch adds two new options (compatible with clang) which allow users to choose the behavior of undefined behavior sanitization. By default as before, all undefined behaviors (except for __builtin_unreachable and missing return in C++) continue after reporting which means that you can get lots of runtime errors from a single program run and the exit code will not reflect the failure in that case. With this patch, one can use -fsanitize=undefined -fno-sanitize-recover, which will report just the first undefined behavior and then exit with non-zero code. Or one can use -fsanitize-undefined-trap-on-error, which will just __builtin_trap () on undefined behavior, not report anything and not require linking of -lubsan (useful e.g. for the kernel or embedded apps). If -fsanitize-undefined-trap-on-error, then -f{,no-}sanitize-recover is ignored, as ub traps, of course only the first undefined behavior will be reported (through the SIGILL/abort). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Works for me. Thanks, Richard. 2014-04-15 Jakub Jelinek ja...@redhat.com PR sanitizer/60275 * common.opt (fsanitize-recover, fsanitize-undefined-trap-on-error): New options. * gcc.c (sanitize_spec_function): Don't return for undefined if flag_sanitize_undefined_trap_on_error. * sanitizer.def (BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW_ABORT, BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS_ABORT, BUILT_IN_UBSAN_HANDLE_VLA_BOUND_NOT_POSITIVE_ABORT, BUILT_IN_UBSAN_HANDLE_TYPE_MISMATCH_ABORT, BUILT_IN_UBSAN_HANDLE_ADD_OVERFLOW_ABORT, BUILT_IN_UBSAN_HANDLE_SUB_OVERFLOW_ABORT, BUILT_IN_UBSAN_HANDLE_MUL_OVERFLOW_ABORT, BUILT_IN_UBSAN_HANDLE_NEGATE_OVERFLOW_ABORT, BUILT_IN_UBSAN_HANDLE_LOAD_INVALID_VALUE_ABORT): New builtins. * ubsan.c (ubsan_instrument_unreachable): Return __builtin_trap () if flag_sanitize_undefined_trap_on_error. (ubsan_expand_null_ifn): Emit __builtin_trap () if flag_sanitize_undefined_trap_on_error and __ubsan_handle_type_mismatch_abort if !flag_sanitize_recover. (ubsan_expand_null_ifn, ubsan_build_overflow_builtin, instrument_bool_enum_load): Emit __builtin_trap () if flag_sanitize_undefined_trap_on_error and __builtin_handle_*_abort () if !flag_sanitize_recover. * doc/invoke.texi (-fsanitize-recover, -fsanitize-undefined-trap-on-error): Document. c-family/ * c-ubsan.c (ubsan_instrument_return): Return __builtin_trap () if flag_sanitize_undefined_trap_on_error. (ubsan_instrument_division, ubsan_instrument_shift, ubsan_instrument_vla): Likewise. Use __ubsan_handle_*_abort () if !flag_sanitize_recover. testsuite/ * g++.dg/ubsan/return-2.C: Revert 2014-03-24 changes, add -fno-sanitize-recover to dg-options. * g++.dg/ubsan/cxx11-shift-1.C: Remove c++11 target restriction, add -std=c++11 to dg-options. * g++.dg/ubsan/cxx11-shift-2.C: Likewise. * g++.dg/ubsan/cxx1y-vla.C: Remove c++1y target restriction, add -std=c++1y to dg-options. * c-c++-common/ubsan/undefined-1.c: Revert 2014-03-24 changes, add -fno-sanitize-recover to dg-options. * c-c++-common/ubsan/overflow-sub-1.c: Likewise. * c-c++-common/ubsan/vla-4.c: Likewise. * c-c++-common/ubsan/pr59503.c: Likewise. * c-c++-common/ubsan/vla-3.c: Likewise. * c-c++-common/ubsan/save-expr-1.c: Likewise. * c-c++-common/ubsan/overflow-add-1.c: Likewise. * c-c++-common/ubsan/shift-3.c: Likewise. * c-c++-common/ubsan/overflow-1.c: Likewise. * c-c++-common/ubsan/overflow-negate-2.c: Likewise. * c-c++-common/ubsan/vla-2.c: Likewise. * c-c++-common/ubsan/overflow-mul-1.c: Likewise. * c-c++-common/ubsan/pr60613-1.c: Likewise. * c-c++-common/ubsan/shift-6.c: Likewise. * c-c++-common/ubsan/overflow-mul-3.c: Likewise. * c-c++-common/ubsan/overflow-add-3.c: New test. * c-c++-common/ubsan/overflow-add-4.c: New test. * c-c++-common/ubsan/div-by-zero-6.c: New test. * c-c++-common/ubsan/div-by-zero-7.c: New test. --- gcc/common.opt.jj 2014-04-15 09:57:33.400264838 +0200 +++ gcc/common.opt2014-04-15 10:28:10.554519376 +0200 @@ -862,6 +862,14 @@ fsanitize= Common Driver Report Joined Select what to sanitize +fsanitize-recover +Common Report Var(flag_sanitize_recover) Init(1) +After diagnosing undefined behavior attempt to continue execution + +fsanitize-undefined-trap-on-error +Common Report Var(flag_sanitize_undefined_trap_on_error) Init(0) +Use trap instead of a library function for undefined behavior sanitization + fasynchronous-unwind-tables Common Report Var(flag_asynchronous_unwind_tables) Optimization Generate unwind tables that are exact at each instruction
libsanitizer merge from upstream request
Konstantin / Jakub, Could you update GCC's libsanitizer version? I'd like to have the AArch64 support, which was committed on my behalf in LLVM sources as svn rev 201303. You may prefer to merge with a more recent revision of course :-) Once AArch64 support is merged, I'll post the GCC part. Thanks, Christophe.
Re: [PATCH][RFC] Remove RTL loop unswitching
On Sun, 20 Apr 2014, Jan Hubicka wrote: This removes RTL loop unswitching (see last years discussion about compile-time issues of that pass). RTL loop unswitching is enabled together with GIMPLE loop unswitching at -O3 and by -floop-unswitch. It's clearly the wrong place to do high-level loop transforms these days, and the cost of maintainance doesn't outweight the questionable benefit. Thus the following patch removes it. Bootstrap / regtest pending on x86_64-unknown-linux-gnu (I hope for testsuite fallout). Any objections? Not really, I am all for moving more of loop stuff to trees. Did you performed some benchmarks? (I remember I did in 2012 but completely forgot the outcome). I did that last year and it showed no difference in SPEC 2k6. When bootstrapping with -O3 and a gcc_unreachable () in the RTL unswitching path you get some ICEs there but they are due to different effective --param max-unswitch-insns that is on GIMPLE applied to tree_num_loop_insns () and on RTL to num_loop_insns (). I'll go forward with the patch today. On related note, shall I try to update the following? http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02270.html Yeah. Thanks, Richard. Honza Thanks, Richard. 2014-04-15 Richard Biener rguent...@suse.de * Makefile.in (OBJS): Remove loop-unswitch.o. * loop-unswitch.c: Delete. * tree-pass.h (make_pass_rtl_unswitch): Remove. * passes.def (pass_rtl_unswitch): Likewise. * loop-init.c (gate_rtl_unswitch): Likewise. (rtl_unswitch): Likewise. (pass_data_rtl_unswitch): Likewise. (pass_rtl_unswitch): Likewise. (make_pass_rtl_unswitch): Likewise. * rtl.h (reversed_condition): Likewise. (compare_and_jump_seq): Likewise. * loop-iv.c (reversed_condition): Move here from loop-unswitch.c and make static. * loop-unroll.c (compare_and_jump_seq): Likewise. Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 209410) +++ gcc/Makefile.in (working copy) @@ -1294,7 +1294,6 @@ OBJS = \ loop-invariant.o \ loop-iv.o \ loop-unroll.o \ - loop-unswitch.o \ lower-subreg.o \ lra.o \ lra-assigns.o \ Index: gcc/tree-pass.h === --- gcc/tree-pass.h (revision 209410) +++ gcc/tree-pass.h (working copy) @@ -512,7 +512,6 @@ extern rtl_opt_pass *make_pass_outof_cfg extern rtl_opt_pass *make_pass_loop2 (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_loop_init (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_move_loop_invariants (gcc::context *ctxt); -extern rtl_opt_pass *make_pass_rtl_unswitch (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_unroll_and_peel_loops (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_doloop (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_loop_done (gcc::context *ctxt); Index: gcc/passes.def === --- gcc/passes.def (revision 209410) +++ gcc/passes.def (working copy) @@ -341,7 +341,6 @@ along with GCC; see the file COPYING3. PUSH_INSERT_PASSES_WITHIN (pass_loop2) NEXT_PASS (pass_rtl_loop_init); NEXT_PASS (pass_rtl_move_loop_invariants); - NEXT_PASS (pass_rtl_unswitch); NEXT_PASS (pass_rtl_unroll_and_peel_loops); NEXT_PASS (pass_rtl_doloop); NEXT_PASS (pass_rtl_loop_done); Index: gcc/loop-init.c === --- gcc/loop-init.c (revision 209410) +++ gcc/loop-init.c (working copy) @@ -518,61 +518,7 @@ make_pass_rtl_move_loop_invariants (gcc: } -/* Loop unswitching for RTL. */ -static bool -gate_rtl_unswitch (void) -{ - return flag_unswitch_loops; -} - -static unsigned int -rtl_unswitch (void) -{ - if (number_of_loops (cfun) 1) -unswitch_loops (); - return 0; -} - -namespace { - -const pass_data pass_data_rtl_unswitch = -{ - RTL_PASS, /* type */ - loop2_unswitch, /* name */ - OPTGROUP_LOOP, /* optinfo_flags */ - true, /* has_gate */ - true, /* has_execute */ - TV_LOOP_UNSWITCH, /* tv_id */ - 0, /* properties_required */ - 0, /* properties_provided */ - 0, /* properties_destroyed */ - 0, /* todo_flags_start */ - TODO_verify_rtl_sharing, /* todo_flags_finish */ -}; - -class pass_rtl_unswitch : public rtl_opt_pass -{ -public: - pass_rtl_unswitch (gcc::context *ctxt) -: rtl_opt_pass (pass_data_rtl_unswitch, ctxt) - {} - - /* opt_pass methods: */ - bool gate () { return gate_rtl_unswitch (); } - unsigned int execute () { return rtl_unswitch (); } - -}; // class pass_rtl_unswitch - -} // anon namespace - -rtl_opt_pass * -make_pass_rtl_unswitch (gcc::context *ctxt) -{ -
[PATCH] Fix PR60891
This fixes an oversight in loop_optimizer_init () loop-fixup code that fails to honor AVOID_CFG_MANIPULATIONS. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk and 4.9 branch. Richard. 2014-04-23 Richard Biener rguent...@suse.de PR middle-end/60891 * loop-init.c (loop_optimizer_init): Make sure to apply LOOPS_MAY_HAVE_MULTIPLE_LATCHES before fixing up loops. * gcc.dg/torture/pr60891.c: New testcase. Index: gcc/loop-init.c === --- gcc/loop-init.c (revision 209559) +++ gcc/loop-init.c (working copy) @@ -94,20 +94,15 @@ loop_optimizer_init (unsigned flags) else { bool recorded_exits = loops_state_satisfies_p (LOOPS_HAVE_RECORDED_EXITS); + bool needs_fixup = loops_state_satisfies_p (LOOPS_NEED_FIXUP); gcc_assert (cfun-curr_properties PROP_loops); /* Ensure that the dominators are computed, like flow_loops_find does. */ calculate_dominance_info (CDI_DOMINATORS); - if (loops_state_satisfies_p (LOOPS_NEED_FIXUP)) - { - loops_state_clear (~0U); - fix_loop_structure (NULL); - } - #ifdef ENABLE_CHECKING - else + if (!needs_fixup) verify_loop_structure (); #endif @@ -115,6 +110,14 @@ loop_optimizer_init (unsigned flags) if (recorded_exits) release_recorded_exits (); loops_state_clear (~0U); + + if (needs_fixup) + { + /* Apply LOOPS_MAY_HAVE_MULTIPLE_LATCHES early as fix_loop_structure +re-applies flags. */ + loops_state_set (flags LOOPS_MAY_HAVE_MULTIPLE_LATCHES); + fix_loop_structure (NULL); + } } /* Apply flags to loops. */ Index: gcc/testsuite/gcc.dg/torture/pr60891.c === --- gcc/testsuite/gcc.dg/torture/pr60891.c (revision 0) +++ gcc/testsuite/gcc.dg/torture/pr60891.c (working copy) @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-additional-options -fno-tree-ch -fno-tree-cselim -fno-tree-dominator-opts } */ + +int a, b, c, d, e, f; + +void foo (int x) +{ + for (;;) +{ + int g = c; + if (x) + { + if (e) + while (a) + --f; + } + for (b = 5; b; b--) + { + } + if (!g) + x = 0; +} +}
[PATCH] Fix PR60895
This fixes PR60895 - copying TREE_ADDRESSABLE from a decl to a handled-component-ref doesn't work as the inliner tries to do. Use mark_addressable instead. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk and 4.9 branch. Richard. 2014-04-23 Richard Biener rguent...@suse.de PR middle-end/60895 * tree-inline.c (declare_return_variable): Use mark_addressable. * g++.dg/torture/pr60895.C: New testcase. Index: gcc/tree-inline.c === --- gcc/tree-inline.c (revision 209559) +++ gcc/tree-inline.c (working copy) @@ -3120,7 +3124,8 @@ declare_return_variable (copy_body_data { var = return_slot; gcc_assert (TREE_CODE (var) != SSA_NAME); - TREE_ADDRESSABLE (var) |= TREE_ADDRESSABLE (result); + if (TREE_ADDRESSABLE (result)) + mark_addressable (var); } if ((TREE_CODE (TREE_TYPE (result)) == COMPLEX_TYPE || TREE_CODE (TREE_TYPE (result)) == VECTOR_TYPE) Index: gcc/testsuite/g++.dg/torture/pr60895.C === --- gcc/testsuite/g++.dg/torture/pr60895.C (revision 0) +++ gcc/testsuite/g++.dg/torture/pr60895.C (working copy) @@ -0,0 +1,32 @@ +// { dg-do compile } + +struct C +{ + double elems[3]; +}; + +C +foo () +{ + C a; + double *f = a.elems; + int b; + for (; b;) +{ + *f = 0; + ++f; +} + return a; +} + +struct J +{ + C c; + __attribute__((always_inline)) J () : c (foo ()) {} +}; + +void +bar () +{ + J (); +}
Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc
On 17 April 2014 19:01, Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: On Thu, Apr 17, 2014 at 8:45 PM, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: On 17 April 2014 16:51:23 Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: On Thu, Apr 17, 2014 at 6:27 PM, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: On 17 April 2014 16:07, Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: Hi, If you are trying to modify the libsanitizer files, please read here: https://code.google.com/p/address-sanitizer/wiki/HowToContribute I read that, thanks. Patch 3/3 is for current compiler-rt git repo, please install it there, i do not have write access to the LLVM nor compiler-rt trees. I can commit your patch to llvm tree only after you follow the process described on that page. Sorry, this is a hard rule. What part of the process do you think I did not follow? I made a patch for compiler-rt, sent it to llvm-comm...@cs.uiuc.edu then provided the corresponding GCC parts, along a backport of the new bits that I expect to be overwritten once you do a new merge, leaving just the GCC configuy bits. This is how I read the wiki page you cite. Please tell me what you expect me to do differently? First, I did not notice that you've sent it to llvm-commits because it was also sent to the gcc list (unusual thing to happen) and got filtered into the gcc part of my mail. Sorry. But second, the patch is far from trivial and you should not expect us to commit it w/o a careful review, so here comes another part of the wiki: For non-trivial patches please use Phabricator -- this will help us reply faster. http://reviews.llvm.org/D3464 thanks,
Re: fuse-caller-save - hook format
On 22/04/14 18:13, Tom de Vries wrote: On 22-04-14 18:18, Richard Sandiford wrote: Tom de Vries tom_devr...@mentor.com writes: On 22-04-14 17:27, Richard Sandiford wrote: Tom de Vries tom_devr...@mentor.com writes: 2. post_expand_call_insn. A utility hook to facilitate adding the clobbers to CALL_INSN_FUNCTION_USAGE. Why is this needed though? Like I say, I think targets should update CALL_INSN_FUNCTION_USAGE when emitting calls as part of the call expander. Splitting the functionality of the call expanders across the define_expand and a new hook just makes things unnecessarily complicated IMO. Richard, It is not needed, but it is convenient. There are targets where the define_expands for calls use the rtl template. Having to add clobbers to the CALL_INSN_FUNCTION_USAGE for such a target means you cannot use the rtl template any more and instead need to generate all needed RTL insns in C code. This hook means that you can keep using the rtl template, which is less intrusive for those targets. [ switching order of questions ] Which target do you have in mind? Aarch64. But if the target is simple enough to use a single call pattern for call cases, wouldn't it be possible to add the clobber directly to the call pattern? I think that can be done, but that feels intrusive as well. I thought the reason that we added these clobbers to CALL_INSN_FUNCTION_USAGE was exactly because we did not want to add them to the rtl patterns? But, if the maintainer is fine with that, so am I. Richard Earnshaw, are you ok with adding the IP0_REGNUM/IP1_REGNUM clobbers to all the call patterns in the Aarch64 target? The alternatives are: - rewrite the call expansions not to use the rtl templates, and add the clobbers there to CALL_INSN_FUNCTION_USAGE - get the post_expand_call_insn hook approved and use that to add the clobbers to CALL_INSN_FUNCTION_USAGE. what is your preference? It seems undesirable to me to be hard-coding ABI constraints directly into the MD file. It's not a major problem while there is one ABI that's common to all targets; but it's quite possible this sort of detail would change from platform to platform. That sort of churn is best kept out of the MD file itself, if at all possible. R. Thanks, - Tom
RFA: x86 backend: Add default-manifest to Cygwin/MinGW links
Hi Guys, Please could I have permission to apply the patch below ? Ideally for both mainline and the 4.9 branch. The patch adds a file called default-manifest.o to the end of a final link command line for the Cygwin and MinGW targets. The file is only added if it can be found in the library search path(s), so the patch will have no effect if the file does not exist. The default manifest file contains a resource section (.rsrc) holding information necessary for the binary to be run under Windows 8. It is placed last on the linker command line so that a user provided manifest, if there is one, will take precedence over the default manifest. The manifest used to be automatically added by the linker, but this proved to be problematic as the linker is not good at selectively inserting binaries. The manifest itself is provided by a separate project which will have to become a new dependency for the Cygwin and MinGW projects. Cheers Nick gcc/ChangeLog 2014-04-23 Nick Clifton ni...@redhat.com * config/i386/cygwin.h (ENDFILE_SPEC): Include default-manifest.o if it can be found in the search path. * config/i386/mingw32.h (ENDFILE_SPEC): Likewise. Index: gcc/config/i386/cygwin.h === --- gcc/config/i386/cygwin.h(revision 209670) +++ gcc/config/i386/cygwin.h(working copy) @@ -45,6 +45,7 @@ #undef ENDFILE_SPEC #define ENDFILE_SPEC \ %{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s}\ + %{!shared:%:if-exists(default-manifest.o%s)}\ crtend.o%s /* Normally, -lgcc is not needed since everything in it is in the DLL, but we Index: gcc/config/i386/mingw32.h === --- gcc/config/i386/mingw32.h (revision 209670) +++ gcc/config/i386/mingw32.h (working copy) @@ -148,6 +148,7 @@ #undef ENDFILE_SPEC #define ENDFILE_SPEC \ %{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s} \ + %{!shared:%:if-exists(default-manifest.o%s)}\ crtend.o%s /* Override startfile prefix defaults. */
Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc
Thanks. Let's move the discussion there. On Wed, Apr 23, 2014 at 12:46 PM, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: On 17 April 2014 19:01, Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: On Thu, Apr 17, 2014 at 8:45 PM, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: On 17 April 2014 16:51:23 Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: On Thu, Apr 17, 2014 at 6:27 PM, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: On 17 April 2014 16:07, Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: Hi, If you are trying to modify the libsanitizer files, please read here: https://code.google.com/p/address-sanitizer/wiki/HowToContribute I read that, thanks. Patch 3/3 is for current compiler-rt git repo, please install it there, i do not have write access to the LLVM nor compiler-rt trees. I can commit your patch to llvm tree only after you follow the process described on that page. Sorry, this is a hard rule. What part of the process do you think I did not follow? I made a patch for compiler-rt, sent it to llvm-comm...@cs.uiuc.edu then provided the corresponding GCC parts, along a backport of the new bits that I expect to be overwritten once you do a new merge, leaving just the GCC configuy bits. This is how I read the wiki page you cite. Please tell me what you expect me to do differently? First, I did not notice that you've sent it to llvm-commits because it was also sent to the gcc list (unusual thing to happen) and got filtered into the gcc part of my mail. Sorry. But second, the patch is far from trivial and you should not expect us to commit it w/o a careful review, so here comes another part of the wiki: For non-trivial patches please use Phabricator -- this will help us reply faster. http://reviews.llvm.org/D3464 thanks,
Re: RFA: x86 backend: Add default-manifest to Cygwin/MinGW links
Hello Nick, 2014-04-23 10:53 GMT+02:00 Nick Clifton ni...@redhat.com: Hi Guys, Please could I have permission to apply the patch below ? Ideally for both mainline and the 4.9 branch. The patch adds a file called default-manifest.o to the end of a final link command line for the Cygwin and MinGW targets. The file is only added if it can be found in the library search path(s), so the patch will have no effect if the file does not exist. The default manifest file contains a resource section (.rsrc) holding information necessary for the binary to be run under Windows 8. It is placed last on the linker command line so that a user provided manifest, if there is one, will take precedence over the default manifest. The manifest used to be automatically added by the linker, but this proved to be problematic as the linker is not good at selectively inserting binaries. The manifest itself is provided by a separate project which will have to become a new dependency for the Cygwin and MinGW projects. Cheers Nick Well, I am a bit concerned about the position of the manifest-object. What will actually happen, if user specifies an user-specific manifest-object. Will the default one, if present, be ignored, or will it be still linked? Cheers, Kai
Re: Remove obsolete Solaris 9 support
Andrew Hughes gnu.and...@redhat.com writes: - Original Message - On Sat, 2014-04-19 at 09:03 +0100, Andrew Haley wrote: On 04/16/2014 12:16 PM, Rainer Orth wrote: * I'm removing the sys/loadavg.h check from classpath. Again, I'm uncertain if this is desirable. In the past, classpath changes were merged upstream by one of the libjava maintainers. We should not diverge from GNU Classpath unless there is a strong reason to do so. I think the configure check is mostly harmless, but wouldn't be opposed removing it. It really seems to have been added explicitly for Solaris 9, which is probably really dead by now. Andrew Hughes, you added it back in 2008. Are you still using/building on any Solaris 9 setups? I vaguely remember adding it. I was building on the university's Solaris 9 machines at the time. They've long since replaced them with GNU/Linux machines and I've been at Red Hat for over five years, so those days are long gone :) I have some Freetype fixes to push to Classpath as well, so I'll fix this too and look at merging to gcj in the not-too-distant future. I think it's long overdue. Ideally, the change should be left out of this patch, so as to avoid conflicts. Based on the other Andrew's comment and the knowledge that classpath (like libgo) lives upstream, I didn't commit that part with the rest of the patch. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: RFA: x86 backend: Add default-manifest to Cygwin/MinGW links
Hi Kai, The default manifest file contains a resource section (.rsrc) holding information necessary for the binary to be run under Windows 8. It is placed last on the linker command line so that a user provided manifest, if there is one, will take precedence over the default manifest. Well, I am a bit concerned about the position of the manifest-object. What will actually happen, if user specifies an user-specific manifest-object. Will the default one, if present, be ignored, or will it be still linked? The default one, if present, will be ignored[1]. This is why I am using ENDFILE_SPEC to add the default manifest to the linker command line. This ensures that the default manifest is placed after any user specified object files on the linker command line. The resource merging code in the linker is specifically designed to drop any duplicate resources, only keeping the resource that appeared first on the command line. Cheers Nick [1] Strictly speaking the default manifest will not be ignored. It will be included in the link, and merged into the output .rsrc section. But the resource merging code in the linker will drop everything in the default manifest giving preference to the user supplied manifest instead.
Re: RFA: x86 backend: Add default-manifest to Cygwin/MinGW links
2014-04-23 11:06 GMT+02:00 Nicholas Clifton ni...@redhat.com: Hi Kai, The default manifest file contains a resource section (.rsrc) holding information necessary for the binary to be run under Windows 8. It is placed last on the linker command line so that a user provided manifest, if there is one, will take precedence over the default manifest. Well, I am a bit concerned about the position of the manifest-object. What will actually happen, if user specifies an user-specific manifest-object. Will the default one, if present, be ignored, or will it be still linked? The default one, if present, will be ignored[1]. This is why I am using ENDFILE_SPEC to add the default manifest to the linker command line. This ensures that the default manifest is placed after any user specified object files on the linker command line. The resource merging code in the linker is specifically designed to drop any duplicate resources, only keeping the resource that appeared first on the command line. Cheers Nick [1] Strictly speaking the default manifest will not be ignored. It will be included in the link, and merged into the output .rsrc section. But the resource merging code in the linker will drop everything in the default manifest giving preference to the user supplied manifest instead. Thanks for explaining. So patch is ok for trunk, and for 4.9 branch. Thanks, Kai
Re: [Patch, Fortran] PR60881 - fix ICE with allocatable scalar coarrays
Dear Tobias, As you say, this of a rather obvious nature and is OK for trunk. Cheers Paul On 21 April 2014 22:52, Tobias Burnus bur...@net-b.de wrote: Dear all, for a change, a patch for the trunk and not for the fortran-caf branch. The following is a rather obvious patch which fixes the ICE. Built and regtested on x86-64-gnu-linux. OK for the trunk? As it is of rather obvious nature, I will commit it to the trunk in the next days unless there are objections. Tobias -- The knack of flying is learning how to throw yourself at the ground and miss. --Hitchhikers Guide to the Galaxy
Re: [PATCH] Fix warning in libgfortran configure script
On 17/04/14 17:49, Kyrill Tkachov wrote: Hi all, While configuring libgfortran I'm getting this message: libgfortran/configure: line 25938: test: =: unary operator expected The script doesn't fail and continues afterwards, but I don't think it's supposed to give that warning. This patch makes it go away and makes it more consistent with other similar uses (a few lines below $ac_cv_lib_rt_clock_gettime is quoted when used in a test structure). configure.ac is updated and configure is regenerated with autoconf 2.64 Ok for trunk? Make sure libgfortran builds for arm-none-eabi. libgfortran/ 2014-04-17 Kyrylo Tkachov kyrylo.tkac...@arm.com * configure.ac: Quote usage of ac_cv_func_clock_gettime in if test. * configure: Regenerate. This looks fairly safe to me. My only question might be why isn't the variable set to one of 'yes' or 'no'? OK unless the fortran maintainers chime in within 24 hours. R. libgfortran-configure.patch diff --git a/libgfortran/configure b/libgfortran/configure index 23f57c7..d3ced74 100755 --- a/libgfortran/configure +++ b/libgfortran/configure @@ -25935,7 +25935,7 @@ fi # test is copied from libgomp, and modified to not link in -lrt as # libgfortran calls clock_gettime via a weak reference if it's found # in librt. -if test $ac_cv_func_clock_gettime = no; then +if test $ac_cv_func_clock_gettime = no; then { $as_echo $as_me:${as_lineno-$LINENO}: checking for clock_gettime in -lrt 5 $as_echo_n checking for clock_gettime in -lrt... 6; } if test ${ac_cv_lib_rt_clock_gettime+set} = set; then : diff --git a/libgfortran/configure.ac b/libgfortran/configure.ac index de2d65e..24dbf2b 100644 --- a/libgfortran/configure.ac +++ b/libgfortran/configure.ac @@ -510,7 +510,7 @@ AC_CHECK_LIB([m],[feenableexcept],[have_feenableexcept=yes AC_DEFINE([HAVE_FEENA # test is copied from libgomp, and modified to not link in -lrt as # libgfortran calls clock_gettime via a weak reference if it's found # in librt. -if test $ac_cv_func_clock_gettime = no; then +if test $ac_cv_func_clock_gettime = no; then AC_CHECK_LIB(rt, clock_gettime, [AC_DEFINE(HAVE_CLOCK_GETTIME_LIBRT, 1, [Define to 1 if you have the `clock_gettime' function in librt.])])
Re: RFA: x86 backend: Add default-manifest to Cygwin/MinGW links
[This time to everyone, not only to Kai, sorry] Hi guys, On Apr 23 11:08, Kai Tietz wrote: 2014-04-23 11:06 GMT+02:00 Nicholas Clifton ni...@redhat.com: Hi Kai, The default manifest file contains a resource section (.rsrc) holding information necessary for the binary to be run under Windows 8. It is placed last on the linker command line so that a user provided manifest, if there is one, will take precedence over the default manifest. Well, I am a bit concerned about the position of the manifest-object. What will actually happen, if user specifies an user-specific manifest-object. Will the default one, if present, be ignored, or will it be still linked? The default one, if present, will be ignored[1]. This is why I am using ENDFILE_SPEC to add the default manifest to the linker command line. This ensures that the default manifest is placed after any user specified object files on the linker command line. The resource merging code in the linker is specifically designed to drop any duplicate resources, only keeping the resource that appeared first on the command line. Cheers Nick [1] Strictly speaking the default manifest will not be ignored. It will be included in the link, and merged into the output .rsrc section. But the resource merging code in the linker will drop everything in the default manifest giving preference to the user supplied manifest instead. Thanks for explaining. So patch is ok for trunk, and for 4.9 branch. Couldn't have said it better. However, we know that the act of merging will currently result in broken resources in the executable. Wouldn't it be better to apply the above patch only after the resource merge fix? Thanks, Corinna -- Corinna Vinschen Cygwin Maintainer Red Hat pgpaqCW9LpugL.pgp Description: PGP signature
Re: RFA: x86 backend: Add default-manifest to Cygwin/MinGW links
Hi Corinna, However, we know that the act of merging will currently result in broken resources in the executable. Wouldn't it be better to apply the above patch only after the resource merge fix? No. Well not in my opinion. :-) The reason is that this patch only makes a difference if the default manifest can be found in a library search path. If there is none present then nothing happens. So you can disable the (broken) merging of a default manifest file by simply not having it present. Which should be the case for all current installations. Plus - I am hoping to fix the resource merging problem soon. (Any day now, honest). So I would like to have the gcc patch in place for when that happens. Cheers Nick
[PATCH][RFC] (Auto)-add TODO_verify_il
This goes forward with an old idea of doing IL verification after each pass. This is a baby-step towards it by adding TODO_verify_il, auto-added by the pass manager at the todo-after position. It moves loop-closed SSA verification (which was done whenever loops were in loop-closed SSA form - before _and_ after a pass...) under the TODO_verify_il umbrella. Bootstrap/regtest ongoing on x86_64-unknown-linux-gnu. I'm proposing to remove TODO_verify_* by enabling them under TODO_verify_il. Any comments? Thansk, Richard. 2014-04-23 Richard Biener rguent...@suse.de * tree-pass.h (TODO_verify_il): Define. (TODO_verify_all): Complete properly. * passes.c (execute_function_todo): Move existing loop-closed SSA verification under TODO_verify_il. (execute_one_pass): Trigger TODO_verify_il at todo-after time. Index: gcc/tree-pass.h === --- gcc/tree-pass.h (revision 209677) +++ gcc/tree-pass.h (working copy) @@ -234,6 +234,7 @@ protected: #define TODO_verify_flow (1 3) #define TODO_verify_stmts (1 4) #define TODO_cleanup_cfg (1 5) +#define TODO_verify_il (1 6) #define TODO_dump_symtab (1 7) #define TODO_remove_functions (1 8) #define TODO_rebuild_frequencies (1 9) @@ -309,7 +310,8 @@ protected: | TODO_update_ssa_only_virtuals) #define TODO_verify_all \ - (TODO_verify_ssa | TODO_verify_flow | TODO_verify_stmts) + (TODO_verify_ssa | TODO_verify_flow | TODO_verify_stmts | TODO_verify_il \ + | TODO_verify_rtl_sharing) /* Register pass info. */ Index: gcc/passes.c === --- gcc/passes.c(revision 209677) +++ gcc/passes.c(working copy) @@ -1777,8 +1777,7 @@ execute_function_todo (void *data) return; #if defined ENABLE_CHECKING - if (flags TODO_verify_ssa - || (current_loops loops_state_satisfies_p (LOOP_CLOSED_SSA))) + if (flags TODO_verify_ssa) { verify_gimple_in_cfg (cfun); verify_ssa (true); @@ -1787,8 +1786,18 @@ execute_function_todo (void *data) verify_gimple_in_cfg (cfun); if (flags TODO_verify_flow) verify_flow_info (); - if (current_loops loops_state_satisfies_p (LOOP_CLOSED_SSA)) -verify_loop_closed_ssa (false); + if (flags TODO_verify_il) +{ + if (current_loops + loops_state_satisfies_p (LOOP_CLOSED_SSA)) + { + if (!(flags (TODO_verify_stmts|TODO_verify_ssa))) + verify_gimple_in_cfg (cfun); + if (!(flags TODO_verify_ssa)) + verify_ssa (true); + verify_loop_closed_ssa (false); + } +} if (flags TODO_verify_rtl_sharing) verify_rtl_sharing (); #endif @@ -2170,7 +2179,7 @@ execute_one_pass (opt_pass *pass) check_profile_consistency (pass-static_pass_number, 0, true); /* Run post-pass cleanup and verification. */ - execute_todo (todo_after | pass-todo_flags_finish); + execute_todo (todo_after | pass-todo_flags_finish | TODO_verify_il); if (profile_report cfun (cfun-curr_properties PROP_cfg)) check_profile_consistency (pass-static_pass_number, 1, true);
Re: [RFC] Add aarch64 support for ada
OK, I have installed a variant of the patch (it should not change anything). But it breaks on IA-64 for the same reason as on Aarch64 so we'll need to find something else. -- Eric Botcazou
Re: [PATCH][RFC][wide-int] Fix some build errors on arm in wide-int branch and report ICE
On Tue, Apr 22, 2014 at 6:04 PM, Mike Stump mikest...@comcast.net wrote: On Apr 22, 2014, at 8:33 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Kyrill Tkachov kyrylo.tkac...@arm.com writes: Ping. http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00769.html Any ideas? I recall chatter on IRC that we want to merge wide-int into trunk soon. Bootstrap failure on arm would prevent that... Sorry for the late reply. I hadn't forgotten, but I wanted to wait until I had chance to look into the ICE before replying, which I haven't had chance to do yet. They are separable issues, so, I checked in the change. It's a shame we can't use C++ style casts, but I suppose that's the price to pay for being able to write unsigned HOST_WIDE_INT”. unsigned_HOST_WIDE_INT isn’t horrible, but, yeah, my fingers were expecting a typedef or better. I slightly prefer the int (1) style, but I think we should go the direction of the patch. Well, on my list of things to try for 4.10 is to kill off HOST_WIDE_* and require a 64bit integer type on the host and force all targets to use a 64bit 'hwi'. Thus, s/HOST_WIDE_INT/int64_t/ (and the appropriate related changes). Richard.
Re: [wide-int 1/8] Fix some off-by-one errors and bounds tests
On Tue, Apr 22, 2014 at 9:45 PM, Richard Sandiford rdsandif...@googlemail.com wrote: This is the first of 8 patches from reading through the diff with mainline. Some places had an off-by-one error on an index and some used = 0 instead of = 0. I think we should use MAX_BITSIZE_MODE_ANY_MODE rather than MAX_BITSIZE_MODE_ANY_INT when handling floating-point modes. Two hunks contain unrelated formatting fixes too. Tested on x86_64-linux-gnu. OK to install? Ok. Thanks, Richard. Thanks, Richard Index: gcc/c-family/c-ada-spec.c === --- gcc/c-family/c-ada-spec.c 2014-04-22 20:31:10.632895953 +0100 +++ gcc/c-family/c-ada-spec.c 2014-04-22 20:31:24.880998602 +0100 @@ -2205,8 +2205,9 @@ dump_generic_ada_node (pretty_printer *b val = -val; } sprintf (pp_buffer (buffer)-digit_buffer, - 16#% HOST_WIDE_INT_PRINT x, val.elt (val.get_len () - 1)); - for (i = val.get_len () - 2; i = 0; i--) + 16#% HOST_WIDE_INT_PRINT x, + val.elt (val.get_len () - 1)); + for (i = val.get_len () - 2; i = 0; i--) sprintf (pp_buffer (buffer)-digit_buffer, HOST_WIDE_INT_PRINT_PADDED_HEX, val.elt (i)); pp_string (buffer, pp_buffer (buffer)-digit_buffer); Index: gcc/dbxout.c === --- gcc/dbxout.c2014-04-22 20:31:10.632895953 +0100 +++ gcc/dbxout.c2014-04-22 20:31:24.881998608 +0100 @@ -720,7 +720,7 @@ stabstr_O (tree cst) } prec -= res_pres; - for (i = prec - 3; i = 0; i = i - 3) + for (i = prec - 3; i = 0; i = i - 3) { digit = wi::extract_uhwi (cst, i, 3); stabstr_C ('0' + digit); Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c 2014-04-22 20:31:10.632895953 +0100 +++ gcc/dwarf2out.c 2014-04-22 20:31:24.884998630 +0100 @@ -1847,7 +1847,7 @@ output_loc_operands (dw_loc_descr_ref lo int i; int len = get_full_len (*val2-v.val_wide); if (WORDS_BIG_ENDIAN) - for (i = len; i = 0; --i) + for (i = len - 1; i = 0; --i) dw2_asm_output_data (HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR, val2-v.val_wide-elt (i), NULL); else @@ -2073,7 +2073,7 @@ output_loc_operands (dw_loc_descr_ref lo dw2_asm_output_data (1, len * l, NULL); if (WORDS_BIG_ENDIAN) - for (i = len; i = 0; --i) + for (i = len - 1; i = 0; --i) dw2_asm_output_data (l, val2-v.val_wide-elt (i), NULL); else for (i = 0; i len; ++i) @@ -5398,11 +5398,11 @@ print_die (dw_die_ref die, FILE *outfile int i = a-dw_attr_val.v.val_wide-get_len (); fprintf (outfile, constant (); gcc_assert (i 0); - if (a-dw_attr_val.v.val_wide-elt (i) == 0) + if (a-dw_attr_val.v.val_wide-elt (i - 1) == 0) fprintf (outfile, 0x); fprintf (outfile, HOST_WIDE_INT_PRINT_HEX, a-dw_attr_val.v.val_wide-elt (--i)); - while (-- i = 0) + while (--i = 0) fprintf (outfile, HOST_WIDE_INT_PRINT_PADDED_HEX, a-dw_attr_val.v.val_wide-elt (i)); fprintf (outfile, )); @@ -8723,7 +8723,7 @@ output_die (dw_die_ref die) NULL); if (WORDS_BIG_ENDIAN) - for (i = len; i = 0; --i) + for (i = len - 1; i = 0; --i) { dw2_asm_output_data (l, a-dw_attr_val.v.val_wide-elt (i), name); Index: gcc/simplify-rtx.c === --- gcc/simplify-rtx.c 2014-04-22 20:31:10.632895953 +0100 +++ gcc/simplify-rtx.c 2014-04-22 20:31:24.884998630 +0100 @@ -5395,7 +5395,7 @@ simplify_immed_subreg (enum machine_mode case MODE_DECIMAL_FLOAT: { REAL_VALUE_TYPE r; - long tmp[MAX_BITSIZE_MODE_ANY_INT / 32]; + long tmp[MAX_BITSIZE_MODE_ANY_MODE / 32]; /* real_from_target wants its input in words affected by FLOAT_WORDS_BIG_ENDIAN. However, we ignore this,
Re: [wide-int 4/8] Tweak uses of new API
On Tue, Apr 22, 2014 at 9:55 PM, Richard Sandiford rdsandif...@googlemail.com wrote: This is an assorted bunch of API tweaks: - use neg_p instead of lts_p (..., 0) - use STATIC_ASSERT for things that are known at compile time - avoid unnecessary wide(st)_int temporaries and arithmetic - remove an unnecessary template parameter - use to_short_addr for an offset_int-HOST_WIDE_INT offset change Tested on x86_64-linux-gnu. OK to install? Ok. Thanks, Richard. Thanks, Richard Index: gcc/ada/gcc-interface/cuintp.c === --- gcc/ada/gcc-interface/cuintp.c 2014-04-22 20:31:10.680896299 +0100 +++ gcc/ada/gcc-interface/cuintp.c 2014-04-22 20:31:24.526996049 +0100 @@ -160,7 +160,7 @@ UI_From_gnu (tree Input) in a signed 64-bit integer. */ if (tree_fits_shwi_p (Input)) return UI_From_Int (tree_to_shwi (Input)); - else if (wi::lts_p (Input, 0) TYPE_UNSIGNED (gnu_type)) + else if (wi::neg_p (Input) TYPE_UNSIGNED (gnu_type)) return No_Uint; #endif Index: gcc/expmed.c === --- gcc/expmed.c2014-04-22 20:31:10.680896299 +0100 +++ gcc/expmed.c2014-04-22 20:31:24.527996056 +0100 @@ -4971,7 +4971,7 @@ make_tree (tree type, rtx x) return t; case CONST_DOUBLE: - gcc_assert (HOST_BITS_PER_WIDE_INT * 2 = MAX_BITSIZE_MODE_ANY_INT); + STATIC_ASSERT (HOST_BITS_PER_WIDE_INT * 2 = MAX_BITSIZE_MODE_ANY_INT); if (TARGET_SUPPORTS_WIDE_INT == 0 GET_MODE (x) == VOIDmode) t = wide_int_to_tree (type, wide_int::from_array (CONST_DOUBLE_LOW (x), 2, Index: gcc/fold-const.c === --- gcc/fold-const.c2014-04-22 20:31:10.680896299 +0100 +++ gcc/fold-const.c2014-04-22 20:31:24.530996079 +0100 @@ -4274,9 +4274,8 @@ build_range_check (location_t loc, tree if (integer_onep (low) TREE_CODE (high) == INTEGER_CST) { int prec = TYPE_PRECISION (etype); - wide_int osb = wi::set_bit_in_zero (prec - 1, prec) - 1; - if (osb == high) + if (wi::mask (prec - 1, false, prec) == high) { if (TYPE_UNSIGNED (etype)) { @@ -12950,7 +12949,7 @@ fold_binary_loc (location_t loc, operand_equal_p (tree_strip_nop_conversions (TREE_OPERAND (arg0, 1)), arg1, 0) - wi::bit_and (TREE_OPERAND (arg0, 0), 1) == 1) + wi::extract_uhwi (TREE_OPERAND (arg0, 0), 0, 1) == 1) { return omit_two_operands_loc (loc, type, code == NE_EXPR Index: gcc/predict.c === --- gcc/predict.c 2014-04-22 20:31:10.680896299 +0100 +++ gcc/predict.c 2014-04-22 20:31:24.531996086 +0100 @@ -1309,33 +1309,34 @@ predict_iv_comparison (struct loop *loop bool overflow, overall_overflow = false; widest_int compare_count, tem; - widest_int loop_bound = wi::to_widest (loop_bound_var); - widest_int compare_bound = wi::to_widest (compare_var); - widest_int base = wi::to_widest (compare_base); - widest_int compare_step = wi::to_widest (compare_step_var); - /* (loop_bound - base) / compare_step */ - tem = wi::sub (loop_bound, base, SIGNED, overflow); + tem = wi::sub (wi::to_widest (loop_bound_var), +wi::to_widest (compare_base), SIGNED, overflow); overall_overflow |= overflow; - widest_int loop_count = wi::div_trunc (tem, compare_step, SIGNED, -overflow); + widest_int loop_count = wi::div_trunc (tem, +wi::to_widest (compare_step_var), +SIGNED, overflow); overall_overflow |= overflow; - if (!wi::neg_p (compare_step) + if (!wi::neg_p (wi::to_widest (compare_step_var)) ^ (compare_code == LT_EXPR || compare_code == LE_EXPR)) { /* (loop_bound - compare_bound) / compare_step */ - tem = wi::sub (loop_bound, compare_bound, SIGNED, overflow); + tem = wi::sub (wi::to_widest (loop_bound_var), +wi::to_widest (compare_var), SIGNED, overflow); overall_overflow |= overflow; - compare_count = wi::div_trunc (tem, compare_step, SIGNED, overflow); + compare_count = wi::div_trunc (tem, wi::to_widest (compare_step_var), +SIGNED, overflow); overall_overflow |= overflow; } else { /* (compare_bound - base) / compare_step */ - tem = wi::sub (compare_bound, base, SIGNED, overflow); + tem =
Re: [wide-int 3/8] Add and use udiv_ceil
On Tue, Apr 22, 2014 at 9:51 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Just a minor tweak to avoid several calculations when one would do. Since we have a function for rounded-up division, we might as well use it instead of the (X + Y - 1) / Y idiom. Tested on x86_64-linux-gnu. OK to install? Ok. Thanks, Richard. Thanks, Richard Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c 2014-04-22 20:31:25.187000808 +0100 +++ gcc/dwarf2out.c 2014-04-22 20:31:26.374009366 +0100 @@ -14824,7 +14824,7 @@ simple_decl_align_in_bits (const_tree de static inline offset_int round_up_to_align (const offset_int t, unsigned int align) { - return wi::udiv_trunc (t + align - 1, align) * align; + return wi::udiv_ceil (t, align) * align; } /* Given a pointer to a FIELD_DECL, compute and return the byte offset of the Index: gcc/wide-int.h === --- gcc/wide-int.h 2014-04-22 20:31:25.842005530 +0100 +++ gcc/wide-int.h 2014-04-22 20:31:26.375009373 +0100 @@ -521,6 +521,7 @@ #define SHIFT_FUNCTION \ BINARY_FUNCTION udiv_floor (const T1 , const T2 ); BINARY_FUNCTION sdiv_floor (const T1 , const T2 ); BINARY_FUNCTION div_ceil (const T1 , const T2 , signop, bool * = 0); + BINARY_FUNCTION udiv_ceil (const T1 , const T2 ); BINARY_FUNCTION div_round (const T1 , const T2 , signop, bool * = 0); BINARY_FUNCTION divmod_trunc (const T1 , const T2 , signop, WI_BINARY_RESULT (T1, T2) *); @@ -2566,6 +2567,13 @@ wi::div_ceil (const T1 x, const T2 y, return quotient; } +template typename T1, typename T2 +inline WI_BINARY_RESULT (T1, T2) +wi::udiv_ceil (const T1 x, const T2 y) +{ + return div_ceil (x, y, UNSIGNED); +} + /* Return X / Y, rouding towards nearest with ties away from zero. Treat X and Y as having the signedness given by SGN. Indicate in *OVERFLOW if the result overflows. */
Re: [wide-int 6/8] Avoid redundant extensions
On Tue, Apr 22, 2014 at 10:04 PM, Richard Sandiford rdsandif...@googlemail.com wrote: register_edge_assert_for_2 operates on wide_ints of precision nprec so a lot of the extensions are redundant. Tested on x86_64-linux-gnu. OK to install? Ok. Thanks, Richard. Thanks, Richard Index: gcc/tree-vrp.c === --- gcc/tree-vrp.c 2014-04-22 20:58:26.969683484 +0100 +++ gcc/tree-vrp.c 2014-04-22 21:00:26.670617168 +0100 @@ -5125,16 +5125,13 @@ register_edge_assert_for_2 (tree name, e { wide_int minv, maxv, valv, cst2v; wide_int tem, sgnbit; - bool valid_p = false, valn = false, cst2n = false; + bool valid_p = false, valn, cst2n; enum tree_code ccode = comp_code; valv = wide_int::from (val, nprec, UNSIGNED); cst2v = wide_int::from (cst2, nprec, UNSIGNED); - if (TYPE_SIGN (TREE_TYPE (val)) == SIGNED) - { - valn = wi::neg_p (wi::sext (valv, nprec)); - cst2n = wi::neg_p (wi::sext (cst2v, nprec)); - } + valn = wi::neg_p (valv, TYPE_SIGN (TREE_TYPE (val))); + cst2n = wi::neg_p (cst2v, TYPE_SIGN (TREE_TYPE (val))); /* If CST2 doesn't have most significant bit set, but VAL is negative, we have comparison like if ((x 0x123) -4) (always true). Just give up. */ @@ -5153,13 +5150,11 @@ register_edge_assert_for_2 (tree name, e have folded the comparison into false) and maximum unsigned value is VAL | ~CST2. */ maxv = valv | ~cst2v; - maxv = wi::zext (maxv, nprec); valid_p = true; break; case NE_EXPR: tem = valv | ~cst2v; - tem = wi::zext (tem, nprec); /* If VAL is 0, handle (X CST2) != 0 as (X CST2) 0U. */ if (valv == 0) { @@ -5176,7 +5171,7 @@ register_edge_assert_for_2 (tree name, e sgnbit = wi::zero (nprec); goto lt_expr; } - if (!cst2n wi::neg_p (wi::sext (cst2v, nprec))) + if (!cst2n wi::neg_p (cst2v)) sgnbit = wi::set_bit_in_zero (nprec - 1, nprec); if (sgnbit != 0) { @@ -5245,7 +5240,6 @@ register_edge_assert_for_2 (tree name, e maxv -= 1; } maxv |= ~cst2v; - maxv = wi::zext (maxv, nprec); minv = sgnbit; valid_p = true; break; @@ -5274,7 +5268,6 @@ register_edge_assert_for_2 (tree name, e } maxv -= 1; maxv |= ~cst2v; - maxv = wi::zext (maxv, nprec); minv = sgnbit; valid_p = true; break; @@ -5283,7 +5276,7 @@ register_edge_assert_for_2 (tree name, e break; } if (valid_p - wi::zext (maxv - minv, nprec) != wi::minus_one (nprec)) + (maxv - minv) != -1) { tree tmp, new_val, type; int i;
Re: [wide-int 7/8] Undo some changes from trunk
On Tue, Apr 22, 2014 at 10:12 PM, Richard Sandiford rdsandif...@googlemail.com wrote: This patch undoes a few assorted differences from trunk. For fold-const.c the old code was: /* If INNER is a right shift of a constant and it plus BITNUM does not overflow, adjust BITNUM and INNER. */ if (TREE_CODE (inner) == RSHIFT_EXPR TREE_CODE (TREE_OPERAND (inner, 1)) == INTEGER_CST tree_fits_uhwi_p (TREE_OPERAND (inner, 1)) bitnum TYPE_PRECISION (type) (tree_to_uhwi (TREE_OPERAND (inner, 1)) (unsigned) (TYPE_PRECISION (type) - bitnum))) { bitnum += tree_to_uhwi (TREE_OPERAND (inner, 1)); inner = TREE_OPERAND (inner, 0); } and we lost the bitnum range test. The gimple-fold.c change contained an unrelated stylistic change that makes the code a bit less efficient. For ipa-prop.c we should convert to a HOST_WIDE_INT before multiplying, like trunk does. It doesn't change the result and is more efficient. objc-act.c contains three copies of the same code. The check for 0 was kept in the third but not the first two. Tested on x86_64-linux-gnu. OK to install? Ok. Thanks, Richard. Thanks, Richard Index: gcc/fold-const.c === --- gcc/fold-const.c2014-04-22 21:00:26.921619127 +0100 +++ gcc/fold-const.c2014-04-22 21:00:27.317622218 +0100 @@ -6581,8 +6581,9 @@ fold_single_bit_test (location_t loc, en not overflow, adjust BITNUM and INNER. */ if (TREE_CODE (inner) == RSHIFT_EXPR TREE_CODE (TREE_OPERAND (inner, 1)) == INTEGER_CST - wi::ltu_p (wi::to_widest (TREE_OPERAND (inner, 1)) + bitnum, - TYPE_PRECISION (type))) + bitnum TYPE_PRECISION (type) + wi::ltu_p (TREE_OPERAND (inner, 1), + TYPE_PRECISION (type) - bitnum)) { bitnum += tree_to_uhwi (TREE_OPERAND (inner, 1)); inner = TREE_OPERAND (inner, 0); Index: gcc/gimple-fold.c === --- gcc/gimple-fold.c 2014-04-22 20:58:26.869682704 +0100 +++ gcc/gimple-fold.c 2014-04-22 21:00:27.31866 +0100 @@ -3163,12 +3163,13 @@ fold_const_aggregate_ref_1 (tree t, tree (idx = (*valueize) (TREE_OPERAND (t, 1))) TREE_CODE (idx) == INTEGER_CST) { - tree low_bound = array_ref_low_bound (t); - tree unit_size = array_ref_element_size (t); + tree low_bound, unit_size; /* If the resulting bit-offset is constant, track it. */ - if (TREE_CODE (low_bound) == INTEGER_CST - tree_fits_uhwi_p (unit_size)) + if ((low_bound = array_ref_low_bound (t), + TREE_CODE (low_bound) == INTEGER_CST) + (unit_size = array_ref_element_size (t), + tree_fits_uhwi_p (unit_size))) { offset_int woffset = wi::sext (wi::to_offset (idx) - wi::to_offset (low_bound), Index: gcc/ipa-prop.c === --- gcc/ipa-prop.c 2014-04-22 20:58:26.869682704 +0100 +++ gcc/ipa-prop.c 2014-04-22 21:00:27.319622234 +0100 @@ -3787,8 +3787,8 @@ ipa_modify_call_arguments (struct cgraph if (TYPE_ALIGN (type) align) align = TYPE_ALIGN (type); } - misalign += (offset_int::from (off, SIGNED) - * BITS_PER_UNIT).to_short_addr (); + misalign += (offset_int::from (off, SIGNED).to_short_addr () + * BITS_PER_UNIT); misalign = misalign (align - 1); if (misalign != 0) align = (misalign -misalign); Index: gcc/objc/objc-act.c === --- gcc/objc/objc-act.c 2014-04-22 20:58:26.869682704 +0100 +++ gcc/objc/objc-act.c 2014-04-22 21:00:27.320622242 +0100 @@ -4882,7 +4882,9 @@ objc_decl_method_attributes (tree *node, which specifies the index of the format string argument. Add 2. */ number = TREE_VALUE (second_argument); - if (number TREE_CODE (number) == INTEGER_CST) + if (number + TREE_CODE (number) == INTEGER_CST + !wi::eq_p (number, 0)) TREE_VALUE (second_argument) = wide_int_to_tree (TREE_TYPE (number), wi::add (number, 2)); @@ -4893,7 +4895,9 @@ objc_decl_method_attributes (tree *node, in which case we don't need to add 2. Add 2 if not 0. */ number = TREE_VALUE (third_argument); - if (number TREE_CODE
Re: [C PATCH] Make attributes accept enum values (PR c/50459)
On Sat, Apr 19, 2014 at 09:56:02AM -0400, Jason Merrill wrote: On 04/17/2014 12:00 PM, Marek Polacek wrote: == CPP_CLOSE_PAREN))) { tree arg1 = c_parser_peek_token (parser)-value; + if (!attr_takes_id_p) +{ + /* This is for enum values, so that they can be used as + an attribute parameter; lookup_name will find their + CONST_DECLs. */ + tree ln = lookup_name (arg1); + if (ln) +arg1 = ln; +} c_parser_consume_token (parser); Instead, we should add !attr_takes_id_p to the if condition immediately above so that we parse the arguments as an expression-list. Ah, indeed. So like this? I had to add some ugliness because of Obj-C and also tweak a few tests, since we now print slightly different error message if the identifier in attribute argument isn't declared. Regtested/bootstrapped on x86_64-linux. 2014-04-22 Marek Polacek pola...@redhat.com PR c/50459 c-family/ * c-common.c (check_user_alignment): Return -1 if alignment is error node. (handle_aligned_attribute): Don't call default_conversion on FUNCTION_DECLs. (handle_vector_size_attribute): Likewise. (handle_tm_wrap_attribute): Handle case when wrap_decl is error node. (handle_sentinel_attribute): Call default_conversion and allow even integral types as an argument. c/ * c-parser.c (c_parser_attributes): Parse the arguments as an expression-list if the attribute takes identifier. testsuite/ * c-c++-common/attributes-1.c: Remove dg-error line. * c-c++-common/pr50459.c: New test. * c-c++-common/pr59280.c: Add undeclared to dg-error. * gcc.dg/nonnull-2.c: Likewise. * gcc.dg/pr55570.c: Modify dg-error. * gcc.dg/tm/wrap-2.c: Likewise. diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c index c0e247b..df44faa 100644 --- gcc/c-family/c-common.c +++ gcc/c-family/c-common.c @@ -7418,6 +7418,8 @@ check_user_alignment (const_tree align, bool allow_zero) { int i; + if (error_operand_p (align)) +return -1; if (TREE_CODE (align) != INTEGER_CST || !INTEGRAL_TYPE_P (TREE_TYPE (align))) { @@ -7539,7 +7541,8 @@ handle_aligned_attribute (tree *node, tree ARG_UNUSED (name), tree args, if (args) { align_expr = TREE_VALUE (args); - if (align_expr TREE_CODE (align_expr) != IDENTIFIER_NODE) + if (align_expr TREE_CODE (align_expr) != IDENTIFIER_NODE + TREE_CODE (align_expr) != FUNCTION_DECL) align_expr = default_conversion (align_expr); } else @@ -8404,9 +8407,11 @@ handle_tm_wrap_attribute (tree *node, tree name, tree args, else { tree wrap_decl = TREE_VALUE (args); - if (TREE_CODE (wrap_decl) != IDENTIFIER_NODE - TREE_CODE (wrap_decl) != VAR_DECL - TREE_CODE (wrap_decl) != FUNCTION_DECL) + if (error_operand_p (wrap_decl)) +; + else if (TREE_CODE (wrap_decl) != IDENTIFIER_NODE + TREE_CODE (wrap_decl) != VAR_DECL + TREE_CODE (wrap_decl) != FUNCTION_DECL) error (%qE argument not an identifier, name); else { @@ -8533,7 +8538,8 @@ handle_vector_size_attribute (tree *node, tree name, tree args, *no_add_attrs = true; size = TREE_VALUE (args); - if (size TREE_CODE (size) != IDENTIFIER_NODE) + if (size TREE_CODE (size) != IDENTIFIER_NODE + TREE_CODE (size) != FUNCTION_DECL) size = default_conversion (size); if (!tree_fits_uhwi_p (size)) @@ -8944,8 +8950,12 @@ handle_sentinel_attribute (tree *node, tree name, tree args, if (args) { tree position = TREE_VALUE (args); + if (position TREE_CODE (position) != IDENTIFIER_NODE + TREE_CODE (position) != FUNCTION_DECL) + position = default_conversion (position); - if (TREE_CODE (position) != INTEGER_CST) + if (TREE_CODE (position) != INTEGER_CST + || !INTEGRAL_TYPE_P (TREE_TYPE (position))) { warning (OPT_Wattributes, requested position is not an integer constant); diff --git gcc/c/c-parser.c gcc/c/c-parser.c index 5653e49..8d91d6b 100644 --- gcc/c/c-parser.c +++ gcc/c/c-parser.c @@ -3943,11 +3943,16 @@ c_parser_attributes (c_parser *parser) In objective-c the identifier may be a classname. */ if (c_parser_next_token_is (parser, CPP_NAME) (c_parser_peek_token (parser)-id_kind == C_ID_ID - || (c_dialect_objc () - c_parser_peek_token (parser)-id_kind == C_ID_CLASSNAME)) + || (c_dialect_objc () + c_parser_peek_token (parser)-id_kind +== C_ID_CLASSNAME)) ((c_parser_peek_2nd_token (parser)-type == CPP_COMMA) || (c_parser_peek_2nd_token
Re: [wide-int 8/8] Formatting and typo fixes
On Tue, Apr 22, 2014 at 10:14 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Almost obvious, but just in case... The first mem_loc_descriptor hunk just reflows the text so that the line breaks are less awkward. Tested on x86_64-linux-gnu. OK to install? Ok. Thanks, Richard. Thanks, Richard Index: gcc/doc/rtl.texi === --- gcc/doc/rtl.texi2014-04-22 21:08:26.002367845 +0100 +++ gcc/doc/rtl.texi2014-04-22 21:13:54.343668582 +0100 @@ -1553,7 +1553,7 @@ neither inherently signed nor inherently signedness is determined by the rtl operation instead. On more modern ports, @code{CONST_DOUBLE} only represents floating -point values. New ports define to @code{TARGET_SUPPORTS_WIDE_INT} to +point values. New ports define @code{TARGET_SUPPORTS_WIDE_INT} to make this designation. @findex CONST_DOUBLE_LOW @@ -1571,7 +1571,7 @@ the precise bit pattern used by the targ @findex CONST_WIDE_INT @item (const_wide_int:@var{m} @var{nunits} @var{elt0} @dots{}) -This contains an array of @code{HOST_WIDE_INTS} that is large enough +This contains an array of @code{HOST_WIDE_INT}s that is large enough to hold any constant that can be represented on the target. This form of rtl is only used on targets that define @code{TARGET_SUPPORTS_WIDE_INT} to be nonzero and then Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c 2014-04-22 21:13:54.297668148 +0100 +++ gcc/dwarf2out.c 2014-04-22 21:13:54.337668526 +0100 @@ -12911,14 +12911,13 @@ mem_loc_descriptor (rtx rtl, enum machin dw_die_ref type_die; /* Note that if TARGET_SUPPORTS_WIDE_INT == 0, a -CONST_DOUBLE rtx could represent either an large integer -or a floating-point constant. If -TARGET_SUPPORTS_WIDE_INT != 0, the value is always a -floating point constant. +CONST_DOUBLE rtx could represent either a large integer +or a floating-point constant. If TARGET_SUPPORTS_WIDE_INT != 0, +the value is always a floating point constant. When it is an integer, a CONST_DOUBLE is used whenever -the constant requires 2 HWIs to be adequately -represented. We output CONST_DOUBLEs as blocks. */ +the constant requires 2 HWIs to be adequately represented. +We output CONST_DOUBLEs as blocks. */ if (mode == VOIDmode || (GET_MODE (rtl) == VOIDmode GET_MODE_BITSIZE (mode) != HOST_BITS_PER_DOUBLE_INT)) @@ -15147,9 +15146,9 @@ insert_wide_int (const wide_int val, un } /* We'd have to extend this code to support odd sizes. */ - gcc_assert (elt_size % (HOST_BITS_PER_WIDE_INT/BITS_PER_UNIT) == 0); + gcc_assert (elt_size % (HOST_BITS_PER_WIDE_INT / BITS_PER_UNIT) == 0); - int n = elt_size / (HOST_BITS_PER_WIDE_INT/BITS_PER_UNIT); + int n = elt_size / (HOST_BITS_PER_WIDE_INT / BITS_PER_UNIT); if (WORDS_BIG_ENDIAN) for (i = n - 1; i = 0; i--) Index: gcc/emit-rtl.c === --- gcc/emit-rtl.c 2014-04-22 21:08:26.002367845 +0100 +++ gcc/emit-rtl.c 2014-04-22 21:13:54.338668535 +0100 @@ -213,8 +213,8 @@ const_wide_int_htab_hash (const void *x) const_wide_int_htab_eq (const void *x, const void *y) { int i; - const_rtx xr = (const_rtx)x; - const_rtx yr = (const_rtx)y; + const_rtx xr = (const_rtx) x; + const_rtx yr = (const_rtx) y; if (CONST_WIDE_INT_NUNITS (xr) != CONST_WIDE_INT_NUNITS (yr)) return 0; Index: gcc/fold-const.c === --- gcc/fold-const.c2014-04-22 21:13:54.308668252 +0100 +++ gcc/fold-const.c2014-04-22 21:13:54.340668554 +0100 @@ -1775,7 +1775,7 @@ fold_convert_const_fixed_from_int (tree di.low = TREE_INT_CST_ELT (arg1, 0); if (TREE_INT_CST_NUNITS (arg1) == 1) -di.high = (HOST_WIDE_INT)di.low 0 ? (HOST_WIDE_INT)-1 : 0; +di.high = (HOST_WIDE_INT) di.low 0 ? (HOST_WIDE_INT) -1 : 0; else di.high = TREE_INT_CST_ELT (arg1, 1); Index: gcc/rtl.c === --- gcc/rtl.c 2014-04-22 21:08:26.002367845 +0100 +++ gcc/rtl.c 2014-04-22 21:13:54.341668564 +0100 @@ -232,7 +232,7 @@ cwi_output_hex (FILE *outfile, const_rtx { int i = CWI_GET_NUM_ELEM (x); gcc_assert (i 0); - if (CWI_ELT (x, i-1) == 0) + if (CWI_ELT (x, i - 1) == 0) /* The HOST_WIDE_INT_PRINT_HEX prepends a 0x only if the val is non zero. We want all numbers to have a 0x prefix. */ fprintf (outfile, 0x); Index: gcc/rtl.h === --- gcc/rtl.h 2014-04-22 21:08:26.002367845 +0100 +++ gcc/rtl.h 2014-04-22 21:13:54.341668564 +0100
[PATCH] Update libstdc++ baseline symbols for m68k
Committed. Andreas. * config/abi/post/m68k-linux-gnu/baseline_symbols.txt (CXXABI_1.3.9): New version. diff --git a/libstdc++-v3/config/abi/post/m68k-linux-gnu/baseline_symbols.txt b/libstdc++-v3/config/abi/post/m68k-linux-gnu/baseline_symbols.txt index ce247a9..bd2e67f 100644 --- a/libstdc++-v3/config/abi/post/m68k-linux-gnu/baseline_symbols.txt +++ b/libstdc++-v3/config/abi/post/m68k-linux-gnu/baseline_symbols.txt @@ -2520,6 +2520,7 @@ OBJECT:0:CXXABI_1.3.5 OBJECT:0:CXXABI_1.3.6 OBJECT:0:CXXABI_1.3.7 OBJECT:0:CXXABI_1.3.8 +OBJECT:0:CXXABI_1.3.9 OBJECT:0:CXXABI_TM_1 OBJECT:0:GLIBCXX_3.4 OBJECT:0:GLIBCXX_3.4.1 -- 1.9.2 -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
[PATCH] Tweak an error msg a little
I think it's better to be consistent and always quote the transaction_wrap name, it even looks nicer. I ran tm.exp tests, ok for trunk? 2014-04-23 Marek Polacek pola...@redhat.com * c-common.c (handle_tm_wrap_attribute): Tweak error message. diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c index 0b5ded8..a08c873 100644 --- gcc/c-family/c-common.c +++ gcc/c-family/c-common.c @@ -8421,7 +8421,7 @@ handle_tm_wrap_attribute (tree *node, tree name, tree args, error (%qD is not compatible with %qD, wrap_decl, decl); } else - error (transaction_wrap argument is not a function); + error (%qE argument is not a function, name); } } Marek
Commit: MSP430: Enhance -mhwmult option
Hi Guys, I am applying the attached patch to enhance the -mhwmult command line option of the MSP430 backend. The option can now be used to specify the type of hardware multiplier supported to be enabled as well as just enabling or disabling the support. The default behaviour is now to enable hardware multiply support based upon the -mmcu command line option used. If no -mmcu option has been specified, or the mcu name is unrecognised, then the normal 32-bit hardware support will be enabled. The patch also fixes the parsing of the -mmcu= and -mcpu= command line options so that the last one specified takes precedence. Cheers Nick gcc/ChangeLog 2014-04-23 Nick Clifton ni...@redhat.com * config/msp430/msp430.c (msp430_handle_option): Move function to msp430-common.c (msp430_option_override): Simplify mcu and mcpu option handling. (msp430_is_f5_mcu): Rename to msp430_use_f5_series_hwmult. Add support for -mhwmult command line option. (has_32bit_hwmult): Rename to use_32bit_hwmult. Add support for -mhwmult command line option. (msp430_hwmult_enabled): Delete. (msp43o_output_labelref): Add support for -mhwmult command line option. * config/msp430/msp430.md (mulhisi3, umulhisi3, mulsidi3) (umulsidi3): Likewise. * config/msp430/msp430.opt (mmcu): Add Report attribute. (mcpu, mlarge, msmall): Likewise. (mhwmult): New option. * config/msp430/msp430-protos.h (msp430_hwmult_enabled): Remove prototype. (msp430_is_f5_mcu): Remove prototype. (msp430_use_f5_series_hwmult): Add prototype. * config/msp430/msp430-opts.h: New file. * common/config/msp430: New directory. * common/config/msp430/msp430-common.c: New file. * config.gcc (msp430): Remove target_has_targetm_common. * doc/invoke.texi: Document -mhwmult command line option. msp430.opts.patch.xz Description: application/xz
Re: [PATCH] Tweak an error msg a little
On Wed, Apr 23, 2014 at 12:22 PM, Marek Polacek pola...@redhat.com wrote: I think it's better to be consistent and always quote the transaction_wrap name, it even looks nicer. I ran tm.exp tests, ok for trunk? Ok. Thanks, Richard. 2014-04-23 Marek Polacek pola...@redhat.com * c-common.c (handle_tm_wrap_attribute): Tweak error message. diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c index 0b5ded8..a08c873 100644 --- gcc/c-family/c-common.c +++ gcc/c-family/c-common.c @@ -8421,7 +8421,7 @@ handle_tm_wrap_attribute (tree *node, tree name, tree args, error (%qD is not compatible with %qD, wrap_decl, decl); } else - error (transaction_wrap argument is not a function); + error (%qE argument is not a function, name); } } Marek
-fuse-caller-save - Collect register usage information
On 22-04-14 17:05, Tom de Vries wrote: I've updated the fuse-caller-save patch series to model non-callee call clobbers in CALL_INSN_FUNCTION_USAGE. Vladimir, This is the updated version of the previously approved patch http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01320.html , updated for the new hook call_fusage_contains_non_callee_clobbers. The only difference is in the functions get_call_reg_set_usage and collect_fn_hard_reg_usage which use the hook. OK for trunk? Thanks, - Tom 2013-04-29 Radovan Obradovic robrado...@mips.com Tom de Vries t...@codesourcery.com * cgraph.h (struct cgraph_node): Add function_used_regs, function_used_regs_initialized and function_used_regs_valid fields. * final.c: Move include of hard-reg-set.h to before rtl.h to declare find_all_hard_reg_sets. (collect_fn_hard_reg_usage, get_call_fndecl, get_call_cgraph_node) (get_call_reg_set_usage): New function. (rest_of_handle_final): Use collect_fn_hard_reg_usage. diff --git a/gcc/cgraph.h b/gcc/cgraph.h index 15310d8..eb0fe8e 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -408,6 +408,15 @@ public: /* Time profiler: first run of function. */ int tp_first_run; + /* Call unsaved hard registers really used by the corresponding + function (including ones used by functions called by the + function). */ + HARD_REG_SET function_used_regs; + /* Set if function_used_regs is initialized. */ + unsigned function_used_regs_initialized: 1; + /* Set if function_used_regs is valid. */ + unsigned function_used_regs_valid: 1; + /* Set when decl is an abstract function pointed to by the ABSTRACT_DECL_ORIGIN of a reachable function. */ unsigned used_as_abstract_origin : 1; diff --git a/gcc/final.c b/gcc/final.c index 83abee2..0b1947d 100644 --- a/gcc/final.c +++ b/gcc/final.c @@ -49,6 +49,7 @@ along with GCC; see the file COPYING3. If not see #include tree.h #include varasm.h +#include hard-reg-set.h #include rtl.h #include tm_p.h #include regs.h @@ -57,7 +58,6 @@ along with GCC; see the file COPYING3. If not see #include recog.h #include conditions.h #include flags.h -#include hard-reg-set.h #include output.h #include except.h #include function.h @@ -223,6 +223,7 @@ static int alter_cond (rtx); static int final_addr_vec_align (rtx); #endif static int align_fuzz (rtx, rtx, int, unsigned); +static void collect_fn_hard_reg_usage (void); /* Initialize data in final at the beginning of a compilation. */ @@ -4425,6 +4426,7 @@ rest_of_handle_final (void) assemble_start_function (current_function_decl, fnname); final_start_function (get_insns (), asm_out_file, optimize); final (get_insns (), asm_out_file, optimize); + collect_fn_hard_reg_usage (); final_end_function (); /* The IA-64 .handlerdata directive must be issued before the .endp @@ -4720,3 +4722,119 @@ make_pass_clean_state (gcc::context *ctxt) { return new pass_clean_state (ctxt); } + +/* Collect hard register usage for the current function. */ + +static void +collect_fn_hard_reg_usage (void) +{ + rtx insn; + int i; + struct cgraph_node *node; + + if (!flag_use_caller_save) +return; + + node = cgraph_get_node (current_function_decl); + gcc_assert (node != NULL); + + gcc_assert (!node-function_used_regs_initialized); + node-function_used_regs_initialized = 1; + + for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn)) +{ + HARD_REG_SET insn_used_regs; + + if (!NONDEBUG_INSN_P (insn)) + continue; + + find_all_hard_reg_sets (insn, insn_used_regs, false); + + if (CALL_P (insn) + (!targetm.call_fusage_contains_non_callee_clobbers () + || !get_call_reg_set_usage (insn, insn_used_regs, call_used_reg_set))) + { + CLEAR_HARD_REG_SET (node-function_used_regs); + return; + } + + IOR_HARD_REG_SET (node-function_used_regs, insn_used_regs); +} + + /* Be conservative - mark fixed and global registers as used. */ + IOR_HARD_REG_SET (node-function_used_regs, fixed_reg_set); + for (i = 0; i FIRST_PSEUDO_REGISTER; i++) +if (global_regs[i]) + SET_HARD_REG_BIT (node-function_used_regs, i); + +#ifdef STACK_REGS + /* Handle STACK_REGS conservatively, since the df-framework does not + provide accurate information for them. */ + + for (i = FIRST_STACK_REG; i = LAST_STACK_REG; i++) +SET_HARD_REG_BIT (node-function_used_regs, i); +#endif + + node-function_used_regs_valid = 1; +} + +/* Get the declaration of the function called by INSN. */ + +static tree +get_call_fndecl (rtx insn) +{ + rtx note, datum; + + if (!flag_use_caller_save) +return NULL_TREE; + + note = find_reg_note (insn, REG_CALL_DECL, NULL_RTX); + if (note == NULL_RTX) +return NULL_TREE; + + datum = XEXP (note, 0); + if (datum != NULL_RTX) +return SYMBOL_REF_DECL (datum); + + return NULL_TREE; +} + +static struct cgraph_node * +get_call_cgraph_node (rtx insn) +{ + tree
Re: RFA: x86 backend: Add default-manifest to Cygwin/MinGW links
Hi Nick, On Apr 23 10:41, Nicholas Clifton wrote: Hi Corinna, However, we know that the act of merging will currently result in broken resources in the executable. Wouldn't it be better to apply the above patch only after the resource merge fix? No. Well not in my opinion. :-) The reason is that this patch only makes a difference if the default manifest can be found in a library search path. If there is none present then nothing happens. So you can disable the (broken) merging of a default manifest file by simply not having it present. Which should be the case for all current installations. Plus - I am hoping to fix the resource merging problem soon. (Any day now, honest). So I would like to have the gcc patch in place for when that happens. Ok, sounds fine to me. Thanks, Corinna -- Corinna Vinschen Cygwin Maintainer Red Hat pgpMqGA1lAxMO.pgp Description: PGP signature
Re: [PATCH, ARM] Improve 64 bit division performance
Ping? Ramana mentioned at Linaro Connect that this should be tested on more platforms. I've now checked this on qemu with no regressions on trunk for: arm-unknown-linux-gnueabihf v7-A: ARM and Thumb-2 arm-unknown-linux-gnueabi v4t, v5t, v6: ARM OK for trunk? Archive link: http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01611.html On 27 February 2014 16:38, Charles Baylis charles.bay...@linaro.org wrote: [resending as text/plain] Hi These patches optimise 64 bit division by removing the use of the __gnu_[u]ldivmod_helper functions and hence avoiding the redundant calculation of the remainder in those functions. Bootstrapped, tested and checked for arm-unknown-linux-gnueabihf. Benchmarked on Chromebook and Raspberry Pi using attached divbench3.c. Loop1 varies the divisor and loop2 varies the dividend. Chromebook: before: loop1 unsigned: 3.474419 loop2 unsigned: 6.564871 loop1 signed: 4.127967 loop2 signed: 6.071490 after: loop1 unsigned: 2.781364 loop2 unsigned: 6.166478 loop1 signed: 2.800974 loop2 signed: 6.129588 Raspberry pi: before loop1 unsigned:28.881753 loop2 unsigned:19.876385 loop1 signed: 32.074941 loop2 signed: 20.594860 after: loop1 unsigned:24.893846 loop2 unsigned:19.537562 loop1 signed: 25.334509 loop2 signed: 19.615088 Any comments? OK for stage 1? Patch 1: 2014-02-27 Charles Baylis charles.bay...@linaro.org * config/arm/bpabi.S (__aeabi_uldivmod): Perform division using call to __udivmoddi4. Patch 2: 2014-02-27 Charles Baylis charles.bay...@linaro.org * config/arm/bpabi.S (__aeabi_ldivmod): Perform signed division via call to __udivmoddi4 and fixing up for negative operands.
[PATCH] Fix PR60903
LIM fails to properly mark new blocks/edges it creates as belonging to irreducible regions. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk and 4.9 branch. Richard. 2014-04-23 Richard Biener rguent...@suse.de PR tree-optimization/60903 * tree-ssa-loop-im.c (analyze_memory_references): Remove commented code block. (execute_sm_if_changed): Properly apply IRREDUCIBLE_LOOP loop flags to newly created BBs and edges. * gcc.dg/torture/pr60903.c: New testcase. Index: gcc/tree-ssa-loop-im.c === *** gcc/tree-ssa-loop-im.c (revision 209677) --- gcc/tree-ssa-loop-im.c (working copy) *** analyze_memory_references (void) *** 1544,1558 struct loop *loop, *outer; unsigned i, n; - #if 0 - /* Initialize bb_loop_postorder with a mapping from loop-num to - its postorder index. */ - i = 0; - bb_loop_postorder = XNEWVEC (unsigned, number_of_loops (cfun)); - FOR_EACH_LOOP (loop, LI_FROM_INNERMOST) - bb_loop_postorder[loop-num] = i++; - #endif - /* Collect all basic-blocks in loops and sort them after their loops postorder. */ i = 0; --- 1547,1552 *** execute_sm_if_changed (edge ex, tree mem *** 1807,1812 --- 1803,1809 gimple_stmt_iterator gsi; gimple stmt; struct prev_flag_edges *prev_edges = (struct prev_flag_edges *) ex-aux; + bool irr = ex-flags EDGE_IRREDUCIBLE_LOOP; /* ?? Insert store after previous store if applicable. See note below. */ *** execute_sm_if_changed (edge ex, tree mem *** 1821,1828 old_dest = ex-dest; new_bb = split_edge (ex); then_bb = create_empty_bb (new_bb); ! if (current_loops new_bb-loop_father) ! add_bb_to_loop (then_bb, new_bb-loop_father); gsi = gsi_start_bb (new_bb); stmt = gimple_build_cond (NE_EXPR, flag, boolean_false_node, --- 1818,1826 old_dest = ex-dest; new_bb = split_edge (ex); then_bb = create_empty_bb (new_bb); ! if (irr) ! then_bb-flags = BB_IRREDUCIBLE_LOOP; ! add_bb_to_loop (then_bb, new_bb-loop_father); gsi = gsi_start_bb (new_bb); stmt = gimple_build_cond (NE_EXPR, flag, boolean_false_node, *** execute_sm_if_changed (edge ex, tree mem *** 1834,1842 stmt = gimple_build_assign (unshare_expr (mem), tmp_var); gsi_insert_after (gsi, stmt, GSI_CONTINUE_LINKING); ! make_edge (new_bb, then_bb, EDGE_TRUE_VALUE); ! make_edge (new_bb, old_dest, EDGE_FALSE_VALUE); ! then_old_edge = make_edge (then_bb, old_dest, EDGE_FALLTHRU); set_immediate_dominator (CDI_DOMINATORS, then_bb, new_bb); --- 1832,1843 stmt = gimple_build_assign (unshare_expr (mem), tmp_var); gsi_insert_after (gsi, stmt, GSI_CONTINUE_LINKING); ! make_edge (new_bb, then_bb, !EDGE_TRUE_VALUE | (irr ? EDGE_IRREDUCIBLE_LOOP : 0)); ! make_edge (new_bb, old_dest, !EDGE_FALSE_VALUE | (irr ? EDGE_IRREDUCIBLE_LOOP : 0)); ! then_old_edge = make_edge (then_bb, old_dest, !EDGE_FALLTHRU | (irr ? EDGE_IRREDUCIBLE_LOOP : 0)); set_immediate_dominator (CDI_DOMINATORS, then_bb, new_bb); Index: gcc/testsuite/gcc.dg/torture/pr60903.c === *** gcc/testsuite/gcc.dg/torture/pr60903.c (revision 0) --- gcc/testsuite/gcc.dg/torture/pr60903.c (working copy) *** *** 0 --- 1,22 + /* { dg-do compile } */ + + extern int a, b, k, q; + + void + foo () + { + if (a) + { + while (q) + { + lbl: + if (a) + { + a = 0; + goto lbl; + } + } + b = k; + } + goto lbl; + }
Add clobber_reg
On 22-04-14 17:05, Tom de Vries wrote: I've updated the fuse-caller-save patch series to model non-callee call clobbers in CALL_INSN_FUNCTION_USAGE. Eric, Richard Sandiford mentioned here ( http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00870.html ): ... Although we really should have a utility function like use_reg, but for clobbers, so that the above would become: clobber_reg (CALL_INSN_FUNCTION_USAGE (insn), gen_rtx_REG (word_mode, 18)); ... I've implemented a patch that adds clobber_reg and clobber_reg_mode, similar to use_reg and use_reg_mode. Bootstrapped and reg-tested on x86_64 as part of the fuse-caller-save series. OK for trunk? Thanks, - Tom 2014-04-18 Tom de Vries t...@codesourcery.com * expr.c (clobber_reg_mode): New function. * expr.h (clobber_reg): New function. diff --git a/gcc/expr.c b/gcc/expr.c index 72e4401..fc58eb7f 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -2396,6 +2396,18 @@ use_reg_mode (rtx *call_fusage, rtx reg, enum machine_mode mode) = gen_rtx_EXPR_LIST (mode, gen_rtx_USE (VOIDmode, reg), *call_fusage); } +/* Add a CLOBBER expression for REG to the (possibly empty) list pointed + to by CALL_FUSAGE. REG must denote a hard register. */ + +void +clobber_reg_mode (rtx *call_fusage, rtx reg, enum machine_mode mode) +{ + gcc_assert (REG_P (reg) REGNO (reg) FIRST_PSEUDO_REGISTER); + + *call_fusage += gen_rtx_EXPR_LIST (mode, gen_rtx_CLOBBER (VOIDmode, reg), *call_fusage); +} + /* Add USE expressions to *CALL_FUSAGE for each of NREGS consecutive regs, starting at REGNO. All of these registers must be hard registers. */ diff --git a/gcc/expr.h b/gcc/expr.h index 524da67..1823feb 100644 --- a/gcc/expr.h +++ b/gcc/expr.h @@ -346,6 +346,7 @@ extern void copy_blkmode_from_reg (rtx, rtx, tree); /* Mark REG as holding a parameter for the next CALL_INSN. Mode is TYPE_MODE of the non-promoted parameter, or VOIDmode. */ extern void use_reg_mode (rtx *, rtx, enum machine_mode); +extern void clobber_reg_mode (rtx *, rtx, enum machine_mode); extern rtx copy_blkmode_to_reg (enum machine_mode, tree); @@ -356,6 +357,13 @@ use_reg (rtx *fusage, rtx reg) use_reg_mode (fusage, reg, VOIDmode); } +/* Mark REG as clobbered by the call with FUSAGE as CALL_INSN_FUNCTION_USAGE. */ +static inline void +clobber_reg (rtx *fusage, rtx reg) +{ + clobber_reg_mode (fusage, reg, VOIDmode); +} + /* Mark NREGS consecutive regs, starting at REGNO, as holding parameters for the next CALL_INSN. */ extern void use_regs (rtx *, int, int);
Re: [PATCH, ARM] Suppress Redundant Flag Setting for Cortex-A15
Hi, On 28 January 2014 13:10, Ramana Radhakrishnan ramana@googlemail.com wrote: On Fri, Jan 24, 2014 at 5:16 PM, Ian Bolton ian.bol...@arm.com wrote: Hi there! An existing optimisation for Thumb-2 converts t32 encodings to t16 encodings to reduce codesize, at the expense of causing redundant flag setting for ADD, AND, etc. This redundant flag setting can have negative performance impact on cortex-a15. This patch introduces two new tuning options so that the conversion from t32 to t16, which takes place in thumb2_reorg, can be suppressed for cortex-a15. To maintain some of the original benefit (reduced codesize), the suppression is only done where the enclosing basic block is deemed worthy of optimising for speed. This tested with no regressions and performance has improved for the workloads tested on cortex-a15. (It might be beneficial to other processors too, but that has not been investigated yet.) OK for stage 1? This is OK for stage1. Ramana Cheers, Ian 2014-01-24 Ian Bolton ian.bol...@arm.com gcc/ * config/arm/arm-protos.h (tune_params): New struct members. * config/arm/arm.c: Initialise tune_params per processor. (thumb2_reorg): Suppress conversion from t32 to t16 when optimizing for speed, based on new tune_params. This causes gcc.target/arm/negdi-1.c gcc.target/arm/negdi-2.c to FAIL when GCC is configured as: --with-mode=ar --with-cpu=cortex-a15 --with-fpu=neon-vfpv4 both tests used to PASS. (see http://cbuild.validation.linaro.org/build/cross-validation/gcc/209561/report-build-info.html) Christophe.
Re: Remove obsolete Solaris 9 support
On Tue, Apr 22, 2014 at 2:35 PM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Uros Bizjak ubiz...@gmail.com writes: On Wed, Apr 16, 2014 at 1:16 PM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Now that 4.9 has branched, it's time to actually remove the obsolete Solaris 9 configuration. Most of this is just legwork and falls under my Solaris maintainership. A couple of questions, though: * Uros: I'm removing all sse_os_support() checks from the testsuite. Solaris 9 was the only consumer, so it seems best to do away with it. This is OK, but please leave sse-os-check.h (and corresponding sse_os_support calls) in the testsuite. Just remove the Solaris 9 specific code from sse-os-check.h and always return 1, perhaps with the comment that all currently supported OSes support SSE instructions. Here's the final patch I've checked in, incorporating all review comments. I've left out the libgo (already checked in by Ian) and classpath parts. It looks to me that one part was left in libgcc/config/i386/crtfastmath.c: #if !defined __x86_64__ defined __sun__ defined __svr4__ #include signal.h #include ucontext.h ... #endif
Re: [Patch ARM] Allow any register for DImode values in Thumb2.
On 27 February 2014 14:58, Ramana Radhakrishnan ramra...@arm.com wrote: Hi I noticed that for T32 we don't allow any old register for DImode values. The restriction of an even register is true only for ARM state because the ISA doesn't allow any old register in this place. In a few large .i files that I had knocking about, noticed a nice drop in stack usage and a generally improved register allocation strategy. Queued for stage1 after suitable testing including a bootstrap and regression test in Thumb2 found no issues. regards Ramana DATE Ramana Radhakrishnan ramana.radhakrish...@arm.com * config/arm/arm.c (arm_hard_regno_mode_ok): Loosen restrictions on core registers for DImode values in Thumb2. Hi Ramana, I've noticed some regressions after this patch has been committed (rev 209615): gcc.c-torture/compile/pr34856.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions gcc.c-torture/compile/pr34856.c -O3 -fomit-frame-pointer -funroll-loops gcc.c-torture/execute/scal-to-vec1.c compilation, -O2 gcc.c-torture/execute/scal-to-vec1.c compilation, -O2 -flto -fno-use-linker-plugin -flto-partition=none gcc.c-torture/execute/scal-to-vec1.c compilation, -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects Now all produce ICE in several GCC configurations (mostly when generating thumb code) eg: --target arm-none-eabi --with-cpu=cortex-a9 --with-mode=thumb --target arm-none-linux-gnueabi --with-cpu=cortex-a9 --with-mode=thumb but it's OK for target arm-none-linux-gnueabihf. See http://cbuild.validation.linaro.org/build/cross-validation/gcc/209615/report-build-info.html Christophe.
Re: [Patch ARM] Allow any register for DImode values in Thumb2.
On Wed, Apr 23, 2014 at 1:53 PM, Christophe Lyon christophe.l...@linaro.org wrote: On 27 February 2014 14:58, Ramana Radhakrishnan ramra...@arm.com wrote: Hi I noticed that for T32 we don't allow any old register for DImode values. The restriction of an even register is true only for ARM state because the ISA doesn't allow any old register in this place. In a few large .i files that I had knocking about, noticed a nice drop in stack usage and a generally improved register allocation strategy. Queued for stage1 after suitable testing including a bootstrap and regression test in Thumb2 found no issues. regards Ramana DATE Ramana Radhakrishnan ramana.radhakrish...@arm.com * config/arm/arm.c (arm_hard_regno_mode_ok): Loosen restrictions on core registers for DImode values in Thumb2. Hi Ramana, I've noticed some regressions after this patch has been committed (rev 209615): gcc.c-torture/compile/pr34856.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions gcc.c-torture/compile/pr34856.c -O3 -fomit-frame-pointer -funroll-loops gcc.c-torture/execute/scal-to-vec1.c compilation, -O2 gcc.c-torture/execute/scal-to-vec1.c compilation, -O2 -flto -fno-use-linker-plugin -flto-partition=none gcc.c-torture/execute/scal-to-vec1.c compilation, -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects Now all produce ICE in several GCC configurations (mostly when generating thumb code) eg: --target arm-none-eabi --with-cpu=cortex-a9 --with-mode=thumb --target arm-none-linux-gnueabi --with-cpu=cortex-a9 --with-mode=thumb Thanks for the report - I'll have a look. I've had this in a tree for testing for sometime that runs these configurations atleast the bare-metal arm-none-eabi one with multilib testing for thumb. but it's OK for target arm-none-linux-gnueabihf. See http://cbuild.validation.linaro.org/build/cross-validation/gcc/209615/report-build-info.html Christophe.
Add post_expand_call_insn hook
On 22-04-14 17:05, Tom de Vries wrote: I've updated the fuse-caller-save patch series to model non-callee call clobbers in CALL_INSN_FUNCTION_USAGE. Eric, this patch adds a post_expand_call_insn hook. The hook is called right after expansion of calls, and allows a target to do additional processing, such as f.i. adding clobbers to CALL_INSN_FUNCTION_USAGE. Instead of using the hook, we could add code to the preparation statements operand of the different call expands, but that requires those expands not to use the rtl template, and generate all the rtl through c code. Which requires a rewrite of the call expands in case of Aarch64. Bootstrapped and reg-tested on x86_64 as part of the fuse-caller-save patch series. OK for trunk? Thanks, - Tom 2014-04-18 Tom de Vries t...@codesourcery.com * target.def (post_expand_call_insn): New DEFHOOK. * calls.c (expand_call, emit_library_call_value_1): Call post_expand_call_insn hook. * tm.texi.in (@section Storage Layout): Add hook TARGET_POST_EXPAND_CALL_INSN. * hooks.c (hook_void_rtx): New function. * hooks.h (hook_void_rtx): Declare function. diff --git a/gcc/calls.c b/gcc/calls.c index e798c7a..0777a02 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -3507,6 +3507,8 @@ expand_call (tree exp, rtx target, int ignore) free (stack_usage_map_buf); + targetm.post_expand_call_insn (last_call_insn ()); + return target; } @@ -4344,6 +4346,8 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value, free (stack_usage_map_buf); + targetm.post_expand_call_insn (last_call_insn ()); + return value; } diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 8af8efd..40b5bb1 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -1408,6 +1408,11 @@ registers whenever the function being expanded has any SDmode usage. @end deftypefn +@deftypefn {Target Hook} void TARGET_POST_EXPAND_CALL_INSN (rtx) +This hook is called just after expansion of a call_expr into rtl, allowing +the target to perform additional processing. +@end deftypefn + @deftypefn {Target Hook} void TARGET_INSTANTIATE_DECLS (void) This hook allows the backend to perform additional instantiations on rtl that are not actually in any insns yet, but will be later. diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 8991c3c..812b0b8 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -1285,6 +1285,8 @@ The default definition of this macro returns false for all sizes. @hook TARGET_EXPAND_TO_RTL_HOOK +@hook TARGET_POST_EXPAND_CALL_INSN + @hook TARGET_INSTANTIATE_DECLS @hook TARGET_MANGLE_TYPE diff --git a/gcc/hooks.c b/gcc/hooks.c index 1c67bdf..53e8591 100644 --- a/gcc/hooks.c +++ b/gcc/hooks.c @@ -461,6 +461,13 @@ hook_void_rtx_int (rtx insn ATTRIBUTE_UNUSED, int mode ATTRIBUTE_UNUSED) { } +/* Generic hook that takes a rtx and an int and returns void. */ + +void +hook_void_rtx (rtx insn ATTRIBUTE_UNUSED) +{ +} + /* Generic hook that takes a struct gcc_options * and returns void. */ void diff --git a/gcc/hooks.h b/gcc/hooks.h index 896b41d..4df5ae0 100644 --- a/gcc/hooks.h +++ b/gcc/hooks.h @@ -66,6 +66,7 @@ extern bool hook_bool_dint_dint_uint_bool_true (double_int, double_int, extern void hook_void_void (void); extern void hook_void_constcharptr (const char *); +extern void hook_void_rtx (rtx); extern void hook_void_rtx_int (rtx, int); extern void hook_void_FILEptr_constcharptr (FILE *, const char *); extern bool hook_bool_FILEptr_rtx_false (FILE *, rtx); diff --git a/gcc/target.def b/gcc/target.def index ae0bc9c..2f7178c 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -4639,6 +4639,15 @@ usage., hook_void_void) /* This target hook allows the backend to perform additional + processing after expansion of a call insn. */ +DEFHOOK +(post_expand_call_insn, + This hook is called just after expansion of a call_expr into rtl, allowing\n\ +the target to perform additional processing., + void, (rtx), + hook_void_rtx) + +/* This target hook allows the backend to perform additional instantiations on rtx that are not actually in insns yet, but will be later. */ DEFHOOK
[Patch, Fortran, testsuite] Increase tolerance level for precision of bessel function.
Hi, The attached patch adjusts a fortran test to decrease the precision of one of the points on the bessel curve. gfortran.dg/bessel_7.f90 fails for a value 3.0 because libm does not seem to be accurate enough compared to what the test expects. I did a like-for-like run on x86 vs aarch64. The issue seems to be in the level of precision that this test checks for. At the fail point, though the two values being compared are comparable, they aren't equal. On aarch64, it looks like this: 33 -0.138861489E+30 -0.138861319E+30 -0.17E+24 10.2699956894 T T 34 -0.304842886E+31 -0.304842493E+31 -0.39E+25 10.8117713928 T T 35 -0.689588648E+32 -0.689587681E+32 -0.97E+26 11.7649326324 T T 36 -0.160599184E+34 -0.160598952E+34 -0.23E+28 12.1240425110 T F If you see row #36, the 2nd and 3rd column values are comparable, but not equal. The delta is indicated in the 5th column which is greater that what the test expects - 12 ULPs. On x86 it looks like this: 33 -0.138861508E+30 -0.138861366E+30 -0.14E+24 8.5583286285 T T 34 -0.304842916E+31 -0.304842614E+31 -0.30E+25 8.3167467117 T T 35 -0.689588696E+32 -0.689587971E+32 -0.73E+26 8.8236989975 T T 36 -0.160599184E+34 -0.160599029E+34 -0.15E+28 8.0826950073 T T The delta on aarch64 is more than x86. If we increase the tolerance level for precision as shown in the patch, the test works fine for both x86 and aarch64. Tested on aarch64-none-linux-gnu, x86_64-unknown-linux-gnu. OK for trunk? Thanks, Tejas. Changelog: 2014-04-23 Tejas Belagod tejas.bela...@arm.com testsuite/ * gfortran.dg/bessel_7.f90(myeps): Increase precision tolerance level.diff --git a/gcc/testsuite/gfortran.dg/bessel_7.f90 b/gcc/testsuite/gfortran.dg/bessel_7.f90 index 7e63ed1..c6b5f74 100644 --- a/gcc/testsuite/gfortran.dg/bessel_7.f90 +++ b/gcc/testsuite/gfortran.dg/bessel_7.f90 @@ -16,7 +16,7 @@ implicit none real,parameter :: values(*) = [0.0, 0.5, 1.0, 0.9, 1.8,2.0,3.0,4.0,4.25,8.0,34.53, 475.78] real,parameter :: myeps(size(values)) = epsilon(0.0) - * [2, 3, 4, 5, 8, 2, 12, 6, 7, 6, 36, 168 ] + * [2, 3, 4, 5, 8, 2, 13, 6, 7, 6, 36, 168 ] ! The following is sufficient for me - the values above are a bit ! more tolerant ! * [0, 0, 0, 3, 3, 0, 9, 0, 2, 1, 22, 130 ]
Re: [PATCH] Fix warning in libgfortran configure script
On 23 April 2014 10:22, Richard Earnshaw rearn...@arm.com wrote: libgfortran/ 2014-04-17 Kyrylo Tkachov kyrylo.tkac...@arm.com * configure.ac: Quote usage of ac_cv_func_clock_gettime in if test. * configure: Regenerate. This looks fairly safe to me. My only question might be why isn't the variable set to one of 'yes' or 'no'? This is due to the newlib library detection kludgery further up the file. Rather than using autoconf to probe the interface, we detect newlib, bypass the AC_CHECK_FUNC_ONCE() macro and hardwire the interface. This has the effect of leaving various ac_cv_func_* variables undefined Cheers /Marcus
Re: [AArch64/ARM 3/3] Add execution tests of ARM TRN Intrinsics
On Fri, Mar 28, 2014 at 3:50 PM, Alan Lawrence alan.lawre...@arm.com wrote: Final patch in series, adds new tests of the ARM TRN Intrinsics, that also check the execution results, reusing the test bodies introduced into AArch64 in the first patch. (These tests subsume the autogenerated ones in testsuite/gcc.target/arm/neon/ that only check assembler output.) Tests use gcc.target/arm/simd/simd.exp from corresponding patch for ZIP Intrinsics, will commit that first. All tests passing on arm-none-eabi. The ARM bits are ok. testsuite/ChangeLog: 2012-03-28 Alan Lawrence alan.lawre...@arm.com * gcc.target/arm/simd/vtrnqf32_1.c: New file. * gcc.target/arm/simd/vtrnqp16_1.c: New file. * gcc.target/arm/simd/vtrnqp8_1.c: New file. * gcc.target/arm/simd/vtrnqs16_1.c: New file. * gcc.target/arm/simd/vtrnqs32_1.c: New file. * gcc.target/arm/simd/vtrnqs8_1.c: New file. * gcc.target/arm/simd/vtrnqu16_1.c: New file. * gcc.target/arm/simd/vtrnqu32_1.c: New file. * gcc.target/arm/simd/vtrnqu8_1.c: New file. * gcc.target/arm/simd/vtrnf32_1.c: New file. * gcc.target/arm/simd/vtrnp16_1.c: New file. * gcc.target/arm/simd/vtrnp8_1.c: New file. * gcc.target/arm/simd/vtrns16_1.c: New file. * gcc.target/arm/simd/vtrns32_1.c: New file. * gcc.target/arm/simd/vtrns8_1.c: New file. * gcc.target/arm/simd/vtrnu16_1.c: New file. * gcc.target/arm/simd/vtrnu32_1.c: New file. * gcc.target/arm/simd/vtrnu8_1.c: New file. diff --git a/gcc/testsuite/gcc.target/arm/simd/vtrnf32_1.c b/gcc/testsuite/gcc.target/arm/simd/vtrnf32_1.c new file mode 100644 index 000..c9620fb --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vtrnf32_1.c @@ -0,0 +1,12 @@ +/* Test the `vtrnf32' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options -save-temps -O1 -fno-inline } */ +/* { dg-add-options arm_neon } */ + +#include arm_neon.h +#include ../../aarch64/simd/vtrnf32.x + +/* { dg-final { scan-assembler-times vtrn\.32\[ \t\]+\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+!?\(?:\[ \t\]+@\[a-zA-Z0-9 \]+\)?\n 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/arm/simd/vtrnp16_1.c b/gcc/testsuite/gcc.target/arm/simd/vtrnp16_1.c new file mode 100644 index 000..0ff4319 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vtrnp16_1.c @@ -0,0 +1,12 @@ +/* Test the `vtrnp16' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options -save-temps -O1 -fno-inline } */ +/* { dg-add-options arm_neon } */ + +#include arm_neon.h +#include ../../aarch64/simd/vtrnp16.x + +/* { dg-final { scan-assembler-times vtrn\.16\[ \t\]+\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+!?\(?:\[ \t\]+@\[a-zA-Z0-9 \]+\)?\n 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/arm/simd/vtrnp8_1.c b/gcc/testsuite/gcc.target/arm/simd/vtrnp8_1.c new file mode 100644 index 000..2b047e4 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vtrnp8_1.c @@ -0,0 +1,12 @@ +/* Test the `vtrnp8' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options -save-temps -O1 -fno-inline } */ +/* { dg-add-options arm_neon } */ + +#include arm_neon.h +#include ../../aarch64/simd/vtrnp8.x + +/* { dg-final { scan-assembler-times vtrn\.8\[ \t\]+\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+!?\(?:\[ \t\]+@\[a-zA-Z0-9 \]+\)?\n 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/arm/simd/vtrnqf32_1.c b/gcc/testsuite/gcc.target/arm/simd/vtrnqf32_1.c new file mode 100644 index 000..dd4e883 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vtrnqf32_1.c @@ -0,0 +1,12 @@ +/* Test the `vtrnQf32' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options -save-temps -O1 -fno-inline } */ +/* { dg-add-options arm_neon } */ + +#include arm_neon.h +#include ../../aarch64/simd/vtrnqf32.x + +/* { dg-final { scan-assembler-times vtrn\.32\[ \t\]+\[qQ\]\[0-9\]+, ?\[qQ\]\[0-9\]+!?\(?:\[ \t\]+@\[a-zA-Z0-9 \]+\)?\n 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/arm/simd/vtrnqp16_1.c b/gcc/testsuite/gcc.target/arm/simd/vtrnqp16_1.c new file mode 100644 index 000..374eee3 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vtrnqp16_1.c @@ -0,0 +1,12 @@ +/* Test the `vtrnQp16' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options -save-temps -O1 -fno-inline } */ +/* { dg-add-options arm_neon } */ + +#include arm_neon.h +#include ../../aarch64/simd/vtrnqp16.x + +/* { dg-final { scan-assembler-times vtrn\.16\[ \t\]+\[qQ\]\[0-9\]+,
Re: [wide-int 2/8] Fix ubsan internal-fn.c handling
Richard Sandiford rdsandif...@googlemail.com writes: This code was mixing hprec and hprec*2 wide_ints. The simplest fix seemed to be to introduce a function that gives the minimum precision necessary to represent a function, which also means that no temporary wide_ints are needed. Other places might be able to use this too, but I'd like to look at that after the merge. The patch series fixed a regression in c-c++-common/ubsan/overflow-2.c and I assume it's due to this change. Tested on x86_64-linux-gnu. OK to install? Richard B. expressed doubts about this on IRC, so for a bit more detail: The comparisons we're doing are on the range of an SSA name. There are three ways that these ranges could be stored in the range_info_def: (1) as INTEGER_CSTs. This was felt to be unacceptable because it would create too many garbage constants. (2) as widest_ints. This too was unacceptable because it would bloat the range_info_def. (3) as a form of wide_int in which the HWIs are allocated as a trailing part of the containing structure. This means that range_info_defs for 64-bit types only have 3 HWIs (smaller than now). We went for (3). Having decided to store the ranges like wide_ints, the question then is: what about the get/set_range_info interface? Two obvious options are: (a) present the ranges as wide_ints. (b) present the ranges as widest_ints, converting in and out as necessary. (a) is more efficient and seems to fit well with the pre-ubsan callers, so that's what was chosen. In the patch we have two wide_ints that have the same precision as the SSA name. The values we were creating via wi::min_value and wi::max_value instead had half that precision. This is the same kind of mismatch as you'd get comparing HImode and SImode in RTL, say. We could fix the bug by using something like: if (wi::les_p (arg0_max, wi::mask (hprec, false, prec)) wi::les_p (wi::mask (hprec, true, prec), arg0_min)) etc. Or we could extend the wide_ints to widest_ints so that precision doesn't matter when doing the comparisons. But both those options involve temporaries and seem unnecessarily complicated. All we're really asking here is: what is the minimum precision needed to represent this constant? That's something that could be generally useful (e.g. when checking whether a value fits a type) so the patch adds a corresponding wi:: function. Thanks, Richard Thanks, Richard Index: gcc/internal-fn.c === --- gcc/internal-fn.c 2014-04-22 20:31:10.516895118 +0100 +++ gcc/internal-fn.c 2014-04-22 20:31:25.842005530 +0100 @@ -478,7 +478,7 @@ ubsan_expand_si_overflow_mul_check (gimp rtx do_overflow = gen_label_rtx (); rtx hipart_different = gen_label_rtx (); - int hprec = GET_MODE_PRECISION (hmode); + unsigned int hprec = GET_MODE_PRECISION (hmode); rtx hipart0 = expand_shift (RSHIFT_EXPR, mode, op0, hprec, NULL_RTX, 0); hipart0 = gen_lowpart (hmode, hipart0); @@ -513,12 +513,11 @@ ubsan_expand_si_overflow_mul_check (gimp wide_int arg0_min, arg0_max; if (get_range_info (arg0, arg0_min, arg0_max) == VR_RANGE) { - if (wi::les_p (arg0_max, wi::max_value (hprec, SIGNED)) -wi::les_p (wi::min_value (hprec, SIGNED), arg0_min)) + unsigned int mprec0 = wi::min_precision (arg0_min, SIGNED); + unsigned int mprec1 = wi::min_precision (arg0_max, SIGNED); + if (mprec0 = hprec mprec1 = hprec) op0_small_p = true; - else if (wi::les_p (arg0_max, wi::max_value (hprec, UNSIGNED)) - wi::les_p (~wi::max_value (hprec, UNSIGNED), - arg0_min)) + else if (mprec0 = hprec + 1 mprec1 = hprec + 1) op0_medium_p = true; if (!wi::neg_p (arg0_min, TYPE_SIGN (TREE_TYPE (arg0 op0_sign = 0; @@ -531,12 +530,11 @@ ubsan_expand_si_overflow_mul_check (gimp wide_int arg1_min, arg1_max; if (get_range_info (arg1, arg1_min, arg1_max) == VR_RANGE) { - if (wi::les_p (arg1_max, wi::max_value (hprec, SIGNED)) -wi::les_p (wi::min_value (hprec, SIGNED), arg1_min)) + unsigned int mprec0 = wi::min_precision (arg1_min, SIGNED); + unsigned int mprec1 = wi::min_precision (arg1_max, SIGNED); + if (mprec0 = hprec mprec1 = hprec) op1_small_p = true; - else if (wi::les_p (arg1_max, wi::max_value (hprec, UNSIGNED)) - wi::les_p (~wi::max_value (hprec, UNSIGNED), - arg1_min)) + else if (mprec0 = hprec + 1 mprec1 = hprec + 1)
Re: [Patch ARM] Allow any register for DImode values in Thumb2.
On Wed, Apr 23, 2014 at 2:06 PM, Ramana Radhakrishnan ramana@googlemail.com wrote: On Wed, Apr 23, 2014 at 1:53 PM, Christophe Lyon christophe.l...@linaro.org wrote: On 27 February 2014 14:58, Ramana Radhakrishnan ramra...@arm.com wrote: Hi I noticed that for T32 we don't allow any old register for DImode values. The restriction of an even register is true only for ARM state because the ISA doesn't allow any old register in this place. In a few large .i files that I had knocking about, noticed a nice drop in stack usage and a generally improved register allocation strategy. Queued for stage1 after suitable testing including a bootstrap and regression test in Thumb2 found no issues. regards Ramana DATE Ramana Radhakrishnan ramana.radhakrish...@arm.com * config/arm/arm.c (arm_hard_regno_mode_ok): Loosen restrictions on core registers for DImode values in Thumb2. Hi Ramana, I've noticed some regressions after this patch has been committed (rev 209615): gcc.c-torture/compile/pr34856.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions gcc.c-torture/compile/pr34856.c -O3 -fomit-frame-pointer -funroll-loops gcc.c-torture/execute/scal-to-vec1.c compilation, -O2 gcc.c-torture/execute/scal-to-vec1.c compilation, -O2 -flto -fno-use-linker-plugin -flto-partition=none gcc.c-torture/execute/scal-to-vec1.c compilation, -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects Now all produce ICE in several GCC configurations (mostly when generating thumb code) eg: --target arm-none-eabi --with-cpu=cortex-a9 --with-mode=thumb --target arm-none-linux-gnueabi --with-cpu=cortex-a9 --with-mode=thumb Thanks for the report - I'll have a look. I've had this in a tree for testing for sometime that runs these configurations atleast the bare-metal arm-none-eabi one with multilib testing for thumb. Uggh I hate it that gmail sometimes cuts off your sentences. Needless to say, this is surprising but it's OK for target arm-none-linux-gnueabihf. See http://cbuild.validation.linaro.org/build/cross-validation/gcc/209615/report-build-info.html Christophe.
RE: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
Yeah, I think the lack of elimination is the problem. process_address eliminates $frame temporarily before checking whether the address is valid, but the places that check EXTRA_CONSTRAINT_STR pass the original uneliminated address. So the legitimate_address_p hook sees the $sp-based address but the W constraint only sees the $frame-based address (which might or might not be valid, depending on whether $frame is eliminated to the stack or hard frame pointer). I think the constraints should see the eliminated address too. That makes sense and explains why it worked when $frame was eliminated to hard frame pointer but didn't for the stack pointer. BTW, we might want to define something like: #define MODE_BASE_REG_CLASS(MODE) \ (TARGET_MIPS16 \ ? ((MODE) == SImode || (MODE) == DImode ? M16_SP_REGS : M16_REGS) \ : GR_REGS) instead of BASE_REG_CLASS. It might lead to slightly better code (or not -- if it doesn't then don't bother :-)). I have already tried it and no visible difference was seen. If this patch is OK then I think the only thing blocking the switch to LRA is the asm-subreg-1.c failure. I think it'd be fine to XFAIL that test on MIPS for now, until there's a consensus about what X means for asms. The patch worked for me and passed the regression test. Thanks. If we were going to XFAIL the test then it would apply specifically for -mips16 -O1. In any other combination it appears to work. Would that be a stopper? Below is the revised patch addressing all the comments and changes so far. Regards, Robert 2014-03-26 Robert Suchanek robert.sucha...@imgtec.com * lra-constraints.c (base_to_reg): New function. (process_address): Use new function. * config/mips/constraints.md (d): BASE_REG_CLASS replaced by TARGET_MIPS16 ? M16_REGS : GR_REGS. * config/mips/mips.c (mips_regno_mode_ok_for_base_p): Remove use !strict_p for MIPS16. (mips_register_priority): New function that implements the target hook TARGET_REGISTER_PRIORITY. (mips_spill_class): Likewise for TARGET_SPILL_CLASS (mips_lra_p): Likewise for TARGET_LRA_P. * config/mips/mips.h (reg_class): Add M16_SP_REGS and SPILL_REGS classes. (REG_CLASS_NAMES): Likewise. (REG_CLASS_CONTENTS): Likewise. (BASE_REG_CLASS): Use M16_SP_REGS. * config/mips/mips.md (*mul_acc_si, *mul_sub_si): Add alternative tuned for LRA. New set attribute to enable alternatives depending on the register allocator used. (*lea64): Disable pattern for MIPS16. * config/mips/mips.opt (mlra): New option diff --git gcc/config/mips/constraints.md gcc/config/mips/constraints.md index f6834fd..fa33c30 100644 --- gcc/config/mips/constraints.md +++ gcc/config/mips/constraints.md @@ -19,7 +19,7 @@ ;; Register constraints -(define_register_constraint d BASE_REG_CLASS +(define_register_constraint d TARGET_MIPS16 ? M16_REGS : GR_REGS An address register. This is equivalent to @code{r} unless generating MIPS16 code.) diff --git gcc/config/mips/mips.c gcc/config/mips/mips.c index 45256e9..81b6c26 100644 --- gcc/config/mips/mips.c +++ gcc/config/mips/mips.c @@ -655,7 +655,7 @@ const enum reg_class mips_regno_to_class[FIRST_PSEUDO_REGISTER] = { M16_REGS,M16_STORE_REGS, LEA_REGS,LEA_REGS, LEA_REGS,LEA_REGS,LEA_REGS,LEA_REGS, T_REG, PIC_FN_ADDR_REG, LEA_REGS,LEA_REGS, - LEA_REGS,LEA_REGS,LEA_REGS,LEA_REGS, + LEA_REGS,M16_SP_REGS, LEA_REGS,LEA_REGS, FP_REGS, FP_REGS,FP_REGS,FP_REGS, FP_REGS, FP_REGS,FP_REGS,FP_REGS, @@ -2241,22 +2241,9 @@ mips_regno_mode_ok_for_base_p (int regno, enum machine_mode mode, return true; /* In MIPS16 mode, the stack pointer can only address word and doubleword - values, nothing smaller. There are two problems here: - - (a) Instantiating virtual registers can introduce new uses of the - stack pointer. If these virtual registers are valid addresses, - the stack pointer should be too. - - (b) Most uses of the stack pointer are not made explicit until - FRAME_POINTER_REGNUM and ARG_POINTER_REGNUM have been eliminated. - We don't know until that stage whether we'll be eliminating to the - stack pointer (which needs the restriction) or the hard frame - pointer (which doesn't). - - All in all, it seems more consistent to only enforce this restriction - during and after reload. */ + values, nothing smaller. */ if (TARGET_MIPS16 regno == STACK_POINTER_REGNUM) -return !strict_p || GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8; +return GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8; return TARGET_MIPS16 ? M16_REG_P (regno) : GP_REG_P (regno); } @@ -12115,6 +12102,18 @@
Re: [wide-int 2/8] Fix ubsan internal-fn.c handling
On Wed, Apr 23, 2014 at 3:29 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Sandiford rdsandif...@googlemail.com writes: This code was mixing hprec and hprec*2 wide_ints. The simplest fix seemed to be to introduce a function that gives the minimum precision necessary to represent a function, which also means that no temporary wide_ints are needed. Other places might be able to use this too, but I'd like to look at that after the merge. The patch series fixed a regression in c-c++-common/ubsan/overflow-2.c and I assume it's due to this change. Tested on x86_64-linux-gnu. OK to install? Richard B. expressed doubts about this on IRC, so for a bit more detail: The comparisons we're doing are on the range of an SSA name. There are three ways that these ranges could be stored in the range_info_def: (1) as INTEGER_CSTs. This was felt to be unacceptable because it would create too many garbage constants. (2) as widest_ints. This too was unacceptable because it would bloat the range_info_def. (3) as a form of wide_int in which the HWIs are allocated as a trailing part of the containing structure. This means that range_info_defs for 64-bit types only have 3 HWIs (smaller than now). We went for (3). Having decided to store the ranges like wide_ints, the question then is: what about the get/set_range_info interface? Two obvious options are: (a) present the ranges as wide_ints. (b) present the ranges as widest_ints, converting in and out as necessary. (a) is more efficient and seems to fit well with the pre-ubsan callers, so that's what was chosen. In the patch we have two wide_ints that have the same precision as the SSA name. The values we were creating via wi::min_value and wi::max_value instead had half that precision. This is the same kind of mismatch as you'd get comparing HImode and SImode in RTL, say. We could fix the bug by using something like: if (wi::les_p (arg0_max, wi::mask (hprec, false, prec)) wi::les_p (wi::mask (hprec, true, prec), arg0_min)) etc. Or we could extend the wide_ints to widest_ints so that precision doesn't matter when doing the comparisons. But both those options involve temporaries and seem unnecessarily complicated. All we're really asking here is: what is the minimum precision needed to represent this constant? That's something that could be generally useful (e.g. when checking whether a value fits a type) so the patch adds a corresponding wi:: function. Ah, that makes sense now ;) Thus the patch is ok. Thanks, Richard. Thanks, Richard Thanks, Richard Index: gcc/internal-fn.c === --- gcc/internal-fn.c 2014-04-22 20:31:10.516895118 +0100 +++ gcc/internal-fn.c 2014-04-22 20:31:25.842005530 +0100 @@ -478,7 +478,7 @@ ubsan_expand_si_overflow_mul_check (gimp rtx do_overflow = gen_label_rtx (); rtx hipart_different = gen_label_rtx (); - int hprec = GET_MODE_PRECISION (hmode); + unsigned int hprec = GET_MODE_PRECISION (hmode); rtx hipart0 = expand_shift (RSHIFT_EXPR, mode, op0, hprec, NULL_RTX, 0); hipart0 = gen_lowpart (hmode, hipart0); @@ -513,12 +513,11 @@ ubsan_expand_si_overflow_mul_check (gimp wide_int arg0_min, arg0_max; if (get_range_info (arg0, arg0_min, arg0_max) == VR_RANGE) { - if (wi::les_p (arg0_max, wi::max_value (hprec, SIGNED)) -wi::les_p (wi::min_value (hprec, SIGNED), arg0_min)) + unsigned int mprec0 = wi::min_precision (arg0_min, SIGNED); + unsigned int mprec1 = wi::min_precision (arg0_max, SIGNED); + if (mprec0 = hprec mprec1 = hprec) op0_small_p = true; - else if (wi::les_p (arg0_max, wi::max_value (hprec, UNSIGNED)) - wi::les_p (~wi::max_value (hprec, UNSIGNED), - arg0_min)) + else if (mprec0 = hprec + 1 mprec1 = hprec + 1) op0_medium_p = true; if (!wi::neg_p (arg0_min, TYPE_SIGN (TREE_TYPE (arg0 op0_sign = 0; @@ -531,12 +530,11 @@ ubsan_expand_si_overflow_mul_check (gimp wide_int arg1_min, arg1_max; if (get_range_info (arg1, arg1_min, arg1_max) == VR_RANGE) { - if (wi::les_p (arg1_max, wi::max_value (hprec, SIGNED)) -wi::les_p (wi::min_value (hprec, SIGNED), arg1_min)) + unsigned int mprec0 = wi::min_precision (arg1_min, SIGNED); + unsigned int mprec1 = wi::min_precision (arg1_max, SIGNED); + if (mprec0 = hprec mprec1 = hprec) op1_small_p = true; - else if (wi::les_p (arg1_max, wi::max_value (hprec,
[PATCH] Add MIPS -mxpa command line option.
Hi, This patch adds a GCC MIPS command line option (-mxpa) to enable/disable support for the eXtended Physical Address (XPA) instructions within the assembler. The ChangeLog and patch are shown below. Many thanks, Andrew * doc/invoke.texi: Document -mxpa and -mno-xpa MIPS command line options. * config/mips/mips.opt (mxpa): New option. * config/mips/mips.h (ASM_SPEC): Pass mxpa and mno-xpa to the assembler. diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index b25865b..91a33ef 100644 --- a/gcc/config/mips/mips.h +++ b/gcc/config/mips/mips.h @@ -1176,6 +1176,7 @@ struct mips_cpu_info { %{mmcu} %{mno-mcu} \ %{meva} %{mno-eva} \ %{mvirt} %{mno-virt} \ +%{mxpa} %{mno-xpa} \ %{msmartmips} %{mno-smartmips} \ %{mmt} %{mno-mt} \ %{mfix-rm7000} %{mno-fix-rm7000} \ diff --git a/gcc/config/mips/mips.opt b/gcc/config/mips/mips.opt index 6ee5398..c992cee 100644 --- a/gcc/config/mips/mips.opt +++ b/gcc/config/mips/mips.opt @@ -392,6 +392,10 @@ mvirt Target Report Var(TARGET_VIRT) Use Virtualization Application Specific instructions +mxpa +Target Report Var(TARGET_XPA) +Use eXtended Physical Address (XPA) instructions + mvr4130-align Target Report Mask(VR4130_ALIGN) Perform VR4130-specific alignment optimizations diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index ff43f26..22a66e8 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -781,6 +781,7 @@ Objective-C and Objective-C++ Dialects}. -mmcu -mmno-mcu @gol -meva -mno-eva @gol -mvirt -mno-virt @gol +-mxpa -mno-xpa @gol -mmicromips -mno-micromips @gol -mfpu=@var{fpu-type} @gol -msmartmips -mno-smartmips @gol @@ -17494,6 +17495,12 @@ Use (do not use) the MIPS Enhanced Virtual Addressing instructions. @opindex mno-virt Use (do not use) the MIPS Virtualization Application Specific instructions. +@item -mxpa +@itemx -mno-xpa +@opindex mxpa +@opindex mno-xpa +Use (do not use) the MIPS eXtended Physical Address (XPA) instructions. + @item -mlong64 @opindex mlong64 Force @code{long} types to be 64 bits wide. See @option{-mlong32} for -- 1.7.1
Re: calloc = malloc + memset
On Fri, Apr 18, 2014 at 8:27 PM, Marc Glisse marc.gli...@inria.fr wrote: Thanks for the comments! On Fri, 18 Apr 2014, Jakub Jelinek wrote: The passes.def change makes me a little bit nervous, but if it works, perhaps. Would you prefer running the pass twice? I thought there would be less resistance to moving the pass than duplicating it. Indeed. I think placing it after loops and CSE (thus what you have done) makes sense. strlenopt itself shouldn't enable much additional optimizations. But well, pass ordering is always tricky. Didn't look at the rest of the changes, but Jakub is certainly able to approve the patch so I leave it to him. Thanks, Richard. By the way, I think even passes we run only once should have the required functions implemented so they can be run several times (at least most of them), in case users want to do that in plugins. I was surprised when I tried adding a second strlen pass and the compiler refused. --- gcc/testsuite/g++.dg/tree-ssa/calloc.C (revision 0) +++ gcc/testsuite/g++.dg/tree-ssa/calloc.C (working copy) @@ -0,0 +1,35 @@ +/* { dg-do compile { target c++11 } } */ +/* { dg-options -O3 -fdump-tree-optimized } */ + +#include new +#include vector +#include cstdlib + +void g(void*); +inline void* operator new(std::size_t sz) +{ + void *p; + + if (sz == 0) +sz = 1; + + // Slightly modified from the libsupc++ version, that one has 2 calls + // to malloc which makes it too hard to optimize. + while ((p = std::malloc (sz)) == 0) +{ + std::new_handler handler = std::get_new_handler (); + if (! handler) +throw std::bad_alloc(); + handler (); +} + return p; +} + +void f(void*p,int n){ + new(p)std::vectorint(n); +} + +/* { dg-final { scan-tree-dump-times calloc 1 optimized } } */ +/* { dg-final { scan-tree-dump-not malloc optimized } } */ +/* { dg-final { scan-tree-dump-not memset optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ This looks to me way too much fragile, any time the libstdc++ or glibc headers change a little bit, you might need to adjust the dg-final directives. Much better would be if you just provided the prototypes yourself and subset of the std::vector you really need for the testcase. You can throw some class or int, it doesn't have to be std::bad_alloc, etc. I don't understand what seems so fragile to you. There is a single function in the .optimized dump, which just calls calloc in a loop. It doesn't seem that likely that a change in glibc/libstdc++ would make an extra memset pop up. A change in libstdc++ could easily prevent the optimization completely (I'd like to hope we can avoid that, half of the purpose of the testcase was making sure libstdc++ didn't change in a bad way), but I don't really see how it could keep it in a way that requires tweaking dg-final. While trying to write a standalone version, I hit again many missed optimizations, getting such nice things in the .optimized dump as: _12 = p_13 + sz_7; if (_12 != p_13) or: _12 = p_13 + sz_7; _30 = (unsigned long) _12; _9 = p_13 + 4; _10 = (unsigned long) _9; _11 = _30 - _10; _22 = _11 /[ex] 4; _21 = _22; _40 = _21 + 1; _34 = _40 * 4; It is embarrassing... I hope the combiner GSoC will work well and we can just add a dozen patterns to handle this before 4.10. --- gcc/testsuite/gcc.dg/strlenopt-9.c (revision 208772) +++ gcc/testsuite/gcc.dg/strlenopt-9.c (working copy) @@ -11,21 +11,21 @@ fn1 (int r) optimized away. */ return strchr (p, '\0'); } __attribute__((noinline, noclone)) size_t fn2 (int r) { char *p, q[10]; strcpy (q, abc); p = r ? a : q; - /* String length for p varies, therefore strlen below isn't + /* String length is constant for both alternatives, and strlen is optimized away. */ return strlen (p); Is this because of jump threading? It is PRE that turns: if (r_4(D) == 0) goto bb 5; else goto bb 3; bb 5: goto bb 4; bb 3: bb 4: # p_1 = PHI q(5), a(3) _5 = __builtin_strlen (p_1); into: if (r_4(D) == 0) goto bb 5; else goto bb 3; bb 5: _7 = __builtin_strlen (q); pretmp_8 = _7; goto bb 4; bb 3: bb 4: # p_1 = PHI q(5), a(3) # prephitmp_9 = PHI pretmp_8(5), 1(3) _5 = prephitmp_9; It says: Found partial redundancy for expression {call_expr__builtin_strlen,p_1}@.MEM_3 (0005) --- gcc/testsuite/gcc.dg/tree-ssa/calloc-1.c(revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/calloc-1.c(working copy) @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-optimized } */ + +#include stdlib.h +#include string.h Even this I find unsafe. The strlenopt*.c tests use it's custom strlenopt.h header for a reason, you might just add a calloc prototype in there and use that header. Might as well use __builtin_* then. +/* Handle a
Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
Robert Suchanek robert.sucha...@imgtec.com writes: If we were going to XFAIL the test then it would apply specifically for -mips16 -O1. In any other combination it appears to work. Would that be a stopper? Hmm, in that case maybe we should just leave it failing. The alternative would be to skip the test altogther for MIPS, with a PR referencing it, but that seems a bit over-the-top. 2014-03-26 Robert Suchanek robert.sucha...@imgtec.com * lra-constraints.c (base_to_reg): New function. (process_address): Use new function. * config/mips/constraints.md (d): BASE_REG_CLASS replaced by TARGET_MIPS16 ? M16_REGS : GR_REGS. * config/mips/mips.c (mips_regno_mode_ok_for_base_p): Remove use !strict_p for MIPS16. (mips_register_priority): New function that implements the target hook TARGET_REGISTER_PRIORITY. (mips_spill_class): Likewise for TARGET_SPILL_CLASS (mips_lra_p): Likewise for TARGET_LRA_P. * config/mips/mips.h (reg_class): Add M16_SP_REGS and SPILL_REGS classes. (REG_CLASS_NAMES): Likewise. (REG_CLASS_CONTENTS): Likewise. (BASE_REG_CLASS): Use M16_SP_REGS. * config/mips/mips.md (*mul_acc_si, *mul_sub_si): Add alternative tuned for LRA. New set attribute to enable alternatives depending on the register allocator used. (*lea64): Disable pattern for MIPS16. * config/mips/mips.opt (mlra): New option Looks good. @@ -12115,6 +12102,18 @@ mips_register_move_cost (enum machine_mode mode, return 0; } +/* Return a register priority for hard reg REGNO. */ + +static int +mips_register_priority (int hard_regno) +{ + /* Treat MIPS16 registers with higher priority than other regs. */ + if (TARGET_MIPS16 + TEST_HARD_REG_BIT (reg_class_contents[M16_REGS], hard_regno)) +return 1; + return 0; +} + /* Implement TARGET_MEMORY_MOVE_COST. */ static int @@ -18897,6 +18896,21 @@ mips_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update) *update = build2 (COMPOUND_EXPR, void_type_node, *update, atomic_feraiseexcept_call); } + +static reg_class_t +mips_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED, + enum machine_mode mode ATTRIBUTE_UNUSED) +{ + if (TARGET_MIPS16) +return SPILL_REGS; + return NO_REGS; +} + +static bool +mips_lra_p (void) +{ + return mips_lra_flag; +} /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP Please use comments of the form: /* Implement TARGET_FOO. */ above all three functions (instead of the current one in the case of mips_register_priority), just so that it's painfully obvious that these are target hooks. OK for the MIPS part with that change, thanks. Out of interest, do you see any difference if you include $sp in SPILL_REGS? That obviously doesn't make much conceptual sense, but it would give a cleaner class hierarchy. Richard
Re: [PATCH] Add MIPS -mxpa command line option.
Andrew Bennett andrew.benn...@imgtec.com writes: * doc/invoke.texi: Document -mxpa and -mno-xpa MIPS command line options. * config/mips/mips.opt (mxpa): New option. * config/mips/mips.h (ASM_SPEC): Pass mxpa and mno-xpa to the assembler. OK, thanks. If your account doesn't have gcc access yet then please ask overseers@ to add it. Remember to add yourself to MAINTAINERS afterwards :-) Or if you'd prefer not to get access, I can commit it for you. Thanks, Richard
Re: Remove obsolete Solaris 9 support
Uros Bizjak ubiz...@gmail.com writes: It looks to me that one part was left in libgcc/config/i386/crtfastmath.c: #if !defined __x86_64__ defined __sun__ defined __svr4__ #include signal.h #include ucontext.h ... #endif Right, missed it because it carried no Solaris 9 comment. I'll remove it after a round of testing. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH 00/89] Compile-time gimple-checking
On Tue, Apr 22, 2014 at 9:42 PM, Richard Biener richard.guent...@gmail.com wrote: On April 22, 2014 8:56:56 PM CEST, Richard Sandiford rdsandif...@googlemail.com wrote: David Malcolm dmalc...@redhat.com writes: Alternatively we could change the is-a.h API to eliminate this discrepancy, and keep the typedefs; giving something like the following: static void dump_gimple_switch (pretty_printer *buffer, gimple_switch gs, int spc, int flags) [...snip...] [...later, within pp_gimple_stmt_1:] case GIMPLE_SWITCH: dump_gimple_switch (buffer, as_a gimple_switch (gs), spc, flags); break; which is concise, readable, and avoid the change in pointerness compared to the gimple typedef; the local decls above would look like this: gimple some_stmt; /* note how this doesn't have a star... */ gimple_assign assign_stmt; /* ...and neither do these */ gimple_cond assign_stmt; gimple_phi phi; I think this last proposal is my preferred API, but it requires the change to is-a.h Attached is a proposed change to the is-a.h API that elimintates the discrepancy, allowing the use of typedefs with is-a.h (doesn't yet compile, but hopefully illustrates the idea). Note how it changes the API to match C++'s dynamic_cast operator i.e. you do Q* q = dyn_castQ* (p); not: Q* q = dyn_castQ (p); Thanks for being flexible. :-) I like this version too FWIW, for the reason you said: it really does look like a proper C++ cast. Indeed. I even wasn't aware it is different Than a c++ cast... It would be nice if you can change that with a separate patch posted in a separate thread to be more visible. Also I see you introduce a const_FOO class with every FOO one. I wonder whether, now that we have C++, can address const-correctness in a less awkward way than with a typedef. Can you try to go back in time and see why we did with that in the first place? ISTR that it was oh, if we were only using C++ we wouldn't need to jump through that hoop. Thanks, Richard. Richard. If we ever decide to get rid of the typedefs (maybe at the same time as using auto) then the choice might be different, but that would be a much more systematic and easily-automated change than this one. Thanks, Richard
Re: [PATCH][RFC][wide-int] Fix some build errors on arm in wide-int branch and report ICE
On 04/23/2014 05:47 AM, Richard Biener wrote: On Tue, Apr 22, 2014 at 6:04 PM, Mike Stump mikest...@comcast.net wrote: On Apr 22, 2014, at 8:33 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Kyrill Tkachov kyrylo.tkac...@arm.com writes: Ping. http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00769.html Any ideas? I recall chatter on IRC that we want to merge wide-int into trunk soon. Bootstrap failure on arm would prevent that... Sorry for the late reply. I hadn't forgotten, but I wanted to wait until I had chance to look into the ICE before replying, which I haven't had chance to do yet. They are separable issues, so, I checked in the change. It's a shame we can't use C++ style casts, but I suppose that's the price to pay for being able to write unsigned HOST_WIDE_INT”. unsigned_HOST_WIDE_INT isn’t horrible, but, yeah, my fingers were expecting a typedef or better. I slightly prefer the int (1) style, but I think we should go the direction of the patch. Well, on my list of things to try for 4.10 is to kill off HOST_WIDE_* and require a 64bit integer type on the host and force all targets to use a 64bit 'hwi'. Thus, s/HOST_WIDE_INT/int64_t/ (and the appropriate related changes). Richard. I should point out that there is a community that wants to go in the opposite direction here. They are the people with real 32 bit hosts who want to go back to a world where they are allowed to make hwi a 32 bit value.They have been waiting wide-int to be committed because they see this as a way to get back to world where most of the math is done natively. I am not part of this community but they feel that if the math that has the potential to be big to be is done in wide-ints, then they can go back to using a 32 bit hwi for everything else.For them, a wide-int built on 32 hwi's would be a win. kenny
Re: [PATCH 00/89] Compile-time gimple-checking
On Wed, Apr 23, 2014 at 4:19 PM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Apr 22, 2014 at 9:42 PM, Richard Biener richard.guent...@gmail.com wrote: On April 22, 2014 8:56:56 PM CEST, Richard Sandiford rdsandif...@googlemail.com wrote: David Malcolm dmalc...@redhat.com writes: Alternatively we could change the is-a.h API to eliminate this discrepancy, and keep the typedefs; giving something like the following: static void dump_gimple_switch (pretty_printer *buffer, gimple_switch gs, int spc, int flags) [...snip...] [...later, within pp_gimple_stmt_1:] case GIMPLE_SWITCH: dump_gimple_switch (buffer, as_a gimple_switch (gs), spc, flags); break; which is concise, readable, and avoid the change in pointerness compared to the gimple typedef; the local decls above would look like this: gimple some_stmt; /* note how this doesn't have a star... */ gimple_assign assign_stmt; /* ...and neither do these */ gimple_cond assign_stmt; gimple_phi phi; I think this last proposal is my preferred API, but it requires the change to is-a.h Attached is a proposed change to the is-a.h API that elimintates the discrepancy, allowing the use of typedefs with is-a.h (doesn't yet compile, but hopefully illustrates the idea). Note how it changes the API to match C++'s dynamic_cast operator i.e. you do Q* q = dyn_castQ* (p); not: Q* q = dyn_castQ (p); Thanks for being flexible. :-) I like this version too FWIW, for the reason you said: it really does look like a proper C++ cast. Indeed. I even wasn't aware it is different Than a c++ cast... It would be nice if you can change that with a separate patch posted in a separate thread to be more visible. Also I see you introduce a const_FOO class with every FOO one. I wonder whether, now that we have C++, can address const-correctness in a less awkward way than with a typedef. Can you try to go back in time and see why we did with that in the first place? ISTR that it was oh, if we were only using C++ we wouldn't need to jump through that hoop. To followup myself here, it's because 'tree' is a typedef to a pointer and thus 'const tree' is different from 'const tree_node *'. Not sure why we re-introduced the 'mistake' of making 'tree' a pointer when we introduced 'gimple'. If we were to make 'gimple' the class type itself we can use gimple *, const gimple * and also const gimple (when a NULL pointer is not expected). Anyway, gazillion new typedefs are ugly :/ (typedefs are ugly) Richard. Thanks, Richard. Richard. If we ever decide to get rid of the typedefs (maybe at the same time as using auto) then the choice might be different, but that would be a much more systematic and easily-automated change than this one. Thanks, Richard
Re: [PATCH][RFC][wide-int] Fix some build errors on arm in wide-int branch and report ICE
On Wed, Apr 23, 2014 at 4:29 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 04/23/2014 05:47 AM, Richard Biener wrote: On Tue, Apr 22, 2014 at 6:04 PM, Mike Stump mikest...@comcast.net wrote: On Apr 22, 2014, at 8:33 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Kyrill Tkachov kyrylo.tkac...@arm.com writes: Ping. http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00769.html Any ideas? I recall chatter on IRC that we want to merge wide-int into trunk soon. Bootstrap failure on arm would prevent that... Sorry for the late reply. I hadn't forgotten, but I wanted to wait until I had chance to look into the ICE before replying, which I haven't had chance to do yet. They are separable issues, so, I checked in the change. It's a shame we can't use C++ style casts, but I suppose that's the price to pay for being able to write unsigned HOST_WIDE_INT”. unsigned_HOST_WIDE_INT isn’t horrible, but, yeah, my fingers were expecting a typedef or better. I slightly prefer the int (1) style, but I think we should go the direction of the patch. Well, on my list of things to try for 4.10 is to kill off HOST_WIDE_* and require a 64bit integer type on the host and force all targets to use a 64bit 'hwi'. Thus, s/HOST_WIDE_INT/int64_t/ (and the appropriate related changes). Richard. I should point out that there is a community that wants to go in the opposite direction here. They are the people with real 32 bit hosts who want to go back to a world where they are allowed to make hwi a 32 bit value.They have been waiting wide-int to be committed because they see this as a way to get back to world where most of the math is done natively. I am not part of this community but they feel that if the math that has the potential to be big to be is done in wide-ints, then they can go back to using a 32 bit hwi for everything else.For them, a wide-int built on 32 hwi's would be a win. That wide-int builds on HWI is an implementation detail. It can easily be changed to build on int32_t. Btw, what important target still supports a 32bit HWI? None for what I know. Look at config.gcc and what does _not_ set need_64bit_hwint. Even plain arm needs it. Richard. kenny
Re: [PATCH 00/89] Compile-time gimple-checking
Hi, On Mon, 21 Apr 2014, David Malcolm wrote: This is a greatly-expanded version of: http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01262.html As of r205034 (de6bd75e3c9bc1efe8a6387d48eedaa4dafe622d) and r205428 (a90353203da18288cdac1b0b78fe7b22c69fe63f) the various gimple statements form a C++ inheritance hierarchy, but we're not yet making much use of that in the code: everything refers to just gimple (or const_gimple), and type-checking is performed at run-time within the various gimple_foo_* accessors in gimple.h, and almost nowhere else. The following patch series introduces compile-time checking of much of the handling of gimple statements. FWIW, I still don't like any of this for reasons already outlined here: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00773.html (basically: I consider automatically creating types a very bad idea. You do do that by simply creating a type for every gimple code.) case GIMPLE_SWITCH: dump_gimple_switch (buffer, gs-as_a_gimple_switch (), spc, flags); break; where the -as_a_gimple_switch is a no-op cast from gimple to the more concrete gimple_switch in a release build, with runtime checking for code == GIMPLE_SWITCH added in a checked build (it uses as_a internally). Unlike others here I do like the cast-as-method (if we absolutely _must_ have a complicated type hierarchy for gimple), but would suggest different a name: the gimple_ is tautological, and the a_ just noise, just name it gs-as_switch() (incidentally then it's _really_ shorter than the ugly is_a/as_a syntax). Ciao, Michael.
[PATCH] Avoid going to GENERIC for TER expansion
$subject - we have the sepops interface for this. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2014-04-23 Richard Biener rguent...@suse.de * expr.c (expand_expr_real_1): Avoid gimple_assign_rhs_to_tree during TER and instead use the sepops interface for expanding non-GIMPLE_SINGLE_RHS. Index: gcc/expr.c === *** gcc/expr.c (revision 209559) --- gcc/expr.c (working copy) *** expand_expr_real_1 (tree exp, rtx target *** 9395,9406 if (g) { rtx r; ! location_t saved_loc = curr_insn_location (); ! ! set_curr_insn_location (gimple_location (g)); ! r = expand_expr_real (gimple_assign_rhs_to_tree (g), target, ! tmode, modifier, NULL, inner_reference_p); ! set_curr_insn_location (saved_loc); if (REG_P (r) !REG_EXPR (r)) set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (exp), r); return r; --- 9395,9427 if (g) { rtx r; ! ops.code = gimple_assign_rhs_code (g); ! switch (get_gimple_rhs_class (ops.code)) ! { ! case GIMPLE_TERNARY_RHS: ! ops.op2 = gimple_assign_rhs3 (g); ! /* Fallthru */ ! case GIMPLE_BINARY_RHS: ! ops.op1 = gimple_assign_rhs2 (g); ! /* Fallthru */ ! case GIMPLE_UNARY_RHS: ! ops.op0 = gimple_assign_rhs1 (g); ! ops.type = TREE_TYPE (gimple_assign_lhs (g)); ! ops.location = gimple_location (g); ! r = expand_expr_real_2 (ops, target, tmode, modifier); ! break; ! case GIMPLE_SINGLE_RHS: ! { ! location_t saved_loc = curr_insn_location (); ! set_curr_insn_location (gimple_location (g)); ! r = expand_expr_real (gimple_assign_rhs1 (g), target, ! tmode, modifier, NULL, inner_reference_p); ! set_curr_insn_location (saved_loc); ! break; ! } ! default: ! gcc_unreachable (); ! } if (REG_P (r) !REG_EXPR (r)) set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (exp), r); return r;
Re: [PATCH][RFC][wide-int] Fix some build errors on arm in wide-int branch and report ICE
On Wed, Apr 23, 2014 at 04:36:23PM +0200, Richard Biener wrote: I should point out that there is a community that wants to go in the opposite direction here. They are the people with real 32 bit hosts who want to go back to a world where they are allowed to make hwi a 32 bit value.They have been waiting wide-int to be committed because they see this as a way to get back to world where most of the math is done natively. I am not part of this community but they feel that if the math that has the potential to be big to be is done in wide-ints, then they can go back to using a 32 bit hwi for everything else.For them, a wide-int built on 32 hwi's would be a win. I don't think wide-int will be more efficient than 64-bit integer support on 32-bit architectures, if it would be, that would just mean that we need to improve support for the double word integers for that target. So what exactly would be the advantage of going back to 32-bit HWI say on i?86? Jakub
Re: [PATCH][RFC][wide-int] Fix some build errors on arm in wide-int branch and report ICE
On 04/23/2014 10:36 AM, Richard Biener wrote: On Wed, Apr 23, 2014 at 4:29 PM, Kenneth Zadeck zad...@naturalbridge.com wrote: On 04/23/2014 05:47 AM, Richard Biener wrote: On Tue, Apr 22, 2014 at 6:04 PM, Mike Stump mikest...@comcast.net wrote: On Apr 22, 2014, at 8:33 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Kyrill Tkachov kyrylo.tkac...@arm.com writes: Ping. http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00769.html Any ideas? I recall chatter on IRC that we want to merge wide-int into trunk soon. Bootstrap failure on arm would prevent that... Sorry for the late reply. I hadn't forgotten, but I wanted to wait until I had chance to look into the ICE before replying, which I haven't had chance to do yet. They are separable issues, so, I checked in the change. It's a shame we can't use C++ style casts, but I suppose that's the price to pay for being able to write unsigned HOST_WIDE_INT”. unsigned_HOST_WIDE_INT isn’t horrible, but, yeah, my fingers were expecting a typedef or better. I slightly prefer the int (1) style, but I think we should go the direction of the patch. Well, on my list of things to try for 4.10 is to kill off HOST_WIDE_* and require a 64bit integer type on the host and force all targets to use a 64bit 'hwi'. Thus, s/HOST_WIDE_INT/int64_t/ (and the appropriate related changes). Richard. I should point out that there is a community that wants to go in the opposite direction here. They are the people with real 32 bit hosts who want to go back to a world where they are allowed to make hwi a 32 bit value.They have been waiting wide-int to be committed because they see this as a way to get back to world where most of the math is done natively. I am not part of this community but they feel that if the math that has the potential to be big to be is done in wide-ints, then they can go back to using a 32 bit hwi for everything else.For them, a wide-int built on 32 hwi's would be a win. That wide-int builds on HWI is an implementation detail. It can easily be changed to build on int32_t. Btw, what important target still supports a 32bit HWI? None for what I know. Look at config.gcc and what does _not_ set need_64bit_hwint. Even plain arm needs it. I think that originally, hwi was supposed to be a natural integer on the host machine and it was corrupted to always be a 64 bit integer. Right now, wide-int is built on hwis which are always 64 bits.On a 32 bit machine, this means that there are two levels of abstraction to get to the hardware, one to get from wide-int to 64 bits and one to get from 64 bits to 32 bits. The easy part of converting wide-int to run natively on a 32 bit machine is going to be the internals of wide-int. Of course until you test it you never know, but we did try very hard not to care about the internal size of the rep. The hard part will be the large number of places where wide-int converts to or from hwi. Some of those callers expect things to really be 64 bits and some of them deal with numbers that are always small enough to be implemented in the efficient native representation. I think that the push against you is that the latter case should not be converted to int64_t. Richard. kenny
Re: -fuse-caller-save - Collect register usage information
On 2014-04-23, 6:41 AM, Tom de Vries wrote: On 22-04-14 17:05, Tom de Vries wrote: I've updated the fuse-caller-save patch series to model non-callee call clobbers in CALL_INSN_FUNCTION_USAGE. Vladimir, This is the updated version of the previously approved patch http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01320.html , updated for the new hook call_fusage_contains_non_callee_clobbers. The only difference is in the functions get_call_reg_set_usage and collect_fn_hard_reg_usage which use the hook. OK for trunk? 2013-04-29 Radovan Obradovic robrado...@mips.com Tom de Vries t...@codesourcery.com * cgraph.h (struct cgraph_node): Add function_used_regs, function_used_regs_initialized and function_used_regs_valid fields. * final.c: Move include of hard-reg-set.h to before rtl.h to declare find_all_hard_reg_sets. (collect_fn_hard_reg_usage, get_call_fndecl, get_call_cgraph_node) (get_call_reg_set_usage): New function. (rest_of_handle_final): Use collect_fn_hard_reg_usage. It looks ok for me, Tom. But to be straight I am not a maintainer for this part of the compiler. So it is just my recommendation. I guess to get an approval for these changes, you should ask Jan Hubicka (cgraph.h) or a global or RTL reviewer for final.c.
Re: Add call_fusage_contains_non_callee_clobbers hook
Tom de Vries tom_devr...@mentor.com writes: On 22-04-14 17:05, Tom de Vries wrote: I've updated the fuse-caller-save patch series to model non-callee call clobbers in CALL_INSN_FUNCTION_USAGE. Vladimir, This patch adds a hook to indicate whether a target has added the non-callee call clobbers to CALL_INSN_FUNCTION_USAGE, meaning it's safe to do the fuse-caller-save optimization. FWIW I think this should be a plain bool rather than a function, like delay_sched2 etc. Thanks, Richard
Re: -fuse-caller-save - Collect register usage information
Tom de Vries tom_devr...@mentor.com writes: +/* Collect hard register usage for the current function. */ + +static void +collect_fn_hard_reg_usage (void) +{ + rtx insn; + int i; + struct cgraph_node *node; + + if (!flag_use_caller_save) +return; + + node = cgraph_get_node (current_function_decl); + gcc_assert (node != NULL); + + gcc_assert (!node-function_used_regs_initialized); + node-function_used_regs_initialized = 1; + + for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn)) +{ + HARD_REG_SET insn_used_regs; + + if (!NONDEBUG_INSN_P (insn)) + continue; + + find_all_hard_reg_sets (insn, insn_used_regs, false); + + if (CALL_P (insn) +(!targetm.call_fusage_contains_non_callee_clobbers () + || !get_call_reg_set_usage (insn, insn_used_regs, call_used_reg_set))) If the uses of flag_use_caller_save also check call_fusage_contains_non_callee_clobbers, would it be better to test them both together here too, rather than waiting to see a call? + /* Be conservative - mark fixed and global registers as used. */ + IOR_HARD_REG_SET (node-function_used_regs, fixed_reg_set); + for (i = 0; i FIRST_PSEUDO_REGISTER; i++) +if (global_regs[i]) + SET_HARD_REG_BIT (node-function_used_regs, i); The loop isn't needed; all globals are fixed. Thanks again for working on this. Richard
[PATCH AARCH64] One-line tidy of bit-twiddle expression in aarch64.c
This patch is a small tidy of a more-complicated expression that just flips a single bit and can thus be a simple XOR. No regressions on aarch64-none-elf or aarch64_be-none-elf. (I've verified code is indeed exercised by dg-torture.exp vshuf-v*.c). Also ok after applying TBL and testsuite patches in http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01309.html and http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00579.html. gcc/ChangeLog: 2014-04-23 Alan Lawrence alan.lawre...@arm.com * config/aarch64/aarch64.c (aarch64_expand_vec_perm_1): tidy bit-flip expression.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index a3147ee..b879754 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -8124,7 +8124,7 @@ aarch64_expand_vec_perm_const_1 (struct expand_vec_perm_d *d) rtx x; for (i = 0; i nelt; ++i) - d-perm[i] = (d-perm[i] + nelt) (2 * nelt - 1); + d-perm[i] ^= nelt; /* Keep the same index, but in the other vector. */ x = d-op0; d-op0 = d-op1;
RE: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
Hmm, in that case maybe we should just leave it failing. The alternative would be to skip the test altogther for MIPS, with a PR referencing it, but that seems a bit over-the-top. I'd leave it as it is for now until the consensus regarding the 'X' constraint is reached. Please use comments of the form: /* Implement TARGET_FOO. */ above all three functions (instead of the current one in the case of mips_register_priority), just so that it's painfully obvious that these are target hooks. Modified as requested and attached the patch below. I tried to keep to the conventions but apparently I seem to overlook certain things. I'll remember this part now :). Out of interest, do you see any difference if you include $sp in SPILL_REGS? That obviously doesn't make much conceptual sense, but it would give a cleaner class hierarchy. Including $sp does not make any difference, exactly the same code size. Although I haven't thoroughly tested it, I limited the check to -Os. Regards, Robert 2014-03-26 Robert Suchanek robert.sucha...@imgtec.com * lra-constraints.c (base_to_reg): New function. (process_address): Use new function. * config/mips/constraints.md (d): BASE_REG_CLASS replaced by TARGET_MIPS16 ? M16_REGS : GR_REGS. * config/mips/mips.c (mips_regno_mode_ok_for_base_p): Remove use !strict_p for MIPS16. (mips_register_priority): New function that implements the target hook TARGET_REGISTER_PRIORITY. (mips_spill_class): Likewise for TARGET_SPILL_CLASS (mips_lra_p): Likewise for TARGET_LRA_P. * config/mips/mips.h (reg_class): Add M16_SP_REGS and SPILL_REGS classes. (REG_CLASS_NAMES): Likewise. (REG_CLASS_CONTENTS): Likewise. (BASE_REG_CLASS): Use M16_SP_REGS. * config/mips/mips.md (*mul_acc_si, *mul_sub_si): Add alternative tuned for LRA. New set attribute to enable alternatives depending on the register allocator used. (*lea64): Disable pattern for MIPS16. * config/mips/mips.opt (mlra): New option diff --git gcc/config/mips/constraints.md gcc/config/mips/constraints.md index f6834fd..fa33c30 100644 --- gcc/config/mips/constraints.md +++ gcc/config/mips/constraints.md @@ -19,7 +19,7 @@ ;; Register constraints -(define_register_constraint d BASE_REG_CLASS +(define_register_constraint d TARGET_MIPS16 ? M16_REGS : GR_REGS An address register. This is equivalent to @code{r} unless generating MIPS16 code.) diff --git gcc/config/mips/mips.c gcc/config/mips/mips.c index 45256e9..f8d90b2 100644 --- gcc/config/mips/mips.c +++ gcc/config/mips/mips.c @@ -655,7 +655,7 @@ const enum reg_class mips_regno_to_class[FIRST_PSEUDO_REGISTER] = { M16_REGS,M16_STORE_REGS, LEA_REGS,LEA_REGS, LEA_REGS,LEA_REGS,LEA_REGS,LEA_REGS, T_REG, PIC_FN_ADDR_REG, LEA_REGS,LEA_REGS, - LEA_REGS,LEA_REGS,LEA_REGS,LEA_REGS, + LEA_REGS,M16_SP_REGS, LEA_REGS,LEA_REGS, FP_REGS, FP_REGS,FP_REGS,FP_REGS, FP_REGS, FP_REGS,FP_REGS,FP_REGS, @@ -2241,22 +2241,9 @@ mips_regno_mode_ok_for_base_p (int regno, enum machine_mode mode, return true; /* In MIPS16 mode, the stack pointer can only address word and doubleword - values, nothing smaller. There are two problems here: - - (a) Instantiating virtual registers can introduce new uses of the - stack pointer. If these virtual registers are valid addresses, - the stack pointer should be too. - - (b) Most uses of the stack pointer are not made explicit until - FRAME_POINTER_REGNUM and ARG_POINTER_REGNUM have been eliminated. - We don't know until that stage whether we'll be eliminating to the - stack pointer (which needs the restriction) or the hard frame - pointer (which doesn't). - - All in all, it seems more consistent to only enforce this restriction - during and after reload. */ + values, nothing smaller. */ if (TARGET_MIPS16 regno == STACK_POINTER_REGNUM) -return !strict_p || GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8; +return GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8; return TARGET_MIPS16 ? M16_REG_P (regno) : GP_REG_P (regno); } @@ -12115,6 +12102,18 @@ mips_register_move_cost (enum machine_mode mode, return 0; } +/* Implement TARGET_REGISTER_PRIORITY. */ + +static int +mips_register_priority (int hard_regno) +{ + /* Treat MIPS16 registers with higher priority than other regs. */ + if (TARGET_MIPS16 + TEST_HARD_REG_BIT (reg_class_contents[M16_REGS], hard_regno)) +return 1; + return 0; +} + /* Implement TARGET_MEMORY_MOVE_COST. */ static int @@ -18897,6 +18896,25 @@ mips_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update) *update = build2 (COMPOUND_EXPR,
Re: [PATCH 00/89] Compile-time gimple-checking
On Wed, 2014-04-23 at 16:19 +0200, Richard Biener wrote: On Tue, Apr 22, 2014 at 9:42 PM, Richard Biener richard.guent...@gmail.com wrote: On April 22, 2014 8:56:56 PM CEST, Richard Sandiford rdsandif...@googlemail.com wrote: David Malcolm dmalc...@redhat.com writes: Alternatively we could change the is-a.h API to eliminate this discrepancy, and keep the typedefs; giving something like the following: static void dump_gimple_switch (pretty_printer *buffer, gimple_switch gs, int spc, int flags) [...snip...] [...later, within pp_gimple_stmt_1:] case GIMPLE_SWITCH: dump_gimple_switch (buffer, as_a gimple_switch (gs), spc, flags); break; which is concise, readable, and avoid the change in pointerness compared to the gimple typedef; the local decls above would look like this: gimple some_stmt; /* note how this doesn't have a star... */ gimple_assign assign_stmt; /* ...and neither do these */ gimple_cond assign_stmt; gimple_phi phi; I think this last proposal is my preferred API, but it requires the change to is-a.h Attached is a proposed change to the is-a.h API that elimintates the discrepancy, allowing the use of typedefs with is-a.h (doesn't yet compile, but hopefully illustrates the idea). Note how it changes the API to match C++'s dynamic_cast operator i.e. you do Q* q = dyn_castQ* (p); not: Q* q = dyn_castQ (p); Thanks for being flexible. :-) I like this version too FWIW, for the reason you said: it really does look like a proper C++ cast. Indeed. I even wasn't aware it is different Than a c++ cast... It would be nice if you can change that with a separate patch posted in a separate thread to be more visible. Done, as: http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01439.html I've experimentally ported patch 2 of the series (gimple_switch) to this approach, dropping the unloved casting methods in favor of as_a and dyn_cast, and it works. Also I see you introduce a const_FOO class with every FOO one. I wonder whether, now that we have C++, can address const-correctness in a less awkward way than with a typedef. Can you try to go back in time and see why we did with that in the first place? ISTR that it was oh, if we were only using C++ we wouldn't need to jump through that hoop. Thanks, Richard. Richard. If we ever decide to get rid of the typedefs (maybe at the same time as using auto) then the choice might be different, but that would be a much more systematic and easily-automated change than this one. Thanks, Richard
Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend
On 2014-04-21, 8:23 AM, Richard Sandiford wrote: Robert Suchanek robert.sucha...@imgtec.com writes: Did you see the failures even after your mips_regno_mode_ok_for_base_p change? LRA should know how to reload a W address. Yes but I realize there is more. It fails because $sp is now included in BASE_REG_CLASS and W is based on it. However, I suppose that it would be too eager to say it is wrong and likely there is something missing in LRA if we want to keep all alternatives. Currently there is no check if a reloaded operand has a valid address, use of $sp in lbu/lhu cases. Even if we added extra checks we are less likely to benefit as we need to reload the base into register. Not sure what you mean, sorry. W exists specifically to exclude $sp-based and $pc-based addresses. LRA AFAIK should already be able to reload addresses that are valid in the TARGET_LEGITIMATE_ADDRESS_P sense but which do not match the constraints for a particular insn. Can you remember one of the tests that fails? I couldn't trigger the problem with the original testcase but found another one that reveals it. The following needs to compiled with -mips32r2 -mips16 -Os: struct { int addr; } c; struct command { int args[1]; }; unsigned short a; fn1 (struct command *p1) { unsigned short d; d = fn2 (); a = p1-args[0]; fn3 (a); if (c.addr) { fn4 (p1-args[0]); return; } (c)-addr = fn5 (); fn6 (d); } Thanks. Not sure how the constraint would/should exclude $sp-based address in LRA. In this particular case, a spilled pseudo is changed to memory giving the following RTL form: (insn 30 29 31 4 (set (reg:SI 4 $4) (and:SI (mem/c:SI (plus:SI (reg/f:SI 78 $frame) (const_int 16 [0x10])) [7 %sfp+16 S4 A32]) (const_int 65535 [0x]))) shell.i:17 161 {*andsi3_mips16} (expr_list:REG_DEAD (reg:SI 194 [ D.1469 ]) (nil))) The operand 1 during alternative selection is not marked as a bad operand as it is a memory operand. $frame appears to be fine as it could be eliminated later to hard register. No reloads are inserted for the instructions concerned. Unless, $frame should be temporarily eliminated and then a reload would be inserted? Yeah, I think the lack of elimination is the problem. process_address eliminates $frame temporarily before checking whether the address is valid, but the places that check EXTRA_CONSTRAINT_STR pass the original uneliminated address. So the legitimate_address_p hook sees the $sp-based address but the W constraint only sees the $frame-based address (which might or might not be valid, depending on whether $frame is eliminated to the stack or hard frame pointer). I think the constraints should see the eliminated address too. This patch seems to fix it for me. Tested on x86_64-linux-gnu. Vlad, is this OK for trunk? BTW, we might want to define something like: #define MODE_BASE_REG_CLASS(MODE) \ (TARGET_MIPS16 \ ? ((MODE) == SImode || (MODE) == DImode ? M16_SP_REGS : M16_REGS) \ : GR_REGS) instead of BASE_REG_CLASS. It might lead to slightly better code (or not -- if it doesn't then don't bother :-)). If this patch is OK then I think the only thing blocking the switch to LRA is the asm-subreg-1.c failure. I think it'd be fine to XFAIL that test on MIPS for now, until there's a consensus about what X means for asms. gcc/ * lra-constraints.c (valid_address_p): Move earlier in file. Add a constraint argument to the address_info version. (satisfies_memory_constraint_p): New function. (satisfies_address_constraint_p): Likewise. (process_alt_operands, curr_insn_transform): Use them. (process_address): Pass the constraint to valid_address_p when checking address operands. Yes, it looks ok for me, Richard. Thanks on working on this. I am on vacation till May 4th. If the patch results in problems on other targets, I hope you revert it. But to be honest, I believe it is very safe and don't expect any problems at all.
[Committed][ARM][AArch64] Patches previously ok'd for stage1
Hi all, I've committed to trunk some of my arm and aarch64 patches that I had pending for stage1 (approval email in parentheses): http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00933.html (http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01609.html) http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00934.html (http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01634.html) http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00935.html (http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01608.html) http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00936.html (http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01635.html) http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01330.html (http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01343.html) http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01274.html (http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01346.html) I've committed them as revisions r209701 to r209706 inclusively. Thanks, Kyrill
[PATCH AARCH64] fix and enable non-const shuffle for bigendian using TBL instruction
At present vec_perm with non-const indices is not handled on bigendian, so gcc generates generic, slow, code. This patch fixes up TBL to reverse the indices within each input vector (following Richard Henderson's suggestion of using an XOR with (nelts - 1) rather than a complicated mask/add/subtract, http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01285.html), and enables the code for bigendian. Regressed on aarch64_be-none-elf with no changes. (This is as expected: in all affected cases, gcc was already producing correct non-arch-specific code using scalar op. However, I have manually verified for various tests in c-c++-common/torture/vshuf-v* that (a) TBL instructions are now produced, (b) a version of the compiler that produces TBLs without the index correction, fails tests). Note tests c-c++-common/torture/vshuf-{v16hi,v4df,v4di,v8si} (i.e. the 32-byte vectors) were broken prior to this patch and are not affected. gcc/ChangeLog: 2014-04-23 Alan Lawrence alan.lawre...@arm.com * config/aarch64/aarch64-simd.md (vec_perm): Enable for bigendian. * config/aarch64/aarch64.c (aarch64_expand_vec_perm): Remove assert against bigendian and adjust indices.diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 73aee2c..e14e9b0 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -4002,7 +4002,7 @@ (match_operand:VB 1 register_operand) (match_operand:VB 2 register_operand) (match_operand:VB 3 register_operand)] - TARGET_SIMD !BYTES_BIG_ENDIAN + TARGET_SIMD { aarch64_expand_vec_perm (operands[0], operands[1], operands[2], operands[3]); diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index d332741..6875b58 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -7763,18 +7763,24 @@ aarch64_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel) enum machine_mode vmode = GET_MODE (target); unsigned int i, nelt = GET_MODE_NUNITS (vmode); bool one_vector_p = rtx_equal_p (op0, op1); - rtx rmask[MAX_VECT_LEN], mask; - - gcc_checking_assert (!BYTES_BIG_ENDIAN); + rtx mask; /* The TBL instruction does not use a modulo index, so we must take care of that ourselves. */ - mask = GEN_INT (one_vector_p ? nelt - 1 : 2 * nelt - 1); - for (i = 0; i nelt; ++i) -rmask[i] = mask; - mask = gen_rtx_CONST_VECTOR (vmode, gen_rtvec_v (nelt, rmask)); + mask = aarch64_simd_gen_const_vector_dup (vmode, + one_vector_p ? nelt - 1 : 2 * nelt - 1); sel = expand_simple_binop (vmode, AND, sel, mask, NULL, 0, OPTAB_LIB_WIDEN); + /* For big-endian, we also need to reverse the index within the vector + (but not which vector). */ + if (BYTES_BIG_ENDIAN) +{ + /* If one_vector_p, mask is a vector of (nelt - 1)'s already. */ + if (!one_vector_p) +mask = aarch64_simd_gen_const_vector_dup (vmode, nelt - 1); + sel = expand_simple_binop (vmode, XOR, sel, mask, + NULL, 0, OPTAB_LIB_WIDEN); +} aarch64_expand_vec_perm_1 (target, op0, op1, sel); }
Re: [PATCH 00/89] Compile-time gimple-checking
On 04/23/2014 10:19 AM, Richard Biener wrote: On Tue, Apr 22, 2014 at 9:42 PM, Richard Biener richard.guent...@gmail.com wrote: On April 22, 2014 8:56:56 PM CEST, Richard Sandiford rdsandif...@googlemail.com wrote: David Malcolm dmalc...@redhat.com writes: Alternatively we could change the is-a.h API to eliminate this discrepancy, and keep the typedefs; giving something like the following: static void dump_gimple_switch (pretty_printer *buffer, gimple_switch gs, int spc, int flags) [...snip...] [...later, within pp_gimple_stmt_1:] case GIMPLE_SWITCH: dump_gimple_switch (buffer, as_a gimple_switch (gs), spc, flags); break; which is concise, readable, and avoid the change in pointerness compared to the gimple typedef; the local decls above would look like this: gimple some_stmt; /* note how this doesn't have a star... */ gimple_assign assign_stmt; /* ...and neither do these */ gimple_cond assign_stmt; gimple_phi phi; I think this last proposal is my preferred API, but it requires the change to is-a.h Attached is a proposed change to the is-a.h API that elimintates the discrepancy, allowing the use of typedefs with is-a.h (doesn't yet compile, but hopefully illustrates the idea). Note how it changes the API to match C++'s dynamic_cast operator i.e. you do Q* q = dyn_castQ* (p); not: Q* q = dyn_castQ (p); Thanks for being flexible. :-) I like this version too FWIW, for the reason you said: it really does look like a proper C++ cast. Indeed. I even wasn't aware it is different Than a c++ cast... It would be nice if you can change that with a separate patch posted in a separate thread to be more visible. Also I see you introduce a const_FOO class with every FOO one. I wonder whether, now that we have C++, can address const-correctness in a less awkward way than with a typedef. Can you try to go back in time and see why we did with that in the first place? ISTR that it was oh, if we were only using C++ we wouldn't need to jump through that hoop. I was also wondering if we shouldn't be able to get rid of the 'const_' versions and just properly use const with the c++ classes. I think we can... Andrew
[Patch] Fix obsolete autoconf macros in configure.ac
The gcc configure.ac script is using an obsolete form of the AC_CHECK_TYPE autoconf macro to check for caddr_t and ssize_t. http://www.gnu.org/software/autoconf/manual/autoconf-2.60/html_node/Obsolete-Macros.html#Obsolete-Macros This usage is causing a build failure for me when building a windows GCC using the mingw toolset. I would like to replace the obsolete autoconf macros with a 'proper' one. Tested with my mingw build and a MIPS targetted linux build. OK to checkin? Steve Ellcey sell...@mips.com 2014-04-23 Steve Ellcey sell...@mips.com * configure.ac (caddr_t, ssize_t): Use AC_CHECK_TYPES instead of obsolete form of AC_CHECK_TYPE. * configure: Regenerate. diff --git a/gcc/configure.ac b/gcc/configure.ac index d789557..98acb1b 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -1083,8 +1083,8 @@ int main() fi fi -AC_CHECK_TYPE(ssize_t, int) -AC_CHECK_TYPE(caddr_t, char *) +AC_CHECK_TYPES([ssize_t]) +AC_CHECK_TYPES([caddr_t]) GCC_AC_FUNC_MMAP_BLACKLIST
Re: [PATCH][RFC] Remove RTL loop unswitching
On Sun, 20 Apr 2014, Jan Hubicka wrote: This removes RTL loop unswitching (see last years discussion about compile-time issues of that pass). RTL loop unswitching is enabled together with GIMPLE loop unswitching at -O3 and by -floop-unswitch. It's clearly the wrong place to do high-level loop transforms these days, and the cost of maintainance doesn't outweight the questionable benefit. Thus the following patch removes it. Bootstrap / regtest pending on x86_64-unknown-linux-gnu (I hope for testsuite fallout). Any objections? Not really, I am all for moving more of loop stuff to trees. Did you performed some benchmarks? (I remember I did in 2012 but completely forgot the outcome). I did that last year and it showed no difference in SPEC 2k6. When bootstrapping with -O3 and a gcc_unreachable () in the RTL unswitching path you get some ICEs there but they are due to different effective --param max-unswitch-insns that is on GIMPLE applied to tree_num_loop_insns () and on RTL to num_loop_insns (). Yep, I remember seeing some interesting special cases where RTL analyzis did catch on invariants but tree didn't, but nothing important. I'll go forward with the patch today. On related note, shall I try to update the following? http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02270.html Yeah. Will do, Honza Thanks, Richard. Honza Thanks, Richard. 2014-04-15 Richard Biener rguent...@suse.de * Makefile.in (OBJS): Remove loop-unswitch.o. * loop-unswitch.c: Delete. * tree-pass.h (make_pass_rtl_unswitch): Remove. * passes.def (pass_rtl_unswitch): Likewise. * loop-init.c (gate_rtl_unswitch): Likewise. (rtl_unswitch): Likewise. (pass_data_rtl_unswitch): Likewise. (pass_rtl_unswitch): Likewise. (make_pass_rtl_unswitch): Likewise. * rtl.h (reversed_condition): Likewise. (compare_and_jump_seq): Likewise. * loop-iv.c (reversed_condition): Move here from loop-unswitch.c and make static. * loop-unroll.c (compare_and_jump_seq): Likewise. Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 209410) +++ gcc/Makefile.in (working copy) @@ -1294,7 +1294,6 @@ OBJS = \ loop-invariant.o \ loop-iv.o \ loop-unroll.o \ - loop-unswitch.o \ lower-subreg.o \ lra.o \ lra-assigns.o \ Index: gcc/tree-pass.h === --- gcc/tree-pass.h (revision 209410) +++ gcc/tree-pass.h (working copy) @@ -512,7 +512,6 @@ extern rtl_opt_pass *make_pass_outof_cfg extern rtl_opt_pass *make_pass_loop2 (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_loop_init (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_move_loop_invariants (gcc::context *ctxt); -extern rtl_opt_pass *make_pass_rtl_unswitch (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_unroll_and_peel_loops (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_doloop (gcc::context *ctxt); extern rtl_opt_pass *make_pass_rtl_loop_done (gcc::context *ctxt); Index: gcc/passes.def === --- gcc/passes.def(revision 209410) +++ gcc/passes.def(working copy) @@ -341,7 +341,6 @@ along with GCC; see the file COPYING3. PUSH_INSERT_PASSES_WITHIN (pass_loop2) NEXT_PASS (pass_rtl_loop_init); NEXT_PASS (pass_rtl_move_loop_invariants); - NEXT_PASS (pass_rtl_unswitch); NEXT_PASS (pass_rtl_unroll_and_peel_loops); NEXT_PASS (pass_rtl_doloop); NEXT_PASS (pass_rtl_loop_done); Index: gcc/loop-init.c === --- gcc/loop-init.c (revision 209410) +++ gcc/loop-init.c (working copy) @@ -518,61 +518,7 @@ make_pass_rtl_move_loop_invariants (gcc: } -/* Loop unswitching for RTL. */ -static bool -gate_rtl_unswitch (void) -{ - return flag_unswitch_loops; -} - -static unsigned int -rtl_unswitch (void) -{ - if (number_of_loops (cfun) 1) -unswitch_loops (); - return 0; -} - -namespace { - -const pass_data pass_data_rtl_unswitch = -{ - RTL_PASS, /* type */ - loop2_unswitch, /* name */ - OPTGROUP_LOOP, /* optinfo_flags */ - true, /* has_gate */ - true, /* has_execute */ - TV_LOOP_UNSWITCH, /* tv_id */ - 0, /* properties_required */ - 0, /* properties_provided */ - 0, /* properties_destroyed */ - 0, /* todo_flags_start */ - TODO_verify_rtl_sharing, /* todo_flags_finish */ -}; - -class pass_rtl_unswitch : public rtl_opt_pass -{ -public: - pass_rtl_unswitch (gcc::context *ctxt) -: rtl_opt_pass (pass_data_rtl_unswitch,
Re: [PATCH] Change is-a.h to support typedefs of pointers
On April 23, 2014 5:31:42 PM CEST, David Malcolm dmalc...@redhat.com wrote: The is-a.h API currently implicitly injects a pointer into the type: template typename T, typename U inline T * ^^^ Note how it returns a (T*) as_a (U *p) { gcc_checking_assert (is_a T (p)); but uses the specialization of T, not T* here return is_a_helper T::cast (p); ^^^ and here } so that currently one must write: Q* q = dyn_cast Q (p); This causes difficulties when dealing with typedefs to pointers. For example, with: typedef struct foo_d foo; typedef struct bar_d bar; we can't write: bar b = dyn_cast bar (f); ^^^ ^^^ but have to write: bar b = dyn_cast bar_d (f); ^^^ ^ Note the mismatching types. The following patch changes the is-a.h API to remove the implicit injection of a pointer, so that one writes: Q* q = dyn_cast Q* (p); rather than: Q* q = dyn_cast Q (p); which also gives us more consistency with C++'s dynamic_cast operator, and allows the above cast to a typedef-ptr to be written as: bar b = dyn_cast bar (f); ^^^ ^^^ they can now match. The patch also fixes up the users (a fair amount of cgraph/symtable code, along with the gimple accessors). The example motivating this is to better support as_a and dyn_cast in gimple code, in the Compile-time gimple-checking patch series so that, with suitable typesdefs matching the names in gimple.def, such as: typedef struct gimple_statement_assign *gimple_assign; we can write: case GIMPLE_ASSIGN: { gimple_assign assign_stmt = as_agimple_assign (stmt); ^^ /* do assign-related things on assign_stmt */ } instead of the clunkier: case GIMPLE_ASSIGN: { gimple_assign assign_stmt = as_agimple_statement_assign (stmt); ^^^^ /* do assign-related things on assign_stmt */ } See the http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01259.html subthread for more details, which also considered changing the names of the structs and eliminating the typedefs. However, doing so without the attached fix to the is-a API would introduce an inconsistency between decls for the base class vs subclass, so there'd be: gimple stmt;/* no star */ gimple_assign *stmt;/* star */ (or to change the gimple typedef, which would be a monster patch). Successfully bootstrappedregrtested on x86_64-unknown-linux-gnu. OK for trunk? OK for trunk, no need to wait for 4.9.1 for this. Thanks, Richard. Would the release managers prefer to make this contingent on holding off from committing until after 4.9.1 is out? (to minimize impact of this change on backporting effort) Thanks Dave gcc/ * is-a.h: Update comments to reflect the following changes to the pointerness of the API, making the template parameter match the return type, allowing use of is-a.h with typedefs of pointers. (is_a_helper::cast): Return a T rather then a pointer to a T, so that the return type matches the parameter to the is_a_helper. (as_a): Likewise. (dyn_cast): Likewise. * cgraph.c (cgraph_node_for_asm): Update for removal of implicit pointer from the is-a.h API. * cgraph.h (is_a_helper cgraph_node::test): Convert to... (is_a_helper cgraph_node *::test): ...this, matching change to is-a.h API. (is_a_helper varpool_node::test): Likewise, convert to... (is_a_helper varpool_node *::test): ...this. (varpool_first_variable): Update for removal of implicit pointer from the is-a.h API. (varpool_next_variable): Likewise. (varpool_first_static_initializer): Likewise. (varpool_next_static_initializer): Likewise. (varpool_first_defined_variable): Likewise. (varpool_next_defined_variable): Likewise. (cgraph_first_defined_function): Likewise. (cgraph_next_defined_function): Likewise. (cgraph_first_function): Likewise. (cgraph_next_function): Likewise. (cgraph_first_function_with_gimple_body): Likewise. (cgraph_next_function_with_gimple_body): Likewise. (cgraph_alias_target): Likewise. (varpool_alias_target): Likewise. (cgraph_function_or_thunk_node): Likewise. (varpool_variable_node): Likewise. (symtab_real_symbol_p): Likewise. * cgraphunit.c (referred_to_p): Likewise. (analyze_functions): Likewise. (handle_alias_pairs): Likewise. * gimple-fold.c (can_refer_decl_in_current_unit_p): Likewise. * gimple-ssa.h (gimple_vuse_op): Likewise. (gimple_vdef_op): Likewise. * gimple-streamer-in.c (input_gimple_stmt): Likewise. * gimple.c
[ARM] Initialize new tune_params values
Hi, Revision 209561 introduces two new paramteres for tune_params, but does not initialize them in the Cortex-A57 or Cortex-A12 tuning structures. This breaks bootstrap. Fixed by initializing them to sensible values. Checked to ensure the warnings are cleared, and bootstrap can continue. Ramana has acked this offline, so I've applied this as revision 209710 under the reasonably obvious rule. Thanks, James --- gcc/ 2014-04-23 James Greenhalgh james.greenha...@arm.com * config/arm/arm.c (arm_cortex_a57_tune): Initialize all fields. (arm_cortex_a12_tune): Likewise. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 88d957a..de247cd 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -1658,7 +1658,8 @@ const struct tune_params arm_cortex_a57_tune = true, /* Prefer LDRD/STRD. */ {true, true},/* Prefer non short circuit. */ arm_default_vec_cost, /* Vectorizer costs. */ - false/* Prefer Neon for 64-bits bitops. */ + false, /* Prefer Neon for 64-bits bitops. */ + true, true /* Prefer 32-bit encodings. */ }; /* Branches can be dual-issued on Cortex-A5, so conditional execution is @@ -1711,7 +1712,8 @@ const struct tune_params arm_cortex_a12_tune = true, /* Prefer LDRD/STRD. */ {true, true}, /* Prefer non short circuit. */ arm_default_vec_cost,/* Vectorizer costs. */ - false /* Prefer Neon for 64-bits bitops. */ + false,/* Prefer Neon for 64-bits bitops. */ + false, false /* Prefer 32-bit encodings. */ }; /* armv7m tuning. On Cortex-M4 cores for example, MOVW/MOVT take a single
Re: [Patch] Fix obsolete autoconf macros in configure.ac
Steve Ellcey sell...@mips.com writes: diff --git a/gcc/configure.ac b/gcc/configure.ac index d789557..98acb1b 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -1083,8 +1083,8 @@ int main() fi fi -AC_CHECK_TYPE(ssize_t, int) -AC_CHECK_TYPE(caddr_t, char *) +AC_CHECK_TYPES([ssize_t]) +AC_CHECK_TYPES([caddr_t]) You also need to handle the no longer supported default definition. Moreover, the two macro calls can be combined into one. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: Inliner heuristics TLC 1/n - let small function inlinng to ignore cold portion of program
On Tue, Apr 22, 2014 at 1:17 PM, Jan Hubicka hubi...@ucw.cz wrote: This looks fine. LIPO has similar change too. Other directions worth looking into: 1) To model icache effect better, weighted callee size need to be used with profile. The weight for BB may look like: min(1, FREQ(BB)/FREQ(ENTRY)). 2) When function splitting is turned on, are any inline heuristic changes are needed? E.g. only consider the hot code part of node for unit growth computation? We are also looking into more aggressive approach to track per loop (inter-procedural) region growth limit, instead of using one single global limit. Per-loop growth seems interesting. I assume it is not hard to associate edges with loop nests and it has more of a local nature. per-function loop nests form a graph, which can be embedded inside the callgraph. One of the main thing is loop graph update (just like callgraph node/edge cloning), and summary data update during inlining. Did you experiment with it? We currently do not have time for this, but you are welcome to experiment with it:) Related ideas: 1) per loop priority; 2) more precise code-reuse (icache locality), and icache/itlb penalty estimate; 3) more precise per loop register pressure estimate; 4) other loop transformation hints. thanks, David Honza
Re: [PATCH 00/89] Compile-time gimple-checking
On 04/22/14 02:36, Richard Biener wrote: On Mon, Apr 21, 2014 at 6:56 PM, David Malcolm dmalc...@redhat.com wrote: This is a greatly-expanded version of: http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01262.html As of r205034 (de6bd75e3c9bc1efe8a6387d48eedaa4dafe622d) and r205428 (a90353203da18288cdac1b0b78fe7b22c69fe63f) the various gimple statements form a C++ inheritance hierarchy, but we're not yet making much use of that in the code: everything refers to just gimple (or const_gimple), and type-checking is performed at run-time within the various gimple_foo_* accessors in gimple.h, and almost nowhere else. The following patch series introduces compile-time checking of much of the handling of gimple statements. Various new typedefs are introduced for pointers to statements where the specific code is known, matching the corresponding names from gimple.def. Even though I like these changes in principle I also wear a release managers hat. Being one of the persons doing frequent backports of trunk fixes to branches this will cause a _lot_ of headache. So ... can we delay this until, say, 4.9.1 is out? Understood. So how about we proceed with the review approvals, but they stage in after 4.9.1? Ideally by the time 4.9.1 is ready, the entire series in its final form has been reviewed and approved. jeff
Re: [PATCH 00/89] Compile-time gimple-checking
On 04/22/14 02:03, Richard Sandiford wrote: First of all, thanks a lot for doing this. Maybe one day we'll have the same in rtl :-) Funny you should mention that. I blocked off a hunk of time for David to investigate doing some work on that this year. Jeff
Re: [Patch] Fix obsolete autoconf macros in configure.ac
On Wed, 2014-04-23 at 18:40 +0200, Andreas Schwab wrote: Steve Ellcey sell...@mips.com writes: diff --git a/gcc/configure.ac b/gcc/configure.ac index d789557..98acb1b 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -1083,8 +1083,8 @@ int main() fi fi -AC_CHECK_TYPE(ssize_t, int) -AC_CHECK_TYPE(caddr_t, char *) +AC_CHECK_TYPES([ssize_t]) +AC_CHECK_TYPES([caddr_t]) You also need to handle the no longer supported default definition. Moreover, the two macro calls can be combined into one. Andreas. Actually, now that I look more at caddr_t, I see that we probably shouldn't be using it at all. The only uses in the gcc subdirectory are for calls to mmap and munmap (in gcc.c, gcc-common.c, and config/host-solaris.c) and the latest definitions for mmap and munmap say it should use 'void *', not caddr_t. I will submit a new patch to remove the uses (and definition) of caddr_t from gcc. ssize_t should probably still be fixed, but that was not causing me a failure and it can be handled separately. Steve Ellcey sell...@mips.com
Re: [AArch64/ARM 1/3] Add execution + assembler tests of the AArch64 ZIP Intrinsics.
On 27 March 2014 10:52, Alan Lawrence alan.lawre...@arm.com wrote: This adds DejaGNU tests of the existing AArch64 vzip_* intrinsics, both checking the assembler output and the runtime results. Test bodies are in separate files ready to reuse for ARM in the third patch. Putting these in a new subdirectory ready for tests of other/related intrinsics. All tests passing on aarch64-none-elf and aarch64_be-none-elf. testsuite/ChangeLog: 2014-03-25 Alan Lawrence alan.lawre...@arm.com * gcc.target/aarch64/simd/simd.exp: New file. * gcc.target/aarch64/simd/vzipf32_1.c: New file. * gcc.target/aarch64/simd/vzipf32.x: New file. * gcc.target/aarch64/simd/vzipp16_1.c: New file. * gcc.target/aarch64/simd/vzipp16.x: New file. * gcc.target/aarch64/simd/vzipp8_1.c: New file. * gcc.target/aarch64/simd/vzipp8.x: New file. * gcc.target/aarch64/simd/vzipqf32_1.c: New file. * gcc.target/aarch64/simd/vzipqf32.x: New file. * gcc.target/aarch64/simd/vzipqp16_1.c: New file. * gcc.target/aarch64/simd/vzipqp16.x: New file. * gcc.target/aarch64/simd/vzipqp8_1.c: New file. * gcc.target/aarch64/simd/vzipqp8.x: New file. * gcc.target/aarch64/simd/vzipqs16_1.c: New file. * gcc.target/aarch64/simd/vzipqs16.x: New file. * gcc.target/aarch64/simd/vzipqs32_1.c: New file. * gcc.target/aarch64/simd/vzipqs32.x: New file. * gcc.target/aarch64/simd/vzipqs8_1.c: New file. * gcc.target/aarch64/simd/vzipqs8.x: New file. * gcc.target/aarch64/simd/vzipqu16_1.c: New file. * gcc.target/aarch64/simd/vzipqu16.x: New file. * gcc.target/aarch64/simd/vzipqu32_1.c: New file. * gcc.target/aarch64/simd/vzipqu32.x: New file. * gcc.target/aarch64/simd/vzipqu8_1.c: New file. * gcc.target/aarch64/simd/vzipqu8.x: New file. * gcc.target/aarch64/simd/vzips16_1.c: New file. * gcc.target/aarch64/simd/vzips16.x: New file. * gcc.target/aarch64/simd/vzips32_1.c: New file. * gcc.target/aarch64/simd/vzips32.x: New file. * gcc.target/aarch64/simd/vzips8_1.c: New file. * gcc.target/aarch64/simd/vzips8.x: New file. * gcc.target/aarch64/simd/vzipu16_1.c: New file. * gcc.target/aarch64/simd/vzipu16.x: New file. * gcc.target/aarch64/simd/vzipu32_1.c: New file. * gcc.target/aarch64/simd/vzipu32.x: New file. * gcc.target/aarch64/simd/vzipu8_1.c: New file. * gcc.target/aarch64/simd/vzipu8.x: New file. OK /Marcus
Re: [AArch64/ARM 2/3] Rewrite AArch64 ZIP Intrinsics using __builtin_shuffle
On 27 March 2014 10:52, Alan Lawrence alan.lawre...@arm.com wrote: This patch replaces the temporary inline assembler for vzip_* in arm_neon.h with equivalent calls to __builtin_shuffle. These are matched by aarch64_expand_vec_perm_const{,_1} to output the same assembler instructions. Tests from first patch still passing on aarch64-none-elf and aarch64_be-none-elf. gcc/ChangeLog: 2012-03-27 Alan Lawrence alan.lawre...@arm.com * config/aarch64/arm_neon.h (vzip1_f32, vzip1_p8, vzip1_p16, vzip1_s8, vzip1_s16, vzip1_s32, vzip1_u8, vzip1_u16, vzip1_u32, vzip1q_f32, vzip1q_f64, vzip1q_p8, vzip1q_p16, vzip1q_s8, vzip1q_s16, vzip1q_s32, vzip1q_s64, vzip1q_u8, vzip1q_u16, vzip1q_u32, vzip1q_u64, vzip2_f32, vzip2_p8, vzip2_p16, vzip2_s8, vzip2_s16, vzip2_s32, vzip2_u8, vzip2_u16, vzip2_u32, vzip2q_f32, vzip2q_f64, vzip2q_p8, vzip2q_p16, vzip2q_s8, vzip2q_s16, vzip2q_s32, vzip2q_s64, vzip2q_u8, vzip2q_u16, vzip2q_u32, vzip2q_u64): Replace inline __asm__ with __builtin_shuffle. OK /Marcus
Re: [Patch] Fix obsolete autoconf macros in configure.ac
Steve Ellcey sell...@mips.com writes: Actually, now that I look more at caddr_t, I see that we probably shouldn't be using it at all. The only uses in the gcc subdirectory are for calls to mmap and munmap (in gcc.c, gcc-common.c, and config/host-solaris.c) and the latest definitions for mmap and munmap say it should use 'void *', not caddr_t. I will submit a new This may be irrelevant: Solaris (and other OSes) regularly provide different compilation environments for various levels of standards compatibility, and the default is not the latest. Even apart from Solaris, not everyone adheres to yesterday's version of POSIX.1. patch to remove the uses (and definition) of caddr_t from gcc. Please be very careful here; this easily break several ports. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [AArch64/ARM 1/3] Add execution + assembler tests of AArch64 UZP Intrinsics
On 27 March 2014 17:17, Alan Lawrence alan.lawre...@arm.com wrote: This adds DejaGNU tests of the existing AArch64 vuzp_* intrinsics, both checking the assembler output and the runtime results. Test bodies are in separate files ready to reuse for ARM in the third patch. Putting these in a new subdirectory with the ZIP Intrinsic tests, using simd.exp added there (will commit ZIP tests first). All tests passing on aarch64-none-elf and aarch64_be-none-elf. testsuite/ChangeLog: 2014-03-27 Alan Lawrence alan.lawre...@arm.com * gcc.target/aarch64/simd/vuzpf32_1.c: New file. * gcc.target/aarch64/simd/vuzpf32.x: New file. * gcc.target/aarch64/simd/vuzpp16_1.c: New file. * gcc.target/aarch64/simd/vuzpp16.x: New file. * gcc.target/aarch64/simd/vuzpp8_1.c: New file. * gcc.target/aarch64/simd/vuzpp8.x: New file. * gcc.target/aarch64/simd/vuzpqf32_1.c: New file. * gcc.target/aarch64/simd/vuzpqf32.x: New file. * gcc.target/aarch64/simd/vuzpqp16_1.c: New file. * gcc.target/aarch64/simd/vuzpqp16.x: New file. * gcc.target/aarch64/simd/vuzpqp8_1.c: New file. * gcc.target/aarch64/simd/vuzpqp8.x: New file. * gcc.target/aarch64/simd/vuzpqs16_1.c: New file. * gcc.target/aarch64/simd/vuzpqs16.x: New file. * gcc.target/aarch64/simd/vuzpqs32_1.c: New file. * gcc.target/aarch64/simd/vuzpqs32.x: New file. * gcc.target/aarch64/simd/vuzpqs8_1.c: New file. * gcc.target/aarch64/simd/vuzpqs8.x: New file. * gcc.target/aarch64/simd/vuzpqu16_1.c: New file. * gcc.target/aarch64/simd/vuzpqu16.x: New file. * gcc.target/aarch64/simd/vuzpqu32_1.c: New file. * gcc.target/aarch64/simd/vuzpqu32.x: New file. * gcc.target/aarch64/simd/vuzpqu8_1.c: New file. * gcc.target/aarch64/simd/vuzpqu8.x: New file. * gcc.target/aarch64/simd/vuzps16_1.c: New file. * gcc.target/aarch64/simd/vuzps16.x: New file. * gcc.target/aarch64/simd/vuzps32_1.c: New file. * gcc.target/aarch64/simd/vuzps32.x: New file. * gcc.target/aarch64/simd/vuzps8_1.c: New file. * gcc.target/aarch64/simd/vuzps8.x: New file. * gcc.target/aarch64/simd/vuzpu16_1.c: New file. * gcc.target/aarch64/simd/vuzpu16.x: New file. * gcc.target/aarch64/simd/vuzpu32_1.c: New file. * gcc.target/aarch64/simd/vuzpu32.x: New file. * gcc.target/aarch64/simd/vuzpu8_1.c: New file. * gcc.target/aarch64/simd/vuzpu8.x: New file. OK /Marcus
Re: [AArch64/ARM 2/3] Rewrite AArch64 UZP Intrinsics using __builtin_shuffle
On 27 March 2014 17:25, Alan Lawrence alan.lawre...@arm.com wrote: This patch replaces the temporary inline assembler for vuzp_* in arm_neon.h with equivalent calls to __builtin_shuffle. These are matched by aarch64_expand_vec_perm_const{,_1} to output (generally) the same assembler instructions. That is, except for two-element vectors, where ZIP, UZP and TRN instructions all have the same effect; gcc's backend chooses to output ZIP so this patch also updates the 3 affected tests. Regressed, and tests from first patch still passing modulo updates herein, on aarch64-none-elf and aarch64_be-none-elf. gcc/testsuite/ChangeLog: 2014-03-27 Alan Lawrence alan.lawre...@arm.com * gcc.target/aarch64/vuzps32_1.c: Expect zip1/2 insn rather than uzp1/2. * gcc.target/aarch64/vuzpu32_1.c: Likewise. * gcc.target/aarch64/vuzpf32_1.c: Likewise. gcc/ChangeLog: 2014-03-27 Alan Lawrence alan.lawre...@arm.com * config/aarch64/arm_neon.h (vuzp1_f32, vuzp1_p8, vuzp1_p16, vuzp1_s8, vuzp1_s16, vuzp1_s32, vuzp1_u8, vuzp1_u16, vuzp1_u32, vuzp1q_f32, vuzp1q_f64, vuzp1q_p8, vuzp1q_p16, vuzp1q_s8, vuzp1q_s16, vuzp1q_s32, vuzp1q_s64, vuzp1q_u8, vuzp1q_u16, vuzp1q_u32, vuzp1q_u64, vuzp2_f32, vuzp2_p8, vuzp2_p16, vuzp2_s8, vuzp2_s16, vuzp2_s32, vuzp2_u8, vuzp2_u16, vuzp2_u32, vuzp2q_f32, vuzp2q_f64, vuzp2q_p8, vuzp2q_p16, vuzp2q_s8, vuzp2q_s16, vuzp2q_s32, vuzp2q_s64, vuzp2q_u8, vuzp2q_u16, vuzp2q_u32, vuzp2q_u64): Replace temporary asm with __builtin_shuffle. OK /Marcus
Re: -Wvariadic-macros does not print warning
forgot to add gcc-patches@gcc.gnu.org. Sorry for the double-post. On Wed, Apr 23, 2014 at 11:28 PM, Prathamesh Kulkarni bilbotheelffri...@gmail.com wrote: This is a follow up mail to http://gcc.gnu.org/ml/gcc-help/2014-04/msg00096.html I have attached patch that prints the warning when passed -Wvariadic-macros (I mostly followed it along lines of -Wlong-long). OK for trunk ? [libcpp] * macro.c (parse_params): Remove condition CPP_OPTION (pfile, cpp_pedantic). [gcc/c-family] * c.opt (-Wvariadic-macros): Init(-1) instead of Init(1). * c-opts.c (c_common_handle_option): Add case OPT_Wvariadic_macros. (sanitize_cpp_opts): Check condition for pedantic or warn_traditional. Thanks and Regards, Prathamesh
[4.9.1 RFA] [tree-optimization/60902] Invalidate outputs of GIMPLE_ASMs when threading around loops
The more aggressive threading across loop backedges requires invalidating equivalences that do not hold across all iterations of a loop. At first glance, invaliding at PHI nodes should be sufficient as any statement which potentially generated a new equivalence would be reprocessed as we come across the backedge. However, there is one important case where that does not hold. Specifically we might have derived a value from a conditional and the conditional might have been fed by a statement that doesn't produce useful equivalences (such as a GIMPLE_ASM). Thus the equivalence from the conditional is still visible because no new equivalence will be recorded for the GIMPLE_ASM. So if the result of the GIMPLE_ASM that gets used in the conditional varies from one loop iteration to the next, we could use a stale value from a prior iteration to thread the current iteration. That's exactly what happens in the ffmpeg code. Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Also verified that the sample audio in the referenced BZs no longer chops off after ~2 seconds. Installed on the trunk. OK for 4.9.1 after a suitable soak period on the trunk? commit 02269351ce3a81b5470b8137fb3c34bca27011da Author: Jeff Law l...@redhat.com Date: Wed Apr 23 00:25:47 2014 -0600 PR tree-optimization/60902 * tree-ssa-threadedge.c (record_temporary_equivalences_from_stmts_at_dest): Make sure to invalidate outputs from statements that do not produce useful outputs for threading. PR tree-optimization/60902 * gcc.target/i386/pr60902.c: New test. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 638c0da..ddebba7 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,11 @@ +2014-04-23 Jeff Law l...@redhat.com + + PR tree-optimization/60902 + * tree-ssa-threadedge.c + (record_temporary_equivalences_from_stmts_at_dest): Make sure to + invalidate outputs from statements that do not produce useful + outputs for threading. + 2014-04-23 Venkataramanan Kumar venkataramanan.ku...@linaro.org * config/aarch64/aarch64.md (stack_protect_set, stack_protect_test) diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 126ad08..62b07f4 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2014-04-23 Jeff Law l...@redhat.com + + PR tree-optimization/60902 + * gcc.target/i386/pr60902.c: New test. + 2014-04-23 Alex Velenko alex.vele...@arm.com * gcc.target/aarch64/vdup_lane_1.c: New testcase. diff --git a/gcc/testsuite/gcc.target/i386/pr60902.c b/gcc/testsuite/gcc.target/i386/pr60902.c new file mode 100644 index 000..b81dcd7 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr60902.c @@ -0,0 +1,32 @@ +/* { dg-do run } */ +/* { dg-options -O2 } */ +extern void abort (); +extern void exit (int); + +int x; + +foo() +{ + static int count; + count++; + if (count 1) +abort (); +} + +static inline int +frob () +{ + int a; + __asm__ (mov %1, %0\n\t : =r (a) : m (x)); + x++; + return a; +} + +int +main () +{ + int i; + for (i = 0; i 10 frob () == 0; i++) +foo(); + exit (0); +} diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c index c447b72..8a0103b 100644 --- a/gcc/tree-ssa-threadedge.c +++ b/gcc/tree-ssa-threadedge.c @@ -387,7 +387,34 @@ record_temporary_equivalences_from_stmts_at_dest (edge e, (gimple_code (stmt) != GIMPLE_CALL || gimple_call_lhs (stmt) == NULL_TREE || TREE_CODE (gimple_call_lhs (stmt)) != SSA_NAME)) - continue; + { + /* STMT might still have DEFS and we need to invalidate any known +equivalences for them. + +Consider if STMT is a GIMPLE_ASM with one or more outputs that +feeds a conditional inside a loop. We might derive an equivalence +due to the conditional. */ + tree op; + ssa_op_iter iter; + + if (backedge_seen) + FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_ALL_DEFS) + { + /* This call only invalidates equivalences created by + PHI nodes. This is by design to keep the cost of + of invalidation reasonable. */ + invalidate_equivalences (op, stack, src_map, dst_map); + + /* However, conditionals can imply values for real + operands as well. And those won't be recorded in the + maps. In fact, those equivalences may be recorded totally + outside the threading code. We can just create a new + temporary NULL equivalence here. */ + record_temporary_equivalence (op, NULL_TREE, stack); + } + + continue; + } /* The result of __builtin_object_size depends on all the arguments of a phi node. Temporarily using only one edge
Re: -Wvariadic-macros does not print warning
I didn't attach the patch, I am extremely sorry for the noise. I am re-posting the mail. This is a follow up mail to http://gcc.gnu.org/ml/gcc-help/2014-04/msg00096.html I have attached patch that prints the warning when passed -Wvariadic-macros (I mostly followed it along lines of -Wlong-long). OK for trunk ? [libcpp] * macro.c (parse_params): Remove condition CPP_OPTION (pfile, cpp_pedantic). [gcc/c-family] * c.opt (-Wvariadic-macros): Init(-1) instead of Init(1). * c-opts.c (c_common_handle_option): Add case OPT_Wvariadic_macros. (sanitize_cpp_opts): Check condition for pedantic or warn_traditional. Thanks and Regards, Prathamesh On Wed, Apr 23, 2014 at 11:30 PM, Prathamesh Kulkarni bilbotheelffri...@gmail.com wrote: forgot to add gcc-patches@gcc.gnu.org. Sorry for the double-post. On Wed, Apr 23, 2014 at 11:28 PM, Prathamesh Kulkarni bilbotheelffri...@gmail.com wrote: This is a follow up mail to http://gcc.gnu.org/ml/gcc-help/2014-04/msg00096.html I have attached patch that prints the warning when passed -Wvariadic-macros (I mostly followed it along lines of -Wlong-long). OK for trunk ? [libcpp] * macro.c (parse_params): Remove condition CPP_OPTION (pfile, cpp_pedantic). [gcc/c-family] * c.opt (-Wvariadic-macros): Init(-1) instead of Init(1). * c-opts.c (c_common_handle_option): Add case OPT_Wvariadic_macros. (sanitize_cpp_opts): Check condition for pedantic or warn_traditional. Thanks and Regards, Prathamesh Index: libcpp/macro.c === --- libcpp/macro.c (revision 209470) +++ libcpp/macro.c (working copy) @@ -2800,8 +2800,7 @@ parse_params (cpp_reader *pfile, cpp_mac (pfile, CPP_W_VARIADIC_MACROS, anonymous variadic macros were introduced in C99); } - else if (CPP_OPTION (pfile, cpp_pedantic) - CPP_OPTION (pfile, warn_variadic_macros)) + else if (CPP_OPTION (pfile, warn_variadic_macros)) cpp_pedwarning (pfile, CPP_W_VARIADIC_MACROS, ISO C does not permit named variadic macros); Index: gcc/c-family/c-opts.c === --- gcc/c-family/c-opts.c (revision 209470) +++ gcc/c-family/c-opts.c (working copy) @@ -396,6 +396,10 @@ c_common_handle_option (size_t scode, co cpp_opts-cpp_warn_long_long = value; break; +case OPT_Wvariadic_macros: + cpp_opts-warn_variadic_macros = value; + break; + case OPT_Wmissing_include_dirs: cpp_opts-warn_missing_include_dirs = value; break; @@ -1227,8 +1231,9 @@ sanitize_cpp_opts (void) /* Similarly with -Wno-variadic-macros. No check for c99 here, since this also turns off warnings about GCCs extension. */ - cpp_opts-warn_variadic_macros -= cpp_warn_variadic_macros (pedantic || warn_traditional); + if (cpp_warn_variadic_macros == -1) +cpp_warn_variadic_macros = pedantic || warn_traditional; + cpp_opts-warn_variadic_macros = cpp_warn_variadic_macros; /* If we're generating preprocessor output, emit current directory if explicitly requested or if debugging information is enabled. Index: gcc/c-family/c.opt === --- gcc/c-family/c.opt (revision 209470) +++ gcc/c-family/c.opt (working copy) @@ -785,7 +785,7 @@ C ObjC C++ ObjC++ Var(warn_unused_result Warn if a caller of a function, marked with attribute warn_unused_result, does not use its return value Wvariadic-macros -C ObjC C++ ObjC++ Var(cpp_warn_variadic_macros) Init(1) Warning +C ObjC C++ ObjC++ Var(cpp_warn_variadic_macros) Init(-1) Warning Warn about using variadic macros Wvarargs
Re: Optimize n?rotate(x,n):x
Honza, any comment on Richard's question? On Tue, 15 Apr 2014, Richard Biener wrote: On Mon, Apr 14, 2014 at 6:40 PM, Marc Glisse marc.gli...@inria.fr wrote: On Mon, 14 Apr 2014, Richard Biener wrote: + /* If the special case has a high probability, keep it. */ + if (EDGE_PRED (middle_bb, 0)-probability PROB_EVEN) I suppose Honza has a comment on how to test this properly (not sure if -probability or -frequency is always initialized properly). for example single_likely_edge tests profile_status_for_fn != PROFILE_ABSENT (and uses a fixed probability value ...). Anyway, the comparison looks backwards to me, but maybe I'm missing sth - I'd use = PROB_LIKELY ;) Maybe the comment is confusing? middle_bb contains the expensive operation (say a/b) that the special case skips entirely. If the division happens in less than 50% of cases (that's the proba of the edge going from cond to middle_bb), then doing the comparison+jump may be cheaper and I abort the optimization. At least the testcase with __builtin_expect should prove that I didn't do it backwards. Ah, indeed. My mistake. value-prof seems to use 50% as the cut-off where it may become interesting to special case division, hence my choice of PROB_EVEN. I am not sure which way you want to use PROB_LIKELY (80%). If we have more than 80% chances of executing the division, always perform it? Or if we have more than 80% chances of skipping the division, keep the branch? Ok, if it's from value-prof then that's fine. The patch is ok if Honza doesn't have any comments on whether it's ok to look at -probability unconditionally. Thanks, Richard. Attached is the latest version (passed the testsuite). Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c === --- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-12.c (working copy) @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options -O -fdump-tree-phiopt1 } */ + +int f(int a, int b, int c) { + if (c 5) return c; + if (a == 0) return b; + return a + b; +} + +unsigned rot(unsigned x, int n) { + const int bits = __CHAR_BIT__ * __SIZEOF_INT__; + return (n == 0) ? x : ((x n) | (x (bits - n))); +} + +unsigned m(unsigned a, unsigned b) { + if (a == 0) +return 0; + else +return a b; +} + +/* { dg-final { scan-tree-dump-times goto 2 phiopt1 } } */ +/* { dg-final { cleanup-tree-dump phiopt1 } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c === --- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-13.c (working copy) @@ -0,0 +1,19 @@ +/* { dg-do compile { target x86_64-*-* } } */ +/* { dg-options -O2 -fdump-tree-optimized } */ + +int f(int a, int b) { + if (__builtin_expect(a == 0, 1)) return b; + return a + b; +} + +// optab_handler can handle if(b==1) but not a/b +// so we consider a/b too expensive. +unsigned __int128 g(unsigned __int128 a, unsigned __int128 b) { + if (b == 1) +return a; + else +return a / b; +} + +/* { dg-final { scan-tree-dump-times goto 4 optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ Index: gcc/tree-ssa-phiopt.c === --- gcc/tree-ssa-phiopt.c (revision 209353) +++ gcc/tree-ssa-phiopt.c (working copy) @@ -140,20 +140,37 @@ static bool gate_hoist_loads (void); x = PHI (CONST, a) Gets replaced with: bb0: bb2: t1 = a == CONST; t2 = b c; t3 = t1 t2; x = a; + + It also replaces + + bb0: + if (a != 0) goto bb1; else goto bb2; + bb1: + c = a + b; + bb2: + x = PHI c (bb1), b (bb0), ...; + + with + + bb0: + c = a + b; + bb2: + x = PHI c (bb0), ...; + ABS Replacement --- This transformation, implemented in abs_replacement, replaces bb0: if (a = 0) goto bb2; else goto bb1; bb1: x = -a; bb2: @@ -809,20 +826,103 @@ operand_equal_for_value_replacement (con if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp)) return true; tmp = gimple_assign_rhs2 (def); if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp)) return true; return false; } +/* Returns true if ARG is a neutral element for operation CODE + on the RIGHT side. */ + +static bool +neutral_element_p (tree_code code, tree arg, bool right) +{ + switch (code) +{ +case PLUS_EXPR: +case BIT_IOR_EXPR: +case BIT_XOR_EXPR: + return integer_zerop (arg); + +case LROTATE_EXPR: +case RROTATE_EXPR: +case LSHIFT_EXPR: +case RSHIFT_EXPR: +case MINUS_EXPR: +case POINTER_PLUS_EXPR: + return right integer_zerop (arg); + +case MULT_EXPR: + return integer_onep (arg); + +case TRUNC_DIV_EXPR: +case CEIL_DIV_EXPR: +
Re: [c++] typeinfo for target types
On 04/13/2014 01:41 AM, Marc Glisse wrote: Hello, this patch generates typeinfo for target types. On x86_64, it adds these 6 lines to nm -C libsupc++.a. A follow-up patch will be needed to export and version those in the shared library. + V typeinfo for __float128 + V typeinfo for __float128 const* + V typeinfo for __float128* + V typeinfo name for __float128 + V typeinfo name for __float128 const* + V typeinfo name for __float128* Bootstrap and testsuite on x86_64-linux-gnu (a bit of noise in tsan/tls_race.c). 2014-04-13 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 gcc/c-family/ * c-common.c (registered_builtin_types): Make non-static. * c-common.h (registered_builtin_types): Declare. gcc/cp/ * rtti.c (emit_support_tinfo_1): New function, extracted from emit_support_tinfos. (emit_support_tinfos): Call it and iterate on registered_builtin_types. This is causing aarch64 builds to break. Any c++ compilation aborts at #0 fancy_abort (file=0x14195c8 ../../git-rh/gcc/cp/mangle.c, line=2303, function=0x1419ff8 write_builtin_type(tree_node*)::__FUNCTION__ write_builtin_type) at ../../git-rh/gcc/diagnostic.c:1190 #1 0x007ce2b4 in write_builtin_type ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/mangle.c:2303 #2 0x007cc85c in write_type ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/mangle.c:1969 #3 0x007d4d98 in mangle_special_for_type ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df, code=0x1419a98 TI) at ../../git-rh/gcc/cp/mangle.c:3569 #4 0x007d4dcc in mangle_typeinfo_for_type ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/mangle.c:3585 #5 0x0070618c in get_tinfo_decl ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/rtti.c:422 #6 0x00709ff0 in emit_support_tinfo_1 ( bltn=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/rtti.c:1485 #7 0x0070a344 in emit_support_tinfos () at ../../git-rh/gcc/cp/rtti.c:1550 Presumably the backend needs to grow some mangling support for its builtins, but in the meantime can we do something less drastic than abort? Isn't this only really an issue if someone tries to access one of these types via typeinfo? r~
Re: [i386] define __SIZEOF_FLOAT128__
(Adding an i386 maintainer in Cc) http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00620.html On Sun, 13 Apr 2014, Marc Glisse wrote: Hello, some people like having a macro to test if a type is available (__SIZEOF_INT128__ for instance). This adds macros for __float80 and __float128. The types seem to be always available, so I didn't add any condition. If you think this is a bad idea, please close the PR. Bootstrap+testsuite on x86_64-linux-gnu. 2014-04-13 Marc Glisse marc.gli...@inria.fr PR preprocessor/56540 * config/i386/i386-c.c (ix86_target_macros): Define __SIZEOF_FLOAT80__ and __SIZEOF_FLOAT128__. -- Marc Glisse
[AArch64/ARM 0/3] Patch series for REV permute instructions
The meat of this is in the second patch, which makes the AArch64 backend look for shuffle masks that can be turned into REV instructions, and updates the VREV Neon Intrinsics to use __builtin_shuffle rather than the current inline assembler; this then produces the same instructions (unless the midend can do better). Before that, the first patch adds execution + assembler tests of the existing intrinsics, which then serve as a testcase for the second patch. Third patch reuses the test bodies from first patch in equivalent tests on the ARM architecture. Ok for trunk? --Alan
Re: [PATCH] Change is-a.h to support typedefs of pointers
On Wed, 2014-04-23 at 18:32 +0200, Richard Biener wrote: On April 23, 2014 5:31:42 PM CEST, David Malcolm dmalc...@redhat.com wrote: [...snip...] The following patch changes the is-a.h API to remove the implicit injection of a pointer, so that one writes: Q* q = dyn_cast Q* (p); rather than: Q* q = dyn_cast Q (p); [...snip...] Successfully bootstrappedregrtested on x86_64-unknown-linux-gnu. OK for trunk? OK for trunk, no need to wait for 4.9.1 for this. Thanks. Committed to trunk as r209719. [...snip...]
[AArch64/ARM 1/3] Add execution + assembler tests of AArch64 REV Neon Intrinsics
This adds DejaGNU tests of the existing AArch64 vrev_* intrinsics, both checking the assembler output and the runtime results. Test bodies are in separate files ready to reuse for ARM in the third patch. All tests passing on aarch64-none-elf and aarch64_be-none-elf. gcc/testsuite/ChangeLog: 2014-04-23 Alan Lawrence alan.lawre...@arm.com * gcc.target/aarch64/simd/vrev16p8_1.c: New file. * gcc.target/aarch64/simd/vrev16p8.x: New file. * gcc.target/aarch64/simd/vrev16qp8_1.c: New file. * gcc.target/aarch64/simd/vrev16qp8.x: New file. * gcc.target/aarch64/simd/vrev16qs8_1.c: New file. * gcc.target/aarch64/simd/vrev16qs8.x: New file. * gcc.target/aarch64/simd/vrev16qu8_1.c: New file. * gcc.target/aarch64/simd/vrev16qu8.x: New file. * gcc.target/aarch64/simd/vrev16s8_1.c: New file. * gcc.target/aarch64/simd/vrev16s8.x: New file. * gcc.target/aarch64/simd/vrev16u8_1.c: New file. * gcc.target/aarch64/simd/vrev16u8.x: New file. * gcc.target/aarch64/simd/vrev32p16_1.c: New file. * gcc.target/aarch64/simd/vrev32p16.x: New file. * gcc.target/aarch64/simd/vrev32p8_1.c: New file. * gcc.target/aarch64/simd/vrev32p8.x: New file. * gcc.target/aarch64/simd/vrev32qp16_1.c: New file. * gcc.target/aarch64/simd/vrev32qp16.x: New file. * gcc.target/aarch64/simd/vrev32qp8_1.c: New file. * gcc.target/aarch64/simd/vrev32qp8.x: New file. * gcc.target/aarch64/simd/vrev32qs16_1.c: New file. * gcc.target/aarch64/simd/vrev32qs16.x: New file. * gcc.target/aarch64/simd/vrev32qs8_1.c: New file. * gcc.target/aarch64/simd/vrev32qs8.x: New file. * gcc.target/aarch64/simd/vrev32qu16_1.c: New file. * gcc.target/aarch64/simd/vrev32qu16.x: New file. * gcc.target/aarch64/simd/vrev32qu8_1.c: New file. * gcc.target/aarch64/simd/vrev32qu8.x: New file. * gcc.target/aarch64/simd/vrev32s16_1.c: New file. * gcc.target/aarch64/simd/vrev32s16.x: New file. * gcc.target/aarch64/simd/vrev32s8_1.c: New file. * gcc.target/aarch64/simd/vrev32s8.x: New file. * gcc.target/aarch64/simd/vrev32u16_1.c: New file. * gcc.target/aarch64/simd/vrev32u16.x: New file. * gcc.target/aarch64/simd/vrev32u8_1.c: New file. * gcc.target/aarch64/simd/vrev32u8.x: New file. * gcc.target/aarch64/simd/vrev64f32_1.c: New file. * gcc.target/aarch64/simd/vrev64f32.x: New file. * gcc.target/aarch64/simd/vrev64p16_1.c: New file. * gcc.target/aarch64/simd/vrev64p16.x: New file. * gcc.target/aarch64/simd/vrev64p8_1.c: New file. * gcc.target/aarch64/simd/vrev64p8.x: New file. * gcc.target/aarch64/simd/vrev64qf32_1.c: New file. * gcc.target/aarch64/simd/vrev64qf32.x: New file. * gcc.target/aarch64/simd/vrev64qp16_1.c: New file. * gcc.target/aarch64/simd/vrev64qp16.x: New file. * gcc.target/aarch64/simd/vrev64qp8_1.c: New file. * gcc.target/aarch64/simd/vrev64qp8.x: New file. * gcc.target/aarch64/simd/vrev64qs16_1.c: New file. * gcc.target/aarch64/simd/vrev64qs16.x: New file. * gcc.target/aarch64/simd/vrev64qs32_1.c: New file. * gcc.target/aarch64/simd/vrev64qs32.x: New file. * gcc.target/aarch64/simd/vrev64qs8_1.c: New file. * gcc.target/aarch64/simd/vrev64qs8.x: New file. * gcc.target/aarch64/simd/vrev64qu16_1.c: New file. * gcc.target/aarch64/simd/vrev64qu16.x: New file. * gcc.target/aarch64/simd/vrev64qu32_1.c: New file. * gcc.target/aarch64/simd/vrev64qu32.x: New file. * gcc.target/aarch64/simd/vrev64qu8_1.c: New file. * gcc.target/aarch64/simd/vrev64qu8.x: New file. * gcc.target/aarch64/simd/vrev64s16_1.c: New file. * gcc.target/aarch64/simd/vrev64s16.x: New file. * gcc.target/aarch64/simd/vrev64s32_1.c: New file. * gcc.target/aarch64/simd/vrev64s32.x: New file. * gcc.target/aarch64/simd/vrev64s8_1.c: New file. * gcc.target/aarch64/simd/vrev64s8.x: New file. * gcc.target/aarch64/simd/vrev64u16_1.c: New file. * gcc.target/aarch64/simd/vrev64u16.x: New file. * gcc.target/aarch64/simd/vrev64u32_1.c: New file. * gcc.target/aarch64/simd/vrev64u32.x: New file. * gcc.target/aarch64/simd/vrev64u8_1.c: New file. * gcc.target/aarch64/simd/vrev64u8.x: New file.diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vrev16p8.x b/gcc/testsuite/gcc.target/aarch64/simd/vrev16p8.x new file mode 100644 index 000..6316abf --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/vrev16p8.x @@ -0,0 +1,22 @@ +extern void abort (void); + +poly8x8_t +test_vrev16p8 (poly8x8_t _arg) +{ + return vrev16_p8 (_arg); +} + +int +main (int argc, char **argv) +{ + int i; + poly8x8_t inorder = {1, 2, 3, 4, 5, 6, 7, 8}; + poly8x8_t reversed =
[AArch64/ARM 2/3] Recognize shuffle patterns for REV instructions on AARch64, rewrite intrinsics.
This patch (borrowing heavily from the ARM backend) makes aarch64_expand_vec_perm_const output REV instructions when appropriate, and then implements the vrev_XXX intrinsics in terms of __builtin_shuffle (which now produces the same assembly instructions). No regressions (and tests in previous patch http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01468.html still passing) on aarch64-none-elf; also on aarch64_be-none-elf, where there are no regressions following testsuite config changes in http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00579.html, but some noise (due to unexpected success in vectorization) without that patch. gcc/ChangeLog: 2014-04-23 Alan Lawrence alan.lawre...@arm.com * config/aarch64/iterators.md: add a REVERSE iterator and rev_op attribute for REV64/32/16 insns. * config/aarch64/aarch64-simd.md: add corresponding define_insn parameterized by REVERSE iterator. * config/aarch64/aarch64.c (aarch64_evpc_rev): recognize REVnn patterns. (aarch64_expand_vec_perm_const_1): call aarch64_evpc_rev also. * config/aarch64/arm_neon.h (vrev{16,32,64}[q]_{s,p,u,f}{8,16,32}): rewrite to use __builtin_shuffle.diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 4dffb59..d499e86 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -4032,6 +4032,15 @@ [(set_attr type neon_permuteq)] ) +(define_insn aarch64_revREVERSE:rev_opmode + [(set (match_operand:VALL 0 register_operand =w) + (unspec:VALL [(match_operand:VALL 1 register_operand w)] +REVERSE))] + TARGET_SIMD + revREVERSE:rev_op\\t%0.Vtype, %1.Vtype + [(set_attr type neon_revq)] +) + (define_insn aarch64_st2mode_dreg [(set (match_operand:TI 0 aarch64_simd_struct_operand =Utv) (unspec:TI [(match_operand:OI 1 register_operand w) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 16c51a8..5bb10a2 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -8047,6 +8047,80 @@ aarch64_evpc_zip (struct expand_vec_perm_d *d) return true; } +/* Recognize patterns for the REV insns. */ + +static bool +aarch64_evpc_rev (struct expand_vec_perm_d *d) +{ + unsigned int i, j, diff, nelt = d-nelt; + rtx (*gen) (rtx, rtx); + + if (!d-one_vector_p) +return false; + + diff = d-perm[0]; + switch (diff) +{ +case 7: + switch (d-vmode) + { + case V16QImode: gen = gen_aarch64_rev64v16qi; break; + case V8QImode: gen = gen_aarch64_rev64v8qi; break; + default: + return false; + } + break; +case 3: + switch (d-vmode) + { + case V16QImode: gen = gen_aarch64_rev32v16qi; break; + case V8QImode: gen = gen_aarch64_rev32v8qi; break; + case V8HImode: gen = gen_aarch64_rev64v8hi; break; + case V4HImode: gen = gen_aarch64_rev64v4hi; break; + default: + return false; + } + break; +case 1: + switch (d-vmode) + { + case V16QImode: gen = gen_aarch64_rev16v16qi; break; + case V8QImode: gen = gen_aarch64_rev16v8qi; break; + case V8HImode: gen = gen_aarch64_rev32v8hi; break; + case V4HImode: gen = gen_aarch64_rev32v4hi; break; + case V4SImode: gen = gen_aarch64_rev64v4si; break; + case V2SImode: gen = gen_aarch64_rev64v2si; break; + case V4SFmode: gen = gen_aarch64_rev64v4sf; break; + case V2SFmode: gen = gen_aarch64_rev64v2sf; break; + default: + return false; + } + break; +default: + return false; +} + + for (i = 0; i nelt ; i += diff + 1) +for (j = 0; j = diff; j += 1) + { + /* This is guaranteed to be true as the value of diff + is 7, 3, 1 and we should have enough elements in the + queue to generate this. Getting a vector mask with a + value of diff other than these values implies that + something is wrong by the time we get here. */ + gcc_assert (i + j nelt); + if (d-perm[i + j] != i + diff - j) + return false; + } + + /* Success! */ + if (d-testing_p) +return true; + + emit_insn (gen (d-target, d-op0)); + return true; +} + static bool aarch64_evpc_dup (struct expand_vec_perm_d *d) { @@ -8153,6 +8227,8 @@ aarch64_expand_vec_perm_const_1 (struct expand_vec_perm_d *d) return true; else if (aarch64_evpc_trn (d)) return true; + else if (aarch64_evpc_rev (d)) +return true; else if (aarch64_evpc_dup (d)) return true; return aarch64_evpc_tbl (d); diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index 6af99361..383ed56 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -10628,402 +10628,6 @@ vrecpeq_u32 (uint32x4_t a) return result; } -__extension__ static __inline poly8x8_t __attribute__ ((__always_inline__)) -vrev16_p8 (poly8x8_t a) -{ - poly8x8_t result; - __asm__ (rev16 %0.8b,%1.8b - : =w(result) - : w(a) - : /* No clobbers */); - return result; -} - -__extension__ static __inline int8x8_t
Re: [c++] typeinfo for target types
On Wed, 23 Apr 2014, Richard Henderson wrote: On 04/13/2014 01:41 AM, Marc Glisse wrote: Hello, this patch generates typeinfo for target types. On x86_64, it adds these 6 lines to nm -C libsupc++.a. A follow-up patch will be needed to export and version those in the shared library. + V typeinfo for __float128 + V typeinfo for __float128 const* + V typeinfo for __float128* + V typeinfo name for __float128 + V typeinfo name for __float128 const* + V typeinfo name for __float128* Bootstrap and testsuite on x86_64-linux-gnu (a bit of noise in tsan/tls_race.c). 2014-04-13 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 gcc/c-family/ * c-common.c (registered_builtin_types): Make non-static. * c-common.h (registered_builtin_types): Declare. gcc/cp/ * rtti.c (emit_support_tinfo_1): New function, extracted from emit_support_tinfos. (emit_support_tinfos): Call it and iterate on registered_builtin_types. This is causing aarch64 builds to break. If it is causing too much trouble, we could ifdef out the last 2 lines of emit_support_tinfos and revert the libstdc++ changes (or even revert the whole thing). Any c++ compilation aborts at That's surprising, the code I touched is only ever supposed to run while compiling one file in libsupc++, if I understand correctly. #0 fancy_abort (file=0x14195c8 ../../git-rh/gcc/cp/mangle.c, line=2303, function=0x1419ff8 write_builtin_type(tree_node*)::__FUNCTION__ write_builtin_type) at ../../git-rh/gcc/diagnostic.c:1190 #1 0x007ce2b4 in write_builtin_type ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/mangle.c:2303 #2 0x007cc85c in write_type ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/mangle.c:1969 #3 0x007d4d98 in mangle_special_for_type ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df, code=0x1419a98 TI) at ../../git-rh/gcc/cp/mangle.c:3569 #4 0x007d4dcc in mangle_typeinfo_for_type ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/mangle.c:3585 #5 0x0070618c in get_tinfo_decl ( type=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/rtti.c:422 #6 0x00709ff0 in emit_support_tinfo_1 ( bltn=real_type 0x7fb1653540 __builtin_aarch64_simd_df) at ../../git-rh/gcc/cp/rtti.c:1485 #7 0x0070a344 in emit_support_tinfos () at ../../git-rh/gcc/cp/rtti.c:1550 Presumably the backend needs to grow some mangling support for its builtins, aarch64 has complicated builtins... __builtin_aarch64_simd_df uses double_aarch64_type_node which is not the same as double_type_node. I mostly looked at the x86 backend, so I didn't notice that aarch64 registers a lot more builtins. but in the meantime can we do something less drastic than abort? Sounds good, but I am not sure how exactly. We could use a separate hook (register_builtin_type_for_typeinfo?) so back-ends have to explicitly say they want typeinfo, but it is ugly having to register types multiple times. We could add a parameter to the existing register_builtin_type saying whether we want typeinfo, but that means updating all back-ends. We could get the mangling functions to take a parameter that says whether errors should be fatal and skip generating the typeinfo when we can't mangle, but there is no convenient way to communicate this mangling failure (0 bytes written?). Would mangling the aarch64 builtins be a lot of work? Did other platforms break as well? Isn't this only really an issue if someone tries to access one of these types via typeinfo? Yes. -- Marc Glisse
Re: [i386] define __SIZEOF_FLOAT128__
On Wed, Apr 23, 2014 at 11:48 AM, Marc Glisse marc.gli...@inria.fr wrote: (Adding an i386 maintainer in Cc) http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00620.html On Sun, 13 Apr 2014, Marc Glisse wrote: Hello, some people like having a macro to test if a type is available (__SIZEOF_INT128__ for instance). This adds macros for __float80 and __float128. The types seem to be always available, so I didn't add any condition. If you think this is a bad idea, please close the PR. Bootstrap+testsuite on x86_64-linux-gnu. 2014-04-13 Marc Glisse marc.gli...@inria.fr PR preprocessor/56540 * config/i386/i386-c.c (ix86_target_macros): Define __SIZEOF_FLOAT80__ and __SIZEOF_FLOAT128__. For __SIZEOF_FLOAT80__, you should check TARGET_128BIT_LONG_DOUBLE instead of TARGET_64BIT. -- H.J.
[AArch64/ARM 3/3] Add execution tests of ARM REV intrinsics
Final patch in series, adds new tests of the REV Neon Intrinsics for ARM. These tests subsume the autogenerated tests in gcc/testsuite/gcc.target/arm/neon (that only check assembler output) by also checking the execution results, reusing the test bodies introduced into AArch64 in the first patch. Testsuite driver simd.exp from http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01500.html, will ensure that's committed first. All passing on arm-none-eabi. gcc/testsuite/ChangeLog: 2014-04-23 Alan Lawrence alan.lawre...@arm.com gcc.target/arm/simd/vrev16p8_1.c: New file. gcc.target/arm/simd/vrev16qp8_1.c: New file. gcc.target/arm/simd/vrev16qs8_1.c: New file. gcc.target/arm/simd/vrev16qu8_1.c: New file. gcc.target/arm/simd/vrev16s8_1.c: New file. gcc.target/arm/simd/vrev16u8_1.c: New file. gcc.target/arm/simd/vrev32p16_1.c: New file. gcc.target/arm/simd/vrev32p8_1.c: New file. gcc.target/arm/simd/vrev32qp16_1.c: New file. gcc.target/arm/simd/vrev32qp8_1.c: New file. gcc.target/arm/simd/vrev32qs16_1.c: New file. gcc.target/arm/simd/vrev32qs8_1.c: New file. gcc.target/arm/simd/vrev32qu16_1.c: New file. gcc.target/arm/simd/vrev32qu8_1.c: New file. gcc.target/arm/simd/vrev32s16_1.c: New file. gcc.target/arm/simd/vrev32s8_1.c: New file. gcc.target/arm/simd/vrev32u16_1.c: New file. gcc.target/arm/simd/vrev32u8_1.c: New file. gcc.target/arm/simd/vrev64f32_1.c: New file. gcc.target/arm/simd/vrev64p16_1.c: New file. gcc.target/arm/simd/vrev64p8_1.c: New file. gcc.target/arm/simd/vrev64qf32_1.c: New file. gcc.target/arm/simd/vrev64qp16_1.c: New file. gcc.target/arm/simd/vrev64qp8_1.c: New file. gcc.target/arm/simd/vrev64qs16_1.c: New file. gcc.target/arm/simd/vrev64qs32_1.c: New file. gcc.target/arm/simd/vrev64qs8_1.c: New file. gcc.target/arm/simd/vrev64qu16_1.c: New file. gcc.target/arm/simd/vrev64qu32_1.c: New file. gcc.target/arm/simd/vrev64qu8_1.c: New file. gcc.target/arm/simd/vrev64s16_1.c: New file. gcc.target/arm/simd/vrev64s32_1.c: New file. gcc.target/arm/simd/vrev64s8_1.c: New file. gcc.target/arm/simd/vrev64u16_1.c: New file. gcc.target/arm/simd/vrev64u32_1.c: New file. gcc.target/arm/simd/vrev64u8_1.c: New file.diff --git a/gcc/testsuite/gcc.target/arm/simd/vrev16p8_1.c b/gcc/testsuite/gcc.target/arm/simd/vrev16p8_1.c new file mode 100644 index 000..fddb32f --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vrev16p8_1.c @@ -0,0 +1,12 @@ +/* Test the `vrev16p8' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options -save-temps -fno-inline } */ +/* { dg-add-options arm_neon } */ + +#include arm_neon.h +#include ../../aarch64/simd/vrev16p8.x + +/* { dg-final { scan-assembler vrev16\.8\[ \t\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \t\]+@\[a-zA-Z0-9 \]+\)?\n } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/arm/simd/vrev16qp8_1.c b/gcc/testsuite/gcc.target/arm/simd/vrev16qp8_1.c new file mode 100644 index 000..b4634b8 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vrev16qp8_1.c @@ -0,0 +1,12 @@ +/* Test the `vrev16q_p8' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options -save-temps -fno-inline } */ +/* { dg-add-options arm_neon } */ + +#include arm_neon.h +#include ../../aarch64/simd/vrev16qp8.x + +/* { dg-final { scan-assembler vrev16\.8\[ \t\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \t\]+@\[a-zA-Z0-9 \]+\)?\n } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/arm/simd/vrev16qs8_1.c b/gcc/testsuite/gcc.target/arm/simd/vrev16qs8_1.c new file mode 100644 index 000..691799b --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vrev16qs8_1.c @@ -0,0 +1,12 @@ +/* Test the `vrev16q_s8' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options -save-temps -fno-inline } */ +/* { dg-add-options arm_neon } */ + +#include arm_neon.h +#include ../../aarch64/simd/vrev16qs8.x + +/* { dg-final { scan-assembler vrev16\.8\[ \t\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \t\]+@\[a-zA-Z0-9 \]+\)?\n } } */ +/* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.target/arm/simd/vrev16qu8_1.c b/gcc/testsuite/gcc.target/arm/simd/vrev16qu8_1.c new file mode 100644 index 000..f6ab4ac --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/simd/vrev16qu8_1.c @@ -0,0 +1,12 @@ +/* Test the `vrev16q_u8' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options -save-temps -fno-inline } */ +/* { dg-add-options arm_neon } */ + +#include arm_neon.h +#include ../../aarch64/simd/vrev16qu8.x + +/* { dg-final {