[Ping] Allow dg-skip-if to use compiler flags specified through set_board_info cflags
Hello, Could one of the maintainers please check in the patch below? It was already approved at http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00689.html Regards Senthil On Tue, Aug 21, 2012 at 03:08:43PM -0700, Mike Stump wrote: On Aug 11, 2012, at 10:39 AM, Senthil Kumar Selvaraj wrote: This patch allows cflags set in board config files using set_board_info cflags to be used in the selectors of dg-skip-if and other dejagnu commands that use the check-flags proc. Ok. * lib/target-supports-dg.exp (check-flags): Add cflags from board config to compiler_flags diff --git a/gcc/testsuite/lib/target-supports-dg.exp b/gcc/testsuite/lib/target-supports-dg.exp index 2f6c4c2..bdf7476 100644 --- a/gcc/testsuite/lib/target-supports-dg.exp +++ b/gcc/testsuite/lib/target-supports-dg.exp @@ -304,6 +304,9 @@ proc check-flags { args } { # If running a subset of the test suite, $TEST_ALWAYS_FLAGS may not exist. catch {append compiler_flags $TEST_ALWAYS_FLAGS } set dest [target_info name] +if [board_info $dest exists cflags] { +append compiler_flags [board_info $dest cflags] +} if [board_info $dest exists multilib_flags] { append compiler_flags [board_info $dest multilib_flags] }
Re: [patch, mips] Patch for new mips triplet - mips-mti-elf
Steve Ellcey sell...@mips.com writes: On Mon, 2012-09-17 at 21:36 +0100, Richard Sandiford wrote: It's a hosted vs. embedded thing. Hosted targets like *-linux-gnu have dynamic ABI requirements and so are keyed off an ABI rather than an architecture. The only effect of -march= and --with-arch= should be to extend the choice of available instructions. That was actually very useful in pre MIPS(isa)32 and MIPS(isa)64 days, because MIPS IV had ISA extensions that could be used in 32-bit as well as 64-bit code. You could therefore use the MIPS IV extensions with an existing 32-bit MIPS I or MIPS II sysroot. The same sort of thing applied to processor-specific extensions in 64-bit processors (of which there were many :-)). It's less useful with the stock MIPS32 and MIPS64 ISAs because the 32-bit subset of MIPS64 is (by design) essentially MIPS32. If this is less useful now and since the multilib mips-mti-linux-gnu target I created earlier is only supporting the mips32(r2) and mips64(r2) ISAs (and not MIPS IV, etc) what do you think about me changing that target to default to n32 when specifying the mips64 or mips64r2 architectures and not specifying an explicit ABI? That way both the mips-mti-linux-gnu and mips-mti-elf targets will behave in the same way with regards to the default ABI. JFTR, it would be the case even if you did support MIPS IV. n32 was the best ABI there too. My point was that if you only had access to a 32-bit sysroot and/or kernel, compiling for o32 with MIPS IV (i.e. a 64-bit arch) did have something to offer over o32 with the highest 32-bit arch (MIPS II). But yeah, since mips-mti-linux-gnu provides separate multilibs and sysroots for mips64 and mips64r2, you can define the ABI of those multilibs and sysroots to be what you like. And I agree n32 makes sense. But I think it would be a bad idea simply to add a rule to DRIVER_SELF_SPECS. I think you'd also want to make the 64-bit sysroots mips64-linux-gnu-style (i.e. IRIX-6-style) sysroots, with n32 stuff in /lib32, /usr/lib32, etc., rather than in /lib and /usr/lib. That way, your sysroots are compatible with mipsisa64-linux-gnu and mipsisa63r2-linux-gnu, in case anyone ever does need to build their own toolchain. You also won't need to patch glibc for your layout, and won't confuse package build scripts that expect n32 stuff to be in the standard locations. Obviously that'll mean a bit of work in the t-* makefile fragments though. Another thing to watch out for is that mips-sde-elf redefines the n32 ABI so that long double is only 64 bits (i.e. equivalent to double), not the ABI-prescribed 128 bits. I think it'd be better to avoid that change for mips-sti-linux-gnu, for the same reasons as above. Richard
Re: [Patch, fortran] PR46897 - [OOP] type-bound defined ASSIGNMENT(=) not used for derived type component in intrinsic assign
On 17/09/2012 20:45, Mikael Morin wrote: *** resolve_fl_derived0 (gfc_symbol *sym) *** 12282,12289 --- 12558,12574 || c-attr.proc_pointer || c-attr.allocatable)) == FAILURE) return FAILURE; + + if (c-ts.type == BT_DERIVED + c-ts.u.derived-f2k_derived + c-ts.u.derived-f2k_derived-tb_op[INTRINSIC_ASSIGN]) +sym-attr.defined_assign_comp = 1; } + if (super_type) + sym-attr.defined_assign_comp += super_type-attr.defined_assign_comp; I guess Tobias' reported bug is here. The flag shouldn't be cleared here if it was set just before. Or maybe it is just before, as it doesn't check c-ts.u.derived-attr.defined_assign_comp
Re: [Patch ARM] big-endian support for Neon vext tests
On 17/09/12 20:04, Christophe Lyon wrote: On 17 September 2012 20:04, Richard Earnshaw rearn...@arm.com wrote: On 17/09/12 16:50, Christophe Lyon wrote: On 17 September 2012 17:21, Richard Earnshaw rearn...@arm.com wrote: On 17/09/12 16:13, Christophe Lyon wrote: On 17 September 2012 14:56, Richard Earnshaw rearn...@arm.com wrote: On 05/09/12 23:14, Christophe Lyon wrote: Hello, Although the recent optimization I have committed to use Neon vext instruction for suitable builtin_shuffle calls does not support big-endian yet, I have written a patch to the existing testcases such they now support big-endian mode. I think it's worth improving these tests since writing the right masks for big-endian (such that the program computes the same results as in little-endian) is not always straightforward. In particular: * I have added some comments in a few tests were it took me a while to find the right mask. * In the case of the test which is executed, I had to force the noinline attribute on the helper functions, otherwise the computed results are wrong in big-endian. It is probably an overkill workaround but it works :-) I am going to file a bugzilla for this problem. I have checked that replacing calls to builtin_shuffle by the expected Neon vext variant produces the expected results in big-endian mode, and I arranged the big-endian masks to get the same results. Christophe.= neon-vext-big-endian-tests.patch N ¬n‡r¥ªíÂ)emçhÂyhi× ¢w^™©Ý I'm not sure about this. Looking at the documentation in the manual for builtin_suffle makes no mention of the results/behaviour being endian dependent, which makes me wonder why this test needs to be. R. Indeed, but I had to modify the mask value in order to get the same results in big and little-endian. If the mask should be the same (it would be much more confortable for the developers indeed), then GCC needs to be changed/fixed. That's what I'm trying to establish. I suspect that there is a bug in GCC for all big-endian code here. What happens for a test of uint8x8_t? Well, in my sample testcase in little-endian, I used mask = {2, 3, 4, 5, 6, 7, 8, 9}, which can be optimized into vext #2. In big-endian mode, explicitly forcing use of vext #2 leads to the right result, but to achieve it using builtin_shuffle, I had to change the mask into {14, 15, 0, 1, 2, 3, 4, 5}. I did read the thread starting at http://gcc.gnu.org/ml/gcc-patches/2011-10/msg01133.html and the threads it references, and I must admit that I got a bit confused :-) IMHO, it's currently impossible for a GCC user to write code using vector initializers that would be portable on big and little endian targets. It's too much of a headache It was also a purpose of this patch: have someone react if it looked inappropriate. Thanks for the review, Christophe. I think for big-endian, __builtin_shuffle needs to expand to (for 64-bit vectors) vrev64.size mask vext and for 128-bit vectors vrev64.size mask vswap masktop-dword, masklow-dword vext ... Obviously, if you've got a literal you can simplify the operations down to something that doesn't need the extra instructions. R. Does this mean that the currently generated code is wrong (I mean when no optimization is performed by the compiler, as it is currently the case with GCC in big-endian) ? Quite possibly. In order to determine what is right, we first need to understand the specification. My reading of that is that the semantics should be endian independent, but I was hoping that someone would know for certain and be able to chip in. The alternative would be to try the code on PPC and x86 to check the behaviour there for big and little endian respectively, but I don't have access to a PPC board and I'd rather not trust simulators for some of this. When the input mask is known by the compiler, it generates a series a moves to perform the shuffle operation. Which theoretically should, I think, be doing the transform I described above. R.
[libbacktrace] Fix bootstrap with gcc 4.4
The libbacktrace integration broke Solaris 10 and 11 bootstrap when using gcc 4.4 (any version of gcc without __sync_* support actually): stage1 config.h has /* #undef HAVE_SYNC_FUNCTIONS */ and fileline.c and mmap.c fail to compile: /vol/gcc/src/hg/trunk/local/libbacktrace/fileline.c: In function 'fileline_init alize': /vol/gcc/src/hg/trunk/local/libbacktrace/fileline.c:58: error: implicit declarat ion of function 'abort' The following patch fixes this by including stdlib.h for the abort() declaration in the affected files. It allows the Solaris 11 bootstrap to continue. Ok for mainline? Unfortunately, Solaris 10 (and certainly Solaris 9, too) bootstrap is still broken: /vol/gcc/src/hg/trunk/local/libbacktrace/dwarf.c:652: error: implicit declaration of function 'strnlen' make[1]: *** [dwarf.lo] Error 1 Both completely lack strnlen(). I haven't done anything about this yet. Rainer 2012-09-18 Rainer Orth r...@cebitec.uni-bielefeld.de * fileline.c: Include stdlib.h. * mmap.c: Likewise. # HG changeset patch # Parent a22dd5d7246fa4e8a73de2e66db7594cf9ae9f5a Fix bootstrap with gcc 4.4 diff --git a/libbacktrace/fileline.c b/libbacktrace/fileline.c --- a/libbacktrace/fileline.c +++ b/libbacktrace/fileline.c @@ -35,6 +35,7 @@ POSSIBILITY OF SUCH DAMAGE. */ #include sys/types.h #include sys/stat.h #include fcntl.h +#include stdlib.h #include backtrace.h #include internal.h diff --git a/libbacktrace/mmap.c b/libbacktrace/mmap.c --- a/libbacktrace/mmap.c +++ b/libbacktrace/mmap.c @@ -34,6 +34,7 @@ POSSIBILITY OF SUCH DAMAGE. */ #include errno.h #include string.h +#include stdlib.h #include unistd.h #include sys/mman.h -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] PR other/54411: libiberty: objalloc_alloc integer overflows (CVE-2012-3509)
On 09/17/2012 05:59 PM, Ian Lance Taylor wrote: Fair enough. I've added a wraparound check to the macro. Okay for trunk? { + unsigned long len = original_len; /* We avoid confusion from zero sized objects by always allocating at least 1 byte. */ Please add a blank line after the variable declaration. - (__len = __o-current_space\ + (__len __len = __o-current_space \ Please write __len != 0 or len 0. This is OK with those changes. Thanks, committed with these changes. -- Florian Weimer / Red Hat Product Security Team
Re: PATCH RFA: Print backtrace on ICE
On Mon, Sep 17, 2012 at 7:17 PM, Ian Lance Taylor i...@google.com wrote: This patch to the diagnostic code uses the new backtrace library to print a backtrace on an ICE. For example, here is the output of a test case I took from a C++ PR: /home/iant/foo2.cc:6:6: internal compiler error: in cp_lexer_new_from_tokens, at cp/parser.c:638 0xec549f internal_error(char const*, ...) ../../trunk/gcc/diagnostic.c:1057 0xec3f53 fancy_abort(char const*, int, char const*) ../../trunk/gcc/diagnostic.c: 0x5ff78e cp_lexer_new_from_tokens ../../trunk/gcc/cp/parser.c:638 0x5ff78e cp_parser_push_lexer_for_tokens ../../trunk/gcc/cp/parser.c:3290 0x60ff40 cp_parser_late_parsing_for_member ../../trunk/gcc/cp/parser.c:21713 0x60ff40 cp_parser_class_specifier_1 ../../trunk/gcc/cp/parser.c:18207 0x60ff40 cp_parser_class_specifier ../../trunk/gcc/cp/parser.c:18231 0x60ff40 cp_parser_type_specifier ../../trunk/gcc/cp/parser.c:13390 0x61c83d cp_parser_decl_specifier_seq ../../trunk/gcc/cp/parser.c:10731 0x628317 cp_parser_single_declaration ../../trunk/gcc/cp/parser.c:21313 0x6289c0 cp_parser_template_declaration_after_export ../../trunk/gcc/cp/parser.c:21198 0x62de39 cp_parser_declaration ../../trunk/gcc/cp/parser.c:10183 0x62c487 cp_parser_declaration_seq_opt ../../trunk/gcc/cp/parser.c:10105 0x62c762 cp_parser_translation_unit ../../trunk/gcc/cp/parser.c:3757 0x62c762 c_parse_file() ../../trunk/gcc/cp/parser.c:27557 0x72e4e4 c_common_parse_file() ../../trunk/gcc/c-family/c-opts.c:1138 Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. Bootstrapped on x86_64-unknown-linux-gnu. I didn't bother to run the testsuite, since the code only changes when an ICE occurs anyhow. OK for mainline? Hm. Can you please be that verbose only for ENABLE_CHECKING compilers? I'd say that we should do sth fancy with the backtrace first, like in your example note that it came from an assert (and skip the first two frames), or more simple - skip frames until the function name we printed anyways is listed. Then for !ENABLE_CHECKING I'd derive bugzilla components (backtrace from the frontend? from which tree/RTL pass?). I mean the above is so verbose that bugreporters likely will only paste the last non-interesting lines like 0x62c762 cp_parser_translation_unit ../../trunk/gcc/cp/parser.c:3757 0x62c762 c_parse_file() ../../trunk/gcc/cp/parser.c:27557 0x72e4e4 c_common_parse_file() ../../trunk/gcc/c-family/c-opts.c:1138 Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. also consider ICEs from infinite recursion - you'd get a way too large backtrace (so please consider pruning recursions). Or at least provide a way to disable the backtrace printing with a configure switch. Thanks, Richard. Ian gcc/: 2012-09-17 Ian Lance Taylor i...@google.com * diagnostic.c: Include demangle.h and backtrace.h. (bt_stop): New static array. (bt_callback, bt_err_callback): New static functions. (diagnostic_action_after_output): Call backtrace_full for DK_ICE. * Makefile.in (BACKTRACE): New variable. (BACKTRACEINC, LIBBACKTRACE): New variables. (BACKTRACE_H): New variable. (LIBDEPS, LIBS): Add $(LIBBACKTRACE). (INCLUDES): Add $(BACKTRACEINC). (diagnostic.o): Depend upon $(DEMANGLE_H) and $(BACKTRACE_H). ./: 2012-09-17 Ian Lance Taylor i...@google.com * Makefile.def: Make all-gcc depend on all-libbacktrace. * Makefile.in: Rebuild.
Re: PATCH RFA: Print backtrace on ICE
On Tue, Sep 18, 2012 at 10:49 AM, Richard Guenther richard.guent...@gmail.com wrote: On Mon, Sep 17, 2012 at 7:17 PM, Ian Lance Taylor i...@google.com wrote: This patch to the diagnostic code uses the new backtrace library to print a backtrace on an ICE. For example, here is the output of a test case I took from a C++ PR: /home/iant/foo2.cc:6:6: internal compiler error: in cp_lexer_new_from_tokens, at cp/parser.c:638 0xec549f internal_error(char const*, ...) ../../trunk/gcc/diagnostic.c:1057 0xec3f53 fancy_abort(char const*, int, char const*) ../../trunk/gcc/diagnostic.c: 0x5ff78e cp_lexer_new_from_tokens ../../trunk/gcc/cp/parser.c:638 0x5ff78e cp_parser_push_lexer_for_tokens ../../trunk/gcc/cp/parser.c:3290 0x60ff40 cp_parser_late_parsing_for_member ../../trunk/gcc/cp/parser.c:21713 0x60ff40 cp_parser_class_specifier_1 ../../trunk/gcc/cp/parser.c:18207 0x60ff40 cp_parser_class_specifier ../../trunk/gcc/cp/parser.c:18231 0x60ff40 cp_parser_type_specifier ../../trunk/gcc/cp/parser.c:13390 0x61c83d cp_parser_decl_specifier_seq ../../trunk/gcc/cp/parser.c:10731 0x628317 cp_parser_single_declaration ../../trunk/gcc/cp/parser.c:21313 0x6289c0 cp_parser_template_declaration_after_export ../../trunk/gcc/cp/parser.c:21198 0x62de39 cp_parser_declaration ../../trunk/gcc/cp/parser.c:10183 0x62c487 cp_parser_declaration_seq_opt ../../trunk/gcc/cp/parser.c:10105 0x62c762 cp_parser_translation_unit ../../trunk/gcc/cp/parser.c:3757 0x62c762 c_parse_file() ../../trunk/gcc/cp/parser.c:27557 0x72e4e4 c_common_parse_file() ../../trunk/gcc/c-family/c-opts.c:1138 Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. Bootstrapped on x86_64-unknown-linux-gnu. I didn't bother to run the testsuite, since the code only changes when an ICE occurs anyhow. OK for mainline? Hm. Can you please be that verbose only for ENABLE_CHECKING compilers? I'd say that we should do sth fancy with the backtrace first, like in your example note that it came from an assert (and skip the first two frames), or more simple - skip frames until the function name we printed anyways is listed. Then for !ENABLE_CHECKING I'd derive bugzilla components (backtrace from the frontend? from which tree/RTL pass?). I mean the above is so verbose that bugreporters likely will only paste the last non-interesting lines like 0x62c762 cp_parser_translation_unit ../../trunk/gcc/cp/parser.c:3757 0x62c762 c_parse_file() ../../trunk/gcc/cp/parser.c:27557 0x72e4e4 c_common_parse_file() ../../trunk/gcc/c-family/c-opts.c:1138 Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. also consider ICEs from infinite recursion - you'd get a way too large backtrace (so please consider pruning recursions). Which also means to use an alternate stack for all this (we probably should use sigaltstack for the ICEs anyway). Richard.
Re: [libbacktrace] Fix bootstrap with gcc 4.4
On Tue, Sep 18, 2012 at 10:32 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: The libbacktrace integration broke Solaris 10 and 11 bootstrap when using gcc 4.4 (any version of gcc without __sync_* support actually): Ouch, that's bad. stage1 config.h has /* #undef HAVE_SYNC_FUNCTIONS */ and fileline.c and mmap.c fail to compile: /vol/gcc/src/hg/trunk/local/libbacktrace/fileline.c: In function 'fileline_init alize': /vol/gcc/src/hg/trunk/local/libbacktrace/fileline.c:58: error: implicit declarat ion of function 'abort' The following patch fixes this by including stdlib.h for the abort() declaration in the affected files. It allows the Solaris 11 bootstrap to continue. Ok for mainline? Ok. Thanks, Richard. Unfortunately, Solaris 10 (and certainly Solaris 9, too) bootstrap is still broken: /vol/gcc/src/hg/trunk/local/libbacktrace/dwarf.c:652: error: implicit declaration of function 'strnlen' make[1]: *** [dwarf.lo] Error 1 Both completely lack strnlen(). I haven't done anything about this yet. Rainer 2012-09-18 Rainer Orth r...@cebitec.uni-bielefeld.de * fileline.c: Include stdlib.h. * mmap.c: Likewise. -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [libbacktrace] Fix bootstrap with gcc 4.4
On Tue, Sep 18, 2012 at 10:54 AM, Richard Guenther richard.guent...@gmail.com wrote: On Tue, Sep 18, 2012 at 10:32 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: The libbacktrace integration broke Solaris 10 and 11 bootstrap when using gcc 4.4 (any version of gcc without __sync_* support actually): Ouch, that's bad. Btw, why do we need to build libbacktrace during stage1? stage1 config.h has /* #undef HAVE_SYNC_FUNCTIONS */ and fileline.c and mmap.c fail to compile: /vol/gcc/src/hg/trunk/local/libbacktrace/fileline.c: In function 'fileline_init alize': /vol/gcc/src/hg/trunk/local/libbacktrace/fileline.c:58: error: implicit declarat ion of function 'abort' The following patch fixes this by including stdlib.h for the abort() declaration in the affected files. It allows the Solaris 11 bootstrap to continue. Ok for mainline? Ok. Thanks, Richard. Unfortunately, Solaris 10 (and certainly Solaris 9, too) bootstrap is still broken: /vol/gcc/src/hg/trunk/local/libbacktrace/dwarf.c:652: error: implicit declaration of function 'strnlen' make[1]: *** [dwarf.lo] Error 1 Both completely lack strnlen(). I haven't done anything about this yet. Rainer 2012-09-18 Rainer Orth r...@cebitec.uni-bielefeld.de * fileline.c: Include stdlib.h. * mmap.c: Likewise. -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[PATCH, libbacktrace]: Fix compilation on CentOS 5.8
Hello! CentOS 5.8 uses glibc version 2.5 that needs _GNU_SOURCE defined to use strnlen. 2012-09-18 Uros Bizjak ubiz...@gmail.com * dwarf.c: Define _GNU_SOURCE. Tested on CentOS x86_64-pc-linux-gnu. OK for mainline? Uros. Index: dwarf.c === --- dwarf.c (revision 191413) +++ dwarf.c (working copy) @@ -30,6 +30,8 @@ IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ +#define _GNU_SOURCE + #include config.h #include errno.h
Re: [PATCH][RFC] Use overloads for gimple_build_assign_with_ops
On Mon, 17 Sep 2012, Richard Guenther wrote: On Mon, 17 Sep 2012, Diego Novillo wrote: On 2012-09-17 09:25 , Richard Guenther wrote: This makes use of the new builtin functions for FILE, LINE and FUNCTION to turn gimple_build_assign_with_ops/gimple_build_assign_with_ops3 into two overloads of gimple_build_assign_with_ops (in theory the _stats function can go and we could move the inlines to gimple.c instead, eventually removing the assert and simply calling the 3-op overload from the 2-op overload?) Sounds like a good next step, yes. The patch merely serves as an example on how to transform mem-stat code to non-macros. Quickly build-tested with --disable-gather-detailed-mem-stats, stage1 with --enable-gather-detailed-mem-stats and host GCC 4.6 prints gimple.h:766 ((null)) 112: 0.0% 0: 0.0% 0: 0.0% 0: 0.0% for a former gimple_build_assign_with_ops3 call, stage2 (or stage1 with a GCC 4.8 host compiler) prints tree-ssa-math-opts.c:2610 (convert_mult_to_fma) 112: 0.0% 0: 0.0% 0: 0.0% 0: 0.0% 1 so it effectively cripples -fmem-report when not compiled with a compiler supporting the builtins. But for a bootstrapped 4.8+ compiler, this won't matter, right? It's only when using a host compiler that doesn't support the builtins. Yes. Though as it's mostly used for development in which case non-bootstrapped compilers are used it makes --enable-gather-detailed-mem-stats less useful unless you know of this fact. It will also actively break installed pre-release 4.8 compilers used as host compilers ... Any comments/objections? Looks good to me. Thanks. Thanks, currently bootstrapping / reg-testing on x86_64-unknown-linux-gnu, I'll apply it tomorrow. It turns out the assert doesn't work anyway, so I've gone with the following, bootstrapped and tested on x86_64-unknown-linux-gnu (suprisingly PCH doesn't work with --enable-gather-mem-stats ... huh). Richard. 2012-09-18 Richard Guenther rguent...@suse.de * statistics.h (CXX_MEM_STAT_INFO): New define. * gimple.h (gimple_build_assign_with_ops_stat, gimple_build_assign_with_ops, gimple_build_assign_with_ops3): Turn into an overload of the function gimple_build_assign_with_ops. * gimple.c (gimple_build_assign_with_ops_stat): Rename to ... (gimple_build_assign_with_ops): ... this. * tree-ssa-loop-im.c (move_computations_stmt): Adjust. * tree-ssa-math-opts.c (convert_mult_to_fma): Likewise. * tree-vect-data-refs.c (vect_permute_store_chain): Likewise. (vect_permute_load_chain): Likewise. * tree-vect-generic.c (expand_vector_divmod): Likewise. * tree-vect-patterns.c (vect_recog_dot_prod_pattern): Likewise. (vect_recog_divmod_pattern): Likewise. (vect_recog_mixed_size_cond_pattern): Likewise. (adjust_bool_pattern): Likewise. * tree-vect-slp.c (vect_create_mask_and_perm): Likewise. * tree-vect-stmts.c (vectorizable_operation): Likewise. (permute_vec_elements): Likewise. (vectorizable_load): Likewise. Index: trunk/gcc/gimple.h === *** trunk.orig/gcc/gimple.h 2012-09-11 12:25:21.0 +0200 --- trunk/gcc/gimple.h 2012-09-17 17:06:59.424806556 +0200 *** gimple gimple_build_assign_stat (tree, t *** 744,755 void extract_ops_from_tree_1 (tree, enum tree_code *, tree *, tree *, tree *); ! gimple gimple_build_assign_with_ops_stat (enum tree_code, tree, tree, ! tree, tree MEM_STAT_DECL); ! #define gimple_build_assign_with_ops(c,o1,o2,o3) \ ! gimple_build_assign_with_ops_stat (c, o1, o2, o3, NULL_TREE MEM_STAT_INFO) ! #define gimple_build_assign_with_ops3(c,o1,o2,o3,o4) \ ! gimple_build_assign_with_ops_stat (c, o1, o2, o3, o4 MEM_STAT_INFO) gimple gimple_build_debug_bind_stat (tree, tree, gimple MEM_STAT_DECL); #define gimple_build_debug_bind(var,val,stmt) \ --- 744,755 void extract_ops_from_tree_1 (tree, enum tree_code *, tree *, tree *, tree *); ! gimple ! gimple_build_assign_with_ops (enum tree_code, tree, ! tree, tree CXX_MEM_STAT_INFO); ! gimple ! gimple_build_assign_with_ops (enum tree_code, tree, ! tree, tree, tree CXX_MEM_STAT_INFO); gimple gimple_build_debug_bind_stat (tree, tree, gimple MEM_STAT_DECL); #define gimple_build_debug_bind(var,val,stmt) \ Index: trunk/gcc/tree-ssa-loop-im.c === *** trunk.orig/gcc/tree-ssa-loop-im.c 2012-09-11 16:02:22.0 +0200 --- trunk/gcc/tree-ssa-loop-im.c2012-09-17 14:50:36.634089905 +0200 ***
Re: [PATCH, libbacktrace]: Fix compilation on CentOS 5.8
On Tue, Sep 18, 2012 at 11:09 AM, Uros Bizjak ubiz...@gmail.com wrote: Hello! CentOS 5.8 uses glibc version 2.5 that needs _GNU_SOURCE defined to use strnlen. Hm, shouldn't libiberty contain a xstrnlen? I bet strnlen isn't available everywhere. Richard. 2012-09-18 Uros Bizjak ubiz...@gmail.com * dwarf.c: Define _GNU_SOURCE. Tested on CentOS x86_64-pc-linux-gnu. OK for mainline? Uros. Index: dwarf.c === --- dwarf.c (revision 191413) +++ dwarf.c (working copy) @@ -30,6 +30,8 @@ IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ +#define _GNU_SOURCE + #include config.h #include errno.h
[PATCH, AArch64] Implement ctz and clrsb standard patterns
I've implemented the following standard patterns: * clrsb * ctz Regression runs passed and I have added compilation tests for them, and clz as well. (Execution tests are covered by gcc/testsuite/gcc.c-torture/execute/builtin-bitops-1.c.) OK for aarch64-branch and aarch64-4.7-branch? Cheers, Ian 2012-09-18 Ian Bolton ian.bol...@arm.com gcc/ * config/aarch64/aarch64.h: Define CTZ_DEFINED_VALUE_AT_ZERO. * config/aarch64/aarch64.md (clrsbmode2): New pattern. * config/aarch64/aarch64.md (rbitmode2): New pattern. * config/aarch64/aarch64.md (ctzmode2): New pattern. gcc/testsuite/ * gcc.target/aarch64/clrsb.c: New test. * gcc.target/aarch64/clz.c: New test. * gcc.target/aarch64/ctz.c: New test.diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 5d121fa..abf96c5 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -703,6 +703,8 @@ do { \ #define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \ ((VALUE) = ((MODE) == SImode ? 32 : 64), 2) +#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \ + ((VALUE) = ((MODE) == SImode ? 32 : 64), 2) #define INCOMING_RETURN_ADDR_RTX gen_rtx_REG (Pmode, LR_REGNUM) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 33815ff..5278957 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -153,6 +153,8 @@ (UNSPEC_CMTST 83) ; Used in aarch64-simd.md. (UNSPEC_FMAX83) ; Used in aarch64-simd.md. (UNSPEC_FMIN84) ; Used in aarch64-simd.md. +(UNSPEC_CLS 85) ; Used in aarch64-simd.md. +(UNSPEC_RBIT86) ; Used in aarch64-simd.md. ] ) @@ -2128,6 +2130,33 @@ [(set_attr v8type clz) (set_attr mode MODE)]) +(define_insn clrsbmode2 + [(set (match_operand:GPI 0 register_operand =r) + (unspec:GPI [(match_operand:GPI 1 register_operand r)] UNSPEC_CLS))] + + cls\\t%w0, %w1 + [(set_attr v8type clz) + (set_attr mode MODE)]) + +(define_insn rbitmode2 + [(set (match_operand:GPI 0 register_operand =r) + (unspec:GPI [(match_operand:GPI 1 register_operand r)] UNSPEC_RBIT))] + + rbit\\t%w0, %w1 + [(set_attr v8type rbit) + (set_attr mode MODE)]) + +(define_expand ctzmode2 + [(match_operand:GPI 0 register_operand) + (match_operand:GPI 1 register_operand)] + + { +emit_insn (gen_rbitmode2 (operands[0], operands[1])); +emit_insn (gen_clzmode2 (operands[0], operands[0])); +DONE; + } +) + (define_insn *andmode3nr_compare0 [(set (reg:CC CC_REGNUM) (compare:CC diff --git a/gcc/testsuite/gcc.target/aarch64/clrsb.c b/gcc/testsuite/gcc.target/aarch64/clrsb.c new file mode 100644 index 000..a75dfa0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/clrsb.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +unsigned int functest(unsigned int x) +{ + return __builtin_clrsb(x); +} + +/* { dg-final { scan-assembler cls\tw } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/clz.c b/gcc/testsuite/gcc.target/aarch64/clz.c new file mode 100644 index 000..66e2d29 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/clz.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +unsigned int functest(unsigned int x) +{ + return __builtin_clz(x); +} + +/* { dg-final { scan-assembler clz\tw } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/ctz.c b/gcc/testsuite/gcc.target/aarch64/ctz.c new file mode 100644 index 000..15a2473 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ctz.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +unsigned int functest(unsigned int x) +{ + return __builtin_ctz(x); +} + +/* { dg-final { scan-assembler rbit\tw } } */ +/* { dg-final { scan-assembler clz\tw } } */ +
Re: [PATCH, libbacktrace]: Fix compilation on CentOS 5.8
On Tue, Sep 18, 2012 at 11:16 AM, Richard Guenther richard.guent...@gmail.com wrote: CentOS 5.8 uses glibc version 2.5 that needs _GNU_SOURCE defined to use strnlen. Hm, shouldn't libiberty contain a xstrnlen? I bet strnlen isn't available everywhere. I didn't find it in the sources. OTOH, mmapio.c already defines _GNU_SOURCE for some reason, so I just follow this approach. Uros.
Re: [PATCH, AArch64] Implement ctz and clrsb standard patterns
Ian Bolton ian.bol...@arm.com writes: diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 33815ff..5278957 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -153,6 +153,8 @@ (UNSPEC_CMTST 83) ; Used in aarch64-simd.md. (UNSPEC_FMAX 83) ; Used in aarch64-simd.md. (UNSPEC_FMIN 84) ; Used in aarch64-simd.md. +(UNSPEC_CLS 85) ; Used in aarch64-simd.md. +(UNSPEC_RBIT 86) ; Used in aarch64-simd.md. The comment doesn't appear to be true. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: [PATCH, libbacktrace]: Fix compilation on CentOS 5.8
... I bet strnlen isn't available everywhere. You won!-(it is not available on darwin10) I had to change the strnlen to strlen in order to bootstrap. Dominique
RE: [PATCH, AArch64] Implement ctz and clrsb standard patterns
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 33815ff..5278957 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -153,6 +153,8 @@ (UNSPEC_CMTST 83) ; Used in aarch64-simd.md. (UNSPEC_FMAX83) ; Used in aarch64-simd.md. (UNSPEC_FMIN84) ; Used in aarch64-simd.md. +(UNSPEC_CLS 85) ; Used in aarch64-simd.md. +(UNSPEC_RBIT86) ; Used in aarch64-simd.md. The comment doesn't appear to be true. Fair point! I will fix that.
RE: Ping^2: [PATCH]Remove duplicate check on BRANCH_COST in fold-const.c
Ping. -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On Behalf Of Bin.Cheng Sent: Tuesday, September 04, 2012 11:20 PM To: Richard Guenther; gcc-patches@gcc.gnu.org Cc: Richard Earnshaw Subject: Re: Ping^2: [PATCH]Remove duplicate check on BRANCH_COST in fold- const.c Sorry, I mis-sent this offline. On Tue, Sep 4, 2012 at 11:00 PM, Bin.Cheng amker.ch...@gmail.com wrote: It's not ok (I btw fail to see the patch in this thread). The current way LOGICAL_OP_NON_SHORT_CIRCUIT is implemented/used should instead be changed to always match the pattern LOGICAL_OP_NON_SHORT_CIRCUIT (BRANCH_COST (optimize_function_for_speed_p (cfun), false) = 2) and the default value of LOGICAL_OP_NON_SHORT_CIRCUIT should be 1, defined in defaults.h (and the docs updated). That's not going to work for modern ARM cores. We want to set BRANCH_COST to 1 but still have it generate the non-short-circuit code (because conditional compares are really cheap. Hi Richard, For now, LOGICAL_OP_NON_SHORT_CIRCUIT macro is defined as below, which is duplicate of the BRANCH_COST condition. #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT #define LOGICAL_OP_NON_SHORT_CIRCUIT \ (BRANCH_COST (optimize_function_for_speed_p (cfun), \ false) = 2) #endif Recently we measured performance on some ARM processors and found it would be better to have non-short-circuit optimization while setting BRANCH_COST to 1, which is impossible with present codes. So here comes this patch as below: Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 189835) +++ gcc/fold-const.c(working copy) @@ -8443,9 +8443,7 @@ if ((tem = fold_truth_andor_1 (loc, code, type, arg0, arg1)) != 0) return tem; - if ((BRANCH_COST (optimize_function_for_speed_p (cfun), - false) = 2) - LOGICAL_OP_NON_SHORT_CIRCUIT + if (LOGICAL_OP_NON_SHORT_CIRCUIT (code == TRUTH_AND_EXPR || code == TRUTH_ANDIF_EXPR || code == TRUTH_OR_EXPR The purpose is to remove the duplicate check on BRANCH_COST. As Andrew pointed out that the patch may change behavior if some back-ends define the macro independent of BRANCH_COST. After looking into the code, there are two uses of the macro in fold-const.c, each controls one kind code transformation. The first use is: else if (LOGICAL_OP_NON_SHORT_CIRCUIT lhs != 0 rhs != 0 (code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR) operand_equal_p (lhs, rhs, 0)) The second one is: if ((BRANCH_COST (optimize_function_for_speed_p (cfun), false) = 2) LOGICAL_OP_NON_SHORT_CIRCUIT (code == TRUTH_AND_EXPR || code == TRUTH_ANDIF_EXPR || code == TRUTH_OR_EXPR || code == TRUTH_ORIF_EXPR)) I am not sure why the 2nd condition is designed in current way and haven't found any useful changelog on it. But considering back end can factor BRANCH_COST in LOGICAL_OP_NON_SHORT_CIRCUIT or not, we can conclude that the behavior will only be changed if some back-end want to control the two transformations differently. So the problem becomes whether the 2nd condition should be changed. Either way there is scenario cannot be covered. And for now, FTR, only two targets redefine L_O_N_S_C: mips and rs6000. Both set it to zero so won't be affected by this change. Hi Richard, I have tried to explain the change, but I am not sure whether it is agreed or... Thanks very much.
Re: [PATCH, libbacktrace]: Fix compilation on CentOS 5.8
On Tue, Sep 18, 2012 at 11:25 AM, Uros Bizjak ubiz...@gmail.com wrote: On Tue, Sep 18, 2012 at 11:16 AM, Richard Guenther richard.guent...@gmail.com wrote: CentOS 5.8 uses glibc version 2.5 that needs _GNU_SOURCE defined to use strnlen. Hm, shouldn't libiberty contain a xstrnlen? I bet strnlen isn't available everywhere. I didn't find it in the sources. OTOH, mmapio.c already defines _GNU_SOURCE for some reason, so I just follow this approach. I said it _should_ contain xstrnlen, not that it does ;) I suppose Ian should have used strlen or added xstrnlen to libiberty. Richard. Uros.
Re: Ping^2: [PATCH]Remove duplicate check on BRANCH_COST in fold-const.c
On Tue, Sep 18, 2012 at 11:32 AM, Bin Cheng bin.ch...@arm.com wrote: Ping. I already approved your original patch upthread. Richard. -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On Behalf Of Bin.Cheng Sent: Tuesday, September 04, 2012 11:20 PM To: Richard Guenther; gcc-patches@gcc.gnu.org Cc: Richard Earnshaw Subject: Re: Ping^2: [PATCH]Remove duplicate check on BRANCH_COST in fold- const.c Sorry, I mis-sent this offline. On Tue, Sep 4, 2012 at 11:00 PM, Bin.Cheng amker.ch...@gmail.com wrote: It's not ok (I btw fail to see the patch in this thread). The current way LOGICAL_OP_NON_SHORT_CIRCUIT is implemented/used should instead be changed to always match the pattern LOGICAL_OP_NON_SHORT_CIRCUIT (BRANCH_COST (optimize_function_for_speed_p (cfun), false) = 2) and the default value of LOGICAL_OP_NON_SHORT_CIRCUIT should be 1, defined in defaults.h (and the docs updated). That's not going to work for modern ARM cores. We want to set BRANCH_COST to 1 but still have it generate the non-short-circuit code (because conditional compares are really cheap. Hi Richard, For now, LOGICAL_OP_NON_SHORT_CIRCUIT macro is defined as below, which is duplicate of the BRANCH_COST condition. #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT #define LOGICAL_OP_NON_SHORT_CIRCUIT \ (BRANCH_COST (optimize_function_for_speed_p (cfun), \ false) = 2) #endif Recently we measured performance on some ARM processors and found it would be better to have non-short-circuit optimization while setting BRANCH_COST to 1, which is impossible with present codes. So here comes this patch as below: Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 189835) +++ gcc/fold-const.c(working copy) @@ -8443,9 +8443,7 @@ if ((tem = fold_truth_andor_1 (loc, code, type, arg0, arg1)) != 0) return tem; - if ((BRANCH_COST (optimize_function_for_speed_p (cfun), - false) = 2) - LOGICAL_OP_NON_SHORT_CIRCUIT + if (LOGICAL_OP_NON_SHORT_CIRCUIT (code == TRUTH_AND_EXPR || code == TRUTH_ANDIF_EXPR || code == TRUTH_OR_EXPR The purpose is to remove the duplicate check on BRANCH_COST. As Andrew pointed out that the patch may change behavior if some back-ends define the macro independent of BRANCH_COST. After looking into the code, there are two uses of the macro in fold-const.c, each controls one kind code transformation. The first use is: else if (LOGICAL_OP_NON_SHORT_CIRCUIT lhs != 0 rhs != 0 (code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR) operand_equal_p (lhs, rhs, 0)) The second one is: if ((BRANCH_COST (optimize_function_for_speed_p (cfun), false) = 2) LOGICAL_OP_NON_SHORT_CIRCUIT (code == TRUTH_AND_EXPR || code == TRUTH_ANDIF_EXPR || code == TRUTH_OR_EXPR || code == TRUTH_ORIF_EXPR)) I am not sure why the 2nd condition is designed in current way and haven't found any useful changelog on it. But considering back end can factor BRANCH_COST in LOGICAL_OP_NON_SHORT_CIRCUIT or not, we can conclude that the behavior will only be changed if some back-end want to control the two transformations differently. So the problem becomes whether the 2nd condition should be changed. Either way there is scenario cannot be covered. And for now, FTR, only two targets redefine L_O_N_S_C: mips and rs6000. Both set it to zero so won't be affected by this change. Hi Richard, I have tried to explain the change, but I am not sure whether it is agreed or... Thanks very much.
Re: [PATCH] Changes in mode switching
Hi Ricard, You are right I no need the changes in mode-switchig.c at all. After I remove additional argument from EMIT_MODE_SET and run 'make check' I found no differences with make check result of previous run. So I no need in any changes in the middle end part. Regards, Vladimir P.S. I'll be in vacation till end of nonth. Vladimir Yakovlev vbyakov...@gmail.com writes: I reproduced the failure and found reason of it. I understood haw it resolve and now I need small changes only - additional argument of EMIT_MODE_SET. Is it good fo trunk? I'm not sure I understand why you need to know the instruction. The x86 code was: + if (mode == AVX_U128_CLEAN) + { + if (insn) + { + rtx pat = PATTERN(insn); + if (!is_vzeroupper(pat) !is_vzeroall(pat)) + ix86_emit_vzeroupper (); + } + else + ix86_emit_vzeroupper (); + } + break; But the pass should already know via MODE_AFTER that the mode is set to AVX_U128_CLEAN by vzeroupper and vzeroall. Under what circumstances do we think that we need to set the mode to AVX_U128_CLEAN immediately before vzeroupper or vzeroall? I'm probably making you repeat yourself here, sorry. Richard 2012/9/16 Richard Sandiford rdsandif...@googlemail.com: Vladimir Yakovlev vbyakov...@gmail.com writes: I reproduced the failure and found reason of it. I understood haw it resolve and now I need small changes only - additional argument of EMIT_MODE_SET. Is it good fo trunk? I'm not sure I understand why you need to know the instruction. The x86 code was: + if (mode == AVX_U128_CLEAN) + { + if (insn) + { + rtx pat = PATTERN(insn); + if (!is_vzeroupper(pat) !is_vzeroall(pat)) + ix86_emit_vzeroupper (); + } + else + ix86_emit_vzeroupper (); + } + break; But the pass should already know via MODE_AFTER that the mode is set to AVX_U128_CLEAN by vzeroupper and vzeroall. Under what circumstances do we think that we need to set the mode to AVX_U128_CLEAN immediately before vzeroupper or vzeroall? I'm probably making you repeat yourself here, sorry. Richard
Fix instability of -fschedule-insn for x86
Hi All, This patch aims to fix all stability issues related to using the first scheduler in gcc for x86 target (there several reported issues related to this problem). Main idea of this activity is mostly to provide user a possibility to safely turn on first scheduler for his codes. In some cases this could positively affect performance, especially for in-order Atom. Below is short description of proposed changes. Main idea of this patch is to restrict code motion of instructions having likely spilled HW registers as their operands - in general all these instructions are unloading of incoming function argument in the function entry and passing of outgoing function arguments. This is essential for correct scduling since live range of such insns is not defined before register allocation pahse. This done through 2 hooks: 1. ix86_adjust_priority which sets up the priority of moves from likely spilled HW registers to maximum that allows us to schedule such insns as soon as possible, i.e. all moves correspondent to incoming function arguments will be scheduled at the top of function entry and moves correspondent to function return value will be scheduled immediately after call. 2. ix86_dependencies_evaluation_hook which insert additional dependencies for outgoing function arguments passed in likely spilled HW registers to avoid their code motion. This is done through the following steps: - scan the current schedule region to find all call instructions in reverse order; - find out the first argument that passed in likely spilled HW register. Starting from it insert output dependency between it and possible previous argument (it does not matter if it passed in likely spilled register or not). This is done in add_parameter_dependencies; after it all arguments starting from the first one that is passed in likely spilled register are pairwise connected through output dependency. - add dependencies to the first function argument on the rest of instructions in the current block (until next call) to avoid intra block code motion. - to avoid possible interblock code motion we also check if the first argument has dependee in another blocks and if so insert dependency to the last non-jump set instruction. This patch was deeply tested on Atom (eembc_2_0, spec2000 in base/peak mode) and Big Core (spec2006 in base/peak mode). Also gcc full bootstrapping with turned on 1st scheduler by default was done. Tested for i386 and x86-64, ok for trunk? ChangeLog: 2012-09-18 Yuri Rumyantsev ysrum...@gmail.com * config/i386/i386.c (ix86_dep_by_shift_count_body) : Add check on reload_completed since it can be invoked before register allocation phase in 1st scheduler. (ia32_multipass_dfa_lookahead) : Do not use dfa_lookahead for 1st Scheduler to save compile time. (ix86_sched_reorder) : Do not perform ready list reoddering for 1st Scheduler to save compile time. (insn_is_function_arg) : New function. Returns true if lhs of insn is HW function argument register. (add_parameter_dependencies) : New function. Add output dependencies for chain of function adjacent arguments if only there is a move to likely spilled HW registers. Return first argument if at least one dependence was added or NULL otherwise. (avoid_func_arg_motion) : New function. Add output or anti dependency from insn to first_arg to restrict code motion. (add_dependee_for_func_arg) : New function. Avoid cross block motion of function argument through adding dependency from the first non-jump insn in bb. (ix86_dependencies_evaluation_hook) : New function. Hook for schedule1: avoid motion of function arguments passed in passed in likely spilled HW registers. (ix86_adjust_priority) : New function. Hook for schedule1: set priority of moves from likely spilled HW registers to maximum to schedule them as soon as possible. (ix86_sched_init_global): Do not perform multipass scheduling for 1st Scheduler to save compile time.
[PATCH] Fix instability of -fschedule-insn for x86
Hi All, Forgot to attch the patch. 2012/9/18 Yuri Rumyantsev ysrum...@gmail.com: Hi All, This patch aims to fix all stability issues related to using the first scheduler in gcc for x86 target (there several reported issues related to this problem). Main idea of this activity is mostly to provide user a possibility to safely turn on first scheduler for his codes. In some cases this could positively affect performance, especially for in-order Atom. Below is short description of proposed changes. Main idea of this patch is to restrict code motion of instructions having likely spilled HW registers as their operands - in general all these instructions are unloading of incoming function argument in the function entry and passing of outgoing function arguments. This is essential for correct scheduling since live range of such insns is not defined before register allocation phase. This done through 2 hooks: 1. ix86_adjust_priority which sets up the priority of moves from likely spilled HW registers to maximum that allows us to schedule such insns as soon as possible, i.e. all moves correspondent to incoming function arguments will be scheduled at the top of function entry and moves correspondent to function return value will be scheduled immediately after call. 2. ix86_dependencies_evaluation_hook which insert additional dependencies for outgoing function arguments passed in likely spilled HW registers to avoid their code motion. This is done through the following steps: - scan the current schedule region to find all call instructions in reverse order; - find out the first argument that passed in likely spilled HW register. Starting from it insert output dependency between it and possible previous argument (it does not matter if it passed in likely spilled register or not). This is done in add_parameter_dependencies; after it all arguments starting from the first one that is passed in likely spilled register are pairwise connected through output dependency. - add dependencies to the first function argument on the rest of instructions in the current block (until next call) to avoid intra block code motion. - to avoid possible interblock code motion we also check if the first argument has dependee in another blocks and if so insert dependency to the last non-jump set instruction. This patch was deeply tested on Atom (eembc_2_0, spec2000 in base/peak mode) and Big Core (spec2006 in base/peak mode). Also gcc full bootstrapping with turned on 1st scheduler by default was done. Tested for i386 and x86-64, ok for trunk? ChangeLog: 2012-09-18 Yuri Rumyantsev ysrum...@gmail.com * config/i386/i386.c (ix86_dep_by_shift_count_body) : Add check on reload_completed since it can be invoked before register allocation phase in 1st scheduler. (ia32_multipass_dfa_lookahead) : Do not use dfa_lookahead for 1st Scheduler to save compile time. (ix86_sched_reorder) : Do not perform ready list reordering for 1st Scheduler to save compile time. (insn_is_function_arg) : New function. Returns true if lhs of insn is HW function argument register. (add_parameter_dependencies) : New function. Add output dependencies for chain of function adjacent arguments if only there is a move to likely spilled HW registers. Return first argument if at least one dependence was added or NULL otherwise. (avoid_func_arg_motion) : New function. Add output or anti dependency from insn to first_arg to restrict code motion. (add_dependee_for_func_arg) : New function. Avoid cross block motion of function argument through adding dependency from the first non-jump insn in bb. (ix86_dependencies_evaluation_hook) : New function. Hook for schedule1: avoid motion of function arguments passed in passed in likely spilled HW registers. (ix86_adjust_priority) : New function. Hook for schedule1: set priority of moves from likely spilled HW registers to maximum to schedule them as soon as possible. (ix86_sched_init_global): Do not perform multipass scheduling for 1st Scheduler to save compile time. i386-fschedule-insn_for_x86.diff Description: Binary data
[PATCH] Add -Og optimization level - optimize for compile-time/debugging experience
This adds -Og as optimization level targeted at the devel-compile-debug cycle (formerly mostly tied to -O0 due to debug issues with even -O1). Discussion on g...@gcc.gnu.org at least shows interest in this, so this is a formal patch submission with a request for comments on the implementation (not necessarily on what passes are enabled and why). I have bootstrapped and tested this patch with BOOT_C/CXX_FLAGS=-Og -g TARGET_CFLAGS=-Og -g with all languages included (but -Werror disabled, as expected some new maybe-uninit uses pop up). Ok for trunk? Thanks, Richard. 2012-09-18 Richard Guenther rguent...@suse.de PR other/53316 * common.opt (optimize_debug): New variable. (Og): New optimization level. * doc/invoke.texi (Og): Document. * opts.c (maybe_default_option): Add debug parameter. (maybe_default_options): Likewise. (default_options_optimization): Handle -Og. (common_handle_option): Likewise. * passes.c (gate_all_optimizations): Do not run with -Og. (gate_all_optimizations_g): New gate, run with -Og. (pass_all_optimizations_g): New container pass, run with -Og. (init_optimization_passes): Schedule pass_all_optimizations_g alongside pass_all_optimizations. * gcc/testsuite/lib/c-torture.exp: Add -Og -g to default TORTURE_OPTIONS. Index: trunk/gcc/common.opt === *** trunk.orig/gcc/common.opt 2012-07-19 10:39:47.0 +0200 --- trunk/gcc/common.opt2012-08-10 11:58:22.218122816 +0200 *** int optimize *** 32,37 --- 32,40 Variable int optimize_size + Variable + int optimize_debug + ; Not used directly to control optimizations, only to save -Ofast ; setting for optimize attributes. Variable *** Ofast *** 446,451 --- 449,458 Common Optimization Optimize for speed disregarding exact standards compliance + Og + Common Optimization + Optimize for debugging experience rather than speed or size + Q Driver Index: trunk/gcc/opts.c === *** trunk.orig/gcc/opts.c 2012-07-24 10:35:57.0 +0200 --- trunk/gcc/opts.c2012-08-10 13:47:45.678895549 +0200 *** init_options_struct (struct gcc_options *** 314,328 } /* If indicated by the optimization level LEVEL (-Os if SIZE is set, !-Ofast if FAST is set), apply the option DEFAULT_OPT to OPTS and !OPTS_SET, diagnostic context DC, location LOC, with language mask !LANG_MASK and option handlers HANDLERS. */ static void maybe_default_option (struct gcc_options *opts, struct gcc_options *opts_set, const struct default_options *default_opt, ! int level, bool size, bool fast, unsigned int lang_mask, const struct cl_option_handlers *handlers, location_t loc, --- 314,328 } /* If indicated by the optimization level LEVEL (-Os if SIZE is set, !-Ofast if FAST is set, -Og if DEBUG is set), apply the option DEFAULT_OPT !to OPTS and OPTS_SET, diagnostic context DC, location LOC, with language !mask LANG_MASK and option handlers HANDLERS. */ static void maybe_default_option (struct gcc_options *opts, struct gcc_options *opts_set, const struct default_options *default_opt, ! int level, bool size, bool fast, bool debug, unsigned int lang_mask, const struct cl_option_handlers *handlers, location_t loc, *** maybe_default_option (struct gcc_options *** 335,340 --- 335,342 gcc_assert (level == 2); if (fast) gcc_assert (level == 3); + if (debug) + gcc_assert (level == 1); switch (default_opt-levels) { *** maybe_default_option (struct gcc_options *** 351,357 break; case OPT_LEVELS_1_PLUS_SPEED_ONLY: ! enabled = (level = 1 !size); break; case OPT_LEVELS_2_PLUS: --- 353,363 break; case OPT_LEVELS_1_PLUS_SPEED_ONLY: ! enabled = (level = 1 !size !debug); ! break; ! ! case OPT_LEVELS_1_PLUS_NOT_DEBUG: ! enabled = (level = 1 !debug); break; case OPT_LEVELS_2_PLUS: *** maybe_default_option (struct gcc_options *** 359,365 break; case OPT_LEVELS_2_PLUS_SPEED_ONLY: ! enabled = (level = 2 !size); break; case OPT_LEVELS_3_PLUS: --- 365,371 break; case OPT_LEVELS_2_PLUS_SPEED_ONLY: ! enabled = (level = 2 !size !debug); break; case OPT_LEVELS_3_PLUS: *** static void *** 405,411 maybe_default_options (struct gcc_options *opts,
Re: [PATCH] Changes in mode switching
Hello! You are right I no need the changes in mode-switchig.c at all. After I remove additional argument from EMIT_MODE_SET and run 'make check' I found no differences with make check result of previous run. So I no need in any changes in the middle end part. Vladimir, can you please investigate, how to emit vzeroupper insns after reload? Vzeroupper emits hard registers, and reload moves the insn around even when declared with unspec_volatile. Uros.
Re: [PATCH] Fix instability of -fschedule-insn for x86
Hello! This patch aims to fix all stability issues related to using the first scheduler in gcc for x86 target (there several reported issues related to this problem). Main idea of this activity is mostly to provide user a possibility to safely turn on first scheduler for his codes. In some cases this could positively affect performance, especially for in-order Atom. Below is short description of proposed changes. 2012-09-18 Yuri Rumyantsev ysrum...@gmail.com * config/i386/i386.c (ix86_dep_by_shift_count_body) : Add check on reload_completed since it can be invoked before register allocation phase in 1st scheduler. (ia32_multipass_dfa_lookahead) : Do not use dfa_lookahead for 1st Scheduler to save compile time. (ix86_sched_reorder) : Do not perform ready list reordering for 1st Scheduler to save compile time. (insn_is_function_arg) : New function. Returns true if lhs of insn is HW function argument register. (add_parameter_dependencies) : New function. Add output dependencies for chain of function adjacent arguments if only there is a move to likely spilled HW registers. Return first argument if at least one dependence was added or NULL otherwise. (avoid_func_arg_motion) : New function. Add output or anti dependency from insn to first_arg to restrict code motion. (add_dependee_for_func_arg) : New function. Avoid cross block motion of function argument through adding dependency from the first non-jump insn in bb. (ix86_dependencies_evaluation_hook) : New function. Hook for schedule1: avoid motion of function arguments passed in passed in likely spilled HW registers. (ix86_adjust_priority) : New function. Hook for schedule1: set priority of moves from likely spilled HW registers to maximum to schedule them as soon as possible. (ix86_sched_init_global): Do not perform multipass scheduling for 1st Scheduler to save compile time. I would kindly ask scheduler expert to review the patch from the scheduler functionality POV. Thanks, Uros.
Re: [PATCH] Add -Og optimization level - optimize for compile-time/debugging experience
On Tue, Sep 18, 2012 at 4:23 AM, Richard Guenther rguent...@suse.de wrote: This adds -Og as optimization level targeted at the devel-compile-debug cycle (formerly mostly tied to -O0 due to debug issues with even -O1). Discussion on g...@gcc.gnu.org at least shows interest in this, so this is a formal patch submission with a request for comments on the implementation (not necessarily on what passes are enabled and why). I have bootstrapped and tested this patch with BOOT_C/CXX_FLAGS=-Og -g TARGET_CFLAGS=-Og -g with all languages included (but -Werror disabled, as expected some new maybe-uninit uses pop up). Ok for trunk? Thanks, Richard. 2012-09-18 Richard Guenther rguent...@suse.de PR other/53316 * common.opt (optimize_debug): New variable. (Og): New optimization level. * doc/invoke.texi (Og): Document. * opts.c (maybe_default_option): Add debug parameter. (maybe_default_options): Likewise. (default_options_optimization): Handle -Og. (common_handle_option): Likewise. * passes.c (gate_all_optimizations): Do not run with -Og. (gate_all_optimizations_g): New gate, run with -Og. (pass_all_optimizations_g): New container pass, run with -Og. (init_optimization_passes): Schedule pass_all_optimizations_g alongside pass_all_optimizations. * gcc/testsuite/lib/c-torture.exp: Add -Og -g to default TORTURE_OPTIONS. Glibc must be compiled with optimization. Will -Og build glibc? -- H.J.
[PATCH] Fix i386 costs (was: i386: Fix logic error in r188785, PR target/54592)
On Wed, Jun 27, 2012 at 02:36:14PM -0700, Richard Henderson wrote: As noticed by Igor Zamyatin. Committed. PR target/53749 * config/i386/i386.c (ix86_rtx_costs): Fix typo vs UNITS_PER_WORD in 2012-06-23 change. Adjust two other DImode tests as well. This change broke cost computation for vector PLUS/MINUS/AND/IOR/XOR, the vector modes are all wider than word, but they are now all handled as double word integer arithmetics. --- gcc/ChangeLog |6 ++ gcc/config/i386/i386.c |8 +++- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index edfc649..aae8a4d 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c ... @@ -32441,7 +32440,7 @@ ix86_rtx_costs (rtx x, int code_i, int outer_code_i, int opno, int *total, case AND: case IOR: case XOR: - if (!TARGET_64BIT mode == DImode) + if (GET_MODE_SIZE (mode) UNITS_PER_WORD) { *total = (cost-add * 2 + (rtx_cost (XEXP (x, 0), outer_code, opno, speed) Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2012-09-18 Jakub Jelinek ja...@redhat.com PR target/54592 * config/i386/i386.c (ix86_rtx_costs): Limit UNITS_PER_WORD AND/IOR/XOR cost calculation to MODE_INT class modes. * gcc.target/i386/pr54592.c: New test. --- gcc/config/i386/i386.c.jj 2012-09-13 18:29:08.0 +0200 +++ gcc/config/i386/i386.c 2012-09-18 08:55:08.747028184 +0200 @@ -32792,7 +32792,8 @@ ix86_rtx_costs (rtx x, int code_i, int o case AND: case IOR: case XOR: - if (GET_MODE_SIZE (mode) UNITS_PER_WORD) + if (GET_MODE_CLASS (mode) == MODE_INT + GET_MODE_SIZE (mode) UNITS_PER_WORD) { *total = (cost-add * 2 + (rtx_cost (XEXP (x, 0), outer_code, opno, speed) --- gcc/testsuite/gcc.target/i386/pr54592.c.jj 2012-09-18 09:06:09.399013382 +0200 +++ gcc/testsuite/gcc.target/i386/pr54592.c 2012-09-18 09:13:04.482914236 +0200 @@ -0,0 +1,17 @@ +/* PR target/54592 */ +/* { dg-do compile } */ +/* { dg-options -Os -msse2 } */ +/* { dg-require-effective-target sse2 } */ + +#include emmintrin.h + +void +func (__m128i * foo, size_t a, size_t b, int *dst) +{ + __m128i x = foo[a]; + __m128i y = foo[b]; + __m128i sum = _mm_add_epi32 (x, y); + *dst = _mm_cvtsi128_si32 (sum); +} + +/* { dg-final { scan-assembler paddd\[^\n\r\]*(\\(\[^\n\r\]*\\)|XMMWORD PTR) } } */ Jakub
Re: [PATCH] Changes in mode switching
I tried to perform vzeroupper emitting after reload as additional pass of mode switching. I sow one problem that I don't know haw to overcome. After 'pro_and_epilogue', there can be no flow edge to exit block and pre_exit block is not created in this case (see rotine create_pre_exit). Without that I cannot properly perform vzeroupper insertion at rotine exit. Regards, Vladimir 2012/9/18 Uros Bizjak ubiz...@gmail.com: Hello! You are right I no need the changes in mode-switchig.c at all. After I remove additional argument from EMIT_MODE_SET and run 'make check' I found no differences with make check result of previous run. So I no need in any changes in the middle end part. Vladimir, can you please investigate, how to emit vzeroupper insns after reload? Vzeroupper emits hard registers, and reload moves the insn around even when declared with unspec_volatile. Uros.
[PATCH] Fix vector permutation forwprop optimization (PR tree-optimization/54610)
Hi! vect_gen_perm_mask is not suitable for use outside of the vectorizer, it uses current vector size to determine the number of units of a vector, which isn't something that should be used outside of the vectorizer. The following patch just does construct the mask inline, it is not that long code. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2012-09-18 Jakub Jelinek ja...@redhat.com PR tree-optimization/54610 * tree-ssa-forwprop.c: Include optabs.h. (simplify_vector_constructor): Don't use vect_gen_perm_mask, instead create the mask constant here. * Makefile.in (tree-ssa-forwprop.o): Depend on $(OPTABS_H). * gcc.target/i386/pr54610.c: New test. --- gcc/tree-ssa-forwprop.c.jj 2012-09-14 14:20:56.0 +0200 +++ gcc/tree-ssa-forwprop.c 2012-09-18 10:17:40.627193548 +0200 @@ -34,6 +34,7 @@ along with GCC; see the file COPYING3. #include expr.h #include cfgloop.h #include tree-vectorizer.h +#include optabs.h /* This pass propagates the RHS of assignment statements into use sites of the LHS of the assignment. It's basically a specialized @@ -2854,14 +2855,24 @@ simplify_vector_constructor (gimple_stmt return false; if (maybe_ident) -{ - gimple_assign_set_rhs_from_tree (gsi, orig); -} +gimple_assign_set_rhs_from_tree (gsi, orig); else { - op2 = vect_gen_perm_mask (type, sel); - if (!op2) + tree mask_type, *mask_elts; + + if (!can_vec_perm_p (TYPE_MODE (type), false, sel)) + return false; + mask_type + = build_vector_type (build_nonstandard_integer_type (elem_size, 1), +nelts); + if (GET_MODE_CLASS (TYPE_MODE (mask_type)) != MODE_VECTOR_INT + || GET_MODE_SIZE (TYPE_MODE (mask_type)) +!= GET_MODE_SIZE (TYPE_MODE (type))) return false; + mask_elts = XALLOCAVEC (tree, nelts); + for (i = 0; i nelts; i++) + mask_elts[i] = build_int_cst (TREE_TYPE (mask_type), sel[i]); + op2 = build_vector (mask_type, mask_elts); gimple_assign_set_rhs_with_ops_1 (gsi, VEC_PERM_EXPR, orig, orig, op2); } update_stmt (gsi_stmt (*gsi)); --- gcc/Makefile.in.jj 2012-09-13 07:54:44.0 +0200 +++ gcc/Makefile.in 2012-09-18 10:18:05.717067056 +0200 @@ -2245,7 +2245,7 @@ tree-ssa-forwprop.o : tree-ssa-forwprop. $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) $(CFGLOOP_H) \ $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \ langhooks.h $(FLAGS_H) $(GIMPLE_H) $(GIMPLE_PRETTY_PRINT_H) $(EXPR_H) \ - $(TREE_VECTORIZER_H) + $(TREE_VECTORIZER_H) $(OPTABS_H) tree-ssa-phiprop.o : tree-ssa-phiprop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \ $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \ --- gcc/testsuite/gcc.target/i386/pr54610.c.jj 2012-09-18 10:24:58.793981091 +0200 +++ gcc/testsuite/gcc.target/i386/pr54610.c 2012-09-18 10:26:26.838535968 +0200 @@ -0,0 +1,17 @@ +/* PR tree-optimization/54610 */ +/* { dg-do compile } */ +/* { dg-options -O -mavx -fdump-tree-optimized } */ + +typedef double vec __attribute__((vector_size (2 * sizeof (double; +void f (vec *px, vec *y, vec *z) +{ + vec x = *px; + vec t1 = { x[1], x[0] }; + vec t2 = { x[0], x[1] }; + *y = t1; + *z = t2; +} + +/* { dg-final { scan-tree-dump-times VEC_PERM_EXPR 1 optimized } } */ +/* { dg-final { scan-tree-dump-not BIT_FIELD_REF optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ Jakub
Re: [PATCH] Add -Og optimization level - optimize for compile-time/debugging experience
On Tue, 18 Sep 2012, H.J. Lu wrote: On Tue, Sep 18, 2012 at 4:23 AM, Richard Guenther rguent...@suse.de wrote: This adds -Og as optimization level targeted at the devel-compile-debug cycle (formerly mostly tied to -O0 due to debug issues with even -O1). Discussion on g...@gcc.gnu.org at least shows interest in this, so this is a formal patch submission with a request for comments on the implementation (not necessarily on what passes are enabled and why). I have bootstrapped and tested this patch with BOOT_C/CXX_FLAGS=-Og -g TARGET_CFLAGS=-Og -g with all languages included (but -Werror disabled, as expected some new maybe-uninit uses pop up). Ok for trunk? Thanks, Richard. 2012-09-18 Richard Guenther rguent...@suse.de PR other/53316 * common.opt (optimize_debug): New variable. (Og): New optimization level. * doc/invoke.texi (Og): Document. * opts.c (maybe_default_option): Add debug parameter. (maybe_default_options): Likewise. (default_options_optimization): Handle -Og. (common_handle_option): Likewise. * passes.c (gate_all_optimizations): Do not run with -Og. (gate_all_optimizations_g): New gate, run with -Og. (pass_all_optimizations_g): New container pass, run with -Og. (init_optimization_passes): Schedule pass_all_optimizations_g alongside pass_all_optimizations. * gcc/testsuite/lib/c-torture.exp: Add -Og -g to default TORTURE_OPTIONS. Glibc must be compiled with optimization. Will -Og build glibc? Apply the patch and check for yourself. I suppose the answer will be yes (whatever optimization requirements glibc has). Richard.
Re: [PATCH] Fix vector permutation forwprop optimization (PR tree-optimization/54610)
On Tue, Sep 18, 2012 at 2:20 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! vect_gen_perm_mask is not suitable for use outside of the vectorizer, it uses current vector size to determine the number of units of a vector, which isn't something that should be used outside of the vectorizer. The following patch just does construct the mask inline, it is not that long code. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. 2012-09-18 Jakub Jelinek ja...@redhat.com PR tree-optimization/54610 * tree-ssa-forwprop.c: Include optabs.h. (simplify_vector_constructor): Don't use vect_gen_perm_mask, instead create the mask constant here. * Makefile.in (tree-ssa-forwprop.o): Depend on $(OPTABS_H). * gcc.target/i386/pr54610.c: New test. --- gcc/tree-ssa-forwprop.c.jj 2012-09-14 14:20:56.0 +0200 +++ gcc/tree-ssa-forwprop.c 2012-09-18 10:17:40.627193548 +0200 @@ -34,6 +34,7 @@ along with GCC; see the file COPYING3. #include expr.h #include cfgloop.h #include tree-vectorizer.h +#include optabs.h /* This pass propagates the RHS of assignment statements into use sites of the LHS of the assignment. It's basically a specialized @@ -2854,14 +2855,24 @@ simplify_vector_constructor (gimple_stmt return false; if (maybe_ident) -{ - gimple_assign_set_rhs_from_tree (gsi, orig); -} +gimple_assign_set_rhs_from_tree (gsi, orig); else { - op2 = vect_gen_perm_mask (type, sel); - if (!op2) + tree mask_type, *mask_elts; + + if (!can_vec_perm_p (TYPE_MODE (type), false, sel)) + return false; + mask_type + = build_vector_type (build_nonstandard_integer_type (elem_size, 1), +nelts); + if (GET_MODE_CLASS (TYPE_MODE (mask_type)) != MODE_VECTOR_INT + || GET_MODE_SIZE (TYPE_MODE (mask_type)) +!= GET_MODE_SIZE (TYPE_MODE (type))) return false; + mask_elts = XALLOCAVEC (tree, nelts); + for (i = 0; i nelts; i++) + mask_elts[i] = build_int_cst (TREE_TYPE (mask_type), sel[i]); + op2 = build_vector (mask_type, mask_elts); gimple_assign_set_rhs_with_ops_1 (gsi, VEC_PERM_EXPR, orig, orig, op2); } update_stmt (gsi_stmt (*gsi)); --- gcc/Makefile.in.jj 2012-09-13 07:54:44.0 +0200 +++ gcc/Makefile.in 2012-09-18 10:18:05.717067056 +0200 @@ -2245,7 +2245,7 @@ tree-ssa-forwprop.o : tree-ssa-forwprop. $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) $(CFGLOOP_H) \ $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \ langhooks.h $(FLAGS_H) $(GIMPLE_H) $(GIMPLE_PRETTY_PRINT_H) $(EXPR_H) \ - $(TREE_VECTORIZER_H) + $(TREE_VECTORIZER_H) $(OPTABS_H) tree-ssa-phiprop.o : tree-ssa-phiprop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \ $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \ --- gcc/testsuite/gcc.target/i386/pr54610.c.jj 2012-09-18 10:24:58.793981091 +0200 +++ gcc/testsuite/gcc.target/i386/pr54610.c 2012-09-18 10:26:26.838535968 +0200 @@ -0,0 +1,17 @@ +/* PR tree-optimization/54610 */ +/* { dg-do compile } */ +/* { dg-options -O -mavx -fdump-tree-optimized } */ + +typedef double vec __attribute__((vector_size (2 * sizeof (double; +void f (vec *px, vec *y, vec *z) +{ + vec x = *px; + vec t1 = { x[1], x[0] }; + vec t2 = { x[0], x[1] }; + *y = t1; + *z = t2; +} + +/* { dg-final { scan-tree-dump-times VEC_PERM_EXPR 1 optimized } } */ +/* { dg-final { scan-tree-dump-not BIT_FIELD_REF optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ Jakub
Re: [PATCH] Add -Og optimization level - optimize for compile-time/debugging experience
On Tue, Sep 18, 2012 at 02:38:27PM +0200, Richard Guenther wrote: Glibc must be compiled with optimization. Will -Og build glibc? Apply the patch and check for yourself. I suppose the answer will be yes (whatever optimization requirements glibc has). glibc headers check __OPTIMIZE__ macro, and that is defined with -Og, so yes, it will build. It builds even with -O0 -D__OPTIMIZE__, at least most of it ;) Jakub
Re: [PATCH] Changes in mode switching
On Tue, Sep 18, 2012 at 2:18 PM, Vladimir Yakovlev vbyakov...@gmail.com wrote: I tried to perform vzeroupper emitting after reload as additional pass of mode switching. I sow one problem that I don't know haw to overcome. After 'pro_and_epilogue', there can be no flow edge to exit block and pre_exit block is not created in this case (see rotine create_pre_exit). Without that I cannot properly perform vzeroupper insertion at rotine exit. Can you update us, if there is any problem remaining with a pre-reload mode switching pass vzeroupper insertion? BTW: Can you please repost latest version of target-dependant patch? Uros.
RE: [PATCH, AArch64] Implement ctz and clrsb standard patterns
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 33815ff..5278957 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -153,6 +153,8 @@ (UNSPEC_CMTST 83) ; Used in aarch64-simd.md. (UNSPEC_FMAX 83) ; Used in aarch64-simd.md. (UNSPEC_FMIN 84) ; Used in aarch64-simd.md. +(UNSPEC_CLS 85) ; Used in aarch64-simd.md. +(UNSPEC_RBIT 86) ; Used in aarch64-simd.md. The comment doesn't appear to be true. Fair point! I will fix that. New patch with comment fixed is attached. Now good to commit to aarch64-branch and aarch64-4.7-branch? Cheers, Iandiff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 5d121fa..abf96c5 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -703,6 +703,8 @@ do { \ #define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \ ((VALUE) = ((MODE) == SImode ? 32 : 64), 2) +#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \ + ((VALUE) = ((MODE) == SImode ? 32 : 64), 2) #define INCOMING_RETURN_ADDR_RTX gen_rtx_REG (Pmode, LR_REGNUM) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 33815ff..5278957 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -153,6 +153,8 @@ (UNSPEC_CMTST 83) ; Used in aarch64-simd.md. (UNSPEC_FMAX83) ; Used in aarch64-simd.md. (UNSPEC_FMIN84) ; Used in aarch64-simd.md. +(UNSPEC_CLS 85) ; Used in aarch64.md. +(UNSPEC_RBIT86) ; Used in aarch64.md. ] ) @@ -2128,6 +2130,33 @@ [(set_attr v8type clz) (set_attr mode MODE)]) +(define_insn clrsbmode2 + [(set (match_operand:GPI 0 register_operand =r) + (unspec:GPI [(match_operand:GPI 1 register_operand r)] UNSPEC_CLS))] + + cls\\t%w0, %w1 + [(set_attr v8type clz) + (set_attr mode MODE)]) + +(define_insn rbitmode2 + [(set (match_operand:GPI 0 register_operand =r) + (unspec:GPI [(match_operand:GPI 1 register_operand r)] UNSPEC_RBIT))] + + rbit\\t%w0, %w1 + [(set_attr v8type rbit) + (set_attr mode MODE)]) + +(define_expand ctzmode2 + [(match_operand:GPI 0 register_operand) + (match_operand:GPI 1 register_operand)] + + { +emit_insn (gen_rbitmode2 (operands[0], operands[1])); +emit_insn (gen_clzmode2 (operands[0], operands[0])); +DONE; + } +) + (define_insn *andmode3nr_compare0 [(set (reg:CC CC_REGNUM) (compare:CC diff --git a/gcc/testsuite/gcc.target/aarch64/clrsb.c b/gcc/testsuite/gcc.target/aarch64/clrsb.c new file mode 100644 index 000..a75dfa0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/clrsb.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +unsigned int functest(unsigned int x) +{ + return __builtin_clrsb(x); +} + +/* { dg-final { scan-assembler cls\tw } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/clz.c b/gcc/testsuite/gcc.target/aarch64/clz.c new file mode 100644 index 000..66e2d29 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/clz.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +unsigned int functest(unsigned int x) +{ + return __builtin_clz(x); +} + +/* { dg-final { scan-assembler clz\tw } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/ctz.c b/gcc/testsuite/gcc.target/aarch64/ctz.c new file mode 100644 index 000..15a2473 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ctz.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +unsigned int functest(unsigned int x) +{ + return __builtin_ctz(x); +} + +/* { dg-final { scan-assembler rbit\tw } } */ +/* { dg-final { scan-assembler clz\tw } } */ +
RE: [PATCH, AArch64] Implement ctz and clrsb standard patterns
New version attached with better formatted test cases. OK for aarch64-branch and aarch64-4.7-branch? Cheers, Ian - 2012-09-18 Ian Bolton ian.bol...@arm.com gcc/ * config/aarch64/aarch64.h: Define CTZ_DEFINED_VALUE_AT_ZERO. * config/aarch64/aarch64.md (clrsbmode2): New pattern. * config/aarch64/aarch64.md (rbitmode2): New pattern. * config/aarch64/aarch64.md (ctzmode2): New pattern. gcc/testsuite/ * gcc.target/aarch64/clrsb.c: New test. * gcc.target/aarch64/clz.c: New test. * gcc.target/aarch64/ctz.c: New test.diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 5d121fa..abf96c5 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -703,6 +703,8 @@ do { \ #define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \ ((VALUE) = ((MODE) == SImode ? 32 : 64), 2) +#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \ + ((VALUE) = ((MODE) == SImode ? 32 : 64), 2) #define INCOMING_RETURN_ADDR_RTX gen_rtx_REG (Pmode, LR_REGNUM) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 33815ff..5278957 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -153,6 +153,8 @@ (UNSPEC_CMTST 83) ; Used in aarch64-simd.md. (UNSPEC_FMAX83) ; Used in aarch64-simd.md. (UNSPEC_FMIN84) ; Used in aarch64-simd.md. +(UNSPEC_CLS 85) ; Used in aarch64.md. +(UNSPEC_RBIT86) ; Used in aarch64.md. ] ) @@ -2128,6 +2130,33 @@ [(set_attr v8type clz) (set_attr mode MODE)]) +(define_insn clrsbmode2 + [(set (match_operand:GPI 0 register_operand =r) + (unspec:GPI [(match_operand:GPI 1 register_operand r)] UNSPEC_CLS))] + + cls\\t%w0, %w1 + [(set_attr v8type clz) + (set_attr mode MODE)]) + +(define_insn rbitmode2 + [(set (match_operand:GPI 0 register_operand =r) + (unspec:GPI [(match_operand:GPI 1 register_operand r)] UNSPEC_RBIT))] + + rbit\\t%w0, %w1 + [(set_attr v8type rbit) + (set_attr mode MODE)]) + +(define_expand ctzmode2 + [(match_operand:GPI 0 register_operand) + (match_operand:GPI 1 register_operand)] + + { +emit_insn (gen_rbitmode2 (operands[0], operands[1])); +emit_insn (gen_clzmode2 (operands[0], operands[0])); +DONE; + } +) + (define_insn *andmode3nr_compare0 [(set (reg:CC CC_REGNUM) (compare:CC diff --git a/gcc/testsuite/gcc.target/aarch64/clrsb.c b/gcc/testsuite/gcc.target/aarch64/clrsb.c new file mode 100644 index 000..a75dfa0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/clrsb.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +unsigned int functest (unsigned int x) +{ + return __builtin_clrsb (x); +} + +/* { dg-final { scan-assembler cls\tw } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/clz.c b/gcc/testsuite/gcc.target/aarch64/clz.c new file mode 100644 index 000..66e2d29 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/clz.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +unsigned int functest (unsigned int x) +{ + return __builtin_clz (x); +} + +/* { dg-final { scan-assembler clz\tw } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/ctz.c b/gcc/testsuite/gcc.target/aarch64/ctz.c new file mode 100644 index 000..15a2473 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ctz.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +unsigned int functest (unsigned int x) +{ + return __builtin_ctz (x); +} + +/* { dg-final { scan-assembler rbit\tw } } */ +/* { dg-final { scan-assembler clz\tw } } */ +
Rewrite lto-symtab to work on symbol table
Hi, this patch reorganize lto-symtab to work across symtab's symbol table instead of building its own. This simplifies things a bit and with the previous changes it is rather straighforward - i.e. replace all uses of lto_symtab_entry_t by symtab_node. There are few differences in between the symtab as built by lto-symtab and our symbol table. In one direction the declarations that are not going to be output to final assembly (i.e. are used by debug info and such) are not in symbol table and consequentely they no longer get merged. I think this is fine. Other difference is that symbol table contains some symbols that are not really symbols in classical definition - such as inline clones or functions held in table only for purposes of materialization. I added symtab_real_symbol_p predicate for this. It would make more sense to exclude those from the assembler name hash and drop checks I added to lto-symtab.c. I plan to work on this incrementally - it is not completely trivial. The symbol can become non-real in several ways and it will need bit of work to get this consistent. Bootstrapped/regtested x86_64-linux, tested by building Mozilla, Qt and other stuff with LTO. OK? Honza * symtab.c (insert_to_assembler_name_hash): Do not insert register vars. (unlink_from_assembler_name_hash): NULL out pointers of unlinked var. (symtab_prevail_in_asm_name_hash): New. (symtab_initialize_asm_name_hash): Break out from ... (symtab_node_for_asm): ... here. (dump_symtab_base): Dump LTO file data. (verify_symtab_base): Register vars are not in symtab. * cgraph.h (symtab_initialize_asm_name_hash, symtab_prevail_in_asm_name_hash): New functions. (symtab_real_symbol_p): New inline. * lto-symtab.c: Do not include gt-lto-symtab.h. (lto_symtab_entry_def): Remove. (lto_symtab_entry_t): Remove. (lto_symtab_identifiers): Remove. (lto_symtab_free): Remove. (lto_symtab_entry_hash): Remove. (lto_symtab_entry_eq): Remove. (lto_symtab_entry_marked_p): Remove. (lto_symtab_maybe_init_hash_table): Remove. (resolution_guessed_p, set_resolution_guessed): New functions. (lto_symtab_register_decl): Only set resolution info. (lto_symtab_get, lto_symtab_get_resolution): Remove. (lto_symtab_merge): Reorg to work across symtab; do nothing if decls are same. (lto_symtab_resolve_replaceable_p): Reorg to work on symtab. (lto_symtab_resolve_can_prevail_p): Likewise; only real symbols can prevail. (lto_symtab_resolve_symbols): Reorg to work on symtab. (lto_symtab_merge_decls_2): Likewise. (lto_symtab_merge_decls_1): Likewise; add debug dumps. (lto_symtab_merge_decls): Likewise; do not merge at ltrans stage. (lto_symtab_merge_cgraph_nodes_1): Reorg to work on symtab. (lto_symtab_merge_cgraph_nodes): Likewise; do not merge at ltrans stage. (lto_symtab_prevailing_decl): Rewrite to lookup into symtab. * lto-streaer.h (lto_symtab_free): Remove. * lto-cgraph.c (add_references): Cleanup. * varpool.c (varpool_assemble_decl): Skip hard regs. * lto.c (lto_materialize_function): Update confused comment. (read_cgraph_and_symbols): Do not free symtab. Index: symtab.c === *** symtab.c(revision 191418) --- symtab.c(working copy) *** eq_assembler_name (const void *p1, const *** 104,109 --- 104,111 static void insert_to_assembler_name_hash (symtab_node node) { + if (symtab_variable_p (node) DECL_HARD_REGISTER (node-symbol.decl)) + return; gcc_checking_assert (!node-symbol.previous_sharing_asm_name !node-symbol.next_sharing_asm_name); if (assembler_name_hash) *** unlink_from_assembler_name_hash (symtab_ *** 151,159 --- 153,172 else *slot = node-symbol.next_sharing_asm_name; } + node-symbol.next_sharing_asm_name = NULL; + node-symbol.previous_sharing_asm_name = NULL; } } + /* Arrange node to be first in its entry of assembler_name_hash. */ + + void + symtab_prevail_in_asm_name_hash (symtab_node node) + { + unlink_from_assembler_name_hash (node); + insert_to_assembler_name_hash (node); + } + /* Add node into symbol table. This function is not used directly, but via cgraph/varpool node creation routines. */ *** symtab_remove_node (symtab_node node) *** 287,301 varpool_remove_node (varpool (node)); } ! /* Return the cgraph node that has ASMNAME for its DECL_ASSEMBLER_NAME. !Return NULL if there's no such node. */ ! symtab_node ! symtab_node_for_asm (const_tree asmname) { symtab_node node; - void **slot; - if (!assembler_name_hash) {
Re: PATCH RFA: Print backtrace on ICE
On Tue, Sep 18, 2012 at 1:49 AM, Richard Guenther richard.guent...@gmail.com wrote: On Mon, Sep 17, 2012 at 7:17 PM, Ian Lance Taylor i...@google.com wrote: This patch to the diagnostic code uses the new backtrace library to print a backtrace on an ICE. For example, here is the output of a test case I took from a C++ PR: /home/iant/foo2.cc:6:6: internal compiler error: in cp_lexer_new_from_tokens, at cp/parser.c:638 0xec549f internal_error(char const*, ...) ../../trunk/gcc/diagnostic.c:1057 0xec3f53 fancy_abort(char const*, int, char const*) ../../trunk/gcc/diagnostic.c: 0x5ff78e cp_lexer_new_from_tokens ../../trunk/gcc/cp/parser.c:638 0x5ff78e cp_parser_push_lexer_for_tokens ../../trunk/gcc/cp/parser.c:3290 0x60ff40 cp_parser_late_parsing_for_member ../../trunk/gcc/cp/parser.c:21713 0x60ff40 cp_parser_class_specifier_1 ../../trunk/gcc/cp/parser.c:18207 0x60ff40 cp_parser_class_specifier ../../trunk/gcc/cp/parser.c:18231 0x60ff40 cp_parser_type_specifier ../../trunk/gcc/cp/parser.c:13390 0x61c83d cp_parser_decl_specifier_seq ../../trunk/gcc/cp/parser.c:10731 0x628317 cp_parser_single_declaration ../../trunk/gcc/cp/parser.c:21313 0x6289c0 cp_parser_template_declaration_after_export ../../trunk/gcc/cp/parser.c:21198 0x62de39 cp_parser_declaration ../../trunk/gcc/cp/parser.c:10183 0x62c487 cp_parser_declaration_seq_opt ../../trunk/gcc/cp/parser.c:10105 0x62c762 cp_parser_translation_unit ../../trunk/gcc/cp/parser.c:3757 0x62c762 c_parse_file() ../../trunk/gcc/cp/parser.c:27557 0x72e4e4 c_common_parse_file() ../../trunk/gcc/c-family/c-opts.c:1138 Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. Bootstrapped on x86_64-unknown-linux-gnu. I didn't bother to run the testsuite, since the code only changes when an ICE occurs anyhow. OK for mainline? Hm. Can you please be that verbose only for ENABLE_CHECKING compilers? That would be easy enough but I don't think it's a good idea. The time when this can help the most is when we get a bug report from somebody who doesn't know how to or doesn't want to share the input file. The backtrace can show us whether this is a known ICE. But that will only work if we actually dump the backtrace for a release compiler. It's not like this is something that happens in an ordinary compilation. I think verbosity is just fine here. I'd say that we should do sth fancy with the backtrace first, like in your example note that it came from an assert (and skip the first two frames), or more simple - skip frames until the function name we printed anyways is listed. That function name is not convenient to access. I changed the backtrace to simply skip any leading stack frames in diagnostic.c, which achieves the same effect. Then for !ENABLE_CHECKING I'd derive bugzilla components (backtrace from the frontend? from which tree/RTL pass?). That's a good idea but I'd rather leave it for later. I mean the above is so verbose that bugreporters likely will only paste the last non-interesting lines like I added an explicit note directing them to include the complete backtrace. also consider ICEs from infinite recursion - you'd get a way too large backtrace (so please consider pruning recursions). In my original patch the backtrace is already cut off after 20 functions. Or at least provide a way to disable the backtrace printing with a configure switch. Again, I don't think this is necessary or appropriate. I could add a command line option to disable the backtrace if you think that is important, but I think it's important that the default be to print it. I've attached the updated patch. Here is what it prints now. /home/iant/foo2.cc:6:6: internal compiler error: in cp_lexer_new_from_tokens, at cp/parser.c:638 0x5ff78e cp_lexer_new_from_tokens ../../trunk/gcc/cp/parser.c:638 0x5ff78e cp_parser_push_lexer_for_tokens ../../trunk/gcc/cp/parser.c:3290 0x60ff40 cp_parser_late_parsing_for_member ../../trunk/gcc/cp/parser.c:21713 0x60ff40 cp_parser_class_specifier_1 ../../trunk/gcc/cp/parser.c:18207 0x60ff40 cp_parser_class_specifier ../../trunk/gcc/cp/parser.c:18231 0x60ff40 cp_parser_type_specifier ../../trunk/gcc/cp/parser.c:13390 0x61c83d cp_parser_decl_specifier_seq ../../trunk/gcc/cp/parser.c:10731 0x628317 cp_parser_single_declaration ../../trunk/gcc/cp/parser.c:21313 0x6289c0 cp_parser_template_declaration_after_export ../../trunk/gcc/cp/parser.c:21198 0x62de39 cp_parser_declaration ../../trunk/gcc/cp/parser.c:10183 0x62c487 cp_parser_declaration_seq_opt ../../trunk/gcc/cp/parser.c:10105 0x62c762 cp_parser_translation_unit ../../trunk/gcc/cp/parser.c:3757 0x62c762 c_parse_file()
Re: [PATCH, libbacktrace]: Fix compilation on CentOS 5.8
On Tue, Sep 18, 2012 at 2:25 AM, Uros Bizjak ubiz...@gmail.com wrote: On Tue, Sep 18, 2012 at 11:16 AM, Richard Guenther richard.guent...@gmail.com wrote: CentOS 5.8 uses glibc version 2.5 that needs _GNU_SOURCE defined to use strnlen. Hm, shouldn't libiberty contain a xstrnlen? I bet strnlen isn't available everywhere. I didn't find it in the sources. OTOH, mmapio.c already defines _GNU_SOURCE for some reason, so I just follow this approach. mmapio.c uses _GNU_SOURCE so that getpagesize is available on GNU/Linux systems. I'll fix the strnlen issue some other way. Ian
Re: [PATCH 1-2/12 ] New configure option --enable-espf=(all|ssp|pie|no)
tisdag 11 september 2012 01.33.42 skrev Magnus Granberg: fredag 07 september 2012 18.52.11 skrev du: On Fri, 7 Sep 2012, Magnus Granberg wrote: * Makefile.in Add -fno-stack-protector when needed for espf. Toplevel Makefile.in is a generated file. You need to patch Makefile.def or Makefile.tpl and regenerate Makefile.in. I'm surprised this passes bootstrap, since I wouldn't expect bootstrap to avoid -Wformat-security warnings, and all the previous patch submissions I recall to avoid such warnings have been incorrect (you can't just change error (msg) to error (%s, msg) when the reason the code is written how it is is that no-argument formats such as % and % may appear in msg and need interpreting). Have updated Makefile and configure patch and it bootstrap with --enable-werror did't have that enable last time. Have new changelog to. Thank you for the help. .. Do any one else have any comments or hints for the patches? Gentoo Hardened Project Magnus Granberg
[AARCH64][PATCH] Remove hardwired multiarch.
I've just committed this patch to aarch64-branch to remove the multi-arch fudge. /Marcus 2012-09-18 Marcus Shawcroft marcus.shawcr...@arm.com * config/aarch64/aarch64-linux.h (MULTIARCH_TUPLE): Remove. (STANDARD_STARTFILE_PREFIX_1): Likewise. (STANDARD_STARTFILE_PREFIX_2): Likewise.diff --git a/gcc/config/aarch64/aarch64-linux.h b/gcc/config/aarch64/aarch64-linux.h index 2b5ec7e..faf2e6c 100644 --- a/gcc/config/aarch64/aarch64-linux.h +++ b/gcc/config/aarch64/aarch64-linux.h @@ -21,14 +21,6 @@ #ifndef GCC_AARCH64_LINUX_H #define GCC_AARCH64_LINUX_H -#define MULTIARCH_TUPLE aarch64-linux-gnu - -#undef STANDARD_STARTFILE_PREFIX_1 -#define STANDARD_STARTFILE_PREFIX_1 /lib/ MULTIARCH_TUPLE / - -#undef STANDARD_STARTFILE_PREFIX_2 -#define STANDARD_STARTFILE_PREFIX_2 /usr/lib/ MULTIARCH_TUPLE / - #define GLIBC_DYNAMIC_LINKER /lib/ld-linux-aarch64.so.1 #define LINUX_TARGET_LINK_SPEC %{h*} \ -- 1.7.12.rc0.22.gcdd159b
[v3] libstdc++/54612
Hi, tested x86_64-linux (with and without #include opt_random.h at the end of ext/random), committed to mainline. Should be fixed now. Thanks, Paolo. PS: I just noticed that in ext/random, inside namespace __gnu_cxx, we are using, unqualified, size_t and other types. We shouldn't: it's only because of an accident of our implementation that those types are also available in the global namespace. We should qualify with std::, or use equivalent solutions. PS2: I think we should add ext/random to include/precompiled/extc++.h, to speed up the testsuite and early catch trivial issues. // 2012-09-18 Paolo Carlini paolo.carl...@oracle.com PR libstdc++/54612 * include/ext/random.tcc (operator== (const __gnu_cxx::simd_fast_mersenne_twister_engine, const __gnu_cxx::simd_fast_mersenne_twister_engine)): Fix state_size use. * config/cpu/i486/opt/ext/opt_random.h: Guard with __SSE2__. Index: include/ext/random.tcc === --- include/ext/random.tcc (revision 191415) +++ include/ext/random.tcc (working copy) @@ -328,7 +328,12 @@ __msk1, __msk2, __msk3, __msk4, __parity1, __parity2, __parity3, __parity4 __rhs) { - return (std::equal(__lhs._M_stateT, __lhs._M_stateT + state_size, + typedef __gnu_cxx::simd_fast_mersenne_twister_engine_UIntType, + __m, __pos1, __sl1, __sl2, __sr1, __sr2, + __msk1, __msk2, __msk3, __msk4, + __parity1, __parity2, __parity3, __parity4 __engine; + return (std::equal(__lhs._M_stateT, +__lhs._M_stateT + __engine::state_size, __rhs._M_stateT) __lhs._M_pos == __rhs._M_pos); } Index: config/cpu/i486/opt/ext/opt_random.h === --- config/cpu/i486/opt/ext/opt_random.h(revision 191415) +++ config/cpu/i486/opt/ext/opt_random.h(working copy) @@ -32,6 +32,7 @@ #pragma GCC system_header +#ifdef __SSE2__ namespace __gnu_cxx _GLIBCXX_VISIBILITY(default) { @@ -130,5 +131,6 @@ _GLIBCXX_END_NAMESPACE_VERSION } // namespace +#endif // __SSE2__ #endif // _EXT_OPT_RANDOM_H
Re: [PATCH, libbacktrace]: Fix compilation on CentOS 5.8
Ian Lance Taylor i...@google.com writes: mmapio.c uses _GNU_SOURCE so that getpagesize is available on GNU/Linux systems. That should be available by default (which includes BSD things) unless the namespace has been restricted in some other way. There's of course also the POSIX way of using sysconf(_SC_PAGESIZE). Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
[AARCH64-4.7][PATCH] Remove hardwired multiarch.
I've just back ported this into ARM/AARCH64-4.7. /Marcus 2012-09-18 Marcus Shawcroft marcus.shawcr...@arm.com * config/aarch64/aarch64-linux.h (MULTIARCH_TUPLE): Remove. (STANDARD_STARTFILE_PREFIX_1): Likewise. (STANDARD_STARTFILE_PREFIX_2): Likewise.diff --git a/gcc/config/aarch64/aarch64-linux.h b/gcc/config/aarch64/aarch64-linux.h index 2b5ec7e..faf2e6c 100644 --- a/gcc/config/aarch64/aarch64-linux.h +++ b/gcc/config/aarch64/aarch64-linux.h @@ -21,14 +21,6 @@ #ifndef GCC_AARCH64_LINUX_H #define GCC_AARCH64_LINUX_H -#define MULTIARCH_TUPLE aarch64-linux-gnu - -#undef STANDARD_STARTFILE_PREFIX_1 -#define STANDARD_STARTFILE_PREFIX_1 /lib/ MULTIARCH_TUPLE / - -#undef STANDARD_STARTFILE_PREFIX_2 -#define STANDARD_STARTFILE_PREFIX_2 /usr/lib/ MULTIARCH_TUPLE / - #define GLIBC_DYNAMIC_LINKER /lib/ld-linux-aarch64.so.1 #define LINUX_TARGET_LINK_SPEC %{h*} \ -- 1.7.12.rc0.22.gcdd159b
Re: PATCH RFA: Print backtrace on ICE
On 2012.09.18 at 06:58 -0700, Ian Lance Taylor wrote: On Tue, Sep 18, 2012 at 1:49 AM, Richard Guenther richard.guent...@gmail.com wrote: On Mon, Sep 17, 2012 at 7:17 PM, Ian Lance Taylor i...@google.com wrote: OK for mainline? Hm. Can you please be that verbose only for ENABLE_CHECKING compilers? That would be easy enough but I don't think it's a good idea. The time when this can help the most is when we get a bug report from somebody who doesn't know how to or doesn't want to share the input file. The backtrace can show us whether this is a known ICE. But that will only work if we actually dump the backtrace for a release compiler. It's not like this is something that happens in an ordinary compilation. I think verbosity is just fine here. Or at least provide a way to disable the backtrace printing with a configure switch. Again, I don't think this is necessary or appropriate. I could add a command line option to disable the backtrace if you think that is important, but I think it's important that the default be to print it. If you use make install-strip to install, then libbacktrace will have been build in vain. At least for this case a way to disable libbacktrace should be available. -- Markus
Re: Rewrite lto-symtab to work on symbol table
On Tue, 18 Sep 2012, Jan Hubicka wrote: Hi, this patch reorganize lto-symtab to work across symtab's symbol table instead of building its own. This simplifies things a bit and with the previous changes it is rather straighforward - i.e. replace all uses of lto_symtab_entry_t by symtab_node. There are few differences in between the symtab as built by lto-symtab and our symbol table. In one direction the declarations that are not going to be output to final assembly (i.e. are used by debug info and such) are not in symbol table and consequentely they no longer get merged. I think this is fine. Yes, that sounds fine. Other difference is that symbol table contains some symbols that are not really symbols in classical definition - such as inline clones or functions held in table only for purposes of materialization. I added symtab_real_symbol_p predicate for this. It would make more sense to exclude those from the assembler name hash and drop checks I added to lto-symtab.c. Yes indeed. I also miss a function to compute said hash to be able to travers the next/prev asm name alias chain. Also the current routine to do that computes assembler names for things that did not have one ... I plan to work on this incrementally - it is not completely trivial. The symbol can become non-real in several ways and it will need bit of work to get this consistent. Bootstrapped/regtested x86_64-linux, tested by building Mozilla, Qt and other stuff with LTO. OK? I also have in my local tree: Index: gcc/symtab.c === --- gcc/symtab.c(revision 191415) +++ gcc/symtab.c(working copy) @@ -151,6 +151,8 @@ unlink_from_assembler_name_hash (symtab_ else *slot = node-symbol.next_sharing_asm_name; } + node-symbol.next_sharing_asm_name = NULL; + node-symbol.previous_sharing_asm_name = NULL; } } seems that path is not exercised in the current tree ;) Ok for trunk. Thanks, Richard. Honza * symtab.c (insert_to_assembler_name_hash): Do not insert register vars. (unlink_from_assembler_name_hash): NULL out pointers of unlinked var. (symtab_prevail_in_asm_name_hash): New. (symtab_initialize_asm_name_hash): Break out from ... (symtab_node_for_asm): ... here. (dump_symtab_base): Dump LTO file data. (verify_symtab_base): Register vars are not in symtab. * cgraph.h (symtab_initialize_asm_name_hash, symtab_prevail_in_asm_name_hash): New functions. (symtab_real_symbol_p): New inline. * lto-symtab.c: Do not include gt-lto-symtab.h. (lto_symtab_entry_def): Remove. (lto_symtab_entry_t): Remove. (lto_symtab_identifiers): Remove. (lto_symtab_free): Remove. (lto_symtab_entry_hash): Remove. (lto_symtab_entry_eq): Remove. (lto_symtab_entry_marked_p): Remove. (lto_symtab_maybe_init_hash_table): Remove. (resolution_guessed_p, set_resolution_guessed): New functions. (lto_symtab_register_decl): Only set resolution info. (lto_symtab_get, lto_symtab_get_resolution): Remove. (lto_symtab_merge): Reorg to work across symtab; do nothing if decls are same. (lto_symtab_resolve_replaceable_p): Reorg to work on symtab. (lto_symtab_resolve_can_prevail_p): Likewise; only real symbols can prevail. (lto_symtab_resolve_symbols): Reorg to work on symtab. (lto_symtab_merge_decls_2): Likewise. (lto_symtab_merge_decls_1): Likewise; add debug dumps. (lto_symtab_merge_decls): Likewise; do not merge at ltrans stage. (lto_symtab_merge_cgraph_nodes_1): Reorg to work on symtab. (lto_symtab_merge_cgraph_nodes): Likewise; do not merge at ltrans stage. (lto_symtab_prevailing_decl): Rewrite to lookup into symtab. * lto-streaer.h (lto_symtab_free): Remove. * lto-cgraph.c (add_references): Cleanup. * varpool.c (varpool_assemble_decl): Skip hard regs. * lto.c (lto_materialize_function): Update confused comment. (read_cgraph_and_symbols): Do not free symtab. Index: symtab.c === *** symtab.c (revision 191418) --- symtab.c (working copy) *** eq_assembler_name (const void *p1, const *** 104,109 --- 104,111 static void insert_to_assembler_name_hash (symtab_node node) { + if (symtab_variable_p (node) DECL_HARD_REGISTER (node-symbol.decl)) + return; gcc_checking_assert (!node-symbol.previous_sharing_asm_name !node-symbol.next_sharing_asm_name); if (assembler_name_hash) *** unlink_from_assembler_name_hash (symtab_ *** 151,159 --- 153,172 else *slot = node-symbol.next_sharing_asm_name; } + node-symbol.next_sharing_asm_name = NULL; +
Re: [PATCHv3] rs6000: Add 2 built-ins to read the Time Base Register on PowerPC
Hi Tulio, Thanks for all the cleanups! Two quite minor things... +(define_insn rs6000_get_timebase_ppc64 + [(set (match_operand:DI 0 gpc_reg_operand =r) +(unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB))] + TARGET_POWERPC64 +{ + if (TARGET_MFCRF) +return mfspr %0, 268; + else +return mftb %0; +}) + +(define_insn rs6000_mftb_mode + [(set (match_operand:P 0 gpc_reg_operand =r) +(unspec_volatile:P [(const_int 0)] UNSPECV_MFTB))] + + { + if (TARGET_MFCRF) +return mfspr %0, 268; + else +return mftb %0; + }) These are identical; remove the _ppc64 pattern? (The indenting of the {} is wrong in the mftb pattern). Segher
Re: PATCH RFA: Print backtrace on ICE
On Tue, Sep 18, 2012 at 4:40 PM, Markus Trippelsdorf mar...@trippelsdorf.de wrote: On 2012.09.18 at 06:58 -0700, Ian Lance Taylor wrote: On Tue, Sep 18, 2012 at 1:49 AM, Richard Guenther richard.guent...@gmail.com wrote: On Mon, Sep 17, 2012 at 7:17 PM, Ian Lance Taylor i...@google.com wrote: OK for mainline? Hm. Can you please be that verbose only for ENABLE_CHECKING compilers? That would be easy enough but I don't think it's a good idea. The time when this can help the most is when we get a bug report from somebody who doesn't know how to or doesn't want to share the input file. The backtrace can show us whether this is a known ICE. But that will only work if we actually dump the backtrace for a release compiler. It's not like this is something that happens in an ordinary compilation. I think verbosity is just fine here. Or at least provide a way to disable the backtrace printing with a configure switch. Again, I don't think this is necessary or appropriate. I could add a command line option to disable the backtrace if you think that is important, but I think it's important that the default be to print it. If you use make install-strip to install, then libbacktrace will have been build in vain. At least for this case a way to disable libbacktrace should be available. Indeed - we ship binaries with stripped debug info, usually not installed. libbacktrace will only produce useless garbage then. So I want a way to disable it (at least by default) at configure time. Richard. -- Markus
Re: [PATCH] Fix i386 costs
On 09/18/2012 05:10 AM, Jakub Jelinek wrote: 2012-09-18 Jakub Jelinek ja...@redhat.com PR target/54592 * config/i386/i386.c (ix86_rtx_costs): Limit UNITS_PER_WORD AND/IOR/XOR cost calculation to MODE_INT class modes. * gcc.target/i386/pr54592.c: New test. Ok. r~
Re: PATCH RFA: Print backtrace on ICE
On 2012.09.18 at 16:55 +0200, Richard Guenther wrote: On Tue, Sep 18, 2012 at 4:40 PM, Markus Trippelsdorf mar...@trippelsdorf.de wrote: On 2012.09.18 at 06:58 -0700, Ian Lance Taylor wrote: On Tue, Sep 18, 2012 at 1:49 AM, Richard Guenther richard.guent...@gmail.com wrote: On Mon, Sep 17, 2012 at 7:17 PM, Ian Lance Taylor i...@google.com wrote: OK for mainline? Hm. Can you please be that verbose only for ENABLE_CHECKING compilers? That would be easy enough but I don't think it's a good idea. The time when this can help the most is when we get a bug report from somebody who doesn't know how to or doesn't want to share the input file. The backtrace can show us whether this is a known ICE. But that will only work if we actually dump the backtrace for a release compiler. It's not like this is something that happens in an ordinary compilation. I think verbosity is just fine here. Or at least provide a way to disable the backtrace printing with a configure switch. Again, I don't think this is necessary or appropriate. I could add a command line option to disable the backtrace if you think that is important, but I think it's important that the default be to print it. If you use make install-strip to install, then libbacktrace will have been build in vain. At least for this case a way to disable libbacktrace should be available. Indeed - we ship binaries with stripped debug info, usually not installed. libbacktrace will only produce useless garbage then. So I want a way to disable it (at least by default) at configure time. To be fair, libbacktrace doesn't print garbage in this case. It's clever enough to just get out of the way... -- Markus
Re: [PATCH] Add option for dumping to stderr (issue6190057)
On Tue, Sep 18, 2012 at 1:48 AM, Sharad Singhai sing...@google.com wrote: In response to the recent comments, I have updated the patch to do the following: - Remove pass handling from -fopt-info - Support additional flags in regular dumps I have massaged the options so that they have the following (hopefully clearer) behavior: gcc ... -fopt-info --- dump all optimization info on stderr gcc ... -fopt-info-missed-optimized=file.txt -- dump info about optimization applied as well as missed opportunities on to file.txt. If no file.txt is provided, then use stderr. I have enhanced regular dump flags, so that values accepted by -fopt-info are also accepted. For example, gcc ... -O2 -ftree-vectorize -fdump-tree-vect-optimized=foo.dump Now foo.dump will include the regular tree-vect dump as well as the output of -fopt-info=optimized. This way developers can get more detailed dumps when needed. I have also changed the meaning of dump option details to include optimization details. Thus -details flag implies -missed-optimized-note in addition to other dumps. The pass level filtering of -fopt-info dumps can be done in a follow up patch. It may even turn out to be unnecessary, because the equivalent effect can be achieved by -ftree-PASS-optimized-missed-note. Richard's suggestion to map high level 'pass' names to internal passes and make it available to -fopt-info filtering for end users as a follow up pass will be useful. thanks, David I have bootstrapped and tested the attached patch on x86_64 and didn't observe any new failures. Okay for trunk? Thanks, Sharad
Re: [PATCH] PowerPC VLE port
On Tue, 11 Sep 2012, David Edelsohn wrote: 2012-09-10 Maciej W. Rozycki ma...@codesourcery.com gcc/ * config/rs6000/rs6000.c (print_operand) 'c': Remove. * config/rs6000/spe.md: Remove a leftover comment. Okay. I have applied this change now, thanks for your review. Maciej
libiberty patch committed: Add strnlen
This patch to libiberty adds support for strnlen if it is not already present. I rebuilt the Makefile dependencies. This revealed that maint-tool wasn't recognizing that files that include dwarf2.h also depend on dwarf2.def, so I fixed maint-tool. I also rebuilt functions.texi, which had not been rebuilt for a while. Bootstrapped on x86_64-unknown-linux-gnu, where, as expected, the function was not compiled. Ian 2012-09-18 Ian Lance Taylor i...@google.com * strnlen.c: New file. * configure.ac: Check for strnlen, add it to AC_LIBOBJ if it's not present. * Makefile.in: Rebuild dependencies. (CFILES): Add strnlen.c. (CONFIGURED_OFILES): Add ./strnlen.$(objext). * configure, config.in, functions.texi: Rebuild. * maint-tool: Accept .def files in the include directory. Index: configure.ac === --- configure.ac (revision 191430) +++ configure.ac (working copy) @@ -322,6 +322,7 @@ funcs=$funcs strchr funcs=$funcs strdup funcs=$funcs strncasecmp funcs=$funcs strndup +funcs=$funcs strnlen funcs=$funcs strrchr funcs=$funcs strstr funcs=$funcs strtod @@ -362,8 +363,8 @@ if test x = y; then random realpath rename rindex \ sbrk setenv setproctitle setrlimit sigsetmask snprintf spawnve spawnvpe \ stpcpy stpncpy strcasecmp strchr strdup \ - strerror strncasecmp strndup strrchr strsignal strstr strtod strtol \ - strtoul strverscmp sysconf sysctl sysmp \ + strerror strncasecmp strndup strnlen strrchr strsignal strstr strtod \ + strtol strtoul strverscmp sysconf sysctl sysmp \ table times tmpnam \ vasprintf vfprintf vprintf vsprintf \ wait3 wait4 waitpid) @@ -442,13 +443,14 @@ if test -n ${with_target_subdir}; then AC_LIBOBJ([stpcpy]) AC_LIBOBJ([stpncpy]) AC_LIBOBJ([strndup]) +AC_LIBOBJ([strnlen]) AC_LIBOBJ([strverscmp]) AC_LIBOBJ([vasprintf]) AC_LIBOBJ([waitpid]) for f in $funcs; do case $f in - asprintf | basename | bcmp | bcopy | bzero | clock | ffs | getpagesize | index | insque | mempcpy | mkstemps | random | rindex | sigsetmask | stpcpy | stpncpy | strdup | strndup | strverscmp | vasprintf | waitpid) + asprintf | basename | bcmp | bcopy | bzero | clock | ffs | getpagesize | index | insque | mempcpy | mkstemps | random | rindex | sigsetmask | stpcpy | stpncpy | strdup | strndup | strnlen | strverscmp | vasprintf | waitpid) ;; *) n=HAVE_`echo $f | tr 'abcdefghijklmnopqrstuvwxyz' 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'` Index: maint-tool === --- maint-tool (revision 191430) +++ maint-tool (working copy) @@ -222,7 +222,7 @@ sub deps { opendir(INC, $incdir); while ($f = readdir INC) { - next unless $f =~ /\.h$/; + next unless $f =~ /\.h$/ || $f =~ /\.def$/; $mine{$f} = \$(INCDIR)/$f; $deps{$f} = join(' ', deps_for($incdir/$f)); } Index: strnlen.c === --- strnlen.c (revision 0) +++ strnlen.c (revision 0) @@ -0,0 +1,28 @@ +/* Portable version of strnlen. + This function is in the public domain. */ + +/* + +@deftypefn Supplemental size_t strnlen (const char *@var{s}, size_t @var{maxlen}) + +Returns the length of @var{s}, as with @code{strlen}, but never looks +past the first @var{maxlen} characters in the string. If there is no +'\0' character in the first @var{maxlen} characters, returns +@var{maxlen}. + +@end deftypefn + +*/ + +#include config.h + +size_t +strnlen (const char *s, size_t maxlen) +{ + size_t i; + + for (i = 0; i maxlen; ++i) +if (s[i] == '\0') + break; + return i; +} Index: Makefile.in === --- Makefile.in (revision 191430) +++ Makefile.in (working copy) @@ -151,7 +151,7 @@ CFILES = alloca.c argv.c asprintf.c atex spaces.c splay-tree.c stack-limit.c stpcpy.c stpncpy.c \ strcasecmp.c strchr.c strdup.c strerror.c strncasecmp.c \ strncmp.c strrchr.c strsignal.c strstr.c strtod.c strtol.c \ - strtoul.c strndup.c strverscmp.c\ + strtoul.c strndup.c strnlen.c strverscmp.c \ timeval-utils.c tmpnam.c \ unlink-if-ordinary.c \ vasprintf.c vfork.c vfprintf.c vprintf.c vsnprintf.c vsprintf.c \ @@ -215,9 +215,9 @@ CONFIGURED_OFILES = ./asprintf.$(objext) ./sigsetmask.$(objext) ./snprintf.$(objext) \ ./stpcpy.$(objext) ./stpncpy.$(objext) ./strcasecmp.$(objext) \ ./strchr.$(objext) ./strdup.$(objext) ./strncasecmp.$(objext) \ - ./strncmp.$(objext) ./strndup.$(objext) ./strrchr.$(objext) \ - ./strstr.$(objext) ./strtod.$(objext) ./strtol.$(objext) \ - ./strtoul.$(objext) ./strverscmp.$(objext) \ + ./strncmp.$(objext) ./strndup.$(objext) ./strnlen.$(objext) \ + ./strrchr.$(objext) ./strstr.$(objext) ./strtod.$(objext) \ + ./strtol.$(objext) ./strtoul.$(objext) ./strverscmp.$(objext) \ ./tmpnam.$(objext) \
libbacktrace patch committed: Declare strnlen if not declared
This patch to libbacktrace declares strnlen if it is not declared in a standard header file. If necessary, the actual definition will come from libiberty. Bootstrapped on x86_64-unknown-linux-gnu. Committed to mainline. Ian 2012-09-18 Ian Lance Taylor i...@google.com * configure.ac: Check whether strnlen is declared. * dwarf.c: Declare strnlen if not declared. Index: dwarf.c === --- dwarf.c (revision 191432) +++ dwarf.c (working copy) @@ -44,6 +44,11 @@ POSSIBILITY OF SUCH DAMAGE. */ #include backtrace.h #include internal.h +#ifndef HAVE_DECL_STRNLEN +/* The function is defined in libiberty if needed. */ +extern size_t strnlen (const char *, size_t); +#endif + /* A buffer to read DWARF info. */ struct dwarf_buf Index: configure.ac === --- configure.ac (revision 191432) +++ configure.ac (working copy) @@ -199,6 +199,8 @@ if test $ALLOC_FILE = alloc.lo; then fi AC_SUBST(BACKTRACE_USES_MALLOC) +AC_CHECK_DECLS(strnlen) + AC_CACHE_CHECK([whether tests can run], [libbacktrace_cv_sys_native], [AC_RUN_IFELSE([AC_LANG_PROGRAM([], [return 0;])],
Re: [PATCH, libbacktrace]: Fix compilation on CentOS 5.8
On Tue, Sep 18, 2012 at 7:08 AM, Ian Lance Taylor i...@google.com wrote: I'll fix the strnlen issue some other way. I committed a pair of patches, to libiberty and libbacktrace, that should fix the problem. Let me know if it is still present. Sorry for the difficulty. Ian
Re: [PATCH, libbacktrace]: Fix compilation on CentOS 5.8
On Tue, 18 Sep 2012, Uros Bizjak wrote: Index: dwarf.c === --- dwarf.c (revision 191413) +++ dwarf.c (working copy) @@ -30,6 +30,8 @@ IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ +#define _GNU_SOURCE I think AC_USE_SYSTEM_EXTENSIONS is preferred to defining _GNU_SOURCE in individual source files. -- Joseph S. Myers jos...@codesourcery.com
Re: [libbacktrace] Fix bootstrap with gcc 4.4
On Tue, Sep 18, 2012 at 1:32 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: The libbacktrace integration broke Solaris 10 and 11 bootstrap when using gcc 4.4 (any version of gcc without __sync_* support actually): The patch is fine and should fix the problem, but GCC 4.4 does have __sync_* support. Might be worth looking into why the test failed. Unfortunately, Solaris 10 (and certainly Solaris 9, too) bootstrap is still broken: /vol/gcc/src/hg/trunk/local/libbacktrace/dwarf.c:652: error: implicit declaration of function 'strnlen' make[1]: *** [dwarf.lo] Error 1 Both completely lack strnlen(). I haven't done anything about this yet. This should be fixed now. Sorry about the problems. Ian
Re: [libbacktrace] Fix bootstrap with gcc 4.4
On Tue, Sep 18, 2012 at 1:55 AM, Richard Guenther richard.guent...@gmail.com wrote: On Tue, Sep 18, 2012 at 10:54 AM, Richard Guenther richard.guent...@gmail.com wrote: On Tue, Sep 18, 2012 at 10:32 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: The libbacktrace integration broke Solaris 10 and 11 bootstrap when using gcc 4.4 (any version of gcc without __sync_* support actually): Ouch, that's bad. Btw, why do we need to build libbacktrace during stage1? It's easier for users. The library is designed to quietly do nothing if it does not have the required support. This was just a bug in that design. Ian
Re: PATCH RFA: Print backtrace on ICE
On Tue, Sep 18, 2012 at 7:40 AM, Markus Trippelsdorf mar...@trippelsdorf.de wrote: If you use make install-strip to install, then libbacktrace will have been build in vain. At least for this case a way to disable libbacktrace should be available. Why bother? The library will fail to find anything useful to print, and nothing will be printed. Ian
Re: PATCH RFA: Print backtrace on ICE
On Tue, Sep 18, 2012 at 7:55 AM, Richard Guenther richard.guent...@gmail.com wrote: Indeed - we ship binaries with stripped debug info, usually not installed. libbacktrace will only produce useless garbage then. So I want a way to disable it (at least by default) at configure time. The library won't print useless garbage. That would be pointless. It simply won't print anything at all. I can add a configure option to not dump backtraces if you really want it. I only care that the default is to print them. Ian
Re: Implement Nakagami distribution as an extension.
On 09/18/2012 10:28 AM, Paolo Carlini wrote: On 09/18/2012 03:23 AM, Ed Smith-Rowland wrote: Here is another tweak for the Nakagami distribution. operator() is a one liner without a local variable. The template friend is fixed and the relevant test has been added. Changed dates. If this version is tested (please remember to say where and how) it's Ok. Thanks! Paolo. PS: lately I normally use -std=c++11 for new testcases: IMHO at this point -std=c++0x should be considered a legacy switch. tweaked dates in comment. Tested x86_64 x86_64 GNU/Linux. Committed. 2012-09-18 Edward Smith-Rowland 3dw...@verizon.net * include/ext/random: Add __gnu_cxx::nakagami_distribution class. * include/ext/random.tcc: Add out-of-line functions for __gnu_cxx::nakagami_distribution. * testsuite/ext/random/nakagami_distribution/operators/equal.cc: New file. * testsuite/ext/random/nakagami_distribution/operators/serialize.cc: New file. * testsuite/ext/random/nakagami_distribution/operators/inequal.cc: New file. * testsuite/ext/random/nakagami_distribution/cons/parms.cc: New file. * testsuite/ext/random/nakagami_distribution/cons/default.cc: New file. * testsuite/ext/random/nakagami_distribution/requirements/typedefs.cc: New file. * testsuite/ext/random/nakagami_distribution/requirements/ explicit_instantiation/1.cc: New file. Index: include/ext/random === --- include/ext/random (revision 191411) +++ include/ext/random (working copy) @@ -1139,6 +1139,227 @@ const rice_distribution_RealType1 __d2) { return !(__d1 == __d2); } + + /** + * @brief A Nakagami continuous distribution for random numbers. + * + * The formula for the Nakagami probability density function is + * @f[ + * p(x|\mu,\omega) = \frac{2\mu^\mu}{\Gamma(\mu)\omega^\mu} + * x^{2\mu-1}e^{-\mu x / \omega} + * @f] + * where @f$\Gamma(z)@f$ is the gamma function and @f$\mu = 0.5@f$ + * and @f$\omega 0@f$. + */ + templatetypename _RealType = double +class +nakagami_distribution +{ + static_assert(std::is_floating_point_RealType::value, + template argument not a floating point type); + +public: + /** The type of the range of the distribution. */ + typedef _RealType result_type; + /** Parameter type. */ + struct param_type + { + typedef nakagami_distributionresult_type distribution_type; + + param_type(result_type __mu = result_type(1), + result_type __omega = result_type(1)) + : _M_mu(__mu), _M_omega(__omega) + { + _GLIBCXX_DEBUG_ASSERT(_M_mu = result_type(0.5L)); + _GLIBCXX_DEBUG_ASSERT(_M_omega result_type(0)); + } + + result_type + mu() const + { return _M_mu; } + + result_type + omega() const + { return _M_omega; } + + friend bool + operator==(const param_type __p1, const param_type __p2) + { return __p1._M_mu == __p2._M_mu + __p1._M_omega == __p2._M_omega; } + + private: + void _M_initialize(); + + result_type _M_mu; + result_type _M_omega; + }; + + /** + * @brief Constructors. + */ + explicit + nakagami_distribution(result_type __mu = result_type(1), + result_type __omega = result_type(1)) + : _M_param(__mu, __omega), + _M_gd(__mu, __omega / __mu) + { } + + explicit + nakagami_distribution(const param_type __p) + : _M_param(__p), + _M_gd(__p.mu(), __p.omega() / __p.mu()) + { } + + /** + * @brief Resets the distribution state. + */ + void + reset() + { _M_gd.reset(); } + + /** + * @brief Return the parameters of the distribution. + */ + result_type + mu() const + { return _M_param.mu(); } + + result_type + omega() const + { return _M_param.omega(); } + + /** + * @brief Returns the parameter set of the distribution. + */ + param_type + param() const + { return _M_param; } + + /** + * @brief Sets the parameter set of the distribution. + * @param __param The new parameter set of the distribution. + */ + void + param(const param_type __param) + { _M_param = __param; } + + /** + * @brief Returns the greatest lower bound value of the distribution. + */ + result_type + min() const + { return result_type(0); } + + /** + * @brief Returns the least upper bound value of the distribution. + */ + result_type + max() const + { return std::numeric_limitsresult_type::max(); } + + /** + * @brief Generating functions. + */ + templatetypename _UniformRandomNumberGenerator +
Re: [PATCH] Add option for dumping to stderr (issue6190057)
On Sep 18, 2012 8:43 AM, Xinliang David Li davi...@google.com wrote: On Tue, Sep 18, 2012 at 1:48 AM, Sharad Singhai sing...@google.com wrote: In response to the recent comments, I have updated the patch to do the following: - Remove pass handling from -fopt-info - Support additional flags in regular dumps I have massaged the options so that they have the following (hopefully clearer) behavior: gcc ... -fopt-info --- dump all optimization info on stderr gcc ... -fopt-info-missed-optimized=file.txt -- dump info about optimization applied as well as missed opportunities on to file.txt. If no file.txt is provided, then use stderr. I have enhanced regular dump flags, so that values accepted by -fopt-info are also accepted. For example, gcc ... -O2 -ftree-vectorize -fdump-tree-vect-optimized=foo.dump Now foo.dump will include the regular tree-vect dump as well as the output of -fopt-info=optimized. This way developers can get more detailed dumps when needed. I have also changed the meaning of dump option details to include optimization details. Thus -details flag implies -missed-optimized-note in addition to other dumps. The pass level filtering of -fopt-info dumps can be done in a follow up patch. It may even turn out to be unnecessary, because the equivalent effect can be achieved by -ftree-PASS-optimized-missed-note. Richard's suggestion to map high level 'pass' names to internal passes and make it available to -fopt-info filtering for end users as a follow up pass will be useful. Yes, certainly. I plan to do that in a follow up patch. Currently only vectorization passes use the new dump infrastructure. But as more passes get converted, it will be nice to have an option for high-level -fopt-info filtering for end users. I presume a group of passes would be covered under a single -fopt-info name, such as loop-optimizations. The exact scheme is yet to be designed/discussed. Thanks, Sharad thanks, David I have bootstrapped and tested the attached patch on x86_64 and didn't observe any new failures. Okay for trunk? Thanks, Sharad
Re: [PATCH, AArch64] Implement ctz and clrsb standard patterns
On 18/09/12 14:24, Ian Bolton wrote: New version attached with better formatted test cases. OK for aarch64-branch and aarch64-4.7-branch? Cheers, Ian - 2012-09-18 Ian Bolton ian.bol...@arm.com gcc/ * config/aarch64/aarch64.h: Define CTZ_DEFINED_VALUE_AT_ZERO. * config/aarch64/aarch64.md (clrsbmode2): New pattern. * config/aarch64/aarch64.md (rbitmode2): New pattern. * config/aarch64/aarch64.md (ctzmode2): New pattern. gcc/testsuite/ * gcc.target/aarch64/clrsb.c: New test. * gcc.target/aarch64/clz.c: New test. * gcc.target/aarch64/ctz.c: New test. OK. R.
Re: User directed Function Multiversioning via Function Overloading (issue5752064)
Ping. On Fri, Aug 24, 2012 at 5:34 PM, Sriraman Tallam tmsri...@google.com wrote: Hi Jason, I have created a new patch to use target hooks for all the functionality and make the front-end just call the target hooks at the appropriate places. This is more like what you suggested in a previous mail. In particular, target hooks address the following questions: * Determine if two function decls with the same signature are versions. * Determine the new assembler name of a function version. * Generate the dispatcher function for a set of function versions. * Compare versions to see if one has a higher priority over the other. Patch attached and also available for review at: http://codereview.appspot.com/5752064/ Hope this is more along the lines of what you had in mind, please let me know what you think. Thanks, -Sri. On Mon, Jul 30, 2012 at 12:01 PM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Jul 19, 2012 at 1:39 PM, Jason Merrill ja...@redhat.com wrote: On 07/10/2012 03:14 PM, Sriraman Tallam wrote: I am using the questions you asked previously to explain how I solved each of them. When working on this patch, these are the exact questions I had and tried to address it. * Does this attribute affect a function signature? The function signature should be changed when there is more than one definition/declaration of foo distinguished by unique target attributes. [...] I agree. I was trying to suggest that these questions are what the front end needs to care about, not about versioning specifically. If these questions are turned into target hooks, all of the logic specific to versioning can be contained in the target. My only question intended to be answered by humans is, do people think moving the versioning logic behind more generic target hooks is worthwhile? I have some comments related For the example below, // Default version. int foo () { . } // Version XXX feature supported by Target ABC. int foo __attribute__ ((target (XXX))) { } How should the second version of foo be treated for targets where feature XXX is not supported? Right now, I am working on having my patch completely ignore such function versions when compiled for targets that do not understand the attribute. I could move this check into a generic target hook so that a function definition that does not make sense for the current target is ignored. Also, currently the patch uses target hooks to do the following: - Find if a particular version can be called directly, rather than go through the dispatcher. - Determine what the dispatcher body should be. - Determining the order in which function versions must be dispatched. I do not have a strong opinion on whether the entire logic should be based on target hooks. Thanks, -Sri. Jason
Re: libiberty patch committed: Add strnlen
On Tue, Sep 18, 2012 at 09:03:03AM -0700, Ian Lance Taylor wrote: --- strnlen.c (revision 0) +++ strnlen.c (revision 0) @@ -0,0 +1,28 @@ +/* Portable version of strnlen. + This function is in the public domain. */ + +/* + +@deftypefn Supplemental size_t strnlen (const char *@var{s}, size_t @var{maxlen}) + +Returns the length of @var{s}, as with @code{strlen}, but never looks +past the first @var{maxlen} characters in the string. If there is no +'\0' character in the first @var{maxlen} characters, returns +@var{maxlen}. + +@end deftypefn + +*/ + +#include config.h Shouldn't this #include stddef.h for size_t, or is config.h providing size_t? Or #include sys/types.h ? From what I can see, config.h doesn't always define size_t, only if sys/types.h doesn't define it. + +size_t +strnlen (const char *s, size_t maxlen) +{ + size_t i; + + for (i = 0; i maxlen; ++i) +if (s[i] == '\0') + break; + return i; +} Jakub
Re: [PATCH, libbacktrace]: Fix compilation on CentOS 5.8
On Tue, Sep 18, 2012 at 9:08 AM, Joseph S. Myers jos...@codesourcery.com wrote: I think AC_USE_SYSTEM_EXTENSIONS is preferred to defining _GNU_SOURCE in individual source files. Thanks for the pointer. I have committed this patch after bootstrap and libbacktrace test on x86_64-unknown-linux-gnu. Ian 2012-09-18 Ian Lance Taylor i...@google.com * configure.ac: Add AC_USE_SYSTEM_EXTENSIONS. * mmapio.c: Don't define _GNU_SOURCE. * configure, config.h.in: Rebuild. foo.patch Description: Binary data
Re: libiberty patch committed: Add strnlen
On Tue, Sep 18, 2012 at 9:25 AM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Sep 18, 2012 at 09:03:03AM -0700, Ian Lance Taylor wrote: +*/ + +#include config.h Shouldn't this #include stddef.h for size_t, or is config.h providing size_t? Or #include sys/types.h ? From what I can see, config.h doesn't always define size_t, only if sys/types.h doesn't define it. Yes, the patch as committed does #include stddef.h. There are other files in libiberty that unconditionally #include stddef.h so I figured that was portable enough. I guess I forgot to update the patch attachment after making that change. Thanks for pointing it out. Ian
libbacktrace patch committed: Fix test of HAVE_DECL_STRNLEN
I foolishly assumed that the autoconf macro AC_CHECK_DECLS worked like most autoconf macros, and did not define HAVE_DECL_xx when the declaration is not available. However, it turns out that it actually #defines it to 0. This patch fixes the test of HAVE_DECL_STRNLEN to match that behaviour. Bootstrapped and ran libbacktrace testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian 2012-09-18 Ian Lance Taylor i...@google.com * dwarf.c: Correct test of HAVE_DECL_STRNLEN. Index: dwarf.c === --- dwarf.c (revision 191433) +++ dwarf.c (working copy) @@ -44,7 +44,7 @@ POSSIBILITY OF SUCH DAMAGE. */ #include backtrace.h #include internal.h -#ifndef HAVE_DECL_STRNLEN +#if !defined(HAVE_DECL_STRNLEN) || !HAVE_DECL_STRNLEN /* The function is defined in libiberty if needed. */ extern size_t strnlen (const char *, size_t); #endif
Re: [PATCH] PR 53528 c++/ C++11 Generalized Attribute support
On 09/18/2012 09:51 AM, Dodji Seketeli wrote: + VEC_safe_push (scoped_attributes, heap, attributes_table, sa); + result = VEC_last (scoped_attributes, attributes_table); Here you can set result from the return value of VEC_safe_push. + if ((flags ATTR_FLAG_CXX11) + !(flags ATTR_FLAG_TYPE_IN_PLACE + (TREE_CODE (*node) == RECORD_TYPE + || TREE_CODE (*node) == UNION_TYPE))) + { + /* unused is being used as a c++11 attribute. In this mode +we prevent it from applying to types, unless it's for a +class defintion. */ + warning (OPT_Wattributes, + attribute %qE cannot be applied to a non-class type, name); + return NULL_TREE; + } I think this should now be covered by the general ignoring of attributes that appertain to type-specifiers, so we don't need to check it here. + error (requested alignment %d is larger than %d, +requested_alignment, max_align); Let's make this a pedwarn. +cxx_fundamental_alignment_p (unsigned align) +{ + + return (align = MAX (TYPE_ALIGN (long_long_integer_type_node), +TYPE_ALIGN (long_double_type_node))); +} Unneeded blank line. + inform (token-location, + an attribute for a declaration should be either + at the begining of the declaration or after + the declarator-id); Putting this code here means that we give the diagnostic for attributes in the decl-specifier-seq, but not for attributes that apply to type parts of the declarator such as a ptr-operator or a function declarator. I think the best place to implement this in decl_attributes; there you can just ignore any attributes applied to a type without ATTR_FLAG_TYPE_IN_PLACE. + declarator = cp_parser_make_indirect_declarator (code, type, cv_quals, declarator); + if (declarator != NULL declarator != cp_error_declarator) + declarator-attributes = std_attributes; Let's pass std_attributes to make_indirect_declarator. declarator = cp_parser_make_indirect_declarator (code, class_type, cv_quals, declarator); + + /* For C++11 attributes, the standard at [decl.ptr]/1 says: + +the optional attribute-specifier-seq appertains to the +pointer and not to the object pointed to. */ + if (std_attributes + declarator + (declarator != cp_error_declarator)) + declarator-std_attributes = std_attributes; Here too. +cp_parser_attributes_opt (cp_parser *parser) +{ + + if (cp_next_tokens_can_be_gnu_attribute_p (parser)) Unneeded newline. + alignas_expr = + cp_parser_assignment_expression (parser, /*cast_p=*/false, +/**cp_id_kind=*/NULL); Let's require_potential_rvalue_constant_expression here or in cxx_alignas_expr. + alignas_expr = fold_non_dependent_expr (alignas_expr); You don't need this both in the parser and in cxx_alignas_expr. +cxx_alignas_expr (tree e, tsubst_flags_t complain) I don't think you need the complain parameter anymore, so you don't need to make fold_non_dependent_expr_sfinae non-static either. + /* [dcl.align]/2 says: + + [the assignment-expression shall be an integral constant + expression]. */ + e = fold_non_dependent_expr_sfinae (e, complain); + if (e == NULL_TREE + || e == error_mark_node + || TREE_CODE (e) != INTEGER_CST) +return error_mark_node; This needs to allow value-dependent expressions. I'd say e = fold_non_dependent_expr (e); if (value_dependent_expression_p (e)) /* Leave value-dependent expression alone for now. */; else e = cxx_constant_value (e); Jason
[PATCH] rs6000: Remove integer abs/nabs/min/max patterns
Without these patterns, the exact same code is generated. This is a leftover from when we still had a doz instruction. Tested on powerpc64-linux --enable-languages=c,c++,fortran; no regressions. Also tested all these patterns manually, -m32 and -m64, -misel and -mno-isel. Okay to apply? 2012-09-18 Segher Boessenkool seg...@kernel.crashing.org * gcc/config/rs6000/rs6000.md (sminsi3, smaxsi3, uminsi3, umaxsi3): Delete. (abssi2, absmode2_isel, nabsmode2_isel, abssi2_nopower, nabs_nopower): Delete. (absdi2, absdi2_internal, nabsdi2): Delete. (smindi3, smaxdi3, umindi3, umaxdi3): Delete. --- gcc/config/rs6000/rs6000.md | 230 --- 1 files changed, 0 insertions(+), 230 deletions(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index f2bc15f..2f3795b 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -1796,154 +1796,6 @@ (define_expand submode3 } }) -(define_expand sminsi3 - [(set (match_dup 3) - (if_then_else:SI (gt:SI (match_operand:SI 1 gpc_reg_operand ) - (match_operand:SI 2 reg_or_short_operand )) -(const_int 0) -(minus:SI (match_dup 2) (match_dup 1 - (set (match_operand:SI 0 gpc_reg_operand ) - (minus:SI (match_dup 2) (match_dup 3)))] - TARGET_ISEL - -{ - operands[2] = force_reg (SImode, operands[2]); - rs6000_emit_minmax (operands[0], SMIN, operands[1], operands[2]); - DONE; -}) - -(define_expand smaxsi3 - [(set (match_dup 3) - (if_then_else:SI (gt:SI (match_operand:SI 1 gpc_reg_operand ) - (match_operand:SI 2 reg_or_short_operand )) -(const_int 0) -(minus:SI (match_dup 2) (match_dup 1 - (set (match_operand:SI 0 gpc_reg_operand ) - (plus:SI (match_dup 3) (match_dup 1)))] - TARGET_ISEL - -{ - operands[2] = force_reg (SImode, operands[2]); - rs6000_emit_minmax (operands[0], SMAX, operands[1], operands[2]); - DONE; -}) - -(define_expand uminsi3 - [(set (match_dup 3) (xor:SI (match_operand:SI 1 gpc_reg_operand ) - (match_dup 5))) - (set (match_dup 4) (xor:SI (match_operand:SI 2 gpc_reg_operand ) - (match_dup 5))) - (set (match_dup 3) (if_then_else:SI (gt (match_dup 3) (match_dup 4)) - (const_int 0) - (minus:SI (match_dup 4) (match_dup 3 - (set (match_operand:SI 0 gpc_reg_operand ) - (minus:SI (match_dup 2) (match_dup 3)))] - TARGET_ISEL - -{ - rs6000_emit_minmax (operands[0], UMIN, operands[1], operands[2]); - DONE; -}) - -(define_expand umaxsi3 - [(set (match_dup 3) (xor:SI (match_operand:SI 1 gpc_reg_operand ) - (match_dup 5))) - (set (match_dup 4) (xor:SI (match_operand:SI 2 gpc_reg_operand ) - (match_dup 5))) - (set (match_dup 3) (if_then_else:SI (gt (match_dup 3) (match_dup 4)) - (const_int 0) - (minus:SI (match_dup 4) (match_dup 3 - (set (match_operand:SI 0 gpc_reg_operand ) - (plus:SI (match_dup 3) (match_dup 1)))] - TARGET_ISEL - -{ - rs6000_emit_minmax (operands[0], UMAX, operands[1], operands[2]); - DONE; -}) - -;; We don't need abs with condition code because such comparisons should -;; never be done. -(define_expand abssi2 - [(set (match_operand:SI 0 gpc_reg_operand ) - (abs:SI (match_operand:SI 1 gpc_reg_operand )))] - - -{ - if (TARGET_ISEL) -{ - emit_insn (gen_abssi2_isel (operands[0], operands[1])); - DONE; -} - else -{ - emit_insn (gen_abssi2_nopower (operands[0], operands[1])); - DONE; -} -}) - -(define_insn_and_split absmode2_isel - [(set (match_operand:GPR 0 gpc_reg_operand =r) -(abs:GPR (match_operand:GPR 1 gpc_reg_operand b))) - (clobber (match_scratch:GPR 2 =b)) - (clobber (match_scratch:CC 3 =y))] - TARGET_ISEL - # - reload_completed - [(set (match_dup 2) (neg:GPR (match_dup 1))) - (set (match_dup 3) - (compare:CC (match_dup 1) - (const_int 0))) - (set (match_dup 0) - (if_then_else:GPR (lt (match_dup 3) - (const_int 0)) - (match_dup 2) - (match_dup 1)))] - ) - -(define_insn_and_split nabsmode2_isel - [(set (match_operand:GPR 0 gpc_reg_operand =r) -(neg:GPR (abs:GPR (match_operand:GPR 1 gpc_reg_operand b - (clobber (match_scratch:GPR 2 =b)) - (clobber (match_scratch:CC 3 =y))] - TARGET_ISEL - # - reload_completed - [(set (match_dup 2) (neg:GPR (match_dup 1))) - (set (match_dup 3) - (compare:CC (match_dup 1) - (const_int 0))) - (set (match_dup 0) - (if_then_else:GPR (lt (match_dup 3) -
[PATCH, ARM] 64-bit shifts in NEON
Hello, a while ago Andrew Stubbs posted a patch to use NEON registers and instructions to perform 64-bit integer shifts: http://gcc.gnu.org/ml/gcc-patches/2012-05/msg01645.html As Andrew no longer works on ARM, I've now picked this up and reworked it a bit: - Updated for current mainline changes. - Fixed a typo in the left shift by 1 special case. - Reworked constraint lists to have the NEON alternatives actually reliably chosen in the left shift by register case. - Noticed that arm_emit_coreregs_64bit_shift actually does *not* need a scratch for shifting by constant in any case, which simplifies the implementation a bit. - Further minor simplifications cleanup. Tested on arm-linux-gnueabi (--with-arch=armv7-a --with-float=softfp --with-fpu=neon --with-mode=thumb) with no regressions. OK for mainline? Bye, Ulrich ChangeLog: 2012-09-17 Andrew Stubbs a...@codesourcery.com Ulrich Weigand ulrich.weig...@linaro.org * config/arm/arm.c (arm_print_operand): Add new 'E' format code. (arm_emit_coreregs_64bit_shift): Fix comment. * config/arm/arm.h (enum reg_class): Add VFP_LO_REGS_EVEN. (REG_CLASS_NAMES, REG_CLASS_CONTENTS, IS_VFP_CLASS): Likewise. * config/arm/arm.md (opt, opt_enabled): New attributes. (enabled): Use opt_enabled. (ashldi3, ashrdi3, lshrdi3): Add TARGET_NEON case. * config/arm/constraints.md (T): New register constraint. * config/arm/iterators.md (rshifts): New code iterator. (shift, shifttype): New code attributes. * config/arm/neon.md (signed_shift_di3_neon, unsigned_shift_di3_neon, ashldi3_neon_noclobber, ashldi3_neon, ashrdi3_neon_imm_noclobber, lshrdi3_neon_imm_noclobber, shiftdi3_neon): New patterns. Index: gcc/config/arm/arm.c === *** gcc/config/arm/arm.c(revision 191400) --- gcc/config/arm/arm.c(working copy) *** arm_print_operand (FILE *stream, rtx x, *** 17280,17285 --- 17280,17303 } return; + /* Print the VFP/Neon double precision register name that overlaps the +given single-precision register. */ + case 'E': + { + int mode = GET_MODE (x); + + if (GET_MODE_SIZE (mode) != 4 + || GET_CODE (x) != REG + || !IS_VFP_REGNUM (REGNO (x))) + { + output_operand_lossage (invalid operand for code '%c', code); + return; + } + + fprintf (stream, d%d, (REGNO (x) - FIRST_VFP_REGNUM) 1); + } + return; + /* These two codes print the low/high doubleword register of a Neon quad register, respectively. For pair-structure types, can also print low/high quadword registers. */ *** arm_autoinc_modes_ok_p (enum machine_mod *** 26293,26300 Input requirements: - It is safe for the input and output to be the same register, but early-clobber rules apply for the shift amount and scratch registers. ! - Shift by register requires both scratch registers. Shift by a constant ! less than 32 in Thumb2 mode requires SCRATCH1 only. In all other cases the scratch registers may be NULL. - Ashiftrt by a register also clobbers the CC register. */ void --- 26311,26317 Input requirements: - It is safe for the input and output to be the same register, but early-clobber rules apply for the shift amount and scratch registers. ! - Shift by register requires both scratch registers. In all other cases the scratch registers may be NULL. - Ashiftrt by a register also clobbers the CC register. */ void Index: gcc/config/arm/arm.h === *** gcc/config/arm/arm.h(revision 191254) --- gcc/config/arm/arm.h(working copy) *** enum reg_class *** 1120,1125 --- 1120,1126 CORE_REGS, VFP_D0_D7_REGS, VFP_LO_REGS, + VFP_LO_REGS_EVEN, VFP_HI_REGS, VFP_REGS, IWMMXT_REGS, *** enum reg_class *** 1146,1151 --- 1147,1153 CORE_REGS,\ VFP_D0_D7_REGS, \ VFP_LO_REGS, \ + VFP_LO_REGS_EVEN, \ VFP_HI_REGS, \ VFP_REGS, \ IWMMXT_REGS, \ *** enum reg_class *** 1169,1174 --- 1171,1177 { 0x7FFF, 0x, 0x, 0x }, /* CORE_REGS */ \ { 0x, 0x, 0x, 0x }, /* VFP_D0_D7_REGS */ \ { 0x, 0x, 0x, 0x }, /* VFP_LO_REGS */ \ + { 0x, 0x, 0x, 0x }, /* VFP_LO_REGS_EVEN */ \ { 0x, 0x, 0x, 0x }, /* VFP_HI_REGS */ \ { 0x, 0x, 0x, 0x }, /* VFP_REGS */ \ { 0x, 0x, 0x, 0x }, /* IWMMXT_REGS */ \ ***
libbacktrace patch committed: Mark test functions as unused
This patch to libbacktrace marks the test functions with the unused attribute. This avoids producing a compilation warning when building the test on a system that does not support libbacktrace. Bootstrapped and ran libbacktrace testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian 2012-09-18 Ian Lance Taylor i...@google.com * btest.c (test1, test2, test3, test4): Add the unused attribute. Index: btest.c === --- btest.c (revision 191432) +++ btest.c (working copy) @@ -269,7 +269,7 @@ error_callback_three (void *vdata, const /* Test the backtrace function with non-inlined functions. */ -static int test1 (void) __attribute__ ((noinline)); +static int test1 (void) __attribute__ ((noinline, unused)); static int f2 (int) __attribute__ ((noinline)); static int f3 (int, int) __attribute__ ((noinline)); @@ -323,7 +323,7 @@ f3 (int f1line, int f2line) /* Test the backtrace function with inlined functions. */ -static inline int test2 (void) __attribute__ ((always_inline)); +static inline int test2 (void) __attribute__ ((always_inline, unused)); static inline int f12 (int) __attribute__ ((always_inline)); static inline int f13 (int, int) __attribute__ ((always_inline)); @@ -375,7 +375,7 @@ f13 (int f1line, int f2line) /* Test the backtrace_simple function with non-inlined functions. */ -static int test3 (void) __attribute__ ((noinline)); +static int test3 (void) __attribute__ ((noinline, unused)); static int f22 (int) __attribute__ ((noinline)); static int f23 (int, int) __attribute__ ((noinline)); @@ -524,7 +524,7 @@ f23 (int f1line, int f2line) /* Test the backtrace_simple function with inlined functions. */ -static inline int test4 (void) __attribute__ ((always_inline)); +static inline int test4 (void) __attribute__ ((always_inline, unused)); static inline int f32 (int) __attribute__ ((always_inline)); static inline int f33 (int, int) __attribute__ ((always_inline));
PING Re: [PATCH, MIPS] add new peephole for 74k dspr2
On 08/27/2012 10:36 AM, Richard Sandiford wrote: Sandra Loosemoresan...@codesourcery.com writes: On 08/19/2012 11:22 AM, Richard Sandiford wrote: Not sure whether a peephole is the right choice here. In practice, I'd imagine these opportunities would only come from a DImode move of $0 into a doubleword register, so we could simply emit the pattern in mips_split_doubleword_move. That would also allow us to use it for plain HI and LO. It wasn't obvious from the patch why it was restricted to the DSP extension registers. Please also add a scan-assembler test. How is this version of the fix? Just to say that I've not forgotten about this. I'd still like to remove the !TARGET_64BIT and ISA_HAS_DSP_MULT tests, because the idea isn't specific to either. Also, reviewing the patch made me realise that it might be better to keep the move intact and simply use mult in the output code. That's my fault for suggesting the wrong thing, though, so I was hoping to find time this weekend to try it myself. The testsuite stuff ended up taking up all the available time instead. Richard, Have you had time to think about this some more? I am not sure I can guess how you'd like me to fix this patch now without some more specific review and/or suggestions about where the optimization should happen and what cases it should be extended to detect in addition to the dsp accumulator multiplies. -Sandra
[Patch,wwwdocs,4.7,committed]: Document --with-avrlibc
Added the new avr-gcc configure option --with-avrlibc to the 4.7 release notes. Johann Index: gcc-4.7/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v retrieving revision 1.125 diff -u -p -r1.125 changes.html --- gcc-4.7/changes.html12 Sep 2012 09:06:46 - 1.125 +++ gcc-4.7/changes.html18 Sep 2012 17:22:53 - @@ -717,6 +717,15 @@ int add_values (const __flash int *p, in { return values[i] + *p; }/pre/blockquote/li +liSupport has been added for the AVR-specific configure option + code--with-avrlibc=yes/code in order to arrange for better + integration of a href=http://nongnu.org/avr-libc/;AVR-Libc/a. + This configure option is supported in avr-gcc 4.7.2 and newer and will + only take effect in non-RTEMS configurations. If avr-gcc is configured + for RTEMS, the option will be ignored which is the same as + specifying code--with-avrlibc=no/code. + See a href=http://gcc.gnu.org/PR54461;PR54461/a for more technical + details./li liSupport for AVR-specific a href=http://gcc.gnu.org/onlinedocs/gcc-4.7.1/gcc/AVR-Built%5f002din-Functions.html;built-in functions/a has been added./li liSupport has been added for the signed and unsigned 24-bit scalar
Re: [PATCH] Add -Og optimization level - optimize for compile-time/debugging experience
On Tue, Sep 18, 2012 at 4:23 AM, Richard Guenther rguent...@suse.de wrote: 2012-09-18 Richard Guenther rguent...@suse.de PR other/53316 * common.opt (optimize_debug): New variable. (Og): New optimization level. * doc/invoke.texi (Og): Document. * opts.c (maybe_default_option): Add debug parameter. (maybe_default_options): Likewise. (default_options_optimization): Handle -Og. (common_handle_option): Likewise. * passes.c (gate_all_optimizations): Do not run with -Og. (gate_all_optimizations_g): New gate, run with -Og. (pass_all_optimizations_g): New container pass, run with -Og. (init_optimization_passes): Schedule pass_all_optimizations_g alongside pass_all_optimizations. * gcc/testsuite/lib/c-torture.exp: Add -Og -g to default TORTURE_OPTIONS. This looks good to me. Thanks for working on it. Ian
Re: [PATCH] Add -Og optimization level - optimize for compile-time/debugging experience
On Tue, Sep 18, 2012 at 4:23 AM, Richard Guenther rguent...@suse.de wrote: This adds -Og as optimization level targeted at the devel-compile-debug cycle (formerly mostly tied to -O0 due to debug issues with even -O1). This needs an entry in gcc-4.8/changes.html, of course. Ian
Re: [C++ Patch] for c++/54537
2012/9/11 Fabien Chêne fabien.ch...@gmail.com: Oops, not sure how I test that change initially, or I must be blind, because it triggers an error in tr1/cmath about pow. I'll see what I can do... Well, as summarized in the code below, the problem seems to be the redundant overload of std::tr1::pow(double,double). As one can note that std::pow(double,double) is not defined, I guess the right fix would consist in removing the definition of std::tr1::pow(double,double). extern double pow (double __x, double __y) throw (); namespace std { using ::pow; inline float pow(float __x, float __y) { return __builtin_powf(__x, __y); } inline long double pow(long double __x, long double __y) { return __builtin_powl(__x, __y); } } namespace std { namespace tr1 { // inline double // pow(double __x, double __y) // { return std::pow(__x, __y); } inline float pow(float __x, float __y) { return std::pow(__x, __y); } inline long double pow(long double __x, long double __y) { return std::pow(__x, __y); } } } While looking into this problem, I found it bothering not to have the conflicting declaration mentioned. Hence, I have modified the original patch to call diagnose_name_conflict in case of error. Bootstrapped/Tested x86_64-unknown-linux-gnu. 2012-09-18 Fabien Chêne fab...@gcc.gnu.org PR c++/54537 * cp-tree.h: Check OVL_USED with OVERLOAD_CHECK. * name-lookup.c (do_nonmember_using_decl): Make sure we have an OVERLOAD before calling OVL_USED. Call diagnose_name_conflict instead of issuing an error without mentioning the conflicting declaration. 2012-09-18 Fabien Chêne fab...@gcc.gnu.org PR c++/54537 * g++.dg/overload/using3.C: New. * g++.dg/overload/using2.C: Adjust. * g++.dg/lookup/using9.C: Likewise. 2012-09-18 Fabien Chêne fab...@gcc.gnu.org PR c++/54537 * include/tr1/cmath: Remove pow(double,double) overload. -- Fabien
Re: [C++ Patch] for c++/54537
... And the patch. 2012/9/18 Fabien Chêne fabien.ch...@gmail.com: 2012/9/11 Fabien Chêne fabien.ch...@gmail.com: Oops, not sure how I test that change initially, or I must be blind, because it triggers an error in tr1/cmath about pow. I'll see what I can do... Well, as summarized in the code below, the problem seems to be the redundant overload of std::tr1::pow(double,double). As one can note that std::pow(double,double) is not defined, I guess the right fix would consist in removing the definition of std::tr1::pow(double,double). extern double pow (double __x, double __y) throw (); namespace std { using ::pow; inline float pow(float __x, float __y) { return __builtin_powf(__x, __y); } inline long double pow(long double __x, long double __y) { return __builtin_powl(__x, __y); } } namespace std { namespace tr1 { // inline double // pow(double __x, double __y) // { return std::pow(__x, __y); } inline float pow(float __x, float __y) { return std::pow(__x, __y); } inline long double pow(long double __x, long double __y) { return std::pow(__x, __y); } } } While looking into this problem, I found it bothering not to have the conflicting declaration mentioned. Hence, I have modified the original patch to call diagnose_name_conflict in case of error. Bootstrapped/Tested x86_64-unknown-linux-gnu. 2012-09-18 Fabien Chêne fab...@gcc.gnu.org PR c++/54537 * cp-tree.h: Check OVL_USED with OVERLOAD_CHECK. * name-lookup.c (do_nonmember_using_decl): Make sure we have an OVERLOAD before calling OVL_USED. Call diagnose_name_conflict instead of issuing an error without mentioning the conflicting declaration. 2012-09-18 Fabien Chêne fab...@gcc.gnu.org PR c++/54537 * g++.dg/overload/using3.C: New. * g++.dg/overload/using2.C: Adjust. * g++.dg/lookup/using9.C: Likewise. 2012-09-18 Fabien Chêne fab...@gcc.gnu.org PR c++/54537 * include/tr1/cmath: Remove pow(double,double) overload. -- Fabien -- Fabien 54537_2.patch Description: Binary data
[PATCH, middle-end]: Fix g++.dg/other/vector-compare.C testsuite failure on alpha
Hello! g++.dg/other/vector-compare.C recently started to fail on alphaev68-pc-linux-gnu with: [uros@localhost other]$ ~/gcc-build-alpha/gcc/cc1plus -std=gnu++11 -quiet vector-compare.C vector-compare.C: In function ‘int main()’: vector-compare.C:26:5: internal compiler error: in emit_cmp_and_jump_insn_1, at optabs.c:4261 int main () ^ Please submit a full bug report, with preprocessed source if appropriate. Prepare_cmp_insn in optabs.c expands BLKmode compares using either cmp{mem,str,strn}_optab, or through emit_library_call_value to integer result register, and follows with the expansion of the compare of the result with zero. However, the code blindly assumes that the target is able to compare resulting SImode value, which is not true in case of alpha. Due to missing SImode compare pattern, the above assert is triggered in emit_cmp_and_jump_1. The patch fixes this oversight by simply expanding the comparison of the result through generic comparison expansion code that conveniently follows BLKmode compare expansion. 2012-09-18 Uros Bizjak ubiz...@gmail.com * optabs.c (prepare_cmp_insn): Expand comparison of the result of memory block compare through generic comparison expansion code. Patch was bootstrapped and regression tested on alphaev68-pc-linux-gnu and x86_64-pc-linux-gnu {,-m32}. OK for mainline and release branches? Thanks, Uros. Index: optabs.c === --- optabs.c(revision 191413) +++ optabs.c(working copy) @@ -4087,9 +4087,13 @@ prepare_cmp_insn (rtx x, rtx y, enum rtx_code comp size = convert_to_mode (cmp_mode, size, 1); emit_insn (GEN_FCN (cmp_code) (result, x, y, size, opalign)); - *ptest = gen_rtx_fmt_ee (comparison, VOIDmode, result, const0_rtx); - *pmode = result_mode; - return; + x = result; + y = const0_rtx; + mode = result_mode; + methods = OPTAB_LIB_WIDEN; + unsignedp = false; + + goto result_compare; } if (methods != OPTAB_LIB methods != OPTAB_LIB_WIDEN) @@ -4109,11 +4113,15 @@ prepare_cmp_insn (rtx x, rtx y, enum rtx_code comp XEXP (y, 0), Pmode, size, cmp_mode); - *ptest = gen_rtx_fmt_ee (comparison, VOIDmode, result, const0_rtx); - *pmode = result_mode; - return; + x = result; + y = const0_rtx; + mode = result_mode; + methods = OPTAB_LIB_WIDEN; + unsignedp = false; } + result_compare: + /* Don't allow operands to the compare to trap, as that can put the compare and branch in different basic blocks. */ if (cfun-can_throw_non_call_exceptions)
Re: [PATCH] Add extra location information - PR43486
On Tue, Sep 18, 2012 at 3:58 AM, Arnaud Charlet char...@adacore.com wrote: Since this issue is more general, I have split my changes and introduced a new tentative switch called -fextra-slocs, which is the subject of this email. Sorry for picking on simple stuff, but the switch name seems meaningless, and there isn't any documentation. Conceptually it looks like you are trying to make up for the absence of a proper AST by building an on-the-side hash table to track expression locations. The hash table key is the tree structure itself. The thing is, any call into fold-const may give you an entirely new tree, and at that point you have lost your extra location information. And the C/C++ frontends call into fold-const regularly, which is why we don't have a proper AST in the first place. So it seems to me that this is going to be kind of frustrating, in that we will often have the extra location information but sometimes we won't. And whether we have it or not will change as the frontends change. So while a proper AST would be nice, I'm not convinced that this is the right workaround. Another approach might be to tie this to the location information, because the location information does generally survive fold-const. E.g., perhaps we could grab a bit in the location information to mean that it is special. And we could keep an on-the-side hash table mapping special location values to additional location information. Ian
Re: [PATCH] Add extra location information - PR43486
Sorry for picking on simple stuff, but the switch name seems No problem, and thanks for your feedback. meaningless, and there isn't any documentation. Ah. I'm open for suggestion on a better name, or I can come up with a new one. I'll indeed add documentation as soon as there's some kind of agreement on the approach. Conceptually it looks like you are trying to make up for the absence of a proper AST by building an on-the-side hash table to track expression locations. Right, that's the idea. The hash table key is the tree structure itself. Yes. The thing is, any call into fold-const may give you an entirely new tree, Exactly. and at that point you have lost your extra location information. Actually no, see the c-family/c-common.c patch, copied here, which ensures that folding does preserve such information: * c-common.c (c_fully_fold_internal): Copy extra locations on new node. --- c-family/c-common.c (revision 190939) +++ c-family/c-common.c (working copy) @@ -1440,7 +1440,10 @@ c_fully_fold_internal (tree expr, bool i TREE_NO_WARNING (ret) = 1; } if (ret != expr) -protected_set_expr_location (ret, loc); +{ + protected_set_expr_location (ret, loc); + duplicate_expr_locations (ret, expr); +} return ret; } And the C/C++ frontends call into fold-const regularly, which is why we don't have a proper AST in the first place. So it seems to me that this is going to be kind of frustrating, in that we will often have the extra location information but sometimes we won't. That's not the case as per the c-common.c patch, the locations are preserved across fold, otherwise as you said, the whole approach would be pretty useless. I should perhaps have mentioned that this patch (and the -fdump-xref implementation on top of it) has been in production in our (AdaCore) tree for more than 2 years now, with pretty good results, and certainly most expressions trees do have extra sloc info available in our experience. And whether we have it or not will change as the frontends change. See above. Does this address your concern? Arno
Re: [C++ Patch] for c++/54537
Hi, Fabien Chêne fabien.ch...@gmail.com ha scritto: 2012/9/11 Fabien Chêne fabien.ch...@gmail.com: Oops, not sure how I test that change initially, or I must be blind, because it triggers an error in tr1/cmath about pow. I'll see what I can do... Well, as summarized in the code below, the problem seems to be the redundant overload of std::tr1::pow(double,double). As one can note that std::pow(double,double) is not defined, I guess the right fix would consist in removing the definition of std::tr1::pow(double,double). extern double pow (double __x, double __y) throw (); namespace std { using ::pow; inline float pow(float __x, float __y) { return __builtin_powf(__x, __y); } inline long double pow(long double __x, long double __y) { return __builtin_powl(__x, __y); } } namespace std { namespace tr1 { // inline double // pow(double __x, double __y) // { return std::pow(__x, __y); } inline float pow(float __x, float __y) { return std::pow(__x, __y); } inline long double pow(long double __x, long double __y) { return std::pow(__x, __y); } } } I don't understand: what's wrong - exactly - with the std::tr1::pow (double, double) overload above? Now I can't immediately check, but do the EDG and CLANG front ends accept it or not? Paolo
libbacktrace patch committed: Add some mingw support
This patch to libbacktrace adds some support for mingw. The executable is opened with O_BINARY. The fcntl function is not called. Bootstrapped and ran libbacktrace testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian 2012-09-18 Ian Lance Taylor i...@google.com * posix.c (O_BINARY): Define if not defined. (backtrace_open): Pass O_BINARY to open. Only call fcntl if HAVE_FCNTL is defined. * configure.ac: Test for the fcntl function. * configure, config.h.in: Rebuild. Index: posix.c === --- posix.c (revision 191432) +++ posix.c (working copy) @@ -41,6 +41,10 @@ POSSIBILITY OF SUCH DAMAGE. */ #include backtrace.h #include internal.h +#ifndef O_BINARY +#define O_BINARY 0 +#endif + #ifndef O_CLOEXEC #define O_CLOEXEC 0 #endif @@ -57,18 +61,20 @@ backtrace_open (const char *filename, ba { int descriptor; - descriptor = open (filename, O_RDONLY | O_CLOEXEC); + descriptor = open (filename, O_RDONLY | O_BINARY | O_CLOEXEC); if (descriptor 0) { error_callback (data, filename, errno); return -1; } +#ifdef HAVE_FCNTL /* Set FD_CLOEXEC just in case the kernel does not support O_CLOEXEC. It doesn't matter if this fails for some reason. FIXME: At some point it should be safe to only do this if O_CLOEXEC == 0. */ fcntl (descriptor, F_SETFD, FD_CLOEXEC); +#endif return descriptor; } Index: configure.ac === --- configure.ac (revision 191435) +++ configure.ac (working copy) @@ -201,6 +201,20 @@ if test $ALLOC_FILE = alloc.lo; then fi AC_SUBST(BACKTRACE_USES_MALLOC) +# Check for the fcntl function. +if test -n ${with_target_subdir}; then + case ${host} in + *-*-mingw*) have_fcntl=no ;; + *) have_fcntl=yes ;; + esac +else + AC_CHECK_FUNC(fcntl, [have_fcntl=yes], [have_fcntl=no]) +fi +if test $have_fcntl = yes; then + AC_DEFINE([HAVE_FCNTL], 1, + [Define to 1 if you have the fcntl function]) +fi + AC_CHECK_DECLS(strnlen) AC_CACHE_CHECK([whether tests can run],
Re: Use conditional casting with symtab_node
When, the property test is embedded within a larger condition, a little restructuring is required to pull out the secondary conditions. For example, if (symtab_variable_p (node) varpool (node)-finalized) varpool_analyze_node (varpool (node)); becomes if (varpool_node *vnode = node-try_variable ()) if (vnode-finalized) varpool_analyze_node (vnode); Please avoid cascading if's like this, use the existing idiom instead. -- Eric Botcazou
Re: [PATCH] Add extra location information - PR43486
On Tue, Sep 18, 2012 at 10:58 AM, Arnaud Charlet char...@adacore.com wrote: and at that point you have lost your extra location information. Actually no, see the c-family/c-common.c patch, copied here, which ensures that folding does preserve such information: Thanks. I think I would like some clarity on when the extra location information is available. For better or for worse the C frontend does sometimes call directly into fold-const, without going through c_fully_fold. E.g., I see calls to fold_convert and fold_build2_loc. What happens then? Ian
Re: [C++ Patch] for c++/54537
2012/9/18 Paolo Carlini paolo.carl...@oracle.com: I don't understand: what's wrong - exactly - with the std::tr1::pow (double, double) overload above? Now I can't immediately check, but do the EDG and CLANG front ends accept it or not? They don't. The problem is that it conflicts with ::pow(double,double). // in GLIBC math.h extern double pow (double __x, double __y) throw (); // (1) // in std::tr1::cmath namespace std { namespace tr1 { inline double pow(double __x, double __y) { return std::pow(__x, __y); } }} // in std::tr1::math.h using std::tr1::pow; // this one conflicts with (1) The removal of std::tr1::pow is surely wrong, on second though, I think we should perhaps do: namespace std { namespace tr1 { using std::pow; }} But I see the comment below in std::tr1::math.h... // DR 550. What should the return type of pow(float,int) be? // NB: C++0x and TR1 != C++03. // using std::pow; What do you think ?
Re: PING Re: [PATCH, MIPS] add new peephole for 74k dspr2
Sandra Loosemore san...@codesourcery.com writes: On 08/27/2012 10:36 AM, Richard Sandiford wrote: Sandra Loosemoresan...@codesourcery.com writes: On 08/19/2012 11:22 AM, Richard Sandiford wrote: Not sure whether a peephole is the right choice here. In practice, I'd imagine these opportunities would only come from a DImode move of $0 into a doubleword register, so we could simply emit the pattern in mips_split_doubleword_move. That would also allow us to use it for plain HI and LO. It wasn't obvious from the patch why it was restricted to the DSP extension registers. Please also add a scan-assembler test. How is this version of the fix? Just to say that I've not forgotten about this. I'd still like to remove the !TARGET_64BIT and ISA_HAS_DSP_MULT tests, because the idea isn't specific to either. Also, reviewing the patch made me realise that it might be better to keep the move intact and simply use mult in the output code. That's my fault for suggesting the wrong thing, though, so I was hoping to find time this weekend to try it myself. The testsuite stuff ended up taking up all the available time instead. Richard, Have you had time to think about this some more? I am not sure I can guess how you'd like me to fix this patch now without some more specific review and/or suggestions about where the optimization should happen and what cases it should be extended to detect in addition to the dsp accumulator multiplies. The patch below is the one I've been testing. But I got sidetracked by looking into the possibility of removing the MD0_REG and MD1_REG classes, in order to get more sensible costs. I think that was needed for the madd-9.c test to pass. Richard Index: gcc/config/mips/mips-protos.h === --- gcc/config/mips/mips-protos.h 2012-09-03 07:49:57.319188985 +0100 +++ gcc/config/mips/mips-protos.h 2012-09-04 20:15:10.240130458 +0100 @@ -212,8 +212,8 @@ extern int m16_simm8_8 (rtx, enum machin extern int m16_nsimm8_8 (rtx, enum machine_mode); extern rtx mips_subword (rtx, bool); -extern bool mips_split_64bit_move_p (rtx, rtx); -extern void mips_split_doubleword_move (rtx, rtx); +extern bool mips_split_move_p (rtx, rtx); +extern void mips_split_move (rtx, rtx); extern const char *mips_output_move (rtx, rtx); extern bool mips_cfun_has_cprestore_slot_p (void); extern bool mips_cprestore_address_p (rtx, bool); Index: gcc/config/mips/mips.c === --- gcc/config/mips/mips.c 2012-09-04 20:15:08.191130518 +0100 +++ gcc/config/mips/mips.c 2012-09-04 20:15:17.173130256 +0100 @@ -2395,11 +2395,11 @@ mips_load_store_insns (rtx mem, rtx insn mode = GET_MODE (mem); /* Try to prove that INSN does not need to be split. */ - might_split_p = true; - if (GET_MODE_BITSIZE (mode) == 64) + might_split_p = GET_MODE_SIZE (mode) UNITS_PER_WORD; + if (might_split_p) { set = single_set (insn); - if (set !mips_split_64bit_move_p (SET_DEST (set), SET_SRC (set))) + if (set !mips_split_move_p (SET_DEST (set), SET_SRC (set))) might_split_p = false; } @@ -4105,39 +4105,55 @@ mips_subword (rtx op, bool high_p) return simplify_gen_subreg (word_mode, op, mode, byte); } -/* Return true if a 64-bit move from SRC to DEST should be split into two. */ +/* Return true if SRC can be moved into DEST using MULT $0, $0. */ + +static bool +mips_mult_move_p (rtx dest, rtx src) +{ + return (src == const0_rtx + REG_P (dest) + GET_MODE_SIZE (GET_MODE (dest)) == 2 * UNITS_PER_WORD + (ISA_HAS_DSP_MULT + ? ACC_REG_P (REGNO (dest)) + : MD_REG_P (REGNO (dest; +} + +/* Return true if a move from SRC to DEST should be split into two. */ bool -mips_split_64bit_move_p (rtx dest, rtx src) +mips_split_move_p (rtx dest, rtx src) { - if (TARGET_64BIT) + /* Check whether the move can be done using some variant of MULT $0,$0. */ + if (mips_mult_move_p (dest, src)) return false; /* FPR-to-FPR moves can be done in a single instruction, if they're allowed at all. */ - if (FP_REG_RTX_P (src) FP_REG_RTX_P (dest)) + unsigned int size = GET_MODE_SIZE (GET_MODE (dest)); + if (size == 8 FP_REG_RTX_P (src) FP_REG_RTX_P (dest)) return false; /* Check for floating-point loads and stores. */ - if (ISA_HAS_LDC1_SDC1) + if (size == 8 ISA_HAS_LDC1_SDC1) { if (FP_REG_RTX_P (dest) MEM_P (src)) return false; if (FP_REG_RTX_P (src) MEM_P (dest)) return false; } - return true; + + /* Otherwise split all multiword moves. */ + return size UNITS_PER_WORD; } -/* Split a doubleword move from SRC to DEST. On 32-bit targets, - this function handles 64-bit moves for which mips_split_64bit_move_p - holds. For 64-bit targets, this function handles 128-bit moves. */ +/* Split a move from
Re: [C++ Patch] for c++/54537
Hi, Fabien Chêne fabien.ch...@gmail.com ha scritto: 2012/9/18 Paolo Carlini paolo.carl...@oracle.com: I don't understand: what's wrong - exactly - with the std::tr1::pow (double, double) overload above? Now I can't immediately check, but do the EDG and CLANG front ends accept it or not? They don't. The problem is that it conflicts with ::pow(double,double). // in GLIBC math.h extern double pow (double __x, double __y) throw (); // (1) // in std::tr1::cmath namespace std { namespace tr1 { inline double pow(double __x, double __y) { return std::pow(__x, __y); } }} // in std::tr1::math.h using std::tr1::pow; // this one conflicts with (1) Thus, *this* is the problem. You have to add to your previous snippet an using in the global namespace to see it. But I'm not surprised, frankly, I think the conflict is expected, *IF* (please check) TR1 says that those three overloads, for float, double an long double, must be declared in std::tr1 (likewise for all the other math functions) Now, given that our implementation has the C math.h injecting stuff in the global namespace - and that is legal - I would say that there is nothing to fix, maybe just a library testcase to tweak. As a matter of QoI the idea of having in tr1 using std::pow seems good, if this is what you are suggesting. At this stage however tr1 is really in *deep* regressions only mode, and this is a pretty large change isnt'it? This impacts all the math functions not just pow, right? Paolo
Re: Use conditional casting with symtab_node
On 9/18/12, Eric Botcazou ebotca...@adacore.com wrote: When, the property test is embedded within a larger condition, a little restructuring is required to pull out the secondary conditions. For example, if (symtab_variable_p (node) varpool (node)-finalized) varpool_analyze_node (varpool (node)); becomes if (varpool_node *vnode = node-try_variable ()) if (vnode-finalized) varpool_analyze_node (vnode); Please avoid cascading if's like this, use the existing idiom instead. The language syntax would bind the conditional into the intializer, as in if (varpool_node *vnode = (node-try_variable () vnode-finalized)) varpool_analyze_node (vnode); which does not type-match. So, if you want the type saftey and performance, the cascade is really unavoidable. -- Lawrence Crowl
Re: [PATCH] OpenBSD/hppa support
On Thu, 06 Sep 2012, Mark Kettenis wrote: Most bits are stolen from Linux, but there are a few subtle differences since our assembler is configured to be slightly more HP-UX-ish. libgcc/: 2012-09-06 Mark Kettenis kette...@openbsd.org * config.host (hppa-*-openbsd*): New target. * config/pa/t-openbsd: New file. gcc/: 2012-09-06 Mark Kettenis kette...@openbsd.org * config.gcc (hppa*-*-openbsd*): New target. * config/pa/pa-openbsd.h: New file. * config/pa/pa32-openbsd.h: New file. * config/host-openbsd.c (TRY_EXCEPT_VM_SPACE): Define for OpenBSD/hppa. OK. Please add 2012 to files with copyrights. Thanks, Dave -- J. David Anglin dave.ang...@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
[patch, fortran] Fix an issue found by Coverity scan
Hello world, pretty self-explanatory. No test case because there is no change in behavior. OK for trunk? Thomas 2012-09-18 Thomas König tkoe...@gcc.gnu.org PR fortran/54599 * dependency.c (gfc_dep_compare_expr): Clarify logic, remove dead code. Index: dependency.c === --- dependency.c (Revision 191342) +++ dependency.c (Arbeitskopie) @@ -395,30 +395,21 @@ gfc_dep_compare_expr (gfc_expr *e1, gfc_expr *e2) l = gfc_dep_compare_expr (e1-value.op.op1, e2-value.op.op1); r = gfc_dep_compare_expr (e1-value.op.op2, e2-value.op.op2); - if (l = -2) + if (l != 0) return l; - if (l == 0) - { - /* Watch out for 'A ' // x vs. 'A' // x. */ - gfc_expr *e1_left = e1-value.op.op1; - gfc_expr *e2_left = e2-value.op.op1; + /* Left expressions of // compare equal, but + watch out for 'A ' // x vs. 'A' // x. */ + gfc_expr *e1_left = e1-value.op.op1; + gfc_expr *e2_left = e2-value.op.op1; - if (e1_left-expr_type == EXPR_CONSTANT - e2_left-expr_type == EXPR_CONSTANT - e1_left-value.character.length - != e2_left-value.character.length) - return -2; - else - return r; - } + if (e1_left-expr_type == EXPR_CONSTANT + e2_left-expr_type == EXPR_CONSTANT + e1_left-value.character.length + != e2_left-value.character.length) + return -2; else - { - if (l != 0) - return l; - else - return r; - } + return r; } /* Compare X vs. X-C, for INTEGER only. */
[SH] Use more braced strings in MD
Hello, Like the topic says. No functional change, just cosmetics. Tested on rev 191342 with make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} and no new failures. OK to install? Cheers, Oleg ChangeLog: * config/sh/sh.md (prologue, epilogue): Use braced strings. Index: gcc/config/sh/sh.md === --- gcc/config/sh/sh.md (revision 191342) +++ gcc/config/sh/sh.md (working copy) @@ -10303,12 +10303,17 @@ (define_expand prologue [(const_int 0)] - sh_expand_prologue (); DONE;) +{ + sh_expand_prologue (); + DONE; +}) (define_expand epilogue [(return)] - sh_expand_epilogue (false);) +{ + sh_expand_epilogue (false); +}) (define_expand eh_return [(use (match_operand 0 register_operand ))]
[SH] PR 54236 - Add another addc case
Hello, There is another opportunity where SH's addc insn can be used. Tested on rev 191342 with make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} and no new failures. OK to install? Cheers, Oleg gcc/ChangeLog: PR target/54236 * config/sh/sh.md (*addc): Add pattern to handle one bit left shifts. testsuite/ChangeLog: PR target/54236 * gcc.target/sh/pr54236-1.c (test_08): Add one bit left shift case. Index: gcc/testsuite/gcc.target/sh/pr54236-1.c === --- gcc/testsuite/gcc.target/sh/pr54236-1.c (revision 191342) +++ gcc/testsuite/gcc.target/sh/pr54236-1.c (working copy) @@ -4,9 +4,9 @@ /* { dg-do compile { target sh*-*-* } } */ /* { dg-options -O1 } */ /* { dg-skip-if { sh*-*-* } { -m5*} { } } */ -/* { dg-final { scan-assembler-times addc 3 } } */ +/* { dg-final { scan-assembler-times addc 4 } } */ /* { dg-final { scan-assembler-times subc 3 } } */ -/* { dg-final { scan-assembler-times sett 4 } } */ +/* { dg-final { scan-assembler-times sett 5 } } */ /* { dg-final { scan-assembler-times negc 1 } } */ /* { dg-final { scan-assembler-not movt } } */ @@ -74,3 +74,10 @@ return vi; } + +int +test_08 (int a) +{ + /* 1x addc, 1x sett */ + return (a 1) + 1; +} Index: gcc/config/sh/sh.md === --- gcc/config/sh/sh.md (revision 191342) +++ gcc/config/sh/sh.md (working copy) @@ -1787,6 +1787,22 @@ (reg:SI T_REG))) (clobber (reg:SI T_REG))])]) +;; Left shifts by one are usually done with an add insn to avoid T_REG +;; clobbers. Thus addc can also be used to do something like '(x 1) + 1'. +(define_insn_and_split *addc + [(set (match_operand:SI 0 arith_reg_dest) + (plus:SI (mult:SI (match_operand:SI 1 arith_reg_operand) + (const_int 2)) + (const_int 1))) + (clobber (reg:SI T_REG))] + TARGET_SH1 + # + 1 + [(set (reg:SI T_REG) (const_int 1)) + (parallel [(set (match_dup 0) (plus:SI (plus:SI (match_dup 1) (match_dup 1)) + (reg:SI T_REG))) + (clobber (reg:SI T_REG))])]) + ;; Sometimes combine will try to do 'reg + (0-reg) + 1' if the *addc pattern ;; matched. Split this up into a simple sub add sequence, as this will save ;; us one sett insn.
[SH] PR 54089 - Add another rotcr case
Hello, There is another opportunity where SH's rotcr insn can be used. Tested on rev 191342 with make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} and no new failures. OK to install? Cheers, Oleg gcc/ChangeLog: PR target/54089 * config/sh/predicates.md (arith_reg_or_t_reg_operand): New predicate. * config/sh/sh.md (*rotcr): Use arith_reg_or_t_reg_operand predicate. Handle the case where one of the operands is T_REG. Add new pattern to handle MSB extraction. testsuite/ChangeLog: PR target/54089 * gcc.target/sh/pr54089-1.c (test_11, test_12, test_13, test_14): New functions. Index: gcc/config/sh/sh.md === --- gcc/config/sh/sh.md (revision 191342) +++ gcc/config/sh/sh.md (working copy) @@ -3924,7 +3924,7 @@ [(set (match_operand:SI 0 arith_reg_dest) (ior:SI (lshiftrt:SI (match_operand:SI 1 arith_reg_operand) (match_operand:SI 2 const_int_operand)) - (ashift:SI (match_operand:SI 3 t_reg_operand) + (ashift:SI (match_operand:SI 3 arith_reg_or_t_reg_operand) (const_int 31 (clobber (reg:SI T_REG))] TARGET_SH1 @@ -3976,6 +3976,17 @@ emit_insn (gen_cmpgtsi_t (tmp_t_reg, const0_rtx)); } + /* For the rotcr insn to work, operands[3] must be in T_REG. + If it is not we can get it there by shifting it right one bit. + In this case T_REG is not an input for this insn, thus we don't have to + pay attention as of where to insert the shlr insn. */ + if (! t_reg_operand (operands[3], SImode)) +{ + /* We don't care about the shifted result here, only the T_REG. */ + emit_insn (gen_shlr (gen_reg_rtx (SImode), operands[3])); + operands[3] = get_t_reg_rtx (); +} + emit_insn (gen_rotcr (operands[0], operands[1], operands[3])); DONE; }) @@ -3995,6 +4006,24 @@ (set (reg:SI T_REG) (and:SI (match_dup 0) (const_int 1)))])]) +(define_insn_and_split *rotcr + [(set (match_operand:SI 0 arith_reg_dest) + (ior:SI (and:SI (match_operand:SI 1 arith_reg_operand) + (const_int -2147483648)) ;; 0x8000 + (lshiftrt:SI (match_operand:SI 2 arith_reg_operand) + (const_int 1 + (clobber (reg:SI T_REG))] + TARGET_SH1 + # + can_create_pseudo_p () + [(const_int 0)] +{ + rtx tmp = gen_reg_rtx (SImode); + emit_insn (gen_shll (tmp, operands[1])); + emit_insn (gen_rotcr (operands[0], operands[2], get_t_reg_rtx ())); + DONE; +}) + ;; . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ;; SImode shift left Index: gcc/config/sh/predicates.md === --- gcc/config/sh/predicates.md (revision 191342) +++ gcc/config/sh/predicates.md (working copy) @@ -1028,3 +1028,8 @@ return 0; } }) + +;; Returns true of OP is arith_reg_operand or t_reg_operand. +(define_predicate arith_reg_or_t_reg_operand + (ior (match_operand 0 arith_reg_operand) + (match_operand 0 t_reg_operand))) Index: gcc/testsuite/gcc.target/sh/pr54089-1.c === --- gcc/testsuite/gcc.target/sh/pr54089-1.c (revision 191342) +++ gcc/testsuite/gcc.target/sh/pr54089-1.c (working copy) @@ -2,7 +2,8 @@ /* { dg-do compile { target sh*-*-* } } */ /* { dg-options -O1 } */ /* { dg-skip-if { sh*-*-* } { -m5*} { } } */ -/* { dg-final { scan-assembler-times rotcr 11 } } */ +/* { dg-final { scan-assembler-times rotcr 15 } } */ +/* { dg-final { scan-assembler-times shll\t 1 } } */ typedef char bool; @@ -81,3 +82,30 @@ bool r = a == b; return r 31; } + +unsigned int +test_11 (unsigned int a, int b) +{ + /* 1x shlr, 1x rotcr */ + return (a 1) | (b 31); +} + +unsigned int +test_12 (unsigned int a, int b) +{ + return (a 2) | (b 31); +} + +unsigned int +test_13 (unsigned int a, int b) +{ + return (a 3) | (b 31); +} + +unsigned int +test_14 (unsigned int a, int b) +{ + /* 1x shll, 1x rotcr */ + bool r = b 0; + return ((a 1) | (r 31)); +}
Re: [PATCH] Add extra location information - PR43486
and at that point you have lost your extra location information. Actually no, see the c-family/c-common.c patch, copied here, which ensures that folding does preserve such information: Thanks. I think I would like some clarity on when the extra location information is available. Typically the extra information would be used right before gimplification, just after the front-end has done its job (e.g. via the PLUGIN_PRE_GENERICIZE hooks if implemented via a plug-in). It could also be used during the front-end itself, to e.g. generate more accurate error messages or warnings. For better or for worse the C frontend does sometimes call directly into fold-const, without going through c_fully_fold. Ah yes... I can't resist but note the following comment in fold_convert_loc: -- Used by the middle-end for simple conversions in preference to calling the front-end's convert. -- Not only used by the middle-end apparently... So I guess we should first clarify whether this is an API/layering violation. E.g., I see calls to fold_convert and fold_build2_loc. What happens then? I've looked at most of these calls in the C and C++ front-end, and I suspect most of these calls (e.g. most calls to fold_build2_loc) correspond to internally generated expressions, not directly relevant to the source code. In other words, my patch aims at providing more detailed slocs (source locations) for the source representation, so that e.g. static analysis tools, plug-ins, diagnostic tools (potentially error messages) have more precise source location info. In a few cases of calls to fold_convert_loc/fold_build2_loc/etc... I guess we might indeed loose some sloc info, to be confirmed. In which case, depending whether e.g. calling fold_convert_loc() is indeed expected or not, we could refine the approach to not loose this information. Also note that we are talking about very few cases, and the idea behind the patch is to provide extra info as much as possible, but there's always an available fallback, which is the main source location. In other words, this approach can be incremental, and does not need to be complete to become useful. Arno
testsuite] remove dg-do run from a vect test
The infrastructure for gcc.dg/vect tests determines whether the default is for tests to be compile-only or compile plus execute. Tests that should not be executed use { dg-do compile }, but no test should use { dg-do run }. This patch removes { dg-do run} from pr52298.c. Tested on arm-none-eabi for default and big-endian, checked in on trunk as obvious. I'll backport to 4.6 when the branch is open. Janis 2012-09-18 Janis Johnson jani...@codesourcery.com * gcc.dg/vect/pr52298.c: Remove dg-do run. Index: gcc.dg/vect/pr52298.c === --- gcc.dg/vect/pr52298.c (revision 191440) +++ gcc.dg/vect/pr52298.c (working copy) @@ -1,4 +1,3 @@ -/* { dg-do run } */ /* { dg-options -O1 -ftree-vectorize -fno-tree-pre -fno-tree-loop-im } */ extern void abort (void);
[testsuite] vect effective targets should use arm_neon_ok
In most cases a test that requires ARM NEON should use effective target arm_neon, which means that flags run for all tests include NEON support. The result is cached the first time it is checked for a multilib. Vectorization tests, when run for ARM, add flags to support NEON if it's OK to do so, but those flags are not reflected in the cached results for arm_neon, nor should they be. Because of this, vect effective-target checks should use arm_neon_ok (as most already do) instead of arm_neon. This patch changes the checks for 7 effective targets, allowing more tests to run and decreasing the number of failures. The only new failures I've seen in tests on arm-none-eabi with a variety of test multilibs are for big-endian with vect_multiple_sizes, which means that vect_multiple_sizes should be false for big endian or that there's a bug in ARM big-endian support. Checked in on trunk as obvious. I'll backport to 4.6 when it's open. Janis 2012-09-18 Janis Johnson jani...@codesourcery.com * lib/target-supports.exp (check_effective_target_vect_widen_mult_qi_to_hi, check_effective_target_vect_widen_mult_hi_to_si, check_effective_target_vect_widen_mult_qi_to_hi_pattern, check_effective_target_vect_widen_mult_hi_to_si_pattern, check_effective_target_vect_pack_trunc, check_effective_target_vect_unpack, check_effective_target_vect_multiple_sizes): Check arm_neon_ok instead of arm_none. Index: lib/target-supports.exp === --- lib/target-supports.exp (revision 191440) +++ lib/target-supports.exp (working copy) @@ -3097,7 +3097,7 @@ set et_vect_widen_mult_qi_to_hi_saved 0 } if { [istarget powerpc*-*-*] - || ([istarget arm*-*-*] [check_effective_target_arm_neon]) } { + || ([istarget arm*-*-*] [check_effective_target_arm_neon_ok]) } { set et_vect_widen_mult_qi_to_hi_saved 1 } } @@ -3131,7 +3131,7 @@ || [istarget ia64-*-*] || [istarget i?86-*-*] || [istarget x86_64-*-*] - || ([istarget arm*-*-*] [check_effective_target_arm_neon]) } { + || ([istarget arm*-*-*] [check_effective_target_arm_neon_ok]) } { set et_vect_widen_mult_hi_to_si_saved 1 } } @@ -3152,7 +3152,7 @@ } else { set et_vect_widen_mult_qi_to_hi_pattern_saved 0 if { [istarget powerpc*-*-*] - || ([istarget arm*-*-*] [check_effective_target_arm_neon]) } { + || ([istarget arm*-*-*] [check_effective_target_arm_neon_ok]) } { set et_vect_widen_mult_qi_to_hi_pattern_saved 1 } } @@ -3177,7 +3177,7 @@ || [istarget ia64-*-*] || [istarget i?86-*-*] || [istarget x86_64-*-*] - || ([istarget arm*-*-*] [check_effective_target_arm_neon]) } { + || ([istarget arm*-*-*] [check_effective_target_arm_neon_ok]) } { set et_vect_widen_mult_hi_to_si_pattern_saved 1 } } @@ -3307,7 +3307,7 @@ || [istarget i?86-*-*] || [istarget x86_64-*-*] || [istarget spu-*-*] - || ([istarget arm*-*-*] [check_effective_target_arm_neon] + || ([istarget arm*-*-*] [check_effective_target_arm_neon_ok] [check_effective_target_arm_little_endian]) } { set et_vect_pack_trunc_saved 1 } @@ -,7 +,7 @@ || [istarget x86_64-*-*] || [istarget spu-*-*] || [istarget ia64-*-*] - || ([istarget arm*-*-*] [check_effective_target_arm_neon] + || ([istarget arm*-*-*] [check_effective_target_arm_neon_ok] [check_effective_target_arm_little_endian]) } { set et_vect_unpack_saved 1 } @@ -3751,7 +3751,7 @@ global et_vect_multiple_sizes_saved set et_vect_multiple_sizes_saved 0 -if { ([istarget arm*-*-*] [check_effective_target_arm_neon]) } { +if { ([istarget arm*-*-*] [check_effective_target_arm_neon_ok]) } { set et_vect_multiple_sizes_saved 1 } if { ([istarget x86_64-*-*] || [istarget i?86-*-*]) } {
[testsuite] for vect_multiple_sizes, skip instead of xfail for some checks
Seventeen tests in gcc.dg/vect that use vect_multiple_sizes have checks similar to: /* { dg-final { scan-tree-dump-times can't determine dependence 2 vect { xfail vect_multiple_sizes } } } */ /* { dg-final { scan-tree-dump-times can't determine dependence 4 vect { target vect_multiple_sizes } } } * When vect_multiple_sizes is true the first check shouldn't be reported as XFAIL, it should intead be skipped. The convention in other vect tests is to instead use: /* { dg-final { scan-tree-dump-times can't determine dependence 2 vect { target { ! vect_multiple_sizes } } } } */ /* { dg-final { scan-tree-dump-times can't determine dependence 4 vect { target vect_multiple_sizes } } } * This patch fixes those 17 tests. Tested on arm-none-eabi with a variety of test multilibs, checked in on trunk as obvious. I'll backport to 4.6 when the branch is open. Janis 2012-09-18 Janis Johnson jani...@codesourcery.com * gcc.dg/vect/no-vfa-vect-101.c: Skip a check for an irrelevant target instead of xfailing it. * gcc.dg/vect/no-vfa-vect-102.c: Likewise. * gcc.dg/vect/no-vfa-vect-102a.c: Likewise. * gcc.dg/vect/no-vfa-vect-37.c: Likewise. * gcc.dg/vect/no-vfa-vect-79.c: Likewise. * gcc.dg/vect/vect-104.c: Likewise. * gcc.dg/vect/vect-outer-1-big-array.c: Likewise. * gcc.dg/vect/vect-outer-1.c: Likewise. * gcc.dg/vect/vect-outer-1a-big-array.c: Likewise. * gcc.dg/vect/vect-outer-1a.c: Likewise. * gcc.dg/vect/vect-outer-1b-big-array.c: Likewise. * gcc.dg/vect/vect-outer-1b.c: Likewise. * gcc.dg/vect/vect-outer-2b.c: Likewise. * gcc.dg/vect/vect-outer-3a-big-array.c: Likewise. * gcc.dg/vect/vect-outer-3a.c: Likewise. * gcc.dg/vect/vect-outer-3b.c: Likewise. * gcc.dg/vect/vect-reduc-dot-s8b.c: Likewise. Index: gcc.dg/vect/no-vfa-vect-101.c === --- gcc.dg/vect/no-vfa-vect-101.c (revision 191440) +++ gcc.dg/vect/no-vfa-vect-101.c (working copy) @@ -45,7 +45,7 @@ } /* { dg-final { scan-tree-dump-times vectorized 1 loops 0 vect } } */ -/* { dg-final { scan-tree-dump-times can't determine dependence 1 vect { xfail vect_multiple_sizes } } } */ +/* { dg-final { scan-tree-dump-times can't determine dependence 1 vect { target { ! vect_multiple_sizes } } } } */ /* { dg-final { scan-tree-dump-times can't determine dependence 2 vect { target vect_multiple_sizes } } } */ /* { dg-final { cleanup-tree-dump vect } } */ Index: gcc.dg/vect/no-vfa-vect-102.c === --- gcc.dg/vect/no-vfa-vect-102.c (revision 191440) +++ gcc.dg/vect/no-vfa-vect-102.c (working copy) @@ -53,7 +53,7 @@ } /* { dg-final { scan-tree-dump-times vectorized 1 loops 0 vect } } */ -/* { dg-final { scan-tree-dump-times possible dependence between data-refs 1 vect { xfail vect_multiple_sizes } } } */ +/* { dg-final { scan-tree-dump-times possible dependence between data-refs 1 vect { target { ! vect_multiple_sizes } } } } */ /* { dg-final { scan-tree-dump-times possible dependence between data-refs 2 vect { target vect_multiple_sizes } } } */ /* { dg-final { cleanup-tree-dump vect } } */ Index: gcc.dg/vect/no-vfa-vect-102a.c === --- gcc.dg/vect/no-vfa-vect-102a.c (revision 191440) +++ gcc.dg/vect/no-vfa-vect-102a.c (working copy) @@ -53,7 +53,7 @@ } /* { dg-final { scan-tree-dump-times vectorized 1 loops 0 vect } } */ -/* { dg-final { scan-tree-dump-times possible dependence between data-refs 1 vect { xfail vect_multiple_sizes } } } */ +/* { dg-final { scan-tree-dump-times possible dependence between data-refs 1 vect { target { ! vect_multiple_sizes } } } } */ /* { dg-final { scan-tree-dump-times possible dependence between data-refs 2 vect { target vect_multiple_sizes } } } */ /* { dg-final { cleanup-tree-dump vect } } */ Index: gcc.dg/vect/no-vfa-vect-37.c === --- gcc.dg/vect/no-vfa-vect-37.c(revision 191440) +++ gcc.dg/vect/no-vfa-vect-37.c(working copy) @@ -58,6 +58,6 @@ If/when the aliasing problems are resolved, unalignment may prevent vectorization on some targets. */ /* { dg-final { scan-tree-dump-times vectorized 2 loops 1 vect { xfail *-*-* } } } */ -/* { dg-final { scan-tree-dump-times can't determine dependence 2 vect { xfail vect_multiple_sizes } } } */ +/* { dg-final { scan-tree-dump-times can't determine dependence 2 vect { target { ! vect_multiple_sizes } } } } */ /* { dg-final { scan-tree-dump-times can't determine dependence 4 vect { target vect_multiple_sizes } } } */ /* { dg-final { cleanup-tree-dump vect } } */ Index: gcc.dg/vect/no-vfa-vect-79.c === --- gcc.dg/vect/no-vfa-vect-79.c
[testsuite] vect/fast-math-pr35982: skip check instead of xfail
Test gcc.dg/vect/fast-math-pr35982.c uses xfail in a dg-final check when it should instead skip the check for that effective target. Tested on arm-none-eabi for a variety of test multilibs, checked in on trunk as obvious. I'll backport to 4.6 when the branch is open. Janis 2012-09-18 Janis Johnson jani...@codesourcery.com * gcc.dg/vect/fast-math-pr35982.c: Skip check instead of xfail. Index: gcc.dg/vect/fast-math-pr35982.c === --- gcc.dg/vect/fast-math-pr35982.c (revision 191440) +++ gcc.dg/vect/fast-math-pr35982.c (working copy) @@ -21,5 +21,5 @@ } /* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect { target { vect_extract_even_odd || vect_strided2 } } } } */ -/* { dg-final { scan-tree-dump-times vectorized 0 loops 1 vect { xfail { vect_extract_even_odd || vect_strided2 } } } } */ +/* { dg-final { scan-tree-dump-times vectorized 0 loops 1 vect { target { ! { vect_extract_even_odd || vect_strided2 } } } } } */ /* { dg-final { cleanup-tree-dump vect } } */