Re: [PATCH] Fix vector rotate regression (PR tree-optimization/57233)
On Thu, Jun 26, 2014 at 07:43:55AM +0200, Marc Glisse wrote: + if (compute_type == TREE_TYPE (type) + !VECTOR_INTEGER_TYPE_P (TREE_TYPE (rhs2))) +{ + optab oplv, opl, oprv, opr, opo; + oplv = optab_for_tree_code (LSHIFT_EXPR, type, optab_vector); + oprv = optab_for_tree_code (RSHIFT_EXPR, type, optab_vector); + opo = optab_for_tree_code (BIT_IOR_EXPR, type, optab_default); + opl = optab_for_tree_code (LSHIFT_EXPR, type, optab_scalar); + opr = optab_for_tree_code (RSHIFT_EXPR, type, optab_scalar); How well are we separating lshiftrt from ashiftrt? Are ROTATE_EXPR always on an unsigned type so it is fine? Or do we expand one in terms of the other if it isn't available so it doesn't matter? You're right, for RSHIFT_EXPR I should use oprv = vlshr_optab; opr = lshr_optab; because expmed.c always uses lshiftrt. + if (optab_handler (oplv, TYPE_MODE (type)) != CODE_FOR_nothing + optab_handler (opl, TYPE_MODE (type)) == CODE_FOR_nothing + optab_handler (oprv, TYPE_MODE (type)) != CODE_FOR_nothing + optab_handler (opr, TYPE_MODE (type)) == CODE_FOR_nothing) +{ + opl = oplv; + opr = oprv; +} Maybe arrange the conditions in order (oplv!= oprv!= (opl== || opr==)), or separate the replacement of opl and of opv into 2 separate ifs? The reason I wrote them in this order was so that it fits on the lines ;) Guess two separate ifs would be fine. Somehow, it feels like those checks should be somewhere in get_compute_type so we test both scalar and vector versions for each size, or we could call get_compute_type for both and pick the best. It would be easier to just call get_compute_type in each case and pick the wider, but the preexisting case didn't bother, so I haven't bothered either. Passing two optabs optionally to get_compute_type would be IMHO too ugly. + compute_type = get_compute_type (LSHIFT_EXPR, opl, type); + if (compute_type == TREE_TYPE (type) + || compute_type != get_compute_type (RSHIFT_EXPR, opr, type) + || compute_type != get_compute_type (BIT_IOR_EXPR, opo, type)) +compute_type = TREE_TYPE (type); Since we have determined compute_type from ashift (let's assume that's the one least likely to exist), I would just check that optab is ok with using this mode for the other 2 ops. Here, if we have shifts in 128 bits and ior in both 128 and 256 bits, we will fail (I thought that might be the case in AVX, but apparently not). Plus it is faster ;-) Makes sense. Does rotate hit PR 56873? (I noticed the -mno-xop and no -mxop test) There are exec testcases, -mxop AFAIK has rotates, so I think it is not worth testing. And there is a single asm match compile testcase, where the -mno-xop ensures that when somebody -mxop in RUNTESTFLAGS or CPU defaults to -mxop we don't get a matching failure, because in that case it would emit a vector rotate insn. Jakub
Re: [PATCH] Change default for --param allow-...-data-races to off
On June 26, 2014 12:03:21 AM CEST, Martin Jambor mjam...@suse.cz wrote: Hi, On Wed, Jun 25, 2014 at 03:14:31PM -0600, Jeff Law wrote: On 06/24/14 14:19, Martin Jambor wrote: On Mon, Jun 23, 2014 at 03:35:01PM +0200, Bernd Edlinger wrote: Hi Martin, Well actually, I am not sure if we ever wanted to have a race condition here. Have you seen any impact of --param allow-store-data-races on any benchmark? It's trivially to write one. The only pass that checks the param is tree loop invariant motion and it does that when it applies store-motion. Register pressure increase is increased by a factor of two. So I'd agree that we might want to disable this again for -Ofast. As nothing tests for the PACKED variants nor for the LOAD variant I'd rather remove those. Claiming we don't create races for those when you disable it via the param is simply not true. Thanks, Richard. OK, please go ahead with your patch. Perhaps not unsurprisingly, the patch is very similar. Bootstrapped and tested on x86_64-linux. OK for trunk? Thanks, Martin 2014-06-24 Martin Jambor mjam...@suse.cz * params.def (PARAM_ALLOW_LOAD_DATA_RACES) (PARAM_ALLOW_PACKED_LOAD_DATA_RACES) (PARAM_ALLOW_PACKED_STORE_DATA_RACES): Removed. (PARAM_ALLOW_STORE_DATA_RACES): Set default to zero. * opts.c (default_options_optimization): Set PARAM_ALLOW_STORE_DATA_RACES to one at -Ofast. * doc/invoke.texi (allow-load-data-races) (allow-packed-load-data-races, allow-packed-store-data-races): Removed. (allow-store-data-races): Document the new default. testsuite/ * g++.dg/simulate-thread/bitfields-2.C: Remove allow-load-data-races parameter. * g++.dg/simulate-thread/bitfields.C: Likewise. * gcc.dg/simulate-thread/strict-align-global.c: Remove allow-packed-store-data-races parameter. * gcc.dg/simulate-thread/subfields.c: Likewise. * gcc.dg/tree-ssa/20050314-1.c: Set parameter allow-store-data-races to one. Don't we want to deprecate, not remove the dead options? Is there a mechanism for deprecating parameters (I could not quickly find any) or do you mean to leave them there and only document them as deprecated? I am not really concerned how we deal with the unused parameters, removing or any form of deprecating is fine with me. --params are not a stable interface, so we can just remove those. Of course this would be the opportunity to introduce a real option for this task and leave the param as an implementation detail. Richard. Thanks, Martin
Re: [PATCH, 3/10] skip swapping operands used in ccmp
On 26 June 2014 05:03, Jeff Law l...@redhat.com wrote: On 06/25/14 08:44, Richard Earnshaw wrote: On 23/06/14 07:58, Zhenqiang Chen wrote: Hi, Swapping operands in a ccmp will lead to illegal instructions. So the patch disables it in simplify_while_replacing. The patch is separated from https://gcc.gnu.org/ml/gcc-patches/2014-02/msg01407.html. To make it clean. The patch adds two files: ccmp.{c,h} to hold all new ccmp related functions. OK for trunk? Thanks! -Zhenqiang ChangeLog: 2014-06-23 Zhenqiang Chen zhenqiang.c...@linaro.org * Makefile.in: Add ccmp.o * ccmp.c: New file. Do we really need a new file for one 10-line function? Seems like overkill. I think it would be better to just drop this function into recog.c. Right. And if we did want a new file, clearly the #includes need to be trimmed :-) Yes. It is not necessary for this patch itself. The file is in need for [PATCH, 4/10] expand ccmp. Overall it will have more than 300 lines of codes. And #includes are trimmed. Previously all codes were in expr.c. It always conflicted when rebasing. So I move them in separate files. Thanks! -Zhenqiang
Re: [PATCH, 3/10] skip swapping operands used in ccmp
On 25 June 2014 22:44, Richard Earnshaw rearn...@arm.com wrote: On 23/06/14 07:58, Zhenqiang Chen wrote: Hi, Swapping operands in a ccmp will lead to illegal instructions. So the patch disables it in simplify_while_replacing. The patch is separated from https://gcc.gnu.org/ml/gcc-patches/2014-02/msg01407.html. To make it clean. The patch adds two files: ccmp.{c,h} to hold all new ccmp related functions. OK for trunk? Thanks! -Zhenqiang ChangeLog: 2014-06-23 Zhenqiang Chen zhenqiang.c...@linaro.org * Makefile.in: Add ccmp.o * ccmp.c: New file. Do we really need a new file for one 10-line function? Seems like overkill. I think it would be better to just drop this function into recog.c. Also, can you explain more clearly what the problem is with swapping the operands? If this can't be done, then SWAPPABLE_OPERANDS is arguably doing the wrong thing; and that might mean that rtx class you've applied to your new code is incorrect. Thanks for the comments. In previous tests, I got several new fails if the operands were swapped. I will try to reproduce it and back to you. Thanks! -Zhenqiang R. * ccmp.h: New file. * recog.c (simplify_while_replacing): Check ccmp_insn_p. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 5587b75..8757a30 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1169,6 +1169,7 @@ OBJS = \ builtins.o \ caller-save.o \ calls.o \ + ccmp.o \ cfg.o \ cfganal.o \ cfgbuild.o \ diff --git a/gcc/ccmp.c b/gcc/ccmp.c new file mode 100644 index 000..665c2a5 --- /dev/null +++ b/gcc/ccmp.c @@ -0,0 +1,62 @@ +/* Conditional compare related functions + Copyright (C) 2014-2014 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +http://www.gnu.org/licenses/. */ + +#include config.h +#include system.h +#include coretypes.h +#include tm.h +#include rtl.h +#include tree.h +#include stringpool.h +#include regs.h +#include expr.h +#include optabs.h +#include tree-iterator.h +#include basic-block.h +#include tree-ssa-alias.h +#include internal-fn.h +#include gimple-expr.h +#include is-a.h +#include gimple.h +#include gimple-ssa.h +#include tree-ssanames.h +#include target.h +#include common/common-target.h +#include df.h +#include tree-ssa-live.h +#include tree-outof-ssa.h +#include cfgexpand.h +#include tree-phinodes.h +#include ssa-iterators.h +#include expmed.h +#include ccmp.h + +bool +ccmp_insn_p (rtx object) +{ + rtx x = PATTERN (object); + if (targetm.gen_ccmp_first + GET_CODE (x) == SET + GET_CODE (XEXP (x, 1)) == COMPARE + (GET_CODE (XEXP (XEXP (x, 1), 0)) == IOR + || GET_CODE (XEXP (XEXP (x, 1), 0)) == AND)) +return true; + return false; +} + diff --git a/gcc/ccmp.h b/gcc/ccmp.h new file mode 100644 index 000..7e139aa --- /dev/null +++ b/gcc/ccmp.h @@ -0,0 +1,25 @@ +/* Conditional comapre related functions. + Copyright (C) 2014-2014 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version.: + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +http://www.gnu.org/licenses/. */ + +#ifndef GCC_CCMP_H +#define GCC_CCMP_H + +extern bool ccmp_insn_p (rtx); + +#endif /* GCC_CCMP_H */ diff --git a/gcc/recog.c b/gcc/recog.c index 8d10a4f..b53a28c 100644 --- a/gcc/recog.c +++ b/gcc/recog.c @@ -40,6 +40,7 @@ along with GCC; see the file COPYING3. If not see #include tree-pass.h #include df.h #include insn-codes.h +#include ccmp.h #ifndef STACK_PUSH_CODE #ifdef STACK_GROWS_DOWNWARD @@ -577,7 +578,8 @@ simplify_while_replacing (rtx *loc, rtx to, rtx object, enum rtx_code code = GET_CODE (x); rtx new_rtx = NULL_RTX; - if (SWAPPABLE_OPERANDS_P (x) + /* Do not swap compares in conditional compare instruction. */
Re: [PATCH] Add missing -fdump-* options
On Wed, Jun 25, 2014 at 4:21 PM, Teresa Johnson tejohn...@google.com wrote: On Tue, May 13, 2014 at 8:19 AM, Xinliang David Li davi...@google.com wrote: On Tue, May 13, 2014 at 1:39 AM, Richard Biener richard.guent...@gmail.com wrote: On Fri, May 9, 2014 at 5:54 PM, Teresa Johnson tejohn...@google.com wrote: I discovered that the support for the documented -fdump-* options optimized, missed, note and optall was missing. Added that and fixed a minor typo in the documentation. Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk? I'm not sure they were intented for user-consumption. ISTR they are just an implementation detail exposed by -fopt-info-X (which is where they are documented). The typo fix is ok, also adding a comment before the dump flags definition to the above fact. David, do I remember correctly? I remember we talked about content filtering dump flags. Things like -fdump-xxx-ir -- dump IR only -fdump-xxx-transformation -- optimization note -fdump-xxx-debug -- other debug traces Other than that, now I think 'details' and 'all' seem redundant. 'verbose' flag/modifier can achieve the same effect depending on the context. -fdump-xxx-ir-verbose -- dump IR, and turn on IR modifiers such as vops, lineno, etc -fdump-xxx-transforamtion-verbose -- dump transformations + missed optimizations + notes -fdump-xxx-debug-verbose -- turn on detailed trace. The above proposal seems fine to me as a longer-term direction, but also seems somewhat orthogonal to the issue my patch is trying to solve in the short term, namely inconsistent documentation and behavior: 1) optimized, missed, note and optall are documented as being sub-options for -fdump-tree-* in doc/invoke.texi, but not implemented. 2) optimized, missed, note and optall are however enabled via -fdump-tree-all Could we at least fix these issues in the short term, as it doesn't affect the documented behavior (but rather adds the documented behavior)? Sure. Richard. Thanks, Teresa thanks, David Thanks, Richard. Thanks, Teresa 2014-05-09 Teresa Johnson tejohn...@google.com * doc/invoke.texi: Fix typo. * dumpfile.c: Add support for documented -fdump-* options optimized/missed/note/optall. Index: doc/invoke.texi === --- doc/invoke.texi (revision 210157) +++ doc/invoke.texi (working copy) @@ -6278,7 +6278,7 @@ passes). @item missed Enable showing missed optimization information (only available in certain passes). -@item notes +@item note Enable other detailed optimization information (only available in certain passes). @item =@var{filename} Index: dumpfile.c === --- dumpfile.c (revision 210157) +++ dumpfile.c (working copy) @@ -107,6 +107,10 @@ static const struct dump_option_value_info dump_op {nouid, TDF_NOUID}, {enumerate_locals, TDF_ENUMERATE_LOCALS}, {scev, TDF_SCEV}, + {optimized, MSG_OPTIMIZED_LOCATIONS}, + {missed, MSG_MISSED_OPTIMIZATION}, + {note, MSG_NOTE}, + {optall, MSG_ALL}, {all, ~(TDF_RAW | TDF_SLIM | TDF_LINENO | TDF_TREE | TDF_RTL | TDF_IPA | TDF_STMTADDR | TDF_GRAPH | TDF_DIAGNOSTIC | TDF_VERBOSE | TDF_RHS_ONLY | TDF_NOUID | TDF_ENUMERATE_LOCALS | TDF_SCEV)}, -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
[PATCH v2] typeof: Remove type qualifiers for atomic types
GCC provides its own version of stdatomic.h since GCC 4.9. Here we have: #define atomic_load_explicit(PTR, MO)\ __extension__\ ({\ __auto_type __atomic_load_ptr = (PTR);\ __typeof__ (*__atomic_load_ptr) __atomic_load_tmp;\ __atomic_load (__atomic_load_ptr, __atomic_load_tmp, (MO));\ __atomic_load_tmp;\ }) According to http://en.cppreference.com/w/c/atomic/atomic_load (or in the standard 7.17.7.2 The atomic_load generic functions) we have C atomic_load_explicit( volatile A* obj, memory_order order ); This test case #include stdatomic.h int ld(volatile atomic_int *i) { return atomic_load_explicit(i, memory_order_relaxed); } yields on ARM arm-rtems4.11-gcc -march=armv7-a -O2 test.c -S cat test.s .arch armv7-a .fpu softvfp .eabi_attribute 20, 1 .eabi_attribute 21, 1 .eabi_attribute 23, 3 .eabi_attribute 24, 1 .eabi_attribute 25, 1 .eabi_attribute 26, 1 .eabi_attribute 30, 2 .eabi_attribute 34, 1 .eabi_attribute 18, 4 .file test.c .text .align 2 .global ld .type ld, %function ld: @ args = 0, pretend = 0, frame = 8 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. ldr r3, [r0] sub sp, sp, #8 str r3, [sp, #4] ldr r0, [sp, #4] add sp, sp, #8 @ sp needed bx lr .size ld, .-ld .ident GCC: (GNU) 4.9.1 20140515 (prerelease) To solve this performance issue discard all qualifiers in __typeof__ and __auto_type for atomic types. With this patch we have rm-rtems4.11-gcc -march=armv7-a -O2 test.c -S cat test.s .arch armv7-a .fpu softvfp .eabi_attribute 20, 1 .eabi_attribute 21, 1 .eabi_attribute 23, 3 .eabi_attribute 24, 1 .eabi_attribute 25, 1 .eabi_attribute 26, 1 .eabi_attribute 30, 2 .eabi_attribute 34, 1 .eabi_attribute 18, 4 .file test.c .text .align 2 .global ld .type ld, %function ld: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. ldr r0, [r0] bx lr .size ld, .-ld .ident GCC: (GNU) 4.9.1 20140625 (prerelease) gcc/c/ChangeLog 2014-06-26 Sebastian Huber sebastian.hu...@embedded-brains.de * c-parser.c (c_parser_declaration_or_fndef): Discard all type qualifiers in __auto_type for atomic types. (c_parser_typeof_specifier): Discard all type qualifiers in __typeof__ for atomic types. gcc/testsuite/ChangeLog 2014-06-26 Sebastian Huber sebastian.hu...@embedded-brains.de * gcc.dg/typeof-2.c: New testcase. --- gcc/c/c-parser.c| 21 ++--- gcc/testsuite/gcc.dg/typeof-2.c | 28 2 files changed, 34 insertions(+), 15 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/typeof-2.c diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index 99ff546..037da03 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -1707,14 +1707,10 @@ c_parser_declaration_or_fndef (c_parser *parser, bool fndef_ok, initializer); init = convert_lvalue_to_rvalue (init_loc, init, true, true); tree init_type = TREE_TYPE (init.value); - /* As with typeof, remove _Atomic and const -qualifiers from atomic types. */ + /* As with typeof, remove all qualifiers from atomic types. */ if (init_type != error_mark_node TYPE_ATOMIC (init_type)) init_type - = c_build_qualified_type (init_type, - (TYPE_QUALS (init_type) - ~(TYPE_QUAL_ATOMIC -| TYPE_QUAL_CONST))); + = c_build_qualified_type (init_type, TYPE_UNQUALIFIED); bool vm_type = variably_modified_type_p (init_type, NULL_TREE); if (vm_type) @@ -3011,16 +3007,11 @@ c_parser_typeof_specifier (c_parser *parser) if (was_vm) ret.expr = c_fully_fold (expr.value, false, ret.expr_const_operands); pop_maybe_used (was_vm); - /* For use in macros such as those in stdatomic.h, remove -_Atomic and const qualifiers from atomic types. (Possibly -all qualifiers should be removed; const can be an issue for -more macros using typeof than just the stdatomic.h -ones.) */ + /*
RE: [PATCH] Change default for --param allow-...-data-races to off
Hi, On Thu, 26 Jun 2014 08:43:46, Richard Biener wrote: On June 26, 2014 12:03:21 AM CEST, Martin Jambor mjam...@suse.cz wrote: Hi, On Wed, Jun 25, 2014 at 03:14:31PM -0600, Jeff Law wrote: On 06/24/14 14:19, Martin Jambor wrote: On Mon, Jun 23, 2014 at 03:35:01PM +0200, Bernd Edlinger wrote: Hi Martin, Well actually, I am not sure if we ever wanted to have a race condition here. Have you seen any impact of --param allow-store-data-races on any benchmark? It's trivially to write one. The only pass that checks the param is tree loop invariant motion and it does that when it applies store-motion. Register pressure increase is increased by a factor of two. So I'd agree that we might want to disable this again for -Ofast. As nothing tests for the PACKED variants nor for the LOAD variant I'd rather remove those. Claiming we don't create races for those when you disable it via the param is simply not true. Thanks, Richard. OK, please go ahead with your patch. Perhaps not unsurprisingly, the patch is very similar. Bootstrapped and tested on x86_64-linux. OK for trunk? Thanks, Martin 2014-06-24 Martin Jambor mjam...@suse.cz * params.def (PARAM_ALLOW_LOAD_DATA_RACES) (PARAM_ALLOW_PACKED_LOAD_DATA_RACES) (PARAM_ALLOW_PACKED_STORE_DATA_RACES): Removed. (PARAM_ALLOW_STORE_DATA_RACES): Set default to zero. * opts.c (default_options_optimization): Set PARAM_ALLOW_STORE_DATA_RACES to one at -Ofast. * doc/invoke.texi (allow-load-data-races) (allow-packed-load-data-races, allow-packed-store-data-races): Removed. (allow-store-data-races): Document the new default. testsuite/ * g++.dg/simulate-thread/bitfields-2.C: Remove allow-load-data-races parameter. * g++.dg/simulate-thread/bitfields.C: Likewise. * gcc.dg/simulate-thread/strict-align-global.c: Remove allow-packed-store-data-races parameter. * gcc.dg/simulate-thread/subfields.c: Likewise. * gcc.dg/tree-ssa/20050314-1.c: Set parameter allow-store-data-races to one. Don't we want to deprecate, not remove the dead options? Is there a mechanism for deprecating parameters (I could not quickly find any) or do you mean to leave them there and only document them as deprecated? I am not really concerned how we deal with the unused parameters, removing or any form of deprecating is fine with me. --params are not a stable interface, so we can just remove those. Of course this would be the opportunity to introduce a real option for this task and leave the param as an implementation detail. well, of course, given the fact that the --param allow-store-data-races=0 is actually used now by linux kernel makefiles we should keep this parameter. I'd agree with Richard about the other parameters. Note however that they are not really a secret any more: See https://gcc.gnu.org/wiki/Atomic/GCCMM/ExecutiveSummary where these --params are documented, should this page be adjusted too when we remove them? Bernd. Richard. Thanks, Martin
Re: [PATCH] Fix parts of PR61607
On Wed, 25 Jun 2014, Jeff Law wrote: On 06/25/14 08:05, Richard Biener wrote: This removes restrictions in DOM cprop_operand that inhibit some optimizations. The volatile pointer thing is really realy old and no longer necessary while the loop-depth consideration is only valid for loop-closed PHI nodes (but we're not in loop-closed SSA in DOM) - the coalescing is handled in out-of-SSA phase by inserting copies appropriately. Bootstrapped on x86_64-unknown-linux-gnu, ok? Thanks, Richard. 2014-06-25 Richard Biener rguent...@suse.de PR tree-optimization/61607 * tree-ssa-dom.c (cprop_operand): Remove restriction on propagating volatile pointers and on loop depth. The first hunk is OK. I thought we had tests for the do not copy propagate out of a loop nest in the suite. Did you check that tests in BZ 19038 still generate good code after this change? If we still generate good code for those tests, then this hunk is fine too. I have applied the first hunk and will investigate further. Testing didn't show any issue and I know how to retain the check but not cause the missed optimization shown in PR61607. Richard.
[GSoC][match-and-simplify] factor gimple expressions and builtin functions
This patch factors expression checking for GIMPLE. Generates code as: if (TREE_CODE (opname) == SSA_NAME) { gimple def_stmt = SSA_NAME_DEF_STMT (opname); if (is_gimple_assign (def_stmt)) { if (gimple_assign_rhs_code (def_stmt) == expr-code1) { } if (gimple_assign_rhs_code (def_stmt) == expr-code2) { } } } We cannot use switch-case, for convert/nop since we use CONVERT_EXPR_CODE_P, so i used if stmt. Unfortunately, back-tracking is still done. I shall look into that. * genmatch.c (dt_node::get_expr_code): New member function. (dt_node::is_gimple_expr): Likewise. (dt_node::is_gimple_fn): Likewise. (dt_operand::kids_type): New struct. (dt_operand::gen_gimple_expr): Remove. (dt_operand::gen_gimple_expr_expr): Remove 2nd argument and change returntype to unsigned. Adjust code-gen for expressions. (dt_operand::gen_gimple_expr_fn): Remove 2nd argument and change return type to unsigned. (dt_operand::grok_kids): New member function. (dt_operand::gen_gimple_kids): Likewise. (dt_operand::gen_gimple): Adjust code-gen for gimple expressions. Call dt_operand::gen_gimple_kids. (decision_tree::gen_gimple): Call dt_operand::gen_gimple_kids. Thanks and Regards, Prathamesh Index: gcc/genmatch.c === --- gcc/genmatch.c (revision 211973) +++ gcc/genmatch.c (working copy) @@ -325,6 +325,10 @@ struct dt_node virtual void gen_gimple (FILE *) {} virtual void gen_generic (FILE *) {} + + bool get_expr_code (enum tree_code); + bool is_gimple_expr (); + bool is_gimple_fn (); }; struct dt_operand: public dt_node @@ -333,6 +337,15 @@ struct dt_operand: public dt_node dt_operand *match_dop; dt_operand *parent; unsigned pos; + + struct kids_type { +vecdt_node * gimple_exprs; +vecdt_node * fns; +vecdt_node * others; +dt_node *true_operand; + +kids_type (): gimple_exprs (vNULL), fns (vNULL), others (vNULL), true_operand (0) {} + }; dt_operand (enum dt_type type, operand *op_, dt_operand *match_dop_, dt_operand *parent_ = 0, unsigned pos_ = 0) : dt_node (type), op (op_), match_dop (match_dop_), parent (parent_), pos (pos_) {} @@ -342,9 +355,8 @@ struct dt_operand: public dt_node unsigned gen_predicate (FILE *, const char *); unsigned gen_match_op (FILE *, const char *); - unsigned gen_gimple_expr (FILE *, const char *); - void gen_gimple_expr_expr (FILE *, expr *); - void gen_gimple_expr_fn (FILE *, expr *); + unsigned gen_gimple_expr_expr (FILE *); + unsigned gen_gimple_expr_fn (FILE *); unsigned gen_generic_expr (FILE *, const char *, bool); void gen_generic_expr_expr (FILE *, expr *, const char *, bool); @@ -352,6 +364,9 @@ struct dt_operand: public dt_node char *get_name (char *); void gen_opname (char *, unsigned); + + void grok_kids(kids_type); + void gen_gimple_kids (FILE *); }; @@ -906,9 +921,10 @@ dt_operand::gen_match_op (FILE *f, const return 1; } -void -dt_operand::gen_gimple_expr_fn (FILE *f, expr *e) +unsigned +dt_operand::gen_gimple_expr_fn (FILE *f) { + expr *e = static_castexpr * (op); unsigned n_ops = e-ops.length (); fn_id *op = static_cast fn_id * (e-operation-op); @@ -924,20 +940,22 @@ dt_operand::gen_gimple_expr_fn (FILE *f, fprintf (f, if ((%s = do_valueize (valueize, %s)) != 0)\n, child_opname, child_opname); fprintf (f, {\n); } + + return n_ops + 1; } -void -dt_operand::gen_gimple_expr_expr (FILE *f, expr *e) +unsigned +dt_operand::gen_gimple_expr_expr (FILE *f) { + expr *e = static_castexpr * (op); unsigned n_ops = e-ops.length (); operator_id *op_id = static_cast operator_id * (e-operation-op); if (op_id-code == NOP_EXPR || op_id-code == CONVERT_EXPR) -fprintf (f, if (is_gimple_assign (def_stmt)\n - CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt; +fprintf (f, if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt; else -fprintf (f, if (is_gimple_assign (def_stmt) gimple_assign_rhs_code (def_stmt) == %s)\n, op_id-id); +fprintf (f, if (gimple_assign_rhs_code (def_stmt) == %s)\n, op_id-id); fprintf (f, {\n); @@ -950,22 +968,9 @@ dt_operand::gen_gimple_expr_expr (FILE * fprintf (f, if ((%s = do_valueize (valueize, %s)) != 0)\n, child_opname, child_opname); fprintf (f, {\n); } -} - -unsigned -dt_operand::gen_gimple_expr (FILE *f, const char *opname) -{ - expr *e = static_castexpr * (op); - - fprintf (f, if (TREE_CODE (%s) == SSA_NAME)\n, opname); - fprintf (f, {\n); - - fprintf (f, gimple def_stmt = SSA_NAME_DEF_STMT (%s);\n, opname); - (e-operation-op-kind == id_base::CODE) ? gen_gimple_expr_expr (f, e) : gen_gimple_expr_fn (f, e); - - return e-ops.length () + 2; -} + return n_ops + 1; +} void dt_operand::gen_generic_expr_expr (FILE *f, expr *e, const char *opname, @@
Re: [PATCH, ARM] Cortex-A9 MPCore volatile load workaround
Ping x2. On 14/6/20 2:24 PM, Chung-Lin Tang wrote: Ping. On 2014/6/9 10:03 PM, Chung-Lin Tang wrote: Hi Richard, As we talked about earlier, here's a patch to add a compiler option to work around Cortex-A9 MPCore errata 761319: http://infocenter.arm.com/help/topic/com.arm.doc.uan0004a/UAN0004A_a9_read_read.pdf What the option does basically, is to scan for volatile loads during reorg, and add a dmb barrier after it. It also strives to make dmb conditionally executed under TARGET_THUMB2, which means a new Thumb-2 specific *memory_barrier_t2 pattern in sync.md, with adjusted conds/predicable attributes and %? in output strings. Patch originally written by Julian, with additions by Meador, and finally a few trivial adjustments by me. Again, we've been carrying this fix for a release or two. Okay for trunk? Thanks, Chung-Lin 2014-06-09 Julian Brown jul...@codesourcery.com Meador Inge mead...@codesourcery.com Chung-Lin Tang clt...@codesourcery.com * config/arm/arm.c (arm_option_override): Emit warning if -mfix-cortex-a9-volatile-hazards is used on an incompatible CPU. (any_volatile_loads_p): New. (arm_cortex_a9_errata_reorg): New. (arm_reorg): Call arm_cortex_a9_errata_reorg. * config/arm/arm.opt (mfix-cortex-a9-volatile-hazards): Add option. * config/arm/sync.md (*memory_barrier): Don't use on Thumb-2. (*memory_barrier_t2): New, allow conditional execution on Thumb-2. * doc/invoke.texi (-mfix-cortex-a9-volatile-hazards): Add documentation. testsuite/ * lib/target-supports.exp (check_effective_target_arm_dmb): New. * gcc.target/arm/a9-volatile-ordering-erratum-1.c: New test. * gcc.target/arm/a9-volatile-ordering-erratum-2.c: New test. * gcc.target/arm/a9-volatile-ordering-erratum-3.c: New test. * gcc.target/arm/a9-volatile-ordering-erratum-4.c: New test.
Re: [PATCH] Fix parts of PR61607
On Thu, 26 Jun 2014, Richard Biener wrote: On Wed, 25 Jun 2014, Jeff Law wrote: On 06/25/14 08:05, Richard Biener wrote: This removes restrictions in DOM cprop_operand that inhibit some optimizations. The volatile pointer thing is really realy old and no longer necessary while the loop-depth consideration is only valid for loop-closed PHI nodes (but we're not in loop-closed SSA in DOM) - the coalescing is handled in out-of-SSA phase by inserting copies appropriately. Bootstrapped on x86_64-unknown-linux-gnu, ok? Thanks, Richard. 2014-06-25 Richard Biener rguent...@suse.de PR tree-optimization/61607 * tree-ssa-dom.c (cprop_operand): Remove restriction on propagating volatile pointers and on loop depth. The first hunk is OK. I thought we had tests for the do not copy propagate out of a loop nest in the suite. Did you check that tests in BZ 19038 still generate good code after this change? If we still generate good code for those tests, then this hunk is fine too. I have applied the first hunk and will investigate further. Testing didn't show any issue and I know how to retain the check but not cause the missed optimization shown in PR61607. Let's try to summarize what the restriction is supposed to avoid. It tries to avoid introducing uses of SSA names defined inside a loop outside of it because if the SSA name is live over the backedge we will then have an overlapping life-range which prevents out-of-SSA from coalescing it to a single register. Now, the existing test is not working in that way. Rather the best way we have to ensure this property (all outside uses go through a copy that is placed on exit edges rather than possibly on the backedge) is to go into loop-closed SSA form. This is also where the PHI nodes that confuse DOM in PR61607 come from in the first place. Now as the existing measure is ineffective in some cases out-of-SSA has gotten the ability to deal with this (or a subset): /* If elimination of a PHI requires inserting a copy on a backedge, then we will have to split the backedge which has numerous undesirable performance effects. A significant number of such cases can be handled here by inserting copies into the loop itself. */ insert_backedge_copies (); now, this doesn't seem to deal with outside uses. But eventually the coalescing code already assigns proper cost to backedge copies so that we choose to place copies on the exit edges rather than the backedge ones - seems not so from looking at coalesce_cost_edge. So I think that we should remove the copy-propagation restrictions and instead address this in out-of-SSA. For now the following patch retains the exact same restriction in DOM as it is present in copyprop (but not in FRE - ok my recent fault, or in VRP). By avoiding to record the equivalency for PHIs (where we know that either all or no uses should be covered by the loop depth check) we retain the ability to record the equivalency for the two loop exit PHI nodes and thus the threading (if only on the false path). Bootstrap and regtest running on x86_64-unknown-linux-gnu. I'll try to see what happens to the PR19038 testcases (though that PR is a mess ...) Richard. 2014-06-26 Richard Biener rguent...@suse.de PR tree-optimization/61607 * tree-ssa-copy.c (copy_prop_visit_phi_node): Adjust comment explaining why we restrict copies on loop depth. * tree-ssa-dom.c (cprop_operand): Remove restriction on on loop depth. (record_equivalences_from_phis): Instead add it here. * gcc.dg/tree-ssa/ssa-dom-thread-5.c: New testcase. Index: gcc/tree-ssa-copy.c === --- gcc/tree-ssa-copy.c (revision 212012) +++ gcc/tree-ssa-copy.c (working copy) @@ -401,11 +401,8 @@ copy_prop_visit_phi_node (gimple phi) arg_value = valueize_val (arg); /* Avoid copy propagation from an inner into an outer loop. -Otherwise, this may move loop variant variables outside of -their loops and prevent coalescing opportunities. If the -value was loop invariant, it will be hoisted by LICM and -exposed for copy propagation. -??? The value will be always loop invariant. +Otherwise, this may introduce uses of loop variant variables +outside of their loops and prevent coalescing opportunities. In loop-closed SSA form do not copy-propagate through PHI nodes in blocks with a loop exit edge predecessor. */ if (TREE_CODE (arg_value) == SSA_NAME Index: gcc/tree-ssa-dom.c === --- gcc/tree-ssa-dom.c (revision 212013) +++ gcc/tree-ssa-dom.c (working copy) @@ -1234,7 +1234,13 @@ record_equivalences_from_phis (basic_blo this, since this is a true assignment and not an equivalence inferred from a
Re: [PATCH] Fix vector rotate regression (PR tree-optimization/57233)
On Thu, Jun 26, 2014 at 08:10:15AM +0200, Jakub Jelinek wrote: +compute_type = get_compute_type (LSHIFT_EXPR, opl, type); +if (compute_type == TREE_TYPE (type) +|| compute_type != get_compute_type (RSHIFT_EXPR, opr, type) +|| compute_type != get_compute_type (BIT_IOR_EXPR, opo, type)) + compute_type = TREE_TYPE (type); Since we have determined compute_type from ashift (let's assume that's the one least likely to exist), I would just check that optab is ok with using this mode for the other 2 ops. Here, if we have shifts in 128 bits and ior in both 128 and 256 bits, we will fail (I thought that might be the case in AVX, but apparently not). Plus it is faster ;-) Makes sense. ... So like this? I've also changed get_compute_type so that it will DTRT even for -mavx and V4DImode vectors, so e.g. f5/f6/f8 routines in avx-pr57233.c improve. Also, even for shifts by scalar, if e.g. target doesn't have shifts by scalar at all, and only has narrower vector by vector shifts, it should handle this case too. 2014-06-25 Jakub Jelinek ja...@redhat.com PR tree-optimization/57233 PR tree-optimization/61299 * tree-vect-generic.c (get_compute_type, count_type_subparts): New functions. (expand_vector_operations_1): Use them. If {L,R}ROTATE_EXPR would be lowered to scalar shifts, check if corresponding shifts and vector BIT_IOR_EXPR are supported and don't lower or lower just to narrower vector type in that case. * expmed.c (expand_shift_1): Fix up handling of vector shifts and rotates. * gcc.dg/pr57233.c: New test. * gcc.target/i386/pr57233.c: New test. * gcc.target/i386/sse2-pr57233.c: New test. * gcc.target/i386/avx-pr57233.c: New test. * gcc.target/i386/avx2-pr57233.c: New test. * gcc.target/i386/avx512f-pr57233.c: New test. * gcc.target/i386/xop-pr57233.c: New test. --- gcc/tree-vect-generic.c.jj 2014-05-11 22:21:28.0 +0200 +++ gcc/tree-vect-generic.c 2014-06-26 10:18:32.815895488 +0200 @@ -1334,15 +1334,67 @@ lower_vec_perm (gimple_stmt_iterator *gs update_stmt (gsi_stmt (*gsi)); } +/* Return type in which CODE operation with optab OP can be + computed. */ + +static tree +get_compute_type (enum tree_code code, optab op, tree type) +{ + /* For very wide vectors, try using a smaller vector mode. */ + tree compute_type = type; + if (op + (!VECTOR_MODE_P (TYPE_MODE (type)) + || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing)) +{ + tree vector_compute_type + = type_for_widest_vector_mode (TREE_TYPE (type), op); + if (vector_compute_type != NULL_TREE + (TYPE_VECTOR_SUBPARTS (vector_compute_type) + TYPE_VECTOR_SUBPARTS (compute_type)) + (optab_handler (op, TYPE_MODE (vector_compute_type)) + != CODE_FOR_nothing)) + compute_type = vector_compute_type; +} + + /* If we are breaking a BLKmode vector into smaller pieces, + type_for_widest_vector_mode has already looked into the optab, + so skip these checks. */ + if (compute_type == type) +{ + enum machine_mode compute_mode = TYPE_MODE (compute_type); + if (VECTOR_MODE_P (compute_mode)) + { + if (op optab_handler (op, compute_mode) != CODE_FOR_nothing) + return compute_type; + if (code == MULT_HIGHPART_EXPR + can_mult_highpart_p (compute_mode, + TYPE_UNSIGNED (compute_type))) + return compute_type; + } + /* There is no operation in hardware, so fall back to scalars. */ + compute_type = TREE_TYPE (type); +} + + return compute_type; +} + +/* Helper function of expand_vector_operations_1. Return number of + vector elements for vector types or 1 for other types. */ + +static inline int +count_type_subparts (tree type) +{ + return VECTOR_TYPE_P (type) ? TYPE_VECTOR_SUBPARTS (type) : 1; +} + /* Process one statement. If we identify a vector operation, expand it. */ static void expand_vector_operations_1 (gimple_stmt_iterator *gsi) { gimple stmt = gsi_stmt (*gsi); - tree lhs, rhs1, rhs2 = NULL, type, compute_type; + tree lhs, rhs1, rhs2 = NULL, type, compute_type = NULL_TREE; enum tree_code code; - enum machine_mode compute_mode; optab op = unknown_optab; enum gimple_rhs_class rhs_class; tree new_rhs; @@ -1455,11 +1507,83 @@ expand_vector_operations_1 (gimple_stmt_ { op = optab_for_tree_code (code, type, optab_scalar); + compute_type = get_compute_type (code, op, type); + if (compute_type == type) + return; /* The rtl expander will expand vector/scalar as vector/vector -if necessary. Don't bother converting the stmt here. */ - if (optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing -
Commit: RX: Fix typo in description of RX vector attribute
Hi Guys, I am checking in the patch below to fix a small typo in the description of the RX vector function attribute. Cheers Nick gcc/ChangeLog 2014-06-26 Nick Clifton ni...@redhat.com * doc/extend.texi (Function Attributes): Fix typo in description of RX vector attribute. Index: doc/extend.texi === --- doc/extend.texi (revision 212014) +++ doc/extend.texi (working copy) @@ -4317,7 +4317,7 @@ @item vector @cindex @code{vector} attribute -This RX attribute is similar to the @code{attribute}, including its +This RX attribute is similar to the @code{interrupt} attribute, including its parameters, but does not make the function an interrupt-handler type function (i.e. it retains the normal C function calling ABI). See the @code{interrupt} attribute for a description of its arguments.
Re: [PATCH] Fix parts of PR61607
On Thu, 26 Jun 2014, Richard Biener wrote: On Thu, 26 Jun 2014, Richard Biener wrote: On Wed, 25 Jun 2014, Jeff Law wrote: On 06/25/14 08:05, Richard Biener wrote: This removes restrictions in DOM cprop_operand that inhibit some optimizations. The volatile pointer thing is really realy old and no longer necessary while the loop-depth consideration is only valid for loop-closed PHI nodes (but we're not in loop-closed SSA in DOM) - the coalescing is handled in out-of-SSA phase by inserting copies appropriately. Bootstrapped on x86_64-unknown-linux-gnu, ok? Thanks, Richard. 2014-06-25 Richard Biener rguent...@suse.de PR tree-optimization/61607 * tree-ssa-dom.c (cprop_operand): Remove restriction on propagating volatile pointers and on loop depth. The first hunk is OK. I thought we had tests for the do not copy propagate out of a loop nest in the suite. Did you check that tests in BZ 19038 still generate good code after this change? If we still generate good code for those tests, then this hunk is fine too. I have applied the first hunk and will investigate further. Testing didn't show any issue and I know how to retain the check but not cause the missed optimization shown in PR61607. Let's try to summarize what the restriction is supposed to avoid. It tries to avoid introducing uses of SSA names defined inside a loop outside of it because if the SSA name is live over the backedge we will then have an overlapping life-range which prevents out-of-SSA from coalescing it to a single register. Now, the existing test is not working in that way. Rather the best way we have to ensure this property (all outside uses go through a copy that is placed on exit edges rather than possibly on the backedge) is to go into loop-closed SSA form. This is also where the PHI nodes that confuse DOM in PR61607 come from in the first place. Now as the existing measure is ineffective in some cases out-of-SSA has gotten the ability to deal with this (or a subset): /* If elimination of a PHI requires inserting a copy on a backedge, then we will have to split the backedge which has numerous undesirable performance effects. A significant number of such cases can be handled here by inserting copies into the loop itself. */ insert_backedge_copies (); now, this doesn't seem to deal with outside uses. But eventually the coalescing code already assigns proper cost to backedge copies so that we choose to place copies on the exit edges rather than the backedge ones - seems not so from looking at coalesce_cost_edge. So I think that we should remove the copy-propagation restrictions and instead address this in out-of-SSA. For now the following patch retains the exact same restriction in DOM as it is present in copyprop (but not in FRE - ok my recent fault, or in VRP). By avoiding to record the equivalency for PHIs (where we know that either all or no uses should be covered by the loop depth check) we retain the ability to record the equivalency for the two loop exit PHI nodes and thus the threading (if only on the false path). Bootstrap and regtest running on x86_64-unknown-linux-gnu. I'll try to see what happens to the PR19038 testcases (though that PR is a mess ...) I checked the very original one (thin6d.f from sixtrack) and the generated assembly for -Ofast is the same without any patch and with _all_ loop_depth_of_name restrictions removed from both DOM and copyprop (thus making loop_depth_of_name dead). The cost of out-of-SSA copies for backedges (or in the case of the PR, loop latch edges causing an edge split) is dealt with by /* Inserting copy on critical edge costs more than inserting it elsewhere. */ if (EDGE_CRITICAL_P (e)) mult = 2; in coalesce_cost_edge. So in the end, without a testcase to investigate, I'd propose to get rid of those restrictions. I'm still going forward with the patch below for now. Richard. Richard. 2014-06-26 Richard Biener rguent...@suse.de PR tree-optimization/61607 * tree-ssa-copy.c (copy_prop_visit_phi_node): Adjust comment explaining why we restrict copies on loop depth. * tree-ssa-dom.c (cprop_operand): Remove restriction on on loop depth. (record_equivalences_from_phis): Instead add it here. * gcc.dg/tree-ssa/ssa-dom-thread-5.c: New testcase. Index: gcc/tree-ssa-copy.c === --- gcc/tree-ssa-copy.c (revision 212012) +++ gcc/tree-ssa-copy.c (working copy) @@ -401,11 +401,8 @@ copy_prop_visit_phi_node (gimple phi) arg_value = valueize_val (arg); /* Avoid copy propagation from an inner into an outer loop. - Otherwise, this may move loop variant variables outside of - their loops and
Commit: Testsuite: Fix typo in proc check_effective_target_trapping
Hi Guys, I am applying the patch below as an obvious fix for a typo in the check_effective_target_trapping proc in the testsuite's target-supports.exp file. Cheers Nick gcc/testsuite/ChangeLog 2014-06-26 Nick Clifton ni...@redhat.com * lib/target-supports.exp (check_effective_target_trapping): Fix typo. Index: testsuite/lib/target-supports.exp === --- testsuite/lib/target-supports.exp (revision 212014) +++ testsuite/lib/target-supports.exp (working copy) @@ -706,7 +706,7 @@ # Return 1 if trapping arithmetic is available, 0 otherwise. proc check_effective_target_trapping {} { -return [check_no_compiler_messages scheduling object { +return [check_no_compiler_messages trapping object { add (int a, int b) { return a + b; } } -ftrapv] }
Commit: FRV: Remove redundant assert
Hi Guys, I am checking in the patch below to remove a redundant assert, now that DECL_SECTION_NAME returns a string rather than a tree. Cheers Nick gcc/ChangeLog 2014-06-26 Nick Clifton ni...@redhat.com * config/frv/frv.c (frv_in_small_data_p): Remove redundant assert. Index: config/frv/frv.c === --- config/frv/frv.c(revision 212016) +++ config/frv/frv.c(working copy) @@ -9488,7 +9488,6 @@ section_name = DECL_SECTION_NAME (decl); if (section_name) { - gcc_assert (TREE_CODE (section_name) == STRING_CST); if (frv_string_begins_with (section_name, .sdata)) return true; if (frv_string_begins_with (section_name, .sbss))
[fixincludes] Fix iso/math_c99.h signbit on Solaris
As reported before https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00213.html the Solaris signbit implementation for GNU C in iso/math_c99.h gives warnings with -Wstrict-aliasing, breaking Go bootstrap. The following patch fixes this along the lines of the current solaris_math_8 fix. I've tried to use here documents to improve readability by avoiding the quoting necessary in C strings, but to no avail. All leading tabs were converted into blanks instead, and I couldn't get the pattern to match. So I've decided to go for the working, though a bit less readable, version. Bootstrapped with no regressions on i386-pc-solaris2.11. Passes fixincludes make check. Tested the testcase on x86_64-unknown-linux-gnu. Ok for mainline? Thanks. Rainer 2014-06-25 Rainer Orth r...@cebitec.uni-bielefeld.de fixincludes: * inclhack.def (solaris_math_11): New fix. * fixincl.x: Regenerate. * tests/base/iso/math_c99.h [SOLARIS_MATH_11_CHECK]: New test. gcc/testsuite: * gcc.dg/signbit-sa.c: New test. # HG changeset patch # Parent 4bb6a086dc232e27851ff33b22610d45dd18be57 Fix iso/math_c99.h signbit on Solaris diff --git a/fixincludes/fixincl.x b/fixincludes/fixincl.x --- a/fixincludes/fixincl.x +++ b/fixincludes/fixincl.x @@ -2,11 +2,11 @@ * * DO NOT EDIT THIS FILE (fixincl.x) * - * It has been AutoGen-ed Tuesday January 7, 2014 at 12:02:54 PM MET + * It has been AutoGen-ed Wednesday June 25, 2014 at 05:24:42 PM MEST * From the definitionsinclhack.def * and the template file fixincl */ -/* DO NOT SVN-MERGE THIS FILE, EITHER Tue Jan 7 12:02:54 MET 2014 +/* DO NOT SVN-MERGE THIS FILE, EITHER Wed Jun 25 17:24:42 MEST 2014 * * You must regenerate it. Use the ./genfixes script. * @@ -15,7 +15,7 @@ * certain ANSI-incompatible system header files which are fixed to work * correctly with ANSI C and placed in a directory that GNU C will search. * - * This file contains 224 fixup descriptions. + * This file contains 225 fixup descriptions. * * See README for more information. * @@ -6893,6 +6893,60 @@ static const char* apzSolaris_Math_9Patc /* * * * * * * * * * * * * * * * * * * * * * * * * * * + * Description of Solaris_Math_11 fix + */ +tSCC zSolaris_Math_11Name[] = + solaris_math_11; + +/* + * File name selection pattern + */ +tSCC zSolaris_Math_11List[] = + iso/math_c99.h\0; +/* + * Machine/OS name selection pattern + */ +#define apzSolaris_Math_11Machs (const char**)NULL + +/* + * content selection pattern - do fix if pattern found + */ +tSCC zSolaris_Math_11Select0[] = + @\\(#\\)math_c99\\.h[ \t]+1\\.[0-9]+[ \t]+[0-9/]+ ; + +#defineSOLARIS_MATH_11_TEST_CT 1 +static tTestDesc aSolaris_Math_11Tests[] = { + { TT_EGREP,zSolaris_Math_11Select0, (regex_t*)NULL }, }; + +/* + * Fix Command Arguments for Solaris_Math_11 + */ +static const char* apzSolaris_Math_11Patch[] = { +format, +#undef\tsignbit\n\ +#define\tsignbit(x)\t(sizeof(x) == sizeof(float) \\\n\ +\t\t\t ? __builtin_signbitf(x) \\\n\ +\t\t\t : sizeof(x) == sizeof(long double) \\\n\ +\t\t\t ? __builtin_signbitl(x) \\\n\ +\t\t\t : __builtin_signbit(x)), +^#undef[ \t]+signbit\n\ +#if defined\\(__sparc\\)\n\ +#define[ \t]+signbit\\(x\\)[ \t]+__extension__\\( \n\ +[ \t]+\\{[ \t]*__typeof\\(x\\)[ \t]*__x_s[ \t]*=[ \t]*\\(x\\);[ \t]*\n\ +[ \t]+\\(int\\)[ \t]*\\(\\*\\(unsigned[ \t]*\\*\\)[ \t]*\\__x_s[ \t]*[ \t]*31\\);[ \t]*\\}\\)\n\ +#elif defined\\(__i386\\) \\|\\| defined\\(__amd64\\)\n\ +#define[ \t]+signbit\\(x\\)[ \t]+__extension__\\( \n\ +[ \t]+\\{ __typeof\\(x\\) __x_s = \\(x\\); \n\ +[ \t]+\\(sizeof \\(__x_s\\) == sizeof \\(float\\) \\? \n\ +[ \t]+\\(int\\) \\(\\*\\(unsigned \\*\\) \\__x_s 31\\) : \n\ +[ \t]+sizeof \\(__x_s\\) == sizeof \\(double\\) \\? \n\ +[ \t]+\\(int\\) \\(\\(\\(unsigned \\*\\) \\__x_s\\)\\[1\\] 31\\) : \n\ +[ \t]+\\(int\\) \\(\\(\\(unsigned short \\*\\) \\__x_s\\)\\[4\\] 15\\)\\); \\}\\)\n\ +#endif, +(char*)NULL }; + +/* * * * * * * * * * * * * * * * * * * * * * * * * * + * * Description of Solaris_Once_Init_1 fix */ tSCC zSolaris_Once_Init_1Name[] = @@ -9187,9 +9241,9 @@ static const char* apzX11_SprintfPatch[] * * List of all fixes */ -#define REGEX_COUNT 261 +#define REGEX_COUNT 262 #define MACH_LIST_SIZE_LIMIT 187 -#define FIX_COUNT224 +#define FIX_COUNT225 /* * Enumerate the fixes @@ -9361,6 +9415,7 @@ typedef enum { SOLARIS_MATH_4_FIXIDX, SOLARIS_MATH_8_FIXIDX, SOLARIS_MATH_9_FIXIDX, +SOLARIS_MATH_11_FIXIDX, SOLARIS_ONCE_INIT_1_FIXIDX, SOLARIS_POSIX_SPAWN_RESTRICT_FIXIDX, SOLARIS_POW_INT_OVERLOAD_FIXIDX, @@ -10252,6 +10307,11 @@ tFixDesc fixDescList[ FIX_COUNT ] = { SOLARIS_MATH_9_TEST_CT, FD_MACH_ONLY | FD_SUBROUTINE, aSolaris_Math_9Tests, apzSolaris_Math_9Patch, 0 }, + { zSolaris_Math_11Name,zSolaris_Math_11List, +
Re: [PATCH 1/2] Enable setting sign and unsigned promoted mode (SPR_SIGNED_AND_UNSIGNED)
On 26/06/14 15:50, Jakub Jelinek wrote: On Thu, Jun 26, 2014 at 11:06:26AM +1000, Kugan wrote: Since our aim is to perform single bit checks, why don’t we just use this representation internally (i.e. _rtx-unchanging = 1 if SRP_SIGNED and _rtx-volatil = 1 if SRP_UNSIGNED). As for SUBREG_PROMOTED_SIGNED_P, we still have to return -1 or 1 depending on SRP_POINTER or SRP_UNSIGNED. Why don't you make SUBREG_PROMOTED_UNSIGNED_P just return 0/1 (i.e. the single bit), and for places where it would like to match both SRP_UNSIGNED and SRP_POINTER use SUBREG_PROMOTED_GET () SRP_UNSIGNED or so? If we use SUBREG_PROMOTED_GET () SRP_UNSIGNED, we will miss the case SRP_SIGNED_AND_UNSIGNED. Though this is not wrong, we might miss some optimization opportunities here. We can however use (SUBREG_PROMOTED_GET () != SRP_SIGNED) if you like this. Other option is to define another macro that explicilty says some think like SUBREG_PROMOTED_POINTER_OR_UNSIGNED_P. Ok, sure, if you want to make the test pass for SRP_UNSIGNED, SRP_POINTER and SRP_UNSIGNED_AND_SIGNED, then != SRP_SIGNED is the right thing. What I wanted is make SUBREG_PROMOTED_UNSIGNED_P be a 0/1 again. --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -1448,8 +1448,11 @@ noce_emit_cmove (struct noce_if_info *if_info, rtx x, enum rtx_code code, || byte_vtrue != byte_vfalse || (SUBREG_PROMOTED_VAR_P (vtrue) != SUBREG_PROMOTED_VAR_P (vfalse)) -|| (SUBREG_PROMOTED_UNSIGNED_P (vtrue) -!= SUBREG_PROMOTED_UNSIGNED_P (vfalse))) +|| ((SUBREG_PROMOTED_UNSIGNED_P (vtrue) + != SUBREG_PROMOTED_UNSIGNED_P (vfalse)) + (SUBREG_PROMOTED_SIGNED_P (vtrue) +!= SUBREG_PROMOTED_SIGNED_P (vfalse Shouldn't this be SUBREG_PROMOTED_GET (vtrue) != SUBREG_PROMOTED_GET (vfalse) ? The reason why I checked like this to cover one side with SRP_SIGNED_AND_UNSIGNED and other with SRP_SIGNED or SRP_UNSIGNED. If we check SUBREG_PROMOTED_GET (vtrue) != SUBREG_PROMOTED_GET (vfalse) we will miss that. What you have above is just wrong though. Either you need to make sure the flags are the same (i.e. GET != GET), and keep the SET a few lines below as is, or you would allow (some?) mismatches of the promotion flags, but in that case you'd need to deal with it in the SET conservatively. Like, if one is SRP_SIGNED_AND_UNSIGNED and another one is just SRP_SIGNED or just SRP_UNSIGNED, you'd use the simpler one, if one is promoted and another one is not, you'd not make the SUBREG promoted at all, etc. Not worth it IMHO, at least not for now. + +/* Predicate to check if RTX of SUBREG_PROMOTED_VAR_P() is promoted + for UNSIGNED type. In case of SRP_POINTER, SUBREG_PROMOTED_UNSIGNED_P + returns -1 as this is in most cases handled like unsigned extension, + except for generating instructions where special code is emitted for + (ptr_extend insns) on some architectures. */ #define SUBREG_PROMOTED_UNSIGNED_P(RTX) \ - ((RTL_FLAG_CHECK1 (SUBREG_PROMOTED_UNSIGNED_P, (RTX), SUBREG)-volatil) \ - ? -1 : (int) (RTX)-unchanging) + RTL_FLAG_CHECK1 (SUBREG_PROMOTED_UNSIGNED_P, (RTX), SUBREG)-volatil)\ + + (RTX)-unchanging) == 0) ? -1 : ((RTX)-volatil == 1)) + +/* Checks if RTX of SUBREG_PROMOTED_VAR_P() is promotd for given SIGN. */ +#define SUBREG_CHECK_PROMOTED_SIGN(RTX, SIGN) \ Use space rather than tab. Also, why do we need this macro? Can't you just use SUBREG_PROMOTED_GET () == sign ? I mean, sign in that case is typically just 0 or 1. Again I wanted to cover SRP_SIGNED_AND_UNSIGNED as well in this case. Ah, ok. It is fine as is (with the whitespace change). Thanks for the review. I have now changed it based on the comments. Is this look OK? Thanks, Kugan gcc/ 2014-06-26 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Use new SUBREG_PROMOTED_SET instead of SUBREG_PROMOTED_UNSIGNED_SET (expand_call): Likewise. * cfgexpand.c (expand_gimple_stmt_1): Use SUBREG_PROMOTED_GET SRP_UNSIGNED to get promoted mode as SRP_POINTER is treated the same way as SRP_UNSIGNED. * combine.c (record_promoted_value): Skip 0 comparison with SUBREG_PROMOTED_UNSIGNED_P as it now returns only 0 or 1. * expr.c (convert_move): Use SUBREG_CHECK_PROMOTED_SIGN instead of SUBREG_PROMOTED_UNSIGNED_P. (convert_modes): Likewise. (store_expr): Use SUBREG_PROMOTED_GET SRP_UNSIGNED to get promoted mode as SRP_POINTER is treated the same way as SRP_UNSIGNED. (expand_expr_real_1): Use new SUBREG_PROMOTED_SET instead of SUBREG_PROMOTED_UNSIGNED_SET. * function.c (assign_parm_setup_reg): Use new SUBREG_PROMOTED_SET instead of SUBREG_PROMOTED_UNSIGNED_SET. * ifcvt.c (noce_emit_cmove): Updated to use SUBREG_PROMOTED_GET and SUBREG_PROMOTED_SET. * internal-fn.c
[PATCH, alpha]: FIX PR61586, ICE in alpha_handle_trap_shadows
Hello! Attached patch handles (barrier) RTXes that can be reached when __builtin_trap builtin is used. 2014-06-26 Uros Bizjak ubiz...@gmail.com PR target/61586 * config/alpha/alpha.c (alpha_handle_trap_shadows): Handle BARRIER RTX. testsuite/ChangeLog: 2014-06-26 Uros Bizjak ubiz...@gmail.com PR target/61586 * gcc.target/alpha/pr61586.c: New test. Bootstrapped and regression tested on alphaev68-linux-gnu. OK for mainline and 4.9 ? Uros. Index: config/alpha/alpha.c === --- config/alpha/alpha.c(revision 211941) +++ config/alpha/alpha.c(working copy) @@ -8717,6 +8717,11 @@ alpha_handle_trap_shadows (void) } break; + case BARRIER: + /* __builtin_unreachable can expand to no code at all, +leaving (barrier) RTXes in the instruction stream. */ + goto close_shadow_notrapb; + case JUMP_INSN: case CALL_INSN: case CODE_LABEL: @@ -8732,6 +8737,7 @@ alpha_handle_trap_shadows (void) n = emit_insn_before (gen_trapb (), i); PUT_MODE (n, TImode); PUT_MODE (i, TImode); + close_shadow_notrapb: trap_pending = 0; shadow.used.i = 0; shadow.used.fp = 0; Index: testsuite/gcc.target/alpha/pr61586.c === --- testsuite/gcc.target/alpha/pr61586.c(revision 0) +++ testsuite/gcc.target/alpha/pr61586.c(working copy) @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -mieee } */ + +void foo (int *dimensions, double **params, int hh) +{ + if (params[hh]) +; + else if (dimensions[hh] 0) +params[hh][0] = 1.0f; +}
Re: [PATCH 1/2] Enable setting sign and unsigned promoted mode (SPR_SIGNED_AND_UNSIGNED)
On Thu, Jun 26, 2014 at 07:41:22PM +1000, Kugan wrote: 2014-06-26 Kugan Vivekanandarajah kug...@linaro.org * calls.c (precompute_arguments): Use new SUBREG_PROMOTED_SET instead of SUBREG_PROMOTED_UNSIGNED_SET Missing full stop. --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -3297,7 +3297,7 @@ expand_gimple_stmt_1 (gimple stmt) ; else if (promoted) { - int unsignedp = SUBREG_PROMOTED_UNSIGNED_P (target); + int unsignedp = SUBREG_PROMOTED_GET (target) SRP_UNSIGNED; From what I understand, here you want the -1/0/1 value and not 2, so that is int unsignedp = SUBREG_PROMOTED_GET (target); if (unsignedp == SRP_SIGNED_AND_UNSIGNED) unsignedp = SRP_UNSIGNED; I think. Do you agree? BTW, the final patch will probably need to be tested on one of the weirdo ptr_extend targets (ia64-hpux or x86_64-linux -mx32). --- a/gcc/expr.c +++ b/gcc/expr.c @@ -329,7 +329,7 @@ convert_move (rtx to, rtx from, int unsignedp) if (GET_CODE (from) == SUBREG SUBREG_PROMOTED_VAR_P (from) (GET_MODE_PRECISION (GET_MODE (SUBREG_REG (from))) = GET_MODE_PRECISION (to_mode)) - SUBREG_PROMOTED_UNSIGNED_P (from) == unsignedp) + SUBREG_CHECK_PROMOTED_SIGN (from, unsignedp)) I think unsignedp (misnamed) may be -1/0/1 here, so either SUBREG_CHECK_PROMOTED_SIGN needs to handle those 3, or you need to use something else. If it handles all 3 values, then it would be say ((SIGN) == SRP_POINTER ? SUBREG_PROMOTED_GET (RTX) == SRP_POINTER : (SIGN) == SRP_SIGNED ? SUBREG_PROMOTED_SIGNED_P (RTX) : SUBREG_PROMOTED_UNSIGNED_P (RTX)) or so. from = gen_lowpart (to_mode, from), from_mode = to_mode; gcc_assert (GET_CODE (to) != SUBREG || !SUBREG_PROMOTED_VAR_P (to)); @@ -703,7 +703,7 @@ convert_modes (enum machine_mode mode, enum machine_mode oldmode, rtx x, int uns if (GET_CODE (x) == SUBREG SUBREG_PROMOTED_VAR_P (x) GET_MODE_SIZE (GET_MODE (SUBREG_REG (x))) = GET_MODE_SIZE (mode) - SUBREG_PROMOTED_UNSIGNED_P (x) == unsignedp) + SUBREG_CHECK_PROMOTED_SIGN (x, unsignedp)) x = gen_lowpart (mode, SUBREG_REG (x)); Similarly. @@ -5203,24 +5203,25 @@ store_expr (tree exp, rtx target, int call_param_p, bool nontemporal) == TYPE_PRECISION (TREE_TYPE (exp))) { if (TYPE_UNSIGNED (TREE_TYPE (exp)) - != SUBREG_PROMOTED_UNSIGNED_P (target)) + != SUBREG_PROMOTED_GET (target) SRP_UNSIGNED) Here TYPE_UNSIGNED is 0 or 1, so if you define SUBREG_PROMOTED_CHECK_SIGN the way suggested above, this would be SUBREG_PROMOTED_CHECK_SIGN then, or if (TYPE_UNSIGNED (TREE_TYPE (exp)) ? SUBREG_PROMOTED_UNSIGNED_P (target) : SUBREG_PROMOTED_SIGNED_P (target)) { /* Some types, e.g. Fortran's logical*4, won't have a signed version, so use the mode instead. */ tree ntype = (signed_or_unsigned_type_for -(SUBREG_PROMOTED_UNSIGNED_P (target), TREE_TYPE (exp))); +(SUBREG_PROMOTED_GET (target) SRP_UNSIGNED, I'd just use TYPE_UNSIGNED (TREE_TYPE (exp)) here instead, no reason to repeat what the guarding condition did. + TREE_TYPE (exp))); if (ntype == NULL) ntype = lang_hooks.types.type_for_mode (TYPE_MODE (TREE_TYPE (exp)), -SUBREG_PROMOTED_UNSIGNED_P (target)); +SUBREG_PROMOTED_GET (target) SRP_UNSIGNED); exp = fold_convert_loc (loc, ntype, exp); } exp = fold_convert_loc (loc, lang_hooks.types.type_for_mode (GET_MODE (SUBREG_REG (target)), -SUBREG_PROMOTED_UNSIGNED_P (target)), +SUBREG_PROMOTED_GET (target) SRP_UNSIGNED), exp); I believe fold_convert only considers zero and non-zero, so no idea what we want here for SRP_POINTER. Doing what we used to do would be SUBREG_PROMOTED_GET (target) != SRP_SIGNED. inner_target = SUBREG_REG (target); @@ -5234,14 +5235,14 @@ store_expr (tree exp, rtx target, int call_param_p, bool nontemporal) if (CONSTANT_P (temp) GET_MODE (temp) == VOIDmode) { temp = convert_modes (GET_MODE (target), TYPE_MODE (TREE_TYPE (exp)), - temp, SUBREG_PROMOTED_UNSIGNED_P (target)); + temp, SUBREG_PROMOTED_GET (target) SRP_UNSIGNED); temp = convert_modes (GET_MODE (SUBREG_REG (target)), GET_MODE (target), temp, - SUBREG_PROMOTED_UNSIGNED_P (target)); + SUBREG_PROMOTED_GET (target) SRP_UNSIGNED); } convert_move (SUBREG_REG (target), temp, - SUBREG_PROMOTED_UNSIGNED_P (target)); + SUBREG_PROMOTED_GET (target)
Re: [PATCH 1/2] Enable setting sign and unsigned promoted mode (SPR_SIGNED_AND_UNSIGNED)
Kugan kugan.vivekanandara...@linaro.org writes: @@ -5203,24 +5203,25 @@ store_expr (tree exp, rtx target, int call_param_p, bool nontemporal) == TYPE_PRECISION (TREE_TYPE (exp))) { if (TYPE_UNSIGNED (TREE_TYPE (exp)) - != SUBREG_PROMOTED_UNSIGNED_P (target)) + != SUBREG_PROMOTED_GET (target) SRP_UNSIGNED) has lower precedence than !=. You should have got a warning that fails bootstrap. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 And now for something completely different.
Re: testsuite allocators patch
On 25/06/14 21:47 +0200, François Dumont wrote: I would like to finally propose this patch before the one on _Rb_tree, as a separate one. I have adopted the same evolution on the tracker_allocator with even a perfect forwarding constructor to allow its usage on top of the uneq_allocator which take a personality parameter. Doing so I realized that move_assign_neg.cc tests were not accurate enough as they needed a non move propagating allocator and the uneq_allocator were not explicitly non propagating. Ah, that's good to improve them then. Index: testsuite/util/testsuite_allocator.h === --- testsuite/util/testsuite_allocator.h(revision 211713) +++ testsuite/util/testsuite_allocator.h(working copy) @@ -87,103 +81,142 @@ static intdestructCount_; }; - // A simple basic allocator that just forwards to the + // Helper to detect inconsistency between type used to instantiate an + // allocator and the underlying allocator value_type. + templatetypename T, typename Alloc, + typename = typename Alloc::value_type +struct check_consistent_alloc_value_type; + + templatetypename T, typename Alloc +struct check_consistent_alloc_value_typeT, Alloc, T +{ typedef T value_type; }; + + // An allocator facade that just intercepts some calls and forward them to the // tracker_allocator_counter to fulfill memory requests. This class This comment is no longer true, tracker_allocator_counter does not fulfil the memory requests. // is templated on the target object type, but tracker isn't. - templateclass T - class tracker_allocator - { - private: -typedef tracker_allocator_counter counter_type; + templatetypename T, typename Alloc = std::allocatorT +class tracker_allocator : public Alloc +{ +private: + typedef tracker_allocator_counter counter_type; - public: -typedef T value_type; -typedef T* pointer; -typedef const T* const_pointer; -typedef T reference; -typedef const T const_reference; -typedef std::size_tsize_type; -typedef std::ptrdiff_t difference_type; + typedef __gnu_cxx::__alloc_traitsAlloc AllocTraits; + +public: + typedef typename + check_consistent_alloc_value_typeT, Alloc::value_type value_type; + typedef typename AllocTraits::pointer pointer; + typedef typename AllocTraits::size_type size_type; Thanks for doing this - I think it makes the facade more useful if it uses allocator_traits and so can be combined with SimpleAllocator and CustomPointerAlloc. -templateclass U struct rebind { typedef tracker_allocatorU other; }; + templateclass U + struct rebind + { + typedef tracker_allocatorU, + typename AllocTraits::template rebindU::other other; + }; -pointer -address(reference value) const _GLIBCXX_NOEXCEPT -{ return std::__addressof(value); } +#if __cplusplus = 201103L + tracker_allocator() = default; + tracker_allocator(const tracker_allocator) = default; + tracker_allocator(tracker_allocator) = default; -const_pointer -address(const_reference value) const _GLIBCXX_NOEXCEPT -{ return std::__addressof(value); } + // Perfect forwarding constructor. + templatetypename... _Args + tracker_allocator(_Args... __args) + : Alloc(std::forward_Args(__args)...) + { } +#else + tracker_allocator() _GLIBCXX_USE_NOEXCEPT The _GLIBCXX_USE_NOEXCEPT macro expands to nothing in C++03 mode, so you might as well omit it in the #else branch. OK for trunk if you make the tracker_allocator comment correct. Thanks!
Re: [PATCH 1/2] Enable setting sign and unsigned promoted mode (SPR_SIGNED_AND_UNSIGNED)
On Thu, Jun 26, 2014 at 12:12:03PM +0200, Jakub Jelinek wrote: @@ -5234,14 +5235,14 @@ store_expr (tree exp, rtx target, int call_param_p, bool nontemporal) if (CONSTANT_P (temp) GET_MODE (temp) == VOIDmode) { temp = convert_modes (GET_MODE (target), TYPE_MODE (TREE_TYPE (exp)), - temp, SUBREG_PROMOTED_UNSIGNED_P (target)); + temp, SUBREG_PROMOTED_GET (target) SRP_UNSIGNED); temp = convert_modes (GET_MODE (SUBREG_REG (target)), GET_MODE (target), temp, - SUBREG_PROMOTED_UNSIGNED_P (target)); + SUBREG_PROMOTED_GET (target) SRP_UNSIGNED); } convert_move (SUBREG_REG (target), temp, - SUBREG_PROMOTED_UNSIGNED_P (target)); + SUBREG_PROMOTED_GET (target) SRP_UNSIGNED); In all 3 cases here you want -1/0/1 and treat SRP_SIGNED_AND_UNSIGNED as probably 1, so supposedly you want a macro for that and use it in the 3 cases here, in expand_gimple_stmt_1 etc. That macro (not sure about best name for it), which would for SUBREG_PROMOTED_GET -1, 0, 1, 2 return -1, 0, 1, 1 could be defined e.g. as ((RTL_FLAG_CHECK1 (SUBREG_PROMOTED_GET, (RTX), SUBREG)-volatil) ? 1 \ : (RTX)-unchanging - 1) Jakub
Re: Fix finding reg-sets of call insn in collect_fn_hard_reg_usage
On 19-06-14 18:47, Richard Henderson wrote: And I forgot to mention it might be worth while to notice simple recursion. Avoid the early exit path if caller == callee, despite the caller-save info not being valid. Richard, attached patch enables handling of self-recursive functions in the fuse-caller-save optimization, and adds a test-case. I've done an x86_64 build and ran the i386.exp testsuite. OK for trunk if full bootstrap and reg-test succeeds? Thanks, - Tom 2014-06-26 Tom de Vries t...@codesourcery.com * final.c (get_call_fndecl): Declare. (self_recursive_call_p): New function. (collect_fn_hard_reg_usage): Handle self-recursive function calls. * gcc.target/i386/fuse-caller-save-rec.c: New test. diff --git a/gcc/final.c b/gcc/final.c index 9525efc..ed0ba0b 100644 --- a/gcc/final.c +++ b/gcc/final.c @@ -225,6 +225,7 @@ static int final_addr_vec_align (rtx); #endif static int align_fuzz (rtx, rtx, int, unsigned); static void collect_fn_hard_reg_usage (void); +static tree get_call_fndecl (rtx); /* Initialize data in final at the beginning of a compilation. */ @@ -4750,6 +4751,16 @@ make_pass_clean_state (gcc::context *ctxt) return new pass_clean_state (ctxt); } +/* Return true if INSN is a call to the the current function. */ + +static bool +self_recursive_call_p (rtx insn) +{ + tree fndecl = get_call_fndecl (insn); + return (fndecl == current_function_decl + decl_binds_to_current_def_p (fndecl)); +} + /* Collect hard register usage for the current function. */ static void @@ -4775,7 +4786,8 @@ collect_fn_hard_reg_usage (void) if (!NONDEBUG_INSN_P (insn)) continue; - if (CALL_P (insn)) + if (CALL_P (insn) + !self_recursive_call_p (insn)) { if (!get_call_reg_set_usage (insn, insn_used_regs, call_used_reg_set)) diff --git a/gcc/testsuite/gcc.target/i386/fuse-caller-save-rec.c b/gcc/testsuite/gcc.target/i386/fuse-caller-save-rec.c new file mode 100644 index 000..b30a0b4 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/fuse-caller-save-rec.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fuse-caller-save -fomit-frame-pointer -fno-optimize-sibling-calls } */ +/* { dg-additional-options -mregparm=1 { target ia32 } } */ + +/* Test -fuse-caller-save optimization on self-recursive function. */ + +static int __attribute__((noinline)) +bar (int x) +{ + if (x 4) +return bar (x - 3); + return 0; +} + +int __attribute__((noinline)) +foo (int y) +{ + return y + bar (y); +} + +int +main (void) +{ + return !(foo (5) == 13); +} + +/* Verify that no registers where saved on stack. */ +/* { dg-final { scan-assembler-not \.cfi_offset } } */ + +/* Verify that bar is self-recursive. */ +/* { dg-final { scan-assembler-times call\tbar 2 } } */ + -- 1.9.1
Re: [PATCH] Fix vector rotate regression (PR tree-optimization/57233)
On Thu, 26 Jun 2014, Jakub Jelinek wrote: So like this? I've also changed get_compute_type so that it will DTRT even for -mavx and V4DImode vectors, so e.g. f5/f6/f8 routines in avx-pr57233.c improve. Also, even for shifts by scalar, if e.g. target doesn't have shifts by scalar at all, and only has narrower vector by vector shifts, it should handle this case too. All that? Cool! @@ -1455,11 +1507,83 @@ expand_vector_operations_1 (gimple_stmt_ { op = optab_for_tree_code (code, type, optab_scalar); + compute_type = get_compute_type (code, op, type); + if (compute_type == type) + return; /* The rtl expander will expand vector/scalar as vector/vector -if necessary. Don't bother converting the stmt here. */ - if (optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing - optab_handler (opv, TYPE_MODE (type)) != CODE_FOR_nothing) +if necessary. Pick one with wider vector type. */ + tree compute_vtype = get_compute_type (code, opv, type); + if (count_type_subparts (compute_vtype) + count_type_subparts (compute_type)) + { + compute_type = compute_vtype; + op = opv; + } + } + + if (code == LROTATE_EXPR || code == RROTATE_EXPR) + { + if (compute_type == NULL_TREE) + compute_type = get_compute_type (code, op, type); + if (compute_type == type) return; + /* Before splitting vector rotates into scalar rotates, +see if we can't use vector shifts and BIT_IOR_EXPR +instead. For vector by vector rotates we'd also +need to check BIT_AND_EXPR and NEGATE_EXPR, punt there +for now, fold doesn't seem to create such rotates anyway. */ + if (compute_type == TREE_TYPE (type) + !VECTOR_INTEGER_TYPE_P (TREE_TYPE (rhs2))) + { + optab oplv, opl, oprv, opr, opo; + oplv = optab_for_tree_code (LSHIFT_EXPR, type, optab_vector); + /* Right shift always has to be logical, no matter what +signedness type has. */ + oprv = vlshr_optab; + opo = optab_for_tree_code (BIT_IOR_EXPR, type, optab_default); + opl = optab_for_tree_code (LSHIFT_EXPR, type, optab_scalar); + oprv = lshr_optab; + opr = optab_for_tree_code (RSHIFT_EXPR, type, optab_scalar); Looks like there are some typos in there, you are assigning to oprv twice. -- Marc Glisse
[PATCH][match-and-simplify] Improve GENERIC code-gen
This re-structues GENERIC code to the suggested switch-stmt use which ideally should behave sanely wrt backtracking now (fingers crossing - we don't have any testcases yet excercising the GENERIC code-path). Installed. Richard. 2014-06-26 Richard Biener rguent...@suse.de * genmatch.c (dt_operand::gen_generic_kids): New function. (dt_operand::gen_generic_expr_expr): Conditionalize matching of code to GIMPLE code-gen. (dt_operand::gen_generic_expr_fn): Likewise. (dt_operand::gen_generic_expr): Adjust to close parens. (dt_operand::gen_generic): Call gen_generic_kids. (decision_tree::gen_generic): Likewise, use a switch stmt. Index: gcc/genmatch.c === --- gcc/genmatch.c (revision 212021) +++ gcc/genmatch.c (working copy) @@ -367,6 +367,7 @@ struct dt_operand: public dt_node void grok_kids(kids_type); void gen_gimple_kids (FILE *); + void gen_generic_kids (FILE *); }; @@ -978,8 +979,16 @@ dt_operand::gen_generic_expr_expr (FILE { unsigned n_ops = e-ops.length (); - fprintf (f, if (TREE_CODE (%s) == %s)\n, opname, e-operation-op-id); - fprintf (f, {\n); + operator_id *op_id = static_cast operator_id * (e-operation-op); + + if (valueize) +{ + if (op_id-code == NOP_EXPR || op_id-code == CONVERT_EXPR) + fprintf (f, if (CONVERT_EXPR_P (%s))\n, opname); + else + fprintf (f, if (TREE_CODE (%s) == %s)\n, opname, e-operation-op-id); + fprintf (f, {\n); +} for (unsigned i = 0; i n_ops; ++i) { @@ -1001,13 +1010,16 @@ dt_operand::gen_generic_expr_fn (FILE *f unsigned n_ops = e-ops.length (); fn_id *op = static_cast fn_id * (e-operation-op); - fprintf (f, if (TREE_CODE (%s) == CALL_EXPR\n -TREE_CODE (CALL_EXPR_FN (%s)) == ADDR_EXPR\n -TREE_CODE (TREE_OPERAND (CALL_EXPR_FN (%s), 0)) == FUNCTION_DECL\n -DECL_BUILT_IN_CLASS (TREE_OPERAND (CALL_EXPR_FN (%s), 0)) == BUILT_IN_NORMAL\n -DECL_FUNCTION_CODE (TREE_OPERAND (CALL_EXPR_FN (%s), 0)) == %s)\n, - opname, opname, opname, opname, opname, op-id); - fprintf (f, {\n); + if (valueize) +{ + fprintf (f, if (TREE_CODE (%s) == CALL_EXPR\n + TREE_CODE (CALL_EXPR_FN (%s)) == ADDR_EXPR\n + TREE_CODE (TREE_OPERAND (CALL_EXPR_FN (%s), 0)) == FUNCTION_DECL\n + DECL_BUILT_IN_CLASS (TREE_OPERAND (CALL_EXPR_FN (%s), 0)) == BUILT_IN_NORMAL\n + DECL_FUNCTION_CODE (TREE_OPERAND (CALL_EXPR_FN (%s), 0)) == %s)\n, + opname, opname, opname, opname, opname, op-id); + fprintf (f, {\n); +} for (unsigned i = 0; i n_ops; ++i) { @@ -1028,7 +1040,7 @@ dt_operand::gen_generic_expr (FILE *f, c { expr *e = static_castexpr * (op); (e-operation-op-kind == id_base::CODE) ? gen_generic_expr_expr (f, e, opname, valueize) : gen_generic_expr_fn (f, e, opname, valueize); - return valueize ? e-ops.length () + 1 : 1; + return valueize ? e-ops.length () + 1 : 0; } bool @@ -1197,6 +1209,7 @@ dt_operand::gen_gimple (FILE *f) fprintf (f, }\n); } + void dt_operand::gen_generic (FILE *f) { @@ -1230,8 +1243,7 @@ dt_operand::gen_generic (FILE *f) unsigned i; - for (i = 0; i kids.length (); ++i) -kids[i]-gen_generic (f); + gen_generic_kids (f); for (i = 0; i n_braces; ++i) fprintf (f, }\n); @@ -1240,6 +1252,100 @@ dt_operand::gen_generic (FILE *f) } void +dt_operand::gen_generic_kids (FILE *f) +{ + bool any = false; + for (unsigned j = 0; j kids.length (); ++j) +{ + dt_node *node = kids[j]; + if (node-type == DT_OPERAND) + { + dt_operand *kid = static_castdt_operand *(node); + if (kid-op-type == operand::OP_EXPR) + any = true; + } +} + + if (any) +{ + char opname[20]; + static_cast dt_operand *(kids[0])-get_name (opname); + fprintf (f, switch (TREE_CODE (%s))\n + {\n, opname); + for (unsigned j = 0; j kids.length (); ++j) + { + dt_node *node = kids[j]; + if (node-type != DT_OPERAND) + continue; + dt_operand *kid = static_castdt_operand *(node); + if (kid-op-type != operand::OP_EXPR) + continue; + expr *e = static_cast expr *(kid-op); + if (e-operation-op-kind != id_base::CODE) + continue; + + /* ??? CONVERT */ + fprintf (f, case %s:\n + {\n, e-operation-op-id); + kid-gen_generic (f); + fprintf (f, break;\n + }\n); + } + + bool first = true; + for (unsigned j = 0; j kids.length (); ++j) + { + dt_node *node = kids[j]; + if (node-type != DT_OPERAND) + continue; + dt_operand *kid = static_castdt_operand *(node); + if (kid-op-type !=
Re: [GSoC][match-and-simplify] factor gimple expressions and builtin functions
On Thu, Jun 26, 2014 at 11:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Thu, Jun 26, 2014 at 10:11 AM, Prathamesh Kulkarni bilbotheelffri...@gmail.com wrote: This patch factors expression checking for GIMPLE. Generates code as: if (TREE_CODE (opname) == SSA_NAME) { gimple def_stmt = SSA_NAME_DEF_STMT (opname); if (is_gimple_assign (def_stmt)) { if (gimple_assign_rhs_code (def_stmt) == expr-code1) { } if (gimple_assign_rhs_code (def_stmt) == expr-code2) { } } } We cannot use switch-case, for convert/nop since we use CONVERT_EXPR_CODE_P, so i used if stmt. Actually we can by using case NOP_EXPR: case CONVERT_EXPR: for it. Note that currently you still do if (is_gimple_assign (def_stmt)) { if (gimple_assign_rhs_code (def_stmt) == PLUS_EXPR) { } if (gimple_assign_rhs_code (def_stmt) == MINUS_EXPR) { } thus not use else if (). Of course in the end we want switch (gimple_assign_rhs_code (def_stmt)) { case PLUS_EXPR; ... case MINUS_EXPR: ... Unfortunately, back-tracking is still done. I shall look into that. Note that we need to backtrack to the next parent with a 'true' kid (and from there to its next parent with a 'true' kid). Possibly with using a switch-case or proper if-else-if there isn't anything special to do as the backtracking would work naturally then. So I suggest to concentrate on getting the code to use if (TREE_CODE (op0) == SSA_NAME) { gimple def_stmt = SSA_NAME_DEF_STMT (op0); if (is_gimple_assign (def_stmt)) { switch (gimple_assign_rhs_code ()) { case PLUS_EXPR: ... } else if (gimple_call_builtin_p (def_stmt, BUILT_IN_NORMAL)) { switch (DECL_FUNCTION_CODE (gimple_call_fndecl (def_stmt))) { case BUILT_IN_ABS: ... } } else if (TREE_CODE (op0) == REALPART_EXPR) ... other GENERIC cases true match (not in else {}) that would backtrack through the true cases by means of them not short-circuited by an else but everything else short-circuited by proper use of switch () and if - else-if code. Actually it is quite a bit more complicated. While OP_EXPR naturally short-circuit a failed DT_MATCH doesn't mean we haven't to check other DT_OPERANDs - likewise we have to test all OP_PREDICATE until we find a matching one. So we should emit OP_EXPR checks first, short-circuiting each other and then following with DT_MATCH and OP_PREDICATE ones, but not short-circuiting. I have tried to hack that into GENERIC code-gen but the current code-gen structure is somewhat awkward ... (see patch in separate thread). I see you get around most of the ugliness by using grok_kids, but with a sorted kids array this would come for free I think (sorted after desired code-gen order). Feel free to comment the GENERIC code-gen (just emit empty functions) when you want to refactor things. I will be on vacation for the next week. Richard. Thanks, Richard. * genmatch.c (dt_node::get_expr_code): New member function. (dt_node::is_gimple_expr): Likewise. (dt_node::is_gimple_fn): Likewise. (dt_operand::kids_type): New struct. (dt_operand::gen_gimple_expr): Remove. (dt_operand::gen_gimple_expr_expr): Remove 2nd argument and change returntype to unsigned. Adjust code-gen for expressions. (dt_operand::gen_gimple_expr_fn): Remove 2nd argument and change return type to unsigned. (dt_operand::grok_kids): New member function. (dt_operand::gen_gimple_kids): Likewise. (dt_operand::gen_gimple): Adjust code-gen for gimple expressions. Call dt_operand::gen_gimple_kids. (decision_tree::gen_gimple): Call dt_operand::gen_gimple_kids. Thanks and Regards, Prathamesh
Re: [patch, libgfortran] [4.9/4.10 Regression] Internal read of negative integer broken
Hi Jerry, The patch looks to be OK for trunk. Did you check it with the NIST by any chance? Thanks a lot Paul On 26 June 2014 03:58, Jerry DeLisle jvdeli...@charter.net wrote: Hi, This bug has nothing to do with negative numbers as in the description. However, the problem is due to seeking when there are no spaces to skip. I restructured the loop so that the skipping is not done if there are no spaces. Regression tested on x86-64. New test case from the PR. OK for trunk and 4.9? Regards, Jerry 2014-06-25 Jerry DeLisle jvdeli...@gcc.gnu.org PR libgfortran/61499 * io/list_read.c (eat_spaces): Use a 'for' loop instead of 'while' loop to skip the loop if there are no bytes left in the string. Only seek if actual spaces can be skipped. -- The knack of flying is learning how to throw yourself at the ground and miss. --Hitchhikers Guide to the Galaxy
Re: [patch] Simplify allocator use
On 25/06/14 21:56 +0100, Jonathan Wakely wrote: The other adds an RAII type to help manage pointers obtained from allocators. The new type means I can remove several ugly try-catch blocks that are all very similar in structure and have been bothering me for some time. The new type also makes it trivial to support allocators with fancy pointers, fixing long-standing (but not very important) bugs in std::promise and std::shared_ptr. This patch applies the __allocated_ptr type to hashtable_policy.h to remove most explicit deallocation (yay!) The buckets are still allocated and deallocated manually, because __allocated_ptr only works for allocations of single objects, not arrays. As well as __allocated_ptr this change relies on two things: 1) the node type has a trivial destructor, so we don't actually need to call it, we can just reuse or release its storage. (See 3.8 [basic.life] p1) 2) allocator_traits::construct and allocator_traits::destroy can be used with an allocator that has a different value_type, so we don't need to create a rebound copy to destroy every element, we can just use the node-allocator. (See http://cplusplus.github.io/LWG/lwg-active.html#2218 which is Open, but I've discussed the issue with Howard, Pablo and others, and I think libc++ already relies on this assumption). François, could you check it, and let me know if you see anything wrong or have any comments? commit d2fd02daab715c79c766bc0a476d1d01da1fc305 Author: Jonathan Wakely jwak...@redhat.com Date: Thu Jun 26 12:28:56 2014 +0100 * include/bits/hashtable_policy.h (_ReuseOrAllocNode::operator()): Use __allocated_ptr. (_Hashtable_alloc::_M_allocate_node): Likewise. (_Hashtable_alloc::_M_deallocate_node): Likewise. diff --git a/libstdc++-v3/include/bits/hashtable_policy.h b/libstdc++-v3/include/bits/hashtable_policy.h index 606fbab..ed6b2d7 100644 --- a/libstdc++-v3/include/bits/hashtable_policy.h +++ b/libstdc++-v3/include/bits/hashtable_policy.h @@ -31,6 +31,8 @@ #ifndef _HASHTABLE_POLICY_H #define _HASHTABLE_POLICY_H 1 +#include bits/allocated_ptr.h + namespace std _GLIBCXX_VISIBILITY(default) { _GLIBCXX_BEGIN_NAMESPACE_VERSION @@ -137,20 +139,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __node_type* __node = _M_nodes; _M_nodes = _M_nodes-_M_next(); __node-_M_nxt = nullptr; - __value_alloc_type __a(_M_h._M_node_allocator()); - __value_alloc_traits::destroy(__a, __node-_M_valptr()); - __try - { - __value_alloc_traits::construct(__a, __node-_M_valptr(), - std::forward_Arg(__arg)); - } - __catch(...) - { - __node-~__node_type(); - __node_alloc_traits::deallocate(_M_h._M_node_allocator(), - __node, 1); - __throw_exception_again; - } + auto __a = _M_h._M_node_allocator(); + __node_alloc_traits::destroy(__a, __node-_M_valptr()); + __allocated_ptr_NodeAlloc __guard{__a, __node}; + __node_alloc_traits::construct(__a, __node-_M_valptr(), + std::forward_Arg(__arg)); + __guard = nullptr; return __node; } return _M_h._M_allocate_node(std::forward_Arg(__arg)); @@ -1947,33 +1941,25 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION typename _Hashtable_alloc_NodeAlloc::__node_type* _Hashtable_alloc_NodeAlloc::_M_allocate_node(_Args... __args) { - auto __nptr = __node_alloc_traits::allocate(_M_node_allocator(), 1); - __node_type* __n = std::__addressof(*__nptr); - __try - { - __value_alloc_type __a(_M_node_allocator()); - ::new ((void*)__n) __node_type; - __value_alloc_traits::construct(__a, __n-_M_valptr(), - std::forward_Args(__args)...); - return __n; - } - __catch(...) - { - __node_alloc_traits::deallocate(_M_node_allocator(), __nptr, 1); - __throw_exception_again; - } + auto __a = _M_node_allocator(); + auto __guard = std::__allocate_guarded(__a); + __node_type* __n = __guard.get(); + ::new ((void*)__n) __node_type; + __node_alloc_traits::construct(__a, __n-_M_valptr(), + std::forward_Args(__args)...); + __guard = nullptr; + return __n; } templatetypename _NodeAlloc void _Hashtable_alloc_NodeAlloc::_M_deallocate_node(__node_type* __n) { - typedef typename __node_alloc_traits::pointer _Ptr; - auto __ptr = std::pointer_traits_Ptr::pointer_to(*__n); - __value_alloc_type __a(_M_node_allocator()); - __value_alloc_traits::destroy(__a, __n-_M_valptr()); - __n-~__node_type(); - __node_alloc_traits::deallocate(_M_node_allocator(), __ptr, 1); + static_assert(std::is_trivially_destructible__node_type::value, + Nodes must not require non-trivial destruction); + auto __alloc = _M_node_allocator(); + __allocated_ptr__node_alloc_type __guard{__alloc, __n}; + __node_alloc_traits::destroy(__alloc, __n-_M_valptr()); } templatetypename _NodeAlloc
Re: [patch] Simplify allocator use
On 26/06/14 12:31 +0100, Jonathan Wakely wrote: @@ -137,20 +139,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __node_type* __node = _M_nodes; _M_nodes = _M_nodes-_M_next(); __node-_M_nxt = nullptr; - __value_alloc_type __a(_M_h._M_node_allocator()); - __value_alloc_traits::destroy(__a, __node-_M_valptr()); - __try - { - __value_alloc_traits::construct(__a, __node-_M_valptr(), - std::forward_Arg(__arg)); - } - __catch(...) - { - __node-~__node_type(); - __node_alloc_traits::deallocate(_M_h._M_node_allocator(), - __node, 1); I forgot to mention the change also fixes a bug in the line above: __node should be converted to the allocator's pointer type before passing it to deallocate. Using __allocated_ptr takes care of that.
Re: [PATCH] Fix vector rotate regression (PR tree-optimization/57233)
On Thu, Jun 26, 2014 at 01:16:41PM +0200, Marc Glisse wrote: + if (compute_type == TREE_TYPE (type) + !VECTOR_INTEGER_TYPE_P (TREE_TYPE (rhs2))) +{ + optab oplv, opl, oprv, opr, opo; + oplv = optab_for_tree_code (LSHIFT_EXPR, type, optab_vector); + /* Right shift always has to be logical, no matter what + signedness type has. */ + oprv = vlshr_optab; + opo = optab_for_tree_code (BIT_IOR_EXPR, type, optab_default); + opl = optab_for_tree_code (LSHIFT_EXPR, type, optab_scalar); + oprv = lshr_optab; + opr = optab_for_tree_code (RSHIFT_EXPR, type, optab_scalar); Looks like there are some typos in there, you are assigning to oprv twice. Oops, fixed thusly. 2014-06-25 Jakub Jelinek ja...@redhat.com PR tree-optimization/57233 PR tree-optimization/61299 * tree-vect-generic.c (get_compute_type, count_type_subparts): New functions. (expand_vector_operations_1): Use them. If {L,R}ROTATE_EXPR would be lowered to scalar shifts, check if corresponding shifts and vector BIT_IOR_EXPR are supported and don't lower or lower just to narrower vector type in that case. * expmed.c (expand_shift_1): Fix up handling of vector shifts and rotates. * gcc.dg/pr57233.c: New test. * gcc.target/i386/pr57233.c: New test. * gcc.target/i386/sse2-pr57233.c: New test. * gcc.target/i386/avx-pr57233.c: New test. * gcc.target/i386/avx2-pr57233.c: New test. * gcc.target/i386/avx512f-pr57233.c: New test. * gcc.target/i386/xop-pr57233.c: New test. --- gcc/tree-vect-generic.c.jj 2014-06-26 11:00:00.477268305 +0200 +++ gcc/tree-vect-generic.c 2014-06-26 13:33:33.024069715 +0200 @@ -1334,15 +1334,67 @@ lower_vec_perm (gimple_stmt_iterator *gs update_stmt (gsi_stmt (*gsi)); } +/* Return type in which CODE operation with optab OP can be + computed. */ + +static tree +get_compute_type (enum tree_code code, optab op, tree type) +{ + /* For very wide vectors, try using a smaller vector mode. */ + tree compute_type = type; + if (op + (!VECTOR_MODE_P (TYPE_MODE (type)) + || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing)) +{ + tree vector_compute_type + = type_for_widest_vector_mode (TREE_TYPE (type), op); + if (vector_compute_type != NULL_TREE + (TYPE_VECTOR_SUBPARTS (vector_compute_type) + TYPE_VECTOR_SUBPARTS (compute_type)) + (optab_handler (op, TYPE_MODE (vector_compute_type)) + != CODE_FOR_nothing)) + compute_type = vector_compute_type; +} + + /* If we are breaking a BLKmode vector into smaller pieces, + type_for_widest_vector_mode has already looked into the optab, + so skip these checks. */ + if (compute_type == type) +{ + enum machine_mode compute_mode = TYPE_MODE (compute_type); + if (VECTOR_MODE_P (compute_mode)) + { + if (op optab_handler (op, compute_mode) != CODE_FOR_nothing) + return compute_type; + if (code == MULT_HIGHPART_EXPR + can_mult_highpart_p (compute_mode, + TYPE_UNSIGNED (compute_type))) + return compute_type; + } + /* There is no operation in hardware, so fall back to scalars. */ + compute_type = TREE_TYPE (type); +} + + return compute_type; +} + +/* Helper function of expand_vector_operations_1. Return number of + vector elements for vector types or 1 for other types. */ + +static inline int +count_type_subparts (tree type) +{ + return VECTOR_TYPE_P (type) ? TYPE_VECTOR_SUBPARTS (type) : 1; +} + /* Process one statement. If we identify a vector operation, expand it. */ static void expand_vector_operations_1 (gimple_stmt_iterator *gsi) { gimple stmt = gsi_stmt (*gsi); - tree lhs, rhs1, rhs2 = NULL, type, compute_type; + tree lhs, rhs1, rhs2 = NULL, type, compute_type = NULL_TREE; enum tree_code code; - enum machine_mode compute_mode; optab op = unknown_optab; enum gimple_rhs_class rhs_class; tree new_rhs; @@ -1455,11 +1507,76 @@ expand_vector_operations_1 (gimple_stmt_ { op = optab_for_tree_code (code, type, optab_scalar); + compute_type = get_compute_type (code, op, type); + if (compute_type == type) + return; /* The rtl expander will expand vector/scalar as vector/vector -if necessary. Don't bother converting the stmt here. */ - if (optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing - optab_handler (opv, TYPE_MODE (type)) != CODE_FOR_nothing) +if necessary. Pick one with wider vector type. */ + tree compute_vtype = get_compute_type (code, opv, type); + if (count_type_subparts (compute_vtype) + count_type_subparts (compute_type)) + { +
[PATCH] RTEMS: Add Nios 2 support
This patch should be applied to GCC 4.9 and mainline. I do not have write access, so in case this gets approved, please commit it for me. gcc/ChangeLog 2014-06-26 Sebastian Huber sebastian.hu...@embedded-brains.de * config.gcc (nios2-*-*): Add RTEMS support. * config/nios2/rtems.h: New file. * config/nios2/t-rtems: Likewise. --- gcc/config.gcc |4 ++ gcc/config/nios2/rtems.h | 30 ++ gcc/config/nios2/t-rtems | 133 ++ 3 files changed, 167 insertions(+), 0 deletions(-) create mode 100644 gcc/config/nios2/rtems.h create mode 100644 gcc/config/nios2/t-rtems diff --git a/gcc/config.gcc b/gcc/config.gcc index 63e1222..6174375 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -2156,6 +2156,10 @@ nios2-*-*) tm_file=${tm_file} newlib-stdint.h nios2/elf.h extra_options=${extra_options} nios2/elf.opt ;; + nios2-*-rtems*) + tm_file=${tm_file} rtems.h nios2/rtems.h newlib-stdint.h + tmake_file=${tmake_file} nios2/t-rtems + ;; esac ;; pdp11-*-*) diff --git a/gcc/config/nios2/rtems.h b/gcc/config/nios2/rtems.h new file mode 100644 index 000..1028048 --- /dev/null +++ b/gcc/config/nios2/rtems.h @@ -0,0 +1,30 @@ +/* Definitions of RTEMS target support for Altera Nios II. + Copyright (C) 2014 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published + by the Free Software Foundation; either version 3, or (at your + option) any later version. + + GCC is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public + License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + http://www.gnu.org/licenses/. */ + +#define TARGET_LINUX_ABI 0 + +#undef TARGET_OS_CPP_BUILTINS +#define TARGET_OS_CPP_BUILTINS() \ + do \ +{ \ + builtin_define (__rtems__);\ + builtin_define (__USE_INIT_FINI__);\ + builtin_assert (system=rtems); \ +} \ + while (false) diff --git a/gcc/config/nios2/t-rtems b/gcc/config/nios2/t-rtems new file mode 100644 index 000..f95fa3c --- /dev/null +++ b/gcc/config/nios2/t-rtems @@ -0,0 +1,133 @@ +# Custom RTEMS multilibs + +MULTILIB_OPTIONS = mhw-mul mhw-mulx mhw-div mcustom-fadds=253 mcustom-fdivs=255 mcustom-fmuls=252 mcustom-fsubs=254 + +# Enumeration of multilibs + +# MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fsubs=254 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fdivs=255 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fmuls=252/mcustom-fsubs=254 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fmuls=252 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253/mcustom-fsubs=254 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fadds=253 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fdivs=255/mcustom-fmuls=252 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fdivs=255/mcustom-fsubs=254 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fdivs=255 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fmuls=252/mcustom-fsubs=254 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fmuls=252 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div/mcustom-fsubs=254 +# MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mhw-div +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252/mcustom-fsubs=254 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fmuls=252 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fdivs=255/mcustom-fsubs=254 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fdivs=255 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fmuls=252/mcustom-fsubs=254 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fmuls=252 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fadds=253/mcustom-fsubs=254 +MULTILIB_EXCEPTIONS += mhw-mul/mhw-mulx/mcustom-fadds=253 +MULTILIB_EXCEPTIONS +=
Re: [PATCH] Fix vector rotate regression (PR tree-optimization/57233)
On Thu, 26 Jun 2014, Jakub Jelinek wrote: On Thu, Jun 26, 2014 at 01:16:41PM +0200, Marc Glisse wrote: +if (compute_type == TREE_TYPE (type) + !VECTOR_INTEGER_TYPE_P (TREE_TYPE (rhs2))) + { +optab oplv, opl, oprv, opr, opo; +oplv = optab_for_tree_code (LSHIFT_EXPR, type, optab_vector); +/* Right shift always has to be logical, no matter what + signedness type has. */ +oprv = vlshr_optab; +opo = optab_for_tree_code (BIT_IOR_EXPR, type, optab_default); +opl = optab_for_tree_code (LSHIFT_EXPR, type, optab_scalar); +oprv = lshr_optab; +opr = optab_for_tree_code (RSHIFT_EXPR, type, optab_scalar); Looks like there are some typos in there, you are assigning to oprv twice. Oops, fixed thusly. Ok. Thanks, Richard. 2014-06-25 Jakub Jelinek ja...@redhat.com PR tree-optimization/57233 PR tree-optimization/61299 * tree-vect-generic.c (get_compute_type, count_type_subparts): New functions. (expand_vector_operations_1): Use them. If {L,R}ROTATE_EXPR would be lowered to scalar shifts, check if corresponding shifts and vector BIT_IOR_EXPR are supported and don't lower or lower just to narrower vector type in that case. * expmed.c (expand_shift_1): Fix up handling of vector shifts and rotates. * gcc.dg/pr57233.c: New test. * gcc.target/i386/pr57233.c: New test. * gcc.target/i386/sse2-pr57233.c: New test. * gcc.target/i386/avx-pr57233.c: New test. * gcc.target/i386/avx2-pr57233.c: New test. * gcc.target/i386/avx512f-pr57233.c: New test. * gcc.target/i386/xop-pr57233.c: New test. --- gcc/tree-vect-generic.c.jj2014-06-26 11:00:00.477268305 +0200 +++ gcc/tree-vect-generic.c 2014-06-26 13:33:33.024069715 +0200 @@ -1334,15 +1334,67 @@ lower_vec_perm (gimple_stmt_iterator *gs update_stmt (gsi_stmt (*gsi)); } +/* Return type in which CODE operation with optab OP can be + computed. */ + +static tree +get_compute_type (enum tree_code code, optab op, tree type) +{ + /* For very wide vectors, try using a smaller vector mode. */ + tree compute_type = type; + if (op + (!VECTOR_MODE_P (TYPE_MODE (type)) + || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing)) +{ + tree vector_compute_type + = type_for_widest_vector_mode (TREE_TYPE (type), op); + if (vector_compute_type != NULL_TREE +(TYPE_VECTOR_SUBPARTS (vector_compute_type) +TYPE_VECTOR_SUBPARTS (compute_type)) +(optab_handler (op, TYPE_MODE (vector_compute_type)) + != CODE_FOR_nothing)) + compute_type = vector_compute_type; +} + + /* If we are breaking a BLKmode vector into smaller pieces, + type_for_widest_vector_mode has already looked into the optab, + so skip these checks. */ + if (compute_type == type) +{ + enum machine_mode compute_mode = TYPE_MODE (compute_type); + if (VECTOR_MODE_P (compute_mode)) + { + if (op optab_handler (op, compute_mode) != CODE_FOR_nothing) + return compute_type; + if (code == MULT_HIGHPART_EXPR +can_mult_highpart_p (compute_mode, + TYPE_UNSIGNED (compute_type))) + return compute_type; + } + /* There is no operation in hardware, so fall back to scalars. */ + compute_type = TREE_TYPE (type); +} + + return compute_type; +} + +/* Helper function of expand_vector_operations_1. Return number of + vector elements for vector types or 1 for other types. */ + +static inline int +count_type_subparts (tree type) +{ + return VECTOR_TYPE_P (type) ? TYPE_VECTOR_SUBPARTS (type) : 1; +} + /* Process one statement. If we identify a vector operation, expand it. */ static void expand_vector_operations_1 (gimple_stmt_iterator *gsi) { gimple stmt = gsi_stmt (*gsi); - tree lhs, rhs1, rhs2 = NULL, type, compute_type; + tree lhs, rhs1, rhs2 = NULL, type, compute_type = NULL_TREE; enum tree_code code; - enum machine_mode compute_mode; optab op = unknown_optab; enum gimple_rhs_class rhs_class; tree new_rhs; @@ -1455,11 +1507,76 @@ expand_vector_operations_1 (gimple_stmt_ { op = optab_for_tree_code (code, type, optab_scalar); + compute_type = get_compute_type (code, op, type); + if (compute_type == type) + return; /* The rtl expander will expand vector/scalar as vector/vector - if necessary. Don't bother converting the stmt here. */ - if (optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing -optab_handler (opv, TYPE_MODE (type)) != CODE_FOR_nothing) + if necessary. Pick one with wider vector type. */ + tree compute_vtype = get_compute_type (code, opv, type); + if (count_type_subparts
Re: [patch] Simplify allocator use
On 26/06/14 00:06 +0100, Jonathan Wakely wrote: This simplifies some of the test changes in my last patch, I was misusing the CustomPointerAlloc due to confusion with some uncommitted changes. And this fixes the -fno-rtti version of make_shared, I shouldn't have changed the deleter's parameter to the allocator's pointer. That worked with the current test, but only because our CustomPointerAlloc uses a custom pointer that is implicitly-convertible from value_type*. I have a completely rewritten custom pointer for the testsuite which doesn't support implicit conversions (only the minimum requirements) and that caught this bug. The new custom pointer type is proving very useful while I'm making some std::list changes but isn't ready for prime-time yet. Tested x86_64-linux, committed to trunk. commit e69d8134edde691db7ea2567032229b210dd263d Author: Jonathan Wakely jwak...@redhat.com Date: Thu Jun 26 13:27:30 2014 +0100 * include/bits/shared_ptr_base.h (__shared_ptr::_Deleter): Fix parameter type. diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h b/libstdc++-v3/include/bits/shared_ptr_base.h index 590a8d3..6f85ffa 100644 --- a/libstdc++-v3/include/bits/shared_ptr_base.h +++ b/libstdc++-v3/include/bits/shared_ptr_base.h @@ -1085,7 +1085,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION templatetypename _Alloc struct _Deleter { - void operator()(typename _Alloc::pointer __ptr) + void operator()(_Tp* __ptr) { __allocated_ptr_Alloc __guard{ _M_alloc, __ptr }; allocator_traits_Alloc::destroy(_M_alloc, __guard.get());
[PATCH] Devirtualization dump functions fix
Hello, I encountered similar issue to PR ipa/61462 where location_t locus = gimple_location (e-call_stmt) is called for e-call_stmt == NULL (Firefox with -flto -fdump-ipa-devirt). So that, I decided to introduce new function that is called for all potentially unsafe locations. I am wondering if a newly added function can be added in more seamless way (without playing with va_list and ATTRIBUTE_PRINTF stuff)? Bootstrapped and regtested on x86_64-unknown-linux-gnu. Thanks, Martin ChangeLog: 2014-06-26 Martin Liska mli...@suse.cz * include/ansidecl.h: New collection of ATTRIBUTE_NULL_PRINTF_X_0 defined. gcc/ChangeLog: 2014-06-26 Martin Liska mli...@suse.cz * dumpfile.h: New function dump_printf_loc_for_stmt. * dumpfile.c: Implementation added. (dump_vprintf): New function.i * cgraphunit.c: dump_printf_loc_for_stmt usage replaces dump_printf_loc. * gimple-fold.c: Likewise. * ipa-devirt.c: Likewise. * ipa-prop.c: Likewise. * ipa.c: Likewise. * tree-ssa-pre.c: Likewise. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index 76b2fda1..3b01718 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -905,12 +905,9 @@ walk_polymorphic_call_targets (pointer_set_t *reachable_call_targets, TDF_SLIM); } if (dump_enabled_p ()) -{ - location_t locus = gimple_location (edge-call_stmt); - dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, locus, - devirtualizing call in %s to %s\n, - edge-caller-name (), target-name ()); - } + dump_printf_loc_for_stmt (MSG_OPTIMIZED_LOCATIONS, edge-call_stmt, + devirtualizing call in %s to %s\n, + edge-caller-name (), target-name ()); cgraph_make_edge_direct (edge, target); cgraph_redirect_edge_call_stmt_to_callee (edge); diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c index fd630a6..b7a791c 100644 --- a/gcc/dumpfile.c +++ b/gcc/dumpfile.c @@ -23,6 +23,12 @@ along with GCC; see the file COPYING3. If not see #include diagnostic-core.h #include dumpfile.h #include tree.h +#include basic-block.h +#include tree-ssa-alias.h +#include internal-fn.h +#include gimple-expr.h +#include is-a.h +#include gimple.h #include gimple-pretty-print.h #include context.h @@ -343,52 +349,80 @@ dump_generic_expr_loc (int dump_kind, source_location loc, } } -/* Output a formatted message using FORMAT on appropriate dump streams. */ +/* Output a formatted message using FORMAT on appropriate dump streams. + Accepts va_list AP as the last argument. */ -void -dump_printf (int dump_kind, const char *format, ...) +ATTRIBUTE_NULL_PRINTF_2_0 +static void +dump_vprintf (int dump_kind, const char *format, va_list ap) { if (dump_file (dump_kind pflags)) -{ - va_list ap; - va_start (ap, format); vfprintf (dump_file, format, ap); - va_end (ap); -} if (alt_dump_file (dump_kind alt_flags)) -{ - va_list ap; - va_start (ap, format); vfprintf (alt_dump_file, format, ap); - va_end (ap); -} } -/* Similar to dump_printf, except source location is also printed. */ +/* Output a formatted message using FORMAT on appropriate dump streams. */ void -dump_printf_loc (int dump_kind, source_location loc, const char *format, ...) +dump_printf (int dump_kind, const char *format, ...) +{ + va_list ap; + va_start (ap, format); + dump_vprintf (dump_kind, format, ap); + va_end (ap); +} + +/* Similar to dump_printf, except source location is also printed. + Accepts va_list AP as the last argument. */ + +void +dump_vprintf_loc (int dump_kind, source_location loc, const char *format, + va_list ap) { if (dump_file (dump_kind pflags)) { - va_list ap; dump_loc (dump_kind, dump_file, loc); - va_start (ap, format); vfprintf (dump_file, format, ap); - va_end (ap); } if (alt_dump_file (dump_kind alt_flags)) { - va_list ap; dump_loc (dump_kind, alt_dump_file, loc); - va_start (ap, format); vfprintf (alt_dump_file, format, ap); - va_end (ap); } } +/* Similar to dump_printf, except source location is also printed. */ + +void +dump_printf_loc (int dump_kind, source_location loc, const char *format, ...) +{ + va_list ap; + va_start (ap, format); + dump_vprintf_loc (dump_kind, loc, format, ap); + va_end (ap); +} + +/* Similar to dump_printf, except source location is also printed if STMT + is not null. Otherwise, fallback to dump_fprintf is called. */ + +void +dump_printf_loc_for_stmt (int dump_kind, const_gimple stmt, const char *format, + ...) +{ + va_list ap; + va_start (ap, format); + + if (stmt) +dump_vprintf_loc (dump_kind, gimple_location (stmt), format, ap); + else +dump_vprintf (dump_kind, format, ap); + + va_end (ap); +} + /* Start a dump for PHASE. Store user-supplied dump flags in *FLAG_PTR. Return
[patch] c++/58051 Implement Core 1579
DR1579 relaxes [class.copy]/32 so that expressions in return statements can be looked up as rvalues even when they aren't the same type as the function return type. Implementing that seems as simple as removing the restriction on the types. Tested x86_64-linux, no regressions. OK for trunk? commit 45e8a7ceb267cafde4d4411563a3e84bbd49ad8c Author: Jonathan Wakely jwak...@redhat.com Date: Thu Jun 26 11:00:54 2014 +0100 gcc/cp: DR 1579 PR c++/58051 * typeck.c (check_return_expr): Lookup as an rvalue even when the types aren't the same. gcc/testsuite: * g++.dg/cpp0x/elision_conv.C: New. diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c index 65dccf7..042e600 100644 --- a/gcc/cp/typeck.c +++ b/gcc/cp/typeck.c @@ -8607,7 +8607,7 @@ check_return_expr (tree retval, bool *no_warning) if (VOID_TYPE_P (functype)) return error_mark_node; - /* Under C++0x [12.8/16 class.copy], a returned lvalue is sometimes + /* Under C++11 [12.8/32 class.copy], a returned lvalue is sometimes treated as an rvalue for the purposes of overload resolution to favor move constructors over copy constructors. @@ -8618,8 +8618,6 @@ check_return_expr (tree retval, bool *no_warning) || TREE_CODE (retval) == PARM_DECL) DECL_CONTEXT (retval) == current_function_decl !TREE_STATIC (retval) - same_type_p ((TYPE_MAIN_VARIANT (TREE_TYPE (retval))), - (TYPE_MAIN_VARIANT (functype))) /* This is only interesting for class type. */ CLASS_TYPE_P (functype)) flags = flags | LOOKUP_PREFER_RVALUE; diff --git a/gcc/testsuite/g++.dg/cpp0x/elision_conv.C b/gcc/testsuite/g++.dg/cpp0x/elision_conv.C new file mode 100644 index 000..d778a0b --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/elision_conv.C @@ -0,0 +1,18 @@ +// Core 1579 return by converting move constructor +// PR c++/58051 +// { dg-do compile { target c++11 } } + +struct A { + A() = default; + A(A) = default; +}; + +struct B { + B(A) { } +}; + +B f() +{ + A a; + return a; +}
[PATCH][match-and-simplify] Restore bootstrap
We've accumulated some warnings - fixed below. Richard. 2014-06-26 Richard Biener rguent...@suse.de * genmatch.c (c_expr::gen_transform): Fix unused parameter. (dt_simplify::gen_gimple): Mark captures as possibly unused. (dt_simplify::gen_generic): Likewise. (decision_tree::gen_generic): Mark type parameter as possibly unused. Index: genmatch.c === --- genmatch.c (revision 212025) +++ genmatch.c (working copy) @@ -602,7 +602,7 @@ expr::gen_transform (FILE *f, const char } void -c_expr::gen_transform (FILE *f, const char *dest, bool gimple) +c_expr::gen_transform (FILE *f, const char *dest, bool) { /* If this expression has an outlined function variant, call it. */ if (fname) @@ -1352,7 +1352,7 @@ dt_simplify::gen_gimple (FILE *f) fprintf (f, /* simplify %u */\n, pattern_no); fprintf (f, {\n); - fprintf (f, tree captures[4] = {};\n); + fprintf (f, tree captures[4] ATTRIBUTE_UNUSED = {};\n); for (unsigned i = 0; i dt_simplify::capture_max; ++i) if (indexes[i]) @@ -1410,7 +1410,7 @@ dt_simplify::gen_generic (FILE *f) fprintf (f, /* simplify %u */\n, pattern_no); fprintf (f, {\n); - fprintf (f, tree captures[4] = {};\n); + fprintf (f, tree captures[4] ATTRIBUTE_UNUSED = {};\n); for (unsigned i = 0; i dt_simplify::capture_max; ++i) if (indexes[i]) @@ -1512,7 +1512,7 @@ decision_tree::gen_generic (FILE *f) for (unsigned n = 1; n = 3; ++n) { fprintf (f, \ntree\n - generic_match_and_simplify (enum tree_code code, tree type); + generic_match_and_simplify (enum tree_code code, tree type ATTRIBUTE_UNUSED); for (unsigned i = 0; i n; ++i) fprintf (f, , tree op%d, i); fprintf (f, )\n);
Re: [build, driver] RFC: Support compressed debug sections
Hi Gerald, sorry for the delay, I've been away for a couple of days. On Tue, 3 Jun 2014, Rainer Orth wrote: It's been another week, and I still need approval for the build, doc, and Darwin changes: https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01860.html On the doc side, things are fine. Just a suggestion or two: +Produce compressed debug sections in DWARF format, if that is supported. Supported by what? By the toolchain used. TBH, I just copied that fragment from various other debug options (-gstabs, -gcoff, -gdwarf-N). Given the precedent and the verbosity of a more detailed explanation, I'd leave this as is. doesn't - does not, especially given the emphasis we want to make here. Good point, fixed. And could the If the linker doesn't support writing compressed debug sections, the option is rejected. Otherwise, if the assembler doesn't support them, @option{-gz} is silently ignored when producing object files. be moved to the very end, or is this only applicable to the case where no type has been specified? No, you're right: it's better to first explain the values for type in the working case, then explain potential error scenarios. The section now reads @item -gz@r{[}=@var{type}@r{]} @opindex gz Produce compressed debug sections in DWARF format, if that is supported. If @var{type} is not given, the default type depends on the capabilities of the assembler and linker used. @var{type} may be one of @option{none} (don't compress debug sections), @option{zlib} (use zlib compression in ELF gABI format), or @option{zlib-gnu} (use zlib compression in traditional GNU format). If the linker doesn't support writing compressed debug sections, the option is rejected. Otherwise, if the assembler does not support them, @option{-gz} is silently ignored when producing object files. Thanks for your comments. I'm still missing review of the build parts after three weeks and several reminders, though. Paolo, Nathanael, Alexandre, could one of you please have a look? Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Devirtualization dump functions fix
On Thu, Jun 26, 2014 at 3:01 PM, Martin Liška mli...@suse.cz wrote: Hello, I encountered similar issue to PR ipa/61462 where location_t locus = gimple_location (e-call_stmt) is called for e-call_stmt == NULL (Firefox with -flto -fdump-ipa-devirt). So that, I decided to introduce new function that is called for all potentially unsafe locations. I am wondering if a newly added function can be added in more seamless way (without playing with va_list and ATTRIBUTE_PRINTF stuff)? Bootstrapped and regtested on x86_64-unknown-linux-gnu. Hmm, I don't like that very much - dump_printf_loc_for_stmt still implies stmt is not NULL. So you could have fixed gimple_location as well. I suppose dump_printf_loc already does sth sane with UNKNOWN_LOCATION. Richard. Thanks, Martin ChangeLog: 2014-06-26 Martin Liska mli...@suse.cz * include/ansidecl.h: New collection of ATTRIBUTE_NULL_PRINTF_X_0 defined. gcc/ChangeLog: 2014-06-26 Martin Liska mli...@suse.cz * dumpfile.h: New function dump_printf_loc_for_stmt. * dumpfile.c: Implementation added. (dump_vprintf): New function.i * cgraphunit.c: dump_printf_loc_for_stmt usage replaces dump_printf_loc. * gimple-fold.c: Likewise. * ipa-devirt.c: Likewise. * ipa-prop.c: Likewise. * ipa.c: Likewise. * tree-ssa-pre.c: Likewise.
Re: [build, driver] RFC: Support compressed debug sections
Eric Christopher echri...@gmail.com writes: If it is just to reach compatibility with the debugger, then I’d rather either just mandate a certain debugger or autoconf for what the current debugger supports. As of late people seem to just break the debugging experience with non-updated gdbs and assume that a newer gdb is used. You cannot do that: unlike the assembler and linker used, which are often hardcoded into gcc, the debugger can easily be changed below the compiler's feet, so to speak. Besides, on several platforms, you have more than one debugger available (like gdb and dbx, or others), so this isn't an option. Apart from that, the debugging experience when e.g. emitting very recent DWARF extensions and trying to use them with a gdb that doesn't understand them usually leads to some debug info missing. In this case, emitting compressed debug with a debugger that cannot read it leads to the debugger claiming (correctly, from its point of view) that there's no debugging info present. I don't want to tell users who come complaining `I compiled with -g, but my debugger tells me there's no debug info present': `look, your debugger lies, it is present, but it cannot read it'. That's a lot worse than the DWARF extensions scenario above. Agreed :) FWIW it's already a gas/assembler option, I'm curious about wanting to expose it via the compiler? One reason: ease of use: * -gz is far easier to use/type than -Wa,--compress-debug-sections + -Wl,--compress-debug-sections, and * one common option irrespective of assemblers (the Solaris assembler will gain eventually gain compressed debug support, too) and linkers used (Solaris ld requires -z compress-sections=type), and even the Apple assembler might at some point ;-) On top of all that, compressed debug is a tradeoff: in some cases it may be worth it to save space on debug info if disk space is at a premium for some reason (e.g. for release builds), but in others you want to compile as fast as possible, but assembling and linking compressed debug takes more CPU time. Otherwise we could just as well default to -Os, telling our users it's better for them since it generates faster and smaller code, not minding the compile time cost and worse debugging experience. FWIW I've found in some limited timing that compression is nearly always worth it here at Google - even for compile time given the cost of writing files versus cpu time. Might be worth making it a default at some point in the future and making sure the option is invertible. One might be not so lucky with different/slower CPUs, though. I wonder how this would affect bootstrap times on my current SPARC systems ;-( But yes, a configure option to default -gz to on would certainly be helpful at some point. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Fix forwporp pattern (T)(P + A) - (T)P - (T)A
Only if we could somehow rule out that chars_per_limb can be zero. Then we know for sure that unsigned overflow must happen, and the only possible result would be -1. But at this time, both -1 and 4294967295 are possible. I see, I thought you meant that the result was -1 statically. Thanks for correcting this annoying blunder... -- Eric Botcazou
[PATCH] Improve -fdump-tree-all efficiency
The following patch fixes a big inefficiency when using -fdump-tree-all for large source files. I found that when using this option the compile time became unreasonably slow, and I traced this to the fact that dump_begin/dump_end are called around every function/class that are dumped via -fdump-tree-original and -fdump-class-hierarchy. I fixed this by opening the .original and .class dumps once before invoking the parser and closing them once after. For a file containing ~7000 each of functions and classes, the real time measured for the compile is: no dumping: 8s -fdump-tree-all, no patch: 10m30s -fdump-tree-all, my patch: 1m21s Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk? Thanks, Teresa 2014-06-26 Teresa Johnson tejohn...@google.com * c-family/c-common.h (get_dump_info): Declare. * c-family/c-gimplify.c (c_genericize): Use saved dump files. * c-family/c-opts.c (c_common_parse_file): Begin and end dumps once around parsing invocation. (get_dump_info): New function. * cp/class.c (dump_class_hierarchy): Use saved dump files. (dump_vtable): Ditto. (dump_vtt): Ditto. Index: c-family/c-common.h === --- c-family/c-common.h (revision 211980) +++ c-family/c-common.h (working copy) @@ -835,6 +835,7 @@ extern bool c_common_post_options (const char **); extern bool c_common_init (void); extern void c_common_finish (void); extern void c_common_parse_file (void); +extern FILE *get_dump_info (int, int *); extern alias_set_type c_common_get_alias_set (tree); extern void c_register_builtin_type (tree, const char*); extern bool c_promoting_integer_type_p (const_tree); Index: c-family/c-gimplify.c === --- c-family/c-gimplify.c (revision 211980) +++ c-family/c-gimplify.c (working copy) @@ -123,7 +123,7 @@ c_genericize (tree fndecl) } /* Dump the C-specific tree IR. */ - dump_orig = dump_begin (TDI_original, local_dump_flags); + dump_orig = get_dump_info (TDI_original, local_dump_flags); if (dump_orig) { fprintf (dump_orig, \n;; Function %s, @@ -140,8 +140,6 @@ c_genericize (tree fndecl) else print_c_tree (dump_orig, DECL_SAVED_TREE (fndecl)); fprintf (dump_orig, \n); - - dump_end (TDI_original, dump_orig); } /* Dump all nested functions now. */ Index: c-family/c-opts.c === --- c-family/c-opts.c (revision 211980) +++ c-family/c-opts.c (working copy) @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3. If not see TARGET_FLT_EVAL_METHOD_NON_DEFAULT and TARGET_OPTF. */ #include tm_p.h /* For C_COMMON_OVERRIDE_OPTIONS. */ +#include dumpfile.h #ifndef DOLLARS_IN_IDENTIFIERS # define DOLLARS_IN_IDENTIFIERS true @@ -102,6 +103,12 @@ static size_t deferred_count; /* Number of deferred options scanned for -include. */ static size_t include_cursor; +/* Dump files/flags to use during parsing. */ +static FILE *original_dump_file = NULL; +static int original_dump_flags; +static FILE *class_dump_file = NULL; +static int class_dump_flags; + /* Whether any standard preincluded header has been preincluded. */ static bool done_preinclude; @@ -1088,6 +1095,10 @@ c_common_parse_file (void) for (;;) { c_finish_options (); + /* Open the dump files to use for the original and class dump output + here, to be used during parsing for the current file. */ + original_dump_file = dump_begin (TDI_original, original_dump_flags); + class_dump_file = dump_begin (TDI_class, class_dump_flags); pch_init (); push_file_scope (); c_parse_file (); @@ -1101,6 +1112,16 @@ c_common_parse_file (void) cpp_clear_file_cache (parse_in); this_input_filename = cpp_read_main_file (parse_in, in_fnames[i]); + if (original_dump_file) +{ + dump_end (TDI_original, original_dump_file); + original_dump_file = NULL; +} + if (class_dump_file) +{ + dump_end (TDI_class, class_dump_file); + class_dump_file = NULL; +} /* If an input file is missing, abandon further compilation. cpplib has issued a diagnostic. */ if (!this_input_filename) @@ -1108,6 +1129,23 @@ c_common_parse_file (void) } } +/* Returns the appropriate dump file for PHASE to dump with FLAGS. */ +FILE * +get_dump_info (int phase, int *flags) +{ + gcc_assert (phase == TDI_original || phase == TDI_class); + if (phase == TDI_original) +{ + *flags = original_dump_flags; + return original_dump_file; +} + else +{ + *flags = class_dump_flags; + return class_dump_file; +} +} + /* Common finish hook for the C, ObjC and C++ front
Re: [PATCH] Devirtualization dump functions fix
On 06/26/2014 03:20 PM, Richard Biener wrote: On Thu, Jun 26, 2014 at 3:01 PM, Martin Liška mli...@suse.cz wrote: Hello, I encountered similar issue to PR ipa/61462 where location_t locus = gimple_location (e-call_stmt) is called for e-call_stmt == NULL (Firefox with -flto -fdump-ipa-devirt). So that, I decided to introduce new function that is called for all potentially unsafe locations. I am wondering if a newly added function can be added in more seamless way (without playing with va_list and ATTRIBUTE_PRINTF stuff)? Bootstrapped and regtested on x86_64-unknown-linux-gnu. Hmm, I don't like that very much - dump_printf_loc_for_stmt still implies stmt is not NULL. So you could have fixed gimple_location as well. I suppose dump_printf_loc already does sth sane with UNKNOWN_LOCATION. Richard. Hi, you are right that it is quite complex change. Do you mean this one line change can be sufficient ? diff --git a/gcc/gimple.h b/gcc/gimple.h index ceefbc0..954195e 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -1498,7 +1498,7 @@ gimple_set_block (gimple g, tree block) static inline location_t gimple_location (const_gimple g) { - return g-location; + return g ? g-location : UNKNOWN_LOCATION; } /* Return pointer to location information for statement G. */ I will double-check if it solves the problem ;) Martin Thanks, Martin ChangeLog: 2014-06-26 Martin Liska mli...@suse.cz * include/ansidecl.h: New collection of ATTRIBUTE_NULL_PRINTF_X_0 defined. gcc/ChangeLog: 2014-06-26 Martin Liska mli...@suse.cz * dumpfile.h: New function dump_printf_loc_for_stmt. * dumpfile.c: Implementation added. (dump_vprintf): New function.i * cgraphunit.c: dump_printf_loc_for_stmt usage replaces dump_printf_loc. * gimple-fold.c: Likewise. * ipa-devirt.c: Likewise. * ipa-prop.c: Likewise. * ipa.c: Likewise. * tree-ssa-pre.c: Likewise.
Re: [PATCH] Devirtualization dump functions fix
On Thu, Jun 26, 2014 at 3:43 PM, Martin Liška mli...@suse.cz wrote: On 06/26/2014 03:20 PM, Richard Biener wrote: On Thu, Jun 26, 2014 at 3:01 PM, Martin Liška mli...@suse.cz wrote: Hello, I encountered similar issue to PR ipa/61462 where location_t locus = gimple_location (e-call_stmt) is called for e-call_stmt == NULL (Firefox with -flto -fdump-ipa-devirt). So that, I decided to introduce new function that is called for all potentially unsafe locations. I am wondering if a newly added function can be added in more seamless way (without playing with va_list and ATTRIBUTE_PRINTF stuff)? Bootstrapped and regtested on x86_64-unknown-linux-gnu. Hmm, I don't like that very much - dump_printf_loc_for_stmt still implies stmt is not NULL. So you could have fixed gimple_location as well. I suppose dump_printf_loc already does sth sane with UNKNOWN_LOCATION. Richard. Hi, you are right that it is quite complex change. Do you mean this one line change can be sufficient ? diff --git a/gcc/gimple.h b/gcc/gimple.h index ceefbc0..954195e 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -1498,7 +1498,7 @@ gimple_set_block (gimple g, tree block) static inline location_t gimple_location (const_gimple g) { - return g-location; + return g ? g-location : UNKNOWN_LOCATION; } /* Return pointer to location information for statement G. */ I will double-check if it solves the problem ;) Well yes - it is of course similar broken in spirit but at least a lot simpler ;) I'd put a comment there why we do check g for NULL. Thanks, Richard. Martin Thanks, Martin ChangeLog: 2014-06-26 Martin Liska mli...@suse.cz * include/ansidecl.h: New collection of ATTRIBUTE_NULL_PRINTF_X_0 defined. gcc/ChangeLog: 2014-06-26 Martin Liska mli...@suse.cz * dumpfile.h: New function dump_printf_loc_for_stmt. * dumpfile.c: Implementation added. (dump_vprintf): New function.i * cgraphunit.c: dump_printf_loc_for_stmt usage replaces dump_printf_loc. * gimple-fold.c: Likewise. * ipa-devirt.c: Likewise. * ipa-prop.c: Likewise. * ipa.c: Likewise. * tree-ssa-pre.c: Likewise.
Re: [PATCH] Devirtualization dump functions fix
On Thu, Jun 26, 2014 at 04:10:03PM +0200, Richard Biener wrote: On Thu, Jun 26, 2014 at 3:43 PM, Martin Liška mli...@suse.cz wrote: On 06/26/2014 03:20 PM, Richard Biener wrote: On Thu, Jun 26, 2014 at 3:01 PM, Martin Liška mli...@suse.cz wrote: Hello, I encountered similar issue to PR ipa/61462 where location_t locus = gimple_location (e-call_stmt) is called for e-call_stmt == NULL (Firefox with -flto -fdump-ipa-devirt). So that, I decided to introduce new function that is called for all potentially unsafe locations. I am wondering if a newly added function can be added in more seamless way (without playing with va_list and ATTRIBUTE_PRINTF stuff)? Bootstrapped and regtested on x86_64-unknown-linux-gnu. Hmm, I don't like that very much - dump_printf_loc_for_stmt still implies stmt is not NULL. So you could have fixed gimple_location as well. I suppose dump_printf_loc already does sth sane with UNKNOWN_LOCATION. Richard. Hi, you are right that it is quite complex change. Do you mean this one line change can be sufficient ? diff --git a/gcc/gimple.h b/gcc/gimple.h index ceefbc0..954195e 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -1498,7 +1498,7 @@ gimple_set_block (gimple g, tree block) static inline location_t gimple_location (const_gimple g) { - return g-location; + return g ? g-location : UNKNOWN_LOCATION; } /* Return pointer to location information for statement G. */ I will double-check if it solves the problem ;) Well yes - it is of course similar broken in spirit but at least a lot simpler ;) I'd put a comment there why we do check g for NULL. But it increases overhead, there are hundreds of gimple_location calls and most of them will never pass NULL. Can't you simply do what you do in the inline here in the couple of spots where the stmt might be NULL? Jakub
Re: [PATCH] Devirtualization dump functions fix
On 06/26/2014 04:18 PM, Jakub Jelinek wrote: On Thu, Jun 26, 2014 at 04:10:03PM +0200, Richard Biener wrote: On Thu, Jun 26, 2014 at 3:43 PM, Martin Liška mli...@suse.cz wrote: On 06/26/2014 03:20 PM, Richard Biener wrote: On Thu, Jun 26, 2014 at 3:01 PM, Martin Liška mli...@suse.cz wrote: Hello, I encountered similar issue to PR ipa/61462 where location_t locus = gimple_location (e-call_stmt) is called for e-call_stmt == NULL (Firefox with -flto -fdump-ipa-devirt). So that, I decided to introduce new function that is called for all potentially unsafe locations. I am wondering if a newly added function can be added in more seamless way (without playing with va_list and ATTRIBUTE_PRINTF stuff)? Bootstrapped and regtested on x86_64-unknown-linux-gnu. Hmm, I don't like that very much - dump_printf_loc_for_stmt still implies stmt is not NULL. So you could have fixed gimple_location as well. I suppose dump_printf_loc already does sth sane with UNKNOWN_LOCATION. Richard. Hi, you are right that it is quite complex change. Do you mean this one line change can be sufficient ? diff --git a/gcc/gimple.h b/gcc/gimple.h index ceefbc0..954195e 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -1498,7 +1498,7 @@ gimple_set_block (gimple g, tree block) static inline location_t gimple_location (const_gimple g) { - return g-location; + return g ? g-location : UNKNOWN_LOCATION; } /* Return pointer to location information for statement G. */ I will double-check if it solves the problem ;) Well yes - it is of course similar broken in spirit but at least a lot simpler ;) I'd put a comment there why we do check g for NULL. But it increases overhead, there are hundreds of gimple_location calls and most of them will never pass NULL. Can't you simply do what you do in the inline here in the couple of spots where the stmt might be NULL? Sure, do you have any suggestion how should be called such function? Suggestion: gimple_location_or_unknown ? Thanks, Martin Jakub
Re: [PATCH] Devirtualization dump functions fix
On Thu, Jun 26, 2014 at 04:27:49PM +0200, Martin Liška wrote: Well yes - it is of course similar broken in spirit but at least a lot simpler ;) I'd put a comment there why we do check g for NULL. But it increases overhead, there are hundreds of gimple_location calls and most of them will never pass NULL. Can't you simply do what you do in the inline here in the couple of spots where the stmt might be NULL? Sure, do you have any suggestion how should be called such function? Suggestion: gimple_location_or_unknown ? gimple_location_safe or gimple_safe_location? Jakub
Re: [c++-concepts] Fix assertion failure with cp_maybe_constrained_type_specifier
On 06/25/2014 10:03 AM, Andrew Sutton wrote: I did a full 3-stage bootstrap which is the default these days. I'll try --disable-bootstrap and see what happens. I just did a full bootstrap build and got the same errors. The errors are correct for C++11, which was enabled by default in this branch. IIRC, aggregate initialization requires the initializer-clause to match the structure exactly (or at least not omit any const initializers?) I think this was something Gaby wanted when we created the branch, but I'm not sure it's worth keeping because of the bootstrapping errors. I could reset the default dialect to 98 and turn on concepts iff 1y is enabled, or I could turn on 1y if -fconcepts is enabled. Thoughts? Andrew I did --disable-bootstrap and it worked a charm. I, for one, would like gcc to bootstrap with c++11/c++14. I think we should be starting to shake down that path. I'm probably not alone in this. On the other hand, I don't think c++-concepts branch should be the leader on this. We have our work cut out for us without fighting these bugs. Maybe a c++11-bootstrap branch could be started to work the c++1* bootstrap out. As long as gcc defaults to bootstrappng with c++98 I think we should do that if it won't preclude concepts work. Put it this way: I want concepts in trunk faster than I think we could get c++11 bootstrapping gcc working and set as default. I could be wrong - maybe c++11-bootstrap won't be that hard. As for flags. I vote for concepts switched on for -std=c++1y. As for -fconcepts turning on c++1y I'm less sure. We could allow concepts for C++11 (I don't think c++98 would work because of constexpr and maybe new template syntax). I hadn't thought about that. Personally I leave -std=c++14 and use all the things... ;-) I'm CCing Jason.
Re: [PATCH, rs6000] Fix PR61542 - V4SF vector extract for little endian
Bernd, thanks. At this point I think I will avoid opening this can of worms and not worry about backporting the test case. Thanks, Bill On Wed, 2014-06-18 at 19:18 +0200, Bernd Edlinger wrote: Hi, On Wed, 18 Jun 2014 09:56:15, David Edelsohn wrote: On Tue, Jun 17, 2014 at 6:44 PM, BIll Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, As described in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61542, a new test case (gcc.dg/vect/vect-nop-move.c) was added in 4.9. This exposes a bug on PowerPC little endian for extracting an element from a V4SF value that goes back to 4.8. The following patch fixes the problem. Tested on powerpc64le-unknown-linux-gnu with no regressions. Ok to commit to trunk? I would also like to commit to 4.8 and 4.9 as soon as possible to be picked up by the distros. This is okay everywhere. I would also like to backport gcc.dg/vect/vect-nop-move.c to 4.8 to provide regression coverage. You should ask Bernd and the RMs. Was the bug fix that prompted the new testcase backported to all targets? Thanks, David actually I only added the check_vect to that test case, but that exposed a bug on Solaris-9. See https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=207668. That was in the -fdump-rtl-combine-details handling, where fprintf got a NULL value passed for %s, which ICEs on Solaris9. So if you backport that test case, be sure to check that one too. Originally the test case seems to check something for the aarch64-target. See https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=205712. Obviously the patch in rtlanal.c (set_noop_p) was never backported to the 4.8 branch. Maybe Tejas who originally wrote that test case, can explain, if it makes sense to backport this fix too. Thanks Bernd.
Re: [PATCH] Fix parts of PR61607
On 06/26/14 02:58, Richard Biener wrote: On Thu, 26 Jun 2014, Richard Biener wrote: On Thu, 26 Jun 2014, Richard Biener wrote: On Wed, 25 Jun 2014, Jeff Law wrote: On 06/25/14 08:05, Richard Biener wrote: This removes restrictions in DOM cprop_operand that inhibit some optimizations. The volatile pointer thing is really realy old and no longer necessary while the loop-depth consideration is only valid for loop-closed PHI nodes (but we're not in loop-closed SSA in DOM) - the coalescing is handled in out-of-SSA phase by inserting copies appropriately. Bootstrapped on x86_64-unknown-linux-gnu, ok? Thanks, Richard. 2014-06-25 Richard Biener rguent...@suse.de PR tree-optimization/61607 * tree-ssa-dom.c (cprop_operand): Remove restriction on propagating volatile pointers and on loop depth. The first hunk is OK. I thought we had tests for the do not copy propagate out of a loop nest in the suite. Did you check that tests in BZ 19038 still generate good code after this change? If we still generate good code for those tests, then this hunk is fine too. I have applied the first hunk and will investigate further. Testing didn't show any issue and I know how to retain the check but not cause the missed optimization shown in PR61607. Let's try to summarize what the restriction is supposed to avoid. It tries to avoid introducing uses of SSA names defined inside a loop outside of it because if the SSA name is live over the backedge we will then have an overlapping life-range which prevents out-of-SSA from coalescing it to a single register. Now, the existing test is not working in that way. Rather the best way we have to ensure this property (all outside uses go through a copy that is placed on exit edges rather than possibly on the backedge) is to go into loop-closed SSA form. This is also where the PHI nodes that confuse DOM in PR61607 come from in the first place. Now as the existing measure is ineffective in some cases out-of-SSA has gotten the ability to deal with this (or a subset): /* If elimination of a PHI requires inserting a copy on a backedge, then we will have to split the backedge which has numerous undesirable performance effects. A significant number of such cases can be handled here by inserting copies into the loop itself. */ insert_backedge_copies (); now, this doesn't seem to deal with outside uses. But eventually the coalescing code already assigns proper cost to backedge copies so that we choose to place copies on the exit edges rather than the backedge ones - seems not so from looking at coalesce_cost_edge. So I think that we should remove the copy-propagation restrictions and instead address this in out-of-SSA. For now the following patch retains the exact same restriction in DOM as it is present in copyprop (but not in FRE - ok my recent fault, or in VRP). By avoiding to record the equivalency for PHIs (where we know that either all or no uses should be covered by the loop depth check) we retain the ability to record the equivalency for the two loop exit PHI nodes and thus the threading (if only on the false path). Bootstrap and regtest running on x86_64-unknown-linux-gnu. I'll try to see what happens to the PR19038 testcases (though that PR is a mess ...) I checked the very original one (thin6d.f from sixtrack) and the generated assembly for -Ofast is the same without any patch and with _all_ loop_depth_of_name restrictions removed from both DOM and copyprop (thus making loop_depth_of_name dead). The cost of out-of-SSA copies for backedges (or in the case of the PR, loop latch edges causing an edge split) is dealt with by /* Inserting copy on critical edge costs more than inserting it elsewhere. */ if (EDGE_CRITICAL_P (e)) mult = 2; in coalesce_cost_edge. So in the end, without a testcase to investigate, I'd propose to get rid of those restrictions. I'm still going forward with the patch below for now. Sounds good. Glad to see those hacks disappear. Jeff
[PATCH][Ping v2] Add patch for debugging compiler ICEs
Ping. Original Message Subject:[PATCH][Ping] Add patch for debugging compiler ICEs Date: Wed, 11 Jun 2014 18:15:27 +0400 From: Maxim Ostapenko m.ostape...@partner.samsung.com To: GCC Patches gcc-patches@gcc.gnu.org CC: Yury Gribov y.gri...@samsung.com, Slava Garbuzov v.garbu...@samsung.com, Jakub Jelinek ja...@redhat.com, tsaund...@mozilla.com, chefm...@gmail.com Ping. Original Message Subject:[PATCH] Add patch for debugging compiler ICEs Date: Mon, 02 Jun 2014 19:21:14 +0400 From: Maxim Ostapenko m.ostape...@partner.samsung.com To: GCC Patches gcc-patches@gcc.gnu.org CC: Yury Gribov y.gri...@samsung.com, Slava Garbuzov v.garbu...@samsung.com, Jakub Jelinek ja...@redhat.com, tsaund...@mozilla.com, chefm...@gmail.com Hi, A years ago there was a discussion (https://gcc.gnu.org/ml/gcc-patches/2004-01/msg02437.html) about debugging compiler ICEs that resulted in a patch from Jakub, which dumps useful information into temporary file, but for some reasons this patch wasn't applied to trunk. This is the resurrected patch with added GCC version information into generated repro file. -Maxim 2014-06-02 Jakub Jelinek ja...@redhat.com Max Ostapenko m.ostape...@partner.samsung.com * diagnostic.c (diagnostic_action_after_output): Exit with ICE_EXIT_CODE instead of FATAL_EXIT_CODE. * gcc.c (execute): Don't free first string early, but at the end of the function. Call retry_ice if compiler exited with ICE_EXIT_CODE. (main): Factor out common code. (print_configuration): New function. (try_fork): Likewise. (redirect_stdout_stderr): Likewise. (files_equal_p): Likewise. (check_repro): Likewise. (run_attempt): Likewise. (generate_preprocessed_code): Likewise. (append_text): Likewise. (try_generate_repro): Likewise. diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c index 0cc7593..67b8c5b 100644 --- a/gcc/diagnostic.c +++ b/gcc/diagnostic.c @@ -492,7 +492,7 @@ diagnostic_action_after_output (diagnostic_context *context, real_abort (); diagnostic_finish (context); fnotice (stderr, compilation terminated.\n); - exit (FATAL_EXIT_CODE); + exit (ICE_EXIT_CODE); default: gcc_unreachable (); diff --git a/gcc/gcc.c b/gcc/gcc.c index 9ac18e6..86dce03 100644 --- a/gcc/gcc.c +++ b/gcc/gcc.c @@ -43,6 +43,13 @@ compilation is specified by a string called a spec. */ #include params.h #include vec.h #include filenames.h +#ifdef HAVE_UNISTD_H +#include unistd.h +#endif + +#if !(defined (__MSDOS__) || defined (OS2) || defined (VMS)) +#define RETRY_ICE_SUPPORTED +#endif /* By default there is no special suffix for target executables. */ /* FIXME: when autoconf is fixed, remove the host check - dj */ @@ -253,6 +260,9 @@ static void init_gcc_specs (struct obstack *, const char *, const char *, static const char *convert_filename (const char *, int, int); #endif +#ifdef RETRY_ICE_SUPPORTED +static void try_generate_repro (const char *prog, const char **argv); +#endif static const char *getenv_spec_function (int, const char **); static const char *if_exists_spec_function (int, const char **); static const char *if_exists_else_spec_function (int, const char **); @@ -2797,7 +2807,7 @@ execute (void) } } - if (string != commands[i].prog) + if (i string != commands[i].prog) free (CONST_CAST (char *, string)); } @@ -2850,6 +2860,16 @@ execute (void) else if (WIFEXITED (status) WEXITSTATUS (status) = MIN_FATAL_STATUS) { +#ifdef RETRY_ICE_SUPPORTED + /* For ICEs in cc1, cc1obj, cc1plus see if it is + reproducible or not. */ + const char *p; + if (WEXITSTATUS (status) == ICE_EXIT_CODE + i == 0 + (p = strrchr (commands[0].argv[0], DIR_SEPARATOR)) + ! strncmp (p + 1, cc1, 3)) + try_generate_repro (commands[0].prog, commands[0].argv); +#endif if (WEXITSTATUS (status) greatest_status) greatest_status = WEXITSTATUS (status); ret_code = -1; @@ -2907,6 +2927,9 @@ execute (void) } } + if (commands[0].argv[0] != commands[0].prog) + free (CONST_CAST (char *, commands[0].argv[0])); + return ret_code; } } @@ -6098,6 +6121,342 @@ give_switch (int switchnum, int omit_first_word) switches[switchnum].validated = true; } +static void +print_configuration (void) +{ + int n; + const char *thrmod; + + fnotice (stderr, Target: %s\n, spec_machine); + fnotice (stderr, Configured with: %s\n, configuration_arguments); + +#ifdef THREAD_MODEL_SPEC + /* We could have defined THREAD_MODEL_SPEC to %* by default, + but there's no point in doing all this processing just to get + thread_model back. */ + obstack_init (obstack); + do_spec_1 (THREAD_MODEL_SPEC, 0, thread_model); + obstack_1grow (obstack, '\0'); + thrmod = XOBFINISH (obstack, const char *); +#else + thrmod = thread_model; +#endif + + fnotice (stderr, Thread model: %s\n, thrmod); + + /* compiler_version is
Re: [PATCH v2] typeof: Remove type qualifiers for atomic types
On Thu, 26 Jun 2014, Sebastian Huber wrote: gcc/c/ChangeLog 2014-06-26 Sebastian Huber sebastian.hu...@embedded-brains.de * c-parser.c (c_parser_declaration_or_fndef): Discard all type qualifiers in __auto_type for atomic types. (c_parser_typeof_specifier): Discard all type qualifiers in __typeof__ for atomic types. gcc/testsuite/ChangeLog 2014-06-26 Sebastian Huber sebastian.hu...@embedded-brains.de * gcc.dg/typeof-2.c: New testcase. OK. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Devirtualization dump functions fix
On 06/26/2014 04:29 PM, Jakub Jelinek wrote: On Thu, Jun 26, 2014 at 04:27:49PM +0200, Martin Liška wrote: Well yes - it is of course similar broken in spirit but at least a lot simpler ;) I'd put a comment there why we do check g for NULL. But it increases overhead, there are hundreds of gimple_location calls and most of them will never pass NULL. Can't you simply do what you do in the inline here in the couple of spots where the stmt might be NULL? Sure, do you have any suggestion how should be called such function? Suggestion: gimple_location_or_unknown ? gimple_location_safe or gimple_safe_location? Jakub Thanks, there's new patch. Patch has been tested for Firefox with -flto -fdump-ipa-devirt. Bootstrap and regression tests have been running. Ready for trunk after regression tests? ChangeLog: 2014-06-26 Martin Liska mli...@suse.cz * gimple.h (gimple_safe_location): New function introduced. * cgraphunit.c (walk_polymorphic_call_targets): Usage of gimple_safe_location replaces gimple_location. (gimple_fold_call): Likewise. * ipa-devirt.c (ipa_devirt): Likewise. * ipa-prop.c (ipa_make_edge_direct_to_target): Likewise. * ipa.c (walk_polymorphic_call_targets): Likewise. * tree-ssa-pre.c (eliminate_dom_walker::before_dom_children): Likewise. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index 76b2fda1..2bf5216 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -906,7 +906,7 @@ walk_polymorphic_call_targets (pointer_set_t *reachable_call_targets, } if (dump_enabled_p ()) { - location_t locus = gimple_location (edge-call_stmt); + location_t locus = gimple_safe_location (edge-call_stmt); dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, locus, devirtualizing call in %s to %s\n, edge-caller-name (), target-name ()); diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c index 403dee7..ad230be 100644 --- a/gcc/gimple-fold.c +++ b/gcc/gimple-fold.c @@ -387,7 +387,7 @@ fold_gimple_assign (gimple_stmt_iterator *si) fndecl = builtin_decl_implicit (BUILT_IN_UNREACHABLE); if (dump_enabled_p ()) { - location_t loc = gimple_location (stmt); + location_t loc = gimple_safe_location (stmt); dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc, resolving virtual function address reference to function %s\n, @@ -1131,7 +1131,7 @@ gimple_fold_call (gimple_stmt_iterator *gsi, bool inplace) tree lhs = gimple_call_lhs (stmt); if (dump_enabled_p ()) { - location_t loc = gimple_location (stmt); + location_t loc = gimple_safe_location (stmt); dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc, folding virtual function call to %s\n, targets.length () == 1 diff --git a/gcc/gimple.h b/gcc/gimple.h index ceefbc0..d401d47 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -1501,6 +1501,15 @@ gimple_location (const_gimple g) return g-location; } +/* Return location information for statement G if g is not NULL. + Otherwise, UNKNOWN_LOCATION is returned. */ + +static inline location_t +gimple_safe_location (const_gimple g) +{ + return g ? gimple_location (g) : UNKNOWN_LOCATION; +} + /* Return pointer to location information for statement G. */ static inline const location_t * diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c index 21f4f11..4e5dae8 100644 --- a/gcc/ipa-devirt.c +++ b/gcc/ipa-devirt.c @@ -2080,7 +2080,7 @@ ipa_devirt (void) { if (dump_enabled_p ()) { -location_t locus = gimple_location (e-call_stmt); +location_t locus = gimple_safe_location (e-call_stmt); dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, locus, speculatively devirtualizing call in %s/%i to %s/%i\n, n-name (), n-order, diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c index 1e10b53..c6967be 100644 --- a/gcc/ipa-prop.c +++ b/gcc/ipa-prop.c @@ -2673,17 +2673,11 @@ ipa_make_edge_direct_to_target (struct cgraph_edge *ie, tree target) if (dump_enabled_p ()) { - const char *fmt = discovered direct call to non-function in %s/%i, -making it __builtin_unreachable\n; - - if (ie-call_stmt) - { - location_t loc = gimple_location (ie-call_stmt); - dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc, fmt, - ie-caller-name (), ie-caller-order); - } - else if (dump_file) - fprintf (dump_file, fmt, ie-caller-name (), ie-caller-order); + location_t loc = gimple_safe_location (ie-call_stmt); + dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, loc, + discovered direct call to non-function in %s/%i, + making it __builtin_unreachable\n, + ie-caller-name (), ie-caller-order); } target = builtin_decl_implicit (BUILT_IN_UNREACHABLE); @@ -2745,18 +2739,11 @@ ipa_make_edge_direct_to_target (struct cgraph_edge *ie, tree
Re: [PATCH 3/5] IPA ICF pass
On 06/24/2014 10:31 PM, Jeff Law wrote: On 06/13/14 04:44, mliska wrote: Hello, this is core of IPA ICF patchset. It adds new pass and registers all needed stuff related to a newly introduced interprocedural optimization. Algorithm description: In LGEN, we visit all read-only variables and functions. For each symbol, a hash value based on e.g. number of arguments, number of BB, GIMPLE CODES is computed (similar hash is computed for read-only variables). This kind of information is streamed for LTO. In WPA, we build congruence classes for all symbols having a same hash value. For functions, these classes are subdivided in WPA by argument type comparison. Each reference (a call or a variable reference) to another semantic item candidate is marked and stored for further congruence class reduction (similar algorithm as Value Numbering: www.cs.ucr.edu/~gupta/teaching/553-07/Papers/value.pdf). For every congruence class of functions with more than one semantic function, we load function body. Having this information, we can process complete semantic function equality and subdivide such congruence class. Read-only variable class members are also deeply compared. After that, we process Value numbering algorithm to do a final subdivision. Finally, all items belonging to a congruence class with more than one item are merged. Martin Changelog: 2014-06-13 Martin Liska mli...@suse.cz Jan Hubicka hubi...@ucw.cz * Makefile.in: New pass object file added. * common.opt: New -fipa-icf flag introduced. * doc/invoke.texi: Documentation enhanced for the pass. * lto-section-in.c: New LTO section for a summary created by IPA-ICF. * lto-streamer.h: New section name introduced. * opts.c: Optimization is added to -O2. * passes.def: New pass added. * timevar.def: New time var for IPA-ICF. * tree-pass.h: Pass construction function. * ipa-icf.h: New pass header file added. * ipa-icf.c: New pass source file added. Hi Jeff, I must agree that the implementation of the patch is quite big. Suggested split makes sense, I'll do it. You'll note many of my comments are do you need to You may in fact be handling that stuff correctly, they're just things I'd like you to verify are properly handled. If they're properly handled just say so :-) At a high level, I think this needs to be broken down a bit more. We've got two high level concepts in ipa-icf. One is all the equivalence testing the other is using that information for the icf optimization. Splitting out the equivalence testing seems like a good thing to do as there's other contexts where it would be useful. Overall I think you're on the right path and we just need to iterate a bit on this part of the patchset. @@ -7862,6 +7863,14 @@ it may significantly increase code size (see @option{--param ipcp-unit-growth=@var{value}}). This flag is enabled by default at @option{-O3}. +@item -fipa-icf +@opindex fipa-icf +Perform Identical Code Folding for functions and read-only variables. +Behavior is similar to Gold Linker ICF optimization. Symbols proved +as semantically equivalent are redirected to corresponding symbol. The pass +sensitively decides for usage of alias, thunk or local redirection. +This flag is enabled by default at @option{-O2}. So you've added this at -O2, what is the general compile-time impact? Would it make more sense to instead have it be part of -O3, particularly since ICF is rarely going to improve performance (sans icache issues). This was Honza's idea to put the optimization for -O2, I'll measure compile-time impact. + +/* Interprocedural Identical Code Folding for functions and + read-only variables. + + The goal of this transformation is to discover functions and read-only + variables which do have exactly the same semantics. + + In case of functions, + we could either create a virtual clone or do a simple function wrapper + that will call equivalent function. If the function is just locally visible, + all function calls can be redirected. For read-only variables, we create + aliases if possible. + + Optimization pass arranges as follows: + 1) All functions and read-only variables are visited and internal + data structure, either sem_function or sem_variables is created. + 2) For every symbol from the previoues step, VAR_DECL and FUNCTION_DECL are + saved and matched to corresponding sem_items. s/previoues/previous/ + 3) These declaration are ignored for equality check and are solved + by Value Numbering algorithm published by Alpert, Zadeck in 1992. + 4) We compute hash value for each symbol. + 5) Congruence classes are created based on hash value. If hash value are + equal, equals function is called and symbols are deeply compared. + We must prove that all SSA names, declarations and other items + correspond. + 6) Value Numbering is executed for these classes.
Re: [PATCH] Fix PR c++/61537
On 06/26/2014 01:08 AM, Adam Butcher wrote: Do you want me to apply to 4.9 too? Please. Jason
Re: [c++-concepts] Fix assertion failure with cp_maybe_constrained_type_specifier
On 06/26/2014 11:15 AM, Ed Smith-Rowland wrote: I, for one, would like gcc to bootstrap with c++11/c++14. I think we should be starting to shake down that path. I'm probably not alone in this. Agreed. On the other hand, I don't think c++-concepts branch should be the leader on this. We have our work cut out for us without fighting these bugs. Also agreed. Maybe a c++11-bootstrap branch could be started to work the c++1* bootstrap out. I don't think a separate branch is necessary; people can bootstrap with -std=c++11 locally and fix issues they find on the trunk. As for flags. I vote for concepts switched on for -std=c++1y. As for -fconcepts turning on c++1y I'm less sure. We could allow concepts for C++11 (I don't think c++98 would work because of constexpr and maybe new template syntax). I hadn't thought about that. Personally I leave -std=c++14 and use all the things... ;-) -fconcepts should not be implied by -std=c++1[4y], because concepts are not part of C++14. We could add -std=c++1z and add it to that, though. I lean weakly against having -fconcepts imply a particular -std level, but it should definitely require c++11 or higher. Jason
Re: [patch] c++/58051 Implement Core 1579
OK. Jason
Re: [PATCH] Improve -fdump-tree-all efficiency
On Thu, Jun 26, 2014 at 9:42 AM, Teresa Johnson tejohn...@google.com wrote: * c-family/c-common.h (get_dump_info): Declare. * c-family/c-gimplify.c (c_genericize): Use saved dump files. * c-family/c-opts.c (c_common_parse_file): Begin and end dumps once around parsing invocation. (get_dump_info): New function. * cp/class.c (dump_class_hierarchy): Use saved dump files. (dump_vtable): Ditto. (dump_vtt): Ditto. Looks fine. Diego.
Re: [PATCH] C++ thunk section names
Hi Honza, Could you review this patch when you find time? Thanks Sri On Tue, Jun 17, 2014 at 10:42 AM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Mon, Jun 9, 2014 at 3:54 PM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Mon, May 19, 2014 at 11:25 AM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Thu, Apr 17, 2014 at 10:41 AM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Wed, Feb 5, 2014 at 4:31 PM, Sriraman Tallam tmsri...@google.com wrote: Hi, I would like this patch reviewed and considered for commit when Stage 1 is active again. Patch Description: A C++ thunk's section name is set to be the same as the original function's section name for which the thunk was created in order to place the two together. This is done in cp/method.c in function use_thunk. However, with function reordering turned on, the original function's section name can change to something like .text.hot.orginal or .text.unlikely.original in function default_function_section in varasm.c based on the node count of that function. The thunk function's section name is not updated to be the same as the original here and also is not always correct to do it as the original function can be hotter than the thunk. I have created a patch to not name the thunk function's section to be the same as the original function when function reordering is enabled. Thanks Sri
Re: [AArch64,PATCH] Refactor acquire/release determination into output template
On Tue, Jun 3, 2014 at 5:07 PM, Jones, Joel joel.jo...@caviumnetworks.com wrote: There is duplicate code for determining whether a load or store instruction needs acquire or release semantics. This patch removes the duplicated code and uses a modifying operator to output a/l instead. Since the testsuite already contains tests for the atomic functions, no new testcases are needed. OK? Built and tested for aarch64-elf using Cavium's internal simulator with no regressions. Ping? Thanks, Joel Jones ChangeLog: * config/aarch64/aarch64.c (aarch64_print_operand): Add 'Q' and 'R' operator modifiers. * config/aarch64/atomics.md (atomic_loadmode): Use 'Q' instead of returning a different template for acquire. (aarch64_load_exclusive): Likewise. (atomic_storemode): Use 'R' instead of returning a different template for release. (aarch64_store_exclusive): Likewise. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index c2f6c4f..56152a0 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -3931,6 +3931,50 @@ aarch64_print_operand (FILE *f, rtx x, char code) output_addr_const (asm_out_file, x); break; +case 'Q': + { +/* Print a if memory model requires ac'Q'uire semantics */ +if (GET_CODE (x) != CONST_INT) + { +output_operand_lossage (invalid operand for '%%%c', code); +return; + } +enum memmodel model = (enum memmodel) INTVAL (x); +bool is_acq = false; +switch (model) + { +default: is_acq = true; break; +case MEMMODEL_RELAXED: +case MEMMODEL_CONSUME: +case MEMMODEL_RELEASE: break; + } +if (is_acq) + fputc ('a', f); + } + break; + +case 'R': + { +/* Print l if memory model requires 'R'elease semantics */ +if (GET_CODE (x) != CONST_INT) + { +output_operand_lossage (invalid operand for '%%%c', code); +return; + } +enum memmodel model = (enum memmodel) INTVAL (x); +bool is_rel = false; +switch (model) + { +default: is_rel = true; break; +case MEMMODEL_RELAXED: +case MEMMODEL_CONSUME: +case MEMMODEL_ACQUIRE: break; + } +if (is_rel) + fputc ('l', f); + } + break; + default: output_operand_lossage (invalid operand prefix '%%%c', code); return; diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md index bffa465..55ba918 100644 --- a/gcc/config/aarch64/atomics.md +++ b/gcc/config/aarch64/atomics.md @@ -259,15 +259,7 @@ (match_operand:SI 2 const_int_operand)] ;; model UNSPECV_LDA))] - { -enum memmodel model = (enum memmodel) INTVAL (operands[2]); -if (model == MEMMODEL_RELAXED - || model == MEMMODEL_CONSUME - || model == MEMMODEL_RELEASE) - return ldratomic_sfx\t%w0, %1; -else - return ldaratomic_sfx\t%w0, %1; - } + ld%Q2ratomic_sfx\t%w0, %1 ) (define_insn atomic_storemode @@ -277,15 +269,7 @@ (match_operand:SI 2 const_int_operand)] ;; model UNSPECV_STL))] - { -enum memmodel model = (enum memmodel) INTVAL (operands[2]); -if (model == MEMMODEL_RELAXED - || model == MEMMODEL_CONSUME - || model == MEMMODEL_ACQUIRE) - return stratomic_sfx\t%w1, %0; -else - return stlratomic_sfx\t%w1, %0; - } + st%R2ratomic_sfx\t%w1, %0 ) (define_insn aarch64_load_exclusivemode @@ -296,15 +280,7 @@ (match_operand:SI 2 const_int_operand)] UNSPECV_LX)))] - { -enum memmodel model = (enum memmodel) INTVAL (operands[2]); -if (model == MEMMODEL_RELAXED - || model == MEMMODEL_CONSUME - || model == MEMMODEL_RELEASE) - return ldxratomic_sfx\t%w0, %1; -else - return ldaxratomic_sfx\t%w0, %1; - } + ld%Q2xratomic_sfx\t%w0, %1 ) (define_insn aarch64_load_exclusivemode @@ -314,15 +290,7 @@ (match_operand:SI 2 const_int_operand)] UNSPECV_LX))] - { -enum memmodel model = (enum memmodel) INTVAL (operands[2]); -if (model == MEMMODEL_RELAXED - || model == MEMMODEL_CONSUME - || model == MEMMODEL_RELEASE) - return ldxr\t%w0, %1; -else - return ldaxr\t%w0, %1; - } + ld%Q2xr\t%w0, %1 ) (define_insn aarch64_store_exclusivemode @@ -334,15 +302,7 @@ (match_operand:SI 3 const_int_operand)] UNSPECV_SX))] - { -enum memmodel model = (enum memmodel) INTVAL (operands[3]); -if (model == MEMMODEL_RELAXED - || model == MEMMODEL_CONSUME - || model == MEMMODEL_ACQUIRE) - return stxratomic_sfx\t%w0, %w2, %1; -else
Re: [PATCH] Add missing -fdump-* options
On Thu, Jun 26, 2014 at 12:40 AM, Richard Biener richard.guent...@gmail.com wrote: On Wed, Jun 25, 2014 at 4:21 PM, Teresa Johnson tejohn...@google.com wrote: On Tue, May 13, 2014 at 8:19 AM, Xinliang David Li davi...@google.com wrote: On Tue, May 13, 2014 at 1:39 AM, Richard Biener richard.guent...@gmail.com wrote: On Fri, May 9, 2014 at 5:54 PM, Teresa Johnson tejohn...@google.com wrote: I discovered that the support for the documented -fdump-* options optimized, missed, note and optall was missing. Added that and fixed a minor typo in the documentation. Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk? I'm not sure they were intented for user-consumption. ISTR they are just an implementation detail exposed by -fopt-info-X (which is where they are documented). The typo fix is ok, also adding a comment before the dump flags definition to the above fact. David, do I remember correctly? I remember we talked about content filtering dump flags. Things like -fdump-xxx-ir -- dump IR only -fdump-xxx-transformation -- optimization note -fdump-xxx-debug -- other debug traces Other than that, now I think 'details' and 'all' seem redundant. 'verbose' flag/modifier can achieve the same effect depending on the context. -fdump-xxx-ir-verbose -- dump IR, and turn on IR modifiers such as vops, lineno, etc -fdump-xxx-transforamtion-verbose -- dump transformations + missed optimizations + notes -fdump-xxx-debug-verbose -- turn on detailed trace. The above proposal seems fine to me as a longer-term direction, but also seems somewhat orthogonal to the issue my patch is trying to solve in the short term, namely inconsistent documentation and behavior: 1) optimized, missed, note and optall are documented as being sub-options for -fdump-tree-* in doc/invoke.texi, but not implemented. 2) optimized, missed, note and optall are however enabled via -fdump-tree-all Could we at least fix these issues in the short term, as it doesn't affect the documented behavior (but rather adds the documented behavior)? Sure. Richard. Thanks, retested and committed as r212040. Teresa Thanks, Teresa thanks, David Thanks, Richard. Thanks, Teresa 2014-05-09 Teresa Johnson tejohn...@google.com * doc/invoke.texi: Fix typo. * dumpfile.c: Add support for documented -fdump-* options optimized/missed/note/optall. Index: doc/invoke.texi === --- doc/invoke.texi (revision 210157) +++ doc/invoke.texi (working copy) @@ -6278,7 +6278,7 @@ passes). @item missed Enable showing missed optimization information (only available in certain passes). -@item notes +@item note Enable other detailed optimization information (only available in certain passes). @item =@var{filename} Index: dumpfile.c === --- dumpfile.c (revision 210157) +++ dumpfile.c (working copy) @@ -107,6 +107,10 @@ static const struct dump_option_value_info dump_op {nouid, TDF_NOUID}, {enumerate_locals, TDF_ENUMERATE_LOCALS}, {scev, TDF_SCEV}, + {optimized, MSG_OPTIMIZED_LOCATIONS}, + {missed, MSG_MISSED_OPTIMIZATION}, + {note, MSG_NOTE}, + {optall, MSG_ALL}, {all, ~(TDF_RAW | TDF_SLIM | TDF_LINENO | TDF_TREE | TDF_RTL | TDF_IPA | TDF_STMTADDR | TDF_GRAPH | TDF_DIAGNOSTIC | TDF_VERBOSE | TDF_RHS_ONLY | TDF_NOUID | TDF_ENUMERATE_LOCALS | TDF_SCEV)}, -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: [PATCH x86_64] Optimize access to globals in -fpie -pie builds with copy relocations
Hi Uros, Could you please review this patch? Thanks Sri On Fri, Jun 20, 2014 at 5:17 PM, Sriraman Tallam tmsri...@google.com wrote: Patch Updated. Sri On Mon, Jun 9, 2014 at 3:55 PM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Mon, May 19, 2014 at 11:11 AM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Thu, May 15, 2014 at 11:34 AM, Sriraman Tallam tmsri...@google.com wrote: Optimize access to globals with -fpie, x86_64 only: Currently, with -fPIE/-fpie, GCC accesses globals that are extern to the module using the GOT. This is two instructions, one to get the address of the global from the GOT and the other to get the value. If it turns out that the global gets defined in the executable at link-time, it still needs to go through the GOT as it is too late then to generate a direct access. Examples: foo.cc -- int a_glob; int main () { return a_glob; // defined in this file } With -O2 -fpie -pie, the generated code directly accesses the global via PC-relative insn: 5e0 main: mov0x165a(%rip),%eax# 1c40 a_glob foo.cc -- extern int a_glob; int main () { return a_glob; // defined in this file } With -O2 -fpie -pie, the generated code accesses global via GOT using two memory loads: 6f0 main: mov0x1609(%rip),%rax # 1d00 _DYNAMIC+0x230 mov(%rax),%eax This is true even if in the latter case the global was defined in the executable through a different file. Some experiments on google benchmarks shows that the extra memory loads affects performance by 1% to 5%. Solution - Copy Relocations: When the linker supports copy relocations, GCC can always assume that the global will be defined in the executable. For globals that are truly extern (come from shared objects), the linker will create copy relocations and have them defined in the executable. Result is that no global access needs to go through the GOT and hence improves performance. This patch to the gold linker : https://sourceware.org/ml/binutils/2014-05/msg00092.html submitted recently allows gold to generate copy relocations for -pie mode when necessary. I have added option -mld-pie-copyrelocs which when combined with -fpie would do this. Note that the BFD linker does not support pie copyrelocs yet and this option cannot be used there. Please review. ChangeLog: * config/i386/i36.opt (mld-pie-copyrelocs): New option. * config/i386/i386.c (legitimate_pic_address_disp_p): Check if this address is still legitimate in the presence of copy relocations and -fpie. * testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c: New test. * testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c: New test. Patch attached. Thanks Sri
C++ PATCH to add -std=c++1[7z]
Now that we're past C++14, C++17(?) features are starting to be added to the compiler, so we ought to have a switch for them. Tested x86_64-pc-linux-gnu, applying to trunk. commit a4480bed3c7aca47203e910dec52d80d61b96b2e Author: Jason Merrill ja...@redhat.com Date: Thu Jun 26 12:57:07 2014 -0400 * c-common.h (enum cxx_dialect): Add cxx1z. * c.opt (std=c++1z, std=c++17, std=gnu++1z, std=gnu++17): New. * c-opts.c (c_common_handle_option, set_std_cxx1z): Handle it. diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index 6bf4051..cd8e42e 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -640,8 +640,10 @@ enum cxx_dialect { /* C++11 */ cxx0x, cxx11 = cxx0x, - /* C++1y (C++17?) */ - cxx1y + /* C++1y (C++14?) */ + cxx1y, + /* C++1z (C++17?) */ + cxx1z }; /* The C++ dialect being used. C++98 is the default. */ diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c index fbbc80e..2e47676 100644 --- a/gcc/c-family/c-opts.c +++ b/gcc/c-family/c-opts.c @@ -109,6 +109,7 @@ static void handle_OPT_d (const char *); static void set_std_cxx98 (int); static void set_std_cxx11 (int); static void set_std_cxx1y (int); +static void set_std_cxx1z (int); static void set_std_c89 (int, int); static void set_std_c99 (int); static void set_std_c11 (int); @@ -695,6 +696,16 @@ c_common_handle_option (size_t scode, const char *arg, int value, } break; +case OPT_std_c__1z: +case OPT_std_gnu__1z: + if (!preprocessing_asm_p) + { + set_std_cxx1z (code == OPT_std_c__1z /* ISO */); + if (code == OPT_std_c__1z) + cpp_opts-ext_numeric_literals = 0; + } + break; + case OPT_std_c90: case OPT_std_iso9899_199409: if (!preprocessing_asm_p) @@ -1541,6 +1552,20 @@ set_std_cxx1y (int iso) cxx_dialect = cxx1y; } +/* Set the C++ 201z draft standard (without GNU extensions if ISO). */ +static void +set_std_cxx1z (int iso) +{ + cpp_set_lang (parse_in, iso ? CLK_CXX1Y: CLK_GNUCXX1Y); + flag_no_gnu_keywords = iso; + flag_no_nonansi_builtin = iso; + flag_iso = iso; + /* C++11 includes the C99 standard library. */ + flag_isoc94 = 1; + flag_isoc99 = 1; + cxx_dialect = cxx1z; +} + /* Args to -d specify what to dump. Silently ignore unrecognized options; they may be aimed at toplev.c. */ static void diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 73abd26..1d02bae 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -1414,6 +1414,13 @@ Conform to the ISO 2014(?) C++ draft standard (experimental and incomplete suppo std=c++14 C++ ObjC++ Alias(std=c++1y) Undocumented +std=c++1z +C++ ObjC++ +Conform to the ISO 2017(?) C++ draft standard (experimental and incomplete support) + +std=c++17 +C++ ObjC++ Alias(std=c++1z) Undocumented + std=c11 C ObjC Conform to the ISO 2011 C standard (experimental and incomplete support) @@ -1458,11 +1465,18 @@ Deprecated in favor of -std=gnu++11 std=gnu++1y C++ ObjC++ -Conform to the ISO 201y(7?) C++ draft standard with GNU extensions (experimental and incomplete support) +Conform to the ISO 201y(4?) C++ draft standard with GNU extensions (experimental and incomplete support) std=gnu++14 C++ ObjC++ Alias(std=gnu++1y) Undocumented +std=gnu++1z +C++ ObjC++ +Conform to the ISO 201z(7?) C++ draft standard with GNU extensions (experimental and incomplete support) + +std=gnu++17 +C++ ObjC++ Alias(std=gnu++1y) Undocumented + std=gnu11 C ObjC Conform to the ISO 2011 C standard with GNU extensions (experimental and incomplete support)
C++ PATCH to implement N3994 (range-based for: the next generation)
N3994 proposes that people writing a range-based for should be able to leave out the type and have it default to auto. I expect the proposal to be voted into the working paper at the November standards meeting. The second patch changes the existing range-for diagnostic from error to pedwarn, for consistency with other C++11 features. Tested x86_64-pc-linux-gnu, applying to trunk. commit 00389876d06b03b2550f018e3f96a7b5525c9f38 Author: Jason Merrill ja...@redhat.com Date: Tue Jun 24 06:15:02 2014 -0400 N3994 Ranged-based for-loops: The Next Generation * parser.c (cp_lexer_nth_token_is): New. (cp_parser_for_init_statement): Allow for (id : init). diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index c440c99..426dca4 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -892,6 +892,12 @@ cp_lexer_next_token_is_keyword (cp_lexer* lexer, enum rid keyword) } static inline bool +cp_lexer_nth_token_is (cp_lexer* lexer, size_t n, enum cpp_ttype type) +{ + return cp_lexer_peek_nth_token (lexer, n)-type == type; +} + +static inline bool cp_lexer_nth_token_is_keyword (cp_lexer* lexer, size_t n, enum rid keyword) { return cp_lexer_peek_nth_token (lexer, n)-keyword == keyword; @@ -10607,6 +10613,23 @@ cp_parser_for_init_statement (cp_parser* parser, tree *decl) bool is_range_for = false; bool saved_colon_corrects_to_scope_p = parser-colon_corrects_to_scope_p; + if (cp_lexer_next_token_is (parser-lexer, CPP_NAME) + cp_lexer_nth_token_is (parser-lexer, 2, CPP_COLON)) + { + /* N3994 -- for (id : init) ... */ + if (cxx_dialect cxx1z) + pedwarn (input_location, 0, range-based for loop without a + type-specifier only available with + -std=c++1z or -std=gnu++1z); + tree name = cp_parser_identifier (parser); + tree type = cp_build_reference_type (make_auto (), /*rval*/true); + *decl = build_decl (input_location, VAR_DECL, name, type); + pushdecl (*decl); + cp_lexer_consume_token (parser-lexer); + return true; + } + + /* A colon is used in range-based for. */ parser-colon_corrects_to_scope_p = false; /* We're going to speculatively look for a declaration, falling back diff --git a/gcc/testsuite/g++.dg/cpp1z/range-for1.C b/gcc/testsuite/g++.dg/cpp1z/range-for1.C new file mode 100644 index 000..7e6d055 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/range-for1.C @@ -0,0 +1,12 @@ +// { dg-options -std=c++1z -pedantic-errors } + +extern C int printf (const char *, ...); +#include initializer_list + +int main() +{ + for (i : {1,2}) +{ + printf (%d , i); +} +} commit 90ba192ca14292be71459b1ca8a85aadfe9832e1 Author: Jason Merrill ja...@redhat.com Date: Thu Jun 26 13:26:35 2014 -0400 * parser.c (cp_parser_for_init_statement): Change range-for error to pedwarn. diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 426dca4..a7edd41 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -10647,9 +10647,9 @@ cp_parser_for_init_statement (cp_parser* parser, tree *decl) is_range_for = true; if (cxx_dialect cxx11) { - error_at (cp_lexer_peek_token (parser-lexer)-location, - range-based %for% loops are not allowed - in C++98 mode); + pedwarn (cp_lexer_peek_token (parser-lexer)-location, 0, + range-based %for% loops only available with + -std=c++11 or -std=gnu++11); *decl = error_mark_node; } } diff --git a/gcc/testsuite/g++.dg/cpp0x/range-for9.C b/gcc/testsuite/g++.dg/cpp0x/range-for9.C index c51cbf9..6a50ec3 100644 --- a/gcc/testsuite/g++.dg/cpp0x/range-for9.C +++ b/gcc/testsuite/g++.dg/cpp0x/range-for9.C @@ -1,7 +1,6 @@ // Test for range-based for loop error in C++98 mode -// { dg-do compile } -// { dg-options -std=c++98 } +// { dg-do compile { target { ! c++11 } } } void test() {
Re: [c++-concepts] Fix assertion failure with cp_maybe_constrained_type_specifier
On 06/26/2014 12:23 PM, Jason Merrill wrote: We could add -std=c++1z and add it to that, though. Added. Jason
Re: [PATCH 3/5] IPA ICF pass
Jeff, thanks for review! I did some passes over the patch before it got to the ML, I am happy to have independent opinion. +@item -fipa-icf +@opindex fipa-icf +Perform Identical Code Folding for functions and read-only variables. +Behavior is similar to Gold Linker ICF optimization. Symbols proved +as semantically equivalent are redirected to corresponding symbol. The pass +sensitively decides for usage of alias, thunk or local redirection. +This flag is enabled by default at @option{-O2}. So you've added this at -O2, what is the general compile-time impact? Would it make more sense to instead have it be part of -O3, particularly since ICF is rarely going to improve performance (sans icache issues). I think code size optimization not sacrifying any (or too much of) performance are generally very welcome at -O2. Compared to LLVM and Microsoft compilers we are on code bloat side at -O2. http://hubicka.blogspot.ca/2014/04/linktime-optimization-in-gcc-2-firefox.html has some numbers for -O2 GGC/LLVM. I believe this is result of tunning for relatively small benchamrks (SPECS) and I hope we could revisit -O2 for code size considerations for 4.10 somewhat. If tuned well, ICF has no reason to be expnesive wrt compile time. So lets shoot for that. The considerable donwside of enabling ICF IMO should be only disturbing effect on debug info. + return true; +} Isn't this really checking for equivalence? do correspond seems awkward here. The function turns the names equivalent on first invocation for a given name and later checks that this tentative equivalence holds. Not sure what is best name for it (originaly it was verify that did not sound right to me either) + +/* Verification function for edges E1 and E2. */ + +bool +func_checker::compare_edge (edge e1, edge e2) +{ + if (e1-flags != e2-flags) +return false; Presumably there's no flags we can safely ignore. So absolute equality seems reasonable here. Yep +/* Compare two types if are same aliases in case of strict aliasing + is enabled. */ +bool +sem_item::compare_for_aliasing (tree t1, tree t2) +{ + if (flag_strict_aliasing) +{ + alias_set_type s1 = get_deref_alias_set (TREE_TYPE (t1)); + alias_set_type s2 = get_deref_alias_set (TREE_TYPE (t2)); + + return s1 == s2; +} + + return true; +} Is returning TRUE really the conservatively correct thing to do in the absence of aliasing information? Isn't that case really I don't know in which case the proper return value is FALSE? I think with -fno-strict-aliasing the set should be 0 (Richi?) and thus we can compare for equality. We probably can be on agressive side and let 0 alias set prevail the non-0. But that can be done incrementally. We also need to match type inheritance equality for polymorphic types. I will add function for that into ipa-devirt. +/* References independent hash function. */ + +hashval_t +sem_function::get_hash (void) +{ + if(!hash) +{ + hash = 177454; /* Random number for function type. */ + + hash = iterative_hash_object (arg_count, hash); + hash = iterative_hash_object (bb_count, hash); + hash = iterative_hash_object (edge_count, hash); + hash = iterative_hash_object (cfg_checksum, hash); Does CFG_CHECKSUM encompass the bb/edge counts? It is one used by profiling code to match profiles, so it should. +SE_EXIT_FALSE(); + + if (!equals_wpa (item)) +return false; + + /* Checking function arguments. */ + tree decl1 = DECL_ATTRIBUTES (decl); + tree decl2 = DECL_ATTRIBUTES (compared_func-decl); So are there any attributes we can safely ignore? Probably not. However, we ought to handle the case where the attributes appear in different orders. There are few, like we can ignore weak or visibility attribute because we do produce alias with proper visibility anyway. My plan is to start removing those attributes from declarations once they are turned into suitable representation in symbol table (or for attributes like const/noreturn/pure where we have explicit decl flags). This will make our life bit easier later, too. We probably then can whitelist some attributes, but I would deal with this later. +/* Returns cgraph_node. */ + +struct cgraph_node * +sem_function::get_node (void) +{ + return cgraph (node); +} + +/* Initialize semantic item by info reachable during LTO WPA phase. */ + +void +sem_function::init_wpa (void) +{ + parse_tree_args (); +} inline? Worth or not worth the headache? We ought to autoinline simple wrappers even at -Os (for size) (I am not agains explicit inline keywords here tough) + +bool +sem_function::compare_bb (sem_bb_t *bb1, sem_bb_t *bb2, tree func1, tree func2) So this routine walks down the gimple statements and compares them for equality. Would it make sense to have the equality testing in gimple? That way if someone adds a new gimple code the places they need to
Re: [PATCH, alpha]: FIX PR61586, ICE in alpha_handle_trap_shadows
On 06/26/2014 02:43 AM, Uros Bizjak wrote: 2014-06-26 Uros Bizjak ubiz...@gmail.com PR target/61586 * config/alpha/alpha.c (alpha_handle_trap_shadows): Handle BARRIER RTX. testsuite/ChangeLog: 2014-06-26 Uros Bizjak ubiz...@gmail.com PR target/61586 * gcc.target/alpha/pr61586.c: New test. Ok. Thanks! r~
[C++ Patch] Small compound-literal parsing clean up
Hi, should we do something like this? Tested x86_64-linux. Thanks, Paolo. 2014-06-26 Paolo Carlini paolo.carl...@oracle.com * parser.c (cp_parser_compound_literal_p): New. (cp_parser_postfix_expression, cp_parser_sizeof_operand): Use it. Index: parser.c === --- parser.c(revision 212052) +++ parser.c(working copy) @@ -5609,6 +5609,30 @@ cp_parser_qualifying_entity (cp_parser *parser, return scope; } +/* Return true if we are looking at a compound-literal, false otherwise. */ + +static bool +cp_parser_compound_literal_p (cp_parser *parser) +{ + /* Consume the `('. */ + cp_lexer_consume_token (parser-lexer); + + cp_lexer_save_tokens (parser-lexer); + + /* Skip tokens until the next token is a closing parenthesis. + If we find the closing `)', and the next token is a `{', then + we are looking at a compound-literal. */ + bool compound_literal_p += (cp_parser_skip_to_closing_parenthesis (parser, false, false, + /*consume_paren=*/true) +cp_lexer_next_token_is (parser-lexer, CPP_OPEN_BRACE)); + + /* Roll back the tokens we skipped. */ + cp_lexer_rollback_tokens (parser-lexer); + + return compound_literal_p; +} + /* Parse a postfix-expression. postfix-expression: @@ -5917,25 +5941,12 @@ cp_parser_postfix_expression (cp_parser *parser, b cp_lexer_next_token_is (parser-lexer, CPP_OPEN_PAREN)) { tree initializer = NULL_TREE; - bool compound_literal_p; cp_parser_parse_tentatively (parser); - /* Consume the `('. */ - cp_lexer_consume_token (parser-lexer); /* Avoid calling cp_parser_type_id pointlessly, see comment in cp_parser_cast_expression about c++/29234. */ - cp_lexer_save_tokens (parser-lexer); - - compound_literal_p - = (cp_parser_skip_to_closing_parenthesis (parser, false, false, - /*consume_paren=*/true) - cp_lexer_next_token_is (parser-lexer, CPP_OPEN_BRACE)); - - /* Roll back the tokens we skipped. */ - cp_lexer_rollback_tokens (parser-lexer); - - if (!compound_literal_p) + if (!cp_parser_compound_literal_p (parser)) cp_parser_simulate_error (parser); else { @@ -23966,31 +23977,15 @@ cp_parser_sizeof_operand (cp_parser* parser, enum if (cp_lexer_next_token_is (parser-lexer, CPP_OPEN_PAREN)) { tree type = NULL_TREE; - bool compound_literal_p; /* We can't be sure yet whether we're looking at a type-id or an expression. */ cp_parser_parse_tentatively (parser); - /* Consume the `('. */ - cp_lexer_consume_token (parser-lexer); /* Note: as a GNU Extension, compound literals are considered postfix-expressions as they are in C99, so they are valid arguments to sizeof. See comment in cp_parser_cast_expression for details. */ - cp_lexer_save_tokens (parser-lexer); - /* Skip tokens until the next token is a closing parenthesis. -If we find the closing `)', and the next token is a `{', then -we are looking at a compound-literal. */ - compound_literal_p - = (cp_parser_skip_to_closing_parenthesis (parser, false, false, - /*consume_paren=*/true) - cp_lexer_next_token_is (parser-lexer, CPP_OPEN_BRACE)); - /* Roll back the tokens we skipped. */ - cp_lexer_rollback_tokens (parser-lexer); - /* If we were looking at a compound-literal, simulate an error -so that the call to cp_parser_parse_definitely below will -fail. */ - if (compound_literal_p) + if (cp_parser_compound_literal_p (parser)) cp_parser_simulate_error (parser); else {
[PATCH, rs6000] Remove XFAIL of gfortran.dg/nint_2.f90 for powerpc64le
Hi, The test case gfortran.dg/nint_2.f90 is XFAILed for certain platforms because glibc produces wrong results in some cases. The relatively new powerpc64le-unknown-linux-gnu platform does not have this problem, but the wild-carding causes it to be XFAILed incorrectly. This patch changes the wild-carding to exclude it. Is this OK for trunk, 4.9, and 4.8? Thanks, Bill 2014-06-26 Bill Schmidt wschm...@linux.vnet.ibm.com * gfortran.dg/nint_2.f90: Don't XFAIL for powerpc64le-*-linux*. Index: gcc/testsuite/gfortran.dg/nint_2.f90 === --- gcc/testsuite/gfortran.dg/nint_2.f90(revision 212046) +++ gcc/testsuite/gfortran.dg/nint_2.f90(working copy) @@ -4,7 +4,8 @@ ! http://gcc.gnu.org/ml/fortran/2005-04/msg00139.html ! ! { dg-do run } -! { dg-xfail-run-if PR 33271, math library bug { powerpc-ibm-aix* powerpc*-*-linux* *-*-mingw* } { -O0 } { } } +! { dg-xfail-run-if PR 33271, math library bug { powerpc-ibm-aix* powerpc-*-linux* powerpc64-*-linux* *-*-mingw* } { -O0 } { } } +! Note that this doesn't fail on powerpc64le-*-linux*. real(kind=8) :: a integer(kind=8) :: i1, i2 real :: b
Re: testsuite allocators patch
On 26/06/2014 12:33, Jonathan Wakely wrote: The _GLIBCXX_USE_NOEXCEPT macro expands to nothing in C++03 mode, so you might as well omit it in the #else branch. OK for trunk if you make the tracker_allocator comment correct. Thanks! Committed with: // An allocator facade that intercepts allocate/deallocate/construct/destroy // calls and track them through the tracker_allocator_counter class. This // class is templated on the target object type, but tracker isn't. templatetypename T, typename Alloc = std::allocatorT class tracker_allocator : public Alloc Thanks for feedback. François
[PATCH, rtl]: Teach _.barriers and _.eh_range passes to not split a call and its corresponding CALL_ARG_LOCATION note.
Hello! Attached patch is needed to fix PR 56858 [1], where alpha used NOTE_INSN_EH_REGION_BEG and NOTE_INSN_EH_REGION_END notes to handle its FP traps. After the last RTL exceptions rewrite, those notes are generated after mach_reorg pass, effectively rendering existing approach unusable. In PR 56858, a new target-dependent pass was introduced that moved the traps generation just after eh_ranges pass. trapb insns were emitted at the point of NOTE_INSN_EH_REGION_END notes, and this insertion broke compilation due to ICE in dwarf2out_var_location, at dwarf2out.c. The problem was that trapb insn split a call and its corresponding CALL_ARG_LOCATION note. The patch fixes the splitting problem by teaching _.barriers and _.eh_range passes to not split a call and its corresponding CALL_ARG_LOCATION note. Apparently, a scary comment in jump.c: /* Some old code expects exactly one BARRIER as the NEXT_INSN of a non-fallthru insn. This is not generally true, as multiple barriers may have crept in, or the BARRIER may be separated from the last real insn by one or more NOTEs. This simple pass moves barriers and removes duplicates so that the old code is happy. */ does not apply anymore, and the old code tolerates additional note after the call just fine. The problematic test (g++.dg/torture/stackalign/eh-vararg-1.C) now compiles on x86_64-pc-linux-gnu to following _.final dump: ... (insn:TI 115 114 116 11 (set (reg:DI 5 di) (reg/f:DI 0 ax [126])) eh-vararg-1.C:62 89 {*movdi_internal} (expr_list:REG_DEAD (reg/f:DI 0 ax [126]) (nil))) (call_insn:TI 116 115 203 11 (call (mem:QI (symbol_ref:DI (__cxa_throw) [flags 0x41] function_decl 0x7fcf18d9d900 __cxa_throw) [0 __cxa_throw S1 A8]) (const_int 0 [0])) eh-vararg-1.C:62 642 {*call} (expr_list:REG_DEAD (reg:DI 5 di) (expr_list:REG_DEAD (reg:DI 4 si) (expr_list:REG_DEAD (reg:DI 1 dx) (expr_list:REG_EH_REGION (const_int 1 [0x1]) (expr_list:REG_CALL_DECL (symbol_ref:DI (__cxa_throw) [flags 0x41] function_decl 0x7fcf18d9d900 __cxa_throw) (expr_list:REG_ARGS_SIZE (const_int 0 [0]) (expr_list:REG_NORETURN (const_int 0 [0]) (nil (expr_list:DI (use (reg:DI 5 di)) (expr_list:DI (use (reg:DI 4 si)) (expr_list:DI (use (reg:DI 1 dx)) (nil) (note 203 116 216 (expr_list:REG_DEP_TRUE (concat:DI (reg:DI 4 si) (symbol_ref/i:DI (_ZTI1A) var_decl 0x7fcf18d9b360 _ZTI1A)) (expr_list:REG_DEP_TRUE (concat:DI (reg:DI 1 dx) (const_int 0 [0])) (nil))) NOTE_INSN_CALL_ARG_LOCATION) (note 216 203 117 0 NOTE_INSN_EH_REGION_END) (barrier 117 216 202) ... Previously, both the barrier and the EH_REGION_END note were emitted above CALL_ARG_LOCATION note. 2014-06-26 Uros Bizjak ubiz...@gmail.com * except.c (emit_note_eh_region_end): New helper function. (convert_to_eh_region_ranges): Use emit_note_eh_region_end to emit EH_REGION_END note. * jump.c (cleanup_barriers): Do not split a call and its corresponding CALL_ARG_LOCATION note. The patch was successfully bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32} for all default languages, obj-c++ and go. Additionally, the patch was bootstrapped and regression tested on alpha-linux-gnu (configured with --host=alpha-linux-gnu --build=alpha-linux-gnu --target=alpha-linux-gnu to make the trapb insertion pass effective) together with an updated trapb insertion patch [2], also for all default languages, obj-c++ and go. OK for mainline? [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56858 [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56858#c15 Uros. Index: except.c === --- except.c(revision 212052) +++ except.c(working copy) @@ -2466,6 +2466,20 @@ add_call_site (rtx landing_pad, int action, int se return call_site_base + crtl-eh.call_site_record_v[section]-length () - 1; } +static rtx +emit_note_eh_region_end (rtx insn) +{ + rtx next = NEXT_INSN (insn); + + /* Make sure we do not split a call and its corresponding + CALL_ARG_LOCATION note. */ + if (next NOTE_P (next) + NOTE_KIND (next) == NOTE_INSN_CALL_ARG_LOCATION) +insn = next; + + return emit_note_after (NOTE_INSN_EH_REGION_END, insn); +} + /* Turn REG_EH_REGION notes back into NOTE_INSN_EH_REGION notes. The new note numbers will not refer to region numbers, but instead to call site entries. */ @@ -2544,8 +2558,8 @@ convert_to_eh_region_ranges (void) note = emit_note_before (NOTE_INSN_EH_REGION_BEG, first_no_action_insn_before_switch); NOTE_EH_HANDLER (note) = call_site; - note = emit_note_after (NOTE_INSN_EH_REGION_END, - last_no_action_insn_before_switch);
Re: testsuite allocators patch
Hi, I'm afraid something went badly wrong with this commit, I'm seeing tens of fails. See eg: https://gcc.gnu.org/ml/gcc-testresults/2014-06/msg02439.html Paolo.
Re: [PATH] Intel offload library
Some observations on this code (the first is about integration in GCC, the rest about general good practice for writing robust C libraries for use on GNU/Linux, where failure to follow such practice is likely to lead to problems and security advisories in future, so if the code goes in without these being fixed then I think it would be desirable to have clear plans for fixing them before 4.10 branches): * Don't duplicate the logic for what's a hosted POSIX system; refactor it to a common fragment in config/ (I guess it needs to be a shell script fragment there rather than an actual autoconf macro, since you're using that logic in configure.tgt which is itself such a fragment). * There seems to be lots of code here that calls malloc or strdup (maybe other allocation functions, but at least those two) then dereferences the result without checking for errors. You need to check for errors and then either return an error status or throw an exception (depending on what the API of the function doing the allocation says is the way it should indicate errors). (There could always be other functions being called without error checking; for pretty much any standard C library function that can return error status, you need to handle errors appropriately.) * Code here is using getenv for various purposes. If it's to be safe to use this functionality in setuid programs (and I see no obvious reason why setuid programs shouldn't be able to use such offloading), then secure_getenv should be used when available (glibc 2.17 and later), with the relevant configuration being determined in some safe way when setuid or those environment variables aren't set. * Some code is using strtok. Is there a reason there can only ever be one thread in the process when that code runs? If not, you shouldn't be using strtok; strtok_r, as used elsewhere, is OK. * Another thread issue: is there a reason there can only be one thread when you call fopen? If not, with glibc you should specify e in the fopen mode so that the file is opened with O_CLOEXEC, to avoid leaking a file descriptor if another thread creates a process while the file is open. Similarly, calls to open should use O_CLOEXEC when available, unless you need file descriptors to stay open across exec. * Generally it's not obvious to me whether the code using strcat, sprintf etc. is safe against buffer overruns given arbitrary environment variables, valid but extreme function arguments, etc. - I don't know what the trust boundaries are in this code, but it would be a good idea to make explicit, in code doing such buffer manipulations that are not obviously safe, where the responsibility lies for safety (e.g. if the API for a function is that arguments outside a certain range yield undefined behavior, and it's clear that only outside that range can there be a buffer overrun, make clear in any relevant documentation / comments that this is the API, and ensure that callers coming with GCC respect that API). * MESSAGE_TABLE_NAME contains a string using 0x91 and 0x92 Windows-1252 quotes. It's wrong to use those unconditionally. If you want to use non-ASCII messages you need to respect the user's locale (and if you do, supporting translated messages via gettext is a good idea, not just making quotes follow LC_CTYPE). * I suspect your uses of PATH_MAX will cause trouble to anyone trying to build this code for Hurd, though I think that can be left to Hurd porters to fix. (More significant is that before blindly putting things in a buffer of size PATH_MAX you need a reason that buffer can't overrun - it's one thing if you know the path in question has already been successfully opened, another if you're building it up by pieces without an API saying the caller must ensure those pieces add up to no more than PATH_MAX in size, and appropriate code in the callers to ensure this.) -- Joseph S. Myers jos...@codesourcery.com
[C/C++ PATCH] Implement -Wsizeof-array-argument (PR c/6940)
The following is a revamped patch for -Wsizeof-array-argument. Almost two months back there was an initial attempt by Prathamesh: https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00142.html, but that patch never made it in. This version implements the warning for both C and C++ FEs, adds more testing, enables the warning by default (I can move it to -Wall, of course), makes the warning work properly even for multidimensional arrays, etc. Its purpose is to detect suspicious usage of the sizeof operator on an array function parameter. A few years back I fixed exactly this kind of bug in elfutils - so -Wsizeof-array-argument might be indeed useful. (The warning didn't trigger during GCC bootstrap though.) Jason/Joseph, could you please look at the C++, resp. C FE parts? Tested x86_64-unknown-linux-gnu, ok for trunk? 2014-06-26 Marek Polacek pola...@redhat.com PR c/6940 * doc/invoke.texi: Document -Wsizeof-array-argument. c-family/ * c.opt (Wsizeof-array-argument): New option. c/ * c-decl.c (grokdeclarator): Set C_ARRAY_PARAMETER. * c-tree.h (C_ARRAY_PARAMETER): Define. * c-typeck.c (c_expr_sizeof_expr): Warn when using sizeof on an array function parameter. cp/ * cp-tree.h (DECL_ARRAY_PARAMETER_P): Define. * decl.c (grokdeclarator): Set DECL_ARRAY_PARAMETER_P. * typeck.c (cxx_sizeof_expr): Warn when using sizeof on an array function parameter. testsuite/ * c-c++-common/Wsizeof-pointer-memaccess1.c: Use -Wno-sizeof-array-argument. * c-c++-common/Wsizeof-pointer-memaccess2.c: Likewise. * g++.dg/warn/Wsizeof-pointer-memaccess-1.C: Likewise. * gcc.dg/Wsizeof-pointer-memaccess1.c: Likewise. * g++.dg/torture/Wsizeof-pointer-memaccess1.C: Likewise. * g++.dg/torture/Wsizeof-pointer-memaccess2.C: Likewise. * gcc.dg/torture/Wsizeof-pointer-memaccess1.c: Likewise. * c-c++-common/sizeof-array-argument.c: New test. * gcc.dg/vla-5.c: Add dg-warnings. ../libgomp/ * testsuite/libgomp.c/appendix-a/a.29.1.c (f): Add dg-warnings. diff --git gcc/gcc/c-family/c.opt gcc/gcc/c-family/c.opt index 1d02bae..3d3cf14 100644 --- gcc/gcc/c-family/c.opt +++ gcc/gcc/c-family/c.opt @@ -526,6 +526,10 @@ Wsizeof-pointer-memaccess C ObjC C++ ObjC++ Var(warn_sizeof_pointer_memaccess) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) Warn about suspicious length parameters to certain string functions if the argument uses sizeof +Wsizeof-array-argument +C ObjC C++ ObjC++ Var(warn_sizeof_array_argument) Warning Init(1) +Warn when sizeof is applied on a parameter declared as an array + Wsuggest-attribute=format C ObjC C++ ObjC++ Var(warn_suggest_attribute_format) Warning Warn about functions which might be candidates for format attributes diff --git gcc/gcc/c/c-decl.c gcc/gcc/c/c-decl.c index def10a2..a1bc114 100644 --- gcc/gcc/c/c-decl.c +++ gcc/gcc/c/c-decl.c @@ -6092,6 +6092,7 @@ grokdeclarator (const struct c_declarator *declarator, if (decl_context == PARM) { tree promoted_type; + bool array_parameter_p = false; /* A parameter declared as an array of T is really a pointer to T. One declared as a function is really a pointer to a function. */ @@ -6113,6 +6114,7 @@ grokdeclarator (const struct c_declarator *declarator, attributes in parameter array declarator ignored); size_varies = false; + array_parameter_p = true; } else if (TREE_CODE (type) == FUNCTION_TYPE) { @@ -6137,6 +6139,7 @@ grokdeclarator (const struct c_declarator *declarator, PARM_DECL, declarator-u.id, type); if (size_varies) C_DECL_VARIABLE_SIZE (decl) = 1; + C_ARRAY_PARAMETER (decl) = array_parameter_p; /* Compute the type actually passed in the parmlist, for the case where there is no prototype. diff --git gcc/gcc/c/c-tree.h gcc/gcc/c/c-tree.h index 133930f..f97d0d5 100644 --- gcc/gcc/c/c-tree.h +++ gcc/gcc/c/c-tree.h @@ -66,6 +66,9 @@ along with GCC; see the file COPYING3. If not see /* For a FUNCTION_DECL, nonzero if it was an implicit declaration. */ #define C_DECL_IMPLICIT(EXP) DECL_LANG_FLAG_2 (EXP) +/* For a PARM_DECL, nonzero if it was declared as an array. */ +#define C_ARRAY_PARAMETER(NODE) DECL_LANG_FLAG_0 (NODE) + /* For FUNCTION_DECLs, evaluates true if the decl is built-in but has been declared. */ #define C_DECL_DECLARED_BUILTIN(EXP) \ diff --git gcc/gcc/c/c-typeck.c gcc/gcc/c/c-typeck.c index b62e830..0b63e98 100644 --- gcc/gcc/c/c-typeck.c +++ gcc/gcc/c/c-typeck.c @@ -2731,6 +2731,16 @@ c_expr_sizeof_expr (location_t loc, struct c_expr expr) else { bool expr_const_operands = true; + + if (TREE_CODE (expr.value) == PARM_DECL + C_ARRAY_PARAMETER (expr.value)) + { + if (warning_at (loc, OPT_Wsizeof_array_argument, +
Re: testsuite allocators patch
On 26/06/14 23:21 +0200, Paolo Carlini wrote: Hi, I'm afraid something went badly wrong with this commit, I'm seeing tens of fails. See eg: https://gcc.gnu.org/ml/gcc-testresults/2014-06/msg02439.html It seems that uneq_allocator is no longer copy constructible.
Re: [PATCH, rs6000] Remove XFAIL of gfortran.dg/nint_2.f90 for powerpc64le
On Thu, Jun 26, 2014 at 4:44 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, The test case gfortran.dg/nint_2.f90 is XFAILed for certain platforms because glibc produces wrong results in some cases. The relatively new powerpc64le-unknown-linux-gnu platform does not have this problem, but the wild-carding causes it to be XFAILed incorrectly. This patch changes the wild-carding to exclude it. Is this OK for trunk, 4.9, and 4.8? LGTM. Thanks, David
[PATCH] remove broken and redundant diagnostic in i386_pe_section_type_flags
From: Trevor Saunders tsaund...@mozilla.com Hi, While fixing up the hash_table patch's bustedness here I noticed the code doesn't make any sense. What it inserts into the hash table will never match what we try and look up in it. If you want to use hash_table or htab as a map you need to deal with the keys yourself, it doesn't do it for you. varasm.c is the only caller of this target hook, and it correctly uses a htab to check if the flags returned by the hook are the same as the flags it has for the section, and emit an error if not. Therefore if we fixed this machinary it would only ever emit redundant errors, so it would seem to make sense to get rid of it. I don't have a setup to test windows targets at hand, but I checked I can buildd a compiler targeting x86_64-cygwin with this patch. Ok if someone can really test it and it passes? Trev gccc/ * config/i386/winnt.c (i386_pe_section_type_flags): Remove redundant diagnostic machinary. diff --git a/gcc/config/i386/winnt.c b/gcc/config/i386/winnt.c index 56cd1b2..8a5d982 100644 --- a/gcc/config/i386/winnt.c +++ b/gcc/config/i386/winnt.c @@ -469,19 +469,12 @@ i386_pe_reloc_rw_mask (void) unsigned int i386_pe_section_type_flags (tree decl, const char *name, int reloc) { - static hash_tablepointer_hashunsigned int *htab = NULL; unsigned int flags; - unsigned int **slot; /* Ignore RELOC, if we are allowed to put relocated const data into read-only section. */ if (!flag_writable_rel_rdata) reloc = 0; - /* The names we put in the hashtable will always be the unique - versions given to us by the stringtable, so we can just use - their addresses as the keys. */ - if (!htab) -htab = new hash_tablepointer_hashunsigned int (31); if (decl TREE_CODE (decl) == FUNCTION_DECL) flags = SECTION_CODE; @@ -499,19 +492,6 @@ i386_pe_section_type_flags (tree decl, const char *name, int reloc) if (decl DECL_P (decl) DECL_ONE_ONLY (decl)) flags |= SECTION_LINKONCE; - /* See if we already have an entry for this section. */ - slot = htab-find_slot ((unsigned int *)name, INSERT); - if (!*slot) -{ - *slot = (unsigned int *) xmalloc (sizeof (unsigned int)); - **slot = flags; -} - else -{ - if (decl **slot != flags) - error (%q+D causes a section type conflict, decl); -} - return flags; } -- 2.0.0
Re: [c++-concepts] Fix assertion failure with cp_maybe_constrained_type_specifier
So is C++14 a done deal with a __cplusplus date and all? I've been waiting for some news or a trip report from Rapperswil and have seen nothing on isocpp. I've been thinking of adding a thing or two to C++1z like clang has - The Disabling trigraph expansion by default looks easy. Ed
Re: [patch, libgfortran] [4.9/4.10 Regression] Internal read of negative integer broken
On 26/06/14 04:29, Paul Richard Thomas wrote: Hi Jerry, The patch looks to be OK for trunk. Did you check it with the NIST by any chance? Yes, tested fine. Jerry
RE: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme Sent: Thursday, June 19, 2014 1:36 PM Richard, given this issue, I think we should wait a few more days before I commit A backported (and fixed of course) version to 4.8 and 4.9. No new issues were reported since then. Is it ok to commit the backport (with Jakub fix) now or should we wait more? Best regards, Thomas
Re: [C++ Patch] Small compound-literal parsing clean up
OK. Jason
Re: [c++-concepts] Fix assertion failure with cp_maybe_constrained_type_specifier
On 06/26/2014 09:38 PM, Ed Smith-Rowland wrote: So is C++14 a done deal with a __cplusplus date and all? The C++14 draft was finalized at the February meeting in Issaquah; the ratification process isn't quite done, but I haven't heard any reason to doubt that it will be done soon. The __cplusplus date is 201402L. I've been thinking of adding a thing or two to C++1z like clang has - The Disabling trigraph expansion by default looks easy. Aren't trigraphs off by default already? Jason
[Patch, PR 61061] Add state limit for regex NFA
The limit can be customized by defining a macro _GLIBCXX_REGEX_STATE_LIMIT. The default value is 10. The testcase can be handled if we optimize consecutive quantifiers (collapse them to one). But cases like (a{100}b){100} can't be handled still. We implement range quantifier (foo){n} by copying state sequence (foo) n-1 times. That consumes more space. We may reimplement it (by adding a new _S_op*) someday. Bootstrapped and tested. Thanks! -- Regards, Tim Shen
Re: [PATCH v2] gcc/dwarf2asm.c: Add dw2_asm_voutput_delta() with var_list for dw2_asm_output_delta()
On 06/26/2014 06:25 AM, Chen Gang wrote: BTW: one linux kernel member found a gcc issue for the latest version (4.10.0 20140622 or later), but for old version (e.g. 4.10.0 2014060*), it is OK. It is my chance to fix it (hope can finish within 2014-06-30). For this issue, at present, I find root cause: when find duplicate decls, it need merge with the old one, and let old and new share 'function_decl.f', After free new, also free the old. I shall continue analysing this issue, and welcome any members' suggestions or completions. The related git number is 71e19e54060804493e13748613077b0e69c0cfd9, and the related contents are below: diff --git a/gcc/c/ChangeLog b/gcc/c/ChangeLog index 54d0de7..47cf3cc 100644 --- a/gcc/c/ChangeLog +++ b/gcc/c/ChangeLog @@ -1,3 +1,8 @@ +2014-06-07 Jan Hubicka hubi...@ucw.cz + + * c-decl.c (merge_decls): Use set_decl_section_name. + (duplicate_decls): Remove node if it exists. + 2014-06-05 S. Gilles sgil...@terpmail.umd.edu PR c/53119 diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c index 8fb3296..524b064 100644 --- a/gcc/c/c-decl.c +++ b/gcc/c/c-decl.c @@ -2304,8 +2304,10 @@ merge_decls (tree newdecl, tree olddecl, tree newtype, tree oldtype) We want to issue an error if the sections conflict but that must be done later in decl_attributes since we are called before attributes are assigned. */ - if (DECL_SECTION_NAME (newdecl) == NULL_TREE) - DECL_SECTION_NAME (newdecl) = DECL_SECTION_NAME (olddecl); + if ((DECL_EXTERNAL (olddecl) || TREE_PUBLIC (olddecl) || TREE_STATIC (olddecl)) + DECL_SECTION_NAME (newdecl) == NULL_TREE + DECL_SECTION_NAME (olddecl)) + set_decl_section_name (newdecl, DECL_SECTION_NAME (olddecl)); /* Copy the assembler name. Currently, it can only be defined in the prototype. */ @@ -2574,6 +2576,13 @@ duplicate_decls (tree newdecl, tree olddecl) merge_decls (newdecl, olddecl, newtype, oldtype); /* The NEWDECL will no longer be needed. */ + if (TREE_CODE (newdecl) == FUNCTION_DECL + || TREE_CODE (newdecl) == VAR_DECL) +{ + struct symtab_node *snode = symtab_get_node (newdecl); + if (snode) + symtab_remove_node (snode); +} [...] The related operation: root@gchen:/upstream/linux# cat elevator.i extern int __attribute__ ((__section__(.init.text))) elv_register(void) { return 0; } extern typeof(elv_register) elv_register; root@gchen:/upstream/linux# /usr/local/libexec/gcc/score-elf/4.10.0/cc1 elevator.i elv_register Analyzing compilation unit Segmentation fault (core dumped) root@gchen:/upstream/linux# /usr/local/bin/score-elf-gcc -v Using built-in specs. COLLECT_GCC=/usr/local/bin/score-elf-gcc COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/score-elf/4.10.0/lto-wrapper Target: score-elf Configured with: ../gcc/configure --without-header --disable-nls --enable-language=c --disable-threads --disable-shared --enable-werror=no target_configargs=enable_vtable_verify=yes --target=score-elf --enable-obsolete : (reconfigured) ../gcc/configure --without-header --disable-nls --enable-language=c --disable-threads --disable-shared --enable-werror=no target_configargs=enable_vtable_verify=yes --target=score-elf --enable-obsolete --enable-debug --disable-release Thread model: single gcc version 4.10.0 20140625 (experimental) (GCC) Thanks. -- Chen Gang Open, share, and attitude like air, water, and life which God blessed