[PATCH, PR 58398] Fix regression in gcc.dg/attr-ifunc-4.c
The attached patch fixes the regression in gcc.dg/attr-ifunc-4.c (PR 58398). The problem is that the resolver function just looks like an alias, but it actually is something completely different. So inlining the resolver function has to be avoided. The patch was bootstrapped and regression-tested without any problems on x86_64-unknown-linux-gnu. OK for trunk? Regards, Bernd Edlinger2013-09-17 Bernd Edlinger bernd.edlin...@hotmail.de PR ipa/58398 * cgraph.c (cgraph_function_body_availability): Check for ifunc attribute, and don't inline the resolver in this case. patch-pr58398.diff Description: Binary data
Re: [PATCH] Fix segfault with inlining
I've looked at the C++ testcase int foo (int x) { try { return x; } catch (...) { return 0; } } which exhibits exactly the behavior you quote - return x is considered throwing an exception. The C++ FE doesn't arrange for TREE_THIS_NOTRAP to be set here (maybe due to this issue you quote?). I presume that you compiled with -fnon-call-exceptions? Otherwise, I don't see how something that isn't a call can throw an exception in C++, it should be seen at most as possibly trapping, which is less blocking. Other than that the patch looks reasonable (I suppose you need is_parameter_of only because as we recursively handle the trees PARM_DECLs from the destination could already have leaked into the tree we recurse into?) Do you mean that the test on DECL_CONTEXT is superfluous? Possibly indeed, but with nested functions you can have PARM_DECLs of different origins in a given function body, although this may be irrelevant for tree-inline.c. -- Eric Botcazou
Re: [PATCH, PR 58398] Fix regression in gcc.dg/attr-ifunc-4.c
The attached patch fixes the regression in gcc.dg/attr-ifunc-4.c (PR 58398). The problem is that the resolver function just looks like an alias, but it actually is something completely different. So inlining the resolver function has to be avoided. The patch was bootstrapped and regression-tested without any problems on x86_64-unknown-linux-gnu. OK for trunk? Regards, Bernd Edlinger 2013-09-17 Bernd Edlinger bernd.edlin...@hotmail.de PR ipa/58398 * cgraph.c (cgraph_function_body_availability): Check for ifunc attribute, and don't inline the resolver in this case. OK, thanks! Honza
[PATCH v3] Caller instrumentation with -finstrument-calls
Hello Jan, the MAINTAINERS file reveals that you are the right person to contact for profile feedback related changes. This is the third iteration of the caller instrumentation patch originally posted and explained here: http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01593.html The hooks now conform to the naming scheme suggested by Andrew Pinski and the extra bitfield for the no_instrument_calls func attribute is now relocated to tree_decl_with_vis (it was in tree_function_decl before, but there is no room left there for another bitfield). It would be great if this patch could make it into GCC 4.9.0. Thanks, Paul Paul Woegerer (1): Caller instrumentation with -finstrument-calls. gcc/builtins.def| 5 ++ gcc/c-family/c-common.c | 34 +++ gcc/c/c-decl.c | 2 + gcc/common.opt | 20 - gcc/cp/decl.c | 2 + gcc/doc/invoke.texi | 42 + gcc/function.c | 3 +- gcc/gimplify.c | 113 +++- gcc/ipa.c | 1 + gcc/java/jcf-parse.c| 1 + gcc/libfuncs.h | 6 ++ gcc/optabs.c| 6 ++ gcc/opts.c | 10 +++ gcc/testsuite/g++.dg/other/instrument_calls-1.C | 14 +++ gcc/testsuite/g++.dg/other/instrument_calls-2.C | 20 + gcc/testsuite/g++.dg/other/instrument_calls-3.C | 17 gcc/testsuite/gcc.dg/instrument_calls-1.c | 8 ++ gcc/testsuite/gcc.dg/instrument_calls-2.c | 8 ++ gcc/testsuite/gcc.dg/instrument_calls-3.c | 8 ++ gcc/testsuite/gcc.dg/instrument_calls-4.c | 8 ++ gcc/testsuite/gcc.dg/instrument_calls-5.c | 11 +++ gcc/testsuite/gcc.dg/instrument_calls-6.c | 11 +++ gcc/testsuite/gcc.dg/instrument_calls-7.c | 13 +++ gcc/testsuite/gcc.dg/instrument_calls-8.c | 7 ++ gcc/testsuite/gcc.dg/instrument_calls-9.c | 12 +++ gcc/tree-core.h | 4 +- gcc/tree-streamer-in.c | 2 + gcc/tree-streamer-out.c | 1 + gcc/tree.h | 6 ++ 29 files changed, 390 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/g++.dg/other/instrument_calls-1.C create mode 100644 gcc/testsuite/g++.dg/other/instrument_calls-2.C create mode 100644 gcc/testsuite/g++.dg/other/instrument_calls-3.C create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-1.c create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-2.c create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-3.c create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-4.c create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-5.c create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-6.c create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-7.c create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-8.c create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-9.c -- 1.8.4
[PATCH] Caller instrumentation with -finstrument-calls.
2013-07-01 Paul Woegerer paul_woege...@mentor.com Caller instrumentation with -finstrument-calls. * gcc/builtins.def: Add call-hooks __gnu_profile_call_before and __gnu_profile_call_after. * gcc/libfuncs.h (enum libfunc_index): Likewise. * gcc/optabs.c (init_optabs): Likewise. * gcc/c-family/c-common.c (no_instrument_calls): Add attribute. (handle_no_instrument_calls_attribute): New. * gcc/common.opt (finstrument-calls): New option. (finstrument-calls-exclude-function-list): Likewise. (finstrument-calls-exclude-file-list): Likewise. * gcc/opts.c (common_handle_option): Handle new options. * gcc/tree-core.h (tree_decl_with_vis): Add bitfield no_instrument_calls_before_after. * gcc/tree.h: Macro for no_instrument_calls_before_after access. * gcc/c/c-decl.c (merge_decls): Handle tree_function_decl field. * gcc/cp/decl.c (duplicate_decls): Likewise. * gcc/function.c (expand_function_start): Likewise. * gcc/ipa.c: Likewise. * gcc/java/jcf-parse.c: Likewise. * gcc/tree-streamer-in.c: Likewise. * gcc/tree-streamer-out.c: Likewise. (finstrument-calls-exclude-function-list): Likewise. (finstrument-calls-exclude-file-list): Likewise. * gcc/gimplify.c (flag_instrument_calls_exclude_p): New. (addr_expr_for_call_instrumentation): New. (maybe_add_profile_call): New. (gimplify_call_expr): Add call-hooks insertion. (gimplify_modify_expr): Likewise. * gcc/doc/invoke.texi: Added documentation for -finstrument-calls-exclude-function-list and -finstrument-calls-exclude-file-list and -finstrument-calls. * gcc/testsuite/g++.dg/other/instrument_calls-1.C Added regression test for -finstrument-calls. * gcc/testsuite/g++.dg/other/instrument_calls-2.C: Likewise. * gcc/testsuite/g++.dg/other/instrument_calls-3.C: Likewise. * gcc/testsuite/gcc.dg/instrument_calls-1.c: Likewise. * gcc/testsuite/gcc.dg/instrument_calls-2.c: Likewise. * gcc/testsuite/gcc.dg/instrument_calls-3.c: Likewise. * gcc/testsuite/gcc.dg/instrument_calls-4.c: Likewise. * gcc/testsuite/gcc.dg/instrument_calls-5.c: Likewise. * gcc/testsuite/gcc.dg/instrument_calls-6.c: Likewise. * gcc/testsuite/gcc.dg/instrument_calls-7.c: Likewise. * gcc/testsuite/gcc.dg/instrument_calls-8.c: Likewise. * gcc/testsuite/gcc.dg/instrument_calls-9.c: Likewise. Signed-off-by: Paul Woegerer paul_woege...@mentor.com --- gcc/builtins.def| 5 ++ gcc/c-family/c-common.c | 34 +++ gcc/c/c-decl.c | 2 + gcc/common.opt | 20 - gcc/cp/decl.c | 2 + gcc/doc/invoke.texi | 42 + gcc/function.c | 3 +- gcc/gimplify.c | 113 +++- gcc/ipa.c | 1 + gcc/java/jcf-parse.c| 1 + gcc/libfuncs.h | 6 ++ gcc/optabs.c| 6 ++ gcc/opts.c | 10 +++ gcc/testsuite/g++.dg/other/instrument_calls-1.C | 14 +++ gcc/testsuite/g++.dg/other/instrument_calls-2.C | 20 + gcc/testsuite/g++.dg/other/instrument_calls-3.C | 17 gcc/testsuite/gcc.dg/instrument_calls-1.c | 8 ++ gcc/testsuite/gcc.dg/instrument_calls-2.c | 8 ++ gcc/testsuite/gcc.dg/instrument_calls-3.c | 8 ++ gcc/testsuite/gcc.dg/instrument_calls-4.c | 8 ++ gcc/testsuite/gcc.dg/instrument_calls-5.c | 11 +++ gcc/testsuite/gcc.dg/instrument_calls-6.c | 11 +++ gcc/testsuite/gcc.dg/instrument_calls-7.c | 13 +++ gcc/testsuite/gcc.dg/instrument_calls-8.c | 7 ++ gcc/testsuite/gcc.dg/instrument_calls-9.c | 12 +++ gcc/tree-core.h | 4 +- gcc/tree-streamer-in.c | 2 + gcc/tree-streamer-out.c | 1 + gcc/tree.h | 6 ++ 29 files changed, 390 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/g++.dg/other/instrument_calls-1.C create mode 100644 gcc/testsuite/g++.dg/other/instrument_calls-2.C create mode 100644 gcc/testsuite/g++.dg/other/instrument_calls-3.C create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-1.c create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-2.c create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-3.c create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-4.c create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-5.c create mode 100644 gcc/testsuite/gcc.dg/instrument_calls-6.c
Re: [gomp4, trunk] Two simd fixes
On Mon, 16 Sep 2013, Jakub Jelinek wrote: Hi! This patch fixes two issues I found on the pr58392.c testcase: 1) we weren't copying decl attributes, so e.g. inside #pragma omp parallel omp simd array temporary arrays lost their attribute and weren't adjusted because of that 2) DR_ALIGNED_TO wasn't reset after resetting DR_OFFSET on simd lane access DRs, which resulted in the vectorizer trying to peel for alignment on those. Those are always automatic vars that can be just aligned more. Ok? 2013-09-16 Jakub Jelinek ja...@redhat.com * omp-low.c (copy_var_decl): Copy DECL_ATTRIBUTES. * tree-vect-data-refs.c (vect_analyze_data_refs): For simd_lane_access drs, update also DR_ALIGNED_TO. --- gcc/omp-low.c.jj 2013-09-16 10:08:43.0 +0200 +++ gcc/omp-low.c 2013-09-16 15:25:31.683903448 +0200 @@ -888,6 +888,7 @@ copy_var_decl (tree var, tree name, tree TREE_NO_WARNING (copy) = TREE_NO_WARNING (var); TREE_USED (copy) = 1; DECL_SEEN_IN_BIND_EXPR_P (copy) = 1; + DECL_ATTRIBUTES (copy) = DECL_ATTRIBUTES (var); return copy; } Ok. --- gcc/tree-vect-data-refs.c.jj 2013-09-13 16:48:28.0 +0200 +++ gcc/tree-vect-data-refs.c 2013-09-16 14:47:56.500538758 +0200 @@ -3039,6 +3039,9 @@ again: { DR_OFFSET (newdr) = ssize_int (0); DR_STEP (newdr) = step; + DR_ALIGNED_TO (newdr) + = size_int (highest_pow2_factor + (DR_OFFSET (newdr))); That looks odd - DR_OFFSET (newdr) is constant zero, so you can as well immediately use BIGGEST_ALIGNMENT here (that's what highest_pow2_factor does). Ok with that change. Thanks, Richard. dr = newdr; simd_lane_access = true; } Jakub
Commit: MSP430: Add support for interrupt handlers
Hi Guys, I am applying the patch below to add support for interrupt handlers to the MSP430 backend. The patch also adds a couple of MSP430 specific builtin functions intended to be used inside interrupt handlers. In addition the patch adds support for naked functions, critical functions (which disable interrupts whilst they execute) and reentrant functions (which disable interrupts but always reenable them upon exit). Tested with no regressions on an msp430-elf toolchain. Cheers Nick gcc/ChangeLog 2013-09-17 Nick Clifton ni...@redhat.com * config/msp430/msp430-protos.h: Add prototypes for new functions. * config/msp430/msp430.c (msp430_preserve_reg_p): Add support for interrupt handlers. (is_attr_func): New function. (msp430_is_interrupt_func): New function. (is_naked_func): New function. (is_reentrant_func): New function. (is_critical_func): New function. (msp430_start_function): Add annotations for function attributes. (msp430_attr): New function. (msp430_attribute_table): New. (msp430_function_section): New function. (TARGET_ASM_FUNCTION_SECTION): Define. (msp430_builtin): New enum. (msp430_init_builtins): New function. (msp430_builtin_devl): New function. (msp430_expand_builtin): New function. (TARGET_INIT_BUILTINS): Define. (TARGET_EXPAND_BUILTINS): Define. (TARGET_BUILTIN_DECL): Define. (msp430_expand_prologue): Add support for naked, interrupt, critical and reentranct functions. (msp430_expand_epilogue): Likewise. (msp430_print_operand): Handle 'O' character. * config/msp430/msp430.h (TARGET_CPU_CPP_BUILTINS): Define NO_TRAMPOLINES. * config/msp430/msp430.md (unspec): Add UNS_DINT, UNS_EINT, UNS_PUSH_INTR, UNS_POP_INTR, UNS_BIC_SR, UNS_BIS_SR. (pushm): Use a 'n' rather than an 'i' contraint. (msp_return): Add generation of the interrupt return instruction. (disable_interrupts): New pattern. (enable_interrupts): New pattern. (push_intr_state): New pattern. (pop_intr_state): New pattern. (bic_SR): New pattern. (bis_SR): New pattern. * doc/extend.texi: Document MSP430 function attributes and builtin functions. msp430.intr.patch.xz Description: application/xz
Re: Dump framework newline cleanup
On Mon, Sep 16, 2013 at 8:36 PM, Teresa Johnson tejohn...@google.com wrote: Yep, looked too quickly every time and thought the newline after be zero was applying. Here is the patch with the fix. Ok for trunk pending regression testing? Ok. Thanks, Richard. 2013-09-16 Teresa Johnson tejohn...@google.com * coverage.c (get_coverage_counts): Add missing newline. Index: coverage.c === --- coverage.c (revision 202628) +++ coverage.c (working copy) @@ -347,7 +347,7 @@ get_coverage_counts (unsigned counter, unsigned ex if (!warned++ dump_enabled_p ()) dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, input_location, (flag_guess_branch_prob - ? file %s not found, execution counts estimated + ? file %s not found, execution counts estimated\n : file %s not found, execution counts assumed to be zero\n), da_file_name); Thanks, Teresa On Mon, Sep 16, 2013 at 11:20 AM, Xinliang David Li davi...@google.com wrote: Looks like there is one missing spot: @@ -349,7 +349,7 @@ get_coverage_counts (unsigned counter, u (flag_guess_branch_prob ? file %s not found, execution counts estimated : file %s not found, execution counts assumed to -be zero), +be zero\n), da_file_name); return NULL; I found this when testing interaction of -fprofile-use and -fno-tree-vectorize without a profile. thanks, David On Mon, Sep 16, 2013 at 11:06 AM, Teresa Johnson tejohn...@google.com wrote: On Mon, Sep 16, 2013 at 10:57 AM, Xinliang David Li davi...@google.com wrote: I noticed there are a couple of dump_printf_loc instances in coverage.c not ended with newline. They should be fixed. I committed this change this morning as r202628. I believe I fixed all the dump_printf_loc calls (just double-checked). Can you let me know if you see anymore after you update to this revision? Thanks, Teresa David On Tue, Sep 10, 2013 at 6:32 AM, Teresa Johnson tejohn...@google.com wrote: On Mon, Sep 9, 2013 at 9:55 PM, Xinliang David Li davi...@google.com wrote: looks fine to me. In the long run, I wonder if the machinery in diagnostic messages can be reused for opt-info dumping -- i.e., support different streams. It has many nice features including %qD specifier for printing tree decls. Yes, this would have some advantages such as getting the function name emitted. Teresa David On Mon, Sep 9, 2013 at 12:01 PM, Teresa Johnson tejohn...@google.com wrote: I've attached a patch that implements the cleanup of newline emission by the new dump framework as discussed here: http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01779.html Essentially, I have removed the leading newline emission from dump_loc, and updated dump_printf_loc invocations to emit a trailing newline as necessary. This will remove unnecessary vertical space in the dump output. I did not do any other cleanup of the existing vectorization messages - there are IMO a lot of messages being emitted by the vectorizer under MSG_NOTE (and probably MSG_MISSED_OPTIMIZATION) that should only be emitted to the dump file under -fdump-tree-... and not emitted under -fopt-info-all. The ones that stay under -fopt-info-all need some formatting/style cleanup. Leaving that for follow-on work. Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk? Thanks, Teresa -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
[PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes
Hi, Here is a patch introducing new type and mode for bounds. It is a part of MPX ISA support patch (http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01094.html). Bootstrapped and tested on linux-x86_64. Is it OK for trunk? Thanks, Ilya -- gcc/ 2013-09-16 Ilya Enkovich ilya.enkov...@intel.com * mode-classes.def (MODE_BOUND): New. * tree.def (BOUND_TYPE): New. * genmodes.c (complete_mode): Support MODE_BOUND. (BOUND_MODE): New. (make_bound_mode): New. * machmode.h (BOUND_MODE_P): New. * stor-layout.c (int_mode_for_mode): Support MODE_BOUND. (layout_type): Support BOUND_TYPE. * tree-pretty-print.c (dump_generic_node): Support BOUND_TYPE. * tree.c (build_int_cst_wide): Support BOUND_TYPE. (type_contains_placeholder_1): Likewise. * tree.h (BOUND_TYPE_P): New. * varasm.c (output_constant): Support BOUND_TYPE. * doc/rtl.texi (MODE_BOUND): New. diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi index 1d62223..02b1214 100644 --- a/gcc/doc/rtl.texi +++ b/gcc/doc/rtl.texi @@ -1382,6 +1382,10 @@ any @code{CC_MODE} modes listed in the @file{@var{machine}-modes.def}. @xref{Jump Patterns}, also see @ref{Condition Code}. +@findex MODE_BOUND +@item MODE_BOUND +Bound modes class. Used to represent values of pointer bounds. + @findex MODE_RANDOM @item MODE_RANDOM This is a catchall mode class for modes which don't fit into the above diff --git a/gcc/genmodes.c b/gcc/genmodes.c index dc38483..89174ec 100644 --- a/gcc/genmodes.c +++ b/gcc/genmodes.c @@ -333,6 +333,7 @@ complete_mode (struct mode_data *m) break; case MODE_INT: +case MODE_BOUND: case MODE_FLOAT: case MODE_DECIMAL_FLOAT: case MODE_FRACT: @@ -533,6 +534,18 @@ make_special_mode (enum mode_class cl, const char *name, new_mode (cl, name, file, line); } +#define BOUND_MODE(N, Y) make_bound_mode (#N, Y, __FILE__, __LINE__) + +static void ATTRIBUTE_UNUSED +make_bound_mode (const char *name, + unsigned int bytesize, + const char *file, unsigned int line) +{ + struct mode_data *m = new_mode (MODE_BOUND, name, file, line); + m-bytesize = bytesize; +} + + #define INT_MODE(N, Y) FRACTIONAL_INT_MODE (N, -1U, Y) #define FRACTIONAL_INT_MODE(N, B, Y) \ make_int_mode (#N, B, Y, __FILE__, __LINE__) diff --git a/gcc/machmode.h b/gcc/machmode.h index 981ee92..d4a20b2 100644 --- a/gcc/machmode.h +++ b/gcc/machmode.h @@ -174,6 +174,9 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES]; || CLASS == MODE_ACCUM \ || CLASS == MODE_UACCUM) +#define BOUND_MODE_P(MODE) \ + (GET_MODE_CLASS (MODE) == MODE_BOUND) + /* Get the size in bytes and bits of an object of mode MODE. */ extern CONST_MODE_SIZE unsigned char mode_size[NUM_MACHINE_MODES]; diff --git a/gcc/mode-classes.def b/gcc/mode-classes.def index 7207ef7..c5ea215 100644 --- a/gcc/mode-classes.def +++ b/gcc/mode-classes.def @@ -21,6 +21,7 @@ along with GCC; see the file COPYING3. If not see DEF_MODE_CLASS (MODE_RANDOM),/* other */ \ DEF_MODE_CLASS (MODE_CC),/* condition code in a register */ \ DEF_MODE_CLASS (MODE_INT), /* integer */ \ + DEF_MODE_CLASS (MODE_BOUND),/* bounds */ \ DEF_MODE_CLASS (MODE_PARTIAL_INT), /* integer with padding bits */\ DEF_MODE_CLASS (MODE_FRACT), /* signed fractional number */ \ DEF_MODE_CLASS (MODE_UFRACT),/* unsigned fractional number */ \ diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c index 6f6b310..82611c7 100644 --- a/gcc/stor-layout.c +++ b/gcc/stor-layout.c @@ -383,6 +383,7 @@ int_mode_for_mode (enum machine_mode mode) case MODE_VECTOR_ACCUM: case MODE_VECTOR_UFRACT: case MODE_VECTOR_UACCUM: +case MODE_BOUND: mode = mode_for_size (GET_MODE_BITSIZE (mode), MODE_INT, 0); break; @@ -2135,6 +2136,13 @@ layout_type (tree type) SET_TYPE_MODE (type, VOIDmode); break; +case BOUND_TYPE: + SET_TYPE_MODE (type, + mode_for_size (TYPE_PRECISION (type), MODE_BOUND, 0)); + TYPE_SIZE (type) = bitsize_int (GET_MODE_BITSIZE (TYPE_MODE (type))); + TYPE_SIZE_UNIT (type) = size_int (GET_MODE_SIZE (TYPE_MODE (type))); + break; + case OFFSET_TYPE: TYPE_SIZE (type) = bitsize_int (POINTER_SIZE); TYPE_SIZE_UNIT (type) = size_int (POINTER_SIZE / BITS_PER_UNIT); diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c index 69e4006..8b0825c 100644 --- a/gcc/tree-pretty-print.c +++ b/gcc/tree-pretty-print.c @@ -697,6 +697,7 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags, break; case VOID_TYPE: +case BOUND_TYPE: case INTEGER_TYPE: case REAL_TYPE: case FIXED_POINT_TYPE: diff --git a/gcc/tree.c b/gcc/tree.c index b469b97..bbbe16e
Re: New GCC options for loop vectorization
On Mon, Sep 16, 2013 at 10:24 PM, Xinliang David Li davi...@google.com wrote: On Mon, Sep 16, 2013 at 3:13 AM, Richard Biener richard.guent...@gmail.com wrote: On Fri, Sep 13, 2013 at 5:16 PM, Xinliang David Li davi...@google.com wrote: On Fri, Sep 13, 2013 at 1:30 AM, Richard Biener richard.guent...@gmail.com wrote: On Thu, Sep 12, 2013 at 10:31 PM, Xinliang David Li davi...@google.com wrote: Currently -ftree-vectorize turns on both loop and slp vectorizations, but there is no simple way to turn on loop vectorization alone. The logic for default O3 setting is also complicated. In this patch, two new options are introduced: 1) -ftree-loop-vectorize This option is used to turn on loop vectorization only. option -ftree-slp-vectorize also becomes a first class citizen, and no funny business of Init(2) is needed. With this change, -ftree-vectorize becomes a simple alias to -ftree-loop-vectorize + -ftree-slp-vectorize. For instance, to turn on only slp vectorize at O3, the old way is: -O3 -fno-tree-vectorize -ftree-slp-vectorize With the new change it becomes: -O3 -fno-loop-vectorize To turn on only loop vectorize at O2, the old way is -O2 -ftree-vectorize -fno-slp-vectorize The new way is -O2 -ftree-loop-vectorize 2) -ftree-vect-loop-peeling This option is used to turn on/off loop peeling for alignment. In the long run, this should be folded into the cheap cost model proposed by Richard. This option is also useful in scenarios where peeling can introduce runtime problems: http://gcc.gnu.org/ml/gcc/2005-12/msg00390.html which happens to be common in practice. Patch attached. Compiler boostrapped. Ok after testing? I'd like you to split 1) and 2), mainly because I agree on 1) but not on 2). Ok. Can you also comment on 2) ? I think we want to decide how granular we want to control the vectorizer and using which mechanism. My cost-model re-org makes ftree-vect-loop-version a no-op (basically removes it), so 2) looks like a step backwards in this context. Using cost model to do a coarse grain control/configuration is certainly something we want, but having a fine grain control is still useful. So, can you summarize what pieces (including versioning) of the vectorizer you'd want to be able to disable separately? Loop peeling seems to be the main one. There is also a correctness issue related. For instance, the following code is common in practice, but loop peeling wrongly assumes initial base-alignment and generates aligned mov instruction after peeling, leading to SEGV. Peeling is not something we can blindly turned on -- even when it is on, there should be a way to turn it off explicitly: char a[1]; void foo(int n) { int* b = (int*)(a+n); int i = 0; for (; i 1000; ++i) b[i] = 1; } int main(int argn, char** argv) { foo(argn); } But that's just a bug that should be fixed (looking into it). Just disabling peeling for alignment may get you into the versioning for alignment path (and thus an unvectorized loop at runtime). This is not true for target supporting mis-aligned access. I have not seen a case where alignment driver loop version happens on x86. Also it's know that the alignment peeling code needs some serious TLC (it's outcome depends on the order of DRs, the cost model it uses leaves to be desired as we cannot distinguish between unaligned load and store costs). Yet another reason to turn it off as it is not effective anyways? As said I'll disable all remains of -ftree-vect-loop-version with the cost model patch because it wasn't guarding versioning for aliasing but only versioning for alignment. We have to be consistent here - if we add a way to disable peeling for alignment then we certainly don't want to remove the ability to disable versioning for alignment, no? Richard. thanks, David Richard. I've stopped a quick try doing 1) myself because @@ -1691,6 +1695,12 @@ common_handle_option (struct gcc_options opts-x_flag_ipa_reference = false; break; +case OPT_ftree_vectorize: + if (!opts_set-x_flag_tree_loop_vectorize) + opts-x_flag_tree_loop_vectorize = value; + if (!opts_set-x_flag_tree_slp_vectorize) + opts-x_flag_tree_slp_vectorize = value; + break; doesn't look obviously correct. Does that handle -ftree-vectorize -fno-tree-loop-vectorize -ftree-vectorize or -ftree-loop-vectorize -fno-tree-vectorize properly? Currently at least -ftree-slp-vectorize -fno-tree-vectorize doesn't work. Right -- same is true for -fprofile-use option. FDO enables some passes, but can not re-enable them if they are flipped off before. That said, the option machinery doesn't handle an option being an alias for two other options, so it's mechanism to contract positives/negatives doesn't work here and the override hooks do not work reliably for repeated options. Or am I wrong here? Should
Re: [PATCH] Fix segfault with inlining
On Tue, Sep 17, 2013 at 9:03 AM, Eric Botcazou ebotca...@adacore.com wrote: I've looked at the C++ testcase int foo (int x) { try { return x; } catch (...) { return 0; } } which exhibits exactly the behavior you quote - return x is considered throwing an exception. The C++ FE doesn't arrange for TREE_THIS_NOTRAP to be set here (maybe due to this issue you quote?). I presume that you compiled with -fnon-call-exceptions? Otherwise, I don't see how something that isn't a call can throw an exception in C++, it should be seen at most as possibly trapping, which is less blocking. Yes, with -fnon-call-exceptions. Other than that the patch looks reasonable (I suppose you need is_parameter_of only because as we recursively handle the trees PARM_DECLs from the destination could already have leaked into the tree we recurse into?) Do you mean that the test on DECL_CONTEXT is superfluous? Possibly indeed, but with nested functions you can have PARM_DECLs of different origins in a given function body, although this may be irrelevant for tree-inline.c. Yeah, I thought testing for a PARM_DECL should be sufficient? For nested functions references to outer parms should have been lowered via the static chain at the point tree-inline.c sees them. So, if you agree that the DECL_CONTEXT test is superfluous the patch is ok with the is_parameter_of function removed. Thanks, Richard. -- Eric Botcazou
Fwd: GCC internals conditional execution macro?
Hi, Let me suggest to remove the section Macros to control conditional execution in GCC internals. I assume the section is obsolete given that it is empty. Best, Nicklas 2013-09-17 Nicklas Bo Jensen nbjen...@gmail.com * doc/tm.texi (Macros to control conditional execution): Remove empty section. Index: gcc/doc/tm.texi === --- gcc/doc/tm.texi (revision 202626) +++ gcc/doc/tm.texi (working copy) @@ -6106,15 +6106,6 @@ returns @code{VOIDmode}. @end deftypefn -@node Cond Exec Macros -@subsection Macros to control conditional execution -@findex conditional execution -@findex predication - -There is one macro that may need to be defined for targets -supporting conditional execution, independent of how they -represent conditional branches. - @node Costs @section Describing Relative Costs of Operations @cindex costs of instructions -- Forwarded message -- From: Andreas Schwab sch...@linux-m68k.org Date: Mon, Sep 16, 2013 at 8:03 PM Subject: Re: GCC internals conditional execution macro? To: Nicklas Bo Jensen nbjen...@gmail.com Cc: g...@gcc.gnu.org Nicklas Bo Jensen nbjen...@gmail.com writes: In GCC internals for GCC 4.8.1 and trunk the section Macros to control conditional execution mentions that there exists a macro, but does not name the macro? Which macro is thought of here? The macro has been removed in r188983 without removing the now empty section. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: [Patch] Implement regex_match and regex_search
Hi, On 09/15/2013 03:45 AM, Tim Shen wrote: ...finally. This patch complete flags specifed in [28.5]. However, `optimize` and `match_any` are ignored. `format_*` are unimplemented yet. regex_iterator and regex_token_iterator should work now, but need more testcases. Great. Tim, please complete the testing on -m32 etc, if everything goes well, just wait a day or so and commit. Next, format string and and regex_replace should be worked on. I see... Thanks again! Paolo.
Re: New GCC options for loop vectorization
On Tue, Sep 17, 2013 at 10:20 AM, Richard Biener richard.guent...@gmail.com wrote: On Mon, Sep 16, 2013 at 10:24 PM, Xinliang David Li davi...@google.com wrote: On Mon, Sep 16, 2013 at 3:13 AM, Richard Biener richard.guent...@gmail.com wrote: On Fri, Sep 13, 2013 at 5:16 PM, Xinliang David Li davi...@google.com wrote: On Fri, Sep 13, 2013 at 1:30 AM, Richard Biener richard.guent...@gmail.com wrote: On Thu, Sep 12, 2013 at 10:31 PM, Xinliang David Li davi...@google.com wrote: Currently -ftree-vectorize turns on both loop and slp vectorizations, but there is no simple way to turn on loop vectorization alone. The logic for default O3 setting is also complicated. In this patch, two new options are introduced: 1) -ftree-loop-vectorize This option is used to turn on loop vectorization only. option -ftree-slp-vectorize also becomes a first class citizen, and no funny business of Init(2) is needed. With this change, -ftree-vectorize becomes a simple alias to -ftree-loop-vectorize + -ftree-slp-vectorize. For instance, to turn on only slp vectorize at O3, the old way is: -O3 -fno-tree-vectorize -ftree-slp-vectorize With the new change it becomes: -O3 -fno-loop-vectorize To turn on only loop vectorize at O2, the old way is -O2 -ftree-vectorize -fno-slp-vectorize The new way is -O2 -ftree-loop-vectorize 2) -ftree-vect-loop-peeling This option is used to turn on/off loop peeling for alignment. In the long run, this should be folded into the cheap cost model proposed by Richard. This option is also useful in scenarios where peeling can introduce runtime problems: http://gcc.gnu.org/ml/gcc/2005-12/msg00390.html which happens to be common in practice. Patch attached. Compiler boostrapped. Ok after testing? I'd like you to split 1) and 2), mainly because I agree on 1) but not on 2). Ok. Can you also comment on 2) ? I think we want to decide how granular we want to control the vectorizer and using which mechanism. My cost-model re-org makes ftree-vect-loop-version a no-op (basically removes it), so 2) looks like a step backwards in this context. Using cost model to do a coarse grain control/configuration is certainly something we want, but having a fine grain control is still useful. So, can you summarize what pieces (including versioning) of the vectorizer you'd want to be able to disable separately? Loop peeling seems to be the main one. There is also a correctness issue related. For instance, the following code is common in practice, but loop peeling wrongly assumes initial base-alignment and generates aligned mov instruction after peeling, leading to SEGV. Peeling is not something we can blindly turned on -- even when it is on, there should be a way to turn it off explicitly: char a[1]; void foo(int n) { int* b = (int*)(a+n); int i = 0; for (; i 1000; ++i) b[i] = 1; } int main(int argn, char** argv) { foo(argn); } But that's just a bug that should be fixed (looking into it). Bug in the testcase. b[i] asserts that b is aligned to 'int', so this invokes undefined behavior if peeling cannot reach an alignment of 16. Richard.
Re: [PATCH, i386, MPX 1/X] Support of Intel MPX ISA. 2/2 New registers and instructions
On 16 Sep 11:24, Uros Bizjak wrote: On Fri, Sep 13, 2013 at 12:18 PM, Ilya Enkovich enkovich@gmail.com wrote: 2013/9/11 Uros Bizjak ubiz...@gmail.com: Hi Uros, Thanks a lot for the review! The x86 part looks mostly OK (I have a couple of comments bellow), but please first get target-independent changes reviewed and committed. Do you mean I should move bound type and mode declaration into a separate patch? Yes, target-independent part (middle end) has to go through the separate review to check if this part is OK. The target-dependent part uses the infrastructure from the middle end, so it can go into the code base only after target-independent parts are committed. I sent a separate patch for bound type and mode class (http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01268.html). Here is target part of the patch with fixes you mentioned. Does it look OK? Bootstrapped and checked on linux-x86_64. Still shows incorrect length attribute computation (described here http://gcc.gnu.org/ml/gcc/2013-07/msg00311.html). Thanks, Ilya -- 2013-09-16 Ilya Enkovich ilya.enkov...@intel.com * config/i386/constraints.md (B): New. (Ti): New. (Tb): New. * config/i386/i386-c.c (ix86_target_macros_internal): Add __MPX__. * config/i386/i386-modes.def (BND32): New. (BND64): New. * config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New. * config/i386/i386.c (isa_opts): Add mmpx. (regclass_map): Add bound registers. (dbx_register_map): Likewise. (dbx64_register_map): Likewise. (svr4_dbx_register_map): Likewise. (PTA_MPX): New. (ix86_option_override_internal): Support MPX ISA. (ix86_conditional_register_usage): Support bound registers. (print_reg): Likewise. (ix86_code_end): Add MPX bnd prefix. (output_set_got): Likewise. (ix86_output_call_insn): Likewise. (ix86_print_operand): Add '!' (MPX bnd) print prefix support. (ix86_print_operand_punct_valid_p): Likewise. (ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and UNSPEC_BNDMK_ADDR. (ix86_class_likely_spilled_p): Add bound regs support. (ix86_hard_regno_mode_ok): Likewise. (x86_order_regs_for_local_alloc): Likewise. (ix86_bnd_prefixed_insn_p): New. * config/i386/i386.h (FIRST_PSEUDO_REGISTER): Fix to new value. (FIXED_REGISTERS): Add bound registers. (CALL_USED_REGISTERS): Likewise. (REG_ALLOC_ORDER): Likewise. (HARD_REGNO_NREGS): Likewise. (TARGET_MPX): New. (VALID_BND_REG_MODE): New. (FIRST_BND_REG): New. (LAST_BND_REG): New. (reg_class): Add BND_REGS. (REG_CLASS_NAMES): Likewise. (REG_CLASS_CONTENTS): Likewise. (BND_REGNO_P): New. (ANY_BND_REG_P): New. (BNDmode): New. (HI_REGISTER_NAMES): Add bound registers. * config/i386/i386.md (UNSPEC_BNDMK): New. (UNSPEC_BNDMK_ADDR): New. (UNSPEC_BNDSTX): New. (UNSPEC_BNDLDX): New. (UNSPEC_BNDLDX_ADDR): New. (UNSPEC_BNDCL): New. (UNSPEC_BNDCU): New. (UNSPEC_BNDCN): New. (UNSPEC_MPX_FENCE): New. (BND0_REG): New. (BND1_REG): New. (type): Add mpxmov, mpxmk, mpxchk, mpxld, mpxst. (length_immediate): Likewise. (prefix_0f): Likewise. (memory): Likewise. (prefix_rep): Check for bnd prefix. (BND): New. (bnd_ptr): New. (BNDCHECK): New. (bndcheck): New. (*jcc_1): Add MPX bnd prefix and fix length. (*jcc_2): Likewise. (jump): Likewise. (simple_return_internal): Likewise. (simple_return_pop_internal): Likewise. (*indirect_jump): Add MPX bnd prefix. (*tablejump_1): Likewise. (simple_return_internal_long): Likewise. (simple_return_indirect_internal): Likewise. (mode_mk): New. (*mode_mk): New. (movmode): New. (*movmode_internal_mpx): New. (mode_bndcheck): New. (*mode_bndcheck): New. (mode_ldx): New. (*mode_ldx): New. (mode_stx): New. (*mode_stx): New. * config/i386/predicates.md (lea_address_operand): Rename to... (address_no_seg_operand): ... this. (address_mpx_no_base_operand): New. (address_mpx_no_index_operand): New. (bnd_mem_operator): New. * config/i386/i386.opt (mmpx): New. * doc/invoke.texi: Add documentation for the flags -mmpx, -mno-mpx. * doc/rtl.texi Add documentation for BND32mode and BND64mode. diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md index 28e626f..79d02f7 100644 --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -18,7 +18,7 @@ ;; http://www.gnu.org/licenses/. ;;; Unused letters: -;;; B H T +;;;
Re: [x86,PATCH] Simple fix for Atom LEA splitting.
Here is a final patch with fixed commentary. 2013/9/16 Uros Bizjak ubiz...@gmail.com: On Mon, Sep 16, 2013 at 5:01 PM, Yuri Rumyantsev ysrum...@gmail.com wrote: Does this comment looks good to you: if (start != NULL_RTX) { bb = BLOCK_FOR_INSN (start); if (start != BB_HEAD (bb)) /* Initialize prev to insn if insn and start belong to the same bb; in this case increase_distance can increment distance to 1. */ prev = insn; I'd say something in the lines of: If insn and start belong to the same bb, set prev to insn, so the call to increase_distance will increase the distance between insns by 1. Best regards, Uros. fixed_patch Description: Binary data
Re: [PATCH] Fix segfault with inlining
Yeah, I thought testing for a PARM_DECL should be sufficient? For nested functions references to outer parms should have been lowered via the static chain at the point tree-inline.c sees them. OK for the latter point, but are you sure for the former? My understanding is that we're already in SSA form, so parameters can be represented by SSA_NAMEs without defining statements. -- Eric Botcazou
Re: [PATCH] Fix segfault with inlining
On Fri, Sep 13, 2013 at 04:29:48PM +0200, Eric Botcazou wrote: @@ -4748,6 +4774,8 @@ copy_gimple_seq_and_replace_locals (gimp id.transform_call_graph_edges = CB_CGE_DUPLICATE; id.transform_new_cfg = false; id.transform_return_to_modify = false; + id.transform_parameter = false; + id.transform_parameter = false; id.transform_lang_insert_block = NULL; /* Walk the tree once to find local labels. */ Why are you storing the same thing twice? Jakub
Re: [PATCH] Fix segfault with inlining
On Tue, Sep 17, 2013 at 10:42 AM, Eric Botcazou ebotca...@adacore.com wrote: Yeah, I thought testing for a PARM_DECL should be sufficient? For nested functions references to outer parms should have been lowered via the static chain at the point tree-inline.c sees them. OK for the latter point, but are you sure for the former? My understanding is that we're already in SSA form, so parameters can be represented by SSA_NAMEs without defining statements. That's true... so you can only simplify is_parameter_of by dropping the context check. Richard. -- Eric Botcazou
Re: [PATCH] Handle loops with control flow in loop-distribution
Installed as obvious. Andreas. * gcc.dg/tree-ssa/ldist-22.c (main): Return zero. diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-22.c b/gcc/testsuite/gcc.dg/tree-ssa/ldist-22.c index f6fff77..afc792f 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ldist-22.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-22.c @@ -25,7 +25,7 @@ int main() abort (); if (a[0] != 0 || a[101] != 0) abort (); - return; + return 0; } /* { dg-final { scan-tree-dump generated memset zero ldist } } */ -- 1.8.4 -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 And now for something completely different.
RE: [PATCH, PR 57748] Check for out of bounds access
Hi Martin, On Tue, 17 Sep 2013 01:09:45, Martin Jambor wrote: Hi, On Sun, Sep 15, 2013 at 06:55:17PM +0200, Bernd Edlinger wrote: Hello Richard, attached is my second attempt at fixing PR 57748. This time the movmisalign path is completely removed and a similar bug in the read handling of misaligned structures with a non-BLKmode is fixed too. There are several new test cases for the different possible failure modes. I think the third and fourth testcases are undefined as the description of zero-length arrays extension clearly says the whole thing only makes sense when used as the last field of the outermost-aggregate type. I have not really understood what the third testcase is supposed to test but I did not try too much. Instead of the fourth testcase, you can demonstrate the need for your change in expand_expr_real_1 by augmenting the original testcase a little like in attached pr57748-m1.c. The third test case tries to demonstrate the possible write data store race (by checking the assembler output). But you are right, this example is probably not valid C at all. I was actually worried about unions with non-BLK mode and a movmisalign optab handler. When you look at stor-layout.c (compute_record_mode) you'll see, that in the case of a union usually an integer mode is chosen, which is exactly the same size as the whole union. And just by chance this does not have a movmisalign optab. Therefore I tried to cheat with that zero-sized array, which should probably be rejected at stor-layout.c in the first place. When I tried to make a test case out of it, the bug on the read side hit me as a total surprise... The hunk in expand_expr_real_1 can prove problematic if at any point we need to pass some other modifier to the expansion of tem. I'll try to see if I can come up with a testcase tomorrow. But perhaps we never do (and can hope we never will) and then it would be sort of OKish (note that I cannot approve anything) even though it can pessimize unaligned access paths (by not using movmisalign_optab even when perfectly possible - which is always when there is no zero sized array). It really just shows how evil non-BLKmode structures with zero-sized arrays are and how they complicate things. The expansion of component_refs is reasonably built around the assumption that we'd expand the structure in its mode in the most efficient manner and then chuck the correct part out of it, but here we need to tell the expansion of the structure to hold itself back because we'll be looking outside of the structure (as specified by mode). I too am under the very strong impression that this was not the intention of the design to use a non-BLKmode on a structure with zero-sized arrays. I'm not sure to what extent the hunk adding tests for bitregion_start and bitregion_end being zero is connected to this issue. I do not see any of the testcases exercising that path. If it is indeed another problem, I think it should be submitted (and potentially committed) as a separate patch, preferably with a testcase. Yes, you're probably right. I was unable to find a test case where this code path executes with bitregions. As I said, it maybe possible to prove that bitregion_start and bitregion_end == 0 if the other conditions are satisfied. What is obvious, that it would cause problems to set bitpos=0 when bitregion_start/end is pointing elsewhere. It is however much easier to prove that not going into that code path would not cause any problems if bitregion_start/end is not zero. So this was just for safer programming, but probably no real bug. Thanks, Bernd. Having said all that, I think that removing the misalignp path from expand_assignment altogether is a good idea. I have verified that when the expander is now presented with basically the same thing that 4.7 choked on, expand_expr (..., EXPAND_WRITE) can cope with it (see attached file c.c) and doing that simplifies this complex code path. Thanks, Martin This patch was boot-strapped and regression tested on x86_64-unknown-linux-gnu and i686-pc-linux-gnu. Additionally I generated eCos and an eCos-application (on ARMv5 using packed structures) with an arm-eabi cross compiler, and looked for differences in the disassembled code with and without this patch, but there were none. OK for trunk? Regards Bernd. 2013-09-15 Bernd Edlinger bernd.edlin...@hotmail.de PR middle-end/57748 * expr.c (expand_assignment): Remove misalignp code path. Check for bitregion in offset arithmetic. (expand_expr_real_1): Use EXAND_MEMORY on base object. testsuite: PR middle-end/57748 * gcc.dg/torture/pr57748-1.c: New test. * gcc.dg/torture/pr57748-2.c: New test. * gcc.dg/torture/pr57748-3.c: New test. * gcc.dg/torture/pr57748-3a.c: New test. * gcc.dg/torture/pr57748-4.c: New test. * gcc.dg/torture/pr57748-4a.c: New test.
Re: Fwd: GCC internals conditional execution macro?
On Tue, Sep 17, 2013 at 10:35:22AM +0200, Nicklas Bo Jensen wrote: Hi, Let me suggest to remove the section Macros to control conditional execution in GCC internals. I assume the section is obsolete given that it is empty. Best, Nicklas 2013-09-17 Nicklas Bo Jensen nbjen...@gmail.com * doc/tm.texi (Macros to control conditional execution): Hasn't this been already removed by http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01231.html ? Marek
Re: [PATCH, PR 57748] Check for out of bounds access
On Sun, Sep 15, 2013 at 6:55 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hello Richard, attached is my second attempt at fixing PR 57748. This time the movmisalign path is completely removed and a similar bug in the read handling of misaligned structures with a non-BLKmode is fixed too. There are several new test cases for the different possible failure modes. This patch was boot-strapped and regression tested on x86_64-unknown-linux-gnu and i686-pc-linux-gnu. Additionally I generated eCos and an eCos-application (on ARMv5 using packed structures) with an arm-eabi cross compiler, and looked for differences in the disassembled code with and without this patch, but there were none. OK for trunk? I agree that the existing movmisaling path that you remove is simply bogus, so removing it looks fine to me. Can you give rationale to @@ -4773,6 +4738,8 @@ expand_assignment (tree to, tree from, b if (MEM_P (to_rtx) GET_MODE (to_rtx) == BLKmode GET_MODE (XEXP (to_rtx, 0)) != VOIDmode + bitregion_start == 0 + bitregion_end == 0 bitsize 0 (bitpos % bitsize) == 0 (bitsize % GET_MODE_ALIGNMENT (mode1)) == 0 and especially to @@ -9905,7 +9861,7 @@ expand_expr_real_1 (tree exp, rtx target modifier != EXPAND_STACK_PARM ? target : NULL_RTX), VOIDmode, -modifier == EXPAND_SUM ? EXPAND_NORMAL : modifier); +EXPAND_MEMORY); /* If the bitfield is volatile, we want to access it in the field's mode, not the computed mode. which AFAIK makes memory expansion of loads/stores from/to registers change (fail? go through stack memory?) - see handling of non-MEM return values from that expand_expr call. That is, do you see anything break with just removing the movmisalign path? I'd rather install that (with the new testcases that then pass) separately as this is a somewhat fragile area and being able to more selectively bisect/backport would be nice. Thanks, Richard. Regards Bernd.
Re: [PATCH] Don't always instrument shifts (PR sanitizer/58413)
On Mon, Sep 16, 2013 at 08:35:35PM +0200, Jakub Jelinek wrote: On Fri, Sep 13, 2013 at 08:01:36PM +0200, Marek Polacek wrote: I'd say the above is going to be a maintainance nightmare, with all the code duplication. And you are certainly going to miss cases that way, e.g. void foo (void) { int A[-2 / -1] = {}; } I'd say instead of adding all this, you should just at the right spot insert if (integer_zerop (t)) return NULL_TREE; or similar. For shift instrumentation, I guess you could add if (integer_zerop (t) (tt == NULL_TREE || integer_zerop (tt))) return NULL_TREE; right before: t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), op0, t); Yeah, this is _much_ better. I'm glad we can live without that ugliness. +/* PR sanitizer/58413 */ +/* { dg-do run } */ +/* { dg-options -fsanitize=shift -w } */ + +int x = 7; +int +main (void) +{ + /* All of the following should pass. */ + int A[128 5] = {}; + int B[128 5] = {}; + + static int e = +((int) + (0x | ((31 ((1 (4)) - 1)) (((15) + 6) + 4)) | + ((0) ((15) + 6)) | ((0) (15; This relies on int32plus, so needs to be /* { dg-do run { target int32plus } } */ Fixed. --- gcc/testsuite/c-c++-common/ubsan/shift-5.c.mp3 2013-09-13 18:25:06.195847144 +0200 +++ gcc/testsuite/c-c++-common/ubsan/shift-5.c 2013-09-13 19:16:38.990211229 +0200 @@ -0,0 +1,21 @@ +/* { dg-do compile} */ +/* { dg-options -fsanitize=shift -w } */ +/* { dg-shouldfail ubsan } */ + +int x; +int +main (void) +{ + /* None of the following should pass. */ + switch (x) +{ +case 1 -1: /* { dg-error } */ +case -1 -1: /* { dg-error } */ +case 1 -1: /* { dg-error } */ +case -1 -1: /* { dg-error } */ +case -1 200:/* { dg-error } */ +case 1 200: /* { dg-error } */ Can't you fill in the error you are expecting, or is the problem that the wording is very different between C and C++? I discovered { target c } stuff, so I filled in both error messages. This patch seems to work: bootstrap-ubsan passes + ubsan testsuite passes too. Ok for trunk? 2013-09-17 Marek Polacek pola...@redhat.com Jakub Jelinek ja...@redhat.com PR sanitizer/58413 c-family/ * c-ubsan.c (ubsan_instrument_shift): Don't instrument an expression if we can prove it is correct. (ubsan_instrument_division): Likewise. Remove unnecessary check. testsuite/ * c-c++-common/ubsan/shift-4.c: New test. * c-c++-common/ubsan/shift-5.c: New test. * c-c++-common/ubsan/div-by-zero-5.c: New test. * gcc.dg/ubsan/c-shift-1.c: New test. --- gcc/c-family/c-ubsan.c.mp 2013-09-17 12:24:44.582835840 +0200 +++ gcc/c-family/c-ubsan.c 2013-09-17 12:24:48.772849823 +0200 @@ -51,14 +51,6 @@ ubsan_instrument_division (location_t lo if (TREE_CODE (type) != INTEGER_TYPE) return NULL_TREE; - /* If we *know* that the divisor is not -1 or 0, we don't have to - instrument this expression. - ??? We could use decl_constant_value to cover up more cases. */ - if (TREE_CODE (op1) == INTEGER_CST - integer_nonzerop (op1) - !integer_minus_onep (op1)) -return NULL_TREE; - t = fold_build2 (EQ_EXPR, boolean_type_node, op1, build_int_cst (type, 0)); @@ -74,6 +66,11 @@ ubsan_instrument_division (location_t lo t = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, t, x); } + /* If the condition was folded to 0, no need to instrument + this expression. */ + if (integer_zerop (t)) +return NULL_TREE; + /* In case we have a SAVE_EXPR in a conditional context, we need to make sure it gets evaluated before the condition. */ t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), op0, t); @@ -138,6 +135,11 @@ ubsan_instrument_shift (location_t loc, tt = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, x, tt); } + /* If the condition was folded to 0, no need to instrument + this expression. */ + if (integer_zerop (t) (tt == NULL_TREE || integer_zerop (tt))) +return NULL_TREE; + /* In case we have a SAVE_EXPR in a conditional context, we need to make sure it gets evaluated before the condition. */ t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), op0, t); --- gcc/testsuite/c-c++-common/ubsan/shift-4.c.mp 2013-09-17 12:25:12.130931875 +0200 +++ gcc/testsuite/c-c++-common/ubsan/shift-4.c 2013-09-17 10:19:44.665199565 +0200 @@ -0,0 +1,30 @@ +/* PR sanitizer/58413 */ +/* { dg-do run { target int32plus } } */ +/* { dg-options -fsanitize=shift -w } */ + +int x = 7; +int +main (void) +{ + /* All of the following should pass. */ + int A[128 5] = {}; + int B[128 5] = {}; + + static int e = +((int) + (0x | ((31 ((1 (4)) - 1)) (((15) + 6) + 4)) | + ((0) ((15) + 6)) | ((0) (15; + + if (e != 503316480) +__builtin_abort (); + + switch (x) +
Re: [PATCH, PR 57748] Check for out of bounds access
On Tue, Sep 17, 2013 at 12:00 PM, Richard Biener richard.guent...@gmail.com wrote: On Sun, Sep 15, 2013 at 6:55 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hello Richard, attached is my second attempt at fixing PR 57748. This time the movmisalign path is completely removed and a similar bug in the read handling of misaligned structures with a non-BLKmode is fixed too. There are several new test cases for the different possible failure modes. This patch was boot-strapped and regression tested on x86_64-unknown-linux-gnu and i686-pc-linux-gnu. Additionally I generated eCos and an eCos-application (on ARMv5 using packed structures) with an arm-eabi cross compiler, and looked for differences in the disassembled code with and without this patch, but there were none. OK for trunk? I agree that the existing movmisaling path that you remove is simply bogus, so removing it looks fine to me. Can you give rationale to @@ -4773,6 +4738,8 @@ expand_assignment (tree to, tree from, b if (MEM_P (to_rtx) GET_MODE (to_rtx) == BLKmode GET_MODE (XEXP (to_rtx, 0)) != VOIDmode + bitregion_start == 0 + bitregion_end == 0 bitsize 0 (bitpos % bitsize) == 0 (bitsize % GET_MODE_ALIGNMENT (mode1)) == 0 and especially to @@ -9905,7 +9861,7 @@ expand_expr_real_1 (tree exp, rtx target modifier != EXPAND_STACK_PARM ? target : NULL_RTX), VOIDmode, -modifier == EXPAND_SUM ? EXPAND_NORMAL : modifier); +EXPAND_MEMORY); /* If the bitfield is volatile, we want to access it in the field's mode, not the computed mode. which AFAIK makes memory expansion of loads/stores from/to registers change (fail? go through stack memory?) - see handling of non-MEM return values from that expand_expr call. In particular this seems to disable all movmisalign handling for MEM_REFs wrapped in component references which looks wrong. I was playing with typedef long long V __attribute__ ((vector_size (2 * sizeof (long long)), may_alias)); struct S { long long a[11]; V v; }__attribute__((aligned(8),packed)) ; struct S a, *b = a; V v, w; int main() { v = b-v; b-v = w; return 0; } (use -fno-common) and I see that we use unaligned stores too often (even with a properly aligned MEM). The above at least shows movmisalign opportunities wrapped in component-refs. That is, do you see anything break with just removing the movmisalign path? I'd rather install that (with the new testcases that then pass) separately as this is a somewhat fragile area and being able to more selectively bisect/backport would be nice. Thanks, Richard. Regards Bernd.
Re: [PATCH] Fix segfault with inlining
That's true... so you can only simplify is_parameter_of by dropping the context check. OK, thanks, installed with this modification and the fix for the oversight spotted by Jakub, after retesting on x86-64/Linux. -- Eric Botcazou
Use CreateSemaphoreW instead of CreateSemaphoreA in libgcc.
This is no-op for usual GCC targets, because we don't pass any string to CreateSemaphore anyway. However this trivial change will help mingw-w64's efforts to support WinRT, where only unicode variant is available. libgcc/Changelog: config/i386/gthr-win32.c: CreateSemaphoreW instead of CreateSemaphoreA. config/i386/gthr-win32.h: Likewise.
Re: Use CreateSemaphoreW instead of CreateSemaphoreA in libgcc.
2013/9/17 Jacek Caban cja...@gmail.com: This is no-op for usual GCC targets, because we don't pass any string to CreateSemaphore anyway. However this trivial change will help mingw-w64's efforts to support WinRT, where only unicode variant is available. libgcc/Changelog: config/i386/gthr-win32.c: CreateSemaphoreW instead of CreateSemaphoreA. config/i386/gthr-win32.h: Likewise. Please attach (or inline) patch. Thanks, Kai
Re: Fwd: GCC internals conditional execution macro?
Hasn't this been already removed by http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01231.html ? Yes. Okay. Please ignore then. Best, Nicklas
Re: Use CreateSemaphoreW instead of CreateSemaphoreA in libgcc.
On 09/17/13 13:41, Kai Tietz wrote: 2013/9/17 Jacek Caban cja...@gmail.com: This is no-op for usual GCC targets, because we don't pass any string to CreateSemaphore anyway. However this trivial change will help mingw-w64's efforts to support WinRT, where only unicode variant is available. libgcc/Changelog: config/i386/gthr-win32.c: CreateSemaphoreW instead of CreateSemaphoreA. config/i386/gthr-win32.h: Likewise. Please attach (or inline) patch. It's attached now, sorry. Jacek commit eea3738e6103da1d1bc391b99734c93737d292a4 Author: Jacek Caban ja...@codeweavers.com Date: Tue May 7 17:22:01 2013 +0200 Use CreateSemaphoreW instead of CreateSemaphoreA in libgcc. libgcc/Changelog: config/i386/gthr-win32.c: CreateSemaphoreW instead of CreateSemaphoreA. config/i386/gthr-win32.h: Likewise. diff --git a/libgcc/config/i386/gthr-win32.c b/libgcc/config/i386/gthr-win32.c index f6f661a..f323031 100644 --- a/libgcc/config/i386/gthr-win32.c +++ b/libgcc/config/i386/gthr-win32.c @@ -147,7 +147,7 @@ void __gthr_win32_mutex_init_function (__gthread_mutex_t *mutex) { mutex-counter = -1; - mutex-sema = CreateSemaphore (NULL, 0, 65535, NULL); + mutex-sema = CreateSemaphoreW (NULL, 0, 65535, NULL); } void @@ -195,7 +195,7 @@ __gthr_win32_recursive_mutex_init_function (__gthread_recursive_mutex_t *mutex) mutex-counter = -1; mutex-depth = 0; mutex-owner = 0; - mutex-sema = CreateSemaphore (NULL, 0, 65535, NULL); + mutex-sema = CreateSemaphoreW (NULL, 0, 65535, NULL); } int diff --git a/libgcc/config/i386/gthr-win32.h b/libgcc/config/i386/gthr-win32.h index d2e729a..1e437fc 100644 --- a/libgcc/config/i386/gthr-win32.h +++ b/libgcc/config/i386/gthr-win32.h @@ -635,7 +635,7 @@ static inline void __gthread_mutex_init_function (__gthread_mutex_t *__mutex) { __mutex-counter = -1; - __mutex-sema = CreateSemaphore (NULL, 0, 65535, NULL); + __mutex-sema = CreateSemaphoreW (NULL, 0, 65535, NULL); } static inline void @@ -697,7 +697,7 @@ __gthread_recursive_mutex_init_function (__gthread_recursive_mutex_t *__mutex) __mutex-counter = -1; __mutex-depth = 0; __mutex-owner = 0; - __mutex-sema = CreateSemaphore (NULL, 0, 65535, NULL); + __mutex-sema = CreateSemaphoreW (NULL, 0, 65535, NULL); } static inline int
Re: Use CreateSemaphoreW instead of CreateSemaphoreA in libgcc.
Hi Jacek, I applied patch at rev. 202648 with following ChangeLog 2013-09-17 Jacek Caban * config/i386/gthr-win32.c: CreateSemaphoreW instead of CreateSemaphoreA. * config/i386/gthr-win32.h: Likewise. The wide-variant is in general ok due we don't support any windows-OS anymore, which doesn't support wide API. Thanks, Kai
Re: [x86,PATCH] Simple fix for Atom LEA splitting.
Hello, On 16 Sep 16:36, Uros Bizjak wrote: The patch with a fixed comment is OK otherwise. Checked into main trunk: http://gcc.gnu.org/ml/gcc-cvs/2013-09/msg00512.html -- Thanks, K
RE: [PATCH, PR 57748] Check for out of bounds access
On Tue, 17 Sep 2013 12:45:40, Richard Biener wrote: On Tue, Sep 17, 2013 at 12:00 PM, Richard Biener richard.guent...@gmail.com wrote: On Sun, Sep 15, 2013 at 6:55 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hello Richard, attached is my second attempt at fixing PR 57748. This time the movmisalign path is completely removed and a similar bug in the read handling of misaligned structures with a non-BLKmode is fixed too. There are several new test cases for the different possible failure modes. This patch was boot-strapped and regression tested on x86_64-unknown-linux-gnu and i686-pc-linux-gnu. Additionally I generated eCos and an eCos-application (on ARMv5 using packed structures) with an arm-eabi cross compiler, and looked for differences in the disassembled code with and without this patch, but there were none. OK for trunk? I agree that the existing movmisaling path that you remove is simply bogus, so removing it looks fine to me. Can you give rationale to @@ -4773,6 +4738,8 @@ expand_assignment (tree to, tree from, b if (MEM_P (to_rtx) GET_MODE (to_rtx) == BLKmode GET_MODE (XEXP (to_rtx, 0)) != VOIDmode + bitregion_start == 0 + bitregion_end == 0 bitsize 0 (bitpos % bitsize) == 0 (bitsize % GET_MODE_ALIGNMENT (mode1)) == 0 OK, as already said, I think it could be dangerous to set bitpos=0 without considering bitregion_start/end, but I think it may be possible that this can not happen, because if bitsize is a multiple if ALIGNMENT, and bitpos is a multiple of bitsize, we probably do not have a bit-field at all. And of course I have no test case that fails without this hunk. Maybe it would be better to add an assertion here like: { gcc_assert (bitregion_start == 0 bitregion_end == 0); to_rtx = adjust_address (to_rtx, mode1, bitpos / BITS_PER_UNIT); bitpos = 0; } and especially to @@ -9905,7 +9861,7 @@ expand_expr_real_1 (tree exp, rtx target modifier != EXPAND_STACK_PARM ? target : NULL_RTX), VOIDmode, - modifier == EXPAND_SUM ? EXPAND_NORMAL : modifier); + EXPAND_MEMORY); /* If the bitfield is volatile, we want to access it in the field's mode, not the computed mode. which AFAIK makes memory expansion of loads/stores from/to registers change (fail? go through stack memory?) - see handling of non-MEM return values from that expand_expr call. I wanted to make the expansion of MEM_REF and TARGET_MEM_REF not go thru the final misalign handling, which is guarded by if (modifier != EXPAND_WRITE modifier != EXPAND_MEMORY ... What we want here is most likely EXPAND_MEMORY, which returns a memory context if possible. Could you specify more explicitly what you mean with handling of non-MEM return values from that expand_expr call, then I could try finding test cases for that. In particular this seems to disable all movmisalign handling for MEM_REFs wrapped in component references which looks wrong. I was playing with typedef long long V __attribute__ ((vector_size (2 * sizeof (long long)), may_alias)); struct S { long long a[11]; V v; }__attribute__((aligned(8),packed)) ; struct S a, *b = a; V v, w; int main() { v = b-v; b-v = w; return 0; } (use -fno-common) and I see that we use unaligned stores too often (even with a properly aligned MEM). The above at least shows movmisalign opportunities wrapped in component-refs. hmm, interesting. This does not compile differently with or without this patch. I have another observation, regarding the testcase pr50444.c: method: .LFB4: .cfi_startproc movq 32(%rdi), %rax testq %rax, %rax jne .L7 addl $1, 16(%rdi) movl $3, %eax movq %rax, 32(%rdi) movdqu 16(%rdi), %xmm0 pxor (%rdi), %xmm0 movdqu %xmm0, 40(%rdi) here the first movdqu could as well be movdqa, because 16+rdi is 128-bit aligned. In the ctor method a movdqa is used, but the SRA is very pessimistic and generates an unaligned MEM_REF. Also this example does not compile any different with this patch. That is, do you see anything break with just removing the movmisalign path? I'd rather install that (with the new testcases that then pass) separately as this is a somewhat fragile area and being able to more selectively bisect/backport would be nice. No, I think that is a good idea. Attached the first part of the patch, that does only remove the movmisalign path. Should I apply this one after regression testing? Bernd. Thanks, Richard. Regards Bernd.2013-09-17 Bernd Edlinger bernd.edlin...@hotmail.de PR middle-end/57748 * expr.c (expand_assignment): Remove misalignp code path. testsuite: PR middle-end/57748 * gcc.dg/torture/pr57748-1.c: New test. * gcc.dg/torture/pr57748-2.c: New test. patch-pr57748.diff Description: Binary data
[PATCH][RFC] teach loop distribution to distribute loop nests
This teaches loop distribution to distribute nested loops. I plan to commit the trivial bits of it but not the rest of the patch until I have an idea how to best limit the loop nest walk (it tries distributing nests from outer to inner loops, re-doing dependence analysis and RDG build). At this point loop distribution needs a better cost model, the ability to turn flow dependences into data dependences and turning data dependences into partition ordering dependences. Still the first thing for me to tackle is some more patterns to recognize. Bootstrapped with -ftree-loop-distribution and tested on x86_64-unknown-linux-gnu. Richard. 2013-09-17 Richard Biener rguent...@suse.de * tree-loop-distribution.c (ssa_name_has_uses_outside_loop_p): Properly handle loop nests. (classify_partition): Disable builtins for loop nests. (similar_memory_accesses): Refine cost model. (distribute_loop): Dump which loop we are trying to distribute. (tree_loop_distribution): Handle distribution of nested loops. * gfortran.dg/ldist-2.f: New testcase. * gcc.dg/tree-ssa/ldist-5.c: Adjust XFAIL reason. Index: trunk/gcc/testsuite/gfortran.dg/ldist-2.f === *** /dev/null 1970-01-01 00:00:00.0 + --- trunk/gcc/testsuite/gfortran.dg/ldist-2.f 2013-09-17 13:42:22.144740768 +0200 *** *** 0 --- 1,64 + ! { dg-do compile } + ! { dg-options -O3 -fno-tree-loop-im -ftree-loop-distribution -fdump-tree-ldist-details } + + ! Testcase from bwaves block_solver.f + subroutine mat_times_vec(y,x,a,axp,ayp,azp,axm,aym,azm, + $ nb,nx,ny,nz) + implicit none + integer nb,nx,ny,nz,i,j,k,m,l,kit,im1,ip1,jm1,jp1,km1,kp1 + + real*8 y(nb,nx,ny,nz),x(nb,nx,ny,nz) + + real*8 a(nb,nb,nx,ny,nz), + 1 axp(nb,nb,nx,ny,nz),ayp(nb,nb,nx,ny,nz),azp(nb,nb,nx,ny,nz), + 2 axm(nb,nb,nx,ny,nz),aym(nb,nb,nx,ny,nz),azm(nb,nb,nx,ny,nz) + + + do k=1,nz + c do j=1,ny + cdo i=1,nx + c do l=1,nb + c y(l,i,j,k)=0.0d0 + c enddo + cenddo + c enddo + + km1=mod(k+nz-2,nz)+1 + kp1=mod(k,nz)+1 + do j=1,ny + jm1=mod(j+ny-2,ny)+1 + jp1=mod(j,ny)+1 + do i=1,nx +im1=mod(i+nx-2,nx)+1 +ip1=mod(i,nx)+1 +do l=1,nb + y(l,i,j,k)=0.0d0 + do m=1,nb + y(l,i,j,k)=y(l,i,j,k)+ + 1 a(l,m,i,j,k)*x(m,i,j,k)+ + 2 axp(l,m,i,j,k)*x(m,ip1,j,k)+ + 3 ayp(l,m,i,j,k)*x(m,i,jp1,k)+ + 4 azp(l,m,i,j,k)*x(m,i,j,kp1)+ + 5 axm(l,m,i,j,k)*x(m,im1,j,k)+ + 6 aym(l,m,i,j,k)*x(m,i,jm1,k)+ + 7 azm(l,m,i,j,k)*x(m,i,j,km1) + enddo +enddo + enddo + enddo + enddo + + + + cy=x + cwhere (mask) y=tmp + return + end + + ! We fail to distribute the loop because the output dependence for the + ! two stores to y(l,i,j,k) forces them into the same partition. This is + ! because loop distribution does not promote such dependences into + ! constraints on partition ordering + + ! { dg-final { scan-tree-dump distributed: split to 2 loops ldist { xfail *-*-* } } } + ! { dg-final { cleanup-tree-dump ldist } } Index: trunk/gcc/tree-loop-distribution.c === *** trunk.orig/gcc/tree-loop-distribution.c 2013-09-17 11:51:49.0 +0200 --- trunk/gcc/tree-loop-distribution.c 2013-09-17 14:03:53.378065359 +0200 *** ssa_name_has_uses_outside_loop_p (tree d *** 624,630 { gimple use_stmt = USE_STMT (use_p); if (!is_gimple_debug (use_stmt) ! loop != loop_containing_stmt (use_stmt)) return true; } --- 624,631 { gimple use_stmt = USE_STMT (use_p); if (!is_gimple_debug (use_stmt) ! loop != loop_containing_stmt (use_stmt) ! !flow_loop_nested_p (loop, loop_containing_stmt (use_stmt))) return true; } *** classify_partition (loop_p loop, struct *** 1139,1149 if (stmt_has_scalar_dependences_outside_loop (loop, stmt)) { if (dump_file (dump_flags TDF_DETAILS)) ! fprintf (dump_file, not generating builtin, partition has scalar uses outside of the loop\n); partition-kind = PKIND_REDUCTION; return; } } /* Perform general partition disqualification for builtins. */ --- 1140,1162 if (stmt_has_scalar_dependences_outside_loop (loop, stmt)) { if (dump_file (dump_flags TDF_DETAILS)) ! fprintf
Re: [PATCH] Don't always instrument shifts (PR sanitizer/58413)
On Mon, Sep 16, 2013 at 03:59:12PM +, Joseph S. Myers wrote: On Mon, 16 Sep 2013, Marek Polacek wrote: On Fri, Sep 13, 2013 at 07:18:24PM +, Joseph S. Myers wrote: On Fri, 13 Sep 2013, Marek Polacek wrote: This is kind of fugly, but don't have anything better at the moment. 2013-09-13 Marek Polacek pola...@redhat.com PR sanitizer/58413 c-family/ * c-ubsan.c (ubsan_instrument_shift): Don't instrument an expression if we can prove it is correct. Shouldn't the conditions used here for an expression being proved correct match those for instrumentation, i.e. depend on flag_isoc99 and on (cxx_dialect == cxx11 || cxx_dialect == cxx1y)? I don't think so: for the unsigned case we could restrict it to C only, but it doesn't hurt doing it even for C++; in the signed case we care only about C, but we can't restrict it to flag_isoc99 only, since we need to prove the correctnes even for ANSI C. I don't understand how this answers my question. I'm sorry. Please disregard the original (ugly) patch, the folloing applies to the new (pretty) patch http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01283.html. * The following principle applies: for any command-line options, with ubsan enabled, if an integer operation with particular (non-constant) operands is accepted by the sanitization code at runtime, the same operation with the same operand values (and types) as constants should be accepted at compile time (and at runtime) in contexts where an integer constant expression is required. Does this patch make the compiler meet this principle, for all the different command-line options that vary what is accepted at runtime? I believe so. E.g. int i = 4, j = 3, k; k = i j; is ok, thus the following is ok as well case (4 3) (for C++/C with various -std=*). * The following principle also applies: for any command-line options, with ubsan enabled, if an integer operation with particular (non-constant) operands is rejected by the sanitization code at runtime, the same operation with the same operand values (and types) as constants should be rejected at compile time (or at runtime) in contexts where an integer constant expression is required. Does this patch make the compiler meet this principle, for all the different command-line options that vary what is accepted at runtime? And I think this applies as well. At runtime we reject e.g. int i = 1, j = 120, k; k = i j; and at compile-time we reject enum e { red = 0 120, }; Marek
Re: [C++ Patch] PR 58435
OK. Jason
Re: RFA: Testsuite: Add exceptions for MSP430
Hi Mike, Ok, I assume that the changes to hppa and return 0 are intentional and good… - || [istarget hppa64-hp-hpux11.23] } { - return 0; +|| [istarget hppa64-hp-hpux11.23] } { + return 0; Sorry - yes - they are just whitespace adjustments so that the entries line up. Cheers Nick
Re: [PATCH ARM]Extend thumb1_reorg to save more comparison instructions
On 17/09/13 03:16, bin.cheng wrote: -Original Message- From: Richard Earnshaw Sent: Thursday, September 12, 2013 11:24 PM To: Bin Cheng Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH ARM]Extend thumb1_reorg to save more comparison instructions On 18/04/13 06:34, Bin Cheng wrote: Sorry for the delay, I've been trying to get my head around this one. thumb1_reorg-20130417.txt Index: gcc/config/arm/arm.c == = --- gcc/config/arm/arm.c(revision 197562) +++ gcc/config/arm/arm.c(working copy) @@ -14026,6 +14026,7 @@ thumb1_reorg (void) rtx set, dest, src; rtx pat, op0; rtx prev, insn = BB_END (bb); + bool insn_clobbered = false; while (insn != BB_HEAD (bb) DEBUG_INSN_P (insn)) insn = PREV_INSN (insn); @@ -14034,12 +14035,29 @@ thumb1_reorg (void) if (INSN_CODE (insn) != CODE_FOR_cbranchsi4_insn) continue; - /* Find the first non-note insn before INSN in basic block BB. */ + /* Get the register with which we are comparing. */ + pat = PATTERN (insn); + op0 = XEXP (XEXP (SET_SRC (pat), 0), 0); + + /* Find the first flag setting insn before INSN in basic block + BB. */ gcc_assert (insn != BB_HEAD (bb)); - prev = PREV_INSN (insn); - while (prev != BB_HEAD (bb) (NOTE_P (prev) || DEBUG_INSN_P (prev))) - prev = PREV_INSN (prev); + for (prev = PREV_INSN (insn); + (!insn_clobbered +prev != BB_HEAD (bb) +(NOTE_P (prev) + || DEBUG_INSN_P (prev) + || (GET_CODE (prev) == SET This can't be right. prev is an insn of some form, so the test that it is a SET will always fail. What you need to do here is to initialize 'set' to null before the loop and then have something like || ((set = single_set (prev)) != NULL +get_attr_conds (prev) == CONDS_NOCOND))); + prev = PREV_INSN (prev)) + { + if (reg_set_p (op0, prev)) + insn_clobbered = true; + } + /* Skip if op0 is clobbered by insn other than prev. */ + if (insn_clobbered) + continue; + set = single_set (prev); This now becomes redundant and ... if (!set) continue; This will be based on the set you extracted above. Hi Richard, here is the updated patch according to your comments. Tested on thumb1, please review. OK. R.
Re: [PATCH][Resend][tree-optimization] Fix PR58088
On 09/09/13 10:56, Kyrylo Tkachov wrote: [Resending, since I was away and not pinging it] Hi all, In PR58088 the constant folder goes into an infinite recursion and runs out of stack space because of two conflicting optimisations: (X * C1) C2 plays dirty when nested inside an IOR expression like so: ((X * C1) C2) | C4. One can undo the other leading to an infinite recursion. Thanks to Marek for finding the IOR case. This patch fixes that by checking in the IOR case that the change to C2 will not conflict with the AND case transformation. Example testcases in the PR on bugzilla. This affects both trunk and 4.8 and regresses and bootstraps cleanly on both. Bootstrapped on x86_64-linux-gnu and tested arm-none-eabi on qemu. Ok for trunk and 4.8? Thanks, Kyrill 2013-09-09 Kyrylo Tkachov kyrylo.tkac...@arm.com PR tree-optimization/58088 * fold-const.c (mask_with_trailing_zeros): New function. (fold_binary_loc): Make sure we don't recurse infinitely when the X in (X C1) | C2 is a tree of the form (Y * K1) K2. Use mask_with_trailing_zeros where appropriate. 2013-09-09 Kyrylo Tkachov kyrylo.tkac...@arm.com PR tree-optimization/58088 * gcc.c-torture/compile/pr58088.c: New test.= pr58088.patch @@ -9942,6 +9942,22 @@ exact_inverse (tree type, tree cst) } } +/* Mask out the tz least significant bits of X of type TYPE where +tz is the number of trailing zeroes in Y. */ +static double_int +mask_with_tz (tree type, double_int x, double_int y) +{ + int tz = y.trailing_zeros (); + if (tz 0) blank line between declarations and statements. @@ -11266,6 +11282,7 @@ fold_binary_loc (location_t loc, { double_int c1, c2, c3, msk; int width = TYPE_PRECISION (type), w; + bool valid = true; c1 = tree_to_double_int (TREE_OPERAND (arg0, 1)); c2 = tree_to_double_int (arg1); blank line after declarations before code body. } - if (c3 != c1) + /* If X is a tree of the form (Y * K1) K2, this might conflict Should be a blank line before the comment as well +with that optimization from the BIT_AND_EXPR optimizations. +This could end up in an infinite recursion. */ + if (TREE_CODE (TREE_OPERAND (arg0, 0)) == MULT_EXPR + TREE_CODE (TREE_OPERAND (TREE_OPERAND (arg0, 0), 1)) + == INTEGER_CST) + { + tree t = TREE_OPERAND (TREE_OPERAND (arg0, 0), 1); + double_int masked = mask_with_tz (type, c3, tree_to_double_int (t)); + valid = masked != c1; blank line before statements after declarations. + } + + if (c3 != c1 valid) 'valid' should come before the comparison test. Furthermore, I think 'valid' is misleading; 'try_simplify' would be a more accurate description of what the test is about. OK with those changes. R.
[C++ PATCH] demangler fix (take 2)
Hi all, This is a resubmission of my previous demangler fix [1] rewritten to avoid using hashtables and other libiberty features. From the above referenced email: d_print_comp maintains a certain amount of scope across calls (namely a stack of templates) which is used when evaluating references in template argument lists. If such a reference is later used from a subtitution then the scope in force at the time of the substitution is used. This appears to be wrong (I say appears because I couldn't find anything in the API [2] to clarify this). The attached patch causes the demangler to capture the scope the first time such a reference is traversed, and to use that captured scope on subsequent traversals. This fixes GDB PR 14963 [3] whereby a reference is resolved against the wrong template, causing an infinite loop and eventual stack overflow and segmentation fault. I've added the result to the demangler test suite, but I know of no way to check the validity of the demangled symbol other than by inspection (and I am no expert here!) If anybody knows a way to check this then please let me know! Otherwise, I hope this not-really-checked demangled version is acceptable. Thanks, Gary [1] http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00215.html [2] http://mentorembedded.github.io/cxx-abi/abi.html#mangling [3] http://sourceware.org/bugzilla/show_bug.cgi?id=14963 -- http://gbenson.net/ diff --git a/libiberty/ChangeLog b/libiberty/ChangeLog index 89e108a..2ff8216 100644 --- a/libiberty/ChangeLog +++ b/libiberty/ChangeLog @@ -1,3 +1,20 @@ +2013-09-17 Gary Benson gben...@redhat.com + + * cp-demangle.c (struct d_saved_scope): New structure. + (struct d_print_info): New fields saved_scopes and + num_saved_scopes. + (d_print_init): Initialize the above. + (d_print_free): New function. + (cplus_demangle_print_callback): Call the above. + (d_copy_templates): New function. + (d_print_comp): New variables saved_templates and + need_template_restore. + [DEMANGLE_COMPONENT_REFERENCE, + DEMANGLE_COMPONENT_RVALUE_REFERENCE]: Capture scope the first + time the component is traversed, and use the captured scope for + subsequent traversals. + * testsuite/demangle-expected: Add regression test. + 2013-09-10 Paolo Carlini paolo.carl...@oracle.com PR bootstrap/58386 diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c index 70f5438..a199f6d 100644 --- a/libiberty/cp-demangle.c +++ b/libiberty/cp-demangle.c @@ -275,6 +275,18 @@ struct d_growable_string int allocation_failure; }; +/* A demangle component and some scope captured when it was first + traversed. */ + +struct d_saved_scope +{ + /* The component whose scope this is. */ + const struct demangle_component *container; + /* The list of templates, if any, that was current when this + scope was captured. */ + struct d_print_template *templates; +}; + enum { D_PRINT_BUFFER_LENGTH = 256 }; struct d_print_info { @@ -302,6 +314,10 @@ struct d_print_info int pack_index; /* Number of d_print_flush calls so far. */ unsigned long int flush_count; + /* Array of saved scopes for evaluating substitutions. */ + struct d_saved_scope *saved_scopes; + /* Number of saved scopes in the above array. */ + int num_saved_scopes; }; #ifdef CP_DEMANGLE_DEBUG @@ -3665,6 +3681,30 @@ d_print_init (struct d_print_info *dpi, demangle_callbackref callback, dpi-opaque = opaque; dpi-demangle_failure = 0; + + dpi-saved_scopes = NULL; + dpi-num_saved_scopes = 0; +} + +/* Free a print information structure. */ + +static void +d_print_free (struct d_print_info *dpi) +{ + int i; + + for (i = 0; i dpi-num_saved_scopes; i++) +{ + struct d_print_template *ts, *tn; + + for (ts = dpi-saved_scopes[i].templates; ts != NULL; ts = tn) + { + tn = ts-next; + free (ts); + } +} + + free (dpi-saved_scopes); } /* Indicate that an error occurred during printing, and test for error. */ @@ -3749,6 +3789,7 @@ cplus_demangle_print_callback (int options, demangle_callbackref callback, void *opaque) { struct d_print_info dpi; + int success; d_print_init (dpi, callback, opaque); @@ -3756,7 +3797,9 @@ cplus_demangle_print_callback (int options, d_print_flush (dpi); - return ! d_print_saw_error (dpi); + success = ! d_print_saw_error (dpi); + d_print_free (dpi); + return success; } /* Turn components into a human readable string. OPTIONS is the @@ -3913,6 +3956,36 @@ d_print_subexpr (struct d_print_info *dpi, int options, d_append_char (dpi, ')'); } +/* Return a shallow copy of the current list of templates. + On error d_print_error is called and a partial list may + be returned. Whatever is returned must be freed. */ + +static struct d_print_template * +d_copy_templates (struct d_print_info *dpi) +{ + struct d_print_template *src, *result,
RE: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++)
Hello, Has anyone had a chance to look at this. The C++ part is only a week old, but the C part has been in review for ~3 weeks. I would greatly appreciate if someone could review this and approve for trunk if it is Ok for trunk. Thanks, Balaji V. Iyer. -Original Message- From: Iyer, Balaji V Sent: Wednesday, September 11, 2013 2:18 PM To: r...@redhat.com; Jason Merrill (ja...@redhat.com); Jeff Law; Aldy Hernandez (al...@redhat.com) Cc: gcc-patches@gcc.gnu.org Subject: RE: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++) Hello Everyone, Couple weeks back, I had submitted a patch for review that will implement Cilk keywords (_Cilk_spawn and _Cilk_sync) into the C compiler. I recently finished C++ implementation also. In this email, I am attaching 2 patches: 1 for C (and the common parts for C and C++) and 1 for C++. The C++ Changelog is labelled cp-ChangeLog.cilkplus and the other one is just ChangeLog.cilkplus. There isn't much changes in the C patch. Only noticeable changes would be moving functions to the common parts so that C++ can use them. It passes all the tests and does not affect (by affect I mean fail a passing test or pass a failing one) any of the other tests in the testsuite directory. Is this Ok for trunk? Thanks, Balaji V. Iyer. -Original Message- From: Iyer, Balaji V Sent: Friday, August 30, 2013 1:02 PM To: gcc-patches@gcc.gnu.org Subject: FW: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C The email seem to be bouncing gcc-patches. I have gzipped my patch. Thanks, Balaji V. Iyer. -Original Message- From: Iyer, Balaji V Sent: Friday, August 30, 2013 11:42 AM To: 'Aldy Hernandez' Cc: r...@redhat.com; Jeff Law; gcc-patches@gcc.gnu.org Subject: RE: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C Hi Aldy, Attached, please find a fixed patch and the changelog entries. -Original Message- From: Aldy Hernandez [mailto:al...@redhat.com] Sent: Wednesday, August 28, 2013 2:36 PM To: Iyer, Balaji V Cc: r...@redhat.com; Jeff Law; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C On 08/27/13 16:27, Iyer, Balaji V wrote: Hello Aldy, I went through all the emails and here are the major issues that I could gather (other than lowering the keywords after gimplification, which I am skipping since it is more of an optimization for now). Ok, for now I am fine with delaying handling all this as a gimple tuple since most of your code lives in it's only little world :). But I will go on record saying that part of the reason that you have to handle CALL_EXPR, MODIFY_EXPR, INIT_EXPR and such is because you don't have easy gimplified code to examine. Anyways, agreed, you can do this later. 1. Calling the gimplify_cilk_spawn on top of the gimplify_expr before the switch-statement could slow the compiler down 2. I need a CILK_SPAWN_STMT case in the switch statement in gimplify_expr (). 3. No test for catching the suspicious spawned function warning 4. Reasoning for expanding the 2 builtin functions in builtins.c instead of just inserting the appropriate expanded-code when I am inserting the function call. Did I miss anything else (or misunderstand anything you pointed out)? Here are my answers to those questions above and am attaching a fixed patch with the changelog entries: 1 2(partial): There are 3 places where we could have _Cilk_spawn: INIT_EXPR, CALL_EXPR and MODIFY_EXPR. INIT_EXPR and MODIFY_EXPRS are both gimplified using gimplify_modify_expr. I have moved the cilk_detect_spawn into this function. We will go into the cilk_detect_spawn if cilk plus is enabled, and if there is a cilk_frame (meaning the function has a Cilk_spawn in it) thereby reducing the number of hits into this function significantly. Inside this function, it will go into the function that has a spawned function call and then unwrap the CILK_SPAWN_STMT wrapper and returns true. This shouldn't cause a huge compilation time hit. 2. To handle CALL_EXPR (e.g. _Cilk_spawn foo (x), where foo returns a void or the return value of it is ignored), I have added a CILK_SPAWN_STMT case. Again, I am calling the detect_cilk_spawn and we will only step into this function if Cilk Plus is enabled and if there is a cilk-frame (i.e saying the function has a cilk spawn in it). If there is an error (seen_error () == true), then it just falls through into CALL_EXPR and is handled like a normal call expr not spawned expression. 3. This warning rarely get
[PATCH] Add no_sanitize_undefined attribute (PR sanitizer/58411)
This patch adds the no_sanitize_undefined attribute, so the user can tell that a particular function should be ignored by ubsan. Ran ubsan testsuite/bootstrap-ubsan on x86_64-linux, ok for trunk? 2013-09-17 Marek Polacek pola...@redhat.com PR sanitizer/58411 * doc/extend.texi: Document no_sanitize_undefined attribute. * builtins.c (fold_builtin_0): Don't sanitize function if it has the no_sanitize_undefined attribute. c-family/ * c-common.c (handle_no_sanitize_undefined_attribute): New function. Declare it. (struct attribute_spec c_common_att): Add no_sanitize_undefined. cp/ * typeck.c (cp_build_binary_op): Don't sanitize function if it has the no_sanitize_undefined attribute. c/ * c-typeck.c (build_binary_op): Don't sanitize function if it has the no_sanitize_undefined attribute. testsuite/ * c-c++-common/ubsan/attrib-1.c: New test. --- gcc/c-family/c-common.c.mp2 2013-09-17 15:55:56.417946667 +0200 +++ gcc/c-family/c-common.c 2013-09-17 15:58:55.905513029 +0200 @@ -311,6 +311,8 @@ static tree handle_no_sanitize_address_a int, bool *); static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree, int, bool *); +static tree handle_no_sanitize_undefined_attribute (tree *, tree, tree, int, + bool *); static tree handle_noinline_attribute (tree *, tree, tree, int, bool *); static tree handle_noclone_attribute (tree *, tree, tree, int, bool *); static tree handle_leaf_attribute (tree *, tree, tree, int, bool *); @@ -722,6 +724,9 @@ const struct attribute_spec c_common_att { no_sanitize_address,0, 0, true, false, false, handle_no_sanitize_address_attribute, false }, + { no_sanitize_undefined, 0, 0, true, false, false, + handle_no_sanitize_undefined_attribute, + false }, { warning, 1, 1, true, false, false, handle_error_attribute, false }, { error, 1, 1, true, false, false, @@ -6575,6 +6580,22 @@ handle_no_address_safety_analysis_attrib return NULL_TREE; } +/* Handle a no_sanitize_undefined attribute; arguments as in + struct attribute_spec.handler. */ + +static tree +handle_no_sanitize_undefined_attribute (tree *node, tree name, tree, int, + bool *no_add_attrs) +{ + if (TREE_CODE (*node) != FUNCTION_DECL) +{ + warning (OPT_Wattributes, %qE attribute ignored, name); + *no_add_attrs = true; +} + + return NULL_TREE; +} + /* Handle a noinline attribute; arguments as in struct attribute_spec.handler. */ --- gcc/doc/extend.texi.mp2 2013-09-17 15:55:44.250907707 +0200 +++ gcc/doc/extend.texi 2013-09-17 16:06:21.439974916 +0200 @@ -2136,6 +2136,7 @@ attributes are currently defined for fun @code{warn_unused_result}, @code{nonnull}, @code{gnu_inline}, @code{externally_visible}, @code{hot}, @code{cold}, @code{artificial}, @code{no_sanitize_address}, @code{no_address_safety_analysis}, +@code{no_sanitize_undefined}, @code{error} and @code{warning}. Several other attributes are defined for functions on particular target systems. Other attributes, including @code{section} are @@ -3500,6 +3501,12 @@ The @code{no_address_safety_analysis} is @code{no_sanitize_address} attribute, new code should use @code{no_sanitize_address}. +@item no_sanitize_undefined +@cindex @code{no_sanitize_undefined} function attribute +The @code{no_sanitize_undefined} attribute on functions is used +to inform the compiler that it should not check for undefined behavior +in the function when compiling with the @option{-fsanitize=undefined} option. + @item regparm (@var{number}) @cindex @code{regparm} attribute @cindex functions that are passed arguments in registers on the 386 --- gcc/cp/typeck.c.mp2 2013-09-17 16:10:49.935644344 +0200 +++ gcc/cp/typeck.c 2013-09-17 16:11:20.601743694 +0200 @@ -4887,6 +4887,8 @@ cp_build_binary_op (location_t location, if ((flag_sanitize SANITIZE_UNDEFINED) !processing_template_decl current_function_decl != 0 + !lookup_attribute (no_sanitize_undefined, + DECL_ATTRIBUTES (current_function_decl)) (doing_div_or_mod || doing_shift)) { /* OP0 and/or OP1 might have side-effects. */ --- gcc/c/c-typeck.c.mp22013-09-17 16:09:31.423381687 +0200 +++ gcc/c/c-typeck.c2013-09-17 16:10:00.626476422 +0200 @@ -10498,6 +10498,8 @@ build_binary_op (location_t location, en if (flag_sanitize SANITIZE_UNDEFINED current_function_decl != 0 + !lookup_attribute (no_sanitize_undefined, + DECL_ATTRIBUTES (current_function_decl)) (doing_div_or_mod ||
Disable creation of local aliases on targets w/o alias support
Hi, this patch should fix HP-PA bootstrap issue where we create local aliases but the target has no support for them. Bootstrapped/regtested x86_64-linux (with aliases disabled) and commited. PR middle-end/58329 * ipa-devirt.c (ipa_devirt): Be ready for symtab_nonoverwritable_alias to return NULL. * ipa.c (function_and_variable_visibility): Likewise. * ipa-profile.c (ipa_profile): Likewise. Index: ipa-devirt.c === --- ipa-devirt.c(revision 202650) +++ ipa-devirt.c(working copy) @@ -1098,7 +1098,13 @@ ipa_devirt (void) cgraph_node_name (likely_target), likely_target-symbol.order); if (!symtab_can_be_discarded ((symtab_node) likely_target)) - likely_target = cgraph (symtab_nonoverwritable_alias ((symtab_node)likely_target)); + { + cgraph_node *alias; + alias = cgraph (symtab_nonoverwritable_alias +((symtab_node)likely_target)); + if (alias) + likely_target = alias; + } nconverted++; update = true; cgraph_turn_edge_to_speculative Index: ipa.c === --- ipa.c (revision 202650) +++ ipa.c (working copy) @@ -998,7 +998,7 @@ function_and_variable_visibility (bool w { struct cgraph_node *alias = cgraph (symtab_nonoverwritable_alias ((symtab_node) node)); - if (alias != node) + if (alias alias != node) { while (node-callers) { Index: ipa-profile.c === --- ipa-profile.c (revision 202650) +++ ipa-profile.c (working copy) @@ -625,7 +625,13 @@ ipa_profile (void) of N2. Speculate on the local alias to allow inlining. */ if (!symtab_can_be_discarded ((symtab_node) n2)) - n2 = cgraph (symtab_nonoverwritable_alias ((symtab_node)n2)); + { + cgraph_node *alias; + alias = cgraph (symtab_nonoverwritable_alias + ((symtab_node)n2)); + if (alias) + n2 = alias; + } nconverted++; cgraph_turn_edge_to_speculative (e, n2, Index: symtab.c === --- symtab.c(revision 202650) +++ symtab.c(working copy) @@ -1083,6 +1083,10 @@ symtab_nonoverwritable_alias (symtab_nod (void *)new_node, true); if (new_node) return new_node; +#ifndef ASM_OUTPUT_DEF + /* If aliases aren't supported by the assembler, fail. */ + return NULL; +#endif /* Otherwise create a new one. */ new_decl = copy_node (node-symbol.decl);
Re: [PATCH] Add no_sanitize_undefined attribute (PR sanitizer/58411)
On Tue, Sep 17, 2013 at 05:24:22PM +0200, Marek Polacek wrote: This patch adds the no_sanitize_undefined attribute, so the user can tell that a particular function should be ignored by ubsan. Does this correspond to some llvm attribute? --- gcc/builtins.c.mp22013-09-17 16:13:26.623161281 +0200 +++ gcc/builtins.c2013-09-17 16:15:20.846557451 +0200 @@ -10313,7 +10313,9 @@ fold_builtin_0 (location_t loc, tree fnd return fold_builtin_classify_type (NULL_TREE); case BUILT_IN_UNREACHABLE: - if (flag_sanitize SANITIZE_UNREACHABLE) + if (flag_sanitize SANITIZE_UNREACHABLE +!lookup_attribute (no_sanitize_undefined, + DECL_ATTRIBUTES (current_function_decl))) return ubsan_instrument_unreachable (loc); break; I wonder if current_function_decl couldn't be NULL here, say if __builtin_unreachable () appears in C++ global var initializers or similar. Jakub
Re: [PATCH] Add no_sanitize_undefined attribute (PR sanitizer/58411)
On Tue, Sep 17, 2013 at 05:37:51PM +0200, Jakub Jelinek wrote: On Tue, Sep 17, 2013 at 05:24:22PM +0200, Marek Polacek wrote: This patch adds the no_sanitize_undefined attribute, so the user can tell that a particular function should be ignored by ubsan. Does this correspond to some llvm attribute? No, it seems they don't have a flag for disabling the ubsan; they only have flags for disabling asan/tsan/msan. --- gcc/builtins.c.mp2 2013-09-17 16:13:26.623161281 +0200 +++ gcc/builtins.c 2013-09-17 16:15:20.846557451 +0200 @@ -10313,7 +10313,9 @@ fold_builtin_0 (location_t loc, tree fnd return fold_builtin_classify_type (NULL_TREE); case BUILT_IN_UNREACHABLE: - if (flag_sanitize SANITIZE_UNREACHABLE) + if (flag_sanitize SANITIZE_UNREACHABLE + !lookup_attribute (no_sanitize_undefined, + DECL_ATTRIBUTES (current_function_decl))) return ubsan_instrument_unreachable (loc); break; I wonder if current_function_decl couldn't be NULL here, say if __builtin_unreachable () appears in C++ global var initializers or similar. Well I wonder too ;) I thought it can't be NULL, and tried this struct C { C() { __builtin_unreachable (); } }; C c; int main () { return 0; } and here everything looks ok. Or is this not the proper way of checking that? Surely, I can add the check for current_function_decl != NULL just to be on the safe side... Marek
Fix PR58332
Hi, this patch makes inliner to not inline functions with -O0 optimization attribute and also to not inline into functions. Bootstrapped/regtested x86_64-linux, comitted. PR middle-end/58332 * gcc.c-torture/compile/pr58332.c: New testcase. * cif-code.def (FUNCTION_NOT_OPTIMIZED): New CIF code. * ipa-inline.c (can_inline_edge_p): Do not downgrade FUNCTION_NOT_OPTIMIZED. * ipa-inline-analysis.c (compute_inline_parameters): Function not optimized is not inlinable unless it is alwaysinline. (inline_analyze_function): Force calls in not optimized function not inlinable. Index: testsuite/gcc.c-torture/compile/pr58332.c === --- testsuite/gcc.c-torture/compile/pr58332.c (revision 0) +++ testsuite/gcc.c-torture/compile/pr58332.c (revision 0) @@ -0,0 +1,2 @@ +static inline int foo (int x) { return x + 1; } +__attribute__ ((__optimize__ (0))) int bar (void) { return foo (100); } Index: cif-code.def === --- cif-code.def(revision 202656) +++ cif-code.def(working copy) @@ -37,6 +37,9 @@ DEFCIFCODE(UNSPECIFIED , ) functions that have not been rejected for inlining yet. */ DEFCIFCODE(FUNCTION_NOT_CONSIDERED, N_(function not considered for inlining)) +/* Caller is compiled with optimizations disabled. */ +DEFCIFCODE(FUNCTION_NOT_OPTIMIZED, N_(caller is not optimized)) + /* Inlining failed owing to unavailable function body. */ DEFCIFCODE(BODY_NOT_AVAILABLE, N_(function body not available)) Index: ipa-inline.c === --- ipa-inline.c(revision 202656) +++ ipa-inline.c(working copy) @@ -275,7 +275,8 @@ can_inline_edge_p (struct cgraph_edge *e } else if (e-call_stmt_cannot_inline_p) { - e-inline_failed = CIF_MISMATCHED_ARGUMENTS; + if (e-inline_failed != CIF_FUNCTION_NOT_OPTIMIZED) +e-inline_failed = CIF_MISMATCHED_ARGUMENTS; inlinable = false; } /* Don't inline if the functions have different EH personalities. */ Index: ipa-inline-analysis.c === --- ipa-inline-analysis.c (revision 202656) +++ ipa-inline-analysis.c (working copy) @@ -2664,7 +2664,11 @@ compute_inline_parameters (struct cgraph info-stack_frame_offset = 0; /* Can this function be inlined at all? */ - info-inlinable = tree_inlinable_function_p (node-symbol.decl); + if (!optimize !lookup_attribute (always_inline, + DECL_ATTRIBUTES (node-symbol.decl))) +info-inlinable = false; + else +info-inlinable = tree_inlinable_function_p (node-symbol.decl); /* Type attributes can use parameter indices to describe them. */ if (TYPE_ATTRIBUTES (TREE_TYPE (node-symbol.decl))) @@ -3678,6 +3682,22 @@ inline_analyze_function (struct cgraph_n if (optimize !node-thunk.thunk_p) inline_indirect_intraprocedural_analysis (node); compute_inline_parameters (node, false); + if (!optimize) +{ + struct cgraph_edge *e; + for (e = node-callees; e; e = e-next_callee) + { + if (e-inline_failed == CIF_FUNCTION_NOT_CONSIDERED) + e-inline_failed = CIF_FUNCTION_NOT_OPTIMIZED; + e-call_stmt_cannot_inline_p = true; + } + for (e = node-indirect_calls; e; e = e-next_callee) + { + if (e-inline_failed == CIF_FUNCTION_NOT_CONSIDERED) + e-inline_failed = CIF_FUNCTION_NOT_OPTIMIZED; + e-call_stmt_cannot_inline_p = true; + } +} pop_cfun (); }
Re: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++)
On 09/17/2013 08:50 AM, Iyer, Balaji V wrote: Hello, Has anyone had a chance to look at this. The C++ part is only a week old, but the C part has been in review for ~3 weeks. I would greatly appreciate if someone could review this and approve for trunk if it is Ok for trunk. Obviously not yet. Everyone is pretty busy right now. jeff
Re: [PATCH] Add no_sanitize_undefined attribute (PR sanitizer/58411)
On Tue, Sep 17, 2013 at 06:45:25PM +0200, Marek Polacek wrote: --- gcc/builtins.c.mp22013-09-17 16:13:26.623161281 +0200 +++ gcc/builtins.c2013-09-17 18:42:11.338273135 +0200 @@ -10313,7 +10313,10 @@ fold_builtin_0 (location_t loc, tree fnd return fold_builtin_classify_type (NULL_TREE); case BUILT_IN_UNREACHABLE: - if (flag_sanitize SANITIZE_UNREACHABLE) + if (flag_sanitize SANITIZE_UNREACHABLE +current_function_decl != 0 +!lookup_attribute (no_sanitize_undefined, + DECL_ATTRIBUTES (current_function_decl))) return ubsan_instrument_unreachable (loc); break; I'd say you should instead use (current_function_decl == NULL || !lookup_attribute (...)) so that you instrument even outside of fn bodies, just with no way to turn it off in the code (only command line options). Ok with that change. Jakub
Re: [PATCH] Add no_sanitize_undefined attribute (PR sanitizer/58411)
On Tue, Sep 17, 2013 at 06:51:59PM +0200, Jakub Jelinek wrote: On Tue, Sep 17, 2013 at 06:45:25PM +0200, Marek Polacek wrote: --- gcc/builtins.c.mp2 2013-09-17 16:13:26.623161281 +0200 +++ gcc/builtins.c 2013-09-17 18:42:11.338273135 +0200 @@ -10313,7 +10313,10 @@ fold_builtin_0 (location_t loc, tree fnd return fold_builtin_classify_type (NULL_TREE); case BUILT_IN_UNREACHABLE: - if (flag_sanitize SANITIZE_UNREACHABLE) + if (flag_sanitize SANITIZE_UNREACHABLE + current_function_decl != 0 + !lookup_attribute (no_sanitize_undefined, + DECL_ATTRIBUTES (current_function_decl))) return ubsan_instrument_unreachable (loc); break; I'd say you should instead use (current_function_decl == NULL || !lookup_attribute (...)) so that you instrument even outside of fn bodies, just with no way to turn it off in the code (only command line options). Ok with that change. Thanks, will commit the following tomorrow if no one objects... 2013-09-17 Marek Polacek pola...@redhat.com PR sanitizer/58411 * doc/extend.texi: Document no_sanitize_undefined attribute. * builtins.c (fold_builtin_0): Don't sanitize function if it has the no_sanitize_undefined attribute. c-family/ * c-common.c (handle_no_sanitize_undefined_attribute): New function. Declare it. (struct attribute_spec c_common_att): Add no_sanitize_undefined. cp/ * typeck.c (cp_build_binary_op): Don't sanitize function if it has the no_sanitize_undefined attribute. c/ * c-typeck.c (build_binary_op): Don't sanitize function if it has the no_sanitize_undefined attribute. testsuite/ * c-c++-common/ubsan/attrib-1.c: New test. --- gcc/c-family/c-common.c.mp2 2013-09-17 15:55:56.417946667 +0200 +++ gcc/c-family/c-common.c 2013-09-17 15:58:55.905513029 +0200 @@ -311,6 +311,8 @@ static tree handle_no_sanitize_address_a int, bool *); static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree, int, bool *); +static tree handle_no_sanitize_undefined_attribute (tree *, tree, tree, int, + bool *); static tree handle_noinline_attribute (tree *, tree, tree, int, bool *); static tree handle_noclone_attribute (tree *, tree, tree, int, bool *); static tree handle_leaf_attribute (tree *, tree, tree, int, bool *); @@ -722,6 +724,9 @@ const struct attribute_spec c_common_att { no_sanitize_address,0, 0, true, false, false, handle_no_sanitize_address_attribute, false }, + { no_sanitize_undefined, 0, 0, true, false, false, + handle_no_sanitize_undefined_attribute, + false }, { warning, 1, 1, true, false, false, handle_error_attribute, false }, { error, 1, 1, true, false, false, @@ -6575,6 +6580,22 @@ handle_no_address_safety_analysis_attrib return NULL_TREE; } +/* Handle a no_sanitize_undefined attribute; arguments as in + struct attribute_spec.handler. */ + +static tree +handle_no_sanitize_undefined_attribute (tree *node, tree name, tree, int, + bool *no_add_attrs) +{ + if (TREE_CODE (*node) != FUNCTION_DECL) +{ + warning (OPT_Wattributes, %qE attribute ignored, name); + *no_add_attrs = true; +} + + return NULL_TREE; +} + /* Handle a noinline attribute; arguments as in struct attribute_spec.handler. */ --- gcc/doc/extend.texi.mp2 2013-09-17 15:55:44.250907707 +0200 +++ gcc/doc/extend.texi 2013-09-17 16:06:21.439974916 +0200 @@ -2136,6 +2136,7 @@ attributes are currently defined for fun @code{warn_unused_result}, @code{nonnull}, @code{gnu_inline}, @code{externally_visible}, @code{hot}, @code{cold}, @code{artificial}, @code{no_sanitize_address}, @code{no_address_safety_analysis}, +@code{no_sanitize_undefined}, @code{error} and @code{warning}. Several other attributes are defined for functions on particular target systems. Other attributes, including @code{section} are @@ -3500,6 +3501,12 @@ The @code{no_address_safety_analysis} is @code{no_sanitize_address} attribute, new code should use @code{no_sanitize_address}. +@item no_sanitize_undefined +@cindex @code{no_sanitize_undefined} function attribute +The @code{no_sanitize_undefined} attribute on functions is used +to inform the compiler that it should not check for undefined behavior +in the function when compiling with the @option{-fsanitize=undefined} option. + @item regparm (@var{number}) @cindex @code{regparm} attribute @cindex functions that are passed arguments in registers on the 386 --- gcc/cp/typeck.c.mp2 2013-09-17 16:10:49.935644344 +0200 +++ gcc/cp/typeck.c
Re: New GCC options for loop vectorization
On Tue, Sep 17, 2013 at 1:20 AM, Richard Biener richard.guent...@gmail.com wrote: On Mon, Sep 16, 2013 at 10:24 PM, Xinliang David Li davi...@google.com wrote: On Mon, Sep 16, 2013 at 3:13 AM, Richard Biener richard.guent...@gmail.com wrote: On Fri, Sep 13, 2013 at 5:16 PM, Xinliang David Li davi...@google.com wrote: On Fri, Sep 13, 2013 at 1:30 AM, Richard Biener richard.guent...@gmail.com wrote: On Thu, Sep 12, 2013 at 10:31 PM, Xinliang David Li davi...@google.com wrote: Currently -ftree-vectorize turns on both loop and slp vectorizations, but there is no simple way to turn on loop vectorization alone. The logic for default O3 setting is also complicated. In this patch, two new options are introduced: 1) -ftree-loop-vectorize This option is used to turn on loop vectorization only. option -ftree-slp-vectorize also becomes a first class citizen, and no funny business of Init(2) is needed. With this change, -ftree-vectorize becomes a simple alias to -ftree-loop-vectorize + -ftree-slp-vectorize. For instance, to turn on only slp vectorize at O3, the old way is: -O3 -fno-tree-vectorize -ftree-slp-vectorize With the new change it becomes: -O3 -fno-loop-vectorize To turn on only loop vectorize at O2, the old way is -O2 -ftree-vectorize -fno-slp-vectorize The new way is -O2 -ftree-loop-vectorize 2) -ftree-vect-loop-peeling This option is used to turn on/off loop peeling for alignment. In the long run, this should be folded into the cheap cost model proposed by Richard. This option is also useful in scenarios where peeling can introduce runtime problems: http://gcc.gnu.org/ml/gcc/2005-12/msg00390.html which happens to be common in practice. Patch attached. Compiler boostrapped. Ok after testing? I'd like you to split 1) and 2), mainly because I agree on 1) but not on 2). Ok. Can you also comment on 2) ? I think we want to decide how granular we want to control the vectorizer and using which mechanism. My cost-model re-org makes ftree-vect-loop-version a no-op (basically removes it), so 2) looks like a step backwards in this context. Using cost model to do a coarse grain control/configuration is certainly something we want, but having a fine grain control is still useful. So, can you summarize what pieces (including versioning) of the vectorizer you'd want to be able to disable separately? Loop peeling seems to be the main one. There is also a correctness issue related. For instance, the following code is common in practice, but loop peeling wrongly assumes initial base-alignment and generates aligned mov instruction after peeling, leading to SEGV. Peeling is not something we can blindly turned on -- even when it is on, there should be a way to turn it off explicitly: char a[1]; void foo(int n) { int* b = (int*)(a+n); int i = 0; for (; i 1000; ++i) b[i] = 1; } int main(int argn, char** argv) { foo(argn); } But that's just a bug that should be fixed (looking into it). This kind of code is not uncommon for certain applications (e.g, group varint decoding). Besides, the code like this may be built with -fno-strict-aliasing. Just disabling peeling for alignment may get you into the versioning for alignment path (and thus an unvectorized loop at runtime). This is not true for target supporting mis-aligned access. I have not seen a case where alignment driver loop version happens on x86. Also it's know that the alignment peeling code needs some serious TLC (it's outcome depends on the order of DRs, the cost model it uses leaves to be desired as we cannot distinguish between unaligned load and store costs). Yet another reason to turn it off as it is not effective anyways? As said I'll disable all remains of -ftree-vect-loop-version with the cost model patch because it wasn't guarding versioning for aliasing but only versioning for alignment. We have to be consistent here - if we add a way to disable peeling for alignment then we certainly don't want to remove the ability to disable versioning for alignment, no? yes, for consistency, the version control flag may also be useful to be kept. David Richard. thanks, David Richard. I've stopped a quick try doing 1) myself because @@ -1691,6 +1695,12 @@ common_handle_option (struct gcc_options opts-x_flag_ipa_reference = false; break; +case OPT_ftree_vectorize: + if (!opts_set-x_flag_tree_loop_vectorize) + opts-x_flag_tree_loop_vectorize = value; + if (!opts_set-x_flag_tree_slp_vectorize) + opts-x_flag_tree_slp_vectorize = value; + break; doesn't look obviously correct. Does that handle -ftree-vectorize -fno-tree-loop-vectorize -ftree-vectorize or -ftree-loop-vectorize -fno-tree-vectorize properly? Currently at least -ftree-slp-vectorize -fno-tree-vectorize doesn't work. Right -- same is true for -fprofile-use option. FDO
Re: Using gen_int_mode instead of GEN_INT minor testsuite fallout on MIPS
On Sep 16, 2013, at 8:41 PM, DJ Delorie d...@redhat.com wrote: m32c's PSImode is 24-bits, why does it have 32 in the macro? /* 24-bit pointers, in 32-bit units */ -PARTIAL_INT_MODE (SI); +PARTIAL_INT_MODE_NAME (SI, 32, PSI); Sorry, fingers copied the wrong number. Thanks for the catch. Index: gcc/config/msp430/msp430-modes.def === --- gcc/config/msp430/msp430-modes.def (revision 202634) +++ gcc/config/msp430/msp430-modes.def (working copy) @@ -1,3 +1,3 @@ /* 20-bit address */ -PARTIAL_INT_MODE (SI); +PARTIAL_INT_MODE_NAME (SI, 20, PSI); Index: gcc/config/bfin/bfin-modes.def === --- gcc/config/bfin/bfin-modes.def (revision 202634) +++ gcc/config/bfin/bfin-modes.def (working copy) @@ -19,7 +19,7 @@ http://www.gnu.org/licenses/. */ /* PDImode for the 40-bit accumulators. */ -PARTIAL_INT_MODE (DI); +PARTIAL_INT_MODE_NAME (DI, 40, PDI); /* Two of those - covering both accumulators for vector multiplications. */ VECTOR_MODE (INT, PDI, 2); Index: gcc/config/m32c/m32c-modes.def === --- gcc/config/m32c/m32c-modes.def (revision 202634) +++ gcc/config/m32c/m32c-modes.def (working copy) @@ -22,7 +22,7 @@ /*INT_MODE (PI, 3);*/ /* 24-bit pointers, in 32-bit units */ -PARTIAL_INT_MODE (SI); +PARTIAL_INT_MODE_NAME (SI, 24, PSI); /* 48-bit MULEX result */ /* INT_MODE (MI, 6); */ Index: gcc/config/rs6000/rs6000-modes.def === --- gcc/config/rs6000/rs6000-modes.def (revision 202634) +++ gcc/config/rs6000/rs6000-modes.def (working copy) @@ -45,4 +45,4 @@ VECTOR_MODES (FLOAT, 32); /* V /* Replacement for TImode that only is allowed in GPRs. We also use PTImode for quad memory atomic operations to force getting an even/odd register combination. */ -PARTIAL_INT_MODE (TI); +PARTIAL_INT_MODE_NAME (TI, 128, PTI); Index: gcc/config/sh/sh-modes.def === --- gcc/config/sh/sh-modes.def (revision 202634) +++ gcc/config/sh/sh-modes.def (working copy) @@ -18,9 +18,9 @@ along with GCC; see the file COPYING3. http://www.gnu.org/licenses/. */ /* The SH uses a partial integer mode to represent the FPSCR register. */ -PARTIAL_INT_MODE (SI); +PARTIAL_INT_MODE_NAME (SI, 32, PSI); /* PDI mode is used to represent a function address in a target register. */ -PARTIAL_INT_MODE (DI); +PARTIAL_INT_MODE_NAME (DI, 64, PDI); /* Vector modes. */ VECTOR_MODE (INT, QI, 2);/* V2QI */ Index: gcc/genmodes.c === --- gcc/genmodes.c (revision 202634) +++ gcc/genmodes.c (working copy) @@ -629,10 +629,14 @@ reset_float_format (const char *name, co m-format = format; } -/* Partial integer modes are specified by relation to a full integer mode. - For now, we do not attempt to narrow down their bit sizes. */ -#define PARTIAL_INT_MODE(M) \ - make_partial_integer_mode (#M, P #M, -1U, __FILE__, __LINE__) +/* Partial integer modes are specified by relation to a full integer + mode. */ +#define PARTIAL_INT_MODE(M,PREC) \ + make_partial_integer_mode (#M, P #PREC #M, PREC, __FILE__, __LINE__) +/* Partial integer modes are specified by relation to a full integer + mode. */ +#define PARTIAL_INT_MODE_NAME(M,PREC,NAME) \ + make_partial_integer_mode (#M, #NAME, PREC, __FILE__, __LINE__) static void ATTRIBUTE_UNUSED make_partial_integer_mode (const char *base, const char *name, unsigned int precision, @@ -669,7 +673,7 @@ make_vector_mode (enum mode_class bclass struct mode_data *v; enum mode_class vclass = vector_class (bclass); struct mode_data *component = find_mode (base); - char namebuf[8]; + char namebuf[16]; if (vclass == MODE_RANDOM) return; @@ -917,7 +921,7 @@ enum machine_mode\n{); end will try to use it for bitfields in structures and the like, which we do not want. Only the target md file should generate BImode widgets. */ - if (first first-precision == 1) + if (first first-precision == 1 c == MODE_INT) first = first-next; if (first last) @@ -1187,7 +1191,7 @@ emit_class_narrowest_mode (void) /* Bleah, all this to get the comment right for MIN_MODE_INT. */ tagged_printf (MIN_%s, mode_class_names[c], modes[c] - ? (modes[c]-precision != 1 + ? ((c != MODE_INT || modes[c]-precision != 1) ? modes[c]-name : (modes[c]-next ? modes[c]-next-name Index: gcc/machmode.def === --- gcc/machmode.def
Re: New GCC options for loop vectorization
On Tue, Sep 17, 2013 at 8:45 AM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Sep 17, 2013 at 08:37:57AM -0700, Xinliang David Li wrote: char a[1]; void foo(int n) { int* b = (int*)(a+n); int i = 0; for (; i 1000; ++i) b[i] = 1; } int main(int argn, char** argv) { foo(argn); } But that's just a bug that should be fixed (looking into it). This kind of code is not uncommon for certain applications (e.g, group varint decoding). Besides, the code like this may be built with That is irrelevant to the fact that it is invalid. -fno-strict-aliasing. It isn't invalid because of aliasing violations, but because of unaligned access without saying that it is unaligned (say accessing through aligned(1) type, or packed struct or similar, or doing memcpy). On various architectures unaligned accesses don't cause faults, so it may appear to work, and even on i?86/x86_64 often appears to work, as long as you aren't trying to vectorize code (which doesn't change anything on the fact that it is undefined behavior). ok, undefined behavior it is. By the way, ICC does loop versioning on the case and therefore has no problem. Clang/LLVM vectorizes it with neither peeling nor versioning, and it works fine to. For legacy code like this, GCC is less tolerant. thanks, David Jakub
Re: [PATCH] Add no_sanitize_undefined attribute (PR sanitizer/58411)
On Tue, Sep 17, 2013 at 06:26:52PM +0200, Marek Polacek wrote: Well I wonder too ;) I thought it can't be NULL, and tried this struct C { C() { __builtin_unreachable (); } }; I was more wondering about stuff like: int a = (__builtin_unreachable (), 1); or similar. Jakub
Re: New GCC options for loop vectorization
On Tue, Sep 17, 2013 at 08:37:57AM -0700, Xinliang David Li wrote: char a[1]; void foo(int n) { int* b = (int*)(a+n); int i = 0; for (; i 1000; ++i) b[i] = 1; } int main(int argn, char** argv) { foo(argn); } But that's just a bug that should be fixed (looking into it). This kind of code is not uncommon for certain applications (e.g, group varint decoding). Besides, the code like this may be built with That is irrelevant to the fact that it is invalid. -fno-strict-aliasing. It isn't invalid because of aliasing violations, but because of unaligned access without saying that it is unaligned (say accessing through aligned(1) type, or packed struct or similar, or doing memcpy). On various architectures unaligned accesses don't cause faults, so it may appear to work, and even on i?86/x86_64 often appears to work, as long as you aren't trying to vectorize code (which doesn't change anything on the fact that it is undefined behavior). Jakub
Re: [PATCH] Add no_sanitize_undefined attribute (PR sanitizer/58411)
On Tue, Sep 17, 2013 at 06:34:38PM +0200, Jakub Jelinek wrote: On Tue, Sep 17, 2013 at 06:26:52PM +0200, Marek Polacek wrote: Well I wonder too ;) I thought it can't be NULL, and tried this struct C { C() { __builtin_unreachable (); } }; I was more wondering about stuff like: int a = (__builtin_unreachable (), 1); or similar. Oh yeah, that would segfault, so I added the check that c_f_d is non-NULL. Ok now? 2013-09-17 Marek Polacek pola...@redhat.com PR sanitizer/58411 * doc/extend.texi: Document no_sanitize_undefined attribute. * builtins.c (fold_builtin_0): Don't sanitize function if it has the no_sanitize_undefined attribute. c-family/ * c-common.c (handle_no_sanitize_undefined_attribute): New function. Declare it. (struct attribute_spec c_common_att): Add no_sanitize_undefined. cp/ * typeck.c (cp_build_binary_op): Don't sanitize function if it has the no_sanitize_undefined attribute. c/ * c-typeck.c (build_binary_op): Don't sanitize function if it has the no_sanitize_undefined attribute. testsuite/ * c-c++-common/ubsan/attrib-1.c: New test. --- gcc/c-family/c-common.c.mp2 2013-09-17 15:55:56.417946667 +0200 +++ gcc/c-family/c-common.c 2013-09-17 15:58:55.905513029 +0200 @@ -311,6 +311,8 @@ static tree handle_no_sanitize_address_a int, bool *); static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree, int, bool *); +static tree handle_no_sanitize_undefined_attribute (tree *, tree, tree, int, + bool *); static tree handle_noinline_attribute (tree *, tree, tree, int, bool *); static tree handle_noclone_attribute (tree *, tree, tree, int, bool *); static tree handle_leaf_attribute (tree *, tree, tree, int, bool *); @@ -722,6 +724,9 @@ const struct attribute_spec c_common_att { no_sanitize_address,0, 0, true, false, false, handle_no_sanitize_address_attribute, false }, + { no_sanitize_undefined, 0, 0, true, false, false, + handle_no_sanitize_undefined_attribute, + false }, { warning, 1, 1, true, false, false, handle_error_attribute, false }, { error, 1, 1, true, false, false, @@ -6575,6 +6580,22 @@ handle_no_address_safety_analysis_attrib return NULL_TREE; } +/* Handle a no_sanitize_undefined attribute; arguments as in + struct attribute_spec.handler. */ + +static tree +handle_no_sanitize_undefined_attribute (tree *node, tree name, tree, int, + bool *no_add_attrs) +{ + if (TREE_CODE (*node) != FUNCTION_DECL) +{ + warning (OPT_Wattributes, %qE attribute ignored, name); + *no_add_attrs = true; +} + + return NULL_TREE; +} + /* Handle a noinline attribute; arguments as in struct attribute_spec.handler. */ --- gcc/doc/extend.texi.mp2 2013-09-17 15:55:44.250907707 +0200 +++ gcc/doc/extend.texi 2013-09-17 16:06:21.439974916 +0200 @@ -2136,6 +2136,7 @@ attributes are currently defined for fun @code{warn_unused_result}, @code{nonnull}, @code{gnu_inline}, @code{externally_visible}, @code{hot}, @code{cold}, @code{artificial}, @code{no_sanitize_address}, @code{no_address_safety_analysis}, +@code{no_sanitize_undefined}, @code{error} and @code{warning}. Several other attributes are defined for functions on particular target systems. Other attributes, including @code{section} are @@ -3500,6 +3501,12 @@ The @code{no_address_safety_analysis} is @code{no_sanitize_address} attribute, new code should use @code{no_sanitize_address}. +@item no_sanitize_undefined +@cindex @code{no_sanitize_undefined} function attribute +The @code{no_sanitize_undefined} attribute on functions is used +to inform the compiler that it should not check for undefined behavior +in the function when compiling with the @option{-fsanitize=undefined} option. + @item regparm (@var{number}) @cindex @code{regparm} attribute @cindex functions that are passed arguments in registers on the 386 --- gcc/cp/typeck.c.mp2 2013-09-17 16:10:49.935644344 +0200 +++ gcc/cp/typeck.c 2013-09-17 16:11:20.601743694 +0200 @@ -4887,6 +4887,8 @@ cp_build_binary_op (location_t location, if ((flag_sanitize SANITIZE_UNDEFINED) !processing_template_decl current_function_decl != 0 + !lookup_attribute (no_sanitize_undefined, + DECL_ATTRIBUTES (current_function_decl)) (doing_div_or_mod || doing_shift)) { /* OP0 and/or OP1 might have side-effects. */ --- gcc/c/c-typeck.c.mp22013-09-17 16:09:31.423381687 +0200 +++ gcc/c/c-typeck.c2013-09-17 16:10:00.626476422 +0200 @@ -10498,6 +10498,8 @@ build_binary_op (location_t
Re: Using gen_int_mode instead of GEN_INT minor testsuite fallout on MIPS
Mike Stump mikest...@comcast.net writes: +/* Partial integer modes are specified by relation to a full integer + mode. */ +#define PARTIAL_INT_MODE(M,PREC) \ + make_partial_integer_mode (#M, P #PREC #M, PREC, __FILE__, __LINE__) +/* Partial integer modes are specified by relation to a full integer + mode. */ +#define PARTIAL_INT_MODE_NAME(M,PREC,NAME) \ + make_partial_integer_mode (#M, #NAME, PREC, __FILE__, __LINE__) Sorry for the bikeshedding, but I think it'd better to have a single macro: #define PARTIAL_INT_MODE(M, PREC, NAME) You can easily add an explicit Pnmode if the port happens to want that name. Thanks, Richard
Re: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction
The new test gcc.dg/tree-ssa/slsr-39.c fails in 64 bit mode (see http://gcc.gnu.org/ml/gcc-regression/2013-09/msg00455.html ). Looking for MEM in the dump returns _12 = MEM[(int[50] *)_17]; MEM[(int[50] *)_20] = _13; TIA Dominique
Re: [PATCH] manage dom-walk_data initialization and finalization with constructors and destructors
On 09/17/2013 12:39 PM, Trevor Saunders wrote: I'd like to go ahead and get your patch installed -- do you have a GCC copyright assignment on file with the FSF? Your change is large enough to require one. Its my understanding that Mozilla has one covering work done by employees which would include me. OK. Corporate blanket assignment works for me. sorry about the formatting issues. No worries. It takes time to get up to speed on all the niggling details. I'll throw it into a build/regression test cycle, assuming nothing bad pops out, I'll get it installed. jeff
Re: [PATCH] RTEMS: Add LEON3/SPARC multilibs
Committed to the head. Is this too radical to also go on the 4.8 branch? We would need to discuss it on the RTEMS side but it only impacts us if the multilib is there for sparc-elf on 4.8. Thanks Sebastian. On 8/30/2013 6:58 AM, Daniel Hellstrom wrote: Hello Sebastian, That seems like a good idea. Thanks, Daniel On 08/29/2013 01:04 PM, Sebastian Huber wrote: Recently support for LEON3 specific instructions were added to GCC. Make this support available for RTEMS. This patch should be committed to GCC 4.9. gcc/ChangeLog 2013-08-29 Sebastian Huber sebastian.hu...@embedded-brains.de * config/sparc/t-rtems: Add leon3 multilibs. --- gcc/config/sparc/t-rtems |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/sparc/t-rtems b/gcc/config/sparc/t-rtems index 63d0217..f1a3d84 100644 --- a/gcc/config/sparc/t-rtems +++ b/gcc/config/sparc/t-rtems @@ -17,6 +17,6 @@ # http://www.gnu.org/licenses/. # -MULTILIB_OPTIONS = msoft-float mcpu=v8 -MULTILIB_DIRNAMES = soft v8 +MULTILIB_OPTIONS = msoft-float mcpu=v8/mcpu=leon3 +MULTILIB_DIRNAMES = soft v8 leon3 MULTILIB_MATCHES = msoft-float=mno-fpu -- Joel Sherrill, Ph.D. Director of Research Development joel.sherr...@oarcorp.comOn-Line Applications Research Ask me about RTEMS: a free RTOS Huntsville AL 35805 Support Available(256) 722-9985
[v3] More noexcept for lists
Hello, after vectors, lists. I didn't touch the throw we were discussing earlier today for now. There will be an inconsistency with debug list iterators because they use a general wrapper: - I would need François to tell if that wrapper is ever used with iterators that can throw, - the same wrapper is used for several containers, so unless we change all containers at once it can't stay consistent. Bootstrap+testsuite ok. 2013-09-18 Marc Glisse marc.gli...@inria.fr PR libstdc++/58338 * include/bits/list.tcc (_List_base::_M_clear, list::erase): Mark as noexcept. * include/bits/stl_list.h (_List_iterator) [_List_iterator, _M_const_cast, operator*, operator-, operator++, operator--, operator==, operator!=]: Likewise. (_List_const_iterator) [_List_const_iterator, _M_const_cast, operator*, operator-, operator++, operator--, operator==, operator!=]: Likewise. (operator==(const _List_iterator, const _List_const_iterator), operator!=(const _List_iterator, const _List_const_iterator)): Likewise. (_List_impl) [_List_impl(const _Node_alloc_type), _List_impl(_Node_alloc_type)]: Likewise. (_List_base) [_M_put_node, _List_base(const _Node_alloc_type), _List_base(_List_base), _M_clear, _M_init]: Likewise. (list) [list(), list(const allocator_type)]: Merge. (list) [list(const allocator_type), front, back, pop_front, pop_back, erase, _M_erase]: Mark as noexcept. * include/debug/list (list) [list(const _Allocator), front, back, pop_front, pop_back, _M_erase, erase]: Likewise. * include/profile/list (list) [list(const _Allocator), front, back, pop_front, pop_back, erase]: Likewise. * testsuite/23_containers/list/requirements/dr438/assign_neg.cc: Adjust line number. * testsuite/23_containers/list/requirements/dr438/constructor_1_neg.cc: Likewise. * testsuite/23_containers/list/requirements/dr438/constructor_2_neg.cc: Likewise. * testsuite/23_containers/list/requirements/dr438/insert_neg.cc: Likewise. -- Marc GlisseIndex: include/bits/list.tcc === --- include/bits/list.tcc (revision 202655) +++ include/bits/list.tcc (working copy) @@ -56,21 +56,21 @@ #ifndef _LIST_TCC #define _LIST_TCC 1 namespace std _GLIBCXX_VISIBILITY(default) { _GLIBCXX_BEGIN_NAMESPACE_CONTAINER templatetypename _Tp, typename _Alloc void _List_base_Tp, _Alloc:: -_M_clear() +_M_clear() _GLIBCXX_NOEXCEPT { typedef _List_node_Tp _Node; _Node* __cur = static_cast_Node*(_M_impl._M_node._M_next); while (__cur != _M_impl._M_node) { _Node* __tmp = __cur; __cur = static_cast_Node*(__cur-_M_next); #if __cplusplus = 201103L _M_get_Node_allocator().destroy(__tmp); #else @@ -138,21 +138,21 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER return __it; } return __position._M_const_cast(); } #endif templatetypename _Tp, typename _Alloc typename list_Tp, _Alloc::iterator list_Tp, _Alloc:: #if __cplusplus = 201103L -erase(const_iterator __position) +erase(const_iterator __position) noexcept #else erase(iterator __position) #endif { iterator __ret = iterator(__position._M_node-_M_next); _M_erase(__position._M_const_cast()); return __ret; } #if __cplusplus = 201103L Index: include/bits/stl_list.h === --- include/bits/stl_list.h (revision 202655) +++ include/bits/stl_list.h (working copy) @@ -126,76 +126,76 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER { typedef _List_iterator_Tp_Self; typedef _List_node_Tp_Node; typedef ptrdiff_t difference_type; typedef std::bidirectional_iterator_tagiterator_category; typedef _Tpvalue_type; typedef _Tp* pointer; typedef _Tp reference; - _List_iterator() + _List_iterator() _GLIBCXX_NOEXCEPT : _M_node() { } explicit - _List_iterator(__detail::_List_node_base* __x) + _List_iterator(__detail::_List_node_base* __x) _GLIBCXX_NOEXCEPT : _M_node(__x) { } _Self - _M_const_cast() const + _M_const_cast() const _GLIBCXX_NOEXCEPT { return *this; } // Must downcast from _List_node_base to _List_node to get to _M_data. reference - operator*() const + operator*() const _GLIBCXX_NOEXCEPT { return static_cast_Node*(_M_node)-_M_data; } pointer - operator-() const + operator-() const _GLIBCXX_NOEXCEPT { return
patch to canonize small wide-ints.
Richi, This patch canonizes the bits above the precision for wide ints with types or modes that are not a perfect multiple of HOST_BITS_PER_WIDE_INT. I expect that most of the changes in rtl.h will go away. in particular, when we decide that we can depend on richard's patch to clean up rtl constants, then the only thing that will be left will be the addition of the TARGET_SUPPORTS_WIDE_INT test. I do believe that there is one more conserved force in the universe than what physicist's generally consider: it is uglyness. There is a lot of truth and beauty in the patch but in truth there is a lot of places where the uglyness is just moved someplace else. in the pushing the ugly around dept, trees and wide-ints are not canonized the same way.I spent several days going down the road where it tried to have them be the same, but it got very ugly having 32 bit unsigned int csts have the upper 32 bits set. So now wide_int_to_tree and the wide-int constructors from tree-cst are now more complex. i think that i am in favor of this patch, especially in conjunction with richards cleanup, but only mildly. There is also some cleanup where richard wanted the long lines addressed. Ok to commit to the wide-int branch? kenny Index: gcc/emit-rtl.c === --- gcc/emit-rtl.c (revision 202389) +++ gcc/emit-rtl.c (working copy) @@ -579,8 +579,6 @@ immed_wide_int_const (const wide_int v, if (len 2 || prec = HOST_BITS_PER_WIDE_INT) return gen_int_mode (v.elt (0), mode); - wide_int copy = v; - wi::clear_undef (copy, SIGNED); #if TARGET_SUPPORTS_WIDE_INT { unsigned int i; @@ -599,12 +597,12 @@ immed_wide_int_const (const wide_int v, CWI_PUT_NUM_ELEM (value, len); for (i = 0; i len; i++) - CONST_WIDE_INT_ELT (value, i) = copy.elt (i); + CONST_WIDE_INT_ELT (value, i) = v.elt (i); return lookup_const_wide_int (value); } #else - return immed_double_const (copy.elt (0), copy.elt (1), mode); + return immed_double_const (v.elt (0), v.elt (1), mode); #endif } Index: gcc/lto-streamer-in.c === --- gcc/lto-streamer-in.c (revision 202389) +++ gcc/lto-streamer-in.c (working copy) @@ -1273,7 +1273,7 @@ lto_input_tree_1 (struct lto_input_block for (i = 0; i len; i++) a[i] = streamer_read_hwi (ib); result = wide_int_to_tree (type, wide_int::from_array - (a, len, TYPE_PRECISION (type), false)); + (a, len, TYPE_PRECISION (type))); streamer_tree_cache_append (data_in-reader_cache, result, hash); } else if (tag == LTO_tree_scc) Index: gcc/real.c === --- gcc/real.c (revision 202389) +++ gcc/real.c (working copy) @@ -2248,7 +2248,6 @@ real_from_integer (REAL_VALUE_TYPE *r, e /* Clear out top bits so elt will work with precisions that aren't a multiple of HOST_BITS_PER_WIDE_INT. */ val = wide_int::from (val, len, sgn); - wi::clear_undef (val, sgn); len = len / HOST_BITS_PER_WIDE_INT; SET_REAL_EXP (r, len * HOST_BITS_PER_WIDE_INT + e); Index: gcc/rtl.h === --- gcc/rtl.h (revision 202389) +++ gcc/rtl.h (working copy) @@ -1422,6 +1422,7 @@ wi::int_traits rtx_mode_t::get_precisi return GET_MODE_PRECISION (x.second); } +#if 0 inline wi::storage_ref wi::int_traits rtx_mode_t::decompose (HOST_WIDE_INT *, unsigned int precision, @@ -1437,13 +1438,57 @@ wi::int_traits rtx_mode_t::decompose ( return wi::storage_ref (CONST_WIDE_INT_ELT (x.first, 0), CONST_WIDE_INT_NUNITS (x.first), precision); +#if TARGET_SUPPORTS_WIDE_INT != 0 case CONST_DOUBLE: return wi::storage_ref (CONST_DOUBLE_LOW (x.first), 2, precision); +#endif default: gcc_unreachable (); } } +#else +/* For now, assume that the storage is not canonical, i.e. that there + are bits above the precision that are not all zeros or all ones. + If this is fixed in rtl, then we will not need the calls to + force_to_size. */ +inline wi::storage_ref +wi::int_traits rtx_mode_t::decompose (HOST_WIDE_INT *scratch, + unsigned int precision, + const rtx_mode_t x) +{ + int len; + int small_prec = precision (HOST_BITS_PER_WIDE_INT - 1); + + gcc_checking_assert (precision == get_precision (x)); + switch (GET_CODE (x.first)) +{ +case CONST_INT: + len = 1; + if (small_prec) + scratch[0] = sext_hwi (INTVAL (x.first), precision); + else + scratch = INTVAL (x.first); + break; + +case CONST_WIDE_INT: + len = CONST_WIDE_INT_NUNITS (x.first); + scratch = CONST_WIDE_INT_ELT (x.first, 0); + break; + +#if TARGET_SUPPORTS_WIDE_INT == 0 +case CONST_DOUBLE: + len = 2; + scratch = CONST_DOUBLE_LOW (x.first); + break; +#endif + +
[rl78] optimize prologues
Committed. 2013-09-17 Nick Clifton ni...@redhat.com * config/rl78/rl78.c (need_to_save): Change return type to bool. For interrupt functions: save all call clobbered registers if the interrupt handler is not a leaf function. (rl78_expand_prologue): Always recompute the frame information. For interrupt functions: only select bank 0 if one of the bank 0 registers is going to be psuhed. Index: config/rl78/rl78.c === --- config/rl78/rl78.c (revision 202666) +++ config/rl78/rl78.c (working copy) @@ -537,40 +537,45 @@ rl78_force_nonfar_3 (rtx *operands, rtx static bool rl78_can_eliminate (const int from ATTRIBUTE_UNUSED, const int to ATTRIBUTE_UNUSED) { return true; } -/* Returns nonzero if the given register needs to be saved by the +/* Returns true if the given register needs to be saved by the current function. */ -static int -need_to_save (int regno) +static bool +need_to_save (unsigned int regno) { if (is_interrupt_func (cfun-decl)) { - if (regno 8) - return 1; /* don't know what devirt will need */ + /* We don't need to save registers that have + been reserved for interrupt handlers. */ if (regno 23) - return 0; /* don't need to save interrupt registers */ - if (crtl-is_leaf) - { - return df_regs_ever_live_p (regno); - } - else - return 1; + return false; + + /* If the handler is a non-leaf function then it may call +non-interrupt aware routines which will happily clobber +any call_used registers, so we have to preserve them. */ + if (!crtl-is_leaf call_used_regs[regno]) + return true; + + /* Otherwise we only have to save a register, call_used +or not, if it is used by this handler. */ + return df_regs_ever_live_p (regno); } + if (regno == FRAME_POINTER_REGNUM frame_pointer_needed) -return 1; +return true; if (fixed_regs[regno]) -return 0; +return false; if (crtl-calls_eh_return) -return 1; +return true; if (df_regs_ever_live_p (regno) !call_used_regs[regno]) -return 1; - return 0; +return true; + return false; } /* We use this to wrap all emitted insns in the prologue. */ static rtx F (rtx x) { @@ -1023,20 +1028,26 @@ rl78_expand_prologue (void) rtx sp = gen_rtx_REG (HImode, STACK_POINTER_REGNUM); int rb = 0; if (rl78_is_naked_func ()) return; - if (!cfun-machine-computed) -rl78_compute_frame_info (); + /* Always re-compute the frame info - the register usage may have changed. */ + rl78_compute_frame_info (); if (flag_stack_usage_info) current_function_static_stack_size = cfun-machine-framesize; if (is_interrupt_func (cfun-decl) !TARGET_G10) -emit_insn (gen_sel_rb (GEN_INT (0))); +for (i = 0; i 4; i++) + if (cfun-machine-need_to_push [i]) + { + /* Select Bank 0 if we are using any registers from Bank 0. */ + emit_insn (gen_sel_rb (GEN_INT (0))); + break; + } for (i = 0; i 16; i++) if (cfun-machine-need_to_push [i]) { if (TARGET_G10) {
Re: [PATCH v2 1/6] Convert symtab, cgraph and varpool nodes into a real class hierarchy
On Tue, 2013-09-10 at 15:34 +0200, Jan Hubicka wrote: Thanks for reviewing this, and sorry for the late response (I lost most of last week to illness). Some questions inline below... This patch is the handwritten part of the conversion of these types to C++; it requires the followup patch, which is autogenerated. It converts: struct GTY(()) symtab_node_base to: class GTY((user)) symtab_node_base and converts: struct GTY(()) cgraph_node to: struct GTY((user)) cgraph_node : public symtab_node_base and: struct GTY(()) varpool_node to: class GTY((user)) varpool_node : public symtab_node_base dropping the symtab_node_def union. Since gengtype is unable to cope with inheritance, we have to mark the types with GTY((user)), and handcode the gty field-visiting functions. Given the simple hierarchy, we don't need virtual functions for this. Unfortunately doing so runs into various bugs in gengtype's handling of GTY((user)), so the patch also includes workarounds for these bugs. gengtype walks the graph of the *types* in the code, and produces functions in gtype-desc.[ch] for all types that are reachable from a GTY root. However, it ignores the contents of GTY((user)) types when walking this graph. Hence if you have a subgraph of types that are only reachable via fields in GTY((user)) types, gengtype won't generate helper code for those types. Ideally there would be some way to mark a GTY((user)) type to say which types it references, to avoid having to mark these types as GTY((user)). For now, work around this issue by providing explicit roots of the missing types, of dummy variables (see the bottom of cgraph.c) [..] diff --git a/gcc/cgraph.c b/gcc/cgraph.c index f12bf1b..4b12163 100644 --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -2994,4 +2994,222 @@ cgraph_get_body (struct cgraph_node *node) return true; } +/* GTY((user)) hooks for symtab_node_base (and its subclasses). + We could use virtual functions for this, but given the presence of the + type field and the trivial size of the class hierarchy, switches are + perhaps simpler and faster. */ + +void gt_ggc_mx (symtab_node_base *x) +{ + /* Hand-written equivalent of the chain_next/chain_prev machinery, to + avoid deep call-stacks. + + Locate the neighbors of x (within the linked-list) that haven't been + marked yet, so that x through xlimit give a range suitable for marking. + Note that x (on entry) itself has already been marked by the + gtype-desc.c code, so we first try its successor. + */ + symtab_node_base * xlimit = x ? x-next : NULL; + while (ggc_test_and_set_mark (xlimit)) + xlimit = xlimit-next; + if (x != xlimit) +for (;;) + { +symtab_node_base * const xprev = x-previous; +if (xprev == NULL) break; +x = xprev; +(void) ggc_test_and_set_mark (xprev); + } + while (x != xlimit) +{ + /* Code common to all symtab nodes. */ + gt_ggc_m_9tree_node (x-decl); + gt_ggc_mx_symtab_node_base (x-next); Aren't you marking next twice? Once by xlimit walk and one by recursion? Good catch; removed. + gt_ggc_mx_symtab_node_base (x-previous); The comment above also applies to previous, so I've removed this also. + gt_ggc_mx_symtab_node_base (x-next_sharing_asm_name); + gt_ggc_mx_symtab_node_base (x-previous_sharing_asm_name); + gt_ggc_mx_symtab_node_base (x-same_comdat_group); You can skip marking these. They only point within symbol table and not externally. OK; removed. + gt_ggc_m_20vec_ipa_ref_t_va_gc_ (x-ref_list.references); + gt_ggc_m_9tree_node (x-alias_target); + gt_ggc_m_18lto_file_decl_data (x-lto_file_data); + + /* Extra code, per subclass. */ + switch (x-type) Didn't we agreed on the is_a API? There's just one interesting subclass here, so I've converted this to: if (cgraph_node *cgn = dyn_cast cgraph_node * (x)) { eliminating the switch and static_cast. +{ +case SYMTAB_FUNCTION: + { +cgraph_node *cgn = static_cast cgraph_node * (x); +gt_ggc_m_11cgraph_edge (cgn-callees); +gt_ggc_m_11cgraph_edge (cgn-callers); +gt_ggc_m_11cgraph_edge (cgn-indirect_calls); +gt_ggc_m_11cgraph_node (cgn-origin); +gt_ggc_m_11cgraph_node (cgn-nested); +gt_ggc_m_11cgraph_node (cgn-next_nested); +gt_ggc_m_11cgraph_node (cgn-next_sibling_clone); +gt_ggc_m_11cgraph_node (cgn-prev_sibling_clone); +gt_ggc_m_11cgraph_node (cgn-clones); +gt_ggc_m_11cgraph_node (cgn-clone_of); Same as here. Sorry, it's not clear to me what you meant by Same as here. here. Do you mean that I can skip marking them because they're
[rl78] Add -mallregs
GCC typically avoids using virtual registers $r24 through $r31, as this register bank (bank 3) is reserved for hand-written assembly interrupt handlers. If unneeded for that, this new option lets gcc use those registers also. Committed. * config/rl78/constraints.md (Wcv): Allow up to $r31. * config/rl78/rl78.c (rl78_asm_file_start: Likewise. (rl78_option_override): Likewise, if -mallregs. (is_virtual_register): Likewise. * config/rl78/rl78.h (reg_class): Extend VREGS to $r31. (REGNO_OK_FOR_BASE_P): Likewise. * config/rl78/rl78.opt (-mallregs): New. Index: config/rl78/rl78.h === --- config/rl78/rl78.h (revision 202668) +++ config/rl78/rl78.h (working copy) @@ -262,13 +262,13 @@ enum reg_class { 0x000c, 0x }, /* B and C - index regs. */\ { 0x00ff, 0x }, /* all real registers. */ \ { 0x, 0x0001 }, /* SP */\ { 0x0300, 0x }, /* R8 - HImode */ \ { 0x0c00, 0x }, /* R10 - HImode */ \ { 0xff00, 0x }, /* INT - HImode */ \ - { 0x007fff00, 0x }, /* Virtual registers. */ \ + { 0xff7fff00, 0x }, /* Virtual registers. */ \ { 0xff7f, 0x0002 }, /* General registers. */ \ { 0x0400, 0x0004 }, /* PSW. */ \ { 0xff7f, 0x001f } /* All registers. */ \ } #define TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P hook_bool_mode_true @@ -349,13 +349,13 @@ enum reg_class (IN_RANGE ((REGNO), (MIN), (MAX))\ || (reg_renumber != NULL\ reg_renumber[(REGNO)] = (MIN) \ reg_renumber[(REGNO)] = (MAX))) #ifdef REG_OK_STRICT -#define REGNO_OK_FOR_BASE_P(regno) REGNO_IN_RANGE (regno, 16, 23) +#define REGNO_OK_FOR_BASE_P(regno) REGNO_IN_RANGE (regno, 16, 31) #else #define REGNO_OK_FOR_BASE_P(regno) 1 #endif #define REGNO_OK_FOR_INDEX_P(regno)REGNO_OK_FOR_BASE_P (regno) Index: config/rl78/constraints.md === --- config/rl78/constraints.md (revision 202668) +++ config/rl78/constraints.md (working copy) @@ -260,16 +260,16 @@ es:[AX..HL] for calls (match_test rl78_es_addr (op) satisfies_constraint_Cca (rl78_es_base (op)) || satisfies_constraint_Cca (op)) ) (define_memory_constraint Ccv - [AX..HL,r8-r23] for calls + [AX..HL,r8-r31] for calls (and (match_code mem) (and (match_code reg 0) - (match_test REGNO (XEXP (op, 0)) 24))) + (match_test REGNO (XEXP (op, 0)) 31))) ) (define_memory_constraint Wcv es:[AX..HL,r8-r23] for calls (match_test rl78_es_addr (op) satisfies_constraint_Ccv (rl78_es_base (op)) || satisfies_constraint_Ccv (op)) ) Index: config/rl78/rl78.c === --- config/rl78/rl78.c (revision 202668) +++ config/rl78/rl78.c (working copy) @@ -269,12 +269,13 @@ rl78_asm_file_start (void) else { for (i = 0; i 8; i++) { fprintf (asm_out_file, r%d\t=\t0x%x\n, 8 + i, 0xffef0 + i); fprintf (asm_out_file, r%d\t=\t0x%x\n, 16 + i, 0xffee8 + i); + fprintf (asm_out_file, r%d\t=\t0x%x\n, 24 + i, 0xffee0 + i); } } opt_pass *rl78_devirt_pass = make_pass_rl78_devirt (g); static struct register_pass_info rl78_devirt_info = { @@ -306,12 +307,19 @@ rl78_option_override (void) { flag_omit_frame_pointer = 1; flag_no_function_cse = 1; flag_split_wide_types = 0; init_machine_status = rl78_init_machine_status; + + if (TARGET_ALLREGS) +{ + int i; + for (i=24; i32; i++) + fixed_regs[i] = 0; +} } /* Most registers are 8 bits. Some are 16 bits because, for example, gcc doesn't like dealing with $FP as a register pair. This table maps register numbers to size in bytes. */ static const int register_sizes[] = @@ -2212,13 +2220,13 @@ insn_ok_now (rtx insn) /* Returns TRUE if R is a virtual register. */ static bool is_virtual_register (rtx r) { return (GET_CODE (r) == REG REGNO (r) = 8 - REGNO (r) 24); + REGNO (r) 32); } /* In all these alloc routines, we expect the following: the insn pattern is unshared, the insn was previously recognized and failed due to predicates or constraints, and the operand data is in recog_data. */ Index: config/rl78/rl78.opt === --- config/rl78/rl78.opt(revision 202668) +++ config/rl78/rl78.opt(working copy) @@ -39,12 +39,16 @@ Enum(rl78_mul_types) String(none) Value( EnumValue Enum(rl78_mul_types) String(rl78) Value(MUL_RL78) EnumValue Enum(rl78_mul_types)
libgo patch committed: Fix reflect bug in method calls
This patch to libgo fixes a bug when calling a method when the reflect.Value object holds a pointer to the actual value. The code was calling iword which tests v.kind, but for a method value that is always Func. This fixes the code to implement iword directly using v.typ.Kind(). Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.8 branch. I added a test to the reflect testsuite in the master sources, and it will be imported into the gccgo sources in due course. Ian Index: libgo/go/reflect/value.go === --- libgo/go/reflect/value.go (revision 202233) +++ libgo/go/reflect/value.go (working copy) @@ -611,7 +611,13 @@ func methodReceiver(op string, v Value, } fn = unsafe.Pointer(m.tfn) t = m.mtyp - rcvr = v.iword() + // Can't call iword here, because it checks v.kind, + // and that is always Func. + if v.flagflagIndir != 0 (v.typ.Kind() == Ptr || v.typ.Kind() == UnsafePointer) { + rcvr = loadIword(v.val, v.typ.size) + } else { + rcvr = iword(v.val) + } } return }
[PATCH]Fix missed propagation opportunity in DOM
This is a repost with fixes to avoid the phase-ordering problem exposed by 58387 and 58340. I've included the testcase for 58387. -- I recently noticed that we were failing to propagate edge equivalences into PHI arguments in non-dominated successors. The case loos like this: ;; basic block 11, loop depth 0, count 0, freq 160, maybe hot ;;prev block 10, next block 12, flags: (NEW, REACHABLE) ;;pred: 10 [50.0%] (FALSE_VALUE,EXECUTABLE) _257 = di_13(D)-comps; _258 = (long unsigned int) _255; _259 = _258 * 24; p_260 = _257 + _259; _261 = _255 + 1; di_13(D)-next_comp = _261; if (p_260 != 0B) goto bb 12; else goto bb 13; ;;succ: 12 [100.0%] (TRUE_VALUE,EXECUTABLE) ;;13 (FALSE_VALUE,EXECUTABLE) ;; basic block 12, loop depth 0, count 0, freq 272, maybe hot ;; Invalid sum of incoming frequencies 160, should be 272 ;;prev block 11, next block 13, flags: (NEW, REACHABLE) ;;pred: 11 [100.0%] (TRUE_VALUE,EXECUTABLE) p_260-type = 37; p_260-u.s_builtin.type = _139; ;;succ: 13 [100.0%] (FALLTHRU,EXECUTABLE) ;; basic block 13, loop depth 0, count 0, freq 319, maybe hot ;; Invalid sum of incoming frequencies 432, should be 319 ;;prev block 12, next block 14, flags: (NEW, REACHABLE) ;;pred: 110 [100.0%] (FALLTHRU) ;;12 [100.0%] (FALLTHRU,EXECUTABLE) ;;11 (FALSE_VALUE,EXECUTABLE) # _478 = PHI 0B(110), p_260(12), p_260(11) ret = _478; _142 = di_13(D)-expansion; _143 = _478-u.s_builtin.type; In particular note block 11 does *not* dominate block 13. However, we know that when we traverse the edge 11-13 that p_260 will have the value zero, which should be propagated into the PHI node. After fixing the propagation with the attached patch we have _478 = PHI 0B(110), p_260(12), 0B(11) I have other code which then discovers the unconditional NULL pointer dereferences when we traverse 110-13 or 11-13 and isolates those paths. That in turn allows blocks 12 and 13 to be combined, which in turn allows discovery of additional CSE opportunities. Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Applied to the trunk. * gcc.c-torture/execute/pr58387.c: New test. * tree-ssa-dom.c (cprop_into_successor_phis): Also propagate edge implied equivalences into successor phis. * tree-ssa-threadupdate.c (phi_args_equal_on_edges): Moved into here from tree-ssa-threadedge.c. (mark_threaded_blocks): When threading through a joiner, if both successors of the joiner's clone reach the same block, verify the PHI arguments are equal. If not, cancel the jump threading request. * tree-ssa-threadedge.c (phi_args_equal_on_edges): Moved into tree-ssa-threadupdate.c (thread_across_edge): Don't check PHI argument equality when threading through joiner block here. diff --git a/gcc/testsuite/gcc.c-torture/execute/pr58387.c b/gcc/testsuite/gcc.c-torture/execute/pr58387.c new file mode 100644 index 000..74c32df --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/execute/pr58387.c @@ -0,0 +1,11 @@ +extern void abort(void); + +int a = -1; + +int main () +{ + int b = a == 0 ? 0 : -a; + if (b 1) +abort (); + return 0; +} diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c index e02a566..bf75135 100644 --- a/gcc/tree-ssa-dom.c +++ b/gcc/tree-ssa-dom.c @@ -1642,6 +1642,28 @@ cprop_into_successor_phis (basic_block bb) if (gsi_end_p (gsi)) continue; + /* We may have an equivalence associated with this edge. While +we can not propagate it into non-dominated blocks, we can +propagate them into PHIs in non-dominated blocks. */ + + /* Push the unwind marker so we can reset the const and copies +table back to its original state after processing this edge. */ + const_and_copies_stack.safe_push (NULL_TREE); + + /* Extract and record any simple NAME = VALUE equivalences. + +Don't bother with [01] = COND equivalences, they're not useful +here. */ + struct edge_info *edge_info = (struct edge_info *) e-aux; + if (edge_info) + { + tree lhs = edge_info-lhs; + tree rhs = edge_info-rhs; + + if (lhs TREE_CODE (lhs) == SSA_NAME) + record_const_or_copy (lhs, rhs); + } + indx = e-dest_idx; for ( ; !gsi_end_p (gsi); gsi_next (gsi)) { @@ -1667,6 +1689,8 @@ cprop_into_successor_phis (basic_block bb) may_propagate_copy (orig_val, new_val)) propagate_value (orig_p, new_val); } + + restore_vars_to_original_value (); } } diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c index 42474f1..47db280 100644 --- a/gcc/tree-ssa-threadedge.c +++ b/gcc/tree-ssa-threadedge.c @@ -841,28 +841,6 @@ thread_around_empty_blocks (edge taken_edge, return false; } -/*
Merge from gcc 4.8 branch to gccgo branch
I merged the GCC 4.8 brach to the gccgo branch. Ian
[C++ Patch] PR 58448
Hi, this ICE is caused by error_mark_node as TREE_TYPE of a TYPE_DECL, which leads to a crash at the beginning of the TYPE_DECL case of tsubst_decl. I tried various approaches - for example turning all error_operand_p (t) == true arguments passes to tsubst into error_mark_nodes also works - but I think I have a weak preference for the solution below, because conceptually matches the section of grokdeclarator beginning with: /* If this is declaring a typedef name, return a TYPE_DECL. */ if (typedef_p decl_context != TYPENAME) which seems rather special in terms of producing such TYPE_DECLs in case of errors (it does that for error recovery reasons, I suppose: just returning error_mark_node leads to worse diagnostic for eg, parse/error32.C). Tested x86_64-linux. Thanks, Paolo. /// /cp 2013-09-17 Paolo Carlini paolo.carl...@oracle.com PR c++/58448 * pt.c (tsubst_decl, [TYPE_DECL]): Check TREE_TYPE (t) for error_mark_node. /testsuite 2013-09-17 Paolo Carlini paolo.carl...@oracle.com PR c++/58448 * g++.dg/template/crash117.C: New. Index: cp/pt.c === --- cp/pt.c (revision 202668) +++ cp/pt.c (working copy) @@ -10741,19 +10741,23 @@ tsubst_decl (tree t, tree args, tsubst_flags_t com tree type = NULL_TREE; bool local_p; - if (TREE_CODE (t) == TYPE_DECL -t == TYPE_MAIN_DECL (TREE_TYPE (t))) + if (TREE_CODE (t) == TYPE_DECL) { - /* If this is the canonical decl, we don't have to - mess with instantiations, and often we can't (for - typename, template type parms and such). Note that - TYPE_NAME is not correct for the above test if - we've copied the type for a typedef. */ - type = tsubst (TREE_TYPE (t), args, complain, in_decl); - if (type == error_mark_node) + if (TREE_TYPE (t) == error_mark_node) RETURN (error_mark_node); - r = TYPE_NAME (type); - break; + else if (t == TYPE_MAIN_DECL (TREE_TYPE (t))) + { + /* If this is the canonical decl, we don't have to + mess with instantiations, and often we can't (for + typename, template type parms and such). Note that + TYPE_NAME is not correct for the above test if + we've copied the type for a typedef. */ + type = tsubst (TREE_TYPE (t), args, complain, in_decl); + if (type == error_mark_node) + RETURN (error_mark_node); + r = TYPE_NAME (type); + break; + } } /* Check to see if we already have the specialization we Index: testsuite/g++.dg/template/crash117.C === --- testsuite/g++.dg/template/crash117.C(revision 0) +++ testsuite/g++.dg/template/crash117.C(working copy) @@ -0,0 +1,6 @@ +// PR c++/58448 + +class SmallVector; struct Types4; +template typename, typename, typename, typename struct Types { + typedef Types4::Constructable // { dg-error template|typedef|expected } +} TypesSmallVector, SmallVector, SmallVector, SmallVector:: // { dg-error expected }
[rl78] fix far address optimizations
Track both parts of far addresses so they don't get optimized away. Committed. * config/rl78/constraints.md: For each W* constraint, rename to C* and create a W* constraint that checks for an optional ES: prefix pattern also. * config/rl78/rl78.md (UNS_ES_ADDR): New. (es_addr): New. Used to wrap far addresses. * config/rl78/rl78-protos.h (rl78_es_addr): New. (rl78_es_base): New. * config/rl78/rl78.c (rl78_as_legitimate_address): Accept unspec wrapped far addresses. (rl78_print_operand_1): Unwrap far addresses before processing. (rl78_lo16): Wrap far addresses in unspecs. (rl78_es_addr): New. (rl78_es_base): New. (insn_ok_now): Check for not-yet-wrapped far addresses. (transcode_memory_rtx): Properly re-wrap far addresses. Index: config/rl78/constraints.md === --- config/rl78/constraints.md (revision 202665) +++ config/rl78/constraints.md (working copy) @@ -200,103 +200,155 @@ (define_register_constraint Zint INT_REGS The interrupt registers.) ; All the memory addressing schemes the RL78 supports ; of the form W {register} {bytes of offset} ; or W {register} {register} +; Additionally, the Cxx forms are the same as the Wxx forms, but without +; the ES: override. ; absolute address -(define_memory_constraint Wab +(define_memory_constraint Cab [addr] (and (match_code mem) (ior (match_test CONSTANT_P (XEXP (op, 0))) (match_test GET_CODE (XEXP (op, 0)) == PLUS GET_CODE (XEXP (XEXP (op, 0), 0)) == SYMBOL_REF)) ) ) +(define_memory_constraint Wab + es:[addr] + (match_test rl78_es_addr (op) satisfies_constraint_Cab (rl78_es_base (op)) + || satisfies_constraint_Cab (op)) + ) -(define_memory_constraint Wbc +(define_memory_constraint Cbc word16[BC] (and (match_code mem) (ior (and (match_code reg 0) (match_test REGNO (XEXP (op, 0)) == BC_REG)) (and (match_code plus 0) (and (and (match_code reg 00) (match_test REGNO (XEXP (XEXP (op, 0), 0)) == BC_REG)) (match_test uword_operand (XEXP (XEXP (op, 0), 1), VOIDmode) ) ) +(define_memory_constraint Wbc + es:word16[BC] + (match_test rl78_es_addr (op) satisfies_constraint_Cbc (rl78_es_base (op)) + || satisfies_constraint_Cbc (op)) + ) -(define_memory_constraint Wde +(define_memory_constraint Cde [DE] (and (match_code mem) (and (match_code reg 0) (match_test REGNO (XEXP (op, 0)) == DE_REG))) ) +(define_memory_constraint Wde + es:[DE] + (match_test rl78_es_addr (op) satisfies_constraint_Cde (rl78_es_base (op)) + || satisfies_constraint_Cde (op)) + ) -(define_memory_constraint Wca +(define_memory_constraint Cca [AX..HL] for calls (and (match_code mem) (and (match_code reg 0) (match_test REGNO (XEXP (op, 0)) = HL_REG))) ) +(define_memory_constraint Wca + es:[AX..HL] for calls + (match_test rl78_es_addr (op) satisfies_constraint_Cca (rl78_es_base (op)) + || satisfies_constraint_Cca (op)) + ) -(define_memory_constraint Wcv +(define_memory_constraint Ccv [AX..HL,r8-r23] for calls (and (match_code mem) (and (match_code reg 0) (match_test REGNO (XEXP (op, 0)) 24))) ) +(define_memory_constraint Wcv + es:[AX..HL,r8-r23] for calls + (match_test rl78_es_addr (op) satisfies_constraint_Ccv (rl78_es_base (op)) + || satisfies_constraint_Ccv (op)) + ) -(define_memory_constraint Wd2 +(define_memory_constraint Cd2 word16[DE] (and (match_code mem) (ior (and (match_code reg 0) (match_test REGNO (XEXP (op, 0)) == DE_REG)) (and (match_code plus 0) (and (and (match_code reg 00) (match_test REGNO (XEXP (XEXP (op, 0), 0)) == DE_REG)) (match_test uword_operand (XEXP (XEXP (op, 0), 1), VOIDmode) ) ) +(define_memory_constraint Wd2 + es:word16[DE] + (match_test rl78_es_addr (op) satisfies_constraint_Cd2 (rl78_es_base (op)) + || satisfies_constraint_Cd2 (op)) + ) -(define_memory_constraint Whl +(define_memory_constraint Chl [HL] (and (match_code mem) (and (match_code reg 0) (match_test REGNO (XEXP (op, 0)) == HL_REG))) ) +(define_memory_constraint Whl + es:[HL] + (match_test rl78_es_addr (op) satisfies_constraint_Chl (rl78_es_base (op)) + || satisfies_constraint_Chl (op)) + ) -(define_memory_constraint Wh1 +(define_memory_constraint Ch1 byte8[HL] (and (match_code mem) (and (match_code plus 0) (and (and (match_code reg 00) (match_test REGNO (XEXP (XEXP (op, 0), 0)) == HL_REG)) (match_test ubyte_operand
[GOOGLE] AutoFDO should honor system paths in the profile
This patch makes AutoFDO honor system paths stored in the profile. Bootstrapped and passed regression tests. OK for google-4_8 branch? Thanks, Dehao Index: gcc/auto-profile.c === --- gcc/auto-profile.c (revision 202672) +++ gcc/auto-profile.c (working copy) @@ -616,11 +616,11 @@ bool autofdo_module_profile::read () { char *name = xstrdup (gcov_read_string ()); unsigned total_num = 0; - unsigned num_array[6]; + unsigned num_array[7]; unsigned exported = gcov_read_unsigned (); unsigned lang = gcov_read_unsigned (); unsigned ggc_memory = gcov_read_unsigned (); - for (unsigned j = 0; j 6; j++) + for (unsigned j = 0; j 7; j++) { num_array[j] = gcov_read_unsigned (); total_num += num_array[j]; @@ -638,9 +638,10 @@ bool autofdo_module_profile::read () module-ggc_memory = ggc_memory; module-num_quote_paths = num_array[1]; module-num_bracket_paths = num_array[2]; - module-num_cpp_defines = num_array[3]; - module-num_cpp_includes = num_array[4]; - module-num_cl_args = num_array[5]; + module-num_system_paths = num_array[3]; + module-num_cpp_defines = num_array[4]; + module-num_cpp_includes = num_array[5]; + module-num_cl_args = num_array[6]; module-source_filename = name; module-is_primary = strcmp (name, in_fnames[0]) == 0; module-flags = module-is_primary ? exported : 1;
Re: [GOOGLE] AutoFDO should honor system paths in the profile
ok. David On Tue, Sep 17, 2013 at 4:53 PM, Dehao Chen de...@google.com wrote: This patch makes AutoFDO honor system paths stored in the profile. Bootstrapped and passed regression tests. OK for google-4_8 branch? Thanks, Dehao Index: gcc/auto-profile.c === --- gcc/auto-profile.c (revision 202672) +++ gcc/auto-profile.c (working copy) @@ -616,11 +616,11 @@ bool autofdo_module_profile::read () { char *name = xstrdup (gcov_read_string ()); unsigned total_num = 0; - unsigned num_array[6]; + unsigned num_array[7]; unsigned exported = gcov_read_unsigned (); unsigned lang = gcov_read_unsigned (); unsigned ggc_memory = gcov_read_unsigned (); - for (unsigned j = 0; j 6; j++) + for (unsigned j = 0; j 7; j++) { num_array[j] = gcov_read_unsigned (); total_num += num_array[j]; @@ -638,9 +638,10 @@ bool autofdo_module_profile::read () module-ggc_memory = ggc_memory; module-num_quote_paths = num_array[1]; module-num_bracket_paths = num_array[2]; - module-num_cpp_defines = num_array[3]; - module-num_cpp_includes = num_array[4]; - module-num_cl_args = num_array[5]; + module-num_system_paths = num_array[3]; + module-num_cpp_defines = num_array[4]; + module-num_cpp_includes = num_array[5]; + module-num_cl_args = num_array[6]; module-source_filename = name; module-is_primary = strcmp (name, in_fnames[0]) == 0; module-flags = module-is_primary ? exported : 1;
[PATCH], PR target/58452, Fix gcc 4.8/trunk linuxpaired breakage
While doing some work on power8, I wanted to make sure that for existing systems, I was generating the same code. So I built some code and ran it through various -mcpu= options. When I built a powerpc-linuxpaired compiler, the compiler has trouble with a simple loop that should be vectorized. I traced the code to changes in the vectorizer that required the predicates for movmismalign* to accept memory operands. In the main part of the powerpc compiler, we made this change in April, 2011, but we missed the paired floating point support, since you need to use special configuration options to enable paired floating point support. 2011-04-01 Andrew Pinski pins...@gmail.com Michael Meissner meiss...@linux.vnet.ibm.com PR target/48262 * config/rs6000/vector.md (movmisalignmode): Allow for memory operands, as per the specifications. * config/rs6000/altivec.md (vec_extract_evenv4si): Correct modes. (vec_extract_evenv4sf): Ditto. (vec_extract_evenv8hi): Ditto. (vec_extract_evenv16qi): Ditto. (vec_extract_oddv4si): Ditto. I will do the usual bootstrap/make check tomorrow. Assuming it has no regressions, can I check this patch it to both the 4.8 branch and trunk? 2013-09-17 Michael Meissner meiss...@linux.vnet.ibm.com PR target/58452 * config/rs6000/paired.md (movmisalignv2sf): Fix to allow memory operaands. Index: gcc/config/rs6000/paired.md === --- gcc/config/rs6000/paired.md (revision 202632) +++ gcc/config/rs6000/paired.md (working copy) @@ -462,8 +462,8 @@ (define_expand reduc_splus_v2sf }) (define_expand movmisalignv2sf - [(set (match_operand:V2SF 0 gpc_reg_operand =f) -(match_operand:V2SF 1 gpc_reg_operand f))] + [(set (match_operand:V2SF 0 nonimmediate_operand ) +(match_operand:V2SF 1 any_operand ))] TARGET_PAIRED_FLOAT { paired_expand_vector_move (operands); -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Re: Using gen_int_mode instead of GEN_INT minor testsuite fallout on MIPS
On Sep 17, 2013, at 10:24 AM, Mike Stump mikest...@comcast.net wrote: On Sep 16, 2013, at 8:41 PM, DJ Delorie d...@redhat.com wrote: m32c's PSImode is 24-bits, why does it have 32 in the macro? /* 24-bit pointers, in 32-bit units */ -PARTIAL_INT_MODE (SI); +PARTIAL_INT_MODE_NAME (SI, 32, PSI); Sorry, fingers copied the wrong number. Thanks for the catch. partial-1.diffs.txt p7 boostrap test complete: New tests that PASS: gcc.dg/simulate-thread/atomic-other-short.c -O3 -g thread simulation test it seems someone doesn't flush or wait, I don't think my patch actually fixed this.
Re: [go-nuts] Solaris gccgo http.Get error?
On Tue, Sep 17, 2013 at 12:28 PM, ernie.hers...@10gen.com wrote: If you don't mind explaining, can you tell me why you didn't apply the change to the 4.7 branch? I'm not maintaining Go on the 4.7 branch. I don't object to somebody else doing it, I'm just not doing it myself. My time is limited and I have to draw the line somewhere. Ian On Friday, August 9, 2013 4:53:30 PM UTC-4, Ian Lance Taylor wrote: On Thu, Aug 8, 2013 at 11:22 PM, Jakob Borg ja...@nym.se wrote: But, adding a hints.ai_socktype = SOCK_STREAM; gives me jb@zlogin2:~ $ ./test canonical name: www.google.com 26 2 6 26 2 6 26 2 6 26 2 6 26 2 6 26 2 6 It seems we might need a tweak to support Solaris... :/ Looks like it. I committed a patch to the master repository. This patch copies it over to gccgo. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.8 branch. Note that I have not made the change on the 4.7 branch which is what you are using. The same patch should work for the 4.7 sources, though, if you want to copy it over. Ian -- You received this message because you are subscribed to the Google Groups golang-nuts group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[rl78] add bit test/branch insns
A few new patterns. Committed. 2013-09-17 Nick Clifton ni...@redhat.com * config/rl78/rl78-real.md (bf): New pattern. (bt): New pattern. * config/rl78/rl78.c (rl78_print_operand_1): Handle %B. (rl78_print_operand): Do not put a # before a %B. * config/rl78/rl78.opt: Tweak doc strings. Index: config/rl78/rl78-real.md === --- config/rl78/rl78-real.md(revision 202675) +++ config/rl78/rl78-real.md(working copy) @@ -456,6 +456,61 @@ (set (reg:HI AX_REG) (match_dup 0))] [(set (match_dup 0) (reg:HI AX_REG))] ) +;; Bit test and branch insns. + +;; NOTE: These patterns will work for bits in other places, not just A. + +(define_insn bf + [(set (pc) + (if_then_else (eq (and (reg:QI A_REG) + (match_operand 0 immediate_operand n)) + (const_int 0)) + (label_ref (match_operand 1 )) + (pc)))] + + bf\tA.%B0, $%1 +) + +(define_insn bt + [(set (pc) + (if_then_else (ne (and (reg:QI A_REG) + (match_operand 0 immediate_operand n)) + (const_int 0)) + (label_ref (match_operand 1 )) + (pc)))] + + bt\tA.%B0, $%1 +) + +;; NOTE: These peepholes are fragile. They rely upon GCC generating +;; a specific sequence on insns, based upon examination of test code. +;; Improvements to GCC or using code other than the test code can result +;; in the peephole not matching and the optimization being missed. + +(define_peephole2 + [(set (match_operand:QI 1 register_operand) (reg:QI A_REG)) + (set (match_dup 1) (and:QI (match_dup 1) (match_operand 2 immediate_operand))) + (set (pc) (if_then_else (eq (match_dup 1) (const_int 0)) + (label_ref (match_operand 3 )) + (pc)))] + peep2_regno_dead_p (3, REGNO (operands[1])) +exact_log2 (INTVAL (operands[2])) = 0 + [(set (pc) (if_then_else (eq (and (reg:QI A_REG) (match_dup 2)) (const_int 0)) + (label_ref (match_dup 3)) (pc)))] + ) + +(define_peephole2 + [(set (match_operand:QI 1 register_operand) (reg:QI A_REG)) + (set (match_dup 1) (and:QI (match_dup 1) (match_operand 2 immediate_operand))) + (set (pc) (if_then_else (ne (match_dup 1) (const_int 0)) + (label_ref (match_operand 3 )) + (pc)))] + peep2_regno_dead_p (3, REGNO (operands[1])) +exact_log2 (INTVAL (operands[2])) = 0 + [(set (pc) (if_then_else (ne (and (reg:QI A_REG) (match_dup 2)) (const_int 0)) + (label_ref (match_dup 3)) (pc)))] + ) + Index: config/rl78/rl78.c === --- config/rl78/rl78.c (revision 202675) +++ config/rl78/rl78.c (working copy) @@ -1283,12 +1283,13 @@ rl78_function_arg_boundary (enum machine m - minus - negative of CONST_INT value. c - inverse of a conditional (NE vs EQ for example) z - collapsed conditional s - shift count mod 8 S - shift count mod 16 r - reverse shift count (8-(count mod 8)) + B - bit position h - bottom HI of an SI H - top HI of an SI q - bottom QI of an HI Q - top QI of an HI e - third QI of an SI (i.e. where the ES register gets values from) @@ -1409,12 +1410,14 @@ rl78_print_operand_1 (FILE * file, rtx o else if (letter == 'q') fprintf (file, %ld, INTVAL (op) 0xff); else if (letter == 'h') fprintf (file, %ld, INTVAL (op) 0x); else if (letter == 'e') fprintf (file, %ld, (INTVAL (op) 16) 0xff); + else if (letter == 'B') + fprintf (file, %d, exact_log2 (INTVAL (op))); else if (letter == 'E') fprintf (file, %ld, (INTVAL (op) 24) 0xff); else if (letter == 'm') fprintf (file, %ld, - INTVAL (op)); else if (letter == 's') fprintf (file, %ld, INTVAL (op) % 8); @@ -1602,13 +1605,13 @@ rl78_print_operand_1 (FILE * file, rtx o #undef TARGET_PRINT_OPERAND #define TARGET_PRINT_OPERAND rl78_print_operand static void rl78_print_operand (FILE * file, rtx op, int letter) { - if (CONSTANT_P (op) letter != 'u' letter != 's' letter != 'r' letter != 'S') + if (CONSTANT_P (op) letter != 'u' letter != 's' letter != 'r' letter != 'S' letter != 'B') fprintf (file, #); rl78_print_operand_1 (file, op, letter); } #undef TARGET_TRAMPOLINE_INIT #define TARGET_TRAMPOLINE_INIT rl78_trampoline_init Index: config/rl78/rl78.opt === --- config/rl78/rl78.opt(revision 202675) +++ config/rl78/rl78.opt(working copy) @@ -20,13 +20,13 @@ ;--- HeaderInclude config/rl78/rl78-opts.h msim -Target +Target Report Use the simulator
RE: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction
-Original Message- From: Dominique Dhumieres [mailto:domi...@lps.ens.fr] Sent: Wednesday, September 18, 2013 1:47 AM To: gcc-patches@gcc.gnu.org Cc: hjl.to...@gmail.com; Bin Cheng Subject: Re: [PATCH GCC]Catch more MEM_REFs sharing common addressing part in gimple strength reduction The new test gcc.dg/tree-ssa/slsr-39.c fails in 64 bit mode (see http://gcc.gnu.org/ml/gcc-regression/2013-09/msg00455.html ). Looking for MEM in the dump returns _12 = MEM[(int[50] *)_17]; MEM[(int[50] *)_20] = _13; Thanks for reporting, I think this can be fixed by patch: http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00761.html Thanks. bin
Re: [ping][PATCH][1 of 2] Add value range info to SSA_NAME for zero sign extension elimination in RTL
Thanks Richard for the review. On 16/09/13 23:43, Richard Biener wrote: On Mon, 16 Sep 2013, Kugan wrote: Hi, Updated the patch to the latest changes in trunk that splits tree.h. I also noticed an error in printing double_int and fixed it. Is this OK? print_gimple_stmt (dump_file, stmt, 0, -TDF_SLIM | (dump_flags TDF_LINENO)); +TDF_SLIM | TDF_RANGE | (dump_flags TDF_LINENO)); this should be (dump_flags (TDF_LINENO|TDF_RANGE)) do not always dump range info. I'd have simply re-used TDF_ALIAS (and interpret it as SSA annotation info), adding -range in dump file modifiers is ok with me. +static void +print_double_int (pretty_printer *buffer, double_int cst) +{ + tree node = double_int_to_tree (integer_type_node, cst); + if (TREE_INT_CST_HIGH (node) == 0) +pp_printf (buffer, HOST_WIDE_INT_PRINT_UNSIGNED, TREE_INT_CST_LOW (node)); + else if (TREE_INT_CST_HIGH (node) == -1 +TREE_INT_CST_LOW (node) != 0) +pp_printf (buffer, - HOST_WIDE_INT_PRINT_UNSIGNED, + -TREE_INT_CST_LOW (node)); + else +sprintf (pp_buffer (buffer)-digit_buffer, + HOST_WIDE_INT_PRINT_DOUBLE_HEX, + (unsigned HOST_WIDE_INT) TREE_INT_CST_HIGH (node), + (unsigned HOST_WIDE_INT) TREE_INT_CST_LOW (node)); using sprintf here looks like a layering violation to me. You probably want to factor out code from the INTEGER_CST handling of tree-pretty-print.c:dump_generic_node into a pp_double_int function in pretty-print.[ch] instead. @@ -1628,6 +1647,27 @@ dump_gimple_phi (pretty_printer *buffer, gimple phi, int spc, int flags) pp_string (buffer, # ); } + if ((flags TDF_RANGE) + !POINTER_TYPE_P (TREE_TYPE (lhs)) + SSA_NAME_RANGE_INFO (lhs)) +{ + double_int min, max; + value_range_type range_type; I realize the scheme is pre-existing but can you try factoring out the dumping of SSA_NAME_PTR_INFO / SSA_NAME_RANGE_INFO into a separate routine that can be shared by dump_gimple_phi and pp_gimple_stmt_1? +get_range_info (tree name, double_int min, double_int max, +enum value_range_type range_type) +{ + gcc_assert (!POINTER_TYPE_P (TREE_TYPE (name))); + gcc_assert (TREE_CODE (name) == SSA_NAME); + range_info_def *ri = SSA_NAME_RANGE_INFO (name); the TREE_CODE (name) == SSA_NAME assert is redundant with the tree-checking performed by SSA_NAME_RANGE_INFO. Likewise in the other functions. +void +get_range_info (tree name, double_int min, double_int max, +enum value_range_type range_type) I'm not sure we want to use references. Well - first time. + /* If min max, it is VR_ANTI_RANGE. */ + if (ri-min.scmp (ri-max) == 1) +{ I think that's wrong and needs to be conditional on TYPE_UNSIGNED of the SSA name. + else if (vr_value[i]-type == VR_ANTI_RANGE) +{ + /* VR_ANTI_RANGE ~[min, max] is encoded compactly as + [max + 1, min - 1] without additional attributes. + When min value max value, we know that it is + VR_ANTI_RANGE; it is VR_RANGE othewise. */ + set_range_info (name, + tree_to_double_int (vr_value[i]-max) + + double_int_one, + tree_to_double_int (vr_value[i]-min) + - double_int_one); there is a complication for when max + 1 or min - 1 overflow - those should be non-canonical ranges I think, but double-check this (check set_and_canonicalize_value_range). I have now added a check for min == 0 for unsigned type. AFAIK, For double_int type, this is the only case we should check. I have also made the other changes you have asked me to do. Please find the modified patch and ChangeLog. Bootstrapped and regtested for x86_64-unknown-linux-gnu. Is this OK. Thanks, Kugan +2013-09-17 Kugan Vivekanandarajah kug...@linaro.org + + * gimple-pretty-print.c (dump_ssaname_info) : New function. + * gimple-pretty-print.c (dump_gimple_phi) : Dump range info. + * (pp_gimple_stmt_1) : Likewise. + * tree-pretty-print.c (dump_intger_cst_node) : New function. + * (dump_generic_node) : Call dump_intger_cst_node for INTEGER_CST. + * tree-ssa-alias.c (dump_alias_info) : Check pointer type. + * tree-ssa-copy.c (fini_copy_prop) : Check pointer type and copy + range info. + * tree-ssanames.c (make_ssa_name_fn) : Check pointer type in + initialize. + * (set_range_info) : New function. + * (get_range_info) : Likewise. + * (duplicate_ssa_name_range_info) : Likewise. + * (duplicate_ssa_name_fn) : Check pointer type and call correct + duplicate function. + * tree-vrp.c (vrp_finalize): Call set_range_info to upddate + value range of SSA_NAMEs. + * tree.h (SSA_NAME_PTR_INFO) : changed to access via union + * tree.h