Re: [tsan] ThreadSanitizer instrumentation part
--fno-default-inline @gol +-fmove-loop-invariants -fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg @gol +-ftsan -ftsan-ignore -fno-default-inline @gol Change -ftsan to -fthread-sanitizer -fno-defer-pop -fno-function-cse -fno-guess-branch-probability @gol -fno-inline -fno-math-errno -fno-peephole -fno-peephole2 @gol -fno-sched-interblock -fno-sched-spec -fno-signed-zeros @gol @@ -5956,6 +5957,11 @@ appending @file{.dce} to the source file Dump each function after adding mudflap instrumentation. The file name is made by appending @file{.mudflap} to the source file name. + @item sra @opindex fdump-tree-sra Dump each function after performing scalar replacement of aggregates. The @@ -6798,6 +6804,12 @@ instrumentation (and therefore faster ex some protection against outright memory corrupting writes, but allows erroneously read data to propagate within a program. +@item -ftsan -ftsan-ignore +@opindex ftsan +@opindex ftsan-ignore +Add ThreadSanitizer instrumentation. Use @option{-ftsan-ignore} to specify +an ignore file. Refer to http://go/tsan for details. + -ftsan -- -fthread-sanitizer + * + * IT is free software; you can redistribute it and/or modify it under + * the terms of the GNU General Public License as published by the Free + * Software Foundation; either version 3, or (at your option) any later + * version. See http://www.gnu.org/licenses/ + / You can copy this part of header from asan.c +static int func_calls; + +/* Returns a definition of a runtime functione with type TYP and name NAME. */ s/functione/function/ + +static tree +build_func_decl (tree typ, const char *name) +/* Builds the following decl + void __tsan_read/writeX (void *addr); */ + +static tree +get_memory_access_decl (int is_write, unsigned size) Change is_write to type bool. +{ + tree typ, *decl; + char fname [64]; + static tree cache [2][17]; There is no need to make this function local. define a macro for value 17. + + is_write = !!is_write; No need for this after making is_write bool. + if (size = 1) +size = 1; + else if (size = 3) +size = 2; + else if (size = 7) +size = 4; Missing function comment: /* This function returns the function decl for __tsan_init. */ +static tree +get_init_decl (void) +{ + tree typ; + return decl; +} + +/* Builds the following gimple sequence: + __tsan_read/writeX (EXPR); */ + +static gimple_seq +instr_memory_access (tree expr, int is_write) +{ + tree addr_expr, expr_type, call_expr, fdecl; + gimple_seq gs; + unsigned size; + + gcc_assert (is_gimple_addressable (expr)); + addr_expr = build_addr (unshare_expr (expr), current_function_decl); + expr_type = TREE_TYPE (expr); + while (TREE_CODE (expr_type) == ARRAY_TYPE) +expr_type = TREE_TYPE (expr_type); + size = (TREE_INT_CST_LOW (TYPE_SIZE (expr_type))) / BITS_PER_UNIT; + fdecl = get_memory_access_decl (is_write, size); + call_expr = build_call_expr (fdecl, 1, addr_expr); + gs = NULL; + force_gimple_operand (call_expr, gs, true, 0); + return gs; Use gimple creator API: gimple_build_call -- see examples in tree-profile.c. Return the gimple_stmt_iterator for the newly created call stmt. +} + +/* Builds the following gimple sequence: + __tsan_vptr_update (EXPR, RHS); */ + +static gimple_seq +instr_vptr_update (tree expr, tree rhs) +{ + tree expr_ptr, call_expr, fdecl; + gimple_seq gs; + + expr_ptr = build_addr (unshare_expr (expr), current_function_decl); + fdecl = get_vptr_update_decl (); + call_expr = build_call_expr (fdecl, 2, expr_ptr, rhs); + gs = NULL; + force_gimple_operand (call_expr, gs, true, 0); + return gs; +} Same as above -- directly use gimple_build_call interface. + +/* Returns gimple seq that needs to be inserted at function entry. */ + +static gimple_seq +instr_func_entry (void) +{ + tree retaddr_decl, pc_addr, fdecl, call_expr; + gimple_seq gs; + + retaddr_decl = builtin_decl_implicit (BUILT_IN_RETURN_ADDRESS); + pc_addr = build_call_expr (retaddr_decl, 1, integer_zero_node); + fdecl = get_func_entry_decl (); + call_expr = build_call_expr (fdecl, 1, pc_addr); + gs = NULL; + force_gimple_operand (call_expr, gs, true, 0); + return gs; +} Same as above. + +/* Returns gimple seq that needs to be inserted before function exit. */ + +static gimple_seq +instr_func_exit (void) +{ + tree fdecl, call_expr; + gimple_seq gs; + + fdecl = get_func_exit_decl (); + call_expr = build_call_expr (fdecl, 0); + gs = NULL; + force_gimple_operand (call_expr, gs, true, 0); + return gs; +} + Same as above. +/* Sets location LOC for all gimples in the SEQ. */ + +static void +set_location (gimple_seq seq, location_t loc) +{ + gimple_seq_node n; + + for (n = gimple_seq_first (seq); n != NULL; n = n-gsbase.next) +gimple_set_location (n, loc); +} Is there a need for this function? The
Re: [tsan] ThreadSanitizer instrumentation part
On Wed, Oct 31, 2012 at 4:10 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! Just a couple of random comments: On Wed, Oct 31, 2012 at 11:34:10AM -0700, Wei Mi wrote: gcc/ChangeLog: 2012-10-31 Wei Mi w...@gmail.com If Dmitry wrote parts of the patch, it would be nice to mention him in the ChangeLog too. * Makefile.in (tsan.o): New * passes.c (init_optimization_passes): Add tsan passes * tree-pass.h (register_pass_info): Ditto * cfghooks.h (GCC_CFGHOOKS_H): Avoid including duplicate headers * doc/invoke.texi: Document tsan related options * toplev.c (compile_file): Add tsan pass in driver * gcc.c (LINK_COMMAND_SPEC): Add -lasan in link command if there -ftsan is on. * tsan.c: New file about tsan * tsan.h: Ditto All ChangeLog entries should end with a dot. --- gcc/cfghooks.h(revision 193016) +++ gcc/cfghooks.h(working copy) @@ -19,6 +19,9 @@ You should have received a copy of the G along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ +#ifndef GCC_CFGHOOKS_H +#define GCC_CFGHOOKS_H + /* Only basic-block.h includes this. */ struct cfg_hooks @@ -219,3 +222,4 @@ extern void gimple_register_cfg_hooks (v extern struct cfg_hooks get_cfg_hooks (void); extern void set_cfg_hooks (struct cfg_hooks); +#endif /* GCC_CFGHOOKS_H */ Why this? Simply don't include that header in tsan.c, it is already included by basic-block.h. --- gcc/tsan.c(revision 0) +++ gcc/tsan.c(revision 0) @@ -0,0 +1,534 @@ +/* GCC instrumentation plugin for ThreadSanitizer. + * Copyright (c) 2012, Google Inc. All rights reserved. + * Author: Dmitry Vyukov (dvyukov) + * + * IT is free software; you can redistribute it and/or modify it under + * the terms of the GNU General Public License as published by the Free + * Software Foundation; either version 3, or (at your option) any later + * version. See http://www.gnu.org/licenses/ + */ Can't google just assign the code to FSF, and use a standard boilerplate as everything else in gcc/ ? +/* Builds the following decl + void __tsan_vptr_update (void *vptr, void *val); */ + +static tree +get_vptr_update_decl (void) +{ + tree typ; + static tree decl; + + if (decl != NULL) +return decl; + typ = build_function_type_list (void_type_node, + ptr_type_node, ptr_type_node, NULL_TREE); + decl = build_func_decl (typ, __tsan_vptr_update); + return decl; +} ... Instead of this (but same applies to asan), I think we should just consider putting it into builtins.def (or have sanitizer.def like there is sync.def or omp-builtins.def). The problem might be non-C/C++ family frontends though. This sounds good -- may be better to defer this and combine the effort with asan. David +static gimple_seq +instr_memory_access (tree expr, int is_write) +{ + tree addr_expr, expr_type, call_expr, fdecl; + gimple_seq gs; + unsigned size; + + gcc_assert (is_gimple_addressable (expr)); + addr_expr = build_addr (unshare_expr (expr), current_function_decl); + expr_type = TREE_TYPE (expr); + while (TREE_CODE (expr_type) == ARRAY_TYPE) +expr_type = TREE_TYPE (expr_type); + size = (TREE_INT_CST_LOW (TYPE_SIZE (expr_type))) / BITS_PER_UNIT; int_size_in_bytes. + fdecl = get_memory_access_decl (is_write, size); + call_expr = build_call_expr (fdecl, 1, addr_expr); + gs = NULL; + force_gimple_operand (call_expr, gs, true, 0); + return gs; I think it is weird to return gimple_seqs, it would be better to just emit the code at some stmt iterator, and preferrably without building everything as trees, then gimplifying it. +} + +/* Builds the following gimple sequence: + __tsan_vptr_update (EXPR, RHS); */ + +static gimple_seq +instr_vptr_update (tree expr, tree rhs) +{ + tree expr_ptr, call_expr, fdecl; + gimple_seq gs; + + expr_ptr = build_addr (unshare_expr (expr), current_function_decl); + fdecl = get_vptr_update_decl (); + call_expr = build_call_expr (fdecl, 2, expr_ptr, rhs); + gs = NULL; + force_gimple_operand (call_expr, gs, true, 0); + return gs; +} + +/* Returns gimple seq that needs to be inserted at function entry. */ + +static gimple_seq +instr_func_entry (void) +{ + tree retaddr_decl, pc_addr, fdecl, call_expr; + gimple_seq gs; + + retaddr_decl = builtin_decl_implicit (BUILT_IN_RETURN_ADDRESS); + pc_addr = build_call_expr (retaddr_decl, 1, integer_zero_node); + fdecl = get_func_entry_decl (); + call_expr = build_call_expr (fdecl, 1, pc_addr); + gs = NULL; + force_gimple_operand (call_expr, gs, true, 0); + return gs; +} + +/* Returns gimple seq that needs to be inserted before function exit. */ + +static gimple_seq +instr_func_exit (void) +{ + tree fdecl, call_expr; + gimple_seq gs; + + fdecl = get_func_exit_decl (); + call_expr = build_call_expr
Re: [tsan] ThreadSanitizer instrumentation part
On Wed, Oct 31, 2012 at 11:00:17PM -0700, Xinliang David Li wrote: + /* Below are things we do not instrument + (no possibility of races or not implemented yet). */ + if (/* Compiler-emitted artificial variables. */ + (DECL_P (expr) DECL_ARTIFICIAL (expr)) + /* The var does not live in memory - no possibility of races. */ + || (tcode == VAR_DECL + TREE_ADDRESSABLE (expr) == 0 == !TREE_ADDRESSABLE (expr) + TREE_STATIC (expr) == 0) To detect stack varaible, TREE_STATIC can not be used. Use !DECL_EXTERNAL instead. TREE_STATIC is right for that. Jakub
Committed: Fix sh regression for target/55160
gen_doloop_end_split creates a pattern that sets pc, hence emit_jump_insn has to be used instead of emit_insn. Committed as obvious. 2012-11-01 Joern Rennecke joern.renne...@embecosm.com PR target/55160 * config/sh/sh.md (doloop_end): Use emit_jump_insn. Index: config/sh/sh.md === --- config/sh/sh.md (revision 193029) +++ config/sh/sh.md (working copy) @@ -8479,7 +8479,7 @@ (define_expand doloop_end { if (GET_MODE (operands[0]) != SImode) FAIL; - emit_insn (gen_doloop_end_split (operands[0], operands[4], operands[0])); + emit_jump_insn (gen_doloop_end_split (operands[0], operands[4], operands[0])); DONE; })
Re: [tsan] ThreadSanitizer instrumentation part
On Wed, Oct 31, 2012 at 11:34:10AM -0700, Wei Mi wrote: +static tree +get_init_decl (void) +{ + tree typ; + static tree decl; + + if (decl != NULL) +return decl; + typ = build_function_type_list (void_type_node, NULL_TREE); + decl = build_func_decl (typ, __tsan_init); + return decl; +} The above can crash the compiler btw, as that static tree decl (in many other functions) is not GTY(()) marked (must be file scope for that), thus ggc_collect might free it. Also, please use type instead of typ for variable names. + /* Instrumentation for assignment of a function result + must be inserted after the call. Instrumentation for + reads of function arguments must be inserted before the call. + That's because the call can contain synchronization. */ + if (is_gimple_call (stmt) is_write) +gsi_insert_seq_after (gsi, gs, GSI_NEW_STMT); Inserting stmts after a call may or may not work. E.g. if the call can throw, it must be the last stmt in a basic block, so then the stmts need to be inserted on a successor edge. Similarly noreturn call must be last (but in that case it shouldn't have lhs). + gcode = gimple_code (stmt); + if (gcode == GIMPLE_CALL) is_gimple_call (stmt) +{ + if (gimple_call_fndecl (stmt) != get_init_decl ()) +func_calls++; +} + else if (gcode == GIMPLE_ASSIGN) is_gimple_assign (stmt) +{ + /* Handle assignment lhs as store. */ + lhs = gimple_assign_lhs (stmt); + instrument_expr (gsi, lhs, 1); To find what a store or load is, you can just use the new gimple_store_p (stmt) and gimple_assign_load_p (stmt) predicates, or at least just do gimple_assign_single_p (stmt) to guard instrument_expr calls on both lhs and rhs1. No need to scan all operands, only single rhs assignments can be loads. And as David said, use true/false instead of 1/0. Jakub
Re: [tsan] ThreadSanitizer instrumentation part
On Thu, Nov 01, 2012 at 10:01:31AM +0400, Dmitry Vyukov wrote: For gimplification context, see above, would be better to just emit gimple directly. For func_calls and func_mops, I believe why you need two variables instead of just one, and why the function can't just return a bool whether entry/exit needs to be instrumented or not. It was useful for debugging and stats collection. We currently have something similar in llvm passes as well. E.g. when you add an additional optimization that eliminates some instrumentation, you need to see how much it helps. But you are not using it anywhere but to decide if entry/exit should be instrumented. Either you should put some message into the pass dump if -fdump-tree-tsan{,0}, then you can just use that option and grep, or for one time statistics gathering I think people are usually using one-off hacks, some statistics variables, and fopen (/tmp/somefile, a); fprintf (, ...); them either before exit or upon each change, then gather that across say bootstrap/regtest and build of some interesting packages. But such hacks shouldn't remain in gcc sources. For how many __tsan_* calls have been emitted IMHO the easiest is just to objdump -dr the resulting binary or shared library. Jakub
Re: [PATCH][RFC] Sanity checking for -freorder-blocks-and-partition failures
On 31 October 2012 20:06, Teresa Johnson tejohn...@google.com wrote: On Wed, Oct 31, 2012 at 12:58 PM, Christophe Lyon christophe.l...@st.com wrote: On 30.10.2012 17:59, Teresa Johnson wrote: On Tue, Oct 30, 2012 at 9:26 AM, Steven Bosscher stevenb@gmail.com wrote: Hello, Hot/cold partitioning is apparently a hot topic all of a sudden, which is a good thing of course, because it's in need of some TLC. The attached patch adds another check the RTL cfg checking (verify_flow_info) for the partitioning: A hot block can never be dominated by a cold block (because the dominated block must also be cold). This trips in PR55121. I haven't tested this with any profiling tests, but it's bound to break things. From my POV, whatever gets broken by this patch was already broken to begin with :-) If you're in CC, it's because I hope you can help test this patch. I will try testing your patch on top of mine with our fdo benchmarks. For the others on the cc list, you may need to include my patch as well for testing. Without it, -freorder-blocks-and-partition was DOA for me. For my patch, see http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02692.html Teresa I have tried Steven's patch an indeed it reported numerous errors while compiling spec2k. I tried Teresa's patch too, but it changed nothing in my tests. The patches already posted by Matt are still necessary and Teresa's patch does not improve my tests. With checking enabled I am seeing additional failures that my fixes are not addressing. Looking into those now. Can someone point me to Matt's patches? Is that this one: http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00274.html or are there others? That is one of two. The other one is: http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00275.html I'd be careful with the second one - the original patch posted (in the link) fixes the issue I was seeing, but Stephen Bosscher suggested I made some changes, and I reposted a patch with some incorrect logic (which unfortunately also fixes the issue but more by luck than judgement). I haven't had the time to fully work through updating and testing a reworked patch yet. Thanks, Matt -- Matthew Gretton-Dann Linaro Toolchain Working Group matthew.gretton-d...@linaro.org
Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
On Tue, Oct 30, 2012 at 4:04 PM, Sharad Singhai sing...@google.com wrote: Hi Jakub, My -fopt-info pass filtering patch (http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02704.html) is being reviewed and I hope to get this in by Nov. 5 for inclusion in gcc 4.8.0. I just committed -fopt-info pass filtering patch as r193061. Thanks, Sharad Thanks, Sharad On Mon, Oct 29, 2012 at 10:56 AM, Jakub Jelinek ja...@redhat.com wrote: Status == I'd like to close the stage 1 phase of GCC 4.8 development on Monday, November 5th. If you have still patches for new features you'd like to see in GCC 4.8, please post them for review soon. Patches posted before the freeze, but reviewed shortly after the freeze, may still go in, further changes should be just bugfixes and documentation fixes. Quality Data Priority # Change from Last Report --- --- P1 23 + 23 P2 77 + 8 P3 85 + 84 --- --- Total 185 +115 Previous Report === http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html The next report will be sent by me again, announcing end of stage 1.
Re: [PR54693] loss of debug info in jump threading and loop ivopts
On Wed, Oct 31, 2012 at 01:13:28AM -0200, Alexandre Oliva wrote: From: Alexandre Oliva aol...@redhat.com, Jakub Jelinek ja...@redhat.com for gcc/ChangeLog PR debug/54693 * tree-ssa-loop-ivopts.c (remove_unused_ivs): Emit debug temps for dropped IV sets. Ok, thanks. Jakub
Re: [PATCH] Fix PR53501
It was OK until revision 187042: 2012-05-02 Richard Guenther rguent...@suse.de * tree.c (valid_constant_size_p): New function. * tree.h (valid_constant_size_p): Declare. * cfgexpand.c (expand_one_var): Adjust check for too large variables by using valid_constant_size_p. * varasm.c (assemble_variable): Likewise. c/ * c-decl.c (grokdeclarator): Properly check for sizes that cover more than half of the address-space. cp/ * decl.c (grokdeclarator): Properly check for sizes that cover more than half of the address-space. 2012-05-02 Richard Guenther rguent...@suse.de * fold-const.c (div_if_zero_remainder): sizetypes no longer sign-extend. (int_const_binop_1): New worker for int_const_binop with overflowable parameter. Pass it through to force_fit_type_double. (int_const_binop): Wrap around int_const_binop_1 with overflowable equal to one. (size_binop_loc): Call int_const_binop_1 with overflowable equal to minus one, forcing overflow detection for even unsigned types. (extract_muldiv_1): Remove bogus TYPE_IS_SIZETYPE special-casing. (fold_binary_loc): Call try_move_mult_to_index with signed offset. * stor-layout.c (initialize_sizetypes): sizetypes no longer sign-extend. (layout_type): For zero-sized arrays ignore overflow on the size calculations. * tree-ssa-ccp.c (bit_value_unop_1): Likewise. (bit_value_binop_1): Likewise. * tree.c (double_int_to_tree): Likewise. (double_int_fits_to_tree_p): Likewise. (force_fit_type_double): Likewise. (host_integerp): Likewise. (int_fits_type_p): Likewise. * varasm.c (output_constructor_regular_field): Sign-extend the field offset to cater for negative offsets produced by the Ada frontend. * omp-low.c (extract_omp_for_data): Convert the loop step to signed for pointer adjustments. OK, thanks for digging. I'll have a look. -- Eric Botcazou
Re: Committed: Fix sh regression for target/55160
On Thu, 2012-11-01 at 02:31 -0400, Joern Rennecke wrote: gen_doloop_end_split creates a pattern that sets pc, hence emit_jump_insn has to be used instead of emit_insn. Committed as obvious. I'd like to add a test case for this. Attached patch was tested with make -k check-gcc RUNTESTFLAGS=sh.exp=pr55160.c --target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} OK to install? Cheers, Oleg testsuite/ChangeLog: PR target/55160 * gcc.target/sh/pr55160.c: New. Index: gcc/testsuite/gcc.target/sh/pr55160.c === --- gcc/testsuite/gcc.target/sh/pr55160.c (revision 0) +++ gcc/testsuite/gcc.target/sh/pr55160.c (revision 0) @@ -0,0 +1,25 @@ +/* Check that the decrement-and-test instruction is generated. */ +/* { dg-do compile { target sh*-*-* } } */ +/* { dg-options -O1 } */ +/* { dg-skip-if { sh*-*-* } { -m5*} { } } */ +/* { dg-final { scan-assembler-times dt\tr 2 } } */ + +int +test_00 (int* x, int c) +{ + int s = 0; + int i; + for (i = 0; i c; ++i) +s += x[i]; + return s; +} + +int +test_01 (int* x, int c) +{ + int s = 0; + int i; + for (i = 0; i c; ++i) +s += *--x; + return s; +}
Re: [PATCH] Reset source location for instructions moved out of its original residing basic block
This patch tries to fix this problem by resetting the source location when moving instructions to another BB. This can greatly improve the debuggability of optimized code. For the attached unittest. Without the patch, the debugger will always jump into line 14 even when the branch at line 13 is not taken. With the patch, the problem is fixed. But this breaks coverage info! You cannot change the source location of instructions randomly like that; the rule should be that, once a location is set, it cannot be changed. At most it can be cleared. gcc/ChangeLog: 2012-10-31 Dehao Chen de...@google.com * emit-rtl.c (reorder_insns): Reset the source location for instructions moved out of its original residing basic block. No, that isn't acceptable. -- Eric Botcazou
Re: Committed: Fix sh regression for target/55160
Quoting Oleg Endo oleg.e...@t-online.de: I'd like to add a test case for this. Attached patch was tested with make -k check-gcc RUNTESTFLAGS=sh.exp=pr55160.c --target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} OK to install? That'll be for the SH maintainers to decide. However, I note that dg-skip-if is used with negated logic from what is described in: http://gcc.gnu.org/onlinedocs/gccint/Directives.html Is that a documentation bug?
Re: [PR54693] loss of debug info in jump threading and loop ivopts
On Fri, Oct 26, 2012 at 04:30:41AM -0200, Alexandre Oliva wrote: From: Alexandre Oliva lxol...@fsfla.org for gcc/ChangeLog PR debug/54693 * tree-ssa-threadedge.c (thread_around_empty_block): Copy debug temps from predecessor before threading. As can be seen on test-tgmath2.i, unfortunately many failed attempts to thread can lead into big DEBUG stmt duplication this way. The following patch avoids that by avoiding to copy debug stmts for vars that already have a debug stmt at the beginning of the successor bb. That can happen either from earlier thread_around_empty_block calls, or just happen by accident, or even if the predecessor bb has several debug stmts for the same var like DEBUG i = i_7 i_8 = i_7 + 1 DEBUG i = i_8 i_9 = i_9 + 1 DEBUG i = i_9 We'd needlessly copy over all debug stmts DEBUG i = i_7 DEBUG i = i_8 DEBUG i = i_9 to the succ bb. Even for stmt frontiers it is IMHO undesirable to duplicate (perhaps many times) the whole sequence, as one would then reply the var changing sequence in the debugger perhaps once in the original bb, then once or many times again in the successor bb. The patch unfortunately doesn't speed test-tgmath2.i compilation significantly, but decreases number of debug stmts from ~ 36000 to ~ 16000, the 2+ were clearly redundant. 2012-11-01 Jakub Jelinek ja...@redhat.com PR debug/54402 * tree-ssa-threadedge.c (thread_around_empty_block): Don't copy over debug stmts if the successor bb already starts with a debug stmt for the same var. --- gcc/tree-ssa-threadedge.c.jj2012-10-30 09:01:15.0 +0100 +++ gcc/tree-ssa-threadedge.c 2012-11-01 10:07:29.114499218 +0100 @@ -1,5 +1,5 @@ /* SSA Jump Threading - Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011 + Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 Free Software Foundation, Inc. Contributed by Jeff Law l...@redhat.com @@ -641,18 +641,61 @@ thread_around_empty_block (edge taken_ed we don't lose the bindings as we redirect the edges. */ if (MAY_HAVE_DEBUG_STMTS) { + struct pointer_set_t *vars = NULL; + gsi = gsi_after_labels (bb); for (gimple_stmt_iterator si = gsi_last_bb (taken_edge-src); !gsi_end_p (si); gsi_prev (si)) { + tree var; + stmt = gsi_stmt (si); if (!is_gimple_debug (stmt)) continue; + var = NULL_TREE; + if (gimple_debug_bind_p (stmt)) + var = gimple_debug_bind_get_var (stmt); + else if (gimple_debug_source_bind_p (stmt)) + var = gimple_debug_source_bind_get_var (stmt); + + /* Don't insert debug stmts for vars which will be immediately +overridden by debug stmts at the start of bb (either preexisting +or from earlier thread_around_empty_block attempt), without +any intervening real stmts. */ + if (var != NULL_TREE + is_gimple_debug (gsi_stmt (gsi))) + { + if (vars == NULL) + { + gimple_stmt_iterator gsi2; + vars = pointer_set_create (); + for (gsi2 = gsi; !gsi_end_p (gsi2); gsi_next (gsi2)) + { + gimple stmt2 = gsi_stmt (gsi2); + tree var2; + if (!is_gimple_debug (stmt2)) + break; + if (gimple_debug_bind_p (stmt2)) + var2 = gimple_debug_bind_get_var (stmt2); + else if (gimple_debug_source_bind_p (stmt2)) + var2 = gimple_debug_source_bind_get_var (stmt2); + else + continue; + pointer_set_insert (vars, var2); + } + } + if (pointer_set_insert (vars, var)) + continue; + } + stmt = gimple_copy (stmt); /* ??? Should we drop the location of the copy? */ gsi_insert_before (gsi, stmt, GSI_NEW_STMT); } + + if (vars != NULL) + pointer_set_destroy (vars); } /* This block must have more than one successor. */ Jakub
Re: Committed: Fix sh regression for target/55160
On Thu, 2012-11-01 at 05:18 -0400, Joern Rennecke wrote: Quoting Oleg Endo oleg.e...@t-online.de: I'd like to add a test case for this. Attached patch was tested with make -k check-gcc RUNTESTFLAGS=sh.exp=pr55160.c --target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} OK to install? That'll be for the SH maintainers to decide. However, I note that dg-skip-if is used with negated logic from what is described in: http://gcc.gnu.org/onlinedocs/gccint/Directives.html Is that a documentation bug? I'm using: /* { dg-skip-if { sh*-*-* } { -m5*} { } } */ Documentation says: { dg-skip-if comment { selector } [{ include-opts } [{ exclude-opts }]] } For example, to skip a test if option -Os is present: /* { dg-skip-if { *-*-* } { -Os } { } } */ Has been working for me as described in the docs so far (there are more SH tests that use dg-skip-if, e.g. to skip SH2A tests when not compiling for SH2A etc). Cheers, Oleg
Re: Committed: Fix sh regression for target/55160
Quoting Oleg Endo oleg.e...@t-online.de: I'm using: /* { dg-skip-if { sh*-*-* } { -m5*} { } } */ Sorry, I mentally mixed up the arguments.
Re: Committed: Fix sh regression for target/55160
Oleg Endo oleg.e...@t-online.de wrote: I'd like to add a test case for this. Attached patch was tested with make -k check-gcc RUNTESTFLAGS=sh.exp=pr55160.c --target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} OK to install? OK. Regards, kaz
Re: RFA: hookize ADJUST_INSN_LENGTH
Joern Rennecke joern.renne...@embecosm.com writes: Quoting Richard Sandiford rdsandif...@googlemail.com: But what I'm trying to get at is: why can't the backend tell shorten_branches about the amount of alignment/misalignment that the target wants, and where? Via an attribute or a hook, I don't mind which. But it should be declarative, rather than a part of the shorten_branches algorithm. So, if we want to go with a declarative approach, each instruction can have a plurality of variants, each described by a size, an instruction fetch alignment base, an instruction fetch alignment offset set, a cache line base, a cache line offset set, a cost as a branch destination - cached, a cost as a branch destination - uncached, a fall-through predicted cost, a fall-through mispredicted cost, a branch-taken-predicted cost, a branch-taken-mispredicted cost. Or, if we want to get into more detail, we should actually have a scheduler instruction reservation descriptions instead of singular costs. A variants availability and costs will depend on branch offsets, the presence, kind and size of delay slot instructions and further length-sensitive context (like the set of paths from the function entry, with aggregate instruction fetch sizes and probabilities), so you have to call a target hook each iteration to get the currently available set of variants. Then you have to see this in the context of relative execution frequencies (for cached/uncached branch destinations), and branch probabilities and the reliability of the probability information, and to-be-added annotations about branch predictability. to arrive at a near-optimal solution in reasonable time. Hey, I wasn't suggesting anything like that :-) I was just asking about alignment. It seemed to me that part of the problem is that ARC wants to increase the length of an instruction in order to align a following instruction (which I agree is a useful thing to support). But as things stand, shorten_branches has no way of distinguing that from an instruction that was forced to grow because of offsets being out of range. If instead shorten_branches knows that an instruction prefers a particular alignment/misalignment, it can cooperate with the backend to extend earlier instructions to make that possible. If we do it that way, shorten_branches knows what's going on. You quote lots of complications above, that's also my concern. Directly hooking into ADJUST_INSN_LENGTH restricts us to a very peephole optimisation with no view wider than the current instruction. E.g. for: insn1 insn2 insn3 -- wants to be misaligned by 2 bytes hooking directly into ADJUST_INSN_LENGTH doesn't make it easy to consider using insn1 rather than insn2 to provide the misalignment. ...it looks like arc_reorg already runs shorten_branches in a loop. Is that right? So in a sense we already have iteration involving shorten_branches and the backend. Actually, that is mostly for instruction selection/splitting/combination for compare-and-branch/test-bit-and-branch instructions, which have very stringent offset requirements; these decisions need to be made (mostly; there is code to handle rare out-of-range offsets later, but it makes code quality worse) before delayed branch scheduling and final conditional execution decisions. But the fine-tuning of instruction sizes and alignments happens later in the original branch shortening pass. OK, but the same kind of iteration seems acceptable after delayed-branch scheduling too. I realise we want to cut down the number of iterations where possible, but at the same time, it seems like you're trying to do conditional execution conversion on the fly during shorten_branches, such that the rtl doesn't actually represent the instructions we output. And the performance problem you're seeing is that those behind-the-scenes changes are too confusing for the current shorten_branches code to handle well. E.g. sometimes you decide to delete branches that stay in the rtl stream as branches, and that got a length in a previous iteration. No, not really. Within each branch shortening pass, these decisions are static; they may change when the branches are modified after early branch shortening. They may also change after delay slot scheduling. If we want to get really precise with rtl, I have to alter it in every branch shortening iteration when the decision changes if a compare-and-branch/ test-bit-and branch is within range or not. I was suggesting that once you've committed to something (which I think in your proposal corresponds to the iteration count becoming too high) that commitment should be reflected in the rtl. That also stops the decision from being accidentally reversed. At some point, as you say, you have have to commit to using a particular form (conditionally executed or not) in order to avoid cycles. But that seems like a conditional execution decision
Re: [PATCH, generic] New RTL primitive: `define_subst'
Sounds great, (I think) this is something I've longed to see ever since I saw the iterator machinery (thanks, Richard S!) and wanted to do structural replacements just as easily. Thanks! But... I don't really understand it, so here's some feedback on the documentation: Regarding the language, a definite article is ... You're right, documentation must be improved and it will today/tomorrow after English native-speakers review it. We are looking for more feedbak on what can be understanded :) Thanks, K
Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
On Thu, Nov 01, 2012 at 12:52:04AM -0700, Sharad Singhai wrote: On Tue, Oct 30, 2012 at 4:04 PM, Sharad Singhai sing...@google.com wrote: Hi Jakub, My -fopt-info pass filtering patch (http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02704.html) is being reviewed and I hope to get this in by Nov. 5 for inclusion in gcc 4.8.0. I just committed -fopt-info pass filtering patch as r193061. How was that change tested? I'm seeing thousands of new UNRESOLVED failures, of the form: spawn -ignore SIGHUP /usr/src/gcc/obj415/gcc/xgcc -B/usr/src/gcc/obj415/gcc/ /usr/src/gcc/gcc/testsuite/gcc.target/i386/branch-cost1.c -fno-diagnostics-show-caret -O2 -fdump-tree-gimple -mbranch-cost=0 -S -o branch-cost1.s PASS: gcc.target/i386/branch-cost1.c (test for excess errors) gcc.target/i386/branch-cost1.c: dump file does not exist UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-times gimple if 2 gcc.target/i386/branch-cost1.c: dump file does not exist UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-not gimple See http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00033.html or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00034.html, compare that to http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00025.html or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00026.html The difference is just your patch and unrelated sh backend change. Jakub
Silence inline_call sanity check for now
Hi, this check I added into inline_call last week has cought among other interesting bugs an an ipa-cp/ipa-prop issue that Martin will fix only later this or next week (tracked as PR55078). I am disabling the sanity check for time being because it also triggers with -O3 bootstrap on some configurations that is not good. Comitted as obvoius. Honza * ipa-inline-transform.c (inline_call): Silence sanity check until ipa-cp issue is fixed. Index: ipa-inline-transform.c === --- ipa-inline-transform.c (revision 192989) +++ ipa-inline-transform.c (working copy) @@ -211,7 +211,8 @@ inline_call (struct cgraph_edge *e, bool struct cgraph_node *callee = cgraph_function_or_thunk_node (e-callee, NULL); bool new_edges_found = false; -#ifdef ENABLE_CHECKING + /* FIXME: re-enable once ipa-cp problem is fixed. */ +#if 0 int estimated_growth = estimate_edge_growth (e); bool predicated = inline_edge_summary (e)-predicate != NULL; #endif @@ -259,7 +260,8 @@ inline_call (struct cgraph_edge *e, bool if (update_overall_summary) inline_update_overall_summary (to); new_size = inline_summary (to)-size; -#ifdef ENABLE_CHECKING + /* FIXME: re-enable once ipa-cp problem is fixed. */ +#if 0 /* Verify that estimated growth match real growth. Allow off-by-one error due to INLINE_SIZE_SCALE roudoff errors. */ gcc_assert (!update_overall_summary || !overall_size
Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
richi, I would like you to respond to at least point 1 of this email. In it there is code from the rtl level that was written twice, once for the case when the size of the mode is less than the size of a HWI and once for the case where the size of the mode is less that 2 HWIs. my patch changes this to one instance of the code that works no matter how large the data passed to it is. you have made a specific requirement for wide int to be a template that can be instantiated in several sizes, one for 1 HWI, one for 2 HWI. I would like to know how this particular fragment is to be rewritten in this model? It seems that I would have to retain the structure where there is one version of the code for each size that the template is instantiated. I would like to point out that there are about 125 places where we have two copies of the code for some operation. Many of these places are smaller than this, but some are larger. There are also at least several hundred places where the code only was written for the 1 hwi case. These are harder to find with simple greps. I am very concerned about this particular aspect of your comments because it seems to doom us to write the same code over and over again. kenny On 10/31/2012 02:19 PM, Kenneth Zadeck wrote: Jakub, it is hard from all of the threads to actually distill what the real issues are here. So let me start from a clean slate and state them simply. Richi has three primary objections: 1) that we can do all of this with a templated version of double-int. 2) that we should not be passing in a precision and bitsize into the interface. 3) that the interface is too large. I have attached a fragment of my patch #5 to illustrate the main thrust of my patches and to illustrate the usefulness to gcc right now. In the current trunk, we have code that does simplification when the mode fits in an HWI and we have code that does the simplification if the mode fits in two HWIs. if the mode does not fit in two hwi's the code does not do the simplification. Thus here and in a large number of other places we have two copies of the code.Richi wants there to be multiple template instantiations of double-int.This means that we are now going to have to have 3 copies of this code to support oi mode on a 64 bit host and 4 copies on a 32 bit host. Further note that there are not as many cases for the 2*hwi in the code as their are for the hwi case and in general this is true through out the compiler. (CLRSB is missing from the 2hwi case in the patch) We really did not write twice the code when we stated supporting 2 hwi, we added about 1.5 times the code (simplify-rtx is better than most of the rest of the compiler). I am using the rtl level as an example here because i have posted all of those patches, but the tree level is no better. I do not want to write this code a third time and certainly not a fourth time. Just fixing all of this is quite useful now: it fills in a lot of gaps in our transformations and it removes many edge case crashes because ti mode really is lightly tested. However, this patch becomes crucial as the world gets larger. Richi's second point is that we should be doing everything at infinite precision and not passing in an explicit bitsize and precision. That works ok (sans the issues i raised with it in tree-vpn earlier) when the largest precision on the machine fits in a couple of hwis.However, for targets that have large integers or cross compilers, this becomes expensive.The idea behind my set of patches is that for the transformations that can work this way, we do the math in the precision of the type or mode. In general this means that almost all of the math will be done quickly, even on targets that support really big integers. For passes like tree-vrp, the math will be done at some multiple of the largest type seen in the actual program.The amount of the multiple is a function of the optimization, not the target or the host. Currently (on my home computer) the wide-int interface allows the optimization to go 4x the largest mode on the target. I can get rid of this bound at the expense of doing an alloca rather than stack allocating a fixed sized structure.However, given the extremely heavy use of this interface, that does not seem like the best of tradeoffs. The truth is that the vast majority of the compiler actually wants to see the math done the way that it is going to be done on the machine. Tree-vrp and the gimple constant prop do not. But i have made accommodations to handle both needs.I believe that the reason that double-int was never used at the rtl level is that it does not actually do the math in a way that is useful to the target. Richi's third objection is that the interface is too large. I disagree. It was designed based on the actual usage of the interface. When i found places where i was
[patch, fortran] PR 30146, errors for INTENT(OUT) and INTENT(INOUT) for DO loop variables
Hello world, after the dicsussion on c.l.f, it become clear that passing a DO loop variable to an INTENT(OUT) or INTENT(INOUT) dummy argument is an error. The attached patch throws an error for both cases. I chose to issue the errors as a front-end pass because we cannot check for formal arguments during parsing (where the other checks are implemented). Regression-tested. OK for trunk? Thomas 2012-11-01 Thomas Koenig tkoe...@gcc.gnu.org PR fortran/30146 * frontend-passes.c (do_warn): New function. (do_list): New static variable. (do_size): New static variable. (do_list): New static variable. (gfc_run_passes): Call do_warn. (do_code): New function. (do_function): New function. (gfc_code_walker): Keep track fo DO level. 2012-11-01 Thomas Koenig tkoe...@gcc.gnu.org PR fortran/30146 * gfortran.dg/do_check_6.f90: New test. Index: frontend-passes.c === --- frontend-passes.c (Revision 192894) +++ frontend-passes.c (Arbeitskopie) @@ -39,6 +39,7 @@ static bool optimize_trim (gfc_expr *); static bool optimize_lexical_comparison (gfc_expr *); static void optimize_minmaxloc (gfc_expr **); static bool empty_string (gfc_expr *e); +static void do_warn (gfc_namespace *); /* How deep we are inside an argument list. */ @@ -76,12 +77,30 @@ static bool in_omp_workshare; static int iterator_level; -/* Entry point - run all passes for a namespace. So far, only an - optimization pass is run. */ +/* Keep track of DO loop levels. */ +static gfc_code **do_list; +static int do_size, do_level; + +/* Vector of gfc_expr * to keep track of DO loops. */ + +struct my_struct *evec; + +/* Entry point - run all passes for a namespace. */ + void gfc_run_passes (gfc_namespace *ns) { + + /* Warn about dubious DO loops where the index might + change. */ + + do_size = 20; + do_level = 0; + do_list = XNEWVEC(gfc_code *, do_size); + do_warn (ns); + XDELETEVEC (do_list); + if (gfc_option.flag_frontend_optimize) { expr_size = 20; @@ -1225,6 +1244,160 @@ optimize_minmaxloc (gfc_expr **e) mpz_set_ui (a-expr-value.integer, 1); } +/* Callback function for code checking that we do not pass a DO variable to an + INTENT(OUT) or INTENT(INOUT) dummy variable. */ + +static int +do_code (gfc_code **c, int *walk_subtrees ATTRIBUTE_UNUSED, + void *data ATTRIBUTE_UNUSED) +{ + gfc_code *co; + int i; + gfc_formal_arglist *f; + gfc_actual_arglist *a; + + co = *c; + + switch (co-op) +{ +case EXEC_DO: + + /* Grow the temporary storage if necessary. */ + if (do_level = do_size) + { + do_size = 2 * do_size; + do_list = XRESIZEVEC (gfc_code *, do_list, do_size); + } + + /* Mark the DO loop variable if there is one. */ + if (co-ext.iterator co-ext.iterator-var) + do_list[do_level] = co; + else + do_list[do_level] = NULL; + break; + +case EXEC_CALL: + f = co-symtree-n.sym-formal; + + /* Withot a formal arglist, there is only unknown INTENT, + which we don't check for. */ + if (f == NULL) + break; + + a = co-ext.actual; + + while (a f) + { + for (i=0; ido_level; i++) + { + gfc_symbol *do_sym; + + if (do_list[i] == NULL) + break; + + do_sym = do_list[i]-ext.iterator-var-symtree-n.sym; + + if (a-expr a-expr-symtree + a-expr-symtree-n.sym == do_sym) + { + if (f-sym-attr.intent == INTENT_OUT) + gfc_error_now(Variable '%s' at %L set to undefined value + inside loop beginning at %L as INTENT(OUT) + argument to subroutine '%s', do_sym-name, + a-expr-where, do_list[i]-loc, + co-symtree-n.sym-name); + else if (f-sym-attr.intent == INTENT_INOUT) + gfc_error_now(Variable '%s' at %L not definable inside loop + beginning at %L as INTENT(INOUT) argument to + subroutine '%s', do_sym-name, + a-expr-where, do_list[i]-loc, + co-symtree-n.sym-name); + } + } + a = a-next; + f = f-next; + } + break; + +default: + break; +} + return 0; +} + +/* Callback function for functions checking that we do not pass a DO variable + to an INTENt(OUT) or INTENT(INOUT) dummy variable. */ + +static int +do_function (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED, + void *data ATTRIBUTE_UNUSED) +{ + gfc_formal_arglist *f; + gfc_actual_arglist *a; + gfc_expr *expr; + int i; + + expr = *e; + if (expr-expr_type != EXPR_FUNCTION) +return 0; + + /* Intrinsic functions don't modify their arguments. */ + + if (expr-value.function.isym) +return 0; + + f = expr-symtree-n.sym-formal; + + /* Withot a formal arglist, there is only unknown INTENT, + which we don't check for. */ + if (f == NULL) +return 0; + + a = expr-value.function.actual; + + while (a f) +{ + for (i=0; ido_level; i++) + { + gfc_symbol *do_sym; +
Re: [patch, fortran] PR 30146, errors for INTENT(OUT) and INTENT(INOUT) for DO loop variables
Hi Thomas, there are already lots of places that check for do variables, gfc_check_do_variable() does the hard work. Couldn't the same result be achieved in a much simpler way during resolution? Cheers, - Tobi On 2012-11-01 13:58, Thomas Koenig wrote: Hello world, after the dicsussion on c.l.f, it become clear that passing a DO loop variable to an INTENT(OUT) or INTENT(INOUT) dummy argument is an error. The attached patch throws an error for both cases. I chose to issue the errors as a front-end pass because we cannot check for formal arguments during parsing (where the other checks are implemented). Regression-tested. OK for trunk? Thomas 2012-11-01 Thomas Koenig tkoe...@gcc.gnu.org PR fortran/30146 * frontend-passes.c (do_warn): New function. (do_list): New static variable. (do_size): New static variable. (do_list): New static variable. (gfc_run_passes): Call do_warn. (do_code): New function. (do_function): New function. (gfc_code_walker): Keep track fo DO level. 2012-11-01 Thomas Koenig tkoe...@gcc.gnu.org PR fortran/30146 * gfortran.dg/do_check_6.f90: New test.
Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
On Thu, Nov 1, 2012 at 8:28 AM, Jakub Jelinek ja...@redhat.com wrote: How was that change tested? I'm seeing thousands of new UNRESOLVED failures, of the form: spawn -ignore SIGHUP /usr/src/gcc/obj415/gcc/xgcc -B/usr/src/gcc/obj415/gcc/ /usr/src/gcc/gcc/testsuite/gcc.target/i386/branch-cost1.c -fno-diagnostics-show-caret -O2 -fdump-tree-gimple -mbranch-cost=0 -S -o branch-cost1.s PASS: gcc.target/i386/branch-cost1.c (test for excess errors) gcc.target/i386/branch-cost1.c: dump file does not exist UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-times gimple if 2 gcc.target/i386/branch-cost1.c: dump file does not exist UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-not gimple See http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00033.html or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00034.html, compare that to http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00025.html or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00026.html The difference is just your patch and unrelated sh backend change. I'm seeing the same failures. Sharad, could you fix them or revert your change? Thanks. Diego.
Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
Kenneth Zadeck zad...@naturalbridge.com writes: I would like you to respond to at least point 1 of this email. In it there is code from the rtl level that was written twice, once for the case when the size of the mode is less than the size of a HWI and once for the case where the size of the mode is less that 2 HWIs. my patch changes this to one instance of the code that works no matter how large the data passed to it is. you have made a specific requirement for wide int to be a template that can be instantiated in several sizes, one for 1 HWI, one for 2 HWI. I would like to know how this particular fragment is to be rewritten in this model? It seems that I would have to retain the structure where there is one version of the code for each size that the template is instantiated. I think richi's argument was that wide_int should be split into two. There should be a bare-metal class that just has a length and HWIs, and the main wide_int class should be an extension on top of that that does things to a bit precision instead. Presumably with some template magic so that the length (number of HWIs) is a constant for: typedef foo2 double_int; and a variable for wide_int (because in wide_int the length would be the number of significant HWIs rather than the size of the underlying array). wide_int would also record the precision and apply it after the full HWI operation. So the wide_int class would still provide as wide as we need arithmetic, as in your rtl patch. I don't think he was objecting to that. As is probably obvious, I don't agree FWIW. It seems like an unnecessary complication without any clear use. Especially since the number of significant HWIs in a wide_int isn't always going to be the same for both operands to a binary operation, and it's not clear to me whether that should be handled in the base class or wide_int. Richard
Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
On 11/01/2012 09:10 AM, Richard Sandiford wrote: Kenneth Zadeck zad...@naturalbridge.com writes: I would like you to respond to at least point 1 of this email. In it there is code from the rtl level that was written twice, once for the case when the size of the mode is less than the size of a HWI and once for the case where the size of the mode is less that 2 HWIs. my patch changes this to one instance of the code that works no matter how large the data passed to it is. you have made a specific requirement for wide int to be a template that can be instantiated in several sizes, one for 1 HWI, one for 2 HWI. I would like to know how this particular fragment is to be rewritten in this model? It seems that I would have to retain the structure where there is one version of the code for each size that the template is instantiated. I think richi's argument was that wide_int should be split into two. There should be a bare-metal class that just has a length and HWIs, and the main wide_int class should be an extension on top of that that does things to a bit precision instead. Presumably with some template magic so that the length (number of HWIs) is a constant for: typedef foo2 double_int; and a variable for wide_int (because in wide_int the length would be the number of significant HWIs rather than the size of the underlying array). wide_int would also record the precision and apply it after the full HWI operation. So the wide_int class would still provide as wide as we need arithmetic, as in your rtl patch. I don't think he was objecting to that. As is probably obvious, I don't agree FWIW. It seems like an unnecessary complication without any clear use. Especially since the number of significant HWIs in a wide_int isn't always going to be the same for both operands to a binary operation, and it's not clear to me whether that should be handled in the base class or wide_int. Richard There is a certain amount of surprise about all of this on my part.I thought that i was doing such a great thing by looking at the specific port that you are building to determine how to size these data structures.You would think from the response that i am getting that i had murdered some one. do you think that when he gets around to reading the patch for simplify-rtx.c that he is going to object to this frag? @@ -5179,13 +4815,11 @@ static rtx simplify_immed_subreg (enum machine_mode outermode, rtx op, enum machine_mode innermode, unsigned int byte) { - /* We support up to 512-bit values (for V8DFmode). */ enum { -max_bitsize = 512, value_bit = 8, value_mask = (1 value_bit) - 1 }; - unsigned char value[max_bitsize / value_bit]; + unsigned char value [MAX_BITSIZE_MODE_ANY_MODE/value_bit]; int value_start; int i; int elem;
Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
anyway richard, it does not answer the question as to what you are going to do with a typedef foo2. the point of all of this work by me was to leave no traces of the host in the way the compiler works. instantiating a specific size of the double-ints is not going to get you there. kenny On 11/01/2012 09:10 AM, Richard Sandiford wrote: Kenneth Zadeck zad...@naturalbridge.com writes: I would like you to respond to at least point 1 of this email. In it there is code from the rtl level that was written twice, once for the case when the size of the mode is less than the size of a HWI and once for the case where the size of the mode is less that 2 HWIs. my patch changes this to one instance of the code that works no matter how large the data passed to it is. you have made a specific requirement for wide int to be a template that can be instantiated in several sizes, one for 1 HWI, one for 2 HWI. I would like to know how this particular fragment is to be rewritten in this model? It seems that I would have to retain the structure where there is one version of the code for each size that the template is instantiated. I think richi's argument was that wide_int should be split into two. There should be a bare-metal class that just has a length and HWIs, and the main wide_int class should be an extension on top of that that does things to a bit precision instead. Presumably with some template magic so that the length (number of HWIs) is a constant for: typedef foo2 double_int; and a variable for wide_int (because in wide_int the length would be the number of significant HWIs rather than the size of the underlying array). wide_int would also record the precision and apply it after the full HWI operation. So the wide_int class would still provide as wide as we need arithmetic, as in your rtl patch. I don't think he was objecting to that. As is probably obvious, I don't agree FWIW. It seems like an unnecessary complication without any clear use. Especially since the number of significant HWIs in a wide_int isn't always going to be the same for both operands to a binary operation, and it's not clear to me whether that should be handled in the base class or wide_int. Richard
Re: Parts 3 and 4 to the vxworks/fixincludes patches
On 10/29/2012 10:07 PM, rbmj wrote: I get a clean build on my end... no stdarg.h issues. Build characteristics are given in the previous email. On 10/29/2012 4:26 PM, rbmj wrote: The build does eventually die in libstdc++-v3, but that's not due to these changes (it gives me an internal compiler error while compiling complex_io.cc inside of complex, see bug 55126) This is fixed by the patch in the comments of bug 54957. With this applied (along with a very minor gthread fix at 54805) I get a good build. This fix is applied in 193052, and now my patches will build on top of trunk. Bruce, can you give steps on how to reproduce the error you reported?
Re: [PATCH] skip gcc.target/i386/pr53249.c on darwin
On Fri, Aug 24, 2012 at 04:13:20PM +0200, Rainer Orth wrote: Jack Howarth howa...@bromo.med.uc.edu writes: Currently the new testcase for gcc.target/i386/pr53249.c is failing on darwin due to the absence of -mx32 support on that target. The following patch skips this testcase on darwin. Tested on x86_64-apple-darwin12... This also fails on Solaris/x86 (cf. PR testsuite/53365) and i686-unknown-linux-gnu. I'd strongly prefer if HJ could devise a real fix instead of just skipping the test on an explicit list of systems. Rainer Rainer, What about using... Index: gcc/testsuite/gcc.target/i386/pr53249.c === --- gcc/testsuite/gcc.target/i386/pr53249.c (revision 193061) +++ gcc/testsuite/gcc.target/i386/pr53249.c (working copy) @@ -1,4 +1,4 @@ -/* { dg-do compile { target { ! { ia32 } } } } */ +/* { dg-do compile { target { ! { ia32 || llp64 } } } } */ /* { dg-options -O2 -mx32 -ftls-model=initial-exec -maddress-mode=short } */ struct gomp_task This converts the failure at -m64 into an unsupported testcase on x86_64-apple-darwin12. Jack -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [patch, fortran] PR 30146, errors for INTENT(OUT) and INTENT(INOUT) for DO loop variables
Hi Tobias, Hi Thomas, there are already lots of places that check for do variables, gfc_check_do_variable() does the hard work. gfc_check_do_uses gfc_state_stack, which is built up (and destroyed) during parsing. This is too early because we need to have the formal arglists resolved before we can do the checks. Couldn't the same result be achieved in a much simpler way during resolution? We would have to do the same work, building up a stack of DO loops, during resolution. I don't think there is an advantage in simplicity doing this in resolution; we would also have to make sure that the subroutine calls within the DO loops and the loop variables are resolved before doing the check on the INTENT. Regards Thomas
Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
I am really sorry about that. I am looking and will fix the breakage or revert the patch shortly. Thanks, Sharad On Thu, Nov 1, 2012 at 5:28 AM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Nov 01, 2012 at 12:52:04AM -0700, Sharad Singhai wrote: On Tue, Oct 30, 2012 at 4:04 PM, Sharad Singhai sing...@google.com wrote: Hi Jakub, My -fopt-info pass filtering patch (http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02704.html) is being reviewed and I hope to get this in by Nov. 5 for inclusion in gcc 4.8.0. I just committed -fopt-info pass filtering patch as r193061. How was that change tested? I'm seeing thousands of new UNRESOLVED failures, of the form: spawn -ignore SIGHUP /usr/src/gcc/obj415/gcc/xgcc -B/usr/src/gcc/obj415/gcc/ /usr/src/gcc/gcc/testsuite/gcc.target/i386/branch-cost1.c -fno-diagnostics-show-caret -O2 -fdump-tree-gimple -mbranch-cost=0 -S -o branch-cost1.s PASS: gcc.target/i386/branch-cost1.c (test for excess errors) gcc.target/i386/branch-cost1.c: dump file does not exist UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-times gimple if 2 gcc.target/i386/branch-cost1.c: dump file does not exist UNRESOLVED: gcc.target/i386/branch-cost1.c scan-tree-dump-not gimple See http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00033.html or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00034.html, compare that to http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00025.html or http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg00026.html The difference is just your patch and unrelated sh backend change. Jakub
Re: [patch] Remove unused ebitmap and unused sbitmap functions.
On 2012-10-31 18:47 , Lawrence Crowl wrote: 2012-10-31 Lawrence Crowl cr...@google.com * ebitmap.h: Remove unused. * ebitmap.c: Remove unused. * Makefile.in: Remove ebitmap.h and ebitmap.c. * sbitmap.h (SBITMAP_SIZE_BYTES): Move to source file. (SET_BIT_WITH_POPCOUNT): Remove unused. (RESET_BIT_WITH_POPCOUNT): Remove unused. (bitmap_copy_n): Remove unused. (bitmap_range_empty_p): Remove unused. (sbitmap_popcount): Remove unused. (sbitmap_verify_popcount): Make private to source file. * sbitmap.c (SBITMAP_SIZE_BYTES): Move here from header. (bitmap_copy_n): Remove unused. (bitmap_range_empty_p): Remove unused. (sbitmap_popcount): Remove unused. (sbitmap_verify_popcount): Make private to source file. Index: gcc/sbitmap.c === --- gcc/sbitmap.c (revision 193049) +++ gcc/sbitmap.c (working copy) @@ -22,6 +22,9 @@ along with GCC; see the file COPYING3. #include coretypes.h #include sbitmap.h +/* Return the set size needed for N elements. */ +#define SBITMAP_SIZE_BYTES(BITMAP) ((BITMAP)-size * sizeof (SBITMAP_ELT_TYPE)) + Not terribly important, but maybe take advantage of the change and convert it into a static inline function? OK, otherwise. Diego.
Re: [patch] Normalize bitmap iteration.
On 2012-10-31 13:43 , Lawrence Crowl wrote: Rename the EXECUTE_IF_SET_IN_SBITMAP macro to EXECUTE_IF_SET_IN_BITMAP. Its implementation is now identical to that in bitmap.h. To prevent redefinition errors, both definitions are now guarded by #ifndef. An alternate strategy is to simply include bitmap.h from sbitmap.h. As this would increase build time, I have elected to use the #ifndef version. I do not have a strong preference here. Me neither. This seems easy enough. static inline void -sbitmap_iter_init (sbitmap_iterator *i, const_sbitmap bmp, unsigned int min) +bmp_iter_set_init (sbitmap_iterator *i, const_sbitmap bmp, + unsigned int min, unsigned *bit_no ATTRIBUTE_UNUSED) So, we'll be changing this again, right? Once we introduce the various iterator types? Index: gcc/bitmap.h === --- gcc/bitmap.h(revision 193006) +++ gcc/bitmap.h(working copy) @@ -682,10 +682,13 @@ bmp_iter_and_compl (bitmap_iterator *bi, should be treated as a read-only variable as it contains loop state. */ +#ifndef EXECUTE_IF_SET_IN_BITMAP +/* See sbitmap.h for the other definition of EXECUTE_IF_SET_IN_BITMAP. */ Ah... if we could only overload macro defintions ;) The patch is OK. Thanks. Diego.
Re: [patch] Normalize more bitmap function names.
On 2012-10-31 13:41 , Lawrence Crowl wrote: This patch normalizes more bitmap function names. sbitmap.h TEST_BIT - bitmap_bit_p SET_BIT - bitmap_set_bit I wonder if it wouldn't it be more consistent if TEST_BIT became bitmap_test_bit() ? I never particularly liked the lispy *_p convention. But I don't really care much one way or the other. The patch is OK. Diego.
Re: [PATCH] Reset source location for instructions moved out of its original residing basic block
On Thu, Nov 1, 2012 at 1:52 AM, Eric Botcazou ebotca...@adacore.com wrote: This patch tries to fix this problem by resetting the source location when moving instructions to another BB. This can greatly improve the debuggability of optimized code. For the attached unittest. Without the patch, the debugger will always jump into line 14 even when the branch at line 13 is not taken. With the patch, the problem is fixed. But this breaks coverage info! You cannot change the source location of For optimized code, there are many optimizations that can break coverage info. Code motion is one of them. This patch actually tries to fix the broken coverage info, as illustrated by the unittest. instructions randomly like that; the rule should be that, once a location is set, it cannot be changed. At most it can be cleared. If we clear the debug info for instructions moved to other BB, is it acceptable? Thanks, Dehao gcc/ChangeLog: 2012-10-31 Dehao Chen de...@google.com * emit-rtl.c (reorder_insns): Reset the source location for instructions moved out of its original residing basic block. No, that isn't acceptable. -- Eric Botcazou
Re: Parts 3 and 4 to the vxworks/fixincludes patches
Hi Robert, On Thu, Nov 1, 2012 at 6:35 AM, rbmj r...@verizon.net wrote: and now my patches will build on top of trunk. Bruce, can you give steps on how to reproduce the error you reported? rm -rf gcc-bld gcc-ins cp -l gcc-svn gcc-bld pfx=$PWD/gcc-ins cd gcc-bld ./configure --enable-languages=c,c++ --prefix=$pfx CFLAGS='-g2 -Wall' make I never build in my source tree.
Re: [PATCH] Reset source location for instructions moved out of its original residing basic block
For optimized code, there are many optimizations that can break coverage info. Code motion is one of them. This patch actually tries to fix the broken coverage info, as illustrated by the unittest. No, you seems to be misunderstanding what coverage means. For coverage to be correct, it is necessary that every insn generated for a particular source construct be associated with this source construct in the debug info. If you associate an insn with another source construct and it is executed, then you'll have the other source construct marked as covered, although it might not be actually covered during the execution. Code motion (as any other optimization passes) doesn't break coverage per se. For example, there is nothing wrong with hoisting a instruction out of a loop and keeping its source location if it is always executed; coverage info will be correct after the hoisting. In more complex cases, it might be necessary to clear the source location of the hoisted instruction. If we clear the debug info for instructions moved to other BB, is it acceptable? Yes, clearing is acceptable in principle, but should be done with extreme care since you drop information, so reorder_insns isn't the appropriate place as it's too big a hammer. FWIW, we have related patchlets in our internal tree, like: * loop-invariant.c (move_invariant_reg): Clear the locator of the invariant's insn after it has been moved. * tree-ssa-loop-im.c (move_computations_stmt): Clear the location and block of the invariant's statement after it has been moved. As for the more general problem of jumpiness in GDB for highly optimized code, this should be changed in GDB if your users cannot deal with it. The debug info describes the relationship between the generated code and the source code and, at high optimization levels, this relationship is not isomorphic at all. It's up to the source level debugger to filter out the non-isomorphic part of the mapping if it deems desirable to do so. -- Eric Botcazou
Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
Richard Sandiford rdsandif...@googlemail.com writes: As is probably obvious, I don't agree FWIW. It seems like an unnecessary complication without any clear use. Especially since the number of significant HWIs in a wide_int isn't always going to be the same for both operands to a binary operation, and it's not clear to me whether that should be handled in the base class or wide_int. ...and the number of HWIs in the result might be different again. Whether that's true depends on the value as well as the (HWI) size of the operands. Richard
Lower unsupported VEC_COND_EXPR
Hello, this patch adds support for VEC_COND_EXPR to tree-vect-generic.c. It doesn't try to use vectors of smaller size and jumps straight to elementwise operations, so there is margin for improvements, but it seemed better to have something asap that at least compiles. For the testsuite: the restrictions in vector19.C were useless, they dated from a version that didn't test for errors. The new testcase is for my previous patch, but needed this one. I noticed 2 things while working on this patch: * there is no dead code elimination after the last forwprop pass * constant propagation doesn't really work for vectors bootstrap+testsuite on x86_64-linux, some manual tests with -m32 -march=i486. 2012-11-01 Marc Glisse marc.gli...@inria.fr PR middle-end/55001 gcc/ tree-vect-generic.c (expand_vector_condition): New function. (expand_vector_operations_1): Call it. testsuite/ g++.dg/ext/vector19.C: Remove target restrictions. gcc.dg/fold-compare-7.c: New testcase. -- Marc GlisseIndex: gcc/testsuite/gcc.dg/fold-compare-7.c === --- gcc/testsuite/gcc.dg/fold-compare-7.c (revision 0) +++ gcc/testsuite/gcc.dg/fold-compare-7.c (revision 0) @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +typedef float vecf __attribute__((vector_size(8*sizeof(float; + +long f(vecf *f1, vecf *f2){ + return ((*f1 == *f2) 0)[2]; +} Property changes on: gcc/testsuite/gcc.dg/fold-compare-7.c ___ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: gcc/testsuite/g++.dg/ext/vector19.C === --- gcc/testsuite/g++.dg/ext/vector19.C (revision 193060) +++ gcc/testsuite/g++.dg/ext/vector19.C (working copy) @@ -1,15 +1,12 @@ -/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ -/* { dg-options -std=c++11 -mavx2 } */ - -// The target restrictions and the -mavx2 flag are meant to disappear -// once vector lowering is in place. +/* { dg-do compile } */ +/* { dg-options -std=c++11 } */ typedef double vec __attribute__((vector_size(2*sizeof(double; typedef signed char vec2 __attribute__((vector_size(16))); typedef unsigned char vec2u __attribute__((vector_size(16))); void f (vec *x, vec *y, vec *z) { *x = (*y *z) ? *x : *y; } Index: gcc/tree-vect-generic.c === --- gcc/tree-vect-generic.c (revision 193060) +++ gcc/tree-vect-generic.c (working copy) @@ -861,20 +861,86 @@ expand_vector_divmod (gimple_stmt_iterat || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing) return NULL_TREE; tem = gimplify_build2 (gsi, MULT_EXPR, type, cur_op, op1); op = optab_for_tree_code (MINUS_EXPR, type, optab_default); if (op == unknown_optab || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing) return NULL_TREE; return gimplify_build2 (gsi, MINUS_EXPR, type, op0, tem); } +/* Expand a vector condition to scalars, by using many conditions + on the vector's elements. */ +static void +expand_vector_condition (gimple_stmt_iterator *gsi) +{ + gimple stmt = gsi_stmt (*gsi); + tree type = gimple_expr_type (stmt); + tree a = gimple_assign_rhs1 (stmt); + tree a1 = a; + tree a2; + bool a_is_comparison = false; + tree b = gimple_assign_rhs2 (stmt); + tree c = gimple_assign_rhs3 (stmt); + VEC(constructor_elt,gc) *v; + tree constr; + tree inner_type = TREE_TYPE (type); + tree cond_type = TREE_TYPE (TREE_TYPE (a)); + tree comp_inner_type = cond_type; + tree width = TYPE_SIZE (inner_type); + tree index = bitsize_int (0); + int nunits = TYPE_VECTOR_SUBPARTS (type); + int i; + location_t loc = gimple_location (gsi_stmt (*gsi)); + + if (TREE_CODE (a) != SSA_NAME) +{ + gcc_assert (COMPARISON_CLASS_P (a)); + a_is_comparison = true; + a1 = TREE_OPERAND (a, 0); + a2 = TREE_OPERAND (a, 1); + comp_inner_type = TREE_TYPE (TREE_TYPE (a1)); +} + + if (expand_vec_cond_expr_p (type, TREE_TYPE (a1))) +return; + + /* TODO: try and find a smaller vector type. */ + + warning_at (loc, OPT_Wvector_operation_performance, + vector condition will be expanded piecewise); + + v = VEC_alloc(constructor_elt, gc, nunits); + for (i = 0; i nunits; + i++, index = int_const_binop (PLUS_EXPR, index, width)) +{ + tree aa, result; + tree bb = tree_vec_extract (gsi, inner_type, b, width, index); + tree cc = tree_vec_extract (gsi, inner_type, c, width, index); + if (a_is_comparison) + { + tree aa1 = tree_vec_extract (gsi, comp_inner_type, a1, width, index); + tree aa2 = tree_vec_extract (gsi, comp_inner_type, a2, width, index); + aa = build2 (TREE_CODE (a), cond_type, aa1, aa2); + } + else + aa =
Re: [PATCH] Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)
I would like to work on debugging this, but it's hard without test cases... Maybe the files I attached to my PR55121 could help you in this respect? Your sanity checking patching does complain with these input files. Christophe.
Re: [PATCH, generic] Support printing of escaped curly braces and vertical bar in assembler output
If the macro ASSEMBLER_DIALECT is defined, characters '{', '}' and '|' in assembler code (both in asm() and output templates of machine description) are handled as delimiters of alternative assembler syntax variants. This patch enables printing '{', '}' and '|' characters in assembler code by escaping them with %. It`s needed for future extensions of Intel ISA. There are four in-tree target architectures that already use %|. I think it would be better if you made these new escapes target-specific. For the logic to find the end of an alternative, you can simply always skip over the next char after any percent sign (well, check for end of string, of course); there is no need to count percent signs. Segher
Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
On Thu, Nov 1, 2012 at 12:40 PM, Sharad Singhai sing...@google.com wrote: I found the problem and the following patch fixes it. The issue with my testing was that I was only looking at 'FAIL' lines but forgot to tally the 'UNRESOLVED' test cases, the real symptoms of my test problems. In any case, I am rerunning the whole testsuite just to be sure. Assuming tests pass, is it okay to commit the following? Thanks, Sharad 2012-11-01 Sharad Singhai sing...@google.com PR other/55164 * dumpfile.h (struct dump_file_info): Fix order of flags. OK (remember to insert a tab at the start of each ChangeLog line). Diego.
Re: [PATCH] Reset source location for instructions moved out of its original residing basic block
Hi, Eric, I see your point. How about we guard these changes with a flag, say -gless-jumpy, so that people can always choose between better coverage and less jumpy gdb behavior (it's also important for some other clients like AutoFDO). I will have a series of patches to follow soon that can be guarded by this flag. Thanks, Dehao
Re: [PATCH] Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)
On Wed, Oct 31, 2012 at 4:02 PM, Steven Bosscher stevenb@gmail.com wrote: On Tue, Oct 30, 2012 at 10:43 PM, Teresa Johnson wrote: Sure, I will give this a try after your verification patch tests complete. Does this mean that the patch you posted above to force_nonfallthru_and_redirect is no longer needed either? I'll see if I can avoid the need for some of my fixes, although I believe at least the function.c one will still be needed. I'll check. The force_nonfallthru change is still necessary, because force_nonfallthru should be almost a no-op in cfglayout mode. The whole concept of a fallthru edge doesn't exist in cfglayout mode: any single_succ edge is a fallthru edge until the order of the basic blocks has been determined and the insn chain is re-linked (cfglayout mode originally was developed for bb-reorder, to move blocks around more easily). So the correct patch would actually be: Index: cfgrtl.c === --- cfgrtl.c(revision 193046) +++ cfgrtl.c(working copy) @@ -4547,7 +4547,7 @@ struct cfg_hooks cfg_layout_rtl_cfg_hooks = { cfg_layout_split_edge, rtl_make_forwarder_block, NULL, /* tidy_fallthru_edge */ - rtl_force_nonfallthru, + NULL, /* force_nonfallthru */ rtl_block_ends_with_call_p, rtl_block_ends_with_condjump_p, rtl_flow_call_edges_add, (Or better yet: Remove the force_nonfallthru and tidy_fallthru_edge hooks, they are cfgrtl-only.) But obviously that won't work because bb-reorder.c:fix_up_fall_thru_edges calls this hook while we're in cfglayout mode. That is a bug. The call to force_nonfallthru results in a dangling barrier: cfgrtl.c:1523 emit_barrier_after (BB_END (jump_block)); In cfglayout mode, barriers don't exist in the insns chain, and they don't have to because every edge is a fallthru edge. If there are barriers before cfglayout mode, they are either removed or linked in the basic block footer, and fixup_reorder_chain restores or inserts barriers where necessary to drop out of cfglayout mode. This emit_barrier_after call hangs a barrier after BB_END but not in the footer, and I'm pretty sure the result will be that the barrier is lost in fixup_reorder_chain. See also emit_barrier_after_bb for how inserting a barrier should be done in cfglayout mode. So in short, bbpart doesn't know what it wants to be: a cfgrtl or a cfglayout pass. It doesn't work without cfglayout but it's doing things that are only correct in the cfgrtl world and Very Wrong Indeed in cfglayout-land. Regarding your earlier question about why we needed to add the barrier, I need to dig up the details again but essentially I found that the barriers were being added by bbpart, but bbro was reordering things and the block that ended up at the border between the hot and cold section didn't necessarily have a barrier on it because it was not previously at the region boundary. That doesn't sound right. bbpart doesn't actually re-order the basic blocks, it only marks the blocks with the partition they will be assigned to. Whatever ends up at the border between the two partitions is not relevant: the hot section cannot end in a fall-through edge to the cold section (verify_flow_info even checks for that, see fallthru edge crosses section boundary (bb %i)) so it must end in some explicit jump. Such jumps are always followed by a barrier. The only reason I can think of why there might be a missing barrier, is because fixup_reorder_chain has a bug and forgets to insert the barrier in some cases (and I suspect this may be the case for return patterns, or the a.m. issue of a dropper barrier). I would like to work on debugging this, but it's hard without test cases... I'm working on trying to reproduce some of these failures in a test case I can share. So far, I have only been able to reproduce the failure reported in PR 53743 in spec2006 (456.hmmer/sre_math.c). Still working on getting a smaller/shareable test case for the other 2 issues. The failure in PR 53743 (assert in cfg_layout_merge_blocks) is what I had fixed with my original changes to cfgrtl.c. Need to understand why there is a reg crossing note between 2 bbs in the same partition. In the hmmer test case I also hit a failures in rtl_verify_flow_info and rtl_verify_flow_info_1: gcc -c -o sre_math.o -DSPEC_CPU -D NDEBUG-fprofile-use -freorder-blocks-and-partition -O2 sre_math.c sre_math.c: In function ‘Gammln’: sre_math.c:161:1: error: EDGE_CROSSING incorrectly set across same section } ^ sre_math.c:161:1: error: missing barrier after block 6 sre_math.c:161:1: internal compiler error: verify_flow_info failed This was due to some code in thread_prologue_and_epilogue_insns that duplicated tail blocks: if (e) { copy_bb = create_basic_block (NEXT_INSN (BB_END (e-src)), NULL_RTX, e-src);
Re: [patch] Normalize more bitmap function names.
On 11/1/12, Diego Novillo dnovi...@google.com wrote: On 2012-10-31 13:41 , Lawrence Crowl wrote: This patch normalizes more bitmap function names. sbitmap.h TEST_BIT - bitmap_bit_p SET_BIT - bitmap_set_bit I wonder if it wouldn't it be more consistent if TEST_BIT became bitmap_test_bit() ? I never particularly liked the lispy *_p convention. But I don't really care much one way or the other. The plan was to simply use the bitmap names, and given the schedule I would prefer to leave it as is. The patch is OK. Okay. I will merge and commit. -- Lawrence Crowl
Fwd/Re: Ping / update: RFA: replace #ifdef with if/#if for HAVE_ATTR_*
Quoting Richard Sandiford rdsandif...@googlemail.com: OK with those changes for the rtl bits. Can't approve the generator stuff though. Can you build machinery maintainers also review gen* and *.def patches, of this patch ( http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02897.html ). or are you strictly limited to *.in files as hinted in the MAINTAINERS file?
Re: [patch] Normalize bitmap iteration.
On 11/1/12, Diego Novillo dnovi...@google.com wrote: On 2012-10-31 13:43 , Lawrence Crowl wrote: Rename the EXECUTE_IF_SET_IN_SBITMAP macro to EXECUTE_IF_SET_IN_BITMAP. Its implementation is now identical to that in bitmap.h. To prevent redefinition errors, both definitions are now guarded by #ifndef. An alternate strategy is to simply include bitmap.h from sbitmap.h. As this would increase build time, I have elected to use the #ifndef version. I do not have a strong preference here. Me neither. This seems easy enough. static inline void -sbitmap_iter_init (sbitmap_iterator *i, const_sbitmap bmp, unsigned int min) +bmp_iter_set_init (sbitmap_iterator *i, const_sbitmap bmp, + unsigned int min, unsigned *bit_no ATTRIBUTE_UNUSED) So, we'll be changing this again, right? Once we introduce the various iterator types? If you mean C++ style iterators, yes. If you mean the 'computing' bitmap iterators, no. Index: gcc/bitmap.h === --- gcc/bitmap.h (revision 193006) +++ gcc/bitmap.h (working copy) @@ -682,10 +682,13 @@ bmp_iter_and_compl (bitmap_iterator *bi, should be treated as a read-only variable as it contains loop state. */ +#ifndef EXECUTE_IF_SET_IN_BITMAP +/* See sbitmap.h for the other definition of EXECUTE_IF_SET_IN_BITMAP. */ Ah... if we could only overload macro defintions ;) I had the same though. The patch is OK. Thanks. Okay, I'll merge and commit. -- Lawrence Crowl
Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
On Thu, Nov 1, 2012 at 9:44 AM, Diego Novillo dnovi...@google.com wrote: On Thu, Nov 1, 2012 at 12:40 PM, Sharad Singhai sing...@google.com wrote: I found the problem and the following patch fixes it. The issue with my testing was that I was only looking at 'FAIL' lines but forgot to tally the 'UNRESOLVED' test cases, the real symptoms of my test problems. In any case, I am rerunning the whole testsuite just to be sure. Assuming tests pass, is it okay to commit the following? Thanks, Sharad 2012-11-01 Sharad Singhai sing...@google.com PR other/55164 * dumpfile.h (struct dump_file_info): Fix order of flags. OK (remember to insert a tab at the start of each ChangeLog line). Fixed tab chars. (they were really there, but gmail ate them! :)) Retested and found all my 'UNRESOLVED' problems were gone. Hence committed the fix as r193064. Thanks, Sharad Diego.
Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
Hi Jakub, I would like to get the fission implementation in before stage 1. It has been under review for some time, and is awaiting another round of review now. More info here: http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02684.html Sterling
Re: [patch] Remove unused ebitmap and unused sbitmap functions.
On 11/1/12, Diego Novillo dnovi...@google.com wrote: On 2012-10-31 18:47 , Lawrence Crowl wrote: 2012-10-31 Lawrence Crowl cr...@google.com * ebitmap.h: Remove unused. * ebitmap.c: Remove unused. * Makefile.in: Remove ebitmap.h and ebitmap.c. * sbitmap.h (SBITMAP_SIZE_BYTES): Move to source file. (SET_BIT_WITH_POPCOUNT): Remove unused. (RESET_BIT_WITH_POPCOUNT): Remove unused. (bitmap_copy_n): Remove unused. (bitmap_range_empty_p): Remove unused. (sbitmap_popcount): Remove unused. (sbitmap_verify_popcount): Make private to source file. * sbitmap.c (SBITMAP_SIZE_BYTES): Move here from header. (bitmap_copy_n): Remove unused. (bitmap_range_empty_p): Remove unused. (sbitmap_popcount): Remove unused. (sbitmap_verify_popcount): Make private to source file. Index: gcc/sbitmap.c === --- gcc/sbitmap.c(revision 193049) +++ gcc/sbitmap.c(working copy) @@ -22,6 +22,9 @@ along with GCC; see the file COPYING3. #include coretypes.h #include sbitmap.h +/* Return the set size needed for N elements. */ +#define SBITMAP_SIZE_BYTES(BITMAP) ((BITMAP)-size * sizeof (SBITMAP_ELT_TYPE)) + Not terribly important, but maybe take advantage of the change and convert it into a static inline function? Sure. OK, otherwise. I will merge and commit. -- Lawrence Crowl
Re: [tsan] ThreadSanitizer instrumentation part
On Thu, Nov 1, 2012 at 10:06 AM, Dmitry Vyukov dvyu...@google.com wrote: On Thu, Nov 1, 2012 at 7:47 PM, Xinliang David Li davi...@google.com wrote: On Wed, Oct 31, 2012 at 11:27 PM, Jakub Jelinek ja...@redhat.com wrote: On Wed, Oct 31, 2012 at 11:00:17PM -0700, Xinliang David Li wrote: + /* Below are things we do not instrument + (no possibility of races or not implemented yet). */ + if (/* Compiler-emitted artificial variables. */ + (DECL_P (expr) DECL_ARTIFICIAL (expr)) + /* The var does not live in memory - no possibility of races. */ + || (tcode == VAR_DECL + TREE_ADDRESSABLE (expr) == 0 == !TREE_ADDRESSABLE (expr) + TREE_STATIC (expr) == 0) To detect stack varaible, TREE_STATIC can not be used. Use !DECL_EXTERNAL instead. TREE_STATIC is right for that. Hmm, what does this condition try to capture then? As far as I remember, I've seen global variables marked as !ADDRESSABLE. And this condition is to instrument global vars in any case. But it was a long time ago. But it skips those globals without static storage and marked as not addressable. It seems to me you want to skip all stack local variables that are not address escaped. Without address escape analysis, the address taken bit (not the same as addressable attribute) should be used. As far as I can tell, such bit is not available in var_decl. The varpool_node has one, but it is only for var_decls with static storage. It is also unfortunate that there is no single bit to test if a variable is function auto, though there is an interface to call which is 'auto_var_in_fn_p (...)'. The condition to skip such variable references are: if (tcode == VAR_DECL auto_var_in_fn_p (expr, ..) !address_taken (expr)) The TREE_ADDRESSABLE check seems redundant -- if the var_decl (instead of ssa name) appears in the assignment, why would it not be 'addressable'? And being addressable does not mean it is address taken either. Diego, is there a way to check address taken attribute for auto vars? thanks, David
Re: [tsan] ThreadSanitizer instrumentation part
On Thu, Nov 1, 2012 at 11:16 AM, Dmitry Vyukov dvyu...@google.com wrote: On Thu, Nov 1, 2012 at 10:14 PM, Dmitry Vyukov dvyu...@google.com wrote: On Wed, Oct 31, 2012 at 11:00:17PM -0700, Xinliang David Li wrote: + /* Below are things we do not instrument + (no possibility of races or not implemented yet). */ + if (/* Compiler-emitted artificial variables. */ + (DECL_P (expr) DECL_ARTIFICIAL (expr)) + /* The var does not live in memory - no possibility of races. */ + || (tcode == VAR_DECL + TREE_ADDRESSABLE (expr) == 0 == !TREE_ADDRESSABLE (expr) + TREE_STATIC (expr) == 0) To detect stack varaible, TREE_STATIC can not be used. Use !DECL_EXTERNAL instead. TREE_STATIC is right for that. Hmm, what does this condition try to capture then? As far as I remember, I've seen global variables marked as !ADDRESSABLE. And this condition is to instrument global vars in any case. But it was a long time ago. But it skips those globals without static storage and marked as not addressable. It seems to me you want to skip all stack local variables that are not address escaped. Yes, that's correct. I am pretty sure I implemented it incorrectly :) The high-level intention is to skip memory accesses that can't constitute a data race. Originally there were some other checks, e.g. check for loads from vtable. That should be caught by the is_load_const check there. David Without address escape analysis, the address taken bit (not the same as addressable attribute) should be used. As far as I can tell, such bit is not available in var_decl. The varpool_node has one, but it is only for var_decls with static storage. It is also unfortunate that there is no single bit to test if a variable is function auto, though there is an interface to call which is 'auto_var_in_fn_p (...)'. The condition to skip such variable references are: if (tcode == VAR_DECL auto_var_in_fn_p (expr, ..) !address_taken (expr)) The TREE_ADDRESSABLE check seems redundant -- if the var_decl (instead of ssa name) appears in the assignment, why would it not be 'addressable'? And being addressable does not mean it is address taken either. Diego, is there a way to check address taken attribute for auto vars? thanks, David
Re: [tsan] ThreadSanitizer instrumentation part
On Thu, Nov 1, 2012 at 2:11 PM, Xinliang David Li davi...@google.com wrote: The TREE_ADDRESSABLE check seems redundant -- if the var_decl (instead of ssa name) appears in the assignment, why would it not be 'addressable'? And being addressable does not mean it is address taken either. Yes, it does mean that. Diego, is there a way to check address taken attribute for auto vars? TREE_ADDRESSABLE is it. If that bit is set, the address of the variable has been taken. Diego.
Re: [tsan] ThreadSanitizer instrumentation part
On Thu, Nov 01, 2012 at 11:11:13AM -0700, Xinliang David Li wrote: But it skips those globals without static storage and marked as not addressable. It seems to me you want to skip all stack local variables that are not address escaped. Without address escape analysis, the address taken bit (not the same as addressable attribute) should be used. As far as I can tell, such bit is not available in var_decl. The varpool_node has one, but it is only for var_decls with static storage. It is also unfortunate that there is no single bit to test if a variable is function auto, though there is an interface to call which is 'auto_var_in_fn_p (...)'. The condition to skip such variable references are: if (tcode == VAR_DECL auto_var_in_fn_p (expr, ..) !address_taken (expr)) The TREE_ADDRESSABLE check seems redundant -- if the var_decl (instead of ssa name) appears in the assignment, why would it not be 'addressable'? And being addressable does not mean it is address taken either. TREE_ADDRESSABLE really is a flag whether the address of the decl is ever taken (or addresses of its fields etc.), and it is updated from time to time (execute_update_addresses_taken) after certain passes. Don't understand why you think it is redundant. Jakub
[wwwdocs] benchmarks/index.html -- Richi related updates
I applied the change below, adjusting to Richi's name change and some (announced?) change on the SUSE web infrastructure side (which does a permanent redirect). On a related note, Tobias, http://users.physik.fu-berlin.de/~tburnus/gcc-trunk/benchmark/ now gives an Internal Server Error. Gerald Index: index.html === RCS file: /cvs/gcc/wwwdocs/htdocs/benchmarks/index.html,v retrieving revision 1.28 diff -u -3 -p -r1.28 index.html --- index.html 31 Dec 2008 08:48:50 - 1.28 +++ index.html 1 Nov 2012 18:16:00 - @@ -65,8 +65,8 @@ http://annwm.lbl.gov/bench//a. /p p -Richard Guuml;nther runs -a href=http://www.suse.de/~rguenther/tramp3d/tramp3d-v4.cpp.gz; +Richard Biener runs +a href=http://users.suse.com/~rguenther/tramp3d/; TraMP3d-v4/a tracking mainline GCC compile and runtime performance and its memory usage. Various other C++ benchmarks and Polyhedron are run also, results can be found at a href=http://gcc.opensuse.org/c++bench/;
Re: [tsan] ThreadSanitizer instrumentation part
For the following case: int foo() { int a[100]; // use 'a' as a[i] ... } Assuming there is no assignment of the form p = a[i] in the source, variable 'a' still will be allocated in stack and its address is is needed. Is the addressable bit set for 'a'? If yes, then it is not the same as address taken. David On Thu, Nov 1, 2012 at 11:24 AM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Nov 01, 2012 at 11:11:13AM -0700, Xinliang David Li wrote: But it skips those globals without static storage and marked as not addressable. It seems to me you want to skip all stack local variables that are not address escaped. Without address escape analysis, the address taken bit (not the same as addressable attribute) should be used. As far as I can tell, such bit is not available in var_decl. The varpool_node has one, but it is only for var_decls with static storage. It is also unfortunate that there is no single bit to test if a variable is function auto, though there is an interface to call which is 'auto_var_in_fn_p (...)'. The condition to skip such variable references are: if (tcode == VAR_DECL auto_var_in_fn_p (expr, ..) !address_taken (expr)) The TREE_ADDRESSABLE check seems redundant -- if the var_decl (instead of ssa name) appears in the assignment, why would it not be 'addressable'? And being addressable does not mean it is address taken either. TREE_ADDRESSABLE really is a flag whether the address of the decl is ever taken (or addresses of its fields etc.), and it is updated from time to time (execute_update_addresses_taken) after certain passes. Don't understand why you think it is redundant. Jakub
[wwwdocs] testing/testing-blitz.html update
Something tells me, nobody(?) is really doing the kind of regular testing that we once envisioned, but while we have the page, let's at least make sure the link works. Anyone has been using a newer version and wants to update the description? Gerald Index: testing-blitz.html === RCS file: /cvs/gcc/wwwdocs/htdocs/testing/testing-blitz.html,v retrieving revision 1.5 diff -u -3 -p -r1.5 testing-blitz.html --- testing-blitz.html 9 Jan 2010 15:13:06 - 1.5 +++ testing-blitz.html 1 Nov 2012 18:36:53 - @@ -8,7 +8,7 @@ h1Blitz++ build and test guide/h1 pThis page is a guide to building the -a href=http://www.oonumerics.org/blitz/;Blitz++/a scientific +a href=http://sourceforge.net/projects/blitz/;Blitz++/a scientific computing class library as part of GCC integration testing./p h2Resource usage/h2
Re: RFA: hookize ADJUST_INSN_LENGTH
Quoting Richard Sandiford rdsandif...@googlemail.com: But I think extending the FSM from just being an on-the-fly transformatino during final (as for ARM) to something that happens earlier is a pretty major step in terms of the interface between the backend and the rest of GCC. There are then undocumented restrictions on what hooks can be called when, Indeed. I found I have to call extract_constrain_insn_cached to restore recog_data (the struct, even though gdb will never believe this and insist on that being a type) if I have recognized other insns (by evaluating an attribute) from a splitter predicate. which if broken could lead to the FSM producing different results during final than it did earlier. Well, actually, I did find that this happens, and added code to either compensate for changes or inhibit them. For the most part, these changes are wanted, we just need to manage them to avoid out-of-range branches or ADDR_DIFF_VECs. You could think of this as simulated annealing for ice: some provisions need to be made to avoid rupturing the vessel, either by anticipating size increases, containing pressure to limit size increase, or a combination of both. E.g., personally, I don't think it's acceptable for ADJUST_INSN_LENGTH to hard-code an assumption that all instructions are processed in assembly order, which the ARC one seems to (irrespective of this patch). It doesn't, at least not for correctness; you might argue that it depends on this for compilation time performance. It does a scan in round-robin fashion from the last insn it processed till it reaches the one that is to be processed. But I did develop this stuff when passes were much more rigid than they are today; I'll look into the feasibility of running the ccfsm in a machine-specific pass to finalize in rtl whatever COND_EXEC opportunities are visible at this stage, and calling it after preliminary compare-and-branch determination, and after delay branch scheduling. Maybe at some other places too. One complication is that sometimes a compare-and-branch decision has to be reversed, so the exact ccfsm description before instruction output is that it is unlikely to clobber the flags. Thus, a conditionalizing that depends on the flags surviving through a compare-and-branch cannot be finalized till the branch is safe. Still, the conditionalizing is likely and we should take length changes into account, provided that the compare-and-branch stays in range. I suppose I could add a note REG_DEP_TRUE (unspec [list of brcc insn for required to be short for conditionalizing] UNSPEC_CCFSM) for any instructions subject to such uncertain conditionalizations. Or probably rather three UNSPEC kinds: one to indicate ccfsm-predicated deletion (for a branch), one for general conditionalizing (for arithmetic, branches and calls), and one for equality conditionalizing (for branches, wider offset ranges are available for short variants, and for register zeroing, which has a not-equal conditionalized short form).
Re: [tsan] ThreadSanitizer instrumentation part
On Thu, Nov 01, 2012 at 11:32:01AM -0700, Xinliang David Li wrote: For the following case: int foo() { int a[100]; // use 'a' as a[i] ... } Assuming there is no assignment of the form p = a[i] in the source, variable 'a' still will be allocated in stack and its address is is needed. Is the addressable bit set for 'a'? If yes, then it is not the same as address taken. But how is that relevant to tsan? If TREE_ADDRESSABLE isn't set on a, then yes, you can still use a[i] = 6; or return a[j]; on it, and it will be allocated on stack, but different threads can't access the array, so there is no data race possible. Only if a is static int a[100]; above, or otherwise is_global_var (decl). Jakub
Re: [tsan] ThreadSanitizer instrumentation part
That is exactly related to tsan -- it should skip local variable that is not address taken (though still addressable, which by addressable, it means the variable has a memory home). The problem is that if 'addressable' bit is not equivalent to 'address taken', it will be too conservative to use it --- as addressable variables may not be address taken. Note that even 'address taken' is too conservative to use here. Only when it is escaped the thread (via non TLS global pointers or other escaped local, heap memories), should it be considered to be instrumented. David On Thu, Nov 1, 2012 at 11:49 AM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Nov 01, 2012 at 11:32:01AM -0700, Xinliang David Li wrote: For the following case: int foo() { int a[100]; // use 'a' as a[i] ... } Assuming there is no assignment of the form p = a[i] in the source, variable 'a' still will be allocated in stack and its address is is needed. Is the addressable bit set for 'a'? If yes, then it is not the same as address taken. But how is that relevant to tsan? If TREE_ADDRESSABLE isn't set on a, then yes, you can still use a[i] = 6; or return a[j]; on it, and it will be allocated on stack, but different threads can't access the array, so there is no data race possible. Only if a is static int a[100]; above, or otherwise is_global_var (decl). Jakub
patch to fix PR55150
The following patch fixes the second test of PR55150 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55150 The patch was successfully bootstrapped on x86/x86-64. Committed as rev. 193065. 2012-11-01 Vladimir Makarov vmaka...@redhat.com PR middle-end/55150 * lra-constraints.c (lra_constraints): Check only pseudos with equivalences. Add insns with equivalence pseudos. 2012-11-01 Vladimir Makarov vmaka...@redhat.com PR middle-end/55150 * gcc.dg/pr55150.c: Rename to gcc.dg/pr55150-1.c. * gcc.dg/pr55150-2.c: New test. Index: lra-constraints.c === --- lra-constraints.c (revision 193064) +++ lra-constraints.c (working copy) @@ -3244,9 +3244,11 @@ lra_constraints (bool first_p) { bool changed_p; int i, hard_regno, new_insns_num; - unsigned int min_len, new_min_len; - rtx set, x, dest_reg; + unsigned int min_len, new_min_len, uid; + rtx set, x, reg, dest_reg; basic_block last_bb; + bitmap_head equiv_insn_bitmap; + bitmap_iterator bi; lra_constraint_iter++; if (lra_dump_file != NULL) @@ -3261,10 +3263,12 @@ lra_constraints (bool first_p) lra_risky_transformations_p = false; new_insn_uid_start = get_max_uid (); new_regno_start = first_p ? lra_constraint_new_regno_start : max_reg_num (); + bitmap_initialize (equiv_insn_bitmap, reg_obstack); for (i = FIRST_PSEUDO_REGISTER; i new_regno_start; i++) if (lra_reg_info[i].nrefs != 0) { ira_reg_equiv[i].profitable_p = true; + reg = regno_reg_rtx[i]; if ((hard_regno = lra_get_regno_hard_regno (i)) = 0) { int j, nregs = hard_regno_nregs[hard_regno][PSEUDO_REGNO_MODE (i)]; @@ -3272,7 +3276,7 @@ lra_constraints (bool first_p) for (j = 0; j nregs; j++) df_set_regs_ever_live (hard_regno + j, true); } - else if ((x = get_equiv_substitution (regno_reg_rtx[i])) != NULL_RTX) + else if ((x = get_equiv_substitution (reg)) != reg) { bool pseudo_p = contains_reg_p (x, false, false); rtx set, insn; @@ -3310,8 +3314,15 @@ lra_constraints (bool first_p) ira_reg_equiv[i].defined_p = false; if (contains_reg_p (x, false, true)) ira_reg_equiv[i].profitable_p = false; + if (get_equiv_substitution (reg) != reg) + bitmap_ior_into (equiv_insn_bitmap, lra_reg_info[i].insn_bitmap); } } + /* We should add all insns containing pseudos which should be + substituted by their equivalences. */ + EXECUTE_IF_SET_IN_BITMAP (equiv_insn_bitmap, 0, uid, bi) +lra_push_insn_by_uid (uid); + bitmap_clear (equiv_insn_bitmap); lra_eliminate (false); min_len = lra_insn_stack_length (); new_insns_num = 0; Index: testsuite/gcc.dg/pr55150-1.c === --- testsuite/gcc.dg/pr55150-1.c (revision 0) +++ testsuite/gcc.dg/pr55150-1.c (working copy) @@ -0,0 +1,72 @@ +/* PR middle-end/55150 */ +/* { dg-do compile } */ +/* { dg-options -Os -g } */ + +typedef unsigned int KEY_TABLE_TYPE[(272 / 4)]; + typedef unsigned int u32; + typedef unsigned char u8; + static const u32 Camellia_SBOX[][256] = { + }; + static const u32 SIGMA[] = { +0xa09e667f, 0x3bcc908b, 0xb67ae858, 0x4caa73b2, 0xc6ef372f, 0xe94f82be, 0x54ff53a5, 0xf1d36f1c, 0x10e527fa, 0xde682d1d, 0xb05688c2, 0xb3e6c1fd }; + int Camellia_Ekeygen (int keyBitLength, const u8 * rawKey, KEY_TABLE_TYPE k) { +register u32 s0, s1, s2, s3; +k[0] = s0 = ( { + u32 r = *(const u32 *) (rawKey); + r; + } +); +k[2] = s2 = ( { + u32 r = *(const u32 *) (rawKey + 8); + r; + } +); +k[3] = s3 = ( { + u32 r = *(const u32 *) (rawKey + 12); + r; + } +); +if (keyBitLength != 128) { +k[8] = s0 = ( { + u32 r = *(const u32 *) (rawKey + 16); + r; + } +); +if (keyBitLength == 192) { + k[10] = s2 = ~s0; + k[11] = s3 = ~s1; + } + } +s0 ^= k[0], s1 ^= k[1], s2 ^= k[2], s3 ^= k[3]; +if (keyBitLength == 128) { +k[4] = s0, k[5] = s1, k[6] = s2, k[7] = s3; + } +else { +k[12] = s0, k[13] = s1, k[14] = s2, k[15] = s3; +s0 ^= k[8], s1 ^= k[9], s2 ^= k[10], s3 ^= k[11]; +do{ + register u32 _t0, _t1, _t2, _t3; + _t0 = s2 ^ ((SIGMA + 10))[0]; + _t3 ^= Camellia_SBOX[3][(_t0 8) 0xff]; + s1 ^= _t3; + } +while (0); +do{ + u32 _t0 = s0 (32 - 30); + s2 = (s2 30) | (s3 (32 - 30)); + s3 = (s3 30) | _t0; + } +while (0); +k[40] = s0, k[41] = s1, k[42] = s2, k[43] = s3; +k[64] = s1, k[65] = s2, k[66] = s3, k[67] = s0; +s0 = k[8], s1 = k[9], s2 = k[10], s3 = k[11]; +k[36] = s0, k[37] = s1, k[38] = s2, k[39] = s3; +s0 = k[12], s1 = k[13], s2 = k[14], s3 = k[15]; +do { + s1 = (s1 15) | (s2 (32 - 15)); + } +while (0); +k[12] = s0, k[13] = s1, k[14] =
Re: [tsan] ThreadSanitizer instrumentation part
On Thu, Nov 01, 2012 at 11:57:21AM -0700, Xinliang David Li wrote: That is exactly related to tsan -- it should skip local variable that is not address taken (though still addressable, which by addressable, it means the variable has a memory home). The problem is that if 'addressable' bit is not equivalent to 'address taken', it will be too conservative to use it --- as addressable variables may not be address taken. But TREE_ADDRESSABLE is equivalent to address taken. Note that even 'address taken' is too conservative to use here. Only when it is escaped the thread (via non TLS global pointers or other escaped local, heap memories), should it be considered to be instrumented. If you need it even less conservative, I guess you could do say: struct ptr_info_def pi; memset (pi, 0, sizeof (pi)); pi.escaped = 1; pi.nonlocal = 1; return pt_solution_includes (pi, decl); (the nonlocal in pt_solution_includes_1 performs the is_global_var check that needs to be done, and escaped checks whether decl is in the escaped set). Then say for int foo(int i) { int a[100], b[100], c[100], d[100], *p; if (i 10) p = a; else if (i 60) p = b; else if (i = 65) p = c; else p = d; p[i] = 1; p[2 * i] = 2; return p[i + 1]; } still none of a, b, c, and d vars will be included, even when they are TREE_ADDRESSABLE, but if you say pass p to another function, or set a global var to it, it will already return true for them. Jakub
[wwwdocs,Java] remove sources.redhat.com reference from papers/nosb.html
Applied. Gerald 2012-10-31 Gerald Pfeifer ger...@pfeifer.com * papers/nosb.html: Remove reference to sources.redhat.com. Index: papers/nosb.html === RCS file: /cvs/gcc/wwwdocs/htdocs/java/papers/nosb.html,v retrieving revision 1.4 diff -u -3 -p -r1.4 nosb.html --- papers/nosb.html4 Jan 2003 18:34:19 - 1.4 +++ papers/nosb.html1 Nov 2012 19:04:42 - @@ -15,8 +15,8 @@ No Silver Bullet - Garbage Collection for Java in Embedded Systems/H1 /TD -TD ALIGN=RIGHT WIDTH=400A HREF=http://sources.redhat.com/;Red Hat Sources site/A -| A HREF=../Java Page/A/TD +TD ALIGN=RIGHT WIDTH=400 + A HREF=../Java Page/A/TD /TR /TABLE
Re: [tsan] ThreadSanitizer instrumentation part
On Thu, Nov 1, 2012 at 12:16 PM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Nov 01, 2012 at 11:57:21AM -0700, Xinliang David Li wrote: That is exactly related to tsan -- it should skip local variable that is not address taken (though still addressable, which by addressable, it means the variable has a memory home). The problem is that if 'addressable' bit is not equivalent to 'address taken', it will be too conservative to use it --- as addressable variables may not be address taken. But TREE_ADDRESSABLE is equivalent to address taken. It seems not -- unless our definition of them is different. I debugged the following program: int foo() { int a[1000]; int i, s; for (i = 0; i 1000; i++) a[i] = i; for (i = 0; i 1000; i++) s += a[i]; return s; } a's address is not taken (at source level), but it is marked as addressable. Note that even 'address taken' is too conservative to use here. Only when it is escaped the thread (via non TLS global pointers or other escaped local, heap memories), should it be considered to be instrumented. If you need it even less conservative, I guess you could do say: struct ptr_info_def pi; memset (pi, 0, sizeof (pi)); pi.escaped = 1; pi.nonlocal = 1; return pt_solution_includes (pi, decl); That will be nice. Are points-to info exposed to client code like this? Are there standard aliaser interfaces for them? thanks, David (the nonlocal in pt_solution_includes_1 performs the is_global_var check that needs to be done, and escaped checks whether decl is in the escaped set). Then say for int foo(int i) { int a[100], b[100], c[100], d[100], *p; if (i 10) p = a; else if (i 60) p = b; else if (i = 65) p = c; else p = d; p[i] = 1; p[2 * i] = 2; return p[i + 1]; } still none of a, b, c, and d vars will be included, even when they are TREE_ADDRESSABLE, but if you say pass p to another function, or set a global var to it, it will already return true for them. Jakub
[PATCH 01/13] Initial import of asan from the Google branch
From: dnovillo dnovillo@138bc75d-0d04-0410-961f-82ee72b054a4 This patch merely imports the initial state of asan as it was in the Google branch. It provides basic infrastructure for asan to instrument memory accesses on the heap, at -O3. Note that it supports neither stack nor global variable protection. The rest of the patches of the set is intended to further improve this base. * Makefile.in: Add tree-asan.c. * common.opt: Add -fasan option. * invoke.texi: Document the new flag. * passes.c: Add the asan pass. * toplev.c (compile_file): Call asan_finish_file. * tree-asan.c: New file. * tree-asan.h: New file. * tree-pass.h: Declare pass_asan. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192328 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.asan | 10 ++ gcc/Makefile.in | 5 + gcc/common.opt | 4 + gcc/doc/invoke.texi | 8 +- gcc/passes.c| 1 + gcc/toplev.c| 5 + gcc/tree-asan.c | 403 gcc/tree-asan.h | 26 gcc/tree-pass.h | 1 + 9 files changed, 462 insertions(+), 1 deletion(-) create mode 100644 gcc/ChangeLog.asan create mode 100644 gcc/tree-asan.c create mode 100644 gcc/tree-asan.h diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan new file mode 100644 index 000..40299e2 --- /dev/null +++ b/gcc/ChangeLog.asan @@ -0,0 +1,10 @@ +2012-10-10 Wei Mi w...@google.com + + * Makefile.in: Add tree-asan.c. + * common.opt: Add -fasan option. + * invoke.texi: Document the new flag. + * passes.c: Add the asan pass. + * toplev.c (compile_file): Call asan_finish_file. + * tree-asan.c: New file. + * tree-asan.h: New file. + * tree-pass.h: Declare pass_asan. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 3a8ffbe..e8c4a19 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1350,6 +1350,7 @@ OBJS = \ tracer.o \ trans-mem.o \ tree-affine.o \ + tree-asan.o \ tree-call-cdce.o \ tree-cfg.o \ tree-cfgcleanup.o \ @@ -2209,6 +2210,10 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ $(TREE_H) $(PARAMS_H) $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(RTL_H) \ $(GGC_H) $(TM_P_H) $(TARGET_H) langhooks.h $(REGS_H) gt-stor-layout.h \ $(DIAGNOSTIC_CORE_H) $(CGRAPH_H) $(TREE_INLINE_H) $(TREE_DUMP_H) $(GIMPLE_H) +tree-asan.o : tree-asan.c tree-asan.h $(CONFIG_H) pointer-set.h \ + $(SYSTEM_H) $(TREE_H) $(GIMPLE_H) \ + output.h $(DIAGNOSTIC_H) coretypes.h $(TREE_DUMP_H) $(FLAGS_H) \ + tree-pretty-print.h tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \ $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \ $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) \ diff --git a/gcc/common.opt b/gcc/common.opt index 5b69aff..eca0740 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -849,6 +849,10 @@ fargument-noalias-anything Common Ignore Does nothing. Preserved for backward compatibility. +fasan +Common RejectNegative Report Var(flag_asan) +Enable AddressSanitizer, a memory error detector + fasynchronous-unwind-tables Common Report Var(flag_asynchronous_unwind_tables) Optimization Generate unwind tables that are exact at each instruction boundary diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index ff0c87d..8292ef1 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -353,7 +353,7 @@ Objective-C and Objective-C++ Dialects}. @item Optimization Options @xref{Optimize Options,,Options that Control Optimization}. @gccoptlist{-falign-functions[=@var{n}] -falign-jumps[=@var{n}] @gol --falign-labels[=@var{n}] -falign-loops[=@var{n}] -fassociative-math @gol +-falign-labels[=@var{n}] -falign-loops[=@var{n}] -fasan -fassociative-math @gol -fauto-inc-dec -fbranch-probabilities -fbranch-target-load-optimize @gol -fbranch-target-load-optimize2 -fbtr-bb-exclusive -fcaller-saves @gol -fcheck-data-deps -fcombine-stack-adjustments -fconserve-stack @gol @@ -6822,6 +6822,12 @@ assumptions based on that. The default is @option{-fzero-initialized-in-bss}. +@item -fasan +Enable AddressSanitizer, a fast memory error detector. +Memory access instructions will be instrumented to detect +out-of-bounds and use-after-free bugs. So far only heap bugs will be detected. +See @uref{http://code.google.com/p/address-sanitizer/} for more details. + @item -fmudflap -fmudflapth -fmudflapir @opindex fmudflap @opindex fmudflapth diff --git a/gcc/passes.c b/gcc/passes.c index 67aae52..66a2f74 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -1456,6 +1456,7 @@ init_optimization_passes (void) NEXT_PASS (pass_split_crit_edges); NEXT_PASS (pass_pre); NEXT_PASS (pass_sink_code); + NEXT_PASS (pass_asan); NEXT_PASS (pass_tree_loop); { struct opt_pass **p = pass_tree_loop.pass.sub; diff --git a/gcc/toplev.c b/gcc/toplev.c index 5cbb364..b1aff0c 100644
[PATCH 03/13] Initial asan cleanups
From: jakub jakub@138bc75d-0d04-0410-961f-82ee72b054a4 This patch defines a new asan_shadow_offset target macro, instead of having a mere macro in the asan.c file. It becomes thus cleaner to define the target macro for targets that supports asan, namely x86 for now. The ASAN_SHADOW_SHIFT (which, along with the asan_shadow_offset constant, is used to compute the address of the shadow memory byte for a given memory address) is defined in asan.h. * toplev.c (process_options): Warn and turn off -fasan if not supported by target. * asan.c: Include target.h. (asan_scale, asan_offset_log_32, asan_offset_log_64, asan_offset_log): Removed. (build_check_stmt): Use ASAN_SHADOW_SHIFT and targetm.asan_shadow_offset (). (asan_instrument): Don't initialize asan_offset_log. * asan.h (ASAN_SHADOW_SHIFT): Define. * target.def (TARGET_ASAN_SHADOW_OFFSET): New hook. * doc/tm.texi.in (TARGET_ASAN_SHADOW_OFFSET): Add it. * doc/tm.texi: Regenerated. * Makefile.in (asan.o): Depend on $(TARGET_H). * config/i386/i386.c (ix86_asan_shadow_offset): New function. (TARGET_ASAN_SHADOW_OFFSET): Define. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192372 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.asan | 18 ++ gcc/Makefile.in| 2 +- gcc/asan.c | 25 ++--- gcc/asan.h | 6 +- gcc/config/i386/i386.c | 11 +++ gcc/doc/tm.texi| 6 ++ gcc/doc/tm.texi.in | 2 ++ gcc/target.def | 11 +++ gcc/toplev.c | 7 +++ 9 files changed, 67 insertions(+), 21 deletions(-) diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan index c196bfe..0bc9420 100644 --- a/gcc/ChangeLog.asan +++ b/gcc/ChangeLog.asan @@ -1,3 +1,21 @@ +2012-10-11 Jakub Jelinek ja...@redhat.com + + * toplev.c (process_options): Warn and turn off -fasan + if not supported by target. + * asan.c: Include target.h. + (asan_scale, asan_offset_log_32, asan_offset_log_64, + asan_offset_log): Removed. + (build_check_stmt): Use ASAN_SHADOW_SHIFT and + targetm.asan_shadow_offset (). + (asan_instrument): Don't initialize asan_offset_log. + * asan.h (ASAN_SHADOW_SHIFT): Define. + * target.def (TARGET_ASAN_SHADOW_OFFSET): New hook. + * doc/tm.texi.in (TARGET_ASAN_SHADOW_OFFSET): Add it. + * doc/tm.texi: Regenerated. + * Makefile.in (asan.o): Depend on $(TARGET_H). + * config/i386/i386.c (ix86_asan_shadow_offset): New function. + (TARGET_ASAN_SHADOW_OFFSET): Define. + 2012-10-10 Diego Novillo dnovi...@google.com * asan.c: Rename from tree-asan.c. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index a9da161..bdc5afb 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -2213,7 +2213,7 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ asan.o : asan.c asan.h $(CONFIG_H) pointer-set.h \ $(SYSTEM_H) $(TREE_H) $(GIMPLE_H) \ output.h $(DIAGNOSTIC_H) coretypes.h $(TREE_DUMP_H) $(FLAGS_H) \ - tree-pretty-print.h + tree-pretty-print.h $(TARGET_H) tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \ $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \ $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) \ diff --git a/gcc/asan.c b/gcc/asan.c index a6ceb57..e95be47 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -1,5 +1,5 @@ /* AddressSanitizer, a fast memory error detector. - Copyright (C) 2011 Free Software Foundation, Inc. + Copyright (C) 2011, 2012 Free Software Foundation, Inc. Contributed by Kostya Serebryany k...@google.com This file is part of GCC. @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include gimple.h #include asan.h #include gimple-pretty-print.h +#include target.h /* AddressSanitizer finds out-of-bounds and use-after-free bugs @@ -78,15 +79,6 @@ along with GCC; see the file COPYING3. If not see to create redzones for stack and global object and poison them. */ -/* The shadow address is computed as (Xasan_scale) + (1asan_offset_log). - We may want to add command line flags to change these values. */ - -static const int asan_scale = 3; -static const int asan_offset_log_32 = 29; -static const int asan_offset_log_64 = 44; -static int asan_offset_log; - - /* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}. IS_STORE is either 1 (for a store) or 0 (for a load). SIZE_IN_BYTES is one of 1, 2, 4, 8, 16. */ @@ -202,15 +194,13 @@ build_check_stmt (tree base, gimple_set_location (g, location); gimple_seq_add_stmt (seq, g); - /* Build (base_addr asan_scale) + (1 asan_offset_log). */ + /* Build + (base_addr ASAN_SHADOW_SHIFT) | targetm.asan_shadow_offset (). */ t = build2 (RSHIFT_EXPR, uintptr_type, base_addr, - build_int_cst (uintptr_type, asan_scale)); + build_int_cst
[PATCH 09/13] Don't forget to protect 32 bytes aligned global variables.
From: wmi wmi@138bc75d-0d04-0410-961f-82ee72b054a4 It appeared that we were forgetting to protect global variables that are already 32 bytes aligned. Fixed thus. * varasm.c (assemble_variable): Set asan_protected even for decls that are already ASAN_RED_ZONE_SIZE or more bytes aligned. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192830 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.asan | 15 +++ gcc/varasm.c | 6 +++--- 2 files changed, 18 insertions(+), 3 deletions(-) diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan index 3da0a0b..57670f7 100644 --- a/gcc/ChangeLog.asan +++ b/gcc/ChangeLog.asan @@ -1,3 +1,18 @@ +2012-10-25 Wei Mi w...@google.com + + * varasm.c (assemble_variable): Set asan_protected even + for decls that are already ASAN_RED_ZONE_SIZE or more + bytes aligned. + +2012-10-19 Diego Novillo dnovi...@google.com + + Merge from trunk rev 192612. + +2012-10-18 Xinliang David Li davi...@google.com + + * asan.c (asan_init_shadow_ptr_types): change shadow type + to signed type. + 2012-10-18 Jakub Jelinek ja...@redhat.com * asan.c (build_check_stmt): Unshare base. diff --git a/gcc/varasm.c b/gcc/varasm.c index 8a533ed..641ce0c 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -1991,11 +1991,11 @@ assemble_variable (tree decl, int top_level ATTRIBUTE_UNUSED, align_variable (decl, dont_output_data); if (flag_asan - asan_protect_global (decl) - DECL_ALIGN (decl) ASAN_RED_ZONE_SIZE * BITS_PER_UNIT) + asan_protect_global (decl)) { asan_protected = true; - DECL_ALIGN (decl) = ASAN_RED_ZONE_SIZE * BITS_PER_UNIT; + DECL_ALIGN (decl) = MAX (DECL_ALIGN (decl), + ASAN_RED_ZONE_SIZE * BITS_PER_UNIT); } set_mem_align (decl_rtl, DECL_ALIGN (decl)); -- 1.7.11.7
[PATCH 02/13] Rename tree-asan.[ch] to asan.[ch]
From: dnovillo dnovillo@138bc75d-0d04-0410-961f-82ee72b054a4 Following a discussion we had on this list, this patch renames the file tree-asan.* into asan.*. * asan.c: Rename from tree-asan.c. Update all users. * asan.h: Rename from tree-asan.h Update all users. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192360 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.asan | 7 + gcc/Makefile.in| 4 +- gcc/asan.c | 403 + gcc/asan.h | 26 gcc/toplev.c | 2 +- gcc/tree-asan.c| 403 - gcc/tree-asan.h| 26 7 files changed, 439 insertions(+), 432 deletions(-) create mode 100644 gcc/asan.c create mode 100644 gcc/asan.h delete mode 100644 gcc/tree-asan.c delete mode 100644 gcc/tree-asan.h diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan index 40299e2..c196bfe 100644 --- a/gcc/ChangeLog.asan +++ b/gcc/ChangeLog.asan @@ -1,3 +1,10 @@ +2012-10-10 Diego Novillo dnovi...@google.com + + * asan.c: Rename from tree-asan.c. + Update all users. + * asan.h: Rename from tree-asan.h + Update all users. + 2012-10-10 Wei Mi w...@google.com * Makefile.in: Add tree-asan.c. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index e8c4a19..a9da161 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1350,7 +1350,7 @@ OBJS = \ tracer.o \ trans-mem.o \ tree-affine.o \ - tree-asan.o \ + asan.o \ tree-call-cdce.o \ tree-cfg.o \ tree-cfgcleanup.o \ @@ -2210,7 +2210,7 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ $(TREE_H) $(PARAMS_H) $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(RTL_H) \ $(GGC_H) $(TM_P_H) $(TARGET_H) langhooks.h $(REGS_H) gt-stor-layout.h \ $(DIAGNOSTIC_CORE_H) $(CGRAPH_H) $(TREE_INLINE_H) $(TREE_DUMP_H) $(GIMPLE_H) -tree-asan.o : tree-asan.c tree-asan.h $(CONFIG_H) pointer-set.h \ +asan.o : asan.c asan.h $(CONFIG_H) pointer-set.h \ $(SYSTEM_H) $(TREE_H) $(GIMPLE_H) \ output.h $(DIAGNOSTIC_H) coretypes.h $(TREE_DUMP_H) $(FLAGS_H) \ tree-pretty-print.h diff --git a/gcc/asan.c b/gcc/asan.c new file mode 100644 index 000..a6ceb57 --- /dev/null +++ b/gcc/asan.c @@ -0,0 +1,403 @@ +/* AddressSanitizer, a fast memory error detector. + Copyright (C) 2011 Free Software Foundation, Inc. + Contributed by Kostya Serebryany k...@google.com + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +http://www.gnu.org/licenses/. */ + + +#include config.h +#include system.h +#include coretypes.h +#include tm.h +#include tree.h +#include tm_p.h +#include basic-block.h +#include flags.h +#include function.h +#include tree-inline.h +#include gimple.h +#include tree-iterator.h +#include tree-flow.h +#include tree-dump.h +#include tree-pass.h +#include diagnostic.h +#include demangle.h +#include langhooks.h +#include ggc.h +#include cgraph.h +#include gimple.h +#include asan.h +#include gimple-pretty-print.h + +/* + AddressSanitizer finds out-of-bounds and use-after-free bugs + with 2x slowdown on average. + + The tool consists of two parts: + instrumentation module (this file) and a run-time library. + The instrumentation module adds a run-time check before every memory insn. + For a 8- or 16- byte load accessing address X: + ShadowAddr = (X 3) + Offset + ShadowValue = *(char*)ShadowAddr; // *(short*) for 16-byte access. + if (ShadowValue) + __asan_report_load8(X); + For a load of N bytes (N=1, 2 or 4) from address X: + ShadowAddr = (X 3) + Offset + ShadowValue = *(char*)ShadowAddr; + if (ShadowValue) + if ((X 7) + N - 1 ShadowValue) + __asan_report_loadN(X); + Stores are instrumented similarly, but using __asan_report_storeN functions. + A call too __asan_init() is inserted to the list of module CTORs. + + The run-time library redefines malloc (so that redzone are inserted around + the allocated memory) and free (so that reuse of free-ed memory is delayed), + provides __asan_report* and __asan_init functions. + + Read more: + http://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm + + Future work: + The current implementation supports only detection of out-of-bounds and + use-after-free bugs in heap. + In order to support out-of-bounds for stack and globals we will need +
[PATCH 11/13] Factorize condition insertion code out of build_check_stmt
From: dodji dodji@138bc75d-0d04-0410-961f-82ee72b054a4 This patch splits a new create_cond_insert_point_before_iter function out of build_check_stmt, to be used by a later patch. Tested by running cc1 -fasan on the test program below with and without the patch and by inspecting the gimple output to see that there is no change. void foo () { char foo[1] = {0}; foo[0] = 1; } gcc/ * asan.c (create_cond_insert_point_before_iter): Factorize out of ... (build_check_stmt): ... here. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192844 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.asan | 3 ++ gcc/asan.c | 120 + 2 files changed, 79 insertions(+), 44 deletions(-) diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan index 9159b3f..0e0b9b8 100644 --- a/gcc/ChangeLog.asan +++ b/gcc/ChangeLog.asan @@ -1,5 +1,8 @@ 2012-10-26 Dodji Seketeli do...@redhat.com + * asan.c (create_cond_insert_point_before_iter): Factorize out of ... + (build_check_stmt): ... here. + * asan.c (build_check_stmt): Accept the memory access to be represented by an SSA_NAME. diff --git a/gcc/asan.c b/gcc/asan.c index b43f03b..736286e 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -397,6 +397,75 @@ asan_init_func (void) #define PROB_VERY_UNLIKELY (REG_BR_PROB_BASE / 2000 - 1) #define PROB_ALWAYS(REG_BR_PROB_BASE) +/* Split the current basic block and create a condition statement + insertion point right before the statement pointed to by ITER. + Return an iterator to the point at which the caller might safely + insert the condition statement. + + THEN_BLOCK must be set to the address of an uninitialized instance + of basic_block. The function will then set *THEN_BLOCK to the + 'then block' of the condition statement to be inserted by the + caller. + + Similarly, the function will set *FALLTRHOUGH_BLOCK to the 'else + block' of the condition statement to be inserted by the caller. + + Note that *FALLTHROUGH_BLOCK is a new block that contains the + statements starting from *ITER, and *THEN_BLOCK is a new empty + block. + + *ITER is adjusted to still point to the same statement it was + *pointing to initially. */ + +static gimple_stmt_iterator +create_cond_insert_point_before_iter (gimple_stmt_iterator *iter, + bool then_more_likely_p, + basic_block *then_block, + basic_block *fallthrough_block) +{ + gimple_stmt_iterator gsi = *iter; + + if (!gsi_end_p (gsi)) +gsi_prev (gsi); + + basic_block cur_bb = gsi_bb (*iter); + + edge e = split_block (cur_bb, gsi_stmt (gsi)); + + /* Get a hold on the 'condition block', the 'then block' and the + 'else block'. */ + basic_block cond_bb = e-src; + basic_block fallthru_bb = e-dest; + basic_block then_bb = create_empty_bb (cond_bb); + + /* Set up the newly created 'then block'. */ + e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE); + int fallthrough_probability = +then_more_likely_p +? PROB_VERY_UNLIKELY +: PROB_ALWAYS - PROB_VERY_UNLIKELY; + e-probability = PROB_ALWAYS - fallthrough_probability; + make_single_succ_edge (then_bb, fallthru_bb, EDGE_FALLTHRU); + + /* Set up the fallthrough basic block. */ + e = find_edge (cond_bb, fallthru_bb); + e-flags = EDGE_FALSE_VALUE; + e-count = cond_bb-count; + e-probability = fallthrough_probability; + + /* Update dominance info for the newly created then_bb; note that + fallthru_bb's dominance info has already been updated by + split_bock. */ + if (dom_info_available_p (CDI_DOMINATORS)) +set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb); + + *then_block = then_bb; + *fallthrough_block = fallthru_bb; + *iter = gsi_start_bb (fallthru_bb); + + return gsi_last_bb (cond_bb); +} + /* Instrument the memory access instruction BASE. Insert new statements before ITER. @@ -411,8 +480,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter, int size_in_bytes) { gimple_stmt_iterator gsi; - basic_block cond_bb, then_bb, else_bb; - edge e; + basic_block then_bb, else_bb; tree t, base_addr, shadow; gimple g; tree shadow_ptr_type = shadow_ptr_types[size_in_bytes == 16 ? 1 : 0]; @@ -421,51 +489,15 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter, = build_nonstandard_integer_type (TYPE_PRECISION (TREE_TYPE (base)), 1); tree base_ssa = base; - /* We first need to split the current basic block, and start altering - the CFG. This allows us to insert the statements we're about to - construct into the right basic blocks. */ - - cond_bb = gimple_bb (gsi_stmt (*iter)); - gsi = *iter; - gsi_prev (gsi); - if (!gsi_end_p (gsi)) -e = split_block (cond_bb, gsi_stmt (gsi)); - else -e = split_block_after_labels (cond_bb); - cond_bb = e-src; - else_bb =
[PATCH 05/13] Allow asan at -O0
From: jakub jakub@138bc75d-0d04-0410-961f-82ee72b054a4 This patch defines a new asan pass gate that is activated at -O0, in addition to the -O3 level at which it was initially activated. The patch also does some comment cleanups here and there. * asan.c (build_check_stmt): Rename join_bb variable to else_bb. (gate_asan_O0): New function. (pass_asan_O0): New variable. * passes.c (init_optimization_passes): Add pass_asan_O0. * tree-pass.h (pass_asan_O0): New declaration. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192415 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.asan | 8 gcc/asan.c | 44 +++- gcc/passes.c | 1 + gcc/tree-pass.h| 1 + 4 files changed, 45 insertions(+), 9 deletions(-) diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan index 9bfccd7..505bce9 100644 --- a/gcc/ChangeLog.asan +++ b/gcc/ChangeLog.asan @@ -1,3 +1,11 @@ +2012-10-12 Jakub Jelinek ja...@redhat.com + + * asan.c (build_check_stmt): Rename join_bb variable to else_bb. + (gate_asan_O0): New function. + (pass_asan_O0): New variable. + * passes.c (init_optimization_passes): Add pass_asan_O0. + * tree-pass.h (pass_asan_O0): New declaration. + 2012-10-11 Jakub Jelinek ja...@redhat.com * Makefile.in (GTFILES): Add $(srcdir)/asan.c. diff --git a/gcc/asan.c b/gcc/asan.c index 2e7d4d6..66dc571 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -137,7 +137,7 @@ build_check_stmt (tree base, location_t location, bool is_store, int size_in_bytes) { gimple_stmt_iterator gsi; - basic_block cond_bb, then_bb, join_bb; + basic_block cond_bb, then_bb, else_bb; edge e; tree t, base_addr, shadow; gimple g; @@ -158,23 +158,23 @@ build_check_stmt (tree base, else e = split_block_after_labels (cond_bb); cond_bb = e-src; - join_bb = e-dest; + else_bb = e-dest; - /* A recap at this point: join_bb is the basic block at whose head + /* A recap at this point: else_bb is the basic block at whose head is the gimple statement for which this check expression is being built. cond_bb is the (possibly new, synthetic) basic block the end of which will contain the cache-lookup code, and a conditional that jumps to the cache-miss code or, much more - likely, over to join_bb. */ + likely, over to else_bb. */ /* Create the bb that contains the crash block. */ then_bb = create_empty_bb (cond_bb); e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE); e-probability = PROB_VERY_UNLIKELY; - make_single_succ_edge (then_bb, join_bb, EDGE_FALLTHRU); + make_single_succ_edge (then_bb, else_bb, EDGE_FALLTHRU); - /* Mark the pseudo-fallthrough edge from cond_bb to join_bb. */ - e = find_edge (cond_bb, join_bb); + /* Mark the pseudo-fallthrough edge from cond_bb to else_bb. */ + e = find_edge (cond_bb, else_bb); e-flags = EDGE_FALSE_VALUE; e-count = cond_bb-count; e-probability = PROB_ALWAYS - PROB_VERY_UNLIKELY; @@ -184,7 +184,7 @@ build_check_stmt (tree base, if (dom_info_available_p (CDI_DOMINATORS)) { set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb); - set_immediate_dominator (CDI_DOMINATORS, join_bb, cond_bb); + set_immediate_dominator (CDI_DOMINATORS, else_bb, cond_bb); } gsi = gsi_last_bb (cond_bb); @@ -305,7 +305,7 @@ build_check_stmt (tree base, gimple_set_location (g, location); gsi_insert_after (gsi, g, GSI_NEW_STMT); - *iter = gsi_start_bb (join_bb); + *iter = gsi_start_bb (else_bb); } /* If T represents a memory access, add instrumentation code before ITER. @@ -447,4 +447,30 @@ struct gimple_opt_pass pass_asan = } }; +static bool +gate_asan_O0 (void) +{ + return flag_asan != 0 !optimize; +} + +struct gimple_opt_pass pass_asan_O0 = +{ + { + GIMPLE_PASS, + asan0, /* name */ + gate_asan_O0,/* gate */ + asan_instrument, /* execute */ + NULL,/* sub */ + NULL,/* next */ + 0, /* static_pass_number */ + TV_NONE, /* tv_id */ + PROP_ssa | PROP_cfg | PROP_gimple_leh,/* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + TODO_verify_flow | TODO_verify_stmts + | TODO_update_ssa/* todo_flags_finish */ + } +}; + #include gt-asan.h diff --git a/gcc/passes.c b/gcc/passes.c index 66a2f74..d4115b3 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -1562,6 +1562,7 @@ init_optimization_passes (void) NEXT_PASS (pass_tm_edges); } NEXT_PASS (pass_lower_complex_O0); + NEXT_PASS (pass_asan_O0);
[PATCH 08/13] Fix a couple of ICEs.
From: jakub jakub@138bc75d-0d04-0410-961f-82ee72b054a4 After the previous patches uncovered the fact a NOTE_INSN_BASIC_BLOCK could show up in the middle of a basic block and thus violating an important invariant. THe cfgexpand.c hunk fixes that. Then it appeared that we could get tree sharing violation if build_check_stmt doesn't unshare its base memory parameter before building an ssa name for it. The last hunk fixes a crash that happens because cgraph_build_static_cdtor can call ggc_collect so holding trees around in automatic (thus visited by the gc marker routines) could lead to these tree behind free-ed underneath us. So the patch adds new gc roots for these trees. * asan.c (build_check_stmt): Unshare base. * asan.c (asan_ctor_statements): New variable. (asan_finish_file): Use asan_ctor_statements instead of ctor_statements. * cfgexpand.c (gimple_expand_cfg): If return_label is followed by NOTE_INSN_BASIC_BLOCK, emit var_ret_seq after the note instead of before it. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192567 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.asan | 12 gcc/asan.c | 13 + gcc/cfgexpand.c| 8 +++- 3 files changed, 28 insertions(+), 5 deletions(-) diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan index 971de42..3da0a0b 100644 --- a/gcc/ChangeLog.asan +++ b/gcc/ChangeLog.asan @@ -1,3 +1,15 @@ +2012-10-18 Jakub Jelinek ja...@redhat.com + + * asan.c (build_check_stmt): Unshare base. + + * asan.c (asan_ctor_statements): New variable. + (asan_finish_file): Use asan_ctor_statements instead + of ctor_statements. + + * cfgexpand.c (gimple_expand_cfg): If return_label is + followed by NOTE_INSN_BASIC_BLOCK, emit var_ret_seq + after the note instead of before it. + 2012-10-17 Jakub Jelinek ja...@redhat.com * varasm.c: Include asan.h. diff --git a/gcc/asan.c b/gcc/asan.c index c435d35..6715e51 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -459,6 +459,8 @@ build_check_stmt (tree base, set_immediate_dominator (CDI_DOMINATORS, else_bb, cond_bb); } + base = unshare_expr (base); + gsi = gsi_last_bb (cond_bb); g = gimple_build_assign_with_ops (TREE_CODE (base), make_ssa_name (TREE_TYPE (base), NULL), @@ -748,6 +750,10 @@ asan_add_global (tree decl, tree type, VEC(constructor_elt, gc) *v) CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, init); } +/* Needs to be GTY(()), because cgraph_build_static_cdtor may + invoke ggc_collect. */ +static GTY(()) tree asan_ctor_statements; + /* Module-level instrumentation. - Insert __asan_init() into the list of CTORs. - TODO: insert redzones around globals. @@ -756,12 +762,11 @@ asan_add_global (tree decl, tree type, VEC(constructor_elt, gc) *v) void asan_finish_file (void) { - tree ctor_statements = NULL_TREE; struct varpool_node *vnode; unsigned HOST_WIDE_INT gcount = 0; append_to_statement_list (build_call_expr (asan_init_func (), 0), - ctor_statements); + asan_ctor_statements); FOR_EACH_DEFINED_VARIABLE (vnode) if (asan_protect_global (vnode-symbol.decl)) ++gcount; @@ -799,7 +804,7 @@ asan_finish_file (void) append_to_statement_list (build_call_expr (decl, 2, build_fold_addr_expr (var), build_int_cst (uptr, gcount)), - ctor_statements); + asan_ctor_statements); decl = build_fn_decl (__asan_unregister_globals, type); TREE_NOTHROW (decl) = 1; @@ -810,7 +815,7 @@ asan_finish_file (void) cgraph_build_static_cdtor ('D', dtor_statements, MAX_RESERVED_INIT_PRIORITY - 1); } - cgraph_build_static_cdtor ('I', ctor_statements, + cgraph_build_static_cdtor ('I', asan_ctor_statements, MAX_RESERVED_INIT_PRIORITY - 1); } diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index 67cf902..16fd0fb 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -4638,7 +4638,13 @@ gimple_expand_cfg (void) insn_locations_finalize (); if (var_ret_seq) -emit_insn_after (var_ret_seq, return_label); +{ + rtx after = return_label; + rtx next = NEXT_INSN (after); + if (next NOTE_INSN_BASIC_BLOCK_P (next)) + after = next; + emit_insn_after (var_ret_seq, after); +} /* Zap the tree EH table. */ set_eh_throw_stmt_table (cfun, NULL); -- 1.7.11.7
[PATCH 00/13] Request to merge Address Sanitizer in
From: Dodji Seketeli do...@seketeli.org Hello, The set of patches following this message represents the work that happened on the asan branch to build up the Address Sanitizer work started in the Google branch. Address Sanitizer (aka asan) is a memory error detector. It finds use-after-free and {heap,stack,global}-buffer overflow bugs in C/C++ programs. One can learn about the way it works by reading the pdf slides at [1], or by reading the documentation on the wiki page of the project at [2]. To make a long story short, it works by associating each memory region of eight consecutive bytes with a shadow byte that tells whether if each byte of the memory region is addressable or not. So, conceptually, there is a function 'MemToShadow' which, for each set of contiguous eight bytes of memory returns a shadow byte that tells whether if each byte is accessible or not. Then, each memory access is instrumented by the asan pass to retrieve the shadow byte of the accessed memory; if the access is to a memory address that is deemed non-accessible, a call to an asan runtime library function is issued to report a meaningful error to the user, and the access is performed, letting the user program proceed despite the error. The advantage of this approach, compared to say, Valgrind[4] is the lower time and space overhead. Eventually, when this tool becomes more solid, it'll become complementary to Valgrind. Apart from the compiler components, asan needs a runtime library to function. We share that library with the LLVM implementation of asan that is described at [3]. The last patch of the set imports this library in its pristine form into our tree. The plan is to regularly synchronize it with its LLVM upstream repository. On behalf of the GCC asan developers listed below, I am thus proposing these patches for inclusion into trunk. I chose to follow the chronological commits that happened on the [asan] branch, to ease the authorship propagation. Except for some few exceptions, each of these commits are reasonably logically atomic, so they hopefully shouldn't be too hard to review. The first patch is the initial import of the asan state from the Google branch into the [asan] branch. Subsequent patches clean the code up, add features like protection of stack and global variables, instrumentation of memory access through built-in functions, and, last but not least, the import of the runtime library. Please note that the ChangeLog.asan is meant to disappear at commit time, as its content will be updated (for the dates) and prepended to the normal ChangeLog file. One noticeable shortcoming that we have at the moment is the lack of a DejaGNU test harness for this. This is planned to be addressed as soon as possible. Please find below is a summary of the patches of the set. Thanks. [1]: http://gcc.gnu.org/wiki/cauldron2012?action=AttachFiledo=gettarget=kcc.pdf [2]: http://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm [3]: http://code.google.com/p/address-sanitizer/w/list [4]: http://www.valgrind.org Diego Novillo (2): Initial import of asan from the Google branch Rename tree-asan.[ch] to asan.[ch] Dodji Seketeli (3): Make build_check_stmt accept an SSA_NAME for its base Factorize condition insertion code out of build_check_stmt Instrument built-in memory access function calls Jakub Jelinek (6): Initial asan cleanups Emit GIMPLE directly instead of gimplifying GENERIC. Allow asan at -O0 Implement protection of stack variables Implement protection of global variables Fix a couple of ICEs. Wei Mi (2): Don't forget to protect 32 bytes aligned global variables. Import the asan runtime library into GCC tree ChangeLog.asan | 7 + Makefile.def | 2 + Makefile.in| 487 +- configure | 1 + configure.ac | 1 + gcc/ChangeLog.asan | 175 + gcc/Makefile.in|10 +- gcc/asan.c | 1495 ++ gcc/asan.h |70 + gcc/cfgexpand.c| 165 +- gcc/common.opt | 4 + gcc/config/i386/i386.c |11 + gcc/doc/invoke.texi| 8 +- gcc/doc/tm.texi| 6 + gcc/doc/tm.texi.in | 2 + gcc/gcc.c | 1 + gcc/passes.c | 2 + gcc/target.def |11 + gcc/toplev.c |14 + gcc/tree-pass.h| 2 + gcc/varasm.c
[PATCH 10/13] Make build_check_stmt accept an SSA_NAME for its base
From: dodji dodji@138bc75d-0d04-0410-961f-82ee72b054a4 This patch makes build_check_stmt accept its memory access parameter to be an SSA name. This is useful for a subsequent patch that will re-use. Tested by running cc1 -fasan on the program below with and without the patch and inspecting the gimple output to see that there is no change. void foo () { char foo[1] = {0}; foo[0] = 1; } gcc/ * asan.c (build_check_stmt): Accept the memory access to be represented by an SSA_NAME. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192843 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.asan | 5 + gcc/asan.c | 36 +++- 2 files changed, 28 insertions(+), 13 deletions(-) diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan index 57670f7..9159b3f 100644 --- a/gcc/ChangeLog.asan +++ b/gcc/ChangeLog.asan @@ -1,3 +1,8 @@ +2012-10-26 Dodji Seketeli do...@redhat.com + + * asan.c (build_check_stmt): Accept the memory access to be + represented by an SSA_NAME. + 2012-10-25 Wei Mi w...@google.com * varasm.c (assemble_variable): Set asan_protected even diff --git a/gcc/asan.c b/gcc/asan.c index 6715e51..b43f03b 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -397,16 +397,18 @@ asan_init_func (void) #define PROB_VERY_UNLIKELY (REG_BR_PROB_BASE / 2000 - 1) #define PROB_ALWAYS(REG_BR_PROB_BASE) -/* Instrument the memory access instruction BASE. - Insert new statements before ITER. - LOCATION is source code location. - IS_STORE is either 1 (for a store) or 0 (for a load). +/* Instrument the memory access instruction BASE. Insert new + statements before ITER. + + Note that the memory access represented by BASE can be either an + SSA_NAME, or a non-SSA expression. LOCATION is the source code + location. IS_STORE is TRUE for a store, FALSE for a load. SIZE_IN_BYTES is one of 1, 2, 4, 8, 16. */ static void -build_check_stmt (tree base, - gimple_stmt_iterator *iter, - location_t location, bool is_store, int size_in_bytes) +build_check_stmt (tree base, gimple_stmt_iterator *iter, + location_t location, bool is_store, + int size_in_bytes) { gimple_stmt_iterator gsi; basic_block cond_bb, then_bb, else_bb; @@ -417,6 +419,7 @@ build_check_stmt (tree base, tree shadow_type = TREE_TYPE (shadow_ptr_type); tree uintptr_type = build_nonstandard_integer_type (TYPE_PRECISION (TREE_TYPE (base)), 1); + tree base_ssa = base; /* We first need to split the current basic block, and start altering the CFG. This allows us to insert the statements we're about to @@ -462,15 +465,22 @@ build_check_stmt (tree base, base = unshare_expr (base); gsi = gsi_last_bb (cond_bb); - g = gimple_build_assign_with_ops (TREE_CODE (base), - make_ssa_name (TREE_TYPE (base), NULL), - base, NULL_TREE); - gimple_set_location (g, location); - gsi_insert_after (gsi, g, GSI_NEW_STMT); + + /* BASE can already be an SSA_NAME; in that case, do not create a + new SSA_NAME for it. */ + if (TREE_CODE (base) != SSA_NAME) +{ + g = gimple_build_assign_with_ops (TREE_CODE (base), + make_ssa_name (TREE_TYPE (base), NULL), + base, NULL_TREE); + gimple_set_location (g, location); + gsi_insert_after (gsi, g, GSI_NEW_STMT); + base_ssa = gimple_assign_lhs (g); +} g = gimple_build_assign_with_ops (NOP_EXPR, make_ssa_name (uintptr_type, NULL), - gimple_assign_lhs (g), NULL_TREE); + base_ssa, NULL_TREE); gimple_set_location (g, location); gsi_insert_after (gsi, g, GSI_NEW_STMT); base_addr = gimple_assign_lhs (g); -- 1.7.11.7
[PATCH 04/13] Emit GIMPLE directly instead of gimplifying GENERIC.
From: jakub jakub@138bc75d-0d04-0410-961f-82ee72b054a4 This patch cleanups the instrumentation code generation by emitting GIMPLE directly, as opposed to emitting GENERIC tree and then gimplifying them. It also does some cleanups here and there * Makefile.in (GTFILES): Add $(srcdir)/asan.c. * asan.c (shadow_ptr_types): New variable. (report_error_func): Change is_store argument to bool, don't append newline to function name. (PROB_VERY_UNLIKELY, PROB_ALWAYS): Define. (build_check_stmt): Change is_store argument to bool. Emit GIMPLE directly instead of creating trees and gimplifying them. Mark the error reporting function as very unlikely. (instrument_derefs): Change is_store argument to bool. Use int_size_in_bytes to compute size_in_bytes, simplify size check. Use build_fold_addr_expr instead of build_addr. (transform_statements): Adjust instrument_derefs caller. Use gimple_assign_single_p as stmt test. Don't look at MEM refs in rhs2. (asan_init_shadow_ptr_types): New function. (asan_instrument): Don't push/pop gimplify context. Call asan_init_shadow_ptr_types if not yet initialized. * asan.h (ASAN_SHADOW_SHIFT): Adjust comment. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192375 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.asan | 19 gcc/Makefile.in| 1 + gcc/asan.c | 268 - gcc/asan.h | 2 +- 4 files changed, 185 insertions(+), 105 deletions(-) diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan index 0bc9420..9bfccd7 100644 --- a/gcc/ChangeLog.asan +++ b/gcc/ChangeLog.asan @@ -1,5 +1,24 @@ 2012-10-11 Jakub Jelinek ja...@redhat.com + * Makefile.in (GTFILES): Add $(srcdir)/asan.c. + * asan.c (shadow_ptr_types): New variable. + (report_error_func): Change is_store argument to bool, don't append + newline to function name. + (PROB_VERY_UNLIKELY, PROB_ALWAYS): Define. + (build_check_stmt): Change is_store argument to bool. Emit GIMPLE + directly instead of creating trees and gimplifying them. Mark + the error reporting function as very unlikely. + (instrument_derefs): Change is_store argument to bool. Use + int_size_in_bytes to compute size_in_bytes, simplify size check. + Use build_fold_addr_expr instead of build_addr. + (transform_statements): Adjust instrument_derefs caller. + Use gimple_assign_single_p as stmt test. Don't look at MEM refs + in rhs2. + (asan_init_shadow_ptr_types): New function. + (asan_instrument): Don't push/pop gimplify context. + Call asan_init_shadow_ptr_types if not yet initialized. + * asan.h (ASAN_SHADOW_SHIFT): Adjust comment. + * toplev.c (process_options): Warn and turn off -fasan if not supported by target. * asan.c: Include target.h. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index bdc5afb..2ab1ca9 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -3726,6 +3726,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \ $(srcdir)/lto-streamer.h \ $(srcdir)/target-globals.h \ $(srcdir)/ipa-inline.h \ + $(srcdir)/asan.c \ @all_gtfiles@ # Compute the list of GT header files from the corresponding C sources, diff --git a/gcc/asan.c b/gcc/asan.c index e95be47..2e7d4d6 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -79,18 +79,22 @@ along with GCC; see the file COPYING3. If not see to create redzones for stack and global object and poison them. */ +/* Pointer types to 1 resp. 2 byte integers in shadow memory. A separate + alias set is used for all shadow memory accesses. */ +static GTY(()) tree shadow_ptr_types[2]; + /* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}. IS_STORE is either 1 (for a store) or 0 (for a load). SIZE_IN_BYTES is one of 1, 2, 4, 8, 16. */ static tree -report_error_func (int is_store, int size_in_bytes) +report_error_func (bool is_store, int size_in_bytes) { tree fn_type; tree def; char name[100]; - sprintf (name, __asan_report_%s%d\n, + sprintf (name, __asan_report_%s%d, is_store ? store : load, size_in_bytes); fn_type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE); def = build_fn_decl (name, fn_type); @@ -118,6 +122,9 @@ asan_init_func (void) } +#define PROB_VERY_UNLIKELY (REG_BR_PROB_BASE / 2000 - 1) +#define PROB_ALWAYS(REG_BR_PROB_BASE) + /* Instrument the memory access instruction BASE. Insert new statements before ITER. LOCATION is source code location. @@ -127,21 +134,17 @@ asan_init_func (void) static void build_check_stmt (tree base, gimple_stmt_iterator *iter, - location_t location, int is_store, int size_in_bytes) + location_t location,
[PATCH 06/13] Implement protection of stack variables
From: jakub jakub@138bc75d-0d04-0410-961f-82ee72b054a4 This patch implements the protection of stack variables. To understand how this works, lets look at this example on x86_64 where the stack grows downward: int foo () { char a[23] = {0}; int b[2] = {0}; a[5] = 1; b[1] = 2; return a[5] + b[1]; } For this function, the stack protected by asan will be organized as follows, from the top of the stack to the bottom: Slot 1/ [red zone of 32 bytes called 'RIGHT RedZone'] Slot 2/ [24 bytes for variable 'a'] Slot 3/ [8 bytes of red zone, that adds up to the space of 'a' to make the next slot be 32 bytes aligned; this one is called Partial Redzone; this 32 bytes alignment is an asan constraint] Slot 4/ [red zone of 32 bytes called 'Middle RedZone'] Slot 5/ [8 bytes for variable 'b'] Slot 6/ [24 bytes of Partial Red Zone (similar to slot 3] Slot 7/ [32 bytes of Red Zone at the bottom of the stack, called 'LEFT RedZone'] [A cultural question I've kept asking myself is Why has address sanitizer authors called these red zones (LEFT, MIDDLE, RIGHT) instead of e.g, (BOTTOM, MIDDLE, TOP). Maybe they can step up and educate me so that I get less confused in the future. :-)] The 32 bytes of LEFT red zone at the bottom of the stack can be decomposed as such: 1/ The first 8 bytes contain a magical asan number that is always 0x41B58AB3. 2/ The following 8 bytes contains a pointer to a string (to be parsed at runtime by the runtime asan library), which format is the following: function-name space num-of-variables-on-the-stack (32-bytes-aligned-offset-in-bytes-of-variable space length-of-var-in-bytes ){n} where '(...){n}' means the content inside the parenthesis occurs 'n' times, with 'n' being the number of variables on the stack. 3/ The following 16 bytes of the red zone have no particular format. The shadow memory for that stack layout is going to look like this: - content of shadow memory 8 bytes for slot 7: 0xF1F1F1F1. The F1 byte pattern is a magic number called ASAN_STACK_MAGIC_LEFT and is a way for the runtime to know that the memory for that shadow byte is part of a the LEFT red zone intended to seat at the bottom of the variables on the stack. - content of shadow memory 8 bytes for slots 6 and 5: 0xF4F4F400. The F4 byte pattern is a magic number called ASAN_STACK_MAGIC_PARTIAL. It flags the fact that the memory region for this shadow byte is a PARTIAL red zone intended to pad a variable A, so that the slot following {A,padding} is 32 bytes aligned. Note that the fact that the least significant byte of this shadow memory content is 00 means that 8 bytes of its corresponding memory (which corresponds to the memory of variable 'b') is addressable. - content of shadow memory 8 bytes for slot 4: 0xF2F2F2F2. The F2 byte pattern is a magic number called ASAN_STACK_MAGIC_MIDDLE. It flags the fact that the memory region for this shadow byte is a MIDDLE red zone intended to seat between two 32 aligned slots of {variable,padding}. - content of shadow memory 8 bytes for slot 3 and 2: 0xF400. This represents is the concatenation of variable 'a' and the partial red zone following it, like what we had for variable 'b'. The least significant 3 bytes being 00 means that the 3 bytes of variable 'a' are addressable. - content of shadow memory 8 bytes for slot 1: 0xF3F3F3F3. The F3 byte pattern is a magic number called ASAN_STACK_MAGIC_RIGHT. It flags the fact that the memory region for this shadow byte is a RIGHT red zone intended to seat at the top of the variables of the stack. So, the patch lays out stack variables as well as the different red zones, emits some prologue code to populate the shadow memory as to poison (mark as non-accessible) the regions of the red zones and mark the regions of stack variables as accessible, and emit some epilogue code to un-poison (mark as accessible) the regions of red zones right before the function exits. * Makefile.in (asan.o): Depend on $(EXPR_H) $(OPTABS_H). (cfgexpand.o): Depend on asan.h. * asan.c: Include expr.h and optabs.h. (asan_shadow_set): New variable. (asan_shadow_cst, asan_emit_stack_protection): New functions. (asan_init_shadow_ptr_types): Initialize also asan_shadow_set. * cfgexpand.c: Include asan.h. Define HOST_WIDE_INT heap vector. (partition_stack_vars): If i is large alignment and j small alignment or vice versa, break out of the loop instead of continue, and put the test earlier. If flag_asan, break out of the loop if for small alignment size is different. (struct stack_vars_data): New type. (expand_stack_vars): Add
[PATCH 07/13] Implement protection of global variables
From: jakub jakub@138bc75d-0d04-0410-961f-82ee72b054a4 This patch implements the protection of global variables. The basic idea is to insert a red zone between two global variables and install a constructor function that calls the asan runtime to do the populating of the relevant shadow memory regions at load time. So the patch lays out the global variables as to insert a red zone between them. The size of the red zones is so that each variable starts on a 32 bytes boundary. Then it installs a constructor function that, for each global variable, calls the runtime asan library function __asan_register_globals_with an instance of this type: struct __asan_global { /* Address of the beginning of the global variable. */ const void *__beg; /* Initial size of the global variable. */ uptr __size; /* Size of the global variable + size of the red zone. This size is 32 bytes aligned. */ uptr __size_with_redzone; /* Name of the global variable. */ const void *__name; /* This is always set to NULL for now. */ uptr __has_dynamic_init; } The patch also installs a destructor function that calls the runtime asan library function _asan_unregister_globals. * varasm.c: Include asan.h. (assemble_noswitch_variable): Grow size by asan_red_zone_size if decl is asan protected. (place_block_symbol): Likewise. (assemble_variable): If decl is asan protected, increase DECL_ALIGN if needed, and for decls emitted using assemble_variable_contents append padding zeros after it. * Makefile.in (varasm.o): Depend on asan.h. * asan.c: Include output.h. (asan_pp, asan_pp_initialized): New variables. (asan_pp_initialize, asan_pp_string): New functions. (asan_emit_stack_protection): Use asan_pp{,_initialized} instead of local pp{,_initialized} vars, use asan_pp_initialize and asan_pp_string helpers. (asan_needs_local_alias, asan_protect_global, asan_global_struct, asan_add_global): New functions. (asan_finish_file): Protect global vars that can be protected. * asan.h (asan_protect_global): New prototype. (asan_red_zone_size): New inline function. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192541 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.asan | 20 gcc/Makefile.in| 2 +- gcc/asan.c | 299 +++-- gcc/asan.h | 11 ++ gcc/varasm.c | 22 5 files changed, 319 insertions(+), 35 deletions(-) diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan index 23454f3..971de42 100644 --- a/gcc/ChangeLog.asan +++ b/gcc/ChangeLog.asan @@ -1,5 +1,25 @@ 2012-10-17 Jakub Jelinek ja...@redhat.com + * varasm.c: Include asan.h. + (assemble_noswitch_variable): Grow size by asan_red_zone_size + if decl is asan protected. + (place_block_symbol): Likewise. + (assemble_variable): If decl is asan protected, increase + DECL_ALIGN if needed, and for decls emitted using + assemble_variable_contents append padding zeros after it. + * Makefile.in (varasm.o): Depend on asan.h. + * asan.c: Include output.h. + (asan_pp, asan_pp_initialized): New variables. + (asan_pp_initialize, asan_pp_string): New functions. + (asan_emit_stack_protection): Use asan_pp{,_initialized} + instead of local pp{,_initialized} vars, use asan_pp_initialize + and asan_pp_string helpers. + (asan_needs_local_alias, asan_protect_global, + asan_global_struct, asan_add_global): New functions. + (asan_finish_file): Protect global vars that can be protected. + * asan.h (asan_protect_global): New prototype. + (asan_red_zone_size): New inline function. + * Makefile.in (asan.o): Depend on $(EXPR_H) $(OPTABS_H). (cfgexpand.o): Depend on asan.h. * asan.c: Include expr.h and optabs.h. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 2743e24..c6a9825 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -2721,7 +2721,7 @@ varasm.o : varasm.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \ output.h $(DIAGNOSTIC_CORE_H) xcoffout.h debug.h $(GGC_H) $(TM_P_H) \ $(HASHTAB_H) $(TARGET_H) langhooks.h gt-varasm.h $(BASIC_BLOCK_H) \ $(CGRAPH_H) $(TARGET_DEF_H) tree-mudflap.h \ - pointer-set.h $(COMMON_TARGET_H) + pointer-set.h $(COMMON_TARGET_H) asan.h function.o : function.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_ERROR_H) \ $(TREE_H) $(GIMPLE_H) $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) \ $(OPTABS_H) $(LIBFUNCS_H) $(REGS_H) hard-reg-set.h insn-config.h $(RECOG_H) \ diff --git a/gcc/asan.c b/gcc/asan.c index fe0e9a8..c435d35 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -45,6 +45,7 @@ along with GCC; see the file COPYING3. If not see #include target.h #include expr.h #include optabs.h
[PATCH 12/13] Instrument built-in memory access function calls
From: dodji dodji@138bc75d-0d04-0410-961f-82ee72b054a4 This patch instruments many memory access patterns through builtins. Basically, for a call like: __builtin_memset (from, 0, n_bytes); the patch would only instrument the accesses at the beginning and at the end of the memory region [from, from + n_bytes]. This is the strategy used by the llvm implementation of asan. This instrumentation is done for all the memory access builtin functions that expose a well specified memory region -- one that explicitly states the number of bytes accessed in the region. A special treatment is used for __builtin_strlen. The patch instruments the access to the first byte of its argument, as well as the access to the byte (of the argument) at the offset returned by strlen. For the __sync_* and __atomic* calls the patch instruments the access to the bytes pointed to by the argument. While doing this, I have added a new parameter to build_check_stmt to decide whether to insert the instrumentation code before or after the statement iterator. This allows us to do away with the gsi_{next,prev} dance we were doing in the callers of this function. Tested by running cc1 -fasan on variations of simple programs like: int foo () { char foo[10] = {0}; foo[0] = 't'; foo[1] = 'e'; foo[2] = 's'; foo[3] = 't'; int l = __builtin_strlen (foo); int n = sizeof (foo); __builtin_memset (foo[4], 0, n - 4); __sync_fetch_and_add (foo[11], 1); return l; } and by starring at the gimple output which for this function is: ;; Function foo (foo, funcdef_no=0, decl_uid=1714, cgraph_uid=0) foo () { int n; int l; char foo[10]; int D.1725; char * D.1724; int D.1723; long unsigned int D.1722; int D.1721; long unsigned int D.1720; long unsigned int _1; int _4; long unsigned int _5; int _6; char * _7; int _8; char * _9; unsigned long _10; unsigned long _11; unsigned long _12; signed char * _13; signed char _14; _Bool _15; unsigned long _16; signed char _17; _Bool _18; _Bool _19; char * _20; unsigned long _21; unsigned long _22; unsigned long _23; signed char * _24; signed char _25; _Bool _26; unsigned long _27; signed char _28; _Bool _29; _Bool _30; char * _31; unsigned long _32; unsigned long _33; unsigned long _34; signed char * _35; signed char _36; _Bool _37; unsigned long _38; signed char _39; _Bool _40; _Bool _41; char * _42; unsigned long _43; unsigned long _44; unsigned long _45; signed char * _46; signed char _47; _Bool _48; unsigned long _49; signed char _50; _Bool _51; _Bool _52; char * _53; unsigned long _54; unsigned long _55; unsigned long _56; signed char * _57; signed char _58; _Bool _59; unsigned long _60; signed char _61; _Bool _62; _Bool _63; char[10] * _64; unsigned long _65; unsigned long _66; unsigned long _67; signed char * _68; signed char _69; _Bool _70; unsigned long _71; signed char _72; _Bool _73; _Bool _74; unsigned long _75; unsigned long _76; unsigned long _77; signed char * _78; signed char _79; _Bool _80; unsigned long _81; signed char _82; _Bool _83; _Bool _84; long unsigned int _85; long unsigned int _86; char * _87; char * _88; unsigned long _89; unsigned long _90; unsigned long _91; signed char * _92; signed char _93; _Bool _94; unsigned long _95; signed char _96; _Bool _97; _Bool _98; char * _99; unsigned long _100; unsigned long _101; unsigned long _102; signed char * _103; signed char _104; _Bool _105; unsigned long _106; signed char _107; _Bool _108; _Bool _109; bb 2: foo = {}; _9 = foo[0]; _10 = (unsigned long) _9; _11 = _10 3; _12 = _11 + 17592186044416; _13 = (signed char *) _12; _14 = *_13; _15 = _14 != 0; _16 = _10 7; _17 = (signed char) _16; _18 = _17 = _14; _19 = _15 _18; if (_19 != 0) goto bb 5; else goto bb 4; bb 5: __asan_report_store1 (_10); bb 4: foo[0] = 116; _20 = foo[1]; _21 = (unsigned long) _20; _22 = _21 3; _23 = _22 + 17592186044416; _24 = (signed char *) _23; _25 = *_24; _26 = _25 != 0; _27 = _21 7; _28 = (signed char) _27; _29 = _28 = _25; _30 = _26 _29; if (_30 != 0) goto bb 7; else
[PATCH] Poison loc_t in system.h
The name loc_t is too enticing for developers to avoid, but it unfortunately conflicts with a visible symbol in an AIX header file, which repeatedly causes conflicts. This patch poisons the symbols in GCC system.h to prevent developers from using it. Bootstrapped on powerpc-ibm-aix7.1.0.0 Okay? Thanks, David * system.h (loc_t): Poison. Index: system.h === --- system.h(revision 193045) +++ system.h(working copy) @@ -774,6 +774,11 @@ #undef strerror #pragma GCC poison strerror +/* loc_t is defined on some systems and too inviting for some + programmers to avoid. */ +#undef loc_t + #pragma GCC poison loc_t + /* Old target macros that have moved to the target hooks structure. */ #pragma GCC poison ASM_OPEN_PAREN ASM_CLOSE_PAREN \ FUNCTION_PROLOGUE FUNCTION_EPILOGUE \
[asan] change libasan to libsanitizer
Hi, Here is the patch to change libasan to libsanitizer and reorganize the directory. I divided the patch into three parts for review. patch.part1.txt: Contains the changes in the outermost level. patch.part2.txt.bz2: Remove libasan patch.part3.txt.bz2: Add libsanitizer Is it ok for asan branch? 2012-11-1 Wei Mi w...@google.com * configure.ac: Change target-libasan to target-libsanitizer. * configure.in: Regenerate. * Makefile.def: Change libasan module to libsanitizer. * Makefile.in: Regenerate. * libsanitizer: Change libasan to libsanitizer and add an empty tsan directory under libsanitizer. Thanks, Wei. Index: configure === --- configure (revision 193063) +++ configure (working copy) @@ -2771,7 +2771,7 @@ target_libraries=target-libgcc \ target-libitm \ target-libstdc++-v3 \ target-libmudflap \ - target-libasan \ + target-libsanitizer \ target-libssp \ target-libquadmath \ target-libgfortran \ Index: Makefile.in === --- Makefile.in (revision 193063) +++ Makefile.in (working copy) @@ -575,7 +575,7 @@ all: # This is the list of directories that may be needed in RPATH_ENVVAR # so that programs built for the target machine work. -TARGET_LIB_PATH = $(TARGET_LIB_PATH_libstdc++-v3)$(TARGET_LIB_PATH_libmudflap)$(TARGET_LIB_PATH_libasan)$(TARGET_LIB_PATH_libssp)$(TARGET_LIB_PATH_libgomp)$(TARGET_LIB_PATH_libitm)$(TARGET_LIB_PATH_libatomic)$(HOST_LIB_PATH_gcc) +TARGET_LIB_PATH = $(TARGET_LIB_PATH_libstdc++-v3)$(TARGET_LIB_PATH_libmudflap)$(TARGET_LIB_PATH_libsanitizer)$(TARGET_LIB_PATH_libssp)$(TARGET_LIB_PATH_libgomp)$(TARGET_LIB_PATH_libitm)$(TARGET_LIB_PATH_libatomic)$(HOST_LIB_PATH_gcc) @if target-libstdc++-v3 TARGET_LIB_PATH_libstdc++-v3 = $$r/$(TARGET_SUBDIR)/libstdc++-v3/src/.libs: @@ -585,9 +585,9 @@ TARGET_LIB_PATH_libstdc++-v3 = $$r/$(TAR TARGET_LIB_PATH_libmudflap = $$r/$(TARGET_SUBDIR)/libmudflap/.libs: @endif target-libmudflap -@if target-libasan -TARGET_LIB_PATH_libasan = $$r/$(TARGET_SUBDIR)/libasan/.libs: -@endif target-libasan +@if target-libsanitizer +TARGET_LIB_PATH_libsanitizer = $$r/$(TARGET_SUBDIR)/libsanitizer/.libs: +@endif target-libsanitizer @if target-libssp TARGET_LIB_PATH_libssp = $$r/$(TARGET_SUBDIR)/libssp/.libs: @@ -924,7 +924,7 @@ configure-host: \ configure-target: \ maybe-configure-target-libstdc++-v3 \ maybe-configure-target-libmudflap \ -maybe-configure-target-libasan \ +maybe-configure-target-libsanitizer \ maybe-configure-target-libssp \ maybe-configure-target-newlib \ maybe-configure-target-libgcc \ @@ -1073,7 +1073,7 @@ all-host: maybe-all-lto-plugin all-target: maybe-all-target-libstdc++-v3 @endif target-libstdc++-v3-no-bootstrap all-target: maybe-all-target-libmudflap -all-target: maybe-all-target-libasan +all-target: maybe-all-target-libsanitizer all-target: maybe-all-target-libssp all-target: maybe-all-target-newlib @if target-libgcc-no-bootstrap @@ -1164,7 +1164,7 @@ info-host: maybe-info-lto-plugin info-target: maybe-info-target-libstdc++-v3 info-target: maybe-info-target-libmudflap -info-target: maybe-info-target-libasan +info-target: maybe-info-target-libsanitizer info-target: maybe-info-target-libssp info-target: maybe-info-target-newlib info-target: maybe-info-target-libgcc @@ -1246,7 +1246,7 @@ dvi-host: maybe-dvi-lto-plugin dvi-target: maybe-dvi-target-libstdc++-v3 dvi-target: maybe-dvi-target-libmudflap -dvi-target: maybe-dvi-target-libasan +dvi-target: maybe-dvi-target-libsanitizer dvi-target: maybe-dvi-target-libssp dvi-target: maybe-dvi-target-newlib dvi-target: maybe-dvi-target-libgcc @@ -1328,7 +1328,7 @@ pdf-host: maybe-pdf-lto-plugin pdf-target: maybe-pdf-target-libstdc++-v3 pdf-target: maybe-pdf-target-libmudflap -pdf-target: maybe-pdf-target-libasan +pdf-target: maybe-pdf-target-libsanitizer pdf-target: maybe-pdf-target-libssp pdf-target: maybe-pdf-target-newlib pdf-target: maybe-pdf-target-libgcc @@ -1410,7 +1410,7 @@ html-host: maybe-html-lto-plugin html-target: maybe-html-target-libstdc++-v3 html-target: maybe-html-target-libmudflap -html-target: maybe-html-target-libasan +html-target: maybe-html-target-libsanitizer html-target: maybe-html-target-libssp html-target: maybe-html-target-newlib html-target: maybe-html-target-libgcc @@ -1492,7 +1492,7 @@ TAGS-host: maybe-TAGS-lto-plugin TAGS-target: maybe-TAGS-target-libstdc++-v3 TAGS-target: maybe-TAGS-target-libmudflap -TAGS-target: maybe-TAGS-target-libasan +TAGS-target: maybe-TAGS-target-libsanitizer TAGS-target: maybe-TAGS-target-libssp TAGS-target: maybe-TAGS-target-newlib TAGS-target: maybe-TAGS-target-libgcc @@ -1574,7 +1574,7 @@ install-info-host: maybe-install-info-lt install-info-target:
Re: [asan] change libasan to libsanitizer
Will it be easier if you just rolled back your previous libasan library changes, and resubmit it with the restructured directory? David On Thu, Nov 1, 2012 at 1:17 PM, Wei Mi w...@google.com wrote: patch.part2.txt.bz2 and patch.part3.txt.bz2 are still too big. Divide patch.part2.txt.bz2 into two parts: patch.part2-1.txt.bz2 + patch.part2-2.txt.bz2 Divide patch.part3.txt.bz2 into two parts: patch.part3-1.txt.bz2 + patch.part3-2.txt.bz2 This is patch.part2-1.txt.bz2. Thanks, Wei. On Thu, Nov 1, 2012 at 1:02 PM, Wei Mi w...@google.com wrote: Hi, Here is the patch to change libasan to libsanitizer and reorganize the directory. I divided the patch into three parts for review. patch.part1.txt: Contains the changes in the outermost level. patch.part2.txt.bz2: Remove libasan patch.part3.txt.bz2: Add libsanitizer Is it ok for asan branch? 2012-11-1 Wei Mi w...@google.com * configure.ac: Change target-libasan to target-libsanitizer. * configure.in: Regenerate. * Makefile.def: Change libasan module to libsanitizer. * Makefile.in: Regenerate. * libsanitizer: Change libasan to libsanitizer and add an empty tsan directory under libsanitizer. Thanks, Wei.
Re: [PATCH] Fix CDDCE miscompilation (PR tree-optimization/55018)
From: Steven Bosscher stevenb@gmail.com Date: Sun, 28 Oct 2012 19:33:29 +0100 On Mon, Oct 22, 2012 at 11:09 PM, Jakub Jelinek wrote: On Mon, Oct 22, 2012 at 10:51:43PM +0200, Steven Bosscher wrote: Wouldn't it be way cheaper to just export dfs_find_deadend from cfganal.c and call it in calc_dfs_tree on each unconnected bb? I.e. (untested with the exception of the testcase): 2012-10-22 Jakub Jelinek ja...@redhat.com PR tree-optimization/55018 * cfganal.c (dfs_find_deadend): No longer static. * basic-block.h (dfs_find_deadend): New prototype. * dominance.c (calc_dfs_tree): If saw_unconnected, traverse from dfs_find_deadend of unconnected b instead of b directly. * gcc.dg/torture/pr55018.c: New test. Attached patch was bootstrappedtested on gcc/ PR tree-optimization/55018 * basic-block.h (dfs_find_deadend): New prototype. * cfganal.c (dfs_find_deadend): No longer static. Use bitmap instead of sbitmap for visited. (flow_dfs_compute_reverse_execute): Use dfs_find_deadend here, too. * dominance.c (calc_dfs_tree): If saw_unconnected, traverse from dfs_find_deadend of unconnected b instead of b directly. It seems this caused PR55168, ICE. brgds, H-P
Re: [asan] change libasan to libsanitizer
Yes. That will be easier and clearer. The patch is too big. Thanks, Wei. On Thu, Nov 1, 2012 at 1:19 PM, Xinliang David Li davi...@google.com wrote: Will it be easier if you just rolled back your previous libasan library changes, and resubmit it with the restructured directory? David On Thu, Nov 1, 2012 at 1:17 PM, Wei Mi w...@google.com wrote: patch.part2.txt.bz2 and patch.part3.txt.bz2 are still too big. Divide patch.part2.txt.bz2 into two parts: patch.part2-1.txt.bz2 + patch.part2-2.txt.bz2 Divide patch.part3.txt.bz2 into two parts: patch.part3-1.txt.bz2 + patch.part3-2.txt.bz2 This is patch.part2-1.txt.bz2. Thanks, Wei. On Thu, Nov 1, 2012 at 1:02 PM, Wei Mi w...@google.com wrote: Hi, Here is the patch to change libasan to libsanitizer and reorganize the directory. I divided the patch into three parts for review. patch.part1.txt: Contains the changes in the outermost level. patch.part2.txt.bz2: Remove libasan patch.part3.txt.bz2: Add libsanitizer Is it ok for asan branch? 2012-11-1 Wei Mi w...@google.com * configure.ac: Change target-libasan to target-libsanitizer. * configure.in: Regenerate. * Makefile.def: Change libasan module to libsanitizer. * Makefile.in: Regenerate. * libsanitizer: Change libasan to libsanitizer and add an empty tsan directory under libsanitizer. Thanks, Wei.
Re: [asan] change libasan to libsanitizer
On Thu, Nov 01, 2012 at 01:19:42PM -0700, Xinliang David Li wrote: Will it be easier if you just rolled back your previous libasan library changes, and resubmit it with the restructured directory? I think better would be if you didn't apply it as a patch with lots of svn add/svn rm commands, but instead just svn mv the directory or files. So it would be better if you could post the planned svn commands and the patch that would be applied on top of that. Jakub
Re: [asan] change libasan to libsanitizer
that sounds good to me. David On Thu, Nov 1, 2012 at 1:31 PM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Nov 01, 2012 at 01:19:42PM -0700, Xinliang David Li wrote: Will it be easier if you just rolled back your previous libasan library changes, and resubmit it with the restructured directory? I think better would be if you didn't apply it as a patch with lots of svn add/svn rm commands, but instead just svn mv the directory or files. So it would be better if you could post the planned svn commands and the patch that would be applied on top of that. Jakub
Re: [tsan] ThreadSanitizer instrumentation part
On Thu, Nov 01, 2012 at 11:35:42PM +0400, Dmitry Vyukov wrote: Wow! It's cool! Is it costly to compute? No, it is just a couple of instructions and bitmap_bit_p test. Jakub
Re: [tsan] ThreadSanitizer instrumentation part
On Thu, Nov 01, 2012 at 12:34:54PM -0700, Xinliang David Li wrote: On Thu, Nov 1, 2012 at 12:16 PM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Nov 01, 2012 at 11:57:21AM -0700, Xinliang David Li wrote: That is exactly related to tsan -- it should skip local variable that is not address taken (though still addressable, which by addressable, it means the variable has a memory home). The problem is that if 'addressable' bit is not equivalent to 'address taken', it will be too conservative to use it --- as addressable variables may not be address taken. But TREE_ADDRESSABLE is equivalent to address taken. It seems not -- unless our definition of them is different. I debugged the following program: int foo() { int a[1000]; int i, s; for (i = 0; i 1000; i++) a[i] = i; for (i = 0; i 1000; i++) s += a[i]; return s; } a's address is not taken (at source level), but it is marked as addressable. The question is when you are looking at TREE_ADDRESSABLE. Sure, the FE marks it as TREE_ADDRESSABLE conservatively, but if you compile with -O2, in execute_update_addresses_taken called after the ccp pass TREE_ADDRESSABLE is cleared. I thought the tsan pass was scheduled much later than that (around PRE pass). If you need it even less conservative, I guess you could do say: struct ptr_info_def pi; memset (pi, 0, sizeof (pi)); pi.escaped = 1; pi.nonlocal = 1; return pt_solution_includes (pi, decl); That will be nice. Are points-to info exposed to client code like this? Are there standard aliaser interfaces for them? pt_solution_includes is exported interface, though we'd need to ask richi when he returns whether he is fine with such an use. Jakub
Re: [tsan] ThreadSanitizer instrumentation part
On Thu, Nov 1, 2012 at 1:47 PM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Nov 01, 2012 at 12:34:54PM -0700, Xinliang David Li wrote: On Thu, Nov 1, 2012 at 12:16 PM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Nov 01, 2012 at 11:57:21AM -0700, Xinliang David Li wrote: That is exactly related to tsan -- it should skip local variable that is not address taken (though still addressable, which by addressable, it means the variable has a memory home). The problem is that if 'addressable' bit is not equivalent to 'address taken', it will be too conservative to use it --- as addressable variables may not be address taken. But TREE_ADDRESSABLE is equivalent to address taken. It seems not -- unless our definition of them is different. I debugged the following program: int foo() { int a[1000]; int i, s; for (i = 0; i 1000; i++) a[i] = i; for (i = 0; i 1000; i++) s += a[i]; return s; } a's address is not taken (at source level), but it is marked as addressable. The question is when you are looking at TREE_ADDRESSABLE. Sure, the FE marks it as TREE_ADDRESSABLE conservatively, but if you compile with -O2, in execute_update_addresses_taken called after the ccp pass TREE_ADDRESSABLE is cleared. I thought the tsan pass was scheduled much later than that (around PRE pass). I looked at it pretty late in the pipeline -- during cfgexpand and I had specified -O2 in the command line. If you need it even less conservative, I guess you could do say: struct ptr_info_def pi; memset (pi, 0, sizeof (pi)); pi.escaped = 1; pi.nonlocal = 1; return pt_solution_includes (pi, decl); That will be nice. Are points-to info exposed to client code like this? Are there standard aliaser interfaces for them? pt_solution_includes is exported interface, though we'd need to ask richi when he returns whether he is fine with such an use. yes, It is better to wrap it up in a simpler interface. David Jakub
Re: [PATCH] Fix CDDCE miscompilation (PR tree-optimization/55018)
On Thu, Nov 01, 2012 at 09:26:25PM +0100, Hans-Peter Nilsson wrote: Attached patch was bootstrappedtested on gcc/ PR tree-optimization/55018 * basic-block.h (dfs_find_deadend): New prototype. * cfganal.c (dfs_find_deadend): No longer static. Use bitmap instead of sbitmap for visited. (flow_dfs_compute_reverse_execute): Use dfs_find_deadend here, too. * dominance.c (calc_dfs_tree): If saw_unconnected, traverse from dfs_find_deadend of unconnected b instead of b directly. It seems this caused PR55168, ICE. As Honza said, it was likely latent, and furthermore seems to be related to the flow_dfs_compute_reverse_execute change from Steven (mentioning it primarily what should we consider backporting if anything to release branches). Jakub
Re: Optimize hashtable allocator
Attached patch applied. 2012-11-01 François Dumont fdum...@gcc.gnu.org * include/bits/hashtable_policy.h (__details::_Before_begin): New, combine a base node instance and an allocator. * include/bits/hashtable.h (_Hashtable::_M_node_allocator): Remove. (_Hashtable::_M_before_begin): Rename into _M_bbegin and type modified to __detail::_Before_begin. (_Hashtable::_M_node_allocator()): New, get the node allocator part of _M_bbegin. (_Hashtable::_M_before_begin()): New, get the before begin node part of _M_bbegin. (_Hashtable): Adapt to use latter. François On 11/01/2012 12:02 AM, Jonathan Wakely wrote: On 31 October 2012 22:46, Marc Glisse wrote: On Wed, 31 Oct 2012, Jonathan Wakely wrote: On 31 October 2012 22:14, François Dumont wrote: Here is the patch I came to. I use the 'universal reference' like you propose but some tests started to fail because I think gcc called it instead of the move constructor. Ah of course. The defaulted copy/move constructors you added are the right solution for that. Assuming you never copy a non-const lvalue or a const rvalue. The current use seems ok, I just wanted to mention that it is rather fragile. Yes, true. The code that uses that constructor is entirely under the implementation's control so it should be avoidable. If it's ever a problem we could change the constructor to a non-template then explicitly construct it with the right type: _M_bbegin(_Node_allocator_type(__a)), Index: include/bits/hashtable_policy.h === --- include/bits/hashtable_policy.h (revision 192992) +++ include/bits/hashtable_policy.h (working copy) @@ -1708,6 +1708,24 @@ return true; } + /** + * This type is to combine a _Hash_node_base instance with an allocator + * instance through inheritance to benefit from EBO when possible. + */ + templatetypename _NodeAlloc +struct _Before_begin : public _NodeAlloc +{ + _Hash_node_base _M_node; + + _Before_begin(const _Before_begin) = default; + _Before_begin(_Before_begin) = default; + + templatetypename _Alloc + _Before_begin(_Alloc __a) + : _NodeAlloc(std::forward_Alloc(__a)) + { } +}; + //@} hashtable-detail _GLIBCXX_END_NAMESPACE_VERSION } // namespace __detail Index: include/bits/hashtable.h === --- include/bits/hashtable.h (revision 192992) +++ include/bits/hashtable.h (working copy) @@ -99,7 +99,7 @@ * Each _Hashtable data structure has: * * - _Bucket[] _M_buckets - * - _Hash_node_base _M_before_begin + * - _Hash_node_base _M_bbegin * - size_type _M_bucket_count * - size_type _M_element_count * @@ -302,17 +302,31 @@ _Node_allocator_type; typedef typename _Alloc::template rebind__bucket_type::other _Bucket_allocator_type; - typedef typename _Alloc::template rebindvalue_type::other - _Value_allocator_type; + using __before_begin = __detail::_Before_begin_Node_allocator_type; - _Node_allocator_type _M_node_allocator; __bucket_type* _M_buckets; size_type _M_bucket_count; - __node_base _M_before_begin; + __before_begin _M_bbegin; size_type _M_element_count; _RehashPolicy _M_rehash_policy; + _Node_allocator_type + _M_node_allocator() + { return _M_bbegin; } + + const _Node_allocator_type + _M_node_allocator() const + { return _M_bbegin; } + + __node_base + _M_before_begin() + { return _M_bbegin._M_node; } + + const __node_base + _M_before_begin() const + { return _M_bbegin._M_node; } + templatetypename... _Args __node_type* _M_allocate_node(_Args... __args); @@ -337,7 +351,7 @@ __node_type* _M_begin() const - { return static_cast__node_type*(_M_before_begin._M_nxt); } + { return static_cast__node_type*(_M_before_begin()._M_nxt); } public: // Constructor, destructor, assignment, swap @@ -455,11 +469,11 @@ allocator_type get_allocator() const noexcept - { return allocator_type(_M_node_allocator); } + { return allocator_type(_M_node_allocator()); } size_type max_size() const noexcept - { return _M_node_allocator.max_size(); } + { return _M_node_allocator().max_size(); } // Observers key_equal @@ -685,15 +699,15 @@ _H1, _H2, _Hash, _RehashPolicy, _Traits:: _M_allocate_node(_Args... __args) { - __node_type* __n = _M_node_allocator.allocate(1); + __node_type* __n = _M_node_allocator().allocate(1); __try { - _M_node_allocator.construct(__n, std::forward_Args(__args)...); + _M_node_allocator().construct(__n, std::forward_Args(__args)...); return __n; } __catch(...) { - _M_node_allocator.deallocate(__n, 1); +
Re: [tsan] ThreadSanitizer instrumentation part
On Thu, Nov 01, 2012 at 01:57:40PM -0700, Xinliang David Li wrote: I looked at it pretty late in the pipeline -- during cfgexpand and I had specified -O2 in the command line. That was too late, ivopts pass (after pre where tsan pass was put around) marks it TREE_ADDRESSABLE again, because it creates IV with initial value a[0]. Jakub
Tighten checking of vector comparisons
Hello, this patch makes gimple checking of vector comparisons more strict by ensuring that it doesn't return a boolean. It also fixes a bug that this check detected in fold-const.c: for (void)(x0), the C front-end was calling fold_unary_loc on the conversion to void, which was then converted to if(x0)(void)0. 2012-11-02 Marc Glisse marc.gli...@inria.fr * tree-cfg.c (verify_gimple_comparison): Verify that vector comparison returns a vector. * fold-const.c (fold_unary_loc): Disable conversion optimization for void type. -- Marc GlisseIndex: gcc/fold-const.c === --- gcc/fold-const.c(revision 193060) +++ gcc/fold-const.c(working copy) @@ -7742,21 +7742,22 @@ fold_unary_loc (location_t loc, enum tre /* If we have (type) (a CMP b) and type is an integral type, return new expression involving the new type. Canonicalize (type) (a CMP b) to (a CMP b) ? (type) true : (type) false for non-integral type. Do not fold the result as that would not simplify further, also folding again results in recursions. */ if (TREE_CODE (type) == BOOLEAN_TYPE) return build2_loc (loc, TREE_CODE (op0), type, TREE_OPERAND (op0, 0), TREE_OPERAND (op0, 1)); - else if (!INTEGRAL_TYPE_P (type) TREE_CODE (type) != VECTOR_TYPE) + else if (!INTEGRAL_TYPE_P (type) !VOID_TYPE_P (type) + TREE_CODE (type) != VECTOR_TYPE) return build3_loc (loc, COND_EXPR, type, op0, constant_boolean_node (true, type), constant_boolean_node (false, type)); } /* Handle cases of two conversions in a row. */ if (CONVERT_EXPR_P (op0)) { tree inside_type = TREE_TYPE (TREE_OPERAND (op0, 0)); tree inter_type = TREE_TYPE (op0); Index: gcc/tree-cfg.c === --- gcc/tree-cfg.c (revision 193060) +++ gcc/tree-cfg.c (working copy) @@ -3263,21 +3263,30 @@ verify_gimple_comparison (tree type, tre error (mismatching comparison operand types); debug_generic_expr (op0_type); debug_generic_expr (op1_type); return true; } /* The resulting type of a comparison may be an effective boolean type. */ if (INTEGRAL_TYPE_P (type) (TREE_CODE (type) == BOOLEAN_TYPE || TYPE_PRECISION (type) == 1)) -; +{ + if (TREE_CODE (op0_type) == VECTOR_TYPE + || TREE_CODE (op1_type) == VECTOR_TYPE) +{ + error (vector comparison returning a boolean); + debug_generic_expr (op0_type); + debug_generic_expr (op1_type); + return true; +} +} /* Or an integer vector type with the same size and element count as the comparison operand types. */ else if (TREE_CODE (type) == VECTOR_TYPE TREE_CODE (TREE_TYPE (type)) == INTEGER_TYPE) { if (TREE_CODE (op0_type) != VECTOR_TYPE || TREE_CODE (op1_type) != VECTOR_TYPE) { error (non-vector operands in vector comparison); debug_generic_expr (op0_type);
Re: [tsan] ThreadSanitizer instrumentation part
Right -- I found the same thing, it is the ivopt that resets it -- the IV opt pass should have a TODO_update_address_taken. thanks, David On Thu, Nov 1, 2012 at 2:07 PM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Nov 01, 2012 at 01:57:40PM -0700, Xinliang David Li wrote: I looked at it pretty late in the pipeline -- during cfgexpand and I had specified -O2 in the command line. That was too late, ivopts pass (after pre where tsan pass was put around) marks it TREE_ADDRESSABLE again, because it creates IV with initial value a[0]. Jakub
Re: [asan] change libasan to libsanitizer
On Thu, Nov 1, 2012 at 2:17 PM, Wei Mi w...@google.com wrote: Thanks for the suggestion! The planned svn commands will be: svn mv libasan libsanitizer svn add libsanitizer/asan svn add libsanitizer/tsan Probably keep the tsan creation out of this patch. David cd libsanitizer for i in `ls asan_*`; do svn mv $i asan/$i done Then apply the two patches attached on top of that. patch.1.txt is to handle the toplevel configure and Makefile changes. patch.2.txt is to handle the configure and Makefile changes in libsanitizer. Thanks, Wei. On Thu, Nov 1, 2012 at 1:34 PM, Xinliang David Li davi...@google.com wrote: that sounds good to me. David On Thu, Nov 1, 2012 at 1:31 PM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Nov 01, 2012 at 01:19:42PM -0700, Xinliang David Li wrote: Will it be easier if you just rolled back your previous libasan library changes, and resubmit it with the restructured directory? I think better would be if you didn't apply it as a patch with lots of svn add/svn rm commands, but instead just svn mv the directory or files. So it would be better if you could post the planned svn commands and the patch that would be applied on top of that. Jakub
Re: [tsan] ThreadSanitizer instrumentation part
On Thu, Nov 01, 2012 at 02:19:43PM -0700, Xinliang David Li wrote: Right -- I found the same thing, it is the ivopt that resets it -- the IV opt pass should have a TODO_update_address_taken. That wouldn't change anything. The IVopts pass adds some_ptr = a[0]; to the IL, so a is then address taken. Jakub
Re: [PATCH] Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)
On Thu, Nov 1, 2012 at 10:19 AM, Teresa Johnson tejohn...@google.com wrote: On Wed, Oct 31, 2012 at 4:02 PM, Steven Bosscher stevenb@gmail.com wrote: On Tue, Oct 30, 2012 at 10:43 PM, Teresa Johnson wrote: Sure, I will give this a try after your verification patch tests complete. Does this mean that the patch you posted above to force_nonfallthru_and_redirect is no longer needed either? I'll see if I can avoid the need for some of my fixes, although I believe at least the function.c one will still be needed. I'll check. The force_nonfallthru change is still necessary, because force_nonfallthru should be almost a no-op in cfglayout mode. The whole concept of a fallthru edge doesn't exist in cfglayout mode: any single_succ edge is a fallthru edge until the order of the basic blocks has been determined and the insn chain is re-linked (cfglayout mode originally was developed for bb-reorder, to move blocks around more easily). So the correct patch would actually be: Index: cfgrtl.c === --- cfgrtl.c(revision 193046) +++ cfgrtl.c(working copy) @@ -4547,7 +4547,7 @@ struct cfg_hooks cfg_layout_rtl_cfg_hooks = { cfg_layout_split_edge, rtl_make_forwarder_block, NULL, /* tidy_fallthru_edge */ - rtl_force_nonfallthru, + NULL, /* force_nonfallthru */ rtl_block_ends_with_call_p, rtl_block_ends_with_condjump_p, rtl_flow_call_edges_add, (Or better yet: Remove the force_nonfallthru and tidy_fallthru_edge hooks, they are cfgrtl-only.) But obviously that won't work because bb-reorder.c:fix_up_fall_thru_edges calls this hook while we're in cfglayout mode. That is a bug. The call to force_nonfallthru results in a dangling barrier: cfgrtl.c:1523 emit_barrier_after (BB_END (jump_block)); In cfglayout mode, barriers don't exist in the insns chain, and they don't have to because every edge is a fallthru edge. If there are barriers before cfglayout mode, they are either removed or linked in the basic block footer, and fixup_reorder_chain restores or inserts barriers where necessary to drop out of cfglayout mode. This emit_barrier_after call hangs a barrier after BB_END but not in the footer, and I'm pretty sure the result will be that the barrier is lost in fixup_reorder_chain. See also emit_barrier_after_bb for how inserting a barrier should be done in cfglayout mode. So in short, bbpart doesn't know what it wants to be: a cfgrtl or a cfglayout pass. It doesn't work without cfglayout but it's doing things that are only correct in the cfgrtl world and Very Wrong Indeed in cfglayout-land. Regarding your earlier question about why we needed to add the barrier, I need to dig up the details again but essentially I found that the barriers were being added by bbpart, but bbro was reordering things and the block that ended up at the border between the hot and cold section didn't necessarily have a barrier on it because it was not previously at the region boundary. That doesn't sound right. bbpart doesn't actually re-order the basic blocks, it only marks the blocks with the partition they will be assigned to. Whatever ends up at the border between the two partitions is not relevant: the hot section cannot end in a fall-through edge to the cold section (verify_flow_info even checks for that, see fallthru edge crosses section boundary (bb %i)) so it must end in some explicit jump. Such jumps are always followed by a barrier. The only reason I can think of why there might be a missing barrier, is because fixup_reorder_chain has a bug and forgets to insert the barrier in some cases (and I suspect this may be the case for return patterns, or the a.m. issue of a dropper barrier). I would like to work on debugging this, but it's hard without test cases... I'm working on trying to reproduce some of these failures in a test case I can share. So far, I have only been able to reproduce the failure reported in PR 53743 in spec2006 (456.hmmer/sre_math.c). Still working on getting a smaller/shareable test case for the other 2 issues. The failure in PR 53743 (assert in cfg_layout_merge_blocks) is what I had fixed with my original changes to cfgrtl.c. Need to understand why there is a reg crossing note between 2 bbs in the same partition. Interestingly, this turned out to be the same root cause as the verify_flow_info failures below. It is fixed by the same fix to thread_prologue_and_epilogue_insns. When the code below created the copy_bb and put it in e-src's partition, it made it insufficient for the merge blocks routine to check if the two bbs were in the same partition, because they were in the same partition but separated by the region crossing jump. I'll do some testing of the fix below, but do you have any comments on the correctness or the potential issue I raised (see my note just below the patch)? Do you recommend pursuing the move of the bb partition phase
Re: [PATCH] Fix CDDCE miscompilation (PR tree-optimization/55018)
On Thu, Nov 01, 2012 at 09:26:25PM +0100, Hans-Peter Nilsson wrote: Attached patch was bootstrappedtested on gcc/ PR tree-optimization/55018 * basic-block.h (dfs_find_deadend): New prototype. * cfganal.c (dfs_find_deadend): No longer static. Use bitmap instead of sbitmap for visited. (flow_dfs_compute_reverse_execute): Use dfs_find_deadend here, too. * dominance.c (calc_dfs_tree): If saw_unconnected, traverse from dfs_find_deadend of unconnected b instead of b directly. It seems this caused PR55168, ICE. As Honza said, it was likely latent, and furthermore seems to be related to the flow_dfs_compute_reverse_execute change from Steven (mentioning it primarily what should we consider backporting if anything to release branches). Indeed, this patch should not affect loop predictions. Perhaps it prevents CD-DCE from removing a loop that it previously removed and thus the loop survives till predict.c and triggers the previously latent bugs? The patch I attached to the PR is actually incorrect - estimated_stmt_executions_int already adds the +1 and thus should never return 0. I am looking into proper fix. There is obvious overflow on the other computation patch, so my best guess is that fixing it will cure the issue. Honza Jakub
Re: [PATCH] Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)
On Tue, Oct 30, 2012 at 10:48 AM, Steven Bosscher stevenb@gmail.com wrote: On Tue, Oct 30, 2012 at 6:20 AM, Teresa Johnson wrote: Index: bb-reorder.c === --- bb-reorder.c(revision 192692) +++ bb-reorder.c(working copy) @@ -2188,6 +2188,8 @@ insert_section_boundary_note (void) first_partition = BB_PARTITION (bb); if (BB_PARTITION (bb) != first_partition) { + /* There should be a barrier between text sections. */ + emit_barrier_after (BB_END (bb-prev_bb)); So why isn't there one? There can't be a fall-through edge from one section to the other, so cfgrtl.c:fixup_reorder_chain should have added a barrier here already in the code under the comment: /* Now add jumps and labels as needed to match the blocks new outgoing edges. */ Why isn't it doing that for you? BTW, something else I noted in cfgrtl.c: NOTE_INSN_SWITCH_TEXT_SECTIONS shouldn't be copied in duplicate_insn_chain, so the following is necessary for robustness: Index: cfgrtl.c === --- cfgrtl.c(revision 191819) +++ cfgrtl.c(working copy) @@ -3615,7 +3615,6 @@ break; case NOTE_INSN_EPILOGUE_BEG: - case NOTE_INSN_SWITCH_TEXT_SECTIONS: emit_note_copy (insn); break; Shouldn't the patch be: @@ -3630,10 +3631,11 @@ duplicate_insn_chain (rtx from, rtx to) case NOTE_INSN_FUNCTION_BEG: /* There is always just single entry to function. */ case NOTE_INSN_BASIC_BLOCK: + /* We should only switch text sections once. */ + case NOTE_INSN_SWITCH_TEXT_SECTIONS: break; case NOTE_INSN_EPILOGUE_BEG: - case NOTE_INSN_SWITCH_TEXT_SECTIONS: emit_note_copy (insn); break; i.e. move the NOTE above to where we will ignore it. Otherwise, we would fall into the default case which is listed as unreachable. Teresa There can be only one! One note to rule them all! etc. Index: cfgrtl.c === --- cfgrtl.c(revision 192692) +++ cfgrtl.c(working copy) @@ -912,7 +912,8 @@ rtl_can_merge_blocks (basic_block a, basic_block b partition boundaries). See the comments at the top of bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ - if (BB_PARTITION (a) != BB_PARTITION (b)) + if (find_reg_note (BB_END (a), REG_CROSSING_JUMP, NULL_RTX) + || BB_PARTITION (a) != BB_PARTITION (b)) return false; My dislike for this whole scheme just continues to grow... How can there be a REG_CROSSING_JUMP note if BB_PARTITION(a)==BB_PARTITION(b)? That is a bug. We should not need the notes here. As long as we have the CFG, BB_PARTITION(a)==BB_PARTITION(b) should be the canonical way to check whether two blocks are in the same partition, and the EDGE_CROSSING flag should be set iff an edge crosses from one section to another. The REG_CROSSING_JUMP note should only be used to see if a JUMP_INSN may jump to another section, without having to check all successor edges. Any place where we have to check the BB_PARTITION or edge-flagsEDGE_CROSSING *and* REG_CROSSING_JUMP indicates a bug in the partitioning updating. Another BTW: sched-vis.c doesn't handle REG_CROSSING_JUMP notes so that slim RTL dumping breaks. I need this patchlet to make things work: Index: sched-vis.c === --- sched-vis.c (revision 191819) +++ sched-vis.c (working copy) @@ -553,6 +553,11 @@ { char t1[BUF_LEN], t2[BUF_LEN], t3[BUF_LEN]; + if (! x) +{ + sprintf (buf, (nil)); + return; +} switch (GET_CODE (x)) { case SET: Ciao! Steven -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: [PATCH] skip gcc.target/i386/pr53249.c on darwin
On Thu, Nov 1, 2012 at 6:47 AM, Jack Howarth howa...@bromo.med.uc.edu wrote: On Fri, Aug 24, 2012 at 04:13:20PM +0200, Rainer Orth wrote: Jack Howarth howa...@bromo.med.uc.edu writes: Currently the new testcase for gcc.target/i386/pr53249.c is failing on darwin due to the absence of -mx32 support on that target. The following patch skips this testcase on darwin. Tested on x86_64-apple-darwin12... This also fails on Solaris/x86 (cf. PR testsuite/53365) and i686-unknown-linux-gnu. I'd strongly prefer if HJ could devise a real fix instead of just skipping the test on an explicit list of systems. Rainer Rainer, What about using... Index: gcc/testsuite/gcc.target/i386/pr53249.c === --- gcc/testsuite/gcc.target/i386/pr53249.c (revision 193061) +++ gcc/testsuite/gcc.target/i386/pr53249.c (working copy) @@ -1,4 +1,4 @@ -/* { dg-do compile { target { ! { ia32 } } } } */ +/* { dg-do compile { target { ! { ia32 || llp64 } } } } */ /* { dg-options -O2 -mx32 -ftls-model=initial-exec -maddress-mode=short } */ struct gomp_task This converts the failure at -m64 into an unsupported testcase on x86_64-apple-darwin12. This will disable test on Linux/x86-64. -- H.J.
Re: [tsan] ThreadSanitizer instrumentation part
It is not always true though -- only when the array address is picked by the pass to be the induction variable. David On Thu, Nov 1, 2012 at 2:24 PM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Nov 01, 2012 at 02:19:43PM -0700, Xinliang David Li wrote: Right -- I found the same thing, it is the ivopt that resets it -- the IV opt pass should have a TODO_update_address_taken. That wouldn't change anything. The IVopts pass adds some_ptr = a[0]; to the IL, so a is then address taken. Jakub