Re: Missing guard in ira-color.c ?
On May 10, 2012, at 10:41 PM, Vladimir Makarov wrote: On 05/10/2012 09:10 AM, Tristan Gingold wrote: Hi, I am getting a segfault in ira-color.c:2945 on the trunk: Program received signal SIGSEGV, Segmentation fault. 0x00a79f37 in move_spill_restore () at ../../src/gcc/ira-color.c:2945 2945 || ira_reg_equiv_const[regno] != NULL_RTX (gdb) l 2940 /* don't do the optimization because it can create 2941 copies and the reload pass can spill the allocno set 2942 by copy although the allocno will not get memory 2943 slot. */ 2944 || ira_reg_equiv_invariant_p[regno] 2945 || ira_reg_equiv_const[regno] != NULL_RTX 2946 || !bitmap_bit_p (loop_node-border_allocnos, ALLOCNO_NUM (a))) 2947 continue; 2948 mode = ALLOCNO_MODE (a); 2949 rclass = ALLOCNO_CLASS (a); while building gcc (gnatcmd.adb file) for ia64-vms using a cross compiler (target=ia64-vms, host=x86_64-linux). The reason looks to be an out of bounds access: (gdb) print regno $10 = 18476 (gdb) print ira_reg_equiv_len $11 = 17984 (I suppose this setup is not easy at all to reproduce, but I can provide any files, if necessary). Tristan, thanks for reporting this. Wild guess, as I don't know IRA at all: looks like in this file most accesses to ira_reg_equiv_* are guarded. Is it expected that they aren't at this point ? Yes, I guess. It is possible to have the pseudos which are out of range ira_reg_equiv_const. It should be hard to reproduce such error because they are generated when we need to break circular dependence (e.g. when hard register 1 should be moved to hard register 2 and hard register 2 to hard register 1). Your solution is perfectly fine. So you can commit the patch into the trunk as pre-approved. Thank you for your prompt answer. I will commit it after regtesting. Tristan.
Re: [PATCH, 4.7] Backport fix to [un]signed_type_for
On Thu, 10 May 2012, William J. Schmidt wrote: On Thu, 2012-05-10 at 18:49 +0200, Jakub Jelinek wrote: On Thu, May 10, 2012 at 11:44:27AM -0500, William J. Schmidt wrote: Backporting this patch to 4.7 fixes a problem building Fedora 17. Bootstrapped and regression tested on powerpc64-unknown-linux-gnu. Is the backport OK? For 4.7 I'd very much prefer a less intrusive change (i.e. change the java langhook) instead, but I'll defer to Richard if he prefers this over that. OK. If that's desired, this is the possible change to the langhook: Index: gcc/java/typeck.c === --- gcc/java/typeck.c (revision 187158) +++ gcc/java/typeck.c (working copy) @@ -189,6 +189,12 @@ java_type_for_size (unsigned bits, int unsignedp) return unsignedp ? unsigned_int_type_node : int_type_node; if (bits = TYPE_PRECISION (long_type_node)) return unsignedp ? unsigned_long_type_node : long_type_node; + /* A 64-bit target with TImode requires 128-bit type definitions + for bitsizetype. */ + if (int128_integer_type_node + bits == TYPE_PRECISION (int128_integer_type_node)) +return (unsignedp ? int128_unsigned_type_node + : int128_integer_type_node); return 0; } which also fixed the problem and bootstraps without regressions. Whichever you guys prefer is fine with me. I prefer the java variant, too. Thanks, Richard.
Re: [C++ Patch] PR 53305
On 05/11/2012 04:48 AM, Gabriel Dos Reis wrote: On Thu, May 10, 2012 at 6:40 PM, Paolo Carlinipaolo.carl...@oracle.com wrote: Hi, an ICE on invalid (per Daniel's analysis): when r is NULL_TREE the next DECL_CONTEXT (r) can only crash. Plus a garbled error message because pp_cxx_simple_type_specifier doesn't handle BOUND_TEMPLATE_TEMPLATE_PARM. Tested x86_64-linux. Thanks, Paolo. /// Stylistically, I would write if (r == NULL) or if (r == NULL_TREE) Patch OK with that change. Thanks Gaby. Then I guess I'm going to commit the variant with NULL_TREE, I like it a tad better because after all formally these functions return trees. And consistently I change another instance only a few lines above. Thanks, Paolo. / Index: testsuite/g++.dg/cpp0x/variadic132.C === --- testsuite/g++.dg/cpp0x/variadic132.C(revision 0) +++ testsuite/g++.dg/cpp0x/variadic132.C(revision 0) @@ -0,0 +1,27 @@ +// PR c++/53305 +// { dg-do compile { target c++11 } } + +templateclass... Ts struct tuple { }; + +struct funct +{ + templateclass... argTs + int operator()(argTs...); +}; + +templateclass... class test; + +templatetemplate class... class tp, +class... arg1Ts, class... arg2Ts +class testtparg1Ts..., tparg2Ts... +{ + templateclass func, class...arg3Ts +auto test2(func fun, arg1Ts... arg1s, arg3Ts... arg3s) +- decltype(fun(arg1s..., arg3s...)); +}; + +int main() +{ + testtuple, tuplechar,int t2; + t2.test2(funct(), 'a', 2); // { dg-error no matching function } +} Index: cp/cxx-pretty-print.c === --- cp/cxx-pretty-print.c (revision 187376) +++ cp/cxx-pretty-print.c (working copy) @@ -1261,6 +1261,7 @@ pp_cxx_simple_type_specifier (cxx_pretty_printer * case TEMPLATE_TYPE_PARM: case TEMPLATE_TEMPLATE_PARM: case TEMPLATE_PARM_INDEX: +case BOUND_TEMPLATE_TEMPLATE_PARM: pp_cxx_unqualified_id (pp, t); break; Index: cp/pt.c === --- cp/pt.c (revision 187376) +++ cp/pt.c (working copy) @@ -12066,7 +12066,7 @@ tsubst_copy (tree t, tree args, tsubst_flags_t com case PARM_DECL: r = retrieve_local_specialization (t); - if (r == NULL) + if (r == NULL_TREE) { tree c; @@ -12084,6 +12084,8 @@ tsubst_copy (tree t, tree args, tsubst_flags_t com not the following PARM_DECLs that are chained to T. */ c = copy_node (t); r = tsubst_decl (c, args, complain); + if (r == NULL_TREE) + return error_mark_node; /* Give it the template pattern as its context; its true context hasn't been instantiated yet and this is good enough for mangling. */
[Patch, Fortran, committed] PR53310 - Fix multi-malloc memory leak in EOSHIFT
I have committed the attached patch as obvious. libgfortran (since GCC 4.5) allocates the memory rank()-times instead of only once in eoshift2. For the test case of the PR, gfortran leaked 6 MB per call! Build and regtested on x86-64-linux and comitted as Rev. 187395 to the 4.8 trunk. I intent to backport the patch, but I have not yet decided whether it will be down to 4.5 or not. Tobias PS: In the ChangeLog I have converted some spaces into tabs. Index: libgfortran/intrinsics/eoshift2.c === --- libgfortran/intrinsics/eoshift2.c (revision 187394) +++ libgfortran/intrinsics/eoshift2.c (working copy) @@ -77,6 +77,10 @@ eoshift2 (gfc_array_char *ret, const gfc_array_cha ret-offset = 0; ret-dtype = array-dtype; + + /* xmalloc allocates a single byte for zero size. */ + ret-base_addr = xmalloc (size * arraysize); + for (i = 0; i GFC_DESCRIPTOR_RANK (array); i++) { index_type ub, str; @@ -90,10 +94,6 @@ eoshift2 (gfc_array_char *ret, const gfc_array_cha * GFC_DESCRIPTOR_STRIDE(ret,i-1); GFC_DIMENSION_SET(ret-dim[i], 0, ub, str); - - /* xmalloc allocates a single byte for zero size. */ - ret-base_addr = xmalloc (size * arraysize); - } } else if (unlikely (compile_options.bounds_check)) Index: libgfortran/ChangeLog === --- libgfortran/ChangeLog (revision 187394) +++ libgfortran/ChangeLog (working copy) @@ -1,17 +1,23 @@ +2012-05-11 Tobias Burnus bur...@net-b.de + + PR fortran/53310 + * intrinsics/eoshift2.c (eoshift2): Do not leak + memory by allocating it in the loop. + 2012-05-05 Janne Blomqvist j...@gcc.gnu.org -* config.h.in: Regenerated. -* configure: Regenerated. -* configure.ac: Add checks for getegid and __secure_getenv. -* io/unix.c (P_tmpdir): Fallback definition for macro. -(tempfile_open): New function. -(tempfile): Use secure_getenv, call tempfile_open to try each -directory in turn. -* libgfortran.h (DEFAULT_TMPDIR): Remove macro. -(secure_getenv): New macro/prototype. -* runtime/environ.c (secure_getenv): New function. -(variable_table): Rename GFORTRAN_TMPDIR to TMPDIR. -* runtime/main.c (find_addr2line): Use secure_getenv. + * config.h.in: Regenerated. + * configure: Regenerated. + * configure.ac: Add checks for getegid and __secure_getenv. + * io/unix.c (P_tmpdir): Fallback definition for macro. + (tempfile_open): New function. + (tempfile): Use secure_getenv, call tempfile_open to try each + directory in turn. + * libgfortran.h (DEFAULT_TMPDIR): Remove macro. + (secure_getenv): New macro/prototype. + * runtime/environ.c (secure_getenv): New function. + (variable_table): Rename GFORTRAN_TMPDIR to TMPDIR. + * runtime/main.c (find_addr2line): Use secure_getenv. 2012-04-22 Tobias Burnus bur...@net-b.de @@ -479,24 +485,24 @@ 2012-03-15 Janne Blomqvist j...@gcc.gnu.org -PR libfortran/52434 -PR libfortran/48878 -PR libfortran/38199 -* io/unit.c (get_internal_unit): Default to ROUND_UNSPECIFIED. -(init_units): Likewise. -* io/write_float.def (determine_precision): New function. -(output_float): Take into account buffer with %f format, no need -for our own rounding if unspecified or processor specified -rounding. -(DTOA): Simplify format string, add parameters. -(FDTOA): New macros similar to DTOA, but using %f format. -(OUTPUT_FLOAT_FMT_G): Stack allocate newf, determine correct -precision and fill buffer. -(EN_PREC): New macro. -(determine_en_precision): New function. -(WRITE_FLOAT): For G format, move buffer filling into -output_float_FMT_G, use FDTOA for F format. -(write_float): Increase buffer due to F format. + PR libfortran/52434 + PR libfortran/48878 + PR libfortran/38199 + * io/unit.c (get_internal_unit): Default to ROUND_UNSPECIFIED. + (init_units): Likewise. + * io/write_float.def (determine_precision): New function. + (output_float): Take into account buffer with %f format, no need + for our own rounding if unspecified or processor specified + rounding. + (DTOA): Simplify format string, add parameters. + (FDTOA): New macros similar to DTOA, but using %f format. + (OUTPUT_FLOAT_FMT_G): Stack allocate newf, determine correct + precision and fill buffer. + (EN_PREC): New macro. + (determine_en_precision): New function. + (WRITE_FLOAT): For G format, move buffer filling into + output_float_FMT_G, use FDTOA for F format. + (write_float): Increase buffer due to F format. 2012-03-14 Rainer Orth r...@cebitec.uni-bielefeld.de
Re: [C++ Patch] PR 53301
Hi, On 05/11/2012 06:41 AM, Jason Merrill wrote: On 05/10/2012 05:31 PM, Paolo Carlini wrote: Let's see if we can do something *now* ;) My concrete proposal would be: TYPE_PTRMEM_P rename to TYPE_PTRDATAMEM_P (consistent with TYPE_PTRMEMFUNC_P) TYPE_PTR_TO_MEMBER_P rename to TYPE_PTRMEM_P and then finally #define TYPE_PTR_OR_PTRMEM_P(NODE) \ (TYPE_PTR_P (NODE) || TYPE_PTRMEM_P (NODE)) and use it everywhere. Sounds like an improvement? Sounds pretty good. But I suspect a lot of places want to check TYPE_PTR_P || TYPE_PTRDATAMEM_P (because you can't just use TREE_TYPE to get the function type of a PMF), so this new macro doesn't help with your desire to avoid writing ||. Well, I wanted to avoid writing it somewhere *else* ;) But, if we want, we could add a new predicate for your case too, maybe in a following step. What do you think? Additionally, we could maybe rename PTRMEM_OK_P to PTRMEMFUNC_OK_P No, that flag applies to both varieties of members. I see. Then while we are at it we could probably tweak a bit a the comment, I find it a tad misleading: /* Indicates when overload resolution may resolve to a pointer to member function. [expr.unary.op]/3 */ ??? Thanks, Paolo.
Re: PATCH: Add RTM support to -march=native
On Fri, May 11, 2012 at 4:51 AM, H.J. Lu hongjiu...@intel.com wrote: This patch adds RTM support to -march=native. Tested on Linux/x86-64. OK for trunk? 2012-05-10 H.J. Lu hongjiu...@intel.com * config/i386/driver-i386.c (host_detect_local_cpu): Support RTM. OK. Thanks, Uros.
Re: [C++ Patch] PR 53305
On Fri, May 11, 2012 at 3:07 AM, Paolo Carlini paolo.carl...@oracle.com wrote: Thanks Gaby. Then I guess I'm going to commit the variant with NULL_TREE, I like it a tad better because after all formally these functions return trees. And consistently I change another instance only a few lines above. OK. Thanks; -- Gaby
[RFC GCC/patch] Support sinking loads from memory in tree-ssa-sink.c if possible
Hi, I previously noticed from testcases that gcc now does not sink loads from memory in tree-ssa-sink.c. After discussing I worked out a patch to support this in gcc. Please refer to http://gcc.gnu.org/ml/gcc/2012-04/msg00404.html for some info. I think it is a trivial optimization, but it might be wanted in GCC since less instructions are executed with this. One potential issue might be: Should I disable it when optimizing for size and the sink hurting it. I bootstrapped x86 and tested the patch on x86/arm, no new fail introduced. It is OK for trunk? And comments are welcome. Thanks. gcc/ChangeLog: 2012-05-11 Bin Cheng bin.ch...@arm.com * tree-ssa-sink.c (pred_blocks, pred_visited): New static vars. (init_sink_load): New function initializes pred_blocks and pred_visited. (free_sink_load): New function frees pred_blocks and pred_visited. (sink_load_p): New function checks whether load operation should be sunk. (execute_sink_code): Call init_sink_load and free_sink_load. (statement_sink_location): Sink loads from memory if possible. gcc/testsuite/ChangeLog: 2012-05-11 Bin Cheng bin.ch...@arm.com * testsuite/gcc.dg/tree-ssa/ssa-sink-10.c: New test. * testsuite/gcc.dg/tree-ssa/ssa-sink-11.c: New test. Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-10.c === --- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-10.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-10.c (revision 0) @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-sink-stats } */ +extern int b; +int +foo (int a, int c) +{ + int x = b; + if (c == 1) +return x; + else +return a; +} +/* We should sink load from b into the branch that returns x. */ +/* { dg-final { scan-tree-dump-times Sunk statements: 1 1 sink } } */ +/* { dg-final { cleanup-tree-dump sink } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-11.c === --- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-11.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-11.c (revision 0) @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-sink-stats } */ +extern int b; +void bar(); +int +foo (int a, int c) +{ + int x = b; + bar(); + if (c == 1) +return x; + else +return a; +} +/* We should not sink load of b into the branch that returns x, + because it might be clobbered by the call to bar. */ +/* { dg-final { scan-tree-dump-times Sunk statements: 0 sink } } */ +/* { dg-final { cleanup-tree-dump sink } } */ Index: gcc/tree-ssa-sink.c === --- gcc/tree-ssa-sink.c (revision 186736) +++ gcc/tree-ssa-sink.c (working copy) @@ -77,6 +77,115 @@ } sink_stats; +/* Stores reversely breadth-first searched block pointer, starting at + the potential destination block of sinking from load. */ +static basic_block *pred_blocks; +/* Bitmap records the visited basic blocks in breadth-first search. */ +static bitmap pred_visited; + +static void +init_sink_load (void) +{ + pred_blocks = XCNEWVEC (basic_block, last_basic_block); + pred_visited = BITMAP_ALLOC (NULL); +} + +static void +free_sink_load (void) +{ + free (pred_blocks); + BITMAP_FREE (pred_visited); +} + +/* Given a gimple STMT in basic block FROM, which is a load operation, + check whether it should be sinked to TOGSI in basic block TO. + Return TRUE if it should be sinked, otherwise return FALSE. + + Loads should not be sunk across stmts that might clobber the reference, + for simplicity, we just stop checking at any store. */ + +static bool +sink_load_p (gimple stmt, basic_block from, basic_block to, +gimple_stmt_iterator *togsi) +{ + int i, vc; + basic_block bb; + gimple_stmt_iterator gsi = gsi_for_stmt (stmt); + + /* Stmt should be a load. */ + gcc_assert (gimple_vuse (stmt) !gimple_vdef (stmt)); + + /* Sink has conflict effect with if-conversion pass, which speculates + loads. So do not sink possibly trapping stmt, which can not be + speculated by if-conversion pass afterward. */ + if (gimple_assign_rhs_could_trap_p (stmt)) +return false; + + /* Sink to not post-dominated basic block, unless it's in outer loop. */ + if (dominated_by_p (CDI_POST_DOMINATORS, from, to) + (from-loop_depth = to-loop_depth)) +return false; + + /* If the latch block is empty, don't make it non-empty by sinking + load into it. */ + if (to == from-loop_father-latch empty_block_p (to)) +return false; + + /* Don't sink stmt if reference might be clobbered by successor stmts. */ + for (gsi_next(gsi); !gsi_end_p (gsi); gsi_next (gsi)) +if (gimple_vdef (gsi_stmt (gsi))) + return false; + + /* Don't sink stmt if reference might be clobbered by predecessors of + TOGSI. */ + gsi = *togsi; + if (!gsi_end_p (gsi)) +gsi_prev (gsi); + + for (;
[testsuite] Allow for ! comments in g++.dg/debug/dwarf2/nested-3.C
It turns out that g++.dg/debug/dwarf2/nested-3.C was still failing on Solaris/SPARC: with both as and gas, ! is used as a comment character. This patch accounts for that. Tested with the appropritate runtest invocation on sparc-sun-solaris2.11 with both as and gas, installed on mainline. Rainer 2012-05-11 Rainer Orth r...@cebitec.uni-bielefeld.de * g++.dg/debug/dwarf2/nested-3.C: Allow for ! comments. # HG changeset patch # Parent 00aab0518e96468d2a15d919ee91176c075ea95e Allow for ! comments in g++.dg/debug/dwarf2/nested-3.C diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/nested-3.C b/gcc/testsuite/g++.dg/debug/dwarf2/nested-3.C --- a/gcc/testsuite/g++.dg/debug/dwarf2/nested-3.C +++ b/gcc/testsuite/g++.dg/debug/dwarf2/nested-3.C @@ -59,4 +59,4 @@ main () // // Hence the scary regexp: // -// { dg-final { scan-assembler \[^\n\r\]*\\(DIE \\(0x(\[0-9a-f\]+)\\) DW_TAG_namespace\\)\[\n\r\]+\[^\n\r\]*\thread\[\^\n\r]+\[\n\r\]+(\[^\n\r\]*\[\n\r\]+)+\[^\n\r\]*\\(DIE \\(0x(\[0-9a-f\]+)\\) DW_TAG_class_type\\)(\[\n\r\]+\[^\n\r\]*)+\Executor\[^\n\r\]+\[\n\r\]+\[^\n\r\]*DW_AT_declaration\[\n\r\]+\[^\n\r\]*DW_AT_signature\[^#/\]*\[#/\] \[^\n\r\]*\\(DIE\[^\n\r\]*DW_TAG_subprogram\\)\[\n\r\]+(\[^\n\r\]*\[\n\r\]+)+\[^\n\r\]*\CurrentExecutor\[^\n\r\]+\[\n\r\]+(\[^\n\r\]*\[\n\r\]+)+(\[^\n\r\]*\[\n\r\]+)+\[^\n\r\]*end of children of DIE 0x\\3\[\n\r]+\[^\n\r\]*end of children of DIE 0x\\1\[\n\r]+ } } +// { dg-final { scan-assembler \[^\n\r\]*\\(DIE \\(0x(\[0-9a-f\]+)\\) DW_TAG_namespace\\)\[\n\r\]+\[^\n\r\]*\thread\[\^\n\r]+\[\n\r\]+(\[^\n\r\]*\[\n\r\]+)+\[^\n\r\]*\\(DIE \\(0x(\[0-9a-f\]+)\\) DW_TAG_class_type\\)(\[\n\r\]+\[^\n\r\]*)+\Executor\[^\n\r\]+\[\n\r\]+\[^\n\r\]*DW_AT_declaration\[\n\r\]+\[^\n\r\]*DW_AT_signature\[^#/!\]*\[#/!\] \[^\n\r\]*\\(DIE\[^\n\r\]*DW_TAG_subprogram\\)\[\n\r\]+(\[^\n\r\]*\[\n\r\]+)+\[^\n\r\]*\CurrentExecutor\[^\n\r\]+\[\n\r\]+(\[^\n\r\]*\[\n\r\]+)+(\[^\n\r\]*\[\n\r\]+)+\[^\n\r\]*end of children of DIE 0x\\3\[\n\r]+\[^\n\r\]*end of children of DIE 0x\\1\[\n\r]+ } } -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [RFC GCC/patch] Support sinking loads from memory in tree-ssa-sink.c if possible
On Fri, May 11, 2012 at 10:53 AM, Bin Cheng bin.ch...@arm.com wrote: Hi, I previously noticed from testcases that gcc now does not sink loads from memory in tree-ssa-sink.c. After discussing I worked out a patch to support this in gcc. Please refer to http://gcc.gnu.org/ml/gcc/2012-04/msg00404.html for some info. I think it is a trivial optimization, but it might be wanted in GCC since less instructions are executed with this. One potential issue might be: Should I disable it when optimizing for size and the sink hurting it. I bootstrapped x86 and tested the patch on x86/arm, no new fail introduced. It is OK for trunk? And comments are welcome. Thanks. First of all, can you do your patches with -p? That makes them easier to review. I think you should increase testing coverage - interesting cases include +extern int b; +void bar(); +int +foo (int a, int c) +{ + int x = b; + if (c == 1) +return x; + else { bar (); +return a; } +} where bar () should not prevent sinking. I don't like the way you perform the checking whether there is a store along the paths from the source to the destination location - we have virtual operands in SSA form that should enable a more efficient way to do this. (in general SSU form would be the best here, as the whole sinking is really partial dead code elimination) Basically you are looking for the active VUSE operand at the insert location and want to make sure it matches that of the load you want to sink. Or the other way around, you are looking for all VDEF operands that make the loads VUSE operand dead and want to make sure those are not on any path from the sink source to its destination. Walking upwards is matching what SSA provides the best (but doesn't help very much, as as soon that you've found the SSA name to start walking you are finished). But it boils down to recording the active VUSE at basic-block entry (something that's the same for all loads, no need to compute that per load). But, as we are dealing with simple cases (any store will block sinking of loads) we can as well limit the CFG structure we can sink over to only consider immediate successors of the sink source. As sink_code_in_bb already walks backwards over the stmts of the sink source it's easy to remember whether we've seen a store here or not (also, if there is a relevant store in the source BB then the vuses SSA uses are all in BB or its dominators and the single store reachable that way is in BB). Thanks, Richard. gcc/ChangeLog: 2012-05-11 Bin Cheng bin.ch...@arm.com * tree-ssa-sink.c (pred_blocks, pred_visited): New static vars. (init_sink_load): New function initializes pred_blocks and pred_visited. (free_sink_load): New function frees pred_blocks and pred_visited. (sink_load_p): New function checks whether load operation should be sunk. (execute_sink_code): Call init_sink_load and free_sink_load. (statement_sink_location): Sink loads from memory if possible. gcc/testsuite/ChangeLog: 2012-05-11 Bin Cheng bin.ch...@arm.com * testsuite/gcc.dg/tree-ssa/ssa-sink-10.c: New test. * testsuite/gcc.dg/tree-ssa/ssa-sink-11.c: New test.
Re: [patch] support for multiarch systems
Il 11/05/2012 07:13, Matthias Klose ha scritto: ok, I did clarify it in the existing documentation of MULTIARCH_DIRNAME in fragments.texi, detailing the search order for the files. Should the search order be mentioned in some user documentation as well? if yes, where? Thanks! I don't think the search order should be mentioned specially, since anyway *_INCLUDE_PATH and LIBRARY_PATH are mentioned. However under MULTILIB_DIRNAMES I would add something like this: @code{MULTILIB_DIRNAMES} describes the multilib directories using GCC conventions and is applied to directories that are part of the GCC installation. When multilib-enabled, the compiler will add a subdirectory of the form @var{prefix}/@var{multilib} before each directory in the search path for libraries and crt files. +@findex MULTILIB_OSDIRNAMES +@item MULTILIB_OSDIRNAMES +If @code{MULTILIB_OPTIONS} is used, this variable specifies ... a list of subdirectory names, that are used to modify the search path depending on the chosen multilib. Unlike @code{MULTILIB_DIRNAMES}, @code{MULTILIB_OSDIRNAMES} describes the multilib directories using operating systems conventions, and is applied to the directories such as @code{lib} or those in the @env{LIBRARY_PATH} environment variable. The format is either the same as of +@code{MULTILIB_DIRNAMES}, or a set of mappings. When it is the same +as @code{MULTILIB_DIRNAMES}, it describes the multilib directories +using OS conventions, rather than GCC conventions. s/OS/operating system/ When it is a set +of mappings of the form @var{gccdir}=@var{osdir}, the left side gives +the GCC convention and the right gives the equivalent OS defined +location. If the @var{osdir} part begins with a @samp{!}, ... GCC will not search in the non-multilib directory and use exclusively the multilib directory. Otherwise, the compiler will examine the search path for libraries and crt files twice; the first time it will add @var{multilib} to each directory in the search path, the second it will not. Use the mapping when there is +no one-to-one equivalence between GCC levels and the OS. I'm not sure what you mean here? +For multilib enabled configurations (see @code{MULTIARCH_DIRNAME}) +below), the multilib name is appended to each directory name, separated +by a colon (e.g. @samp{../lib:x86_64-linux-gnu}). For configurations that support both multilib and multiarch, @code{MULTILIB_OSDIRNAMES} also encodes the multiarch name, thus subsuming @code{MULTIARCH_DIRNAME}. The multiarch name is appended to each directory name, separated by a colon (e.g. @samp{../lib32:i386-linux-gnu}). Each multiarch subdirectory will be searched before the corresponding OS multilib directory, for example @samp{/lib/i386-linux-gnu} before @samp{/lib/..lib32}. The multiarch name will also be used to modify the system header search path, as explained for @code{MULTIARCH_DIRNAME}. +@findex MULTIARCH_DIRNAME +@item MULTIARCH_DIRNAME +If @code{MULTIARCH_DIRNAME} is used, this variable specifies the +multiarch name for this configuration. +Libraries and crt files are searched first in +@var{prefix}/@var{multiarch} before @var{prefix} for each @var{prefix} +added by @code{add_prefix} or @code{add_sysrooted_prefix}. +System header files are searched first in +@code{LOCAL_INCLUDE_DIR}/@var{multiarch} before +@code{LOCAL_INCLUDE_DIR}, and in +@code{NATIVE_SYSTEM_HEADER_DIR}/@var{multiarch} before +@code{NATIVE_SYSTEM_HEADER_DIR}. +@code{MULTIARCH_DIRNAME} is not used for multilib enabled +configurations, but encoded in @code{MULTILIB_OSDIRNAMES} instead. This sounds simpler, and doesn't refer to GCC implementation details such add_prefix/add_sysrooted_prefix: This variable specifies the multiarch name for configurations that are multiarch-enabled but not multilibbed configurations. The multiarch name is used to augment the search path for libraries, crt files and system header files with additional locations. The compiler will add a multiarch subdirectory of the form @var{prefix}/@var{multiarch} before each directory in the library and crt search path. It will also add two directories @code{LOCAL_INCLUDE_DIR}/@var{multiarch} and @code{NATIVE_SYSTEM_HEADER_DIR}/@var{multiarch}) to the system header search path, respectively before @code{LOCAL_INCLUDE_DIR} and @code{NATIVE_SYSTEM_HEADER_DIR}. @code{MULTIARCH_DIRNAME} is not used for configurations that support both multilib and multiarch. In that case, multiarch names are encoded in @code{MULTILIB_OSDIRNAMES} instead. +The multiarch tuples are defined +in @uref{http://wiki.debian.org/Multiarch/Tuples}. Is this needed? Aren't they defined simply by the GCC source code? Surely if some other OS implements multiarch with different tuples (no matter how undesirable that would be) Debian would not be an authoritative source. Paolo
Re: [testsuite] Fix gcc.target/i386/hle-* testcases with Sun as
Hi Uros, there, an alternative patch might be to remove it from the port now instead of the patch. Right, that's why I was asking for review rather than just installing on my own. I'd rather see that we remove semicolon in this case, but please note that xchg with memory operand doesn't need lock prefix. If it is not too much trouble... I saw your message only after I had committed the patch. I'll try to have a look this weekend, but it's quite unclear if I have any time. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: Symbol table 20/many: cleanup of cgraph_remove_unreachable_nodes
On Thu, May 10, 2012 at 1:19 PM, Jan Hubicka hubi...@ucw.cz wrote: Hi, after some thought, the changes into omp-low are not as obviously harmless as I originally tought. So i decided to handle this by separate patch. This patch simply makes cgraph to not release bodies of artificial functions that papers around the problem in easier way. Bootstrapped/regtested x86_64-linux, comitted. Honza * cgraph.h (cgraph_remove_unreachable_nodes): Rename to ... (symtab_remove_unreachable_nodes): ... this one. * ipa-cp.c (ipcp_driver): Do not remove unreachable nodes. * cgraphunit.c (ipa_passes): Update. * cgraphclones.c (cgraph_materialize_all_clones): Update. * cgraph.c (cgraph_release_function_body): Only turn initial into error mark when initial was previously set. * ipa-inline.c (ipa_inline): Update. * ipa.c: Include ipa-inline.h (enqueue_cgraph_node, enqueue_varpool_node): Remove. (enqueue_node): New function. (process_references): Update. (symtab_remove_unreachable_nodes): Cleanup. * passes.c (execute_todo, execute_one_pass): Update. This may have caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53321 -- H.J.
[PATCH] Fix PR53295
This fixes the dependency of vectorization of strided loads on gather support. For that to work we need to lift the restriction in data-ref analysis that requries a constant DR_STEP. Fortunately fallout is small. As a side-effect the data-reference for strided loads now makes sense (rather than being gobbled over by gather analysis BB-level DR). Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2012-05-11 Richard Guenther rguent...@suse.de * tree-data-ref.h (stride_of_unit_type_p): Handle non-constant strides. * tree-data-ref.c (dr_analyze_innermost): Allow non-constant strides when analyzing data-references in a loop context. * tree-vect-data-refs.c (vect_mark_for_runtime_alias_test): Reject non-constant strides for now. (vect_enhance_data_refs_alignment): Ignore data references that are strided loads. (vect_analyze_data_ref_access): Handle non-constant strides. (vect_check_strided_load): Verify the data-reference is a load. (vect_analyze_data_refs): Restructure to make strided load support not dependent on gather support. * tree-vect-stmts.c (vectorizable_load): Avoid useless work when doing strided or gather loads. * tree-vect-loop-manip.c (vect_vfa_segment_size): Use integer_zerop to compare stride with zero. Index: gcc/tree-data-ref.h === *** gcc/tree-data-ref.h (revision 187396) --- gcc/tree-data-ref.h (working copy) *** bool stmt_with_adjacent_zero_store_dr_p *** 618,626 static inline bool stride_of_unit_type_p (tree stride, tree type) { ! return tree_int_cst_equal (fold_unary (ABS_EXPR, TREE_TYPE (stride), !stride), !TYPE_SIZE_UNIT (type)); } /* Determines whether RDG vertices V1 and V2 access to similar memory --- 618,627 static inline bool stride_of_unit_type_p (tree stride, tree type) { ! return (TREE_CODE (stride) == INTEGER_CST ! tree_int_cst_equal (fold_unary (ABS_EXPR, TREE_TYPE (stride), !stride), !TYPE_SIZE_UNIT (type))); } /* Determines whether RDG vertices V1 and V2 access to similar memory Index: gcc/tree-data-ref.c === *** gcc/tree-data-ref.c (revision 187396) --- gcc/tree-data-ref.c (working copy) *** dr_analyze_innermost (struct data_refere *** 736,742 if (in_loop) { if (!simple_iv (loop, loop_containing_stmt (stmt), base, base_iv, ! false)) { if (nest) { --- 736,742 if (in_loop) { if (!simple_iv (loop, loop_containing_stmt (stmt), base, base_iv, ! nest ? true : false)) { if (nest) { *** dr_analyze_innermost (struct data_refere *** 773,779 offset_iv.step = ssize_int (0); } else if (!simple_iv (loop, loop_containing_stmt (stmt), !poffset, offset_iv, false)) { if (nest) { --- 773,780 offset_iv.step = ssize_int (0); } else if (!simple_iv (loop, loop_containing_stmt (stmt), !poffset, offset_iv, ! nest ? true : false)) { if (nest) { Index: gcc/tree-vect-data-refs.c === *** gcc/tree-vect-data-refs.c (revision 187396) --- gcc/tree-vect-data-refs.c (working copy) *** vect_mark_for_runtime_alias_test (ddr_p *** 542,547 --- 542,558 return false; } + /* FORNOW: We don't support creating runtime alias tests for non-constant + step. */ + if (TREE_CODE (DR_STEP (DDR_A (ddr))) != INTEGER_CST + || TREE_CODE (DR_STEP (DDR_B (ddr))) != INTEGER_CST) + { + if (vect_print_dump_info (REPORT_DR_DETAILS)) + fprintf (vect_dump, versioning not yet supported for non-constant +step); + return false; + } + VEC_safe_push (ddr_p, heap, LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo), ddr); return true; } *** vect_enhance_data_refs_alignment (loop_v *** 1522,1527 --- 1533,1543 if (integer_zerop (DR_STEP (dr))) continue; + /* Strided loads perform only component accesses, alignment is +irrelevant for them. */ + if (STMT_VINFO_STRIDE_LOAD_P (stmt_info)) + continue; + supportable_dr_alignment = vect_supportable_dr_alignment (dr, true); do_peeling = vector_alignment_reachable_p (dr); if (do_peeling) *** vect_enhance_data_refs_alignment (loop_v *** 1779,1784
Re: [PATCH, alpha]: Fix ICE in alpha_emit_conditional_move, at config/alpha/alpha.c:2649
On Fri, May 11, 2012 at 1:54 AM, Uros Bizjak ubiz...@gmail.com wrote: Hello! Recently testsuite/gcc.c-torture/execute/ieee/pr50310.c started to ICE when compiled with -O3 -mieee on alphaev68-pc-linux-gnu: $ ~/gcc-build-alpha/gcc/cc1 -O3 -mieee -quiet pr50310.c pr50310.c: In function ‘foo’: pr50310.c:31:20: internal compiler error: in alpha_emit_conditional_move, at config/alpha/alpha.c:2649 s3[10 * 4 + i] = __builtin_isunordered (s1[i], s2[i]) ? -1.0 : 0.0; ^ Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. It turned out that UNORDERED and ORDERED RTX codes are not handled in alpha_emit_conditional_move. Attached patch fixes this oversight. 2012-05-11 Uros Bizjak ubiz...@gmail.com * config/alpha/alpha.c (alpha_emit_conditional_branch): Handle ORDERED and UNORDERED conditions. Patch was bootstrapped and regression tested on alphaev68-pc-linux-gnu. OK for mainline SVN and release branches? Oops, wrong patch was attached. Attached is the correct one. Uros. Index: config/alpha/alpha.c === --- config/alpha/alpha.c(revision 187394) +++ config/alpha/alpha.c(working copy) @@ -2335,7 +2335,7 @@ alpha_emit_conditional_branch (rtx operands[], enu { case EQ: case LE: case LT: case LEU: case LTU: case UNORDERED: - /* We have these compares: */ + /* We have these compares. */ cmp_code = code, branch_code = NE; break; @@ -2572,13 +2572,15 @@ alpha_emit_conditional_move (rtx cmp, enum machine switch (code) { case EQ: case LE: case LT: case LEU: case LTU: + case UNORDERED: /* We have these compares. */ cmp_code = code, code = NE; break; case NE: - /* This must be reversed. */ - cmp_code = EQ, code = EQ; + case ORDERED: + /* These must be reversed. */ + cmp_code = reverse_condition (code), code = EQ; break; case GE: case GT: case GEU: case GTU: @@ -2598,6 +2600,14 @@ alpha_emit_conditional_move (rtx cmp, enum machine gcc_unreachable (); } + if (cmp_mode == DImode) + { + if (!reg_or_0_operand (op0, DImode)) + op0 = force_reg (DImode, op0); + if (!reg_or_8bit_operand (op1, DImode)) + op1 = force_reg (DImode, op1); + } + tem = gen_reg_rtx (cmp_mode); emit_insn (gen_rtx_SET (VOIDmode, tem, gen_rtx_fmt_ee (cmp_code, cmp_mode, @@ -2609,6 +2619,14 @@ alpha_emit_conditional_move (rtx cmp, enum machine local_fast_math = 1; } + if (cmp_mode == DImode) +{ + if (!reg_or_0_operand (op0, DImode)) + op0 = force_reg (DImode, op0); + if (!reg_or_8bit_operand (op1, DImode)) + op1 = force_reg (DImode, op1); +} + /* We may be able to use a conditional move directly. This avoids emitting spurious compares. */ if (signed_comparison_operator (cmp, VOIDmode) @@ -2627,11 +2645,13 @@ alpha_emit_conditional_move (rtx cmp, enum machine switch (code) { case EQ: case LE: case LT: case LEU: case LTU: +case UNORDERED: /* We have these compares: */ break; case NE: - /* This must be reversed. */ +case ORDERED: + /* These must be reversed. */ code = reverse_condition (code); cmov_code = EQ; break;
[PATCH] More pass cleanups, make phinodes cache global
This makes some more pass structs internal to passes.c. It also makes the phinodes cache global (which it is already - sort of, it gets cleaned after each early optimization but is kept after IPA opts). Bootstrapped on x86_64-unknown-linux-gnu. Richard. 2012-05-11 Richard Guenther rguent...@suse.de * tree-pass.h (pass_rest_of_compilation, pass_all_optimizations, pass_postreload, pass_all_early_optimizations): Remove. * passes.c (pass_all_optimizations, pass_postreload, pass_all_early_optimizations): Make static. (pass_rest_of_compilation): Likewise. Make it an RTL_PASS. * tree-phinodes.c (init_phinodes, fini_phinodes): Remove. * tree-ssa.c (init_tree_ssa): Do not call init_phinodes. (delete_tree_ssa): Do not call fini_phinodes. * tree-flow.h (init_phinodes, fini_phinodes): Remove. Index: gcc/tree-pass.h === *** gcc/tree-pass.h (revision 187396) --- gcc/tree-pass.h (working copy) *** extern struct gimple_opt_pass pass_tree_ *** 430,436 extern struct gimple_opt_pass pass_dse; extern struct gimple_opt_pass pass_nrv; extern struct gimple_opt_pass pass_rename_ssa_copies; - extern struct gimple_opt_pass pass_rest_of_compilation; extern struct gimple_opt_pass pass_sink_code; extern struct gimple_opt_pass pass_fre; extern struct gimple_opt_pass pass_check_data_deps; --- 430,435 *** extern struct simple_ipa_opt_pass pass_i *** 477,483 extern struct ipa_opt_pass_d pass_ipa_profile; extern struct ipa_opt_pass_d pass_ipa_cdtor_merge; - extern struct gimple_opt_pass pass_all_optimizations; extern struct gimple_opt_pass pass_cleanup_cfg_post_optimizing; extern struct gimple_opt_pass pass_init_datastructures; extern struct gimple_opt_pass pass_fixup_cfg; --- 476,481 *** extern struct rtl_opt_pass pass_sms; *** 535,541 extern struct rtl_opt_pass pass_sched; extern struct rtl_opt_pass pass_ira; extern struct rtl_opt_pass pass_reload; - extern struct rtl_opt_pass pass_postreload; extern struct rtl_opt_pass pass_clean_state; extern struct rtl_opt_pass pass_branch_prob; extern struct rtl_opt_pass pass_value_profile_transformations; --- 533,538 *** extern struct rtl_opt_pass pass_rtl_seqa *** 576,582 extern struct gimple_opt_pass pass_release_ssa_names; extern struct gimple_opt_pass pass_early_inline; extern struct gimple_opt_pass pass_inline_parameters; - extern struct gimple_opt_pass pass_all_early_optimizations; extern struct gimple_opt_pass pass_update_address_taken; extern struct gimple_opt_pass pass_convert_switch; --- 573,578 Index: gcc/passes.c === *** gcc/passes.c(revision 187396) --- gcc/passes.c(working copy) *** gate_all_early_optimizations (void) *** 334,340 !seen_error ()); } ! struct gimple_opt_pass pass_all_early_optimizations = { { GIMPLE_PASS, --- 334,340 !seen_error ()); } ! static struct gimple_opt_pass pass_all_early_optimizations = { { GIMPLE_PASS, *** gate_all_optimizations (void) *** 364,370 (!seen_error () || gimple_in_ssa_p (cfun))); } ! struct gimple_opt_pass pass_all_optimizations = { { GIMPLE_PASS, --- 364,370 (!seen_error () || gimple_in_ssa_p (cfun))); } ! static struct gimple_opt_pass pass_all_optimizations = { { GIMPLE_PASS, *** gate_rest_of_compilation (void) *** 391,400 return !(rtl_dump_and_exit || flag_syntax_only || seen_error ()); } ! struct gimple_opt_pass pass_rest_of_compilation = { { ! GIMPLE_PASS, *rest_of_compilation, /* name */ gate_rest_of_compilation, /* gate */ NULL, /* execute */ --- 391,400 return !(rtl_dump_and_exit || flag_syntax_only || seen_error ()); } ! static struct rtl_opt_pass pass_rest_of_compilation = { { ! RTL_PASS, *rest_of_compilation, /* name */ gate_rest_of_compilation, /* gate */ NULL, /* execute */ *** gate_postreload (void) *** 416,422 return reload_completed; } ! struct rtl_opt_pass pass_postreload = { { RTL_PASS, --- 416,422 return reload_completed; } ! static struct rtl_opt_pass pass_postreload = { { RTL_PASS, *** init_optimization_passes (void) *** 1377,1382 --- 1377,1383 p = all_late_ipa_passes; NEXT_PASS (pass_ipa_pta); *p = NULL; + /* These passes are run after IPA passes on every function that is being output to the assembler file. */ p = all_passes; Index: gcc/tree-phinodes.c
Re: [Patch, fortran] PR 52428 Reading of large negative values and range checking
PING #2! On Wed, May 2, 2012 at 10:22 PM, Janne Blomqvist blomqvist.ja...@gmail.com wrote: PING On Thu, Apr 26, 2012 at 12:08 AM, Janne Blomqvist blomqvist.ja...@gmail.com wrote: Hi, currently when -frange-check is enabled, we check for overflow when doing a formatted read of an integer value. This check, however, is against the Fortran numerical model (see 13.4 in F2008), which defines a symmetric interval [-huge(), huge()], whereas all targets gfortran supports use a two's complement representation with a range of [-huge()-1, huge()]. However, there is no checking against the numerical model when doing arithmetic, and thus we can generate and write the value -huge()-1, but we cannot read it back in! With the -fno-range-check option, this overflow checking can be disabled, but at the cost of disabling all overflow checking, which leads to reading nonsense values if the hardware supported range overflows. The attached patch changes this logic such that overflow checking against the hardware supported range [-huge()-1, huge()] is always done when reading, regardless of the -frange-check flag setting. This also seems to be what ifort 12.0 does, I haven't checked other compilers. For some more arguments back and forth, see the PR. Regtested on x86_64-unknown-linux-gnu, Ok for trunk? gcc/fortran ChangeLog: 2012-04-25 Janne Blomqvist j...@gcc.gnu.org PR fortran/52428 * gfortran.texi: Update _gfortran_set_options documentation. * invoke.texi: Remove runtime behavior description of -fno-range-check. * trans-decl.c (create_main_function): Don't pass the range-check setting to the library. libgfortran ChangeLog: 2012-04-25 Janne Blomqvist j...@gcc.gnu.org PR fortran/52428 * io/io.h (max_value): Rename to si_max, remove second argument. * io/list_read.c (convert_integer): Use unsigned types when parsing the digits, set max value depending on the sign. * io/read.c (max_value): Rename to si_max, remove second argument, simplify. (read_decimal): Set max value depending on sign, always check overflow. (read_radix): Calculate max unsigned value directly. * libgfortran.h (struct compile_options_t): Remove range_check field. * runtime/compile_options.c (set_options): Skip handling options[7]. (init_compile_options): Don't set removed field. gcc/testsuite ChangeLog: 2012-04-25 Janne Blomqvist j...@gcc.gnu.org PR fortran/52428 * gfortran.dg/int_range_io_1.f90: New test. -- Janne Blomqvist -- Janne Blomqvist -- Janne Blomqvist
Re: [PATCH] Take branch misprediction effects into account when RTL loop unrolling (issue6099055)
Ping? Teresa On Fri, May 4, 2012 at 3:41 PM, Teresa Johnson tejohn...@google.com wrote: On David's suggestion, I have removed the changes that rename niter_desc to loop_desc from this patch to focus the patch on the unrolling changes. I can submit a cleanup patch to do the renaming as soon as this one goes in. Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk? Thanks, Teresa Here is the new description of improvements from the original patch: Improved patch based on feedback. Main changes are: 1) Improve efficiency by caching loop analysis results in the loop auxiliary info structure hanging off the loop structure. Added a new routine, analyze_loop_insns, to fill in information about the average and total number of branches, as well as whether there are any floating point set and call instructions in the loop. The new routine is invoked when we first create a loop's niter_desc struct, and the caller (get_simple_loop_desc) has been modified to handle creating a niter_desc for the fake outermost loop. 2) Improvements to max_unroll_with_branches: - Treat the fake outermost loop (the procedure body) as we would a hot outer loop, i.e. compute the max unroll looking at its nested branches, instead of shutting off unrolling when we reach the fake outermost loop. - Pull the checks previously done in the caller into the routine (e.g. whether the loop iterates frequently or contains fp instructions). - Fix a bug in the previous version that sometimes caused overflow in the new unroll factor. 3) Remove float variables, and use integer computation to compute the average number of branches in the loop. 4) Detect more types of floating point computations in the loop by walking all set instructions, not just single sets. 2012-05-04 Teresa Johnson tejohn...@google.com * doc/invoke.texi: Update the documentation with new params. * loop-unroll.c (max_unroll_with_branches): New function. (decide_unroll_constant_iterations, decide_unroll_runtime_iterations): Add heuristic to avoid increasing branch mispredicts when unrolling. (decide_peel_simple, decide_unroll_stupid): Retrieve number of branches from niter_desc instead of via function that walks loop. * loop-iv.c (get_simple_loop_desc): Invoke new analyze_loop_insns function, and add guards to enable this function to work for the outermost loop. * cfgloop.c (insn_has_fp_set, analyze_loop_insns): New functions. (num_loop_branches): Remove. * cfgloop.h (struct loop_desc): Added new fields to cache additional loop analysis information. (num_loop_branches): Remove. (analyze_loop_insns): Declare. * params.def (PARAM_MIN_ITER_UNROLL_WITH_BRANCHES): New param. (PARAM_UNROLL_OUTER_LOOP_BRANCH_BUDGET): Ditto. Index: doc/invoke.texi === --- doc/invoke.texi (revision 187013) +++ doc/invoke.texi (working copy) @@ -8842,6 +8842,12 @@ The maximum number of insns of an unswitched loop. @item max-unswitch-level The maximum number of branches unswitched in a single loop. +@item min-iter-unroll-with-branches +Minimum iteration count to ignore branch effects when unrolling. + +@item unroll-outer-loop-branch-budget +Maximum number of branches allowed in hot outer loop region after unroll. + @item lim-expensive The minimum cost of an expensive expression in the loop invariant motion. Index: loop-unroll.c === --- loop-unroll.c (revision 187013) +++ loop-unroll.c (working copy) @@ -152,6 +152,99 @@ static void combine_var_copies_in_loop_exit (struc basic_block); static rtx get_expansion (struct var_to_expand *); +/* Compute the maximum number of times LOOP can be unrolled without exceeding + a branch budget, which can increase branch mispredictions. The number of + branches is computed by weighting each branch with its expected execution + probability through the loop based on profile data. If no profile feedback + data exists, simply return the current NUNROLL factor. */ + +static unsigned +max_unroll_with_branches(struct loop *loop, unsigned nunroll) +{ + struct loop *outer; + struct niter_desc *outer_desc = 0; + int outer_niters = 1; + int frequent_iteration_threshold; + unsigned branch_budget; + struct niter_desc *desc = get_simple_loop_desc (loop); + + /* Ignore loops with FP computation as these tend to benefit much more + consistently from unrolling. */ + if (desc-has_fp) + return nunroll; + + frequent_iteration_threshold = PARAM_VALUE (PARAM_MIN_ITER_UNROLL_WITH_BRANCHES); + if (expected_loop_iterations (loop) = (unsigned) frequent_iteration_threshold) + return nunroll; + + /* If there was no profile
unwinding fallbacks for powerpc32 on aix 5.2 and 5.3
Hello, We have been using these for a few years now. I have just verified that they do allow exception propagation from signal for Ada on aix 5.3 with a recent mainline. Posting here in case that is of interest, even though it was only exercised on aix 5.2 and 5.3, for 32bit code generation (tested on 64bit 5.2 kernel nevertheless). Thanks in advance for your feedback, Cheers, Olivier 2012-05-11 Olivier Hainque hain...@adacore.com libgcc/ * config/rs6000/aix-unwind.h (ucontext_for): Helper for ... (ppc_aix_fallback_frame_state): New, implementation of ... (MD_FALLBACK_FRAME_STATE_FOR): Define. aix-unw.dif Description: video/dv
Re: [PATCH] Remove TYPE_IS_SIZETYPE
On May 10, 2012, at 14:00 , Richard Guenther wrote: Of course the mere existence of DECL_OFFSET_ALIGN complicates matters for no good reasons (well, at least I did not find a good use of it until now ...). I remember an old discussion about it ... hmm ... http://gcc.gnu.org/ml/gcc/2006-10/msg00019.html JIC that could be of use. Olivier
[PATCH] Remove find_new_referenced_vars
This removes the renaming variant of find_referenced_vars_in. I'm working towards removing the mark_symbols_for_renaming hammer. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2012-05-11 Richard Guenther rguent...@suse.de * tree-flow.h (referenced_var_check_and_insert): Remove. (find_new_referenced_vars): Likewise. * tree-dfa.c (referenced_var_check_and_insert): Make static. (find_new_referenced_vars_1, find_new_referenced_vars): Remove. * tree-inline.c (copy_bb): Use find_referenced_vars_in instead of find_new_referenced_vars. * gimple-fold.c (gimplify_and_update_call_from_tree): Likewise. Index: gcc/tree-flow.h === *** gcc/tree-flow.h (revision 187401) --- gcc/tree-flow.h (working copy) *** typedef struct *** 332,338 (VAR) = next_referenced_var ((ITER))) extern tree referenced_var_lookup (struct function *, unsigned int); - extern bool referenced_var_check_and_insert (tree); #define num_referenced_vars htab_elements (gimple_referenced_vars (cfun)) #define num_ssa_names (VEC_length (tree, cfun-gimple_df-ssa_names)) --- 332,337 *** extern tree get_virtual_var (tree); *** 496,502 extern bool add_referenced_var (tree); extern void remove_referenced_var (tree); extern void mark_symbols_for_renaming (gimple); - extern void find_new_referenced_vars (gimple); extern tree make_rename_temp (tree, const char *); extern void set_default_def (tree, tree); extern tree gimple_default_def (struct function *, tree); --- 495,500 Index: gcc/tree-dfa.c === *** gcc/tree-dfa.c (revision 187401) --- gcc/tree-dfa.c (working copy) *** find_vars_r (tree *tp, int *walk_subtree *** 462,470 return NULL_TREE; } ! /* Find referenced variables in STMT. In contrast with !find_new_referenced_vars, this function will not mark newly found !variables for renaming. */ void find_referenced_vars_in (gimple stmt) --- 462,468 return NULL_TREE; } ! /* Find referenced variables in STMT. */ void find_referenced_vars_in (gimple stmt) *** referenced_var_lookup (struct function * *** 505,511 /* Check if TO is in the referenced_vars hash table and insert it if not. Return true if it required insertion. */ ! bool referenced_var_check_and_insert (tree to) { tree h, *loc; --- 503,509 /* Check if TO is in the referenced_vars hash table and insert it if not. Return true if it required insertion. */ ! static bool referenced_var_check_and_insert (tree to) { tree h, *loc; *** mark_symbols_for_renaming (gimple stmt) *** 667,704 } - /* Find all variables within the gimplified statement that were not -previously visible to the function and add them to the referenced -variables list. */ - - static tree - find_new_referenced_vars_1 (tree *tp, int *walk_subtrees, - void *data ATTRIBUTE_UNUSED) - { - tree t = *tp; - - if (TREE_CODE (t) == VAR_DECL !var_ann (t)) - { - add_referenced_var (t); - mark_sym_for_renaming (t); - } - - if (IS_TYPE_OR_DECL_P (t)) - *walk_subtrees = 0; - - return NULL; - } - - - /* Find any new referenced variables in STMT. */ - - void - find_new_referenced_vars (gimple stmt) - { - walk_gimple_op (stmt, find_new_referenced_vars_1, NULL); - } - - /* If EXP is a handled component reference for a structure, return the base variable. The access range is delimited by bit positions *POFFSET and *POFFSET + *PMAX_SIZE. The access size is *PSIZE bits. If either --- 657,662 Index: gcc/tree-inline.c === *** gcc/tree-inline.c (revision 187401) --- gcc/tree-inline.c (working copy) *** copy_bb (copy_body_data *id, basic_block *** 1794,1800 ssa_op_iter i; tree def; ! find_new_referenced_vars (gsi_stmt (copy_gsi)); FOR_EACH_SSA_TREE_OPERAND (def, stmt, i, SSA_OP_DEF) if (TREE_CODE (def) == SSA_NAME) SSA_NAME_DEF_STMT (def) = stmt; --- 1794,1800 ssa_op_iter i; tree def; ! find_referenced_vars_in (gsi_stmt (copy_gsi)); FOR_EACH_SSA_TREE_OPERAND (def, stmt, i, SSA_OP_DEF) if (TREE_CODE (def) == SSA_NAME) SSA_NAME_DEF_STMT (def) = stmt; Index: gcc/gimple-fold.c === *** gcc/gimple-fold.c (revision 187401) --- gcc/gimple-fold.c (working copy) *** gimplify_and_update_call_from_tree (gimp *** 625,631 new_stmt = gsi_stmt (i); /* The replacement can
[Fortran, (RFC) patch] PR49110/51055 Assignment to alloc. deferred-length character vars
Dear all, the following patch fixes character(len=:), allocatable :: str str = repeat('X', n) Before the patch, the (re)allocation happened before the RHS was evaluated. Thus, the string length wasn't known! (Work around: Add parentheses around the function call.) Note that the patch assumes that the function's result variable's length specification expression is completely known to the caller. I think that's always the case in gfortran - or is it not?. I am not sure whether once can construct a valid program using INTERFACE where that's not the case. The answer is probably no, given: If a type parameter of a function result or a bound of a function result array is not a constant expression, the exact dependence on the entities in the expression is a characteristic. (F2008, 12.3.3). -- Nevertheless, one has to think about whether there could be some issue. If there is, I have no idea how one could handle it instead. Thus, there be better no issue! I am also not positive that len=: pointer results are properly handled, but probably they are. (Untested; character array results are also untested - and array constructors are known to be broken.) Comments? The attached patch builds and regtests. OK for the trunk? Tobias 2012-05-11 Tobias Burnus bur...@net-b.de PR fortran/49110 PR fortran/51055 * trans-expr.c (gfc_trans_assignment_1): Fix allocation handling for assignment of function results to allocatable deferred-length strings. 2012-05-11 Tobias Burnus bur...@net-b.de PR fortran/49110 PR fortran/51055 * gfortran.dg/deferred_type_param_3.f90: New. * gfortran.dg/deferred_type_param_4.f90: New. diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c index 8045b1f..8126697 100644 --- a/gcc/fortran/trans-expr.c +++ b/gcc/fortran/trans-expr.c @@ -7002,17 +7003,18 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag, { tmp = gfc_deallocate_alloc_comp (expr2-ts.u.derived, rse.expr, 0); gfc_add_expr_to_block (loop.post, tmp); } - /* For a deferred character length function, the function call must - happen before the (re)allocation of the lhs, otherwise the character - length of the result is not known. */ - def_clen_func = (((expr2-expr_type == EXPR_FUNCTION) - || (expr2-expr_type == EXPR_COMPCALL) - || (expr2-expr_type == EXPR_PPC)) - expr2-ts.deferred); + /* When assigning a character function result to a deferred-length variable, + the function call must happen before the (re)allocation of the lhs - + otherwise the character length of the result is not known. + NOTE: This relies on having the exact dependence of the length type + parameter available to the caller; gfortran save it in the .mod files. */ + def_clen_func = (expr2-expr_type == EXPR_FUNCTION + || expr2-expr_type == EXPR_COMPCALL + || expr2-expr_type == EXPR_PPC); if (gfc_option.flag_realloc_lhs expr2-ts.type == BT_CHARACTER (def_clen_func || expr2-expr_type == EXPR_OP) expr1-ts.deferred) gfc_add_block_to_block (block, rse.pre); --- /dev/null 2012-05-11 07:40:56.923831265 +0200 +++ gcc/gcc/testsuite/gfortran.dg/deferred_type_param_3.f90 2012-05-11 12:18:46.0 +0200 @@ -0,0 +1,33 @@ +! { dg-do run } +! +! PR fortran/51055 +! PR fortran/49110 +! + +subroutine test() + implicit none + integer :: i = 5 + character(len=:), allocatable :: s1 + call sub(s1, i) + if (len(s1) /= 5) call abort() + if (s1 /= Z) call abort() +contains + subroutine sub(str,j) +character(len=:), allocatable :: str +integer :: j +str = REPEAT(Z,j) +if (len(str) /= 5) call abort() +if (str /= Z) call abort() + end subroutine sub +end subroutine test + +program a + character(len=:),allocatable :: s + integer :: j=2 + s = repeat ('x', j) + if (len(repeat(' ',j)) /= 2) call abort() + if (repeat('y',j) /= yy) call abort() + if (len(s) /= 2) call abort() + if (s /= xx) call abort() + call test() +end program a --- /dev/null 2012-05-11 07:40:56.923831265 +0200 +++ gcc/gcc/testsuite/gfortran.dg/deferred_type_param_4.f90 2012-05-11 12:22:30.0 +0200 @@ -0,0 +1,33 @@ +! { dg-do run } +! +! PR fortran/51055 +! PR fortran/49110 +! +! +program test + implicit none + character(len=:), allocatable :: str + integer :: i + i = 5 + str = f() + call printIt () + i = 7 + str = repeat('X', i) + call printIt () +contains + function f() +character(len=i) :: f +f = '1234567890' + end function f + subroutine printIt +!print *, len(str) +!print '(3a)', '',str,'' +if (i == 5) then + if (str /= 12345 .or. len(str) /= 5) call abort () +else if (i == 7) then + if (str /= XXX .or. len(str) /= 7) call abort () +else + call abort () +end if + end subroutine +end
Re: patch for PR53125
On 05/11/2012 08:17 AM, Richard Earnshaw wrote: On 10/05/12 20:58, Vladimir Makarov wrote: The following patch is for PR53125. The PR is described on http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53125. The patch improves the compilation speed by 35% for the case. The patch was successfully bootstrapped on x86-64. Committed as rev. 187373. 2012-05-10 Vladimir Makarovvmaka...@redhat.com PR rtl-optimization/53125 * ira.c (ira): Call find_moveable_pseudos and move_unallocated_pseudos if only ira_conflicts_p is true. ENOPATCH? Sorry for ommitting the patch. I added it to the attachment. Index: ira.c === --- ira.c (revision 187372) +++ ira.c (working copy) @@ -4125,7 +4125,12 @@ ira (FILE *f) } allocated_reg_info_size = max_reg_num (); - find_moveable_pseudos (); + + /* It is not worth to do such improvement when we use a simple + allocation because of -O0 usage or because the function is too + big. */ + if (ira_conflicts_p) +find_moveable_pseudos (); max_regno_before_ira = max_reg_num (); ira_setup_eliminable_regset (); @@ -4234,7 +4239,10 @@ ira (FILE *f) max_regno * sizeof (struct ira_spilled_reg_stack_slot)); } allocate_initial_values (reg_equivs); - move_unallocated_pseudos (); + + /* See comment for find_moveable_pseudos call. */ + if (ira_conflicts_p) +move_unallocated_pseudos (); } static void
Re: [PATCH] ARM/NEON: vld1q_dup_s64 builtin
You also don't mention how this patch was tested. I used the testsuite I developed some time ago to test all the Neon builtins, which I posted last year on the qemu mailing-list. With the current GCCs, this bug is the only remaining one I could detect. Fair enough. Alternatively it might be worth splitting the vld1q_*64 case into a 64 bit load into a (subreg:DI (V2DI reg) 0 ) followed by a subreg to subreg move which should end up having the same effect . That splitting would allow for better instruction scheduling. Are you aware of examples of similar cases I could use as a model? I would change the iterator from VQX to VQ in the pattern above (you can also simplify the setting of neon_type in that case as well as change that to be a vec_duplicate as below and get rid of any lingering definitions of UNSPEC_VLD1_DUP if they exist), define a separate pattern that expressed this as a define_insn_and_split as below. (define_insn_and_split neon_vld1_dupv2di [(set (match_operand:V2DI 0 s_register_operand =w) (vec_duplicate:V2DI (match_operand:DI 1 neon_struct_operand Um)))] TARGET_NEON # reload_completed [(const_int 0)] { rtx tmprtx = gen_lowpart (DImode, operands[0]); emit_insn (gen_neon_vld1_dupdi (tmprtx, operands[1])); emit_move_insn (gen_highpart (DImode, operands[0]), tmprtx ); DONE; } (set_attr length 8) (set_attr neon_type fromearlierpattern) ) Do you want to try this and see what you get ? In addition it would be nice to have a testcase in gcc.target/arm . Well. Prior to sending my patch I did look at that directory, but I supposed that such a test ought to belong to the neon/ subdir where the tests are described as autogenerated. Any doc on how to do that? I'd rather have an extra regression test in gcc.target/arm that was a run time test. for e.g. take a look at gcc.target/arm/neon-vadds64.c . Ramana Thanks, Christophe.
Re: [RFC] PR 53063 encode group options in .opt files
On 10 May 2012 16:05, Joseph S. Myers jos...@codesourcery.com wrote: On Wed, 9 May 2012, Manuel López-Ibáñez wrote: 2012-05-09 Manuel López-Ibáñez m...@gcc.gnu.org PR 53063 gcc/ * doc/options.texi (EnabledBy): Document * opts.c: Include opts.h and options.h before tm.h. (finish_options): Do not handle some sub-options here... (common_handle_option): ... instead call common_handle_option_auto here. * optc-gen.awk: Handle EnabledBy. * opth-gen.awk: Declare common_handle_option_auto. * common.opt (Wuninitialized): Use EnabledBy. Delete Init. (Wmaybe-uninitialized): Likewise. (Wunused-but-set-variable): Likewise. (Wunused-function): Likewise. (Wunused-label): Likewise. (Wunused-value): Likewise. (Wunused-variable): Likewise. * opt-read.awk: Create opt_numbers array. ada/ * gcc-interface/misc.c (gnat_parse_file): Move before ... (gnat_handle_option): ... this. Use handle_generated_option. c-family/ * c-opts.c (c_common_handle_option): Use handle_generated_option to enable sub-options. fortran/ * options.c: Include diagnostics.h instead of diagnostics-core.h. (set_Wall): Do not see warn_unused here. (gfc_handle_option): Set it here using handle_generated_option. OK with the \ No newline at end of file fixed (i.e. a newline added to the end of optc-gen.awk) and 2012 added to copyright dates of modified files if not already there. Great! Now we have EnabledBy for common options. Now, what should we do with language specific settings? One idea could be to have something like: LangEnabledBy(Fortran Ada, Wall) and then auto-generate something like: ada_handle_option_auto(); fortran_handle_option_auto() which would be called by the *_handle_option() of each front-end. Does this sound reasonable? Cheers, Manuel.
Re: [PATCH] Add option for dumping to stderr (issue6190057)
On Fri, May 11, 2012 at 1:49 AM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, May 10, 2012 at 6:28 PM, Xinliang David Li davi...@google.com wrote: I like your suggestion and support the end goal you have. I don't like the -fopt-info behavior to interfere with regular -fdump-xxx options either. I think we should stage the changes in multiple steps as originally planned. Is Sharad's change good to be checked in for the first stage? Well - I don't think the change walks in the direction we want to go, so I don't see a good reason to make that change. Is your direction misunderstood or you want everything to be changed in one single patch (including replacement of all fprintf (dump_file, ...)? If the former, please clarify. If the latter, it can be done. thanks, David After this one is checked in, the new dump interfaces will be worked on (and to allow multiple streams). Most of the remaining changes will be massive text replacement. thanks, David On Thu, May 10, 2012 at 1:18 AM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, May 10, 2012 at 2:31 AM, Xinliang David Li davi...@google.com wrote: Bummer. I was thinking to reserve '=' for selective dumping: -fdump-tree-pre=func_list_regexp I guess this can be achieved via @ -fdump-tree-pre@func_list -fdump-tree-pre=file_name@func_list Another issue -- I don't think the current precedence rule is correct. Consider that -fopt-info=2 will be mapped to -fdump-tree-all-transform-verbose2=stderr -fdump-rtl-all-transform-verbose2=stderr then the current precedence rule will cause surprise when the following is used -fopt-info -fdump-tree-pre The PRE dump will be emitted to stderr which is not what user wants. In short, special streams should be treated as 'weak' the same way as your previous implementation. Hm, this raises a similar concern I have with the -fvectorizer-verbose flag. With -fopt-info -fdump-tree-pre I do not want some information to be present only on stderr or in the dump file! I want it in _both_ places! (-fvectorizer-verbose makes the -fdump-tree-vect dump contain less information :() Thus, the information where dumping goes has to be done differently (which is why I asked for some re-org originally, so that passes no longer explicitely reference dump_file - dump_file may be different for different kind of information it dumps!). Passes should, instead of fprintf (dump_file, ..., ...) do dump_printf (TDF_scev, ..., ...) thus, specify the kind of information they dump (would be mostly TDF_details vs. 0 today I guess). The dump_printf routine would then properly direct to one or more places to dump at. I realize this needs some more dispatchers for dumping expressions and statements (but it should not be too many). Dumping to dump_file would in any case dump to the passes private dump file only (unqualified stuff would never be useful for -fopt-info). The perfect candidate to convert to this kind of scheme is obviously the vectorizer with its existing -fvectorizer-verbose. If the patch doesn't work towards this kind of end-result I'd rather not have it. Thanks, Richard. thanks, David On Wed, May 9, 2012 at 4:56 PM, Sharad Singhai sing...@google.com wrote: Thanks for your suggestions/comments. I have updated the patch and documentation. It supports the following usage: gcc -fdump-tree-all=tree.dump -fdump-tree-pre=stdout -fdump-rtl-ira=ira.dump Here all tree dumps except the PRE are output into tree.dump, PRE dump goes to stdout and the IRA dump goes to ira.dump. Thanks, Sharad 2012-05-09 Sharad Singhai sing...@google.com * doc/invoke.texi: Add documentation for the new option. * tree-dump.c (dump_get_standard_stream): New function. (dump_files): Update for new field. (dump_switch_p_1): Handle dump filenames. (dump_begin): Likewise. (get_dump_file_name): Likewise. (dump_end): Remove attribute. (dump_enable_all): Add new parameter FILENAME. All callers updated. (enable_rtl_dump_file): * tree-pass.h (enum tree_dump_index): Add new constant. (struct dump_file_info): Add new field FILENAME. * testsuite/g++.dg/other/dump-filename-1.C: New test. Index: doc/invoke.texi === --- doc/invoke.texi (revision 187265) +++ doc/invoke.texi (working copy) @@ -5322,20 +5322,23 @@ Here are some examples showing uses of these optio @item -d@var{letters} @itemx -fdump-rtl-@var{pass} +@itemx -fdump-rtl-@var{pass}=@var{filename} @opindex d Says to make debugging dumps during compilation at times specified by @var{letters}. This is used for debugging the RTL-based passes of the compiler. The file names for most of the dumps are made by appending a pass number and a word to the @var{dumpname}, and the files are -created in the
Re: PING: PATCH: Backport x32 support to libtool
On Mon, Apr 16, 2012 at 10:47 AM, H.J. Lu hjl.to...@gmail.com wrote: On Tue, Apr 3, 2012 at 7:49 AM, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Mar 29, 2012 at 7:34 AM, H.J. Lu hjl.to...@gmail.com wrote: On Sat, Mar 3, 2012 at 9:54 AM, H.J. Lu hongjiu...@intel.com wrote: Hi, This patch backports x32 support to libtool: http://git.savannah.gnu.org/cgit/libtool.git/commit/?id=88992fe6771ec3258bde1b03314ce579da0ac2d5 OK to install? Thanks. H.J. --- ommit 0d8c092cac25c3bce5dbfc1981b84df91b3f6086 Author: H.J. Lu hjl.to...@gmail.com Date: Mon Dec 12 13:03:14 2011 -0800 Add x32 support to libtool.m4 2011-12-12 H.J. Lu hongjiu...@intel.com * libtool.m4 (_LT_ENABLE_LOCK): Support x32. diff --git a/ChangeLog.x32 b/ChangeLog.x32 new file mode 100644 index 000..b6e01ee --- /dev/null +++ b/ChangeLog.x32 @@ -0,0 +1,3 @@ +2011-12-12 H.J. Lu hongjiu...@intel.com + + * libtool.m4 (_LT_ENABLE_LOCK): Support x32. diff --git a/libtool.m4 b/libtool.m4 index 67321a7..a7f99ac 100644 --- a/libtool.m4 +++ b/libtool.m4 @@ -1232,7 +1232,14 @@ s390*-*linux*|s390*-*tpf*|sparc*-*linux*) LD=${LD-ld} -m elf_i386_fbsd ;; x86_64-*linux*) - LD=${LD-ld} -m elf_i386 + case `/usr/bin/file conftest.o` in + *x86-64*) + LD=${LD-ld} -m elf32_x86_64 + ;; + *) + LD=${LD-ld} -m elf_i386 + ;; + esac ;; ppc64-*linux*|powerpc64-*linux*) LD=${LD-ld} -m elf32ppclinux Hi Ralf, Can you review this patch? Thanks. PING. PING. PING. -- H.J.
Re: PATCH: Add x32 support to boehm-gc
On Sun, Apr 29, 2012 at 10:27 AM, H.J. Lu hongjiu...@intel.com wrote: Hi, This patch adds x32 support to boehm-gc. The same patch has been sent to the boehm-gc mailing list. Tested on Linux/x32 and Linux/x86-64. OK for trunk? Thanks. H.J. - Forwarded message from H.J. Lu hongjiu...@intel.com - Date: Mon, 16 Apr 2012 09:39:20 -0700 From: H.J. Lu hongjiu...@intel.com To: H.J. Lu hjl.to...@gmail.com Cc: hans_bo...@hp.com, g...@linux.hpl.hp.com Subject: [bdwgc] PATCH: Add x32 support User-Agent: Mutt/1.5.21 (2010-09-15) Hi, Here are 2 small patches to add x32 support to bdwgc. X32 info can be found at https://sites.google.com/site/x32abi/ They are fully tested on Linux/x32 and Linux/x86-64. Thanks. H.J. --- From 16ea9de35f16f0859c40862f8ef310c0dde6082c Mon Sep 17 00:00:00 2001 From: H.J. Lu hjl.to...@gmail.com Date: Mon, 16 Apr 2012 09:26:07 -0700 Subject: [PATCH 1/2] Define ALIGNMENT and CPP_WORDSZ for x32 * include/private/gcconfig.h: (ALIGNMENT): Set to 4 for x32. (CPP_WORDSZ): Set to 32 for x32. diff --git a/include/private/gcconfig.h b/include/private/gcconfig.h index a163e47..918d100 100644 --- a/include/private/gcconfig.h +++ b/include/private/gcconfig.h @@ -2117,8 +2117,13 @@ # ifdef X86_64 # define MACH_TYPE X86_64 -# define ALIGNMENT 8 -# define CPP_WORDSZ 64 +# ifdef __ILP32__ +# define ALIGNMENT 4 +# define CPP_WORDSZ 32 +# else +# define ALIGNMENT 8 +# define CPP_WORDSZ 64 +# endif # ifndef HBLKSIZE # define HBLKSIZE 4096 # endif -- 1.7.6.5 Hi Uros, This patch has been checked into upstream: https://github.com/ivmai/bdwgc/commit/936c1d5f7b8e8e91f7263bbff884a9d2377951f2 Is this OK for trunk? Thanks. -- H.J.
[Patch,AVR]: ad PR49868: assemble 3-byte symbols
Currently avr_assemble_integer emits an assembler warning to make gas complain about missing feature http://sourceware.org/PR13503 if a 3-byte address must be assembled. As PR13503 is implemented now, avr-gcc can use this feature. It's only needed for the new __memx address space. Ok to install? The feature is new in gcc and binutils and only rarely used, IMHO. The current solution will throw a comprehensible warning but generate bad code, with the patch old binutils will throw a not really helpful diagnose but work with new binutils... Thus okay to backport? Or leave 4.7 as is? Johann PR target/46898 * config/avr/avr.c (avr_const_address_lo16): Remove. (avr_assemble_integer): Print .byte lo8(x), .byte hi8(x), .byte hh8(x) instead of emit an assembler .warning if 3-byte address is assembled. * doc/extend.texi (AVR Named Address Spaces): Document that binutils 2.23 is needed to assemble 3-byte addresses. Index: config/avr/avr.c === --- config/avr/avr.c (revision 187411) +++ config/avr/avr.c (working copy) @@ -6639,48 +6639,6 @@ _reg_unused_after (rtx insn, rtx reg) } -/* Return RTX that represents the lower 16 bits of a constant address. - Unfortunately, simplify_gen_subreg does not handle this case. */ - -static rtx -avr_const_address_lo16 (rtx x) -{ - rtx lo16; - - switch (GET_CODE (x)) -{ -default: - break; - -case CONST: - if (PLUS == GET_CODE (XEXP (x, 0)) - SYMBOL_REF == GET_CODE (XEXP (XEXP (x, 0), 0)) - CONST_INT_P (XEXP (XEXP (x, 0), 1))) -{ - HOST_WIDE_INT offset = INTVAL (XEXP (XEXP (x, 0), 1)); - const char *name = XSTR (XEXP (XEXP (x, 0), 0), 0); - - lo16 = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (name)); - lo16 = gen_rtx_CONST (Pmode, plus_constant (Pmode, lo16, offset)); - - return lo16; -} - - break; - -case SYMBOL_REF: - { -const char *name = XSTR (x, 0); - -return gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (name)); - } -} - - avr_edump (\n%?: %r\n, x); - gcc_unreachable(); -} - - /* Target hook for assembling integer objects. The AVR version needs special handling for references to certain labels. */ @@ -6688,7 +6646,7 @@ static bool avr_assemble_integer (rtx x, unsigned int size, int aligned_p) { if (size == POINTER_SIZE / BITS_PER_UNIT aligned_p - text_segment_operand (x, VOIDmode) ) + text_segment_operand (x, VOIDmode)) { fputs (\t.word\tgs(, asm_out_file); output_addr_const (asm_out_file, x); @@ -6698,15 +6656,17 @@ avr_assemble_integer (rtx x, unsigned in } else if (GET_MODE (x) == PSImode) { - default_assemble_integer (avr_const_address_lo16 (x), -GET_MODE_SIZE (HImode), aligned_p); + /* This needs binutils 2.23+, see PR binutils/13503 */ + + fputs (\t.byte\tlo8(, asm_out_file); + output_addr_const (asm_out_file, x); + fputs ()\n, asm_out_file); - fputs (\t.warning\t\assembling 24-bit address needs binutils - extension for hh8(, asm_out_file); + fputs (\t.byte\thi8(, asm_out_file); output_addr_const (asm_out_file, x); - fputs ()\\n, asm_out_file); + fputs ()\n, asm_out_file); - fputs (\t.byte\t0\t ASM_COMMENT_START hh8(, asm_out_file); + fputs (\t.byte\thh8(, asm_out_file); output_addr_const (asm_out_file, x); fputs ()\n, asm_out_file); Index: doc/extend.texi === --- doc/extend.texi (revision 187411) +++ doc/extend.texi (working copy) @@ -1273,6 +1273,7 @@ If the high bit of the address is set, d RAM using the lower two bytes as RAM address. If the high bit of the address is clear, data is read from flash with @code{RAMPZ} set according to the high byte of the address. +@xref{AVR Built-in Functions,,@code{__builtin_avr_flash_segment}}. Objects in this address space will be located in @code{.progmem.data}. @end table @@ -1302,6 +1303,7 @@ int main (void) @} @end example +@noindent For each named address space supported by avr-gcc there is an equally named but uppercase built-in macro defined. The purpose is to facilitate testing if respective address space @@ -1327,7 +1329,8 @@ int read_var (void) #endif /* __FLASH */ @end example -Notice that attribute @ref{AVR Variable Attributes,@code{progmem}} +@noindent +Notice that attribute @ref{AVR Variable Attributes,,@code{progmem}} locates data in flash but accesses to these data will read from generic address space, i.e.@: from RAM, @@ -1335,6 +1338,7 @@ so that you need special accessors like from @w{@uref{http://nongnu.org/avr-libc/user-manual,AVR-LibC}} together with attribute @code{progmem}. +@noindent @b{Limitations and
Re: [RFC] PR 53063 encode group options in .opt files
On Fri, 11 May 2012, Manuel López-Ibáñez wrote: Great! Now we have EnabledBy for common options. Now, what should we do with language specific settings? One idea could be to have something like: LangEnabledBy(Fortran Ada, Wall) and then auto-generate something like: ada_handle_option_auto(); fortran_handle_option_auto() which would be called by the *_handle_option() of each front-end. Does this sound reasonable? What cases do we have where a language-independent option enables another language-independent option only for some front ends? That's the only case that should need a language-dependent generated function here. -- Joseph S. Myers jos...@codesourcery.com
Re: [RFC] PR 53063 encode group options in .opt files
On 11 May 2012 19:04, Joseph S. Myers jos...@codesourcery.com wrote: On Fri, 11 May 2012, Manuel López-Ibáñez wrote: Great! Now we have EnabledBy for common options. Now, what should we do with language specific settings? One idea could be to have something like: LangEnabledBy(Fortran Ada, Wall) and then auto-generate something like: ada_handle_option_auto(); fortran_handle_option_auto() which would be called by the *_handle_option() of each front-end. Does this sound reasonable? What cases do we have where a language-independent option enables another language-independent option only for some front ends? That's the only case that should need a language-dependent generated function here. Wall enables Wunused in Ada Fortran and C family. Cheers, Manuel.
Re: [RFC] PR 53063 encode group options in .opt files
On 11 May 2012 19:09, Manuel López-Ibáñez lopeziba...@gmail.com wrote: On 11 May 2012 19:04, Joseph S. Myers jos...@codesourcery.com wrote: On Fri, 11 May 2012, Manuel López-Ibáñez wrote: Great! Now we have EnabledBy for common options. Now, what should we do with language specific settings? One idea could be to have something like: LangEnabledBy(Fortran Ada, Wall) and then auto-generate something like: ada_handle_option_auto(); fortran_handle_option_auto() which would be called by the *_handle_option() of each front-end. Does this sound reasonable? What cases do we have where a language-independent option enables another language-independent option only for some front ends? That's the only case that should need a language-dependent generated function here. Wall enables Wunused in Ada Fortran and C family. But the case of lang-indep enables lang-dependent is far more common: Wextra enables some C-only options. Cheers, Manuel.
Re: [Fortran] Patch ping
On 18 April 2012 at 18:57, Bernhard Reutner-Fischer wrote: On Tue, Apr 17, 2012 at 12:47:48AM +0200, Tobias Burnus wrote: Approved but not yet committed: Bernhard: - [PATCH] gfortran testsuite: implicitly cleanup-modules, part 2 http://gcc.gnu.org/ml/fortran/2012-04/msg00065.html Before actually pushing this, I ment to ask if we *want* to make sure that we do not add superfluous cleanup-module calls in the future (which would slow down testing needlessly)? If so we would either have to manually reject occurances of those during patch-review or install a warning or the like if there is an explicit cleanup-modules call yielding the same set as the now automatically determined set. I would go for the manual method: As cleanup-modules is something which developers tend to forget, I do not think that many patches will include them. On then simply tries to reduce those by patch review. - If patch developers do not see it in other files, the chance is high that they do not even know (or remember) about that feature in a few months. And after some time (1/2 year, 1 year?), one can check whether a spurious clean-up modules has slipped in - or whether some cleanup-module is missing. I expect that there will be none or very, very few cases. Tobias
Re: PATCH: Add x32 support to boehm-gc
On Fri, May 11, 2012 at 6:55 PM, H.J. Lu hjl.to...@gmail.com wrote: This patch adds x32 support to boehm-gc. The same patch has been sent to the boehm-gc mailing list. Tested on Linux/x32 and Linux/x86-64. OK for trunk? This patch has been checked into upstream: https://github.com/ivmai/bdwgc/commit/936c1d5f7b8e8e91f7263bbff884a9d2377951f2 Is this OK for trunk? OK, we sync with upstream. Thanks, Uros.
Re: [PATCH] Add option for dumping to stderr (issue6190057)
To be more specific, does the following match what your envisioned? 1) when multiple streams are specified for dumping, the information will be dumped to all streams IF the new dumping interfaces are used (see below). For legacy code, the default dump file will still be used and the user specified file (including stderr) will be silently ignored. This is of course temporary until all dumping sites are converted 2) Define the following new dumping interfaces which will honor all streams specified dump_printf (const char*, ); // for raw string dumping dump_generic_expr (...); // wrapper to print_generic_expr without taking FILE* dump_node (...); // wrapper to print_node dump_gimple_stmt(...); // wrapper to print_gimple_stmt dump_inform (...); //wrapper to inform (..) method, but is aware of of the dumping streams and modify global_dc appropriately. 3) Implement -fopt-info=N, and take -ftree-vectorizer-verbose as the first guinea pig to use the new dumping interfaces. After the infrastructure is ready, gradually deprecate the use of the original dumper interfaces. what do you think? thanks, David On Fri, May 11, 2012 at 9:06 AM, Xinliang David Li davi...@google.com wrote: On Fri, May 11, 2012 at 1:49 AM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, May 10, 2012 at 6:28 PM, Xinliang David Li davi...@google.com wrote: I like your suggestion and support the end goal you have. I don't like the -fopt-info behavior to interfere with regular -fdump-xxx options either. I think we should stage the changes in multiple steps as originally planned. Is Sharad's change good to be checked in for the first stage? Well - I don't think the change walks in the direction we want to go, so I don't see a good reason to make that change. Is your direction misunderstood or you want everything to be changed in one single patch (including replacement of all fprintf (dump_file, ...)? If the former, please clarify. If the latter, it can be done. thanks, David After this one is checked in, the new dump interfaces will be worked on (and to allow multiple streams). Most of the remaining changes will be massive text replacement. thanks, David On Thu, May 10, 2012 at 1:18 AM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, May 10, 2012 at 2:31 AM, Xinliang David Li davi...@google.com wrote: Bummer. I was thinking to reserve '=' for selective dumping: -fdump-tree-pre=func_list_regexp I guess this can be achieved via @ -fdump-tree-pre@func_list -fdump-tree-pre=file_name@func_list Another issue -- I don't think the current precedence rule is correct. Consider that -fopt-info=2 will be mapped to -fdump-tree-all-transform-verbose2=stderr -fdump-rtl-all-transform-verbose2=stderr then the current precedence rule will cause surprise when the following is used -fopt-info -fdump-tree-pre The PRE dump will be emitted to stderr which is not what user wants. In short, special streams should be treated as 'weak' the same way as your previous implementation. Hm, this raises a similar concern I have with the -fvectorizer-verbose flag. With -fopt-info -fdump-tree-pre I do not want some information to be present only on stderr or in the dump file! I want it in _both_ places! (-fvectorizer-verbose makes the -fdump-tree-vect dump contain less information :() Thus, the information where dumping goes has to be done differently (which is why I asked for some re-org originally, so that passes no longer explicitely reference dump_file - dump_file may be different for different kind of information it dumps!). Passes should, instead of fprintf (dump_file, ..., ...) do dump_printf (TDF_scev, ..., ...) thus, specify the kind of information they dump (would be mostly TDF_details vs. 0 today I guess). The dump_printf routine would then properly direct to one or more places to dump at. I realize this needs some more dispatchers for dumping expressions and statements (but it should not be too many). Dumping to dump_file would in any case dump to the passes private dump file only (unqualified stuff would never be useful for -fopt-info). The perfect candidate to convert to this kind of scheme is obviously the vectorizer with its existing -fvectorizer-verbose. If the patch doesn't work towards this kind of end-result I'd rather not have it. Thanks, Richard. thanks, David On Wed, May 9, 2012 at 4:56 PM, Sharad Singhai sing...@google.com wrote: Thanks for your suggestions/comments. I have updated the patch and documentation. It supports the following usage: gcc -fdump-tree-all=tree.dump -fdump-tree-pre=stdout -fdump-rtl-ira=ira.dump Here all tree dumps except the PRE are output into tree.dump, PRE dump goes to stdout and the IRA dump goes to ira.dump. Thanks, Sharad 2012-05-09 Sharad Singhai sing...@google.com * doc/invoke.texi: Add documentation for the new option. *
PATCH: PR target/53315: simple xtest program generates ICE
Hi, This patch uses + in constraint and match_dup in xbegin_1. OK for trunk? Andi, can you provide a run-time testcase patch with proper cpuid check? Thanks. H.J. --- 2012-05-11 Andrew Pinski apin...@cavium.com H.J. Lu hongjiu...@intel.com PR target/53315 * config/i386/i386.md (xbegin_1): Use + in constraint and match_dup. (xbegin): Updated. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index c825579..b64d1d3 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -18361,9 +18361,9 @@ { rtx label = gen_label_rtx (); - operands[1] = force_reg (SImode, constm1_rtx); + operands[0] = force_reg (SImode, constm1_rtx); - emit_jump_insn (gen_xbegin_1 (operands[0], operands[1], label)); + emit_jump_insn (gen_xbegin_1 (operands[0], label)); emit_label (label); LABEL_NUSES (label) = 1; @@ -18375,13 +18375,12 @@ [(set (pc) (if_then_else (ne (unspec [(const_int 0)] UNSPEC_XBEGIN_ABORT) (const_int 0)) - (label_ref (match_operand 2)) + (label_ref (match_operand 1)) (pc))) - (set (match_operand:SI 0 register_operand =a) - (unspec_volatile:SI [(match_operand:SI 1 register_operand 0)] - UNSPECV_XBEGIN))] + (set (match_operand:SI 0 register_operand +a) + (unspec_volatile:SI [(match_dup 0)] UNSPECV_XBEGIN))] TARGET_RTM - xbegin\t%l2 + xbegin\t%l1 [(set_attr type other) (set_attr length 6)])
Re: [patch] support for multiarch systems
On 11.05.2012 12:51, Paolo Bonzini wrote: Il 11/05/2012 07:13, Matthias Klose ha scritto: ok, I did clarify it in the existing documentation of MULTIARCH_DIRNAME in fragments.texi, detailing the search order for the files. Should the search order be mentioned in some user documentation as well? if yes, where? Thanks! I don't think the search order should be mentioned specially, since anyway *_INCLUDE_PATH and LIBRARY_PATH are mentioned. However under MULTILIB_DIRNAMES I would add something like this: @code{MULTILIB_DIRNAMES} describes the multilib directories using GCC conventions and is applied to directories that are part of the GCC installation. When multilib-enabled, the compiler will add a subdirectory of the form @var{prefix}/@var{multilib} before each directory in the search path for libraries and crt files. done. +@findex MULTILIB_OSDIRNAMES +@item MULTILIB_OSDIRNAMES +If @code{MULTILIB_OPTIONS} is used, this variable specifies ... a list of subdirectory names, that are used to modify the search path depending on the chosen multilib. Unlike @code{MULTILIB_DIRNAMES}, @code{MULTILIB_OSDIRNAMES} describes the multilib directories using operating systems conventions, and is applied to the directories such as @code{lib} or those in the @env{LIBRARY_PATH} environment variable. The format is either the same as of +@code{MULTILIB_DIRNAMES}, or a set of mappings. When it is the same +as @code{MULTILIB_DIRNAMES}, it describes the multilib directories +using OS conventions, rather than GCC conventions. s/OS/operating system/ done. When it is a set +of mappings of the form @var{gccdir}=@var{osdir}, the left side gives +the GCC convention and the right gives the equivalent OS defined +location. If the @var{osdir} part begins with a @samp{!}, ... GCC will not search in the non-multilib directory and use exclusively the multilib directory. Otherwise, the compiler will examine the search path for libraries and crt files twice; the first time it will add @var{multilib} to each directory in the search path, the second it will not. Use the mapping when there is +no one-to-one equivalence between GCC levels and the OS. I'm not sure what you mean here? The whole paragraph was taken from the genmultilib script. I left it out for this version. I think I'll update the genmultilib script when these docs are settled. +For multilib enabled configurations (see @code{MULTIARCH_DIRNAME}) +below), the multilib name is appended to each directory name, separated +by a colon (e.g. @samp{../lib:x86_64-linux-gnu}). For configurations that support both multilib and multiarch, @code{MULTILIB_OSDIRNAMES} also encodes the multiarch name, thus subsuming @code{MULTIARCH_DIRNAME}. The multiarch name is appended to each directory name, separated by a colon (e.g. @samp{../lib32:i386-linux-gnu}). Each multiarch subdirectory will be searched before the corresponding OS multilib directory, for example @samp{/lib/i386-linux-gnu} before @samp{/lib/..lib32}. The multiarch name will also be used to modify the system header search path, as explained for @code{MULTIARCH_DIRNAME}. done. +@findex MULTIARCH_DIRNAME +@item MULTIARCH_DIRNAME +If @code{MULTIARCH_DIRNAME} is used, this variable specifies the +multiarch name for this configuration. +Libraries and crt files are searched first in +@var{prefix}/@var{multiarch} before @var{prefix} for each @var{prefix} +added by @code{add_prefix} or @code{add_sysrooted_prefix}. +System header files are searched first in +@code{LOCAL_INCLUDE_DIR}/@var{multiarch} before +@code{LOCAL_INCLUDE_DIR}, and in +@code{NATIVE_SYSTEM_HEADER_DIR}/@var{multiarch} before +@code{NATIVE_SYSTEM_HEADER_DIR}. +@code{MULTIARCH_DIRNAME} is not used for multilib enabled +configurations, but encoded in @code{MULTILIB_OSDIRNAMES} instead. This sounds simpler, and doesn't refer to GCC implementation details such add_prefix/add_sysrooted_prefix: This variable specifies the multiarch name for configurations that are multiarch-enabled but not multilibbed configurations. The multiarch name is used to augment the search path for libraries, crt files and system header files with additional locations. The compiler will add a multiarch subdirectory of the form @var{prefix}/@var{multiarch} before each directory in the library and crt search path. It will also add two directories @code{LOCAL_INCLUDE_DIR}/@var{multiarch} and @code{NATIVE_SYSTEM_HEADER_DIR}/@var{multiarch}) to the system header search path, respectively before @code{LOCAL_INCLUDE_DIR} and @code{NATIVE_SYSTEM_HEADER_DIR}. @code{MULTIARCH_DIRNAME} is not used for configurations that support both multilib and multiarch. In that case, multiarch names are encoded in @code{MULTILIB_OSDIRNAMES} instead. done. thanks for the wording. +The multiarch tuples are defined +in @uref{http://wiki.debian.org/Multiarch/Tuples}. Is this needed? Aren't they defined
Re: patch for PR53125
2012-05-10 Vladimir Makarovvmaka...@redhat.com PR rtl-optimization/53125 * ira.c (ira): Call find_moveable_pseudos and move_unallocated_pseudos if only ira_conflicts_p is true. And the attached patch fixes the reginfo slowness. The reginfo pass was doing: for each bb: for each insn in bb, from BB_END(bb) to BB_HEAD(bb): for each reg that died between BB_END(bb) and insn: // i.e. register is live at insn REG_LIVE_LENGTH(reg)++ With very large basic blocks, and the kind of code for the test case of the PR results in many live registers for SPARC (for x86_64 there are far fewer live registers). For SPARC, there are O(1e5) insns and O(1e4) registers live for most of the basic block. That is effectively almost quadratic behavior in the number of insns per basic block. But the above loop is a bit silly. It is much easier and computationally cheaper to just remember at what insn reg died (last used), and add to REG_LIVE_LENGTH the distance from the insn that sets reg to the insn that used reg. It turns out that (before or after the patch) partial or conditional sets never kill a register, so that REG_LIVE_LENGTH for registers set by partial or conditional stores is not very accurate. I have a few ideas for how to improve that situation a bit, but I'm not sure if it's worth the trouble. I also think that the computation of REG_FREQ and REG_FREQ_CALLS_CROSSED has been wrong for some time. The maximum REG_FREQ is REG_FREQ_MAX (see regs.h) but regstat.c just accumulates REG_FREQ_FROM_BB. I've added a few lines to limit the REG_FREQ to REG_FREQ_MAX, e.g.: REG_FREQ (dregno) += REG_FREQ_FROM_BB (bb); + REG_FREQ (dregno) = + MIN (REG_FREQ (dregno), REG_FREQ_MAX); With this patch, the reginfo pass disappears from the time report for the test case attached to PR53125, compiled at -O0. Bootstrapped and tested on x86_64-unknown-linux-gnu, and I verified for a few simple test cases that the computed REG_LIVE_LENGHTs are unchanged. OK for trunk? Ciao! Steven
Re: patch for PR53125
Now with patch On Fri, May 11, 2012 at 8:42 PM, Steven Bosscher stevenb@gmail.com wrote: 2012-05-10 Vladimir Makarovvmaka...@redhat.com PR rtl-optimization/53125 * ira.c (ira): Call find_moveable_pseudos and move_unallocated_pseudos if only ira_conflicts_p is true. And the attached patch fixes the reginfo slowness. The reginfo pass was doing: for each bb: for each insn in bb, from BB_END(bb) to BB_HEAD(bb): for each reg that died between BB_END(bb) and insn: // i.e. register is live at insn REG_LIVE_LENGTH(reg)++ With very large basic blocks, and the kind of code for the test case of the PR results in many live registers for SPARC (for x86_64 there are far fewer live registers). For SPARC, there are O(1e5) insns and O(1e4) registers live for most of the basic block. That is effectively almost quadratic behavior in the number of insns per basic block. But the above loop is a bit silly. It is much easier and computationally cheaper to just remember at what insn reg died (last used), and add to REG_LIVE_LENGTH the distance from the insn that sets reg to the insn that used reg. It turns out that (before or after the patch) partial or conditional sets never kill a register, so that REG_LIVE_LENGTH for registers set by partial or conditional stores is not very accurate. I have a few ideas for how to improve that situation a bit, but I'm not sure if it's worth the trouble. I also think that the computation of REG_FREQ and REG_FREQ_CALLS_CROSSED has been wrong for some time. The maximum REG_FREQ is REG_FREQ_MAX (see regs.h) but regstat.c just accumulates REG_FREQ_FROM_BB. I've added a few lines to limit the REG_FREQ to REG_FREQ_MAX, e.g.: REG_FREQ (dregno) += REG_FREQ_FROM_BB (bb); + REG_FREQ (dregno) = + MIN (REG_FREQ (dregno), REG_FREQ_MAX); With this patch, the reginfo pass disappears from the time report for the test case attached to PR53125, compiled at -O0. Bootstrapped and tested on x86_64-unknown-linux-gnu, and I verified for a few simple test cases that the computed REG_LIVE_LENGHTs are unchanged. OK for trunk? Ciao! Steven regstat_smarter_local_live_v2.diff Description: Binary data
Re: [google/gcc-4_6] Port arm hardfp patch from Linaro (issue 6206055)
Please add ChangeLog and the trunk patch revision. http://codereview.appspot.com/6206055/
Re: [patch, fortran] PR fortran/52537 Optimize string comparisons against empty strings
Hi Tobias, Hello Thomas, below a very timely review - your patch is not even a month old and was never pinged, besides, you have chosen an unlucky day. (In other words: Sorry for the slow review.) As a matter of fact, I had become 3/4 convinced that I had already committed this patch, which is why I didn't ping it. So, thanks for remembering! Thomas Koenig wrote on Fri, 13 Apr 2012: this patch replaces a != '' with len_trim(a) != 0, to speed up the comparison. I wonder how much it helps - especially for the real world code. Let's see whether the bug reporter will report back. We'll see. Can you also check kind=4 string in the test case? Added (in the second revision of the test case, I also had a flag wrong). Otherwise, the patch is OK. The whitespace fixes you suggested have been applied. Thanks for the review! Thomas
Re: [RFC] PR 53063 encode group options in .opt files
On Fri, 11 May 2012, Manuel López-Ibáñez wrote: What cases do we have where a language-independent option enables another language-independent option only for some front ends? That's the only case that should need a language-dependent generated function here. Wall enables Wunused in Ada Fortran and C family. Is there a reason for it not to do so for all languages? You only really need a front-end-specific generated function if there's something about the semantics that needs to be different in different front ends. But the case of lang-indep enables lang-dependent is far more common: Wextra enables some C-only options. That shouldn't be a problem. You can mark those options as enabled by -Wextra; at most you might need to ensure that enabling a wrong-language option this way doesn't generate the usual warning you get for using a wrong-language option on the command line. There's nothing wrong with having separate autogenerated functions for each language if you want to split things out that way, but it would seem simpler just to have one function called somewhere central, whatever option it is (common or not) that is enabling other options. -- Joseph S. Myers jos...@codesourcery.com
Re: [v3] fix libstdc++/53263
Attached patch applied to trunk. 2012-05-11 François Dumont fdum...@gcc.gnu.org PR libstdc++/53263 * include/debug/safe_iterator.h (__gnu_debug::__base): Move... * include/debug/functions.h: ... Here. Add debug function overloads to perform checks on normal iterators when possible. * include/debug/macros.h (__glibcxx_check_heap) (__glibcxx_check_heap_pred): Use __gnu_debug::__base on iterator range. I check my Bugzilla account Permissions page and it is written: There are no permission bits set on your account. I simply don't know who to ask permissions. Should I file a bugzilla entry for that ? François On 05/10/2012 11:18 PM, Paolo Carlini wrote: Hi, On 05/09/2012 11:02 PM, François Dumont wrote: Here is a patch for PR 53263. Tested under linux x86_64 debug mode. Ok for trunk and 4.7 branch ? Thanks. Considering that this isn't a regression and also that nobody reported the issue for so many years, the patch seems a bit largish to me to go into the branch. Thus, let's apply to mainline only and consider the issue closed. If people insist, seriously insist ;) we may reconsider for 4.7.2. Thanks again! Paolo. PS: are you finally able to manage Bugzilla, yes? Index: include/debug/functions.h === --- include/debug/functions.h (revision 187292) +++ include/debug/functions.h (working copy) @@ -31,7 +31,8 @@ #define _GLIBCXX_DEBUG_FUNCTIONS_H 1 #include bits/c++config.h -#include bits/stl_iterator_base_types.h // for iterator_traits, categories +#include bits/stl_iterator_base_types.h // for iterator_traits, categories and + // _Iter_base #include bits/cpp_type_traits.h // for __is_integer #include debug/formatter.h @@ -118,11 +119,8 @@ inline bool __valid_range_aux(const _InputIterator __first, const _InputIterator __last, std::__false_type) - { -typedef typename std::iterator_traits_InputIterator::iterator_category - _Category; -return __valid_range_aux2(__first, __last, _Category()); - } + { return __valid_range_aux2(__first, __last, + std::__iterator_category(__first)); } /** Don't know what these iterators are, or if they are even * iterators (we may get an integral type for InputIterator), so @@ -214,6 +212,15 @@ return true; } + // For performance reason, as the iterator range has been validated, check on + // random access safe iterators is done using the base iterator. + templatetypename _Iterator, typename _Sequence +inline bool +__check_sorted_aux(const _Safe_iterator_Iterator, _Sequence __first, + const _Safe_iterator_Iterator, _Sequence __last, + std::random_access_iterator_tag __tag) + { return __check_sorted_aux(__first.base(), __last.base(), __tag); } + // Can't check if an input iterator sequence is sorted, because we can't step // through the sequence. templatetypename _InputIterator, typename _Predicate @@ -240,19 +247,28 @@ return true; } + // For performance reason, as the iterator range has been validated, check on + // random access safe iterators is done using the base iterator. + templatetypename _Iterator, typename _Sequence, + typename _Predicate +inline bool +__check_sorted_aux(const _Safe_iterator_Iterator, _Sequence __first, + const _Safe_iterator_Iterator, _Sequence __last, + _Predicate __pred, + std::random_access_iterator_tag __tag) + { return __check_sorted_aux(__first.base(), __last.base(), __pred, __tag); } + // Determine if a sequence is sorted. templatetypename _InputIterator inline bool __check_sorted(const _InputIterator __first, const _InputIterator __last) { - typedef typename std::iterator_traits_InputIterator::iterator_category -_Category; - // Verify that the operator for elements in the sequence is a // StrictWeakOrdering by checking that it is irreflexive. __glibcxx_assert(__first == __last || !(*__first *__first)); - return __check_sorted_aux(__first, __last, _Category()); + return __check_sorted_aux(__first, __last, +std::__iterator_category(__first)); } templatetypename _InputIterator, typename _Predicate @@ -260,14 +276,12 @@ __check_sorted(const _InputIterator __first, const _InputIterator __last, _Predicate __pred) { - typedef typename std::iterator_traits_InputIterator::iterator_category -_Category; - // Verify that the predicate is StrictWeakOrdering by checking that it // is irreflexive. __glibcxx_assert(__first == __last || !__pred(*__first, *__first)); - return __check_sorted_aux(__first, __last, __pred, _Category()); + return __check_sorted_aux(__first, __last, __pred, +std::__iterator_category(__first)); } templatetypename _InputIterator @@ -332,13 +346,11 @@ return
Re: [PATCH, Android] Stack protector enabling for Android target
Hi! Please look at the modified patch in the attachment. ChangeLog remains the same. Tested in android environment(x86_64-*-linux-android), also bootstrapped on x86_64-unknown-linux-gnu. I also started regtesting on linux. Is it ok after successfull regtesting? Thanks, Igor On Mon, May 7, 2012 at 10:08 PM, H.J. Lu hjl.to...@gmail.com wrote: On Mon, May 7, 2012 at 11:04 AM, Maxim Kuvyrkov ma...@codesourcery.com wrote: On 6/05/2012, at 7:02 PM, Igor Zamyatin wrote: Hi! The patch enables stack protector for Android. Android targets don't contain necessary information in features.h so we explicitly enable stack protector for Android. Bootstrapped and regtested on x86_64. Ok to commit? Thanks, Igor 2012-05-06 Igor Zamyatin igor.zamya...@intel.com * configure.ac: Stack protector enabling for Android targets. * configure: Regenerate. diff --git a/gcc/configure.ac b/gcc/configure.ac index 86b4bea..c1012d6 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -4545,6 +4545,8 @@ AC_CACHE_CHECK(__stack_chk_fail in target C library, gcc_cv_libc_provides_ssp, [gcc_cv_libc_provides_ssp=no case $target in + *-android*) + gcc_cv_libc_provides_ssp=yes;; *-*-linux* | *-*-kfreebsd*-gnu | *-*-knetbsd*-gnu) [# glibc 2.4 and later provides __stack_chk_fail and # either __stack_chk_guard, or TLS access to stack guard canary. What you really want is to enable stack protector for Bionic libc, which is often synonymous with -android* target, but not always. Let's enable ssp based on whether __BIONIC__ is defined in the libc headers (i.e., add a grep test for __BIONIC__ in sys/cdefs.h) Also please add a comment along the lines of all versions of Bionic support stack protector. Which exact target did you test this on? X86_64-*-* is a pretty broad definition. We are working on x86_64-*-linux-android target, which uses x32. -- H.J. stack_protector.patch Description: Binary data
Re: [PATCH, tree-optimization] Fix for PR 52868
Ping? On Fri, Apr 27, 2012 at 4:42 PM, Igor Zamyatin izamya...@gmail.com wrote: On Wed, Apr 25, 2012 at 6:41 PM, Richard Guenther richard.guent...@gmail.com wrote: On Wed, Apr 25, 2012 at 4:32 PM, Igor Zamyatin izamya...@gmail.com wrote: On Wed, Apr 25, 2012 at 1:14 PM, Richard Guenther richard.guent...@gmail.com wrote: On Wed, Apr 25, 2012 at 10:56 AM, Igor Zamyatin izamya...@gmail.com wrote: Hi all! I'd like to post for review the patch which makes some costs adjusting in get_computation_cost_at routine in ivopts part. As mentioned in the PR changes also fix the bwaves regression from PR 52272. Also changes introduce no degradations on spec2000/2006 and EEMBC1.1/2.0(this was measured on Atom) on x86 Bootstrapped and regtested on x86. Ok to commit? I can't make sense of the patch and the comment does not help. + diff_cost = cost.cost; cost.cost /= avg_loop_niter (data-current_loop); + add_cost_val = add_cost (TYPE_MODE (ctype), data-speed); + /* Do cost correction if address cost is small enough + and difference cost is high enough. */ + if (address_p diff_cost add_cost_val + get_address_cost (symbol_present, var_present, + offset, ratio, cstepi, + TYPE_MODE (TREE_TYPE (utype)), + TYPE_ADDR_SPACE (TREE_TYPE (utype)), + speed, stmt_is_after_inc, + can_autoinc).cost = add_cost_val) + cost.cost += add_cost_val; Please explain more thoroughly. It also would seem to be better to add an extra case, as later code does For example for such code for (j=0; jM;j++) { for (i=0; iN; i++) sum += ptr-a[j][i] * ptr-c[k][i]; } we currently have following gimple on x86 target (I provided a piece of all phase output): # ivtmp.13_30 = PHI ivtmp.13_31(3), ivtmp.13_33(7) D.1748_34 = (void *) ivtmp.13_30; D.1722_7 = MEM[base: D.1748_34, offset: 0B]; D.1750_36 = ivtmp.27_28; D.1751_37 = D.1750_36 + ivtmp.13_30; -- we got non-invariant add which is not taken into account currently in cost model D.1752_38 = (void *) D.1751_37; D.1753_39 = (sizetype) k_8(D); D.1754_40 = D.1753_39 * 800; D.1723_9 = MEM[base: D.1752_38, index: D.1754_40, offset: 16000B]; ... With proposed fix we produce: # ivtmp.14_30 = PHI ivtmp.14_31(3), 0(7) D.1749_34 = (struct S *) ivtmp.25_28; D.1722_7 = MEM[base: D.1749_34, index: ivtmp.14_30, offset: 0B]; D.1750_35 = (sizetype) k_8(D); D.1751_36 = D.1750_35 * 800; D.1752_37 = ptr_6(D) + D.1751_36; D.1723_9 = MEM[base: D.1752_37, index: ivtmp.14_30, offset: 16000B]; which is more effective on platforms where address cost is cheaper than cost of addition operation. That's basically what this adjustment is for. If we generally miss to account for the add then why is the adjustment conditional on diff_cost add_cost and address_cost = add_cost? Is this a new heuristic or a fix for not accurately computing the cost for the stmts we generate? I'd say this is closer to heuristic since diff_cost add_cost is an attempt to catch the case with non-invariant add produced by pointer difference and address_cost =add_cost leaves the cases with cheap address operations Richard. So comment in the source code now looks as follows /* Do cost correction when address difference produces additional non-invariant add operation which is less profitable if address cost is cheaper than cost of add. */ /* Now the computation is in shape symbol + var1 + const + ratio * var2. (symbol/var1/const parts may be omitted). If we are looking for an address, find the cost of addressing this. */ if (address_p) return add_costs (cost, get_address_cost (symbol_present, var_present, offset, ratio, cstepi, TYPE_MODE (TREE_TYPE (utype)), TYPE_ADDR_SPACE (TREE_TYPE (utype)), speed, stmt_is_after_inc, can_autoinc)); thus refactoring the code a bit would make it possible to CSE the get_address_cost call and eventually make it clearer what the code does. 'offset' could be changed beetween two calls of get_address_cost so such refactoring looks useless. New patch (only the comment was changed) attached. Changelog was changed as well. Richard. Changelog: 2012-04-26 Yuri Rumyantsev yuri.rumyant...@intel.com * tree-ssa-loop-ivopts.c (get_computation_cost_at): Adjust cost model when address difference produces additional non-invariant add operation which is less profitable if
Re: [google/gcc-4_6] Backport arm hardfp patch from trunk (issue 6206055)
On 2012/05/11 18:48:40, carrot wrote: Please add ChangeLog and the trunk patch revision. Thanks. Trunk patch revision is 186859. ChangeLog entry added. Patch started. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 2f85855..588fa04 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,14 @@ +2012-05-11 Han Shen shen...@google.com + + Backport from mainline. + 2012-04-26 Michael Hope michael.h...@linaro.org + Richard Earnshaw rearn...@arm.com + + * config/arm/linux-eabi.h (GLIBC_DYNAMIC_LINKER_SOFT_FLOAT): Define. + (GLIBC_DYNAMIC_LINKER_HARD_FLOAT): Define. + (GLIBC_DYNAMIC_LINKER_DEFAULT): Define. + (GLIBC_DYNAMIC_LINKER): Redefine to use the hard float path. + 2012-04-30 Uros Bizjak ubiz...@gmail.com Backport from mainline diff --git a/gcc/config/arm/linux-eabi.h b/gcc/config/arm/linux-eabi.h index 3a32188..58d06c2 100644 --- a/gcc/config/arm/linux-eabi.h +++ b/gcc/config/arm/linux-eabi.h @@ -32,7 +32,8 @@ while (false) /* We default to a soft-float ABI so that binaries can run on all - target hardware. */ + target hardware. If you override this to use the hard-float ABI then + change the setting of GLIBC_DYNAMIC_LINKER_DEFAULT as well. */ #undef TARGET_DEFAULT_FLOAT_ABI #define TARGET_DEFAULT_FLOAT_ABI ARM_FLOAT_ABI_SOFT @@ -59,10 +60,26 @@ #undef SUBTARGET_EXTRA_LINK_SPEC #define SUBTARGET_EXTRA_LINK_SPEC -m TARGET_LINKER_EMULATION -/* Use ld-linux.so.3 so that it will be possible to run classic - GNU/Linux binaries on an EABI system. */ +/* GNU/Linux on ARM currently supports three dynamic linkers: + - ld-linux.so.2 - for the legacy ABI + - ld-linux.so.3 - for the EABI-derived soft-float ABI + - ld-linux-armhf.so.3 - for the EABI-derived hard-float ABI. + All the dynamic linkers live in /lib. + We default to soft-float, but this can be overridden by changing both + GLIBC_DYNAMIC_LINKER_DEFAULT and TARGET_DEFAULT_FLOAT_ABI. */ + #undef GLIBC_DYNAMIC_LINKER -#define GLIBC_DYNAMIC_LINKER RUNTIME_ROOT_PREFIX /lib/ld-linux.so.3 +#define GLIBC_DYNAMIC_LINKER_SOFT_FLOAT \ + RUNTIME_ROOT_PREFIX /lib/ld-linux.so.3 +#define GLIBC_DYNAMIC_LINKER_HARD_FLOAT \ + RUNTIME_ROOT_PREFIX /lib/ld-linux-armhf.so.3 +#define GLIBC_DYNAMIC_LINKER_DEFAULT \ + RUNTIME_ROOT_PREFIX GLIBC_DYNAMIC_LINKER_SOFT_FLOAT + +#define GLIBC_DYNAMIC_LINKER \ + %{mfloat-abi=hard: GLIBC_DYNAMIC_LINKER_HARD_FLOAT } \ +%{mfloat-abi=soft*: GLIBC_DYNAMIC_LINKER_SOFT_FLOAT } \ +%{!mfloat-abi=*: GLIBC_DYNAMIC_LINKER_DEFAULT } /* At this point, bpabi.h will have clobbered LINK_SPEC. We want to use the GNU/Linux version, not the generic BPABI version. */ http://codereview.appspot.com/6206055/
Re: [google/gcc-4_6] Backport arm hardfp patch from trunk (issue 6206055)
http://codereview.appspot.com/6206055/diff/4001/gcc/ChangeLog File gcc/ChangeLog (right): http://codereview.appspot.com/6206055/diff/4001/gcc/ChangeLog#newcode11 gcc/ChangeLog:11: It seems you also merged in another patch r187012, please mention it. http://codereview.appspot.com/6206055/diff/4001/gcc/config/arm/linux-eabi.h File gcc/config/arm/linux-eabi.h (right): http://codereview.appspot.com/6206055/diff/4001/gcc/config/arm/linux-eabi.h#newcode77 gcc/config/arm/linux-eabi.h:77: RUNTIME_ROOT_PREFIX GLIBC_DYNAMIC_LINKER_SOFT_FLOAT Here the RUNTIME_ROOT_PREFIX may be redundant. http://codereview.appspot.com/6206055/
Re: [Patch / RFC] Improving more locations for binary expressions
On 05/10/2012 04:28 PM, Jason Merrill wrote: Looks good. Thanks Jason. The below is the idea fully implemented. The call.c bits are exactly in the form I had in mind a couple of days ago. The parser.c bits, the ones I actually preliminarily posted, are now a bit different: I noticed that in the middle of the function we were reading / writing all the cp_parser_expression_stack_entry fields by hand (and now would be one more) thus tweaked the algorithm to use a cp_parser_expression_stack_entry as local variable. In my opinion this tidies the code quite a bit and also makes manifest that tree_type and loc are always updated at the same time, as they should be. Booted and tested x86_64-linux. Ok? Thanks, Paolo. // /cp 2012-05-12 * parser.c (struct cp_parser_expression_stack_entry): Add location_t field. (cp_parser_binary_expression): Rework to always update at the same time tree_type and loc. * call.c (print_z_candidate): Add location_t parameter. (print_z_candidates, convert_like_real, joust): Adjust. /testsuite 2012-05-12 * g++.dg/parse/error47.C: New. Index: testsuite/g++.dg/parse/error47.C === --- testsuite/g++.dg/parse/error47.C(revision 0) +++ testsuite/g++.dg/parse/error47.C(revision 0) @@ -0,0 +1,9 @@ +struct T { }; + +T foo(); + +void bar(int a, int b) +{ + if (foo() a b) // { dg-error 13:no match for 'operator' } +; +} Index: cp/parser.c === --- cp/parser.c (revision 187414) +++ cp/parser.c (working copy) @@ -1621,6 +1621,8 @@ typedef struct cp_parser_expression_stack_entry enum tree_code tree_type; /* Precedence of the binary operation we are parsing. */ enum cp_parser_prec prec; + /* Location of the binary operation we are parsing. */ + location_t loc; } cp_parser_expression_stack_entry; /* The stack for storing partial expressions. We only need NUM_PREC_VALUES @@ -7275,30 +7277,33 @@ cp_parser_binary_expression (cp_parser* parser, bo { cp_parser_expression_stack stack; cp_parser_expression_stack_entry *sp = stack[0]; - tree lhs, rhs; + cp_parser_expression_stack_entry sentry; + tree rhs; cp_token *token; - location_t loc; - enum tree_code tree_type, lhs_type, rhs_type; + enum tree_code rhs_type; enum cp_parser_prec new_prec, lookahead_prec; tree overload; /* Parse the first expression. */ - lhs = cp_parser_cast_expression (parser, /*address_p=*/false, cast_p, pidk); - lhs_type = ERROR_MARK; + sentry.lhs = cp_parser_cast_expression (parser, /*address_p=*/false, + cast_p, pidk); + sentry.lhs_type = ERROR_MARK; + sentry.prec = prec; for (;;) { /* Get an operator token. */ token = cp_lexer_peek_token (parser-lexer); - loc = token-location; if (warn_cxx0x_compat token-type == CPP_RSHIFT !parser-greater_than_is_operator_p) { - if (warning_at (loc, OPT_Wc__0x_compat, %% operator is treated + if (warning_at (token-location, OPT_Wc__0x_compat, + %% operator is treated as two right angle brackets in C++11)) - inform (loc, suggest parentheses around %% expression); + inform (token-location, + suggest parentheses around %% expression); } new_prec = TOKEN_PRECEDENCE (token); @@ -7310,7 +7315,7 @@ cp_parser_binary_expression (cp_parser* parser, bo - or, we found an operator which has lower priority. This is the case where the recursive descent *ascends*, as in `3 * 4 + 5' after parsing `3 * 4'. */ - if (new_prec = prec) + if (new_prec = sentry.prec) { if (sp == stack) break; @@ -7319,17 +7324,18 @@ cp_parser_binary_expression (cp_parser* parser, bo } get_rhs: - tree_type = binops_by_token[token-type].tree_type; + sentry.tree_type = binops_by_token[token-type].tree_type; + sentry.loc = token-location; /* We used the operator token. */ cp_lexer_consume_token (parser-lexer); /* For false x or true || x, x will never be executed; disable warnings while evaluating it. */ - if (tree_type == TRUTH_ANDIF_EXPR) - c_inhibit_evaluation_warnings += lhs == truthvalue_false_node; - else if (tree_type == TRUTH_ORIF_EXPR) - c_inhibit_evaluation_warnings += lhs == truthvalue_true_node; + if (sentry.tree_type == TRUTH_ANDIF_EXPR) + c_inhibit_evaluation_warnings += sentry.lhs == truthvalue_false_node; + else if (sentry.tree_type == TRUTH_ORIF_EXPR) + c_inhibit_evaluation_warnings += sentry.lhs == truthvalue_true_node; /* Extract another operand. It may be the RHS of this expression or the LHS of a new,
Re: [patch] Fix debug info of nested inline functions
On 03/02/2012 03:29 PM, Eric Botcazou wrote: I notice that D.7 seems to suggest that if the nested function is not inlinable and shared between all instances of the containing function that we put a normal (non-abstract/concrete) instance of the nested function inside the abstract function for the containing function. But I agree that it's cleaner your way: put an abstract instance there instead so that the abstract instance of the containing function is all abstract. + /* Emit an abstract instance of nested functions within an abstract instance + of their parent. */ + int declaration = ((decl != current_function_decl + !(DECL_INITIAL (decl) != NULL_TREE + DECL_ABSTRACT (decl) + current_function_decl + DECL_ABSTRACT (current_function_decl))) || class_or_namespace_scope_p (context_die)); Hmm, why isn't current_function_decl == decl when we're trying to emit the abstract instance of a nested function? + /* Do not emit concrete instances of abstracted nested functions without + actual instances. */ + else if (TREE_CODE (decl_or_origin) == FUNCTION_DECL + die + get_AT (die, DW_AT_inline)) +; Should without actual instances be something like within concrete instances of containing functions? - if (origin origin-die_parent) + if (origin + origin-die_parent + /* Skip an abtract parent instance. */ + !(origin-die_parent-die_tag == DW_TAG_subprogram + get_AT (origin-die_parent, DW_AT_inline))) add_child_die (origin-die_parent, die); What if the immediate parent is a DW_TAG_lexical_block or some other thing nested inside an abstract subprogram? and you suggested to iterate over DECL_CONTEXT instead of die_parent to find an appropriate parent in order to attach the DIE on the limbo list to. In my earlier comments I seem to have been wrong about the behavior of gen_subprogram_die; now I see that if there is an abstract instance the concrete out-of-line instance is not associated with the decl number. So I guess your earlier limbo handling code was fine apart from the lexical_block issue above. Jason
Re: [C++ Patch] PR 53301
On 05/11/2012 06:03 AM, Paolo Carlini wrote: if (TYPE_PTR_P (type) !TYPE_PTRFN_P (type) - !TYPE_PTR_TO_MEMBER_P (type)) + !TYPE_PTRMEM_P (type)) The check for !pointer to member is no longer necessary, since they don't use POINTER_TYPE. OK with that change. Jason
Re: [Patch / RFC] Improving more locations for binary expressions
I find the name sentry confusing; I don't see how it applies. Perhaps current instead? Otherwise, the patch is OK. Jason