Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls
On 27/01/12 21:37, Tom de Vries wrote: On 24/01/12 11:40, Richard Guenther wrote: On Mon, Jan 23, 2012 at 10:27 PM, Tom de Vries tom_devr...@mentor.com wrote: Richard, Jakub, the following patch fixes PR51879. Consider the following test-case: ... int bar (int); void baz (int); void foo (int y) { int a; if (y == 6) a = bar (7); else a = bar (7); baz (a); } ... after compiling at -02, the representation looks like this before tail-merging: ... # BLOCK 3 freq:1991 # PRED: 2 [19.9%] (true,exec) # .MEMD.1714_7 = VDEF .MEMD.1714_6(D) # USE = nonlocal # CLB = nonlocal aD.1709_3 = barD.1703 (7); goto bb 5; # SUCC: 5 [100.0%] (fallthru,exec) # BLOCK 4 freq:8009 # PRED: 2 [80.1%] (false,exec) # .MEMD.1714_8 = VDEF .MEMD.1714_6(D) # USE = nonlocal # CLB = nonlocal aD.1709_4 = barD.1703 (7); # SUCC: 5 [100.0%] (fallthru,exec) # BLOCK 5 freq:1 # PRED: 3 [100.0%] (fallthru,exec) 4 [100.0%] (fallthru,exec) # aD.1709_1 = PHI aD.1709_3(3), aD.1709_4(4) # .MEMD.1714_5 = PHI .MEMD.1714_7(3), .MEMD.1714_8(4) # .MEMD.1714_9 = VDEF .MEMD.1714_5 # USE = nonlocal # CLB = nonlocal bazD.1705 (aD.1709_1); # VUSE .MEMD.1714_9 return; ... the patch allows aD.1709_4 to be value numbered to aD.1709_3, and .MEMD.1714_8 to .MEMD.1714_7, which enables tail-merging of blocks 4 and 5. The patch makes sure non-const/pure call results (gimple_vdef and gimple_call_lhs) are properly value numbered. Bootstrapped and reg-tested on x86_64. ok for stage1? The following cannot really work: @@ -2600,7 +2601,11 @@ visit_reference_op_call (tree lhs, gimpl result = vn_reference_lookup_1 (vr1, NULL); if (result) { - changed = set_ssa_val_to (lhs, result); + tree result_vdef = gimple_vdef (SSA_NAME_DEF_STMT (result)); + if (vdef) + changed |= set_ssa_val_to (vdef, result_vdef); + changed |= set_ssa_val_to (lhs, result); because 'result' may be not an SSA name. It might also not have a proper definition statement (if VN_INFO (result)-needs_insertion is true). So you at least need to guard things properly. Right. And that also doesn't work if the function is without lhs, such as in the new test-case pr51879-6.c. I fixed this by storing both lhs and vdef, such that I don't have to derive the vdef from the lhs. (On a side-note - I _did_ want to remove value-numbering virtual operands at some point ...) Doing so willl hurt performance of tail-merging in its current form. OTOH, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51964#c0 shows that value numbering as used in tail-merging has its limitations too. Do you have any ideas how to address that one? @@ -3359,8 +3366,10 @@ visit_use (tree use) /* ??? We should handle stores from calls. */ else if (TREE_CODE (lhs) == SSA_NAME) { + tree vuse = gimple_vuse (stmt); if (!gimple_call_internal_p (stmt) - gimple_call_flags (stmt) (ECF_PURE | ECF_CONST)) + (gimple_call_flags (stmt) (ECF_PURE | ECF_CONST) + || (vuse SSA_VAL (vuse) != VN_TOP))) changed = visit_reference_op_call (lhs, stmt); else changed = defs_to_varying (stmt); ... exactly because of the issue that a stmt has multiple defs. Btw, vuse should have been visited here or be part of our SCC, so, why do you need this check? Removed now, that was a workaround for a bug in an earlier version of the patch, that I didn't need anymore. Bootstrapped and reg-tested on x86_64. OK for stage1? Richard, quoting you in http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00618.html: ... I think these fixes hint at that we should use structural equality as fallback if value-numbering doesn't equate two stmt effects. Thus, treat two stmts with exactly the same operands and flags as equal and using value-numbering to canonicalize operands (when they are SSA names) for that comparison, or use VN entirely if there are no side-effects on the stmt. Changing value-numbering of virtual operands, even if it looks correct in the simple cases you change, doesn't look like a general solution for the missed tail merging opportunities. ... The test-case pr51879-6.c shows a case where improving value numbering will help tail-merging, but structural equality comparison not: ... # BLOCK 3 freq:1991 # PRED: 2 [19.9%] (true,exec) # .MEMD.1717_7 = VDEF .MEMD.1717_6(D) # USE = nonlocal # CLB = nonlocal blaD.1708 (5); # .MEMD.1717_8 = VDEF .MEMD.1717_7 # USE = nonlocal # CLB = nonlocal aD.1712_3 = barD.1704 (7); goto bb 5; # SUCC: 5 [100.0%] (fallthru,exec) # BLOCK 4 freq:8009 # PRED: 2 [80.1%] (false,exec) # .MEMD.1717_9 = VDEF .MEMD.1717_6(D) # USE = nonlocal # CLB = nonlocal blaD.1708 (5); # .MEMD.1717_10 = VDEF .MEMD.1717_9 # USE = nonlocal # CLB = nonlocal
Re: [PATCH] Remove dead labels to increase superblock scope
On 09/12/11 14:59, Eric Botcazou wrote: 2011-12-07 Tom de Vries t...@codesourcery.com * cfgcleanup.c (try_optimize_cfg): Replace call to delete_isns_chain by delete_insn_chain call to delete_insn. Remove code to reorder BASIC_BLOCK note and DELETED_LABEL note, and move it to ... * cfg_rtl.c (delete_insn): Change return type to void. The file is cfgrtl.c without underscore. When you have a to... somewhere, you need to have a ...here somewhere else: * cfgrtl.c (delete_insn): ...here. Change return type to void. The hunk for delete_insn contains a line with trailing TABs and spaces. (delete_insn_and_edges): Likewise. (delete_insn_chain): Handle new return type of delete_insn. Delete chain backwards rather than forwards. * rtl.h (delete_insn, delete_insn_and_edges): Change return type to void. * cfglayout.c (fixup_reorder_chain): Delete unused label. * gcc.dg/superblock.c: New test. OK for stage1 when it re-opens (modulo the above nits). Nice cleanup! nits fixed and committed. Thanks, - Tom 2012-04-14 Tom de Vries t...@codesourcery.com * cfgcleanup.c (try_optimize_cfg): Replace call to delete_insn_chain by call to delete_insn. Remove code to reorder BASIC_BLOCK note and DELETED_LABEL note, and move it to ... * cfgrtl.c (delete_insn): ... here. Change return type to void. (delete_insn_and_edges): Likewise. (delete_insn_chain): Handle new return type of delete_insn. Delete chain backwards rather than forwards. * rtl.h (delete_insn, delete_insn_and_edges): Change return type to void. * cfglayout.c (fixup_reorder_chain): Delete unused label. * gcc.dg/superblock.c: New test. Index: gcc/cfgcleanup.c === --- gcc/cfgcleanup.c (revision 181652) +++ gcc/cfgcleanup.c (working copy) @@ -2632,20 +2632,7 @@ try_optimize_cfg (int mode) || ! label_is_jump_target_p (BB_HEAD (b), BB_END (single_pred (b) { - rtx label = BB_HEAD (b); - - delete_insn_chain (label, label, false); - /* If the case label is undeletable, move it after the - BASIC_BLOCK note. */ - if (NOTE_KIND (BB_HEAD (b)) == NOTE_INSN_DELETED_LABEL) - { - rtx bb_note = NEXT_INSN (BB_HEAD (b)); - - reorder_insns_nobb (label, label, bb_note); - BB_HEAD (b) = bb_note; - if (BB_END (b) == bb_note) - BB_END (b) = label; - } + delete_insn (BB_HEAD (b)); if (dump_file) fprintf (dump_file, Deleted label in block %i.\n, b-index); Index: gcc/cfglayout.c === --- gcc/cfglayout.c (revision 181652) +++ gcc/cfglayout.c (working copy) @@ -857,6 +857,9 @@ fixup_reorder_chain (void) (e_taken-src, e_taken-dest)); e_taken-flags |= EDGE_FALLTHRU; update_br_prob_note (bb); + if (LABEL_NUSES (ret_label) == 0 + single_pred_p (e_taken-dest)) + delete_insn (ret_label); continue; } } Index: gcc/rtl.h === --- gcc/rtl.h (revision 181652) +++ gcc/rtl.h (working copy) @@ -2460,12 +2460,12 @@ extern void add_insn_before (rtx, rtx, s extern void add_insn_after (rtx, rtx, struct basic_block_def *); extern void remove_insn (rtx); extern rtx emit (rtx); -extern rtx delete_insn (rtx); +extern void delete_insn (rtx); extern rtx entry_of_function (void); extern void emit_insn_at_entry (rtx); extern void delete_insn_chain (rtx, rtx, bool); extern rtx unlink_insn_chain (rtx, rtx); -extern rtx delete_insn_and_edges (rtx); +extern void delete_insn_and_edges (rtx); extern rtx gen_lowpart_SUBREG (enum machine_mode, rtx); extern rtx gen_const_mem (enum machine_mode, rtx); extern rtx gen_frame_mem (enum machine_mode, rtx); Index: gcc/cfgrtl.c === --- gcc/cfgrtl.c (revision 181652) +++ gcc/cfgrtl.c (working copy) @@ -111,12 +111,11 @@ can_delete_label_p (const_rtx label) !in_expr_list_p (forced_labels, label)); } -/* Delete INSN by patching it out. Return the next insn. */ +/* Delete INSN by patching it out. */ -rtx +void delete_insn (rtx insn) { - rtx next = NEXT_INSN (insn); rtx note; bool really_delete = true; @@ -128,11 +127,22 @@ delete_insn (rtx insn) if (! can_delete_label_p (insn)) { const char *name = LABEL_NAME (insn); + basic_block bb = BLOCK_FOR_INSN (insn); + rtx bb_note = NEXT_INSN (insn); really_delete = false; PUT_CODE (insn, NOTE); NOTE_KIND (insn) = NOTE_INSN_DELETED_LABEL; NOTE_DELETED_LABEL_NAME (insn) = name; + + if (bb_note != NULL_RTX NOTE_INSN_BASIC_BLOCK_P (bb_note) + BLOCK_FOR_INSN (bb_note) == bb) + { + reorder_insns_nobb (insn, insn, bb_note); + BB_HEAD (bb) = bb_note; + if
Re: [PATCH][ARM] NEON DImode neg
On 12/04/12 16:48, Richard Earnshaw wrote: If negation in Neon needs a scratch register, it seems to me to be somewhat odd that we're disparaging the ARM version. Also, wouldn't it be sensible to support a variant that was early-clobber on operand 0, but loaded immediate zero into that value first: vmovDd, #0 vsubDd, Dd, Dm That way you'll never need more than two registers, whereas today you want three. This patch implements the changes you suggested. I've done a full bootstrap and test and found no regressions. OK? Andrew P.S. This patch can't actually be committed until my NEON DImode immediate constants patch is approved and committed. (Without that the load #0 needs a constant pool, and loading constants this late has a bug at -O0.)
Re: [PATCH][ARM] NEON DImode neg
And now with the patch. :( On 14/04/12 13:48, Andrew Stubbs wrote: On 12/04/12 16:48, Richard Earnshaw wrote: If negation in Neon needs a scratch register, it seems to me to be somewhat odd that we're disparaging the ARM version. Also, wouldn't it be sensible to support a variant that was early-clobber on operand 0, but loaded immediate zero into that value first: vmov Dd, #0 vsub Dd, Dd, Dm That way you'll never need more than two registers, whereas today you want three. This patch implements the changes you suggested. I've done a full bootstrap and test and found no regressions. OK? Andrew P.S. This patch can't actually be committed until my NEON DImode immediate constants patch is approved and committed. (Without that the load #0 needs a constant pool, and loading constants this late has a bug at -O0.) 2012-04-12 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.md (negdi2): Use gen_negdi2_neon. * config/arm/neon.md (negdi2_neon): New insn. Also add splitters for core and NEON registers. diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 751997f..f1dbbf7 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -4048,7 +4048,13 @@ (neg:DI (match_operand:DI 1 s_register_operand ))) (clobber (reg:CC CC_REGNUM))])] TARGET_EITHER - + { +if (TARGET_NEON) + { +emit_insn (gen_negdi2_neon (operands[0], operands[1])); + DONE; + } + } ) ;; The constraints here are to prevent a *partial* overlap (where %Q0 == %R1). diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 3c88568..8c8b02d 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -922,6 +922,45 @@ (const_string neon_int_3)))] ) +(define_insn negdi2_neon + [(set (match_operand:DI 0 s_register_operand =w, w,r,r) + (neg:DI (match_operand:DI 1 s_register_operand w, w,0, r))) + (clobber (match_scratch:DI 2 = X,w,X, X)) + (clobber (reg:CC CC_REGNUM))] + TARGET_NEON + # + [(set_attr length 8)] +) + +; Split negdi2_neon for vfp registers +(define_split + [(set (match_operand:DI 0 s_register_operand ) + (neg:DI (match_operand:DI 1 s_register_operand ))) + (clobber (match_scratch:DI 2 )) + (clobber (reg:CC CC_REGNUM))] + TARGET_NEON reload_completed IS_VFP_REGNUM (REGNO (operands[0])) + [(set (match_dup 2) (const_int 0)) + (parallel [(set (match_dup 0) (minus:DI (match_dup 2) (match_dup 1))) + (clobber (reg:CC CC_REGNUM))])] + { +if (!REG_P (operands[2])) + operands[2] = operands[0]; + } +) + +; Split negdi2_neon for core registers +(define_split + [(set (match_operand:DI 0 s_register_operand ) + (neg:DI (match_operand:DI 1 s_register_operand ))) + (clobber (match_scratch:DI 2 )) + (clobber (reg:CC CC_REGNUM))] + TARGET_32BIT reload_completed +arm_general_register_operand (operands[0], DImode) + [(parallel [(set (match_dup 0) (neg:DI (match_dup 1))) + (clobber (reg:CC CC_REGNUM))])] + +) + (define_insn *uminmode3_neon [(set (match_operand:VDQIW 0 s_register_operand =w) (umin:VDQIW (match_operand:VDQIW 1 s_register_operand w)
Re: _GLIBCXX_ATOMIC_BUILTINS too coarse
This patch partially reverts the change made on 2012-02-10 that partially disabled builtin atomics on powerpc, resulting in inconsistent locking (mix of atomics and pthread mutexes) and an ABI incompatibility with previous versions of libstdc++. See the PR for all the gory details. Applying to mainline with Jonathan's approval, and in a few days to the 4.7 branch assuming no problems appear due to this change on mainline. PR libstdc++/52839 * acinclude.m4 (_GLIBCXX_ATOMIC_BUILTINS): Do not depend on glibcxx_cv_atomic_long_long. * configure: Regenerate. Index: libstdc++-v3/acinclude.m4 === --- libstdc++-v3/acinclude.m4 (revision 186130) +++ libstdc++-v3/acinclude.m4 (working copy) @@ -2861,11 +2861,10 @@ CXXFLAGS=$old_CXXFLAGS AC_LANG_RESTORE - # Set atomicity_dir to builtins if all of above tests pass. + # Set atomicity_dir to builtins if all but the long long test above passes. if test $glibcxx_cv_atomic_bool = yes \ test $glibcxx_cv_atomic_short = yes \ - test $glibcxx_cv_atomic_int = yes \ - test $glibcxx_cv_atomic_long_long = yes ; then + test $glibcxx_cv_atomic_int = yes; then AC_DEFINE(_GLIBCXX_ATOMIC_BUILTINS, 1, [Define if the compiler supports C++11 atomics.]) atomicity_dir=cpu/generic/atomicity_builtins -- Alan Modra Australia Development Lab, IBM
Re: [PATCH] Fix PRs c/52283/37985
As far as I know, this patch hasn't been reviewed: http://patchwork.ozlabs.org/patch/150636/ Cheers, Manuel. On 4 April 2012 10:17, Christian Bruel christian.br...@st.com wrote: Hello, Is it OK to push the cleaning of TREE_NO_WARNING to fix the constant expressions errors discrepancies, as discussed in bugzilla #52283, now that the trunk is open ? Many thanks,
[PATCH] Fix PR52976
This patch corrects two errors in reassociating expressions with repeated factors. First, undistribution needs to recognize repeated factors. For now, repeated factors will be ineligible for this optimization. In the future, this can be improved. Second, when a __builtin_powi call is introduced, its target SSA name must be given a rank higher than other operands in the operand list. Otherwise, uses of the call result may be introduced prior to the call. Bootstrapped and regression tested on powerpc64-linux. Confirmed that cpu2000 and cpu2006 SPEC tests build cleanly. OK for trunk? Thanks, Bill 2012-04-14 Bill Schmidt wschm...@linux.vnet.ibm.com PR tree-optimization/52976 * tree-ssa-reassoc.c (add_to_ops_vec_max_rank): New function. (undistribute_ops_list): Ops with repeat counts aren't eligible for undistribution. (attempt_builtin_powi): Call add_to_ops_vec_max_rank. Index: gcc/tree-ssa-reassoc.c === --- gcc/tree-ssa-reassoc.c (revision 186393) +++ gcc/tree-ssa-reassoc.c (working copy) @@ -544,6 +544,28 @@ add_repeat_to_ops_vec (VEC(operand_entry_t, heap) reassociate_stats.pows_encountered++; } +/* Add an operand entry to *OPS for the tree operand OP, giving the + new entry a larger rank than any other operand already in *OPS. */ + +static void +add_to_ops_vec_max_rank (VEC(operand_entry_t, heap) **ops, tree op) +{ + operand_entry_t oe = (operand_entry_t) pool_alloc (operand_entry_pool); + operand_entry_t oe1; + unsigned i; + unsigned max_rank = 0; + + FOR_EACH_VEC_ELT (operand_entry_t, *ops, i, oe1) +if (oe1-rank max_rank) + max_rank = oe1-rank; + + oe-op = op; + oe-rank = max_rank + 1; + oe-id = next_operand_entry_id++; + oe-count = 1; + VEC_safe_push (operand_entry_t, heap, *ops, oe); +} + /* Return true if STMT is reassociable operation containing a binary operation with tree code CODE, and is inside LOOP. */ @@ -1200,6 +1222,7 @@ undistribute_ops_list (enum tree_code opcode, dcode = gimple_assign_rhs_code (oe1def); if ((dcode != MULT_EXPR dcode != RDIV_EXPR) + || oe1-count != 1 || !is_reassociable_op (oe1def, dcode, loop)) continue; @@ -1243,6 +1266,8 @@ undistribute_ops_list (enum tree_code opcode, oecount c; void **slot; size_t idx; + if (oe1-count != 1) + continue; c.oecode = oecode; c.cnt = 1; c.id = next_oecount_id++; @@ -1311,7 +1336,7 @@ undistribute_ops_list (enum tree_code opcode, FOR_EACH_VEC_ELT (operand_entry_t, subops[i], j, oe1) { - if (oe1-op == c-op) + if (oe1-op == c-op oe1-count == 1) { SET_BIT (candidates2, i); ++nr_candidates2; @@ -3275,8 +3300,10 @@ attempt_builtin_powi (gimple stmt, VEC(operand_ent gsi_insert_before (gsi, pow_stmt, GSI_SAME_STMT); } - /* Append the result of this iteration to the ops vector. */ - add_to_ops_vec (ops, iter_result); + /* Append the result of this iteration to the ops vector. + Give it a rank higher than all other ranks in the ops vector + so that all uses of it will be forced to come after it. */ + add_to_ops_vec_max_rank (ops, iter_result); /* Decrement the occurrence count of each element in the product by the count found above, and remove this many copies of each
Re: [PATCH 01/11] Fix cpp_sys_macro_p with -ftrack-macro-expansion
Jason Merrill ja...@redhat.com writes: On 04/10/2012 10:55 AM, Dodji Seketeli wrote: + if (CPP_OPTION (pfile, track_macro_expansion)) I think this should check context-tokens_kind rather than the compiler flag. OK. Below is the updated patch that does that. Tested and bootstrapped on x86_64-unknown-linux-gnu against trunk. libcpp/ * macro.c (cpp_sys_macro_p): Support -ftrack-macro-expansion. --- libcpp/macro.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/libcpp/macro.c b/libcpp/macro.c index 54de3e3..4f8e52f 100644 --- a/libcpp/macro.c +++ b/libcpp/macro.c @@ -2436,7 +2436,12 @@ cpp_get_token_with_location (cpp_reader *pfile, source_location *loc) int cpp_sys_macro_p (cpp_reader *pfile) { - cpp_hashnode *node = pfile-context-c.macro; + cpp_hashnode *node = NULL; + + if (pfile-context-tokens_kind == TOKENS_KIND_EXTENDED) +node = pfile-context-c.mc-macro_node; + else +node = pfile-context-c.macro; return node node-value.macro node-value.macro-syshdr; } -- Dodji
Re: [Patch, Fortran] PR52916 - fix TREE_PUBLIC() = 0 for module procedures
* PING * It is a rather serious rejects-valid regression. It also affects SPEC CPU 2006 and the patch has been confirmed (cf. PR) to fix the regression. Tobias Tobias Burnus wrote: Dear all, my recent patch for setting PRIVATE module variables and procedures to TREE_PUBLIC()=0 had a flaw: I completely forgot about generic interfaces. Even if the specific name is PRIVATE, the specific function is still callable through the a (public) generic name. Thanks to HJ for the report. (The bug causes a failures of SPEC CPU 2006.) I think the handling of type-bound procedures is correct. However, I wouldn't mind if someone could confirm it. I only check for the specific entries as GENERIC, OPERATOR and ASSIGNMENT use a type-bound-proc name, which is already handled. I also didn't try to optimize for private DT, private generics etc. First, I think it is not needed. And secondly, through inheritance, it can get extremely complicated. Build and regtested on x86-64-linux. OK for the trunk? Tobias
Re: [Patch, Fortran] PR52916 - fix TREE_PUBLIC() = 0 for module procedures
Hi Tobias, * PING * It is a rather serious rejects-valid regression. It also affects SPEC CPU 2006 and the patch has been confirmed (cf. PR) to fix the regression. OK for trunk. Thanks for the patch! Thomas
Re: [patch, fortran] Trim spaces on list-directed reads
On 04/10/2012 08:32 AM, Thomas Koenig wrote: Hello world, this patch effectively trims the spaces from the string on list-directed reads. This avoids the large overhead on processing these spaces when reading from long lines. I didn't do this for internal units which are arrays because this would need a separate calculation for each record, and would mean a major change to the way the code is structured. The running time for the test case from PR 50673 is reduced from ~8 s to 0.1 s by this patch. Regression-tested. OK for trunk? yes, very good, OK for trunk. Sorry for delay, time is squeezed badly here. Regards, Jerry
Re: [PATCH] Prevent 'control reaches end of non-void function' warning for DO_WHILE
Jason, On 18/02/12 09:33, Jason Merrill wrote: On 01/22/2012 03:38 AM, Tom de Vries wrote: Sorry I didn't notice this patch until now; please CC me on C++ patches, or at least mention C++ in the subject line. OK, will do. + tree expr = NULL; + append_to_statement_list (*block, expr); + *block = expr; Rather than doing this dance here, I think it would be better to enhance append_to_statement_list to handle the case of the list argument being a non-list. Added return value to append_to_statement_list, so now it's: *block = append_to_statement_list (*block, NULL); + cp_walk_tree (incr, cp_genericize_r, data, NULL); + if (incr EXPR_P (incr)) +SET_EXPR_LOCATION (incr, start_locus); It might be better to set the location on incr before genericizing, so that the location can trickle down. I see. I reordered the code. + t = build1 (GOTO_EXPR, void_type_node, get_bc_label (bc_break)); + SET_EXPR_LOCATION (t, start_locus); Here you can use build1_loc instead of two separate statements. Done. - /* If we use a LOOP_EXPR here, we have to feed the whole thing -back through the main gimplifier to lower it. Given that we -have to gimplify the loop body NOW so that we can resolve -break/continue stmts, seems easier to just expand to gotos. */ Since we're now lowering the loop at genericize time rather than gimplify, what's the rationale for not using LOOP_EXPR/EXIT_EXPR? We should still say something here. I suppose the rationale is that the C front end currently goes straight to gotos. I think we could indeed use LOOP_EXPR/EXIT_EXPR. I'm not sure what the gain would be, since at least for C it would be a construct alive only between genericize and gimplify. But I guess it makes sense from orthogonality point of view. Added a comment with todo. if (cond != error_mark_node) { - gimplify_expr (cond, exit_seq, NULL, is_gimple_val, fb_rvalue); - stmt = gimple_build_cond (NE_EXPR, cond, - build_int_cst (TREE_TYPE (cond), 0), - gimple_label_label (top), - get_bc_label (bc_break)); - gimple_seq_add_stmt (exit_seq, stmt); + cond = build2 (NE_EXPR, boolean_type_node, cond, +build_int_cst (TREE_TYPE (cond), 0)); I don't think we still need this extra comparison to 0, that seems like a gimple-specific thing. OK, removed. if (FOR_INIT_STMT (stmt)) +append_to_statement_list (FOR_INIT_STMT (stmt), expr); + genericize_cp_loop (loop, EXPR_LOCATION (stmt), FOR_COND (stmt), + FOR_BODY (stmt), FOR_EXPR (stmt), 1, walk_subtrees, data); The call to genericize_cp_loop will clear *walk_subtrees, which means we don't genericize the FOR_INIT_STMT. I see. Added call to cp_generize_r. + tree jump = build_and_jump (label); Again, let's use build1_loc. Done. + *stmt_p = build_and_jump (label); + SET_EXPR_LOCATION (*stmt_p, location); And here. Done. + stmt = make_node (OMP_FOR); Why make a new OMP_FOR rather than repurpose the one we already have? We've already modified its operands. Done. Bootstrapped and reg-tested on x86_64. OK for trunk? Thanks, - Tom Jason 2012-04-14 Tom de Vries t...@codesourcery.com * tree-iterator.c (append_to_statement_list_1, append_to_statement_list) (append_to_statement_list_force): Return resulting list. Handle list_p == NULL. * tree-iterator.h (append_to_statement_list) (append_to_statement_list_force): Add tree return type. * cp-gimplify.c (begin_bc_block): Add location parameter and use as location argument to create_artificial_label. (finish_bc_block): Change return type to void. Remove body_seq parameter, and add block parameter. Append label to STMT_LIST and return in block. (gimplify_cp_loop, gimplify_for_stmt, gimplify_while_stmt) (gimplify_do_stmt, gimplify_switch_stmt): Remove function. (genericize_cp_loop, genericize_for_stmt, genericize_while_stmt) (genericize_do_stmt, genericize_switch_stmt, genericize_continue_stmt) (genericize_break_stmt, genericize_omp_for_stmt): New function. (cp_gimplify_omp_for): Remove bc_continue processing. (cp_gimplify_expr): Genericize VEC_INIT_EXPR. (cp_gimplify_expr): Mark FOR_STMT, WHILE_STMT, DO_STMT, SWITCH_STMT, CONTINUE_STMT, and BREAK_STMT as unreachable. (cp_genericize_r): Genericize FOR_STMT, WHILE_STMT, DO_STMT, SWITCH_STMT, CONTINUE_STMT, BREAK_STMT and OMP_FOR. (cp_genericize_tree): New function, factored out of ... (cp_genericize): ... this function. * g++.dg/pr51264-4.C: New test. Index: gcc/tree-iterator.c
[v3] libstdc++/52699
Hi, this is what I'm going to commit to mainline (and likely 4.7.1 too) to solve a number of issues affecting the implementation of independent_bits_engine::operator()() as originally contributed: essentially we want to be very careful with overflows (+ other lesser issues and small optimizations). Many thanks to Marc Glisse for off-line reviewing and helping over the last day or so!! Tested x86_64-linux multilib. Paolo. 2012-04-14 Paolo Carlini paolo.carl...@oracle.com PR libstdc++/52699 * include/bits/random.tcc (independent_bits_engine::operator()()) Avoid various overflows; use common_type on result_type and _RandomNumberEngine::result_type; avoid floating point computations; other smaller tweaks. * include/bits/random.tcc (uniform_int_distribution::operator()) Use common_type; assume _UniformRandomNumberGenerator::result_type unsigned; tidy. * include/bits/stl_algobase.h (__lg(unsigned), __lg(unsigned long), __lg(unsigned long long)): Add. Index: include/bits/stl_algobase.h === --- include/bits/stl_algobase.h (revision 186448) +++ include/bits/stl_algobase.h (working copy) @@ -989,14 +989,26 @@ __lg(int __n) { return sizeof(int) * __CHAR_BIT__ - 1 - __builtin_clz(__n); } + inline unsigned + __lg(unsigned __n) + { return sizeof(int) * __CHAR_BIT__ - 1 - __builtin_clz(__n); } + inline long __lg(long __n) { return sizeof(long) * __CHAR_BIT__ - 1 - __builtin_clzl(__n); } + inline unsigned long + __lg(unsigned long __n) + { return sizeof(long) * __CHAR_BIT__ - 1 - __builtin_clzl(__n); } + inline long long __lg(long long __n) { return sizeof(long long) * __CHAR_BIT__ - 1 - __builtin_clzll(__n); } + inline unsigned long long + __lg(unsigned long long __n) + { return sizeof(long long) * __CHAR_BIT__ - 1 - __builtin_clzll(__n); } + _GLIBCXX_END_NAMESPACE_VERSION _GLIBCXX_BEGIN_NAMESPACE_ALGO Index: include/bits/random.tcc === --- include/bits/random.tcc (revision 186448) +++ include/bits/random.tcc (working copy) @@ -730,40 +730,65 @@ independent_bits_engine_RandomNumberEngine, __w, _UIntType:: operator()() { - const long double __r = static_castlong double(_M_b.max()) - - static_castlong double(_M_b.min()) + 1.0L; - const result_type __m = std::log(__r) / std::log(2.0L); - result_type __n, __n0, __y0, __y1, __s0, __s1; + typedef typename _RandomNumberEngine::result_type _Eresult_type; + const _Eresult_type __r + = (_M_b.max() - _M_b.min() std::numeric_limits_Eresult_type::max() + ? _M_b.max() - _M_b.min() + 1 : 0); + const unsigned __edig = std::numeric_limits_Eresult_type::digits; + const unsigned __m = __r ? std::__lg(__r) : __edig; + + typedef typename std::common_type_Eresult_type, result_type::type + __ctype; + const unsigned __cdig = std::numeric_limits__ctype::digits; + + unsigned __n, __n0; + __ctype __s0, __s1, __y0, __y1; + for (size_t __i = 0; __i 2; ++__i) { __n = (__w + __m - 1) / __m + __i; __n0 = __n - __w % __n; - const result_type __w0 = __w / __n; - const result_type __w1 = __w0 + 1; - __s0 = result_type(1) __w0; - __s1 = result_type(1) __w1; - __y0 = __s0 * (__r / __s0); - __y1 = __s1 * (__r / __s1); - if (__r - __y0 = __y0 / __n) + const unsigned __w0 = __w / __n; // __w0 = __m + + __s0 = 0; + __s1 = 0; + if (__w0 __cdig) + { + __s0 = __ctype(1) __w0; + __s1 = __s0 1; + } + + __y0 = 0; + __y1 = 0; + if (__r) + { + __y0 = __s0 * (__r / __s0); + if (__s1) + __y1 = __s1 * (__r / __s1); + + if (__r - __y0 = __y0 / __n) + break; + } + else break; } result_type __sum = 0; for (size_t __k = 0; __k __n0; ++__k) { - result_type __u; + __ctype __u; do __u = _M_b() - _M_b.min(); - while (__u = __y0); - __sum = __s0 * __sum + __u % __s0; + while (__y0 __u = __y0); + __sum = __s0 * __sum + (__s0 ? __u % __s0 : __u); } for (size_t __k = __n0; __k __n; ++__k) { - result_type __u; + __ctype __u; do __u = _M_b() - _M_b.min(); - while (__u = __y1); - __sum = __s1 * __sum + __u % __s1; + while (__y1 __u = __y1); + __sum = __s1 * __sum + (__s1 ? __u % __s1 : __u); } return __sum; } @@ -840,12 +865,11 @@ operator()(_UniformRandomNumberGenerator __urng, const
[cxx-conversion] COMPILER_FOR_BUILD in C++ mode
2012-04-14 Pedro Lamarão pedro.lama...@gmail.com * Makefile.in: define COMPILER_FOR_BUILD etc. conditionally according with ENABLE_BUILD_WITH_CXX. --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -731,12 +731,22 @@ CC_FOR_BUILD = @CC_FOR_BUILD@ BUILD_CFLAGS= @BUILD_CFLAGS@ -DGENERATOR_FILE # Native compiler that we use. This may be C++ some day. +ifneq ($(ENABLE_BUILD_WITH_CXX),yes) COMPILER_FOR_BUILD = $(CC_FOR_BUILD) BUILD_COMPILERFLAGS = $(BUILD_CFLAGS) +else +COMPILER_FOR_BUILD = $(COMPILER) +BUILD_COMPILERFLAGS = $(COMPILER_FLAGS) -DIN_GCC -DGENERATOR_FILE +endif # Native linker that we use. +ifneq ($(ENABLE_BUILD_WITH_CXX),yes) LINKER_FOR_BUILD = $(CC_FOR_BUILD) BUILD_LINKERFLAGS = $(BUILD_CFLAGS) +else +LINKER_FOR_BUILD = $(LINKER) +BUILD_LINKERFLAGS = $(LINKER_FLAGS) +endif # Native linker and preprocessor flags. For x-fragment overrides. BUILD_LDFLAGS=@BUILD_LDFLAGS@
Re: [PATCH 01/11] Fix cpp_sys_macro_p with -ftrack-macro-expansion
OK. Jason