Re: RFC: Faster for_each_rtx-like iterators
Trevor Saunders tsaund...@mozilla.com writes: On Wed, May 07, 2014 at 09:52:49PM +0100, Richard Sandiford wrote: I noticed for_each_rtx showing up in profiles and thought I'd have a go at using worklist-based iterators instead. So far I have three: FOR_EACH_SUBRTX: iterates over const_rtx subrtxes of a const_rtx FOR_EACH_SUBRTX_VAR: iterates over rtx subrtxes of an rtx FOR_EACH_SUBRTX_PTR: iterates over subrtx pointers of an rtx * with FOR_EACH_SUBRTX_PTR being the direct for_each_rtx replacement. I made FOR_EACH_SUBRTX the default (unsuffixed) version because most walks really don't modify the structure. I think we should encourage const_rtxes to be used whereever possible. E.g. it might make it easier to have non-GC storage for temporary rtxes in future. I've locally replaced all for_each_rtx calls in the generic code with these iterators and they make things reproducably faster. The speed-up on full --enable-checking=release ./cc1 and ./cc1plus times is only about 1%, but maybe that's enough to justify the churn. seems pretty nice, and it seems like it'll make code a little more readable too :) Implementation-wise, the main observation is that most subrtxes are part of a single contiguous sequence of e fields. E.g. when compiling an oldish combine.ii on x86_64-linux-gnu with -O2, we iterate over the subrtxes of 7,636,542 rtxes. Of those: (A) 4,459,135 (58.4%) are leaf rtxes with no e or E fields, (B) 3,133,875 (41.0%) are rtxes with a single block of e fields and no E fields, and (C)43,532 (00.6%) are more complicated. (A) is really a special case of (B) in which the block has zero length. Those are the only two cases that really need to be handled inline. The implementation does this by having a mapping from an rtx code to the bounds of its e sequence, in the form of a start index and count. Out of (C), the vast majority (43,509) are PARALLELs. However, as you'd probably expect, bloating the inline code with that case made things slower rather than faster. The vast majority (in fact all in the combine.ii run above) of iterations can be done with a 16-element stack worklist. We obviously still need a heap fallback for the pathological cases though. I spent a bit of time trying different iterator implementations and seeing which produced the best code. Specific results from that were: - The storage used for the worklist is separate from the iterator, in order to avoid capturing iterator fields. - Although the natural type of the storage would be auto_vec ..., 16, that produced some overhead compared with a separate stack array and heap vector pointer. With the heap vector pointer, the only overhead is an assignment in the constructor and an if (x) release (x)-style sequence in the destructor. I think the extra complication over auto_vec is worth it because in this case the heap version is so very rarely needed. hm, where does the overhead come from exactly? it seems like if its faster to use vecT, va_heap, vl_embedd *foo; we should fix something about vectors since this isn't the only place it could matter. does it matter if you use vecT, va_heap, vl_embedd * or vecT ? the second is basically just a wrapper around the former I'd expect has no effect. I'm not saying you're doing the wrong thing here, but if we can make generic vectors faster we probably should ;) or is the issue the __builtin_expect()s you can add? Part of the problem is that by having an array in the vec itself, the other fields effectively have their address taken too. So m_alloc, m_num and m_using_auto_storage need to be set up and maintained on the stack, even though we're almost sure that they will never be used. - The maximum number of fields in (B)-type rtxes is 3. We get better code by making that explicit rather than having a general loop. - (C) codes map to an e count of UCHAR_MAX, so we can use a single check to test for that and for cases where the stack worklist is too small. can we use uint8_t? We don't really use that in GCC yet. I don't mind setting a precedent though :-) To give an example: /* Callback for for_each_rtx, that returns 1 upon encountering a VALUE whose UID is greater than the int uid that D points to. */ static int refs_newer_value_cb (rtx *x, void *d) { if (GET_CODE (*x) == VALUE CSELIB_VAL_PTR (*x)-uid *(int *)d) return 1; return 0; } /* Return TRUE if EXPR refers to a VALUE whose uid is greater than that of V. */ static bool refs_newer_value_p (rtx expr, rtx v) { int minuid = CSELIB_VAL_PTR (v)-uid; return for_each_rtx (expr, refs_newer_value_cb, minuid); } becomes: /* Return TRUE if EXPR refers to a VALUE whose uid is greater than that of V. */ static bool refs_newer_value_p (const_rtx expr, rtx v) { int minuid = CSELIB_VAL_PTR (v)-uid; subrtx_iterator::array_type
Re: RFC: Faster for_each_rtx-like iterators
Richard Sandiford rdsandif...@googlemail.com writes: Trevor Saunders tsaund...@mozilla.com writes: On Wed, May 07, 2014 at 09:52:49PM +0100, Richard Sandiford wrote: I noticed for_each_rtx showing up in profiles and thought I'd have a go at using worklist-based iterators instead. So far I have three: FOR_EACH_SUBRTX: iterates over const_rtx subrtxes of a const_rtx FOR_EACH_SUBRTX_VAR: iterates over rtx subrtxes of an rtx FOR_EACH_SUBRTX_PTR: iterates over subrtx pointers of an rtx * with FOR_EACH_SUBRTX_PTR being the direct for_each_rtx replacement. I made FOR_EACH_SUBRTX the default (unsuffixed) version because most walks really don't modify the structure. I think we should encourage const_rtxes to be used whereever possible. E.g. it might make it easier to have non-GC storage for temporary rtxes in future. I've locally replaced all for_each_rtx calls in the generic code with these iterators and they make things reproducably faster. The speed-up on full --enable-checking=release ./cc1 and ./cc1plus times is only about 1%, but maybe that's enough to justify the churn. seems pretty nice, and it seems like it'll make code a little more readable too :) Implementation-wise, the main observation is that most subrtxes are part of a single contiguous sequence of e fields. E.g. when compiling an oldish combine.ii on x86_64-linux-gnu with -O2, we iterate over the subrtxes of 7,636,542 rtxes. Of those: (A) 4,459,135 (58.4%) are leaf rtxes with no e or E fields, (B) 3,133,875 (41.0%) are rtxes with a single block of e fields and no E fields, and (C)43,532 (00.6%) are more complicated. (A) is really a special case of (B) in which the block has zero length. Those are the only two cases that really need to be handled inline. The implementation does this by having a mapping from an rtx code to the bounds of its e sequence, in the form of a start index and count. Out of (C), the vast majority (43,509) are PARALLELs. However, as you'd probably expect, bloating the inline code with that case made things slower rather than faster. The vast majority (in fact all in the combine.ii run above) of iterations can be done with a 16-element stack worklist. We obviously still need a heap fallback for the pathological cases though. I spent a bit of time trying different iterator implementations and seeing which produced the best code. Specific results from that were: - The storage used for the worklist is separate from the iterator, in order to avoid capturing iterator fields. - Although the natural type of the storage would be auto_vec ..., 16, that produced some overhead compared with a separate stack array and heap vector pointer. With the heap vector pointer, the only overhead is an assignment in the constructor and an if (x) release (x)-style sequence in the destructor. I think the extra complication over auto_vec is worth it because in this case the heap version is so very rarely needed. hm, where does the overhead come from exactly? it seems like if its faster to use vecT, va_heap, vl_embedd *foo; we should fix something about vectors since this isn't the only place it could matter. does it matter if you use vecT, va_heap, vl_embedd * or vecT ? the second is basically just a wrapper around the former I'd expect has no effect. I'm not saying you're doing the wrong thing here, but if we can make generic vectors faster we probably should ;) or is the issue the __builtin_expect()s you can add? Part of the problem is that by having an array in the vec itself, the other fields effectively have their address taken too. So m_alloc, m_num and m_using_auto_storage need to be set up and maintained on the stack, even though we're almost sure that they will never be used. - The maximum number of fields in (B)-type rtxes is 3. We get better code by making that explicit rather than having a general loop. - (C) codes map to an e count of UCHAR_MAX, so we can use a single check to test for that and for cases where the stack worklist is too small. can we use uint8_t? We don't really use that in GCC yet. I don't mind setting a precedent though :-) To give an example: /* Callback for for_each_rtx, that returns 1 upon encountering a VALUE whose UID is greater than the int uid that D points to. */ static int refs_newer_value_cb (rtx *x, void *d) { if (GET_CODE (*x) == VALUE CSELIB_VAL_PTR (*x)-uid *(int *)d) return 1; return 0; } /* Return TRUE if EXPR refers to a VALUE whose uid is greater than that of V. */ static bool refs_newer_value_p (rtx expr, rtx v) { int minuid = CSELIB_VAL_PTR (v)-uid; return for_each_rtx (expr, refs_newer_value_cb, minuid); } becomes: /* Return TRUE if EXPR refers to a VALUE whose uid is greater than that of V. */ static bool refs_newer_value_p (const_rtx expr, rtx v) { int minuid
Re: [PATCH] Windows libiberty: Don't quote args unnecessarily (v2)
2014-05-07 8:55 GMT+02:00 Ray Donnelly mingw.andr...@gmail.com: We only quote arguments that contain spaces, \t or characters to prevent wasting 2 characters per argument of the CreateProcess() 32,768 limit. libiberty/ * pex-win32.c (argv_to_cmdline): Don't quote args unnecessarily Ray Donnelly (1): Windows libibery: Don't quote args unnecessarily libiberty/pex-win32.c | 46 +- 1 file changed, 37 insertions(+), 9 deletions(-) -- 1.9.2 Did you missed to attach patch? Or is that my mail-client? Kai
Re: [PATCH] Windows libibery: Don't quote args unnecessarily
Ah, that was my mail-client. Patch is ok IMO. I add Ian CC for a second look at it. Cheers, Kai
Re: [committed] PR 61095: tsan fallout from wide-int merge
Richard Sandiford rdsandif...@googlemail.com writes: This PR was due to code in which -(int) foo was suposed to be sign-extended, but was being ORed with an unsigned int and so ended up being zero-extended. Fixed by using the proper-width type. As Kostya rightly said in the PR, this should have had a testcase too. Tested on x86_64-linux-gnu. It failed before the patch on x86_64, passes after it, and is skipped for -m32. OK to install? Thanks, Richard gcc/testsuite/ PR tree-optimization/61095 * gcc.dg/torture/pr61095.c: New test. Index: gcc/testsuite/gcc.dg/torture/pr61095.c === --- /dev/null 2014-05-03 11:58:38.033951363 +0100 +++ gcc/testsuite/gcc.dg/torture/pr61095.c 2014-05-08 08:46:01.203827892 +0100 @@ -0,0 +1,23 @@ +/* { dg-do run } */ +/* { dg-require-effective-target lp64 } */ + +extern void __attribute__ ((noreturn)) abort (void); + +int __attribute__ ((noinline, noclone)) +foo (unsigned long addr) { +unsigned long *p = (unsigned long*)((addr 0x83f8UL) * 4); +unsigned long xxx = (unsigned long)(p + 1); +return xxx = 0x3c0UL; +} + +int +main (void) +{ + if (foo (0)) +abort (); + if (foo (0x7c00UL)) +abort (); + if (!foo (0xfc00UL)) +abort (); + return 0; +}
Re: [committed] PR 61095: tsan fallout from wide-int merge
On Thu, May 8, 2014 at 9:48 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Sandiford rdsandif...@googlemail.com writes: This PR was due to code in which -(int) foo was suposed to be sign-extended, but was being ORed with an unsigned int and so ended up being zero-extended. Fixed by using the proper-width type. As Kostya rightly said in the PR, this should have had a testcase too. Tested on x86_64-linux-gnu. It failed before the patch on x86_64, passes after it, and is skipped for -m32. OK to install? Ok. Thanks, Richard. Thanks, Richard gcc/testsuite/ PR tree-optimization/61095 * gcc.dg/torture/pr61095.c: New test. Index: gcc/testsuite/gcc.dg/torture/pr61095.c === --- /dev/null 2014-05-03 11:58:38.033951363 +0100 +++ gcc/testsuite/gcc.dg/torture/pr61095.c 2014-05-08 08:46:01.203827892 +0100 @@ -0,0 +1,23 @@ +/* { dg-do run } */ +/* { dg-require-effective-target lp64 } */ + +extern void __attribute__ ((noreturn)) abort (void); + +int __attribute__ ((noinline, noclone)) +foo (unsigned long addr) { +unsigned long *p = (unsigned long*)((addr 0x83f8UL) * 4); +unsigned long xxx = (unsigned long)(p + 1); +return xxx = 0x3c0UL; +} + +int +main (void) +{ + if (foo (0)) +abort (); + if (foo (0x7c00UL)) +abort (); + if (!foo (0xfc00UL)) +abort (); + return 0; +}
[PATCH, 1/2] shrink wrap a function with a single loop: copy propagation
Hi, Similar issue was discussed in thread http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01145.html. The patches are close to Jeff's suggestion: sink just the moves out of the incoming argument registers. The patch and following one try to shrink wrap a function with a single loop, which can not be handled by split_live_ranges_for_shrink_wrap and prepare_shrink_wrap, since the induction variable has more than one definitions. Take the test case in the patch as example, the pseudo code before shrink-wrap is like: p = p2 if (!p) goto return L1: ... p = ... ... goto L1 ... return: Function prepare_shrink_wrap does PRE like optimization to sink some copies from entry block to the live block. The patches enhance prepare_shrink_wrap with: (1) Replace the reference of p to p2 in the entry block. (This patch) (2) Create a new basic block on the live edge to hold the copy p = p2. (Next patch) After shrink-wrap, the pseudo code would like: if (!p2) goto return p = p2 L1: ... p = ... ... goto L1 return: Bootstrap and no make check regression on X86-64 and ARM. No Spec2k INT performance regression for X86-64 and ARM with -O3. With the two patches, for X86-64 Spec2K INT, the number of functions shrink-wrapped increases from 619 to 671. For 453.povray in Spec2006, X86-64 is ~1% better. ARM THUMB mode is ~4% faster. No performance improvement for ARM mode since it uses the mov (subs) to set the CC. There is no way to sink it out of the entry block. OK for trunk? Thanks! -Zhenqiang ChangeLog: 2014-05-08 Zhenqiang Chen zhenqiang.c...@linaro.org * function.c (last_or_compare_p, try_copy_prop): new functions. (move_insn_for_shrink_wrap): try copy propagation. (prepare_shrink_wrap): Separate last_uses from uses. testsuite/ChangeLog: 2014-05-08 Zhenqiang Chen zhenqiang.c...@linaro.org * shrink-wrap-loop.c: New test case. diff --git a/gcc/function.c b/gcc/function.c index 383a52a..764ac82 100644 --- a/gcc/function.c +++ b/gcc/function.c @@ -5421,14 +5421,139 @@ next_block_for_reg (basic_block bb, int regno, int end_regno) return live_edge-dest; } -/* Try to move INSN from BB to a successor. Return true on success. - USES and DEFS are the set of registers that are used and defined - after INSN in BB. */ +/* Check whether INSN is the last insn in BB or + a COMPARE for the last insn in BB. */ + +static bool +last_or_compare_p (basic_block bb, rtx insn) +{ + rtx x = single_set (insn); + + if ((insn == BB_END (bb)) + || ((insn == PREV_INSN (BB_END (bb))) + x REG_P (SET_DEST (x)) + GET_MODE_CLASS (GET_MODE (SET_DEST (x))) == MODE_CC)) +return true; + + return false; +} + +/* Try to copy propagate INSN with SRC and DEST in BB to the last COMPARE + or JUMP insn, which use registers in LAST_USES. */ + +static bool +try_copy_prop (basic_block bb, rtx insn, rtx src, rtx dest, + HARD_REG_SET *last_uses) +{ + bool ret = false; + bool changed, is_asm; + unsigned i, alt, n_ops, dregno, sregno; + + rtx x, r, n, tmp; + + if (GET_CODE (dest) == SUBREG || GET_CODE (src) == SUBREG + || insn == BB_END (bb)) +return false; + + x = NEXT_INSN (insn); + sregno = REGNO (src); + dregno = REGNO (dest); + + while (x != NULL_RTX) +{ + tmp = NEXT_INSN (x); + + if (BARRIER_P(x)) + return false; + + /* Skip other insns since dregno is not referred according to +previous checks. */ + if (!last_or_compare_p (bb, x)) + { + x = tmp; + continue; + } + changed = 0; + extract_insn (x); + if (!constrain_operands (1)) + fatal_insn_not_found (x); + preprocess_constraints (); + alt = which_alternative; + n_ops = recog_data.n_operands; + + is_asm = asm_noperands (PATTERN (x)) = 0; + if (is_asm) + return false; + + for (i = 0; i n_ops; i ++) + { + r = recog_data.operand [i]; + if (REG_P (r) REGNO (r) == dregno) + { + enum reg_class cl = recog_op_alt[i][alt].cl; + if (GET_MODE (r) != GET_MODE (src) + || !in_hard_reg_set_p (reg_class_contents[cl], +GET_MODE (r), sregno) + || recog_op_alt[i][alt].earlyclobber) + { + if (changed) + cancel_changes (0); + return false; +} + n = gen_rtx_raw_REG (GET_MODE (r), sregno); + if (!validate_unshare_change (x, recog_data.operand_loc[i], + n, true)) + { + cancel_changes (0); + return false; + } + + ORIGINAL_REGNO (n) = ORIGINAL_REGNO (r); + REG_ATTRS (n) = REG_ATTRS (r); + REG_POINTER (n) = REG_POINTER (r); + changed = true; +} + } + +
[PATCH, 2/2] shrink wrap a function with a single loop: split live_edge
Hi, The patch splits the live_edge for move_insn_for_shrink_wrap to sink the copy out of the entry block. Bootstrap and no make check regression on X86-64 and ARM. OK for trunk? Thanks! -Zhenqiang ChangeLog: 2014-05-08 Zhenqiang Chen zhenqiang.c...@linaro.org * function.c (next_block_for_reg): Allow live_edge-dest has two predecessors. (move_insn_for_shrink_wrap): Split live_edge. (prepre_shrink_wrap): One more parameter for move_insn_for_shrink_wrap. diff --git a/gcc/function.c b/gcc/function.c index 764ac82..0be58e2 100644 --- a/gcc/function.c +++ b/gcc/function.c @@ -5381,7 +5381,7 @@ requires_stack_frame_p (rtx insn, HARD_REG_SET prologue_used, and if BB is its only predecessor. Return that block if so, otherwise return null. */ -static basic_block +static edge next_block_for_reg (basic_block bb, int regno, int end_regno) { edge e, live_edge; @@ -5415,10 +5415,12 @@ next_block_for_reg (basic_block bb, int regno, int end_regno) if (live_edge-flags EDGE_ABNORMAL) return NULL; - if (EDGE_COUNT (live_edge-dest-preds) 1) + /* When live_edge-dest-preds == 2, we can create a new block on + the edge to make it meet the requirement. */ + if (EDGE_COUNT (live_edge-dest-preds) 2) return NULL; - return live_edge-dest; + return live_edge; } /* Check whether INSN is the last insn in BB or @@ -5545,20 +5547,25 @@ try_copy_prop (basic_block bb, rtx insn, rtx src, rtx dest, return ret; } - /* Try to move INSN from BB to a successor. Return true on success. -USES and DEFS are the set of registers that are used and defined -after INSN in BB. */ +/* Try to move INSN from BB to a successor. Return true on success. + LAST_USES is the set of registers that are used by the COMPARE or JUMP + instructions in the block. USES is the set of registers that are used + by others after INSN except COMARE and JUMP. DEFS are the set of registers + that are used and defined others after INSN. SPLIT_P indicates whether + a live edge from BB is splitted or not. */ static bool move_insn_for_shrink_wrap (basic_block bb, rtx insn, const HARD_REG_SET uses, const HARD_REG_SET defs, - HARD_REG_SET *last_uses) + HARD_REG_SET *last_uses, + bool *split_p) { rtx set, src, dest; bitmap live_out, live_in, bb_uses, bb_defs; unsigned int i, dregno, end_dregno, sregno, end_sregno; basic_block next_block; + edge live_edge; /* Look for a simple register copy. */ set = single_set (insn); @@ -5582,17 +5589,31 @@ move_insn_for_shrink_wrap (basic_block bb, rtx insn, || overlaps_hard_reg_set_p (defs, GET_MODE (dest), dregno)) return false; - /* See whether there is a successor block to which we could move INSN. */ - next_block = next_block_for_reg (bb, dregno, end_dregno); - if (!next_block) + live_edge = next_block_for_reg (bb, dregno, end_dregno); + if (!live_edge) return false; + next_block = live_edge-dest; + /* If the destination register is referred in later insn, try to forward it. */ if (overlaps_hard_reg_set_p (*last_uses, GET_MODE (dest), dregno) !try_copy_prop (bb, insn, src, dest, last_uses)) return false; + /* Create a new basic block on the edge. */ + if (EDGE_COUNT (next_block-preds) == 2) +{ + next_block = split_edge (live_edge); + + bitmap_copy (df_get_live_in (next_block), df_get_live_out (bb)); + df_set_bb_dirty (next_block); + + /* We should not split more than once for a function. */ + gcc_assert (!(*split_p)); + *split_p = true; +} + /* At this point we are committed to moving INSN, but let's try to move it as far as we can. */ do @@ -5610,7 +5631,10 @@ move_insn_for_shrink_wrap (basic_block bb, rtx insn, { for (i = dregno; i end_dregno; i++) { - if (REGNO_REG_SET_P (bb_uses, i) || REGNO_REG_SET_P (bb_defs, i) + + if (*split_p + || REGNO_REG_SET_P (bb_uses, i) + || REGNO_REG_SET_P (bb_defs, i) || REGNO_REG_SET_P (DF_LIVE_BB_INFO (bb)-gen, i)) next_block = NULL; CLEAR_REGNO_REG_SET (live_out, i); @@ -5621,7 +5645,8 @@ move_insn_for_shrink_wrap (basic_block bb, rtx insn, Either way, SRC is now live on entry. */ for (i = sregno; i end_sregno; i++) { - if (REGNO_REG_SET_P (bb_defs, i) + if (*split_p + || REGNO_REG_SET_P (bb_defs, i) || REGNO_REG_SET_P (DF_LIVE_BB_INFO (bb)-gen, i)) next_block = NULL; SET_REGNO_REG_SET (live_out, i); @@ -5650,21 +5675,31 @@ move_insn_for_shrink_wrap (basic_block bb, rtx insn, /* If we don't need to add the move to BB, look for a single successor block. */ if
[ping] [PATCH] config-list.mk: `show' target to show all considered targets
On Mon, 2014-05-05 14:05:40 +0200, Jan-Benedict Glaw jbg...@lug-owl.de wrote: I'd like to install this patch, which would help me to run the build robot (http://toolchain.lug-owl.de/buildbot/): 2014-05-05 Jan-Benedict Glaw jbg...@lug-owl.de contrib/ * config-list.mk (show): New target. Ping? http://gcc.gnu.org/ml/gcc-patches/2014-05/msg00203.html MfG, JBG -- Jan-Benedict Glaw jbg...@lug-owl.de +49-172-7608481 Signature of: God put me on earth to accomplish a certain number of the second :things. Right now I am so far behind I will never die. signature.asc Description: Digital signature
Re: [PATCH, Pointer Bounds Checker 1/x] Pointer bounds type and mode
2014-05-06 19:09 GMT+04:00 Jeff Law l...@redhat.com: On 05/06/14 07:31, Richard Biener wrote: On Tue, May 6, 2014 at 2:10 PM, Ilya Enkovich enkovich@gmail.com wrote: 2014-04-16 15:00 GMT+04:00 Ilya Enkovich enkovich@gmail.com: Hi, This patch restarts the series for introducing Pointer Bounds Checker instrumentation and supporting Intel Memory Protection Extension (MPX) technology. Detailed description is on GCC Wiki page: http://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler. The first patch introduces pointer bounds type and mode. It was approved earlier for 4.9 and had no significant changes since then. I'll assume patch is OK if no objections arise. Patch was bootstrapped and tested for linux-x86_64. Thanks, Ilya -- gcc/ 2014-04-16 Ilya Enkovich ilya.enkov...@intel.com * mode-classes.def (MODE_POINTER_BOUNDS): New. * tree.def (POINTER_BOUNDS_TYPE): New. * genmodes.c (complete_mode): Support MODE_POINTER_BOUNDS. (POINTER_BOUNDS_MODE): New. (make_pointer_bounds_mode): New. * machmode.h (POINTER_BOUNDS_MODE_P): New. * stor-layout.c (int_mode_for_mode): Support MODE_POINTER_BOUNDS. (layout_type): Support POINTER_BOUNDS_TYPE. * tree-pretty-print.c (dump_generic_node): Support POINTER_BOUNDS_TYPE. * tree.c (build_int_cst_wide): Support POINTER_BOUNDS_TYPE. (type_contains_placeholder_1): Likewise. * tree.h (POINTER_BOUNDS_TYPE_P): New. * varasm.c (output_constant): Support POINTER_BOUNDS_TYPE. * doc/rtl.texi (MODE_POINTER_BOUNDS): New. diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi index 20b7187..3a1014d 100644 --- a/gcc/doc/rtl.texi +++ b/gcc/doc/rtl.texi @@ -1382,6 +1382,12 @@ any @code{CC_MODE} modes listed in the @file{@var{machine}-modes.def}. @xref{Jump Patterns}, also see @ref{Condition Code}. +@findex MODE_POINTER_BOUNDS +@item MODE_POINTER_BOUNDS +Pointer bounds modes. Used to represent values of pointer bounds type. +Operations in these modes may be executed as NOPs depending on hardware +features and environment setup. + @findex MODE_RANDOM @item MODE_RANDOM This is a catchall mode class for modes which don't fit into the above diff --git a/gcc/genmodes.c b/gcc/genmodes.c index 8cc3cde..9d0b413 100644 --- a/gcc/genmodes.c +++ b/gcc/genmodes.c @@ -333,6 +333,7 @@ complete_mode (struct mode_data *m) break; case MODE_INT: +case MODE_POINTER_BOUNDS: case MODE_FLOAT: case MODE_DECIMAL_FLOAT: case MODE_FRACT: @@ -534,6 +535,19 @@ make_special_mode (enum mode_class cl, const char *name, new_mode (cl, name, file, line); } +#define POINTER_BOUNDS_MODE(N, Y) \ + make_pointer_bounds_mode (#N, Y, __FILE__, __LINE__) + +static void ATTRIBUTE_UNUSED +make_pointer_bounds_mode (const char *name, + unsigned int bytesize, + const char *file, unsigned int line) +{ + struct mode_data *m = new_mode (MODE_POINTER_BOUNDS, name, file, line); + m-bytesize = bytesize; +} + + #define INT_MODE(N, Y) FRACTIONAL_INT_MODE (N, -1U, Y) #define FRACTIONAL_INT_MODE(N, B, Y) \ make_int_mode (#N, B, Y, __FILE__, __LINE__) diff --git a/gcc/machmode.h b/gcc/machmode.h index bc5d901..cbe5042 100644 --- a/gcc/machmode.h +++ b/gcc/machmode.h @@ -174,6 +174,9 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES]; || CLASS == MODE_ACCUM \ || CLASS == MODE_UACCUM) +#define POINTER_BOUNDS_MODE_P(MODE) \ + (GET_MODE_CLASS (MODE) == MODE_POINTER_BOUNDS) + /* Get the size in bytes and bits of an object of mode MODE. */ extern CONST_MODE_SIZE unsigned char mode_size[NUM_MACHINE_MODES]; diff --git a/gcc/mode-classes.def b/gcc/mode-classes.def index 9c6a8bb..b645484 100644 --- a/gcc/mode-classes.def +++ b/gcc/mode-classes.def @@ -22,6 +22,7 @@ along with GCC; see the file COPYING3. If not see DEF_MODE_CLASS (MODE_CC),/* condition code in a register */ \ DEF_MODE_CLASS (MODE_INT), /* integer */ \ DEF_MODE_CLASS (MODE_PARTIAL_INT), /* integer with padding bits */ \ + DEF_MODE_CLASS (MODE_POINTER_BOUNDS), /* bounds */ \ DEF_MODE_CLASS (MODE_FRACT), /* signed fractional number */ \ DEF_MODE_CLASS (MODE_UFRACT),/* unsigned fractional number */ \ DEF_MODE_CLASS (MODE_ACCUM), /* signed accumulator */ \ diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c index 084d195..af0ab88 100644 --- a/gcc/stor-layout.c +++ b/gcc/stor-layout.c @@ -386,6 +386,7 @@ int_mode_for_mode (enum machine_mode mode) case MODE_VECTOR_ACCUM: case MODE_VECTOR_UFRACT: case MODE_VECTOR_UACCUM: +case MODE_POINTER_BOUNDS: mode = mode_for_size (GET_MODE_BITSIZE (mode), MODE_INT, 0); break; @@ -2124,6 +2125,11 @@
Re: [PATCH, 2/2] shrink wrap a function with a single loop: split live_edge
On Thu, May 8, 2014 at 10:07 AM, Zhenqiang Chen wrote: The patch splits the live_edge for move_insn_for_shrink_wrap to sink the copy out of the entry block. Maybe also time to take the shrink-wrapping code out of function.c and put it in its own file? Ciao! Steven
Re: [PATCH GCC]Pick up more address lowering cases for ivopt and tree-affine.c
On Tue, May 6, 2014 at 6:44 PM, Richard Biener richard.guent...@gmail.com wrote: On Tue, May 6, 2014 at 10:39 AM, Bin.Cheng amker.ch...@gmail.com wrote: On Fri, Dec 6, 2013 at 6:19 PM, Richard Biener richard.guent...@gmail.com wrote: Hi, I split the patch into two and updated the test case. The patches pass bootstrap/tests on x86/x86_64, also pass test on arm cortex-m3. Is it OK? Thanks, bin PATCH 1: 2014-05-06 Bin Cheng bin.ch...@arm.com * gcc/tree-affine.c (tree_to_aff_combination): Handle MEM_REF for core part of address expressions. No gcc/ in the changelog Simplify that by using aff_combination_add_cst: + if (TREE_CODE (core) == MEM_REF) + { + aff_combination_add_cst (comb, mem_ref_offset (core)); + core = TREE_OPERAND (core, 0); patch 1 is ok with that change. Installed with below change because of wide-int merge: - core = build_fold_addr_expr (core); + if (TREE_CODE (core) == MEM_REF) +{ + aff_combination_add_cst (comb, wi::to_widest (TREE_OPERAND (core, 1))); + core = TREE_OPERAND (core, 0); PATCH 2: 2014-05-06 Bin Cheng bin.ch...@arm.com * gcc/tree-ssa-loop-ivopts.c (contain_complex_addr_expr): New. (alloc_iv): Lower base expressions containing ADDR_EXPR. So this lowers addresses(?) which are based on not-decl, like a[0] + 4 but not a + 4. I question this odd choice. ISTR a+4 is already in its lowered form, what we want to handle is EXPR + 4, in which EXPR is MEM_REF/ARRAY_REF, etc.. when originally introducing address lowering (in rev. 204497) I was concerned about the cost? Yes, you did. I still think the iv number is relative small for each loop, thus the change won't cause compilation time issue considering the existing tree-affine transformation in ivopt. I would like to collect more accurate time information for ivopt in gcc bootstrap. Should I use -ftime-report then add all time slices in TV_TREE_LOOP_IVOPTS category? Is there any better solutions? Thanks. Now I wonder why we bother to convert the lowered form back to trees to store it in -base and not simply keep (and always compute) the affine expansion only. Thus, change struct iv to have aff_tree base instead of tree base? Can you see what it takes to do such change? At the point we replace uses we go into affine representation again anyway. Good idea, I may have a look. Thanks, bin -- Best Regards.
[gomp4] Mark __OPENMP_TARGET__ as addressable (was: Offloading patches (2/3): Add tables generation)
Hi! On Tue, 17 Dec 2013 15:39:57 +0400, Michael V. Zolotukhin michael.v.zolotuk...@gmail.com wrote: in this patch we start to pass '__OPENMP_TARGET__' symbol to GOMP_target calls. --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -8371,19 +8372,22 @@ expand_omp_target (struct omp_region *region) } gimple g; - /* FIXME: This will be address of - extern char __OPENMP_TARGET__[] __attribute__((visibility (hidden))) - symbol, as soon as the linker plugin is able to create it for us. */ - tree openmp_target = build_zero_cst (ptr_type_node); + tree openmp_target += build_decl (UNKNOWN_LOCATION, VAR_DECL, + get_identifier (__OPENMP_TARGET__), ptr_type_node); + TREE_PUBLIC (openmp_target) = 1; + DECL_EXTERNAL (openmp_target) = 1; if (kind == GF_OMP_TARGET_KIND_REGION) { tree fnaddr = build_fold_addr_expr (child_fn); - g = gimple_build_call (builtin_decl_explicit (start_ix), 7, - device, fnaddr, openmp_target, t1, t2, t3, t4); + g = gimple_build_call (builtin_decl_explicit (start_ix), 7, device, + fnaddr, build_fold_addr_expr (openmp_target), + t1, t2, t3, t4); In the trunk into gomp-4_0-branch merge that I'm currently preparing, all offloading usage results in an ICE like: spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ [...]/source/libgomp/testsuite/libgomp.c/target-1.c -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/ -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -I[...]/build/x86_64-unknown-linux-gnu/./libgomp -I[...]/source/libgomp/testsuite/.. -fmessage-length=0 -fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -O2 -L[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -lm -o ./target-1.exe [...]/source/libgomp/testsuite/libgomp.c/target-1.c: In function 'fn2': [...]/source/libgomp/testsuite/libgomp.c/target-1.c:46:3: error: address taken, but ADDRESSABLE bit not set __OPENMP_TARGET__ [...]/source/libgomp/testsuite/libgomp.c/target-1.c:35:11: note: in statement __builtin_GOMP_target_data (-1, __OPENMP_TARGET__, 1, .omp_data_arr.12, .omp_data_sizes.13, .omp_data_kinds.14); [...]/source/libgomp/testsuite/libgomp.c/target-1.c:46:3: error: address taken, but ADDRESSABLE bit not set __OPENMP_TARGET__ [...]/source/libgomp/testsuite/libgomp.c/target-1.c:37:13: note: in statement __builtin_GOMP_target (-1, fn2._omp_fn.0, __OPENMP_TARGET__, 6, .omp_data_arr.6, .omp_data_sizes.7, .omp_data_kinds.8); [...]/source/libgomp/testsuite/libgomp.c/target-1.c:46:3: error: address taken, but ADDRESSABLE bit not set __OPENMP_TARGET__ [...]/source/libgomp/testsuite/libgomp.c/target-1.c:44:13: note: in statement __builtin_GOMP_target_update (-1, __OPENMP_TARGET__, 2, .omp_data_arr.9, .omp_data_sizes.10, .omp_data_kinds.11); [...]/source/libgomp/testsuite/libgomp.c/target-1.c:46:3: internal compiler error: verify_gimple failed 0xa149ec verify_gimple_in_cfg(function*, bool) ../../source/gcc/tree-cfg.c:4954 0x93a407 execute_function_todo ../../source/gcc/passes.c:1777 0x93ad63 execute_todo ../../source/gcc/passes.c:1834 In r210207, I committed the following patch; should we also be setting any additional flags, such as DECL_ARTIFICIAL? commit aaf964a67612f5aa50b405d2aa7998ed3b5d5ac6 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Thu May 8 09:41:33 2014 + Mark __OPENMP_TARGET__ as addressable. gcc/ * omp-low.c (get_offload_symbol_decl): Mark __OPENMP_TARGET__ as addressable. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@210207 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index 1bd1f51..b1e73c0 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,3 +1,8 @@ +2014-05-08 Thomas Schwinge tho...@codesourcery.com + + * omp-low.c (get_offload_symbol_decl): Mark __OPENMP_TARGET__ as + addressable. + 2014-04-04 Bernd Schmidt ber...@codesourcery.com * lto-wrapper.c (replace_special_characters): Remove functions and diff --git gcc/omp-low.c gcc/omp-low.c index de00516..5e90ce3 100644 --- gcc/omp-low.c +++ gcc/omp-low.c @@ -233,6 +233,7 @@ get_offload_symbol_decl (void) tree decl = build_decl (UNKNOWN_LOCATION, VAR_DECL, get_identifier (__OPENMP_TARGET__), ptr_type_node); + TREE_ADDRESSABLE (decl) = 1; TREE_PUBLIC (decl) = 1; DECL_EXTERNAL (decl) = 1; DECL_WEAK (decl) = 1; Grüße, Thomas pgpWXMM72thdV.pgp Description: PGP signature
Re: [gomp4] Add tables generation
On 05/06/2014 05:32 PM, Ilya Verbin wrote: On 05 Apr 17:22, Bernd Schmidt wrote: Things seemed to work over here, but now I'm not certain whether the __start_/__stop_ functionality is GNU ld specific? Maybe we should just go back to the previous version of this patch which didn't try to use this. Bernd This approach does not work with shared libraries. The automatically inserted symbols have GLOBAL binding, therefore the __start_/__stop_ from the executable overwrite the respective symbols in DSO. Ok, I guess we should just go back to what we had previously. Here's what I intend to commit if there are no objections. Bernd Index: gcc/lto-wrapper.c === --- gcc/lto-wrapper.c (revision 210170) +++ gcc/lto-wrapper.c (working copy) @@ -66,7 +66,7 @@ static unsigned int nr; static char **input_names; static char **output_names; static char **offload_names; -static const char *ompend; +static const char *ompbegin, *ompend; static char *makefile; const char tool_name[] = lto-wrapper; @@ -554,30 +554,40 @@ copy_file (const char *dest, const char } } -/* Find the crtompend.o file in LIBRARY_PATH, make a copy and store - the name of the copy in ompend. */ +/* Find the omp_begin.o and omp_end.o files in LIBRARY_PATH, make copies + and store the names of the copies in ompbegin and ompend. */ static void -find_ompend (void) +find_ompbeginend (void) { char **paths; const char *library_path = getenv (LIBRARY_PATH); if (library_path == NULL) return; - int n_paths = parse_env_var (library_path, paths, /crtompend.o); + int n_paths = parse_env_var (library_path, paths, /crtompbegin.o); - for (int i = 0; i n_paths; i++) + int i; + for (i = 0; i n_paths; i++) if (access_check (paths[i], R_OK) == 0) { + size_t len = strlen (paths[i]); + char *tmp = xstrdup (paths[i]); + strcpy (paths[i] + len - 7, end.o); + if (access_check (paths[i], R_OK) != 0) + fatal (installation error, can't find crtompend.o); /* The linker will delete the filenames we give it, so make copies. */ - const char *omptmp = make_temp_file (.o); - copy_file (omptmp, paths[i]); - ompend = omptmp; + const char *omptmp1 = make_temp_file (.o); + const char *omptmp2 = make_temp_file (.o); + copy_file (omptmp1, tmp); + ompbegin = omptmp1; + copy_file (omptmp2, paths[i]); + ompend = omptmp2; + free (tmp); break; } - if (ompend == 0) -fatal (installation error, can't find crtompend.o); + if (i == n_paths) +fatal (installation error, can't find crtompbegin.o); free_array_of_ptrs ((void**) paths, n_paths); } @@ -1073,7 +1083,7 @@ cont: compile_images_for_openmp_targets (argc, argv); if (offload_names) { - find_ompend (); + find_ompbeginend (); for (i = 0; offload_names[i]; i++) { fputs (offload_names[i], stdout); @@ -1082,6 +1092,11 @@ cont: free_array_of_ptrs ((void **)offload_names, i); } } + if (ompbegin) + { + fputs (ompbegin, stdout); + putc ('\n', stdout); + } for (i = 0; i nr; ++i) { Index: libgcc/Makefile.in === --- libgcc/Makefile.in (revision 210170) +++ libgcc/Makefile.in (working copy) @@ -975,6 +975,9 @@ crtbegin$(objext): $(srcdir)/crtstuff.c crtend$(objext): $(srcdir)/crtstuff.c $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $ -DCRT_END +crtompbegin$(objext): $(srcdir)/ompstuff.c + $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $ -DCRT_BEGIN + crtompend$(objext): $(srcdir)/ompstuff.c $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $ -DCRT_END Index: libgcc/configure === --- libgcc/configure (revision 210170) +++ libgcc/configure (working copy) @@ -4397,7 +4397,7 @@ fi if test x$offload_targets != x; then - extra_parts=${extra_parts} crtompend.o + extra_parts=${extra_parts} crtompbegin.o crtompend.o fi # Check if Solaris/x86 linker supports ZERO terminator unwind entries. Index: libgcc/configure.ac === --- libgcc/configure.ac (revision 210170) +++ libgcc/configure.ac (working copy) @@ -336,7 +336,7 @@ AC_ARG_ENABLE(offload-targets, ], [enable_accelerator=no]) AC_SUBST(enable_accelerator) if test x$offload_targets != x; then - extra_parts=${extra_parts} crtompend.o + extra_parts=${extra_parts} crtompbegin.o crtompend.o fi # Check if Solaris/x86 linker supports ZERO terminator unwind entries. Index: libgcc/ompstuff.c === --- libgcc/ompstuff.c (revision 210170) +++ libgcc/ompstuff.c (working copy) @@ -39,14 +39,35 @@ see the files COPYING3 and COPYING.RUNTI #include tm.h #include libgcc_tm.h +#ifdef CRT_BEGIN + #if defined(HAVE_GAS_HIDDEN) defined(ENABLE_OFFLOADING) -extern void __start___gnu_offload_funcs; -extern void __stop___gnu_offload_funcs;
[PATCH] Remove old pointer conversion special-case
This removes special-casing of pointer conversion handling in gimple_fold_stmt_to_constant_1 - this was from times where most pointer conversions were not useless (only conversions to function/method pointers from other kinds are considered not useless - which I'll change with a followup). Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2014-05-08 Richard Biener rguent...@suse.de * gimple-fold.c (gimple_fold_stmt_to_constant_1): Remove pointer propagation special-case. Index: gcc/gimple-fold.c === --- gcc/gimple-fold.c (revision 210148) +++ gcc/gimple-fold.c (working copy) @@ -2638,20 +2638,6 @@ gimple_fold_stmt_to_constant_1 (gimple s tree lhs = gimple_assign_lhs (stmt); tree op0 = (*valueize) (gimple_assign_rhs1 (stmt)); - /* Conversions are useless for CCP purposes if they are -value-preserving. Thus the restrictions that -useless_type_conversion_p places for restrict qualification -of pointer types should not apply here. -Substitution later will only substitute to allowed places. */ - if (CONVERT_EXPR_CODE_P (subcode) - POINTER_TYPE_P (TREE_TYPE (lhs)) - POINTER_TYPE_P (TREE_TYPE (op0)) - TYPE_ADDR_SPACE (TREE_TYPE (lhs)) -== TYPE_ADDR_SPACE (TREE_TYPE (op0)) - TYPE_MODE (TREE_TYPE (lhs)) -== TYPE_MODE (TREE_TYPE (op0))) - return op0; - return fold_unary_ignore_overflow_loc (loc, subcode, gimple_expr_type (stmt), op0);
[PATCH] Fix oversight in call gimplification
When gimplifying a call we now remember the original function type used and record it in gimple_call_fntype. But we fail to use exactly that type for looking at TYPE_ARG_TYPES. The following fixes that. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2014-05-08 Richard Biener rguent...@suse.de * gimplify.c (gimplify_call_expr): Use saved fnptrtype for looking at TYPE_ARG_TYPES. Index: gcc/gimplify.c === --- gcc/gimplify.c (revision 210207) +++ gcc/gimplify.c (working copy) @@ -2329,8 +2329,8 @@ gimplify_call_expr (tree *expr_p, gimple parms = NULL_TREE; if (fndecl) parms = TYPE_ARG_TYPES (TREE_TYPE (fndecl)); - else if (POINTER_TYPE_P (TREE_TYPE (CALL_EXPR_FN (*expr_p -parms = TYPE_ARG_TYPES (TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (*expr_p; + else +parms = TYPE_ARG_TYPES (TREE_TYPE (fnptrtype)); if (fndecl DECL_ARGUMENTS (fndecl)) p = DECL_ARGUMENTS (fndecl);
[PATCH] Improve and simplify VN expression combining
The following gets rid of SCCVNs valueize_expr which was used on GENERIC expressions built via vn_get_expr_for which is used for stmt combining via fold (yeah, I know ...). The odd way was that it first folded and built the expression and then valueized it (and not folding again), resulting in uncanonicalized (if not unsimplified - but that's unlikely) trees which may lead to missed foldings when combining two of those later. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. (yeah, the gimple folding via pattern description gets rid of all this) Richard. 2014-05-08 Richard Biener rguent...@suse.de * tree-ssa-sccvn.c (vn_get_expr_for): Valueize operands before folding the expression. (valueize_expr): Remove. (visit_reference_op_load): Do not valueize the result of vn_get_expr_for. (simplify_binary_expression): Likewise. (simplify_unary_expression): Likewise. Index: gcc/tree-ssa-sccvn.c === --- gcc/tree-ssa-sccvn.c(revision 210202) +++ gcc/tree-ssa-sccvn.c(working copy) @@ -414,8 +416,8 @@ vn_get_expr_for (tree name) if (!is_gimple_assign (def_stmt)) return vn-valnum; - /* FIXME tuples. This is incomplete and likely will miss some - simplifications. */ + /* Note that we can valueize here because we clear the cached + simplified expressions after each optimistic iteration. */ code = gimple_assign_rhs_code (def_stmt); switch (TREE_CODE_CLASS (code)) { @@ -427,20 +429,21 @@ vn_get_expr_for (tree name) 0)) == SSA_NAME) expr = fold_build1 (code, gimple_expr_type (def_stmt), - TREE_OPERAND (gimple_assign_rhs1 (def_stmt), 0)); + vn_valueize (TREE_OPERAND + (gimple_assign_rhs1 (def_stmt), 0))); break; case tcc_unary: expr = fold_build1 (code, gimple_expr_type (def_stmt), - gimple_assign_rhs1 (def_stmt)); + vn_valueize (gimple_assign_rhs1 (def_stmt))); break; case tcc_binary: expr = fold_build2 (code, gimple_expr_type (def_stmt), - gimple_assign_rhs1 (def_stmt), - gimple_assign_rhs2 (def_stmt)); + vn_valueize (gimple_assign_rhs1 (def_stmt)), + vn_valueize (gimple_assign_rhs2 (def_stmt))); break; case tcc_exceptional: @@ -2759,7 +2762,6 @@ defs_to_varying (gimple stmt) } static bool expr_has_constants (tree expr); -static tree valueize_expr (tree expr); /* Visit a copy between LHS and RHS, return true if the value number changed. */ @@ -2900,7 +2902,7 @@ visit_reference_op_load (tree lhs, tree || TREE_CODE (val) == VIEW_CONVERT_EXPR) TREE_CODE (TREE_OPERAND (val, 0)) == SSA_NAME) { - tree tem = valueize_expr (vn_get_expr_for (TREE_OPERAND (val, 0))); + tree tem = vn_get_expr_for (TREE_OPERAND (val, 0)); if ((CONVERT_EXPR_P (tem) || TREE_CODE (tem) == VIEW_CONVERT_EXPR) (tem = fold_unary_ignore_overflow (TREE_CODE (val), @@ -3210,26 +3214,6 @@ stmt_has_constants (gimple stmt) return false; } -/* Replace SSA_NAMES in expr with their value numbers, and return the - result. - This is performed in place. */ - -static tree -valueize_expr (tree expr) -{ - switch (TREE_CODE_CLASS (TREE_CODE (expr))) -{ -case tcc_binary: - TREE_OPERAND (expr, 1) = vn_valueize (TREE_OPERAND (expr, 1)); - /* Fallthru. */ -case tcc_unary: - TREE_OPERAND (expr, 0) = vn_valueize (TREE_OPERAND (expr, 0)); - break; -default:; -} - return expr; -} - /* Simplify the binary expression RHS, and return the result if simplified. */ @@ -3250,7 +3234,7 @@ simplify_binary_expression (gimple stmt) if (VN_INFO (op0)-has_constants || TREE_CODE_CLASS (code) == tcc_comparison || code == COMPLEX_EXPR) - op0 = valueize_expr (vn_get_expr_for (op0)); + op0 = vn_get_expr_for (op0); else op0 = vn_valueize (op0); } @@ -3259,7 +3243,7 @@ simplify_binary_expression (gimple stmt) { if (VN_INFO (op1)-has_constants || code == COMPLEX_EXPR) - op1 = valueize_expr (vn_get_expr_for (op1)); + op1 = vn_get_expr_for (op1); else op1 = vn_valueize (op1); } @@ -3321,7 +3305,7 @@ simplify_unary_expression (gimple stmt) orig_op0 = op0; if (VN_INFO (op0)-has_constants) -op0 = valueize_expr (vn_get_expr_for (op0)); +op0 = vn_get_expr_for (op0); else if (CONVERT_EXPR_CODE_P (code) || code == REALPART_EXPR || code == IMAGPART_EXPR @@ -3330,7 +3314,7 @@ simplify_unary_expression
[C++ Patch] PR 13981 (resend)
Hi, some time ago I worked a bit on this very old diagnostic enhancement issue: http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01588.html and yesterday we got a duplicate. Adding an inform to convert_for_assignment still makes sense to me, thus I resurrected the patch, tightened a bit the check and retested it. What do you think? Should we maybe cover more cases, which?, in the inform? Thanks, Paolo. /cp 2014-05-08 Paolo Carlini paolo.carl...@oracle.com PR c++/13981 * typeck.c (convert_for_assignment): Provide an inform for pointers to incomplete class types. /testsuite 2014-05-08 Paolo Carlini paolo.carl...@oracle.com PR c++/13981 * g++.dg/diagnostic/pr13981.C: New. Index: cp/typeck.c === --- cp/typeck.c (revision 210205) +++ cp/typeck.c (working copy) @@ -8094,6 +8094,13 @@ convert_for_assignment (tree type, tree rhs, default: gcc_unreachable(); } + if (TYPE_PTR_P (rhstype) + TYPE_PTR_P (type) + CLASS_TYPE_P (TREE_TYPE (rhstype)) + CLASS_TYPE_P (TREE_TYPE (type)) + !COMPLETE_TYPE_P (TREE_TYPE (rhstype))) + inform (input_location, class type %qT is incomplete, + TREE_TYPE (rhstype)); } return error_mark_node; } Index: testsuite/g++.dg/diagnostic/pr13981.C === --- testsuite/g++.dg/diagnostic/pr13981.C (revision 0) +++ testsuite/g++.dg/diagnostic/pr13981.C (working copy) @@ -0,0 +1,12 @@ +// PR c++/13981 + +struct A {}; +struct B; + +void func( A *a ); + +int main() +{ + B *b = 0; + func(b); // { dg-error cannot convert } +} // { dg-message is incomplete is incomplete { target *-*-* } 11 }
Re: RFC: Faster for_each_rtx-like iterators
On Thu, May 08, 2014 at 07:25:50AM +0100, Richard Sandiford wrote: Trevor Saunders tsaund...@mozilla.com writes: On Wed, May 07, 2014 at 09:52:49PM +0100, Richard Sandiford wrote: I noticed for_each_rtx showing up in profiles and thought I'd have a go at using worklist-based iterators instead. So far I have three: FOR_EACH_SUBRTX: iterates over const_rtx subrtxes of a const_rtx FOR_EACH_SUBRTX_VAR: iterates over rtx subrtxes of an rtx FOR_EACH_SUBRTX_PTR: iterates over subrtx pointers of an rtx * with FOR_EACH_SUBRTX_PTR being the direct for_each_rtx replacement. I made FOR_EACH_SUBRTX the default (unsuffixed) version because most walks really don't modify the structure. I think we should encourage const_rtxes to be used whereever possible. E.g. it might make it easier to have non-GC storage for temporary rtxes in future. I've locally replaced all for_each_rtx calls in the generic code with these iterators and they make things reproducably faster. The speed-up on full --enable-checking=release ./cc1 and ./cc1plus times is only about 1%, but maybe that's enough to justify the churn. seems pretty nice, and it seems like it'll make code a little more readable too :) Implementation-wise, the main observation is that most subrtxes are part of a single contiguous sequence of e fields. E.g. when compiling an oldish combine.ii on x86_64-linux-gnu with -O2, we iterate over the subrtxes of 7,636,542 rtxes. Of those: (A) 4,459,135 (58.4%) are leaf rtxes with no e or E fields, (B) 3,133,875 (41.0%) are rtxes with a single block of e fields and no E fields, and (C)43,532 (00.6%) are more complicated. (A) is really a special case of (B) in which the block has zero length. Those are the only two cases that really need to be handled inline. The implementation does this by having a mapping from an rtx code to the bounds of its e sequence, in the form of a start index and count. Out of (C), the vast majority (43,509) are PARALLELs. However, as you'd probably expect, bloating the inline code with that case made things slower rather than faster. The vast majority (in fact all in the combine.ii run above) of iterations can be done with a 16-element stack worklist. We obviously still need a heap fallback for the pathological cases though. I spent a bit of time trying different iterator implementations and seeing which produced the best code. Specific results from that were: - The storage used for the worklist is separate from the iterator, in order to avoid capturing iterator fields. - Although the natural type of the storage would be auto_vec ..., 16, that produced some overhead compared with a separate stack array and heap vector pointer. With the heap vector pointer, the only overhead is an assignment in the constructor and an if (x) release (x)-style sequence in the destructor. I think the extra complication over auto_vec is worth it because in this case the heap version is so very rarely needed. hm, where does the overhead come from exactly? it seems like if its faster to use vecT, va_heap, vl_embedd *foo; we should fix something about vectors since this isn't the only place it could matter. does it matter if you use vecT, va_heap, vl_embedd * or vecT ? the second is basically just a wrapper around the former I'd expect has no effect. I'm not saying you're doing the wrong thing here, but if we can make generic vectors faster we probably should ;) or is the issue the __builtin_expect()s you can add? Part of the problem is that by having an array in the vec itself, the other fields effectively have their address taken too. So m_alloc, m_num and m_using_auto_storage need to be set up and maintained on the stack, even though we're almost sure that they will never be used. ok - The maximum number of fields in (B)-type rtxes is 3. We get better code by making that explicit rather than having a general loop. - (C) codes map to an e count of UCHAR_MAX, so we can use a single check to test for that and for cases where the stack worklist is too small. can we use uint8_t? We don't really use that in GCC yet. I don't mind setting a precedent though :-) To give an example: /* Callback for for_each_rtx, that returns 1 upon encountering a VALUE whose UID is greater than the int uid that D points to. */ static int refs_newer_value_cb (rtx *x, void *d) { if (GET_CODE (*x) == VALUE CSELIB_VAL_PTR (*x)-uid *(int *)d) return 1; return 0; } /* Return TRUE if EXPR refers to a VALUE whose uid is greater than that of V. */ static bool refs_newer_value_p (rtx expr, rtx v) { int minuid = CSELIB_VAL_PTR (v)-uid; return for_each_rtx (expr, refs_newer_value_cb, minuid); } becomes: /* Return TRUE if EXPR refers to a
Re: [C++ Patch] PR 13981 (resend)
... Manuel suggested to also use DECL_SOURCE_LOCATION, while we are at it. Thus I'm retesting the below. Thanks, Paolo. // Index: cp/typeck.c === --- cp/typeck.c (revision 210205) +++ cp/typeck.c (working copy) @@ -8094,6 +8094,14 @@ convert_for_assignment (tree type, tree rhs, default: gcc_unreachable(); } + if (TYPE_PTR_P (rhstype) + TYPE_PTR_P (type) + CLASS_TYPE_P (TREE_TYPE (rhstype)) + CLASS_TYPE_P (TREE_TYPE (type)) + !COMPLETE_TYPE_P (TREE_TYPE (rhstype))) + inform (DECL_SOURCE_LOCATION (TYPE_MAIN_DECL + (TREE_TYPE (rhstype))), + class type %qT is incomplete, TREE_TYPE (rhstype)); } return error_mark_node; } Index: testsuite/g++.dg/diagnostic/pr13981.C === --- testsuite/g++.dg/diagnostic/pr13981.C (revision 0) +++ testsuite/g++.dg/diagnostic/pr13981.C (working copy) @@ -0,0 +1,12 @@ +// PR c++/13981 + +struct A {}; +struct B; // { dg-message is incomplete } + +void func( A *a ); + +int main() +{ + B *b = 0; + func(b); // { dg-error cannot convert } +}
[PATCH][1/n][RFC] Make FRE/PRE somewhat predicate aware
Ok, not really predicate aware, but this makes value-numbering pessimistically handle non-executable edges. In the following patch groundwork is laid and PHI value-numbering is adjusted to take advantage of edges known to be not executable. SCCVN is not well-suited to be control aware, but we still can see if value-numbering allows us to mark edges as not executable by looking at control statements. Value-numbering of PHI nodes is one obvious consumer of such information and it also gives a natural order to do that (pessimistic) edge executability computation - dominator order. Thus the following adds a pass over all control statements, trying to simplify them after value-numbering their operands (and all uses recursively, as SCCVN does). With followup patches I will try to use this information to reduce the amount of work done (also improving optimization, of course). One other obvious candidate is the alias walker which doesn't have to consider unreachable paths when walking into virtual PHIs. The patch will likely get some more cleanups (due to the hack in set_ssa_val_to). Comments still welcome. Thanks, Richard. 2014-05-08 Richard Biener rguent...@suse.de * tree-ssa-sccvn.c: Include tree-cfg.h and domwalk.h. (set_ssa_val_to): Handle unexpected sets to VN_TOP. (visit_phi): Ignore edges marked as not executable. (class cond_dom_walker): New. (cond_dom_walker::before_dom_children): Value-number control statements and mark successor edges as not executable if possible. (run_scc_vn): First walk all control statements in dominator order, marking edges as not executable. * gcc.dg/tree-ssa/ssa-fre-39.c: New testcase. Index: gcc/tree-ssa-sccvn.c === *** gcc/tree-ssa-sccvn.c.orig 2014-05-08 12:22:58.926026518 +0200 --- gcc/tree-ssa-sccvn.c2014-05-08 13:42:50.646696614 +0200 *** along with GCC; see the file COPYING3. *** 51,56 --- 51,58 #include params.h #include tree-ssa-propagate.h #include tree-ssa-sccvn.h + #include tree-cfg.h + #include domwalk.h /* This algorithm is based on the SCC algorithm presented by Keith Cooper and L. Taylor Simpson in SCC-Based Value numbering *** set_ssa_val_to (tree from, tree to) *** 2661,2666 --- 2663,2687 tree currval = SSA_VAL (from); HOST_WIDE_INT toff, coff; + /* The only thing we allow as value numbers are ssa_names + and invariants. So assert that here. We don't allow VN_TOP + as visiting a stmt should produce a value-number other than + that. + ??? Still VN_TOP can happen for unreachable code, so force + it to varying in that case. Not all code is prepared to + get VN_TOP on valueization. */ + if (to == VN_TOP) + { + if (dump_file (dump_flags TDF_DETAILS)) + fprintf (dump_file, Forcing value number to varying on +receiving VN_TOP\n); + to = from; + } + + gcc_assert (to != NULL_TREE + (TREE_CODE (to) == SSA_NAME + || is_gimple_min_invariant (to))); + if (from != to) { if (currval == from) *** set_ssa_val_to (tree from, tree to) *** 2680,2692 to = from; } - /* The only thing we allow as value numbers are VN_TOP, ssa_names - and invariants. So assert that here. */ - gcc_assert (to != NULL_TREE - (to == VN_TOP - || TREE_CODE (to) == SSA_NAME - || is_gimple_min_invariant (to))); - if (dump_file (dump_flags TDF_DETAILS)) { fprintf (dump_file, Setting value number of ); --- 2701,2706 *** visit_phi (gimple phi) *** 3071,3077 tree result; tree sameval = VN_TOP; bool allsame = true; - unsigned i; /* TODO: We could check for this in init_sccvn, and replace this with a gcc_assert. */ --- 3085,3090 *** visit_phi (gimple phi) *** 3080,3106 /* See if all non-TOP arguments have the same value. TOP is equivalent to everything, so we can ignore it. */ ! for (i = 0; i gimple_phi_num_args (phi); i++) ! { ! tree def = PHI_ARG_DEF (phi, i); ! if (TREE_CODE (def) == SSA_NAME) ! def = SSA_VAL (def); ! if (def == VN_TOP) ! continue; ! if (sameval == VN_TOP) ! { ! sameval = def; ! } ! else ! { ! if (!expressions_equal_p (def, sameval)) ! { ! allsame = false; ! break; ! } ! } ! } /* If all value numbered to the same value, the phi node has that value. */ --- 3093,3122 /* See if all non-TOP arguments have the same value. TOP is equivalent to everything, so we can ignore it. */ ! edge_iterator ei; ! edge e; ! FOR_EACH_EDGE (e, ei, gimple_bb (phi)-preds) !
Re: [Patch, Fortran] Add support for TS18508's CO_SUM/MAX/MIN (part 1/2)
Dear Tobias, OK for trunk with one slight quibble: I would rather see + /* Generate the function call. */ + if (code-resolved_isym-id == GFC_ISYM_CO_MAX) +fndecl = gfor_fndecl_co_max; + else if (code-resolved_isym-id == GFC_ISYM_CO_MIN) +fndecl = gfor_fndecl_co_min; + else if (code-resolved_isym-id == GFC_ISYM_CO_SUM) + fndecl = gfor_fndecl_co_sum; + else + gcc_unreachable (); + but your version is obviously functionally OK. Thanks for the patch Paul
[libgcc, build] Don't build libgcc-unwind.map with --disable-shared (PR libgcc/61097)
As reported in the PR, libgcc fails to build on Solaris with --disable-shared: the creation of libgcc-unwind.map depends on libgcc-std.ver which isn't built in this case. Fixed as follows, tested by verifying that a --disable-shared i386-pc-solaris2.10 build gets into stage2 without trying to build map files, while a default (i.e. --enable-shared) build still correctly builds the maps. Installed on mainline; will backport to the 4.9 branch in a few days. Rainer 2014-05-08 Rainer Orth r...@cebitec.uni-bielefeld.de PR libgcc/61097 * config/t-slibgcc-sld: Only build and install libgcc-unwind.map if --enable-shared. # HG changeset patch # Parent 8b4f4776ed04d118977a300b92559035f3b7a49b Don't build libgcc-unwind.map with --disable-shared (PR libgcc/61097) diff --git a/libgcc/config/t-slibgcc-sld b/libgcc/config/t-slibgcc-sld --- a/libgcc/config/t-slibgcc-sld +++ b/libgcc/config/t-slibgcc-sld @@ -4,6 +4,8 @@ SHLIB_LDFLAGS = -Wl,-h,$(SHLIB_SONAME) -Wl,-z,text -Wl,-z,defs \ -Wl,-M,$(SHLIB_MAP) +ifeq ($(enable_shared),yes) + # Linker mapfile to enforce direct binding to libgcc_s unwinder # (PR target/59788). libgcc-unwind.map: libgcc-std.ver @@ -26,3 +28,5 @@ install-libgcc-unwind-map: libgcc-unwind $(INSTALL_DATA) $ $(DESTDIR)$(slibdir) install: install-libgcc-unwind-map + +endif -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [wide-int] Add fast path for hosts with HWI widening multiplication
On Wed, Dec 4, 2013 at 12:56 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Sandiford rdsandif...@googlemail.com writes: This patch handles multiplications using a single HWIxHWI-2HWI multiplication on hosts that have one. This removes all uses of the slow (half-HWI) path for insn-recog.ii. The slow path is still used 58 times for cp/parser.ii and 168 times for fold-const.ii, but at that kind of level it shouldn't matter much. I followed Joseph's suggestion and reused longlong.h. I copied it from libgcc rather than glibc since it seemed better for GCC to have a single version across both gcc/ and libgcc/. I can put it in include/ if that seems better. I've committed the patch to move longlong.h to trunk and merged back to the branch, so all that's left is the wide-int.cc patch. OK to install? Thanks, Richard Index: gcc/wide-int.cc === --- gcc/wide-int.cc 2013-12-03 23:59:08.133658567 + +++ gcc/wide-int.cc 2013-12-04 12:55:28.466895358 + @@ -27,6 +27,16 @@ along with GCC; see the file COPYING3. #include tree.h #include dumpfile.h +#if GCC_VERSION = 3000 +#define W_TYPE_SIZE HOST_BITS_PER_WIDE_INT +typedef unsigned HOST_HALF_WIDE_INT UHWtype; +typedef unsigned HOST_WIDE_INT UWtype; +typedef unsigned int UQItype __attribute__ ((mode (QI))); +typedef unsigned int USItype __attribute__ ((mode (SI))); +typedef unsigned int UDItype __attribute__ ((mode (DI))); +#include longlong.h We also need something like the attached patch to handle architectures which use UDWType in longlong.h. I noticed this when trying rth's patch stack http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00391.html to improve longlong.h for AArch64. It's needed when rth's patches go in finally for aarch64 but could probably go in now - after all the comment in longlong.h says you need to define UDWtype ... Don't mind either way. regards Ramana DATE Ramana Radhakrishnan ramana.radhakrish...@arm.com * wide-int.cc (UTItype): Define. (UDWtype): Define for appropriate W_TYPE_SIZE. +#endif + /* This is the maximal size of the buffer needed for dump. */ const unsigned int MAX_SIZE = (4 * (MAX_BITSIZE_MODE_ANY_INT / 4 + (MAX_BITSIZE_MODE_ANY_INT @@ -1255,8 +1265,8 @@ wi_pack (unsigned HOST_WIDE_INT *result, record in *OVERFLOW whether the result overflowed. SGN controls the signedness and is used to check overflow or if HIGH is set. */ unsigned int -wi::mul_internal (HOST_WIDE_INT *val, const HOST_WIDE_INT *op1, - unsigned int op1len, const HOST_WIDE_INT *op2, +wi::mul_internal (HOST_WIDE_INT *val, const HOST_WIDE_INT *op1val, + unsigned int op1len, const HOST_WIDE_INT *op2val, unsigned int op2len, unsigned int prec, signop sgn, bool *overflow, bool high) { @@ -1285,24 +1295,53 @@ wi::mul_internal (HOST_WIDE_INT *val, co if (needs_overflow) *overflow = false; + wide_int_ref op1 = wi::storage_ref (op1val, op1len, prec); + wide_int_ref op2 = wi::storage_ref (op2val, op2len, prec); + /* This is a surprisingly common case, so do it first. */ - if ((op1len == 1 op1[0] == 0) || (op2len == 1 op2[0] == 0)) + if (op1 == 0 || op2 == 0) { val[0] = 0; return 1; } +#ifdef umul_ppmm + if (sgn == UNSIGNED) +{ + /* If the inputs are single HWIs and the output has room for at +least two HWIs, we can use umul_ppmm directly. */ + if (prec = HOST_BITS_PER_WIDE_INT * 2 + wi::fits_uhwi_p (op1) + wi::fits_uhwi_p (op2)) + { + umul_ppmm (val[1], val[0], op1.ulow (), op2.ulow ()); + return 1 + (val[1] != 0 || val[0] 0); + } + /* Likewise if the output is a full single HWI, except that the +upper HWI of the result is only used for determining overflow. +(We handle this case inline when overflow isn't needed.) */ + else if (prec == HOST_BITS_PER_WIDE_INT) + { + unsigned HOST_WIDE_INT upper; + umul_ppmm (upper, val[0], op1.ulow (), op2.ulow ()); + if (needs_overflow) + *overflow = (upper != 0); + return 1; + } +} +#endif + /* Handle multiplications by 1. */ - if (op1len == 1 op1[0] == 1) + if (op1 == 1) { for (i = 0; i op2len; i++) - val[i] = op2[i]; + val[i] = op2val[i]; return op2len; } - if (op2len == 1 op2[0] == 1) + if (op2 == 1) { for (i = 0; i op1len; i++) - val[i] = op1[i]; + val[i] = op1val[i]; return op1len; } @@ -1316,13 +1355,13 @@ wi::mul_internal (HOST_WIDE_INT *val, co if (sgn == SIGNED) { - o0 = sext_hwi (op1[0], prec); - o1 = sext_hwi (op2[0], prec); + o0 = op1.to_shwi (); + o1 =
Re: [Fortran, Patch] Some prep patches for coarray communication
Dear Tobias, This one is fine for trunk. Thanks for the patch. Paul
[patch,avr] Fix PR61055: Wrong branch instruction after INC, DEC.
Some instructions like INC, DEC, NEG currently set cc0 to set_zn which is not the whole story because they also set the V flag. This leads to a wrong branch instruction in the remainder as cc0 is used. Fix is to do same as clobber cc0. For the matter of clarity, I added a new cc0 alternative set_vzn for that case. Moreover, ADIW sets cc0 to set_czn rather than set_zn. This is the same as the action of a single ADD and like ADIW was modeled the old days (before avr_out_plus_1 was introduced to print the output). No new regressions. Ok to apply? Johann gcc/config/avr PR target/61055 * config/avr/avr.md (cc): Add new attribute set_vzn. (addqi3, addqq3, adduqq3, subqi3, subqq3, subuqq3, negqi2) [cc]: Set cc insn attribute to set_vzn instead of set_zn for alternatives with INC, DEC or NEG. * config/avr/avr.c (avr_notice_update_cc): Handle SET_VZN. (avr_out_plus_1): ADIW sets cc0 to CC_SET_CZN. INC, DEC and ADD+ADC set cc0 to CC_CLOBBER. gcc/testsuite/ PR target/61055 * gcc.target/avr/torture/pr61055.c: New test. Index: config/avr/avr.c === --- config/avr/avr.c (revision 210209) +++ config/avr/avr.c (working copy) @@ -2359,6 +2359,12 @@ avr_notice_update_cc (rtx body ATTRIBUTE } break; +case CC_SET_VZN: + /* Insn like INC, DEC, NEG that set Z,N,V. We currently don't make use + of this combination, cf. also PR61055. */ + CC_STATUS_INIT; + break; + case CC_SET_CZN: /* Insn sets the Z,N,C flags of CC to recog_operand[0]. The V flag may or may not be known but that's ok because @@ -6290,7 +6296,7 @@ avr_out_plus_1 (rtx *xop, int *plen, enu if (REG_P (xop[2])) { - *pcc = MINUS == code ? (int) CC_SET_CZN : (int) CC_SET_N; + *pcc = MINUS == code ? (int) CC_SET_CZN : (int) CC_CLOBBER; for (i = 0; i n_bytes; i++) { @@ -6399,7 +6405,7 @@ avr_out_plus_1 (rtx *xop, int *plen, enu op, plen, 1); if (n_bytes == 2 PLUS == code) -*pcc = CC_SET_ZN; +*pcc = CC_SET_CZN; } i++; @@ -6422,6 +6428,7 @@ avr_out_plus_1 (rtx *xop, int *plen, enu { avr_asm_len ((code == PLUS) ^ (val8 == 1) ? dec %0 : inc %0, op, plen, 1); + *pcc = CC_CLOBBER; break; } Index: config/avr/avr.md === --- config/avr/avr.md (revision 210209) +++ config/avr/avr.md (working copy) @@ -90,7 +90,7 @@ (define_c_enum unspecv (include constraints.md) ;; Condition code settings. -(define_attr cc none,set_czn,set_zn,set_n,compare,clobber, +(define_attr cc none,set_czn,set_zn,set_vzn,set_n,compare,clobber, plus,ldi (const_string none)) @@ -1098,7 +1098,7 @@ (define_insn addmode3 inc %0\;inc %0 dec %0\;dec %0 [(set_attr length 1,1,1,1,2,2) - (set_attr cc set_czn,set_czn,set_zn,set_zn,set_zn,set_zn)]) + (set_attr cc set_czn,set_czn,set_vzn,set_vzn,set_vzn,set_vzn)]) ;; addhi3 ;; addhq3 adduhq3 @@ -1369,7 +1369,7 @@ (define_insn submode3 dec %0\;dec %0 inc %0\;inc %0 [(set_attr length 1,1,1,1,2,2) - (set_attr cc set_czn,set_czn,set_zn,set_zn,set_zn,set_zn)]) + (set_attr cc set_czn,set_czn,set_vzn,set_vzn,set_vzn,set_vzn)]) ;; subhi3 ;; subhq3 subuhq3 @@ -3992,7 +3992,7 @@ (define_insn negqi2 neg %0 [(set_attr length 1) - (set_attr cc set_zn)]) + (set_attr cc set_vzn)]) (define_insn *negqihi2 [(set (match_operand:HI 0 register_operand=r) Index: testsuite/gcc.target/avr/torture/pr61055.c === --- testsuite/gcc.target/avr/torture/pr61055.c (revision 0) +++ testsuite/gcc.target/avr/torture/pr61055.c (revision 0) @@ -0,0 +1,88 @@ +/* { dg-do run } */ +/* { dg-options { -fno-peephole2 } } */ + +#include stdlib.h + +typedef __UINT16_TYPE__ uint16_t; +typedef __INT16_TYPE__ int16_t; +typedef __UINT8_TYPE__ uint8_t; + +uint8_t __attribute__((noinline,noclone)) +fun_inc (uint8_t c0) +{ + register uint8_t c asm (r15) = c0; + + /* Force target value into R15 (lower register) */ + asm ( : +l (c)); + + c++; + if (c = 0x80) +c = 0; + + asm ( : +l (c)); + + return c; +} + +uint8_t __attribute__((noinline,noclone)) +fun_dec (uint8_t c0) +{ + register uint8_t c asm (r15) = c0; + + /* Force target value into R15 (lower register) */ + asm ( : +l (c)); + + c--; + if (c 0x80) +c = 0; + + asm ( : +l (c)); + + return c; +} + + +uint8_t __attribute__((noinline,noclone)) +fun_neg (uint8_t c0) +{ + register uint8_t c asm (r15) = c0; + + c = -c; + if (c = 0x80) +c = 0; + + return c; +} + +uint16_t __attribute__((noinline,noclone)) +fun_adiw (uint16_t c0) +{ + register
Re: [wide-int] Add fast path for hosts with HWI widening multiplication
Ramana Radhakrishnan ramana@googlemail.com writes: diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 354cdb9..8ef9a0f 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2014-05-08 Ramana Radhakrishnan ramana.radhakrish...@arm.com + + * wide-int.cc (UTItype): Define. + (UDWtype): Define for appropriate W_TYPE_SIZE. + 2014-05-08 Alan Modra amo...@gmail.com PR target/60737 diff --git a/gcc/wide-int.cc b/gcc/wide-int.cc index 69a15bc..3552e03 100644 --- a/gcc/wide-int.cc +++ b/gcc/wide-int.cc @@ -34,6 +34,12 @@ typedef unsigned HOST_WIDE_INT UWtype; typedef unsigned int UQItype __attribute__ ((mode (QI))); typedef unsigned int USItype __attribute__ ((mode (SI))); typedef unsigned int UDItype __attribute__ ((mode (DI))); +typedef unsigned int UTItype __attribute__ ((mode (TI))); +#if W_TYPE_SIZE == 32 +# define UDWtype UDItype +#elif W_TYPE_SIZE == 64 +# define UDWtype UTItype +#endif #include longlong.h #endif OK, thanks. Technically the W_TYPE_SIZE == 32 should be dead now that HWI is always 64 bits, but there's still a debate about whether wide-int should continue using HOST_WIDE_INT/int64_t or something else, so I think it would be best to keep it. Richard
Re: Contributing new gcc targets: i386-*-dragonfly and x86-64-*-dragonfly
On 3 May 2014 08:11, John Marino wrote: On 5/2/2014 22:15, Joseph S. Myers wrote: On Fri, 2 May 2014, John Marino wrote: 1) I don't know which type definitions are missing (iow, the important ones from sys/type.h that are required to build gcc) The default presumption should be: * stddef.h from GCC provides what it needs to provide; nothing extra is needed and such a #include should not be needed at all. * Special measures to avoid duplicate typedefs (where some other header also defines one of the typedefs defined in stddef.h) aren't in fact needed, because GCC allows duplicate typedefs in system headers (even outside C11 mode - in C11 mode it's a standard feature). So try removing that #include. If that causes problems, investigate based on the actual problems seen. Hi Joseph, Removing the include worked after also removing the #ifdef __DragonFly with regards to the rune_t type definition. I built gcc with a full bootstraps on both DragonFly platforms successfully. stddef.h is much simpler now: --- gcc/ginclude/stddef.h.orig +++ gcc/ginclude/stddef.h @@ -133,6 +133,7 @@ #ifndef _BSD_PTRDIFF_T_ #ifndef ___int_ptrdiff_t_h #ifndef _GCC_PTRDIFF_T +#ifndef _PTRDIFF_T_DECLARED /* DragonFly */ #define _PTRDIFF_T #define _T_PTRDIFF_ #define _T_PTRDIFF @@ -141,10 +142,12 @@ #define _BSD_PTRDIFF_T_ #define ___int_ptrdiff_t_h #define _GCC_PTRDIFF_T +#define _PTRDIFF_T_DECLARED #ifndef __PTRDIFF_TYPE__ #define __PTRDIFF_TYPE__ long int #endif typedef __PTRDIFF_TYPE__ ptrdiff_t; +#endif /* _PTRDIFF_T_DECLARED */ #endif /* _GCC_PTRDIFF_T */ #endif /* ___int_ptrdiff_t_h */ #endif /* _BSD_PTRDIFF_T_ */ @@ -198,6 +201,7 @@ #define _GCC_SIZE_T #define _SIZET_ #if (defined (__FreeBSD__) (__FreeBSD__ = 5)) \ + || defined(__DragonFly__) \ || defined(__FreeBSD_kernel__) /* __size_t is a typedef on FreeBSD 5, must not trash it. */ #elif defined (__VMS__) revised patchset : http://leaf.dragonflybsd.org/~marino/gcc-df-target/patches/patch-dragonfly-target revised changelog : http://leaf.dragonflybsd.org/~marino/gcc-df-target/changelog_entries/gcc_ChangeLog_entry.txt revised commit msg: http://leaf.dragonflybsd.org/~marino/gcc-df-target/proposed_commit-msg.txt Good catch! Does the rest of the patch set look good to you? I think all the non-obvious patches have been reviewed collectively by various people now and may be ready to be approved now. Ian's approved the libiberty.h change, Joseph's approved the stddef.h change, I've approved the libstdc++ parts. IIUC it still needs explicit approval for the rest, e.g. trivial adjustments to configuration stuff in libitm and libcilkrts. Are there specific maintainers for those libs? The rest look obvious to me, it doesn't touch other targets at all except for one bit that replaces a check for __FreeBSD__ = 7 with a grep for the dl_iterate_phdr function in a system header, which only affects FreeBSD and looks OK to me. Anyone willing to give it an overall approval?
Re: [RFA/dwarf v2] Add DW_AT_GNAT_use_descriptive_type flag for Ada units.
Ping? I understand there was a freeze period for a while, but the patch has been out for a little over 3 months, now, and is fairly trivial. Can someone review it for me? Suggest also including it in 4.9. Thank you! On Tue, Feb 19, 2013 at 10:50:46PM -0500, Jason Merrill wrote: On 02/19/2013 10:42 PM, Joel Brobecker wrote: This is useful when a DIE does not have a descriptive type attribute. In that case, the debugger needs to determine whether the unit was compiled with a compiler that normally provides that information, or not. Ah. OK, then. But I'd prefer to call it DW_AT_GNAT_use_descriptive_type, to follow the convention of keeping the vendor tag at the beginning of the name. Almost a year ago, you privately approved a small patch of mine, with the small request above. I'm sorry I let it drag so long! Here is the updated patch. include/ChangeLog: * dwarf2.def: Rename DW_AT_use_GNAT_descriptive_type into DW_AT_GNAT_use_descriptive_type. gcc/ChangeLog: * dwarf2out.c (gen_compile_unit_die): Add DW_AT_use_GNAT_descriptive_type attribute for Ada units. Tested on x86_64-linux. I should also adjust the Wiki page accordingly, but the login process keeps timing out. I know I have the right login and passwd since I succesfully reset them using the passwd recovery procedure, just in case the error was due to bad credentials. I'll try again later. If approved, I will also take care of coordinating the dwarf2.def change with binutils-gdb.git. Is this patch still OK to commit? Thank you, -- Joel From 7aae3721addf6905113d9f0287a5cbb5301a462b Mon Sep 17 00:00:00 2001 From: Joel Brobecker brobec...@adacore.com Date: Thu, 3 Jan 2013 09:25:12 -0500 Subject: [PATCH] [dwarf] Add DW_AT_GNAT_use_descriptive_type flag for Ada units. This patch first renames the DW_AT_use_GNAT_descriptive_type DWARF attribute into DW_AT_GNAT_use_descriptive_type to better follow the usual convention of keeping the vendor tag at the beginning of the name. It then modifies dwadrf2out to generate this attribute for Ada units. include/ChangeLog: * dwarf2.def: Rename DW_AT_use_GNAT_descriptive_type into DW_AT_GNAT_use_descriptive_type. gcc/ChangeLog: * dwarf2out.c (gen_compile_unit_die): Add DW_AT_use_GNAT_descriptive_type attribute for Ada units. --- gcc/dwarf2out.c|4 include/dwarf2.def |2 +- 2 files changed, 5 insertions(+), 1 deletions(-) diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index d1ca4ba..057605c 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -19318,6 +19318,10 @@ gen_compile_unit_die (const char *filename) /* The default DW_ID_case_sensitive doesn't need to be specified. */ break; } + + if (language == DW_LANG_Ada95) +add_AT_flag (die, DW_AT_GNAT_use_descriptive_type, 1); + return die; } diff --git a/include/dwarf2.def b/include/dwarf2.def index 71a37b3..4dd636e 100644 --- a/include/dwarf2.def +++ b/include/dwarf2.def @@ -398,7 +398,7 @@ DW_AT (DW_AT_VMS_rtnbeg_pd_address, 0x2201) /* GNAT extensions. */ /* GNAT descriptive type. See http://gcc.gnu.org/wiki/DW_AT_GNAT_descriptive_type . */ -DW_AT (DW_AT_use_GNAT_descriptive_type, 0x2301) +DW_AT (DW_AT_GNAT_use_descriptive_type, 0x2301) DW_AT (DW_AT_GNAT_descriptive_type, 0x2302) /* UPC extension. */ DW_AT (DW_AT_upc_threads_scaled, 0x3210) -- 1.7.0.4 -- Joel
Re: Contributing new gcc targets: i386-*-dragonfly and x86-64-*-dragonfly
On 05/08/14 07:14, Jonathan Wakely wrote: Ian's approved the libiberty.h change, Joseph's approved the stddef.h change, I've approved the libstdc++ parts. IIUC it still needs explicit approval for the rest, e.g. trivial adjustments to configuration stuff in libitm and libcilkrts. Are there specific maintainers for those libs? The rest look obvious to me, it doesn't touch other targets at all except for one bit that replaces a check for __FreeBSD__ = 7 with a grep for the dl_iterate_phdr function in a system header, which only affects FreeBSD and looks OK to me. Anyone willing to give it an overall approval? I'll take a look at the rest. I mostly wanted someone else to deal with stddef.h :-) jeff
Re: Contributing new gcc targets: i386-*-dragonfly and x86-64-*-dragonfly
On 5/8/2014 15:32, Jeff Law wrote: On 05/08/14 07:14, Jonathan Wakely wrote: Anyone willing to give it an overall approval? I'll take a look at the rest. I mostly wanted someone else to deal with stddef.h :-) Thanks Jeff! I'm am very appreciative of that. John
Re: [RS6000] Fix PR61098, Poor code setting count register
On Wed, May 7, 2014 at 9:48 PM, Alan Modra amo...@gmail.com wrote: On powerpc64, to set a large loop count we have code like the following after split1: (insn 67 14 68 4 (set (reg:DI 160) (const_int 99942400 [0x5f5])) /home/amodra/unaligned_load.c:14 -1 (nil)) (insn 68 67 42 4 (set (reg:DI 160) (ior:DI (reg:DI 160) (const_int 57600 [0xe100]))) /home/amodra/unaligned_load.c:14 -1 (expr_list:REG_EQUAL (const_int 1 [0x5f5e100]) (nil))) and then test for loop exit with: (jump_insn 65 31 45 5 (parallel [ (set (pc) (if_then_else (ne (reg:DI 160) (const_int 1 [0x1])) (label_ref:DI 42) (pc))) (set (reg:DI 160) (plus:DI (reg:DI 160) (const_int -1 [0x]))) (clobber (scratch:CC)) (clobber (scratch:DI)) ]) /home/amodra/unaligned_load.c:15 800 {*ctrdi_internal1} (int_list:REG_BR_PROB 9899 (nil)) - 42) The jump_insn of course is meant for use with bdnz, which implies a strong preference for reg 160 to live in the count register. Trouble is, the count register doesn't do arithmetic. So, use a new psuedo for intermediate results. On looking at this, I noticed the !TARGET_POWERPC64 code in rs6000_emit_set_long_const was broken, apparently expecting c1 and c2 to be the high and low 32 bits of the constant. That's no longer true, so I've fixed that as well. Bootstrapped and regression tested powerpc64-linux. OK for mainline and branches? PR target/61098 * config/rs6000/rs6000.c (rs6000_emit_set_const): Remove unneeded params and return value. Simplify. Update comment. (rs6000_emit_set_long_const): Remove unneeded param and return value. Correct !TARGET_POWERPC64 handling of constants 2G. If we can, use a new pseudo for intermediate calculations. Alan, The history is 32 bit HWI. The ChangeLog does not mention the changes to rs6000.md nor rs6000-protos.h. Please do not remove all of the comments from the two functions. The comments should provide some documentation about the different purposes of the two functions other than setting DEST to a CONST. Why did you remove the test for NULL dest? - if (dest == NULL) - dest = gen_reg_rtx (mode); That could occur, at least it used to occur. I think that the way you rearranged the invocations of copy_rtx() in rs6000_emit_set_long_const() is okay, but it would be good for someone else to double check. Thanks, David
Re: [C++ Patch] PR 13981 (resend)
OK. Jason
RE: [Patch, PR 60158] Generate .fixup sections for .data.rel.ro.local entries.
-Original Message- From: Alan Modra [mailto:amo...@gmail.com] Sent: Saturday, April 26, 2014 11:52 AM To: Dharmakan Rohit-B30502 Cc: gcc-patches@gcc.gnu.org; dje@gmail.com; Wienskoski Edmar-RA8797 Subject: Re: [Patch, PR 60158] Generate .fixup sections for .data.rel.ro.local entries. On Fri, Apr 25, 2014 at 02:57:38PM +, rohitarul...@freescale.com wrote: Source file: gcc-4.8.2/gcc/varasm.c @@ -7120,7 +7120,7 @@ if (CONSTANT_POOL_ADDRESS_P (symbol)) { desc = SYMBOL_REF_CONSTANT (symbol); output_constant_pool_1 (desc, 1); - (A) offset += GET_MODE_SIZE (desc-mode); I think the reason 1 is passed here for align is that with -fsection- anchors, in output_object_block we've already laid out everything in the block, assigning offsets from the start of the block. Aligning shouldn't be necessary, because we've already done that.. OTOH, it shouldn't hurt to align again. Thanks. I have tested for both the cases on e500v2, e500mc, e5500, ppc64 (GCC v4.8.2 branch) with no regressions. Patch1 [gcc.fix_pr60158_fixup_table-fsf]: Pass actual alignment value to output_constant_pool_2. Patch2 [gcc.fix_pr60158_fixup_table-fsf-2]: Use the alignment data available in the first argument (constant_descriptor_rtx) of output_constant_pool_1. (Note: this generates .align directive twice). Is it ok to commit? Any comments? Regards, Rohit gcc.fix_pr60158_fixup_table-fsf Description: gcc.fix_pr60158_fixup_table-fsf gcc.fix_pr60158_fixup_table-fsf-2 Description: gcc.fix_pr60158_fixup_table-fsf-2
Re: RFA: Fix calculation of size of builtin setjmp buffer
Hi Mike, How about GET_MODE_SIZE (STACK_SAVEAREA_MODE (SAVE_NONLOCAL)) / GET_MODE_SIZE (Pmode) + 2 + /* slop for mips, see builtin_setjmp_setup */ 1 - 1. This retains the slop for mips, and fixes ports like ia64 and s390 (see STACK_SAVEAREA_MODE on those ports, it is larger one might expect)? OK - revised patch attached. I have added a comment before the computation to explain each of the numbers, and adjusted the comments in the other files to match the new size of the jump buffer. What do you think of this version ? Cheers Nick builtin-setjmp.patch.2 Description: Unix manual page
Re: emit __float128 typeinfo
Ping http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01651.html On Fri, 25 Apr 2014, Marc Glisse wrote: On Fri, 25 Apr 2014, Marc Glisse wrote: the previous patch had to be reverted as it broke the strange handling of vectors in the ARM target. This new patch should be much more conservative I hope. Instead of adding this typeinfo to libsupc++, I am letting the FE know that it isn't available in libsupc++. There are 2 versions, both regtested fine. Does this approach seem ok, or do we need to try harder to find a way to get this typeinfo into libsupc++? 2014-04-25 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 * rtti.c (emit_support_tinfos): Move the array... (fundamentals): ... and make it global. (typeinfo_in_lib_p): Use it. 2014-04-25 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 * rtti.c (typeinfo_in_lib_p) [REAL_TYPE]: Check against a hardcoded list of available types. It seems better with a TYPE_CANONICAL in there. It passed bootstrap and the testsuite is running. -- Marc Glisse
RE: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line option
-Original Message- From: Joseph Myers [mailto:jos...@codesourcery.com] Sent: Wednesday, May 07, 2014 9:01 PM To: Herman, Andrei Cc: gcc-patches@gcc.gnu.org; herman_and...@mentor.com Subject: Re: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line option On Wed, 7 May 2014, Herman, Andrei wrote: When this flag is set, a DW_TAG_lexical_block DIE will be emitted for every function body, loop body, switch body, case statement, if-then and if-else statement, even if the body is a single statement. Likewise, a lexical block will be emitted for the first label of a labeled statement. This block ends at the end of the current lexical scope, or when a break, continue, goto or return statement is encountered at the same lexical scope level. Consequently, any case in a switch statement that does not flow through to the next case, will have its own dwarf lexical block. The documentation appears to suggest it's purely about debug info and has no effect on language semantics. However, the implementation appears to force C99 scoping rules. I don't think it's appropriate for a debug info option You are right. The C99 scoping rules are forced with this option. to have that effect; that is, gcc.dg/c90-scope-1.c should still pass even with the option enabled (more generally, the whole C testsuite should be verified to work with the option enabled). I suspect the changes adding scopes for labels would also affect language semantics; it's valid in C to have a declaration (not having variably modified type) after one case in a switch statement that gets used in another case even when control does not flow through. The changes in gcc/c/c-decl.c are meant to deal with this problem. Declarations that would fall into the scope of a newly created label scope are moved into the enclosing normal (non label) scope, where they actually belong. If you can't avoid affecting language semantics then you need to be very clear in the documentation that the option makes some invalid programs valid and vice versa and changes the semantics of some valid programs (even if you then assert the affected cases are uncommon in real C code). Here is the changed documentation: @item -fforce-dwarf-lexical-blocks Produce debug information (a DW_TAG_lexical_block) for every function body, loop body, switch body, case statement, if-then and if-else statement, even if the body is a single statement. Likewise, a lexical block will be emitted for the first label of a statement. This block ends at the end of the current lexical scope, or when a break, continue, goto or return statement is encountered at the same lexical scope level. This option is useful for coverage tools that utilize the dwarf debug information. This option only applies to C/C++ code and is available when using DWARF Version 4 or higher. Note that when this option is used, it will enforce the scoping rules of the C99 standard, which may make some programs that are invalid, to become valid and vice versa. -- Joseph S. Myers jos...@codesourcery.com Regards, Andrei Herman Mentor Graphics Israel branch
Fwd: [PATCH, alpha]: Fix PR61092, wide-int merge broke alpha bootstrap
Hello! Wide-int merge triggered following ICE: In file included from ../../gcc-svn/trunk/gcc/wide-int.cc:37:0: ../../gcc-svn/trunk/gcc/wide-int.cc: In function ‘unsigned int wi::mul_internal(long int*, const long int*, unsigned int, const long int*, unsigned int, unsigned int, signop, bool*, bool)’: ../../gcc-svn/trunk/gcc/../include/longlong.h:145:10: sorry, unimplemented: unexpected AST of kind mult_highpart_expr (ph) = __builtin_alpha_umulh (__m0, __m1);\ ^ ../../gcc-svn/trunk/gcc/wide-int.cc:1269:4: note: in expansion of macro ‘umul_ppmm’ umul_ppmm (val[1], val[0], op1.ulow (), op2.ulow ()); ^ ../../gcc-svn/trunk/gcc/../include/longlong.h:145:10: internal compiler error: in potential_constant_expression_1, at cp/semantics.c:10575 (ph) = __builtin_alpha_umulh (__m0, __m1);\ ^ ../../gcc-svn/trunk/gcc/wide-int.cc:1269:4: note: in expansion of macro ‘umul_ppmm’ umul_ppmm (val[1], val[0], op1.ulow (), op2.ulow ()); ^ As instructed by Jakub, target builtins should be folded during gimplification. 2014-05-08 Uros Bizjak ubiz...@gmail.com PR target/61092 * config/alpha/alpha.c: Include gimple-iterator.h. (alpha_gimple_fold_builtin): New function. Move ALPHA_BUILTIN_UMULH folding from ... (alpha_fold_builtin): ... here. (TARGET_GIMPLE_FOLD_BUILTIN): New define. Patch was bootstrapped and regression tested on alphaev68-pc-linux-gnu. If there are no objections, I will commit the patch to mainline and 4.9. Uros. Index: config/alpha/alpha.c === --- config/alpha/alpha.c(revision 210120) +++ config/alpha/alpha.c(working copy) @@ -62,6 +62,7 @@ along with GCC; see the file COPYING3. If not see #include gimple-expr.h #include is-a.h #include gimple.h +#include gimple-iterator.h #include gimplify.h #include gimple-ssa.h #include stringpool.h @@ -7042,9 +7043,6 @@ alpha_fold_builtin (tree fndecl, int n_args, tree case ALPHA_BUILTIN_MSKQH: return alpha_fold_builtin_mskxx (op, opint, op_const, 0xff, true); -case ALPHA_BUILTIN_UMULH: - return fold_build2 (MULT_HIGHPART_EXPR, alpha_dimode_u, op[0], op[1]); - case ALPHA_BUILTIN_ZAP: opint[1] ^= 0xff; /* FALLTHRU */ @@ -7094,6 +7092,49 @@ alpha_fold_builtin (tree fndecl, int n_args, tree return NULL; } } + +bool +alpha_gimple_fold_builtin (gimple_stmt_iterator *gsi) +{ + bool changed = false; + gimple stmt = gsi_stmt (*gsi); + tree call = gimple_call_fn (stmt); + gimple new_stmt = NULL; + + if (call) +{ + tree fndecl = gimple_call_fndecl (stmt); + + if (fndecl) + { + tree arg0, arg1; + + switch (DECL_FUNCTION_CODE (fndecl)) + { + case ALPHA_BUILTIN_UMULH: + arg0 = gimple_call_arg (stmt, 0); + arg1 = gimple_call_arg (stmt, 1); + + new_stmt + = gimple_build_assign_with_ops (MULT_HIGHPART_EXPR, + gimple_call_lhs (stmt), + arg0, + arg1); + break; + default: + break; + } + } +} + + if (new_stmt) +{ + gsi_replace (gsi, new_stmt, true); + changed = true; +} + + return changed; +} /* This page contains routines that are used to determine what the function prologue and epilogue code will do and write them out. */ @@ -9790,6 +9831,8 @@ alpha_canonicalize_comparison (int *code, rtx *op0 #define TARGET_EXPAND_BUILTIN alpha_expand_builtin #undef TARGET_FOLD_BUILTIN #define TARGET_FOLD_BUILTIN alpha_fold_builtin +#undef TARGET_GIMPLE_FOLD_BUILTIN +#define TARGET_GIMPLE_FOLD_BUILTIN alpha_gimple_fold_builtin #undef TARGET_FUNCTION_OK_FOR_SIBCALL #define TARGET_FUNCTION_OK_FOR_SIBCALL alpha_function_ok_for_sibcall
Re: Fix some tests for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL
On Thu, May 8, 2014 at 3:10 AM, Joseph S. Myers jos...@codesourcery.com wrote: Having fixed TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL to apply only to 128-bit vectors, some --with-arch=bdver3 --with-cpu=bdver3 scan-assembler failures relating to that tuning remain, because of different choices of instructions for 128-bit vectors from the choices expected by the tests. This patch fixes affected tests to allow the different instruction choices seen in this case. Tested for x86_64-linux-gnu (--with-arch=bdver3 --with-cpu=bdver3). OK to commit? OK. Thanks, Uros.
Re: [Patch, PR 60158] Generate .fixup sections for .data.rel.ro.local entries.
Rohit, The subject line and thread may confuse people that this is a PowerPC-specific issue. You need approval from a reviewer with authority over varasm.c. Thanks, David On Thu, May 8, 2014 at 9:54 AM, rohitarul...@freescale.com rohitarul...@freescale.com wrote: -Original Message- From: Alan Modra [mailto:amo...@gmail.com] Sent: Saturday, April 26, 2014 11:52 AM To: Dharmakan Rohit-B30502 Cc: gcc-patches@gcc.gnu.org; dje@gmail.com; Wienskoski Edmar-RA8797 Subject: Re: [Patch, PR 60158] Generate .fixup sections for .data.rel.ro.local entries. On Fri, Apr 25, 2014 at 02:57:38PM +, rohitarul...@freescale.com wrote: Source file: gcc-4.8.2/gcc/varasm.c @@ -7120,7 +7120,7 @@ if (CONSTANT_POOL_ADDRESS_P (symbol)) { desc = SYMBOL_REF_CONSTANT (symbol); output_constant_pool_1 (desc, 1); - (A) offset += GET_MODE_SIZE (desc-mode); I think the reason 1 is passed here for align is that with -fsection- anchors, in output_object_block we've already laid out everything in the block, assigning offsets from the start of the block. Aligning shouldn't be necessary, because we've already done that.. OTOH, it shouldn't hurt to align again. Thanks. I have tested for both the cases on e500v2, e500mc, e5500, ppc64 (GCC v4.8.2 branch) with no regressions. Patch1 [gcc.fix_pr60158_fixup_table-fsf]: Pass actual alignment value to output_constant_pool_2. Patch2 [gcc.fix_pr60158_fixup_table-fsf-2]: Use the alignment data available in the first argument (constant_descriptor_rtx) of output_constant_pool_1. (Note: this generates .align directive twice). Is it ok to commit? Any comments? Regards, Rohit
Re: [PATCH, PR58066] preferred_stack_boundary update for tls expanded call
On Thu, May 8, 2014 at 12:59 AM, Wei Mi w...@google.com wrote: The calls added in the templates of tls_local_dynamic_base_32 and tls_global_dynamic_32 in pr58066-3.patch are used to prevent sched2 from moving sp setting across implicit tls calls, but those calls make the combine of UNSPEC_TLS_LD_BASE and UNSPEC_DTPOFF difficult, so that the optimization in tls_local_dynamic_32_once to convert local_dynamic to global_dynamic mode for single tls reference cannot take effect. In the updated patch, I remove those calls from insn templates and add reg:SI SP_REG explicitly in the templates of UNSPEC_TLS_GD and UNSPEC_TLS_LD_BASE. It solves the sched2 and combine problems above, and now the optimization in tls_local_dynamic_32_once works. bootstrapped ok on x86_64-linux-gnu. regression is going on. Is it OK if regression passes? Please update ChangeLog with all changes, see below: ChangeLog: gcc/ 2014-05-07 Wei Mi w...@google.com * config/i386/i386.c (ix86_compute_frame_layout): preferred_stack_boundary updated for tls expanded call. (...): Update preferred_stack_boundary for call, expanded from tls descriptor. * config/i386/i386.md: Set ix86_tls_descriptor_calls_expanded_in_cfun. * config/i386/i386.md (*tls_global_dynamic_32_gnu): Depend on SP register. (*tls_local_dynamic_base_32_gnu): Ditto. ... (tls_global_dynamic_32): Set ix86_tls_descriptor_calls_expanded_in_cfun. Update RTX to depend on SP register. (tls_local_dynamic_base_32): Ditto. ... The patch is OK for mainline with updated and complete ChangeLog entry. Thanks, Uros.
Re: [C PATCH] Warn for _Alignas + main (PR c/61077)
Sorry, the subject should say _Atomic instead of _Alignas. Marek
Re: RFA: Fix calculation of size of builtin setjmp buffer
On May 8, 2014, at 7:24 AM, Nicholas Clifton ni...@redhat.com wrote: What do you think of this version ? Now we just need a __builtin_setjmp style of maintainer to review…
[PING] [PATCH, wwwdocs, AArch64] Document issues with singleton vector types
Ping~ Originally posted here: http://gcc.gnu.org/ml/gcc-patches/2014-05/msg00019.html Thanks, Yufeng On 05/01/14 17:57, Yufeng Zhang wrote: Hi, This patch documents issues with singleton vector types in the 4.9 AArch64 backend. On AArch64, the singleton vector types int64x1_t, uint64x1_t and float64x1_t exported by arm_neon.h are defined to be the same as their base types. This results in incorrect application of parameter passing rules to arguments of types int64x1_t and uint64x1_t, with respect to the AAPCS64 ABI specification. In addition, names of C++ functions with parameters of these types (including float64x1_t) are not mangled correctly. The current typedef declarations also unintentionally allow implicit casting between singleton vector types and their base types. These issues will be resolved in a near future release. See PR60825 for more information. OK for the wwwdocs repos? Thanks, Yufeng
Re: [wide-int] Add fast path for hosts with HWI widening multiplication
On Thu, 8 May 2014, Ramana Radhakrishnan wrote: DATE Ramana Radhakrishnan ramana.radhakrish...@arm.com * wide-int.cc (UTItype): Define. (UDWtype): Define for appropriate W_TYPE_SIZE. This breaks builds for 32-bit hosts, where TImode isn't supported. You can only use TImode on the host if it's 64-bit. wide-int.cc:37:56: error: unable to emulate 'TI' -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH, ARM] Fix PR60609 (Error: value of 256 too large for field of 1 bytes)
On 25 April 2014 16:31, Ramana Radhakrishnan ramana@googlemail.com wrote: On Fri, Apr 25, 2014 at 4:29 PM, Charles Baylis charles.bay...@linaro.org wrote: OK to backport to 4.8 and 4.7? Ok by me but give 24 working hours for an RM to object. Committed to 4.7 as r210227. Committed to 4.8 as r210226. 2014-05-08 Charles Baylis charles.bay...@linaro.org Backport from mainline 2014-04-07 Charles Baylis charles.bay...@linaro.org PR target/60609 * config/arm/arm.h (ASM_OUTPUT_CASE_END): Remove. (LABEL_ALIGN_AFTER_BARRIER): Align barriers which occur after ADDR_DIFF_VEC.
[patch] libstdc++/57394 - copy operations for basic_streambuf
This patch isn't very useful on its own, but fixes the PR and should be useful when adding move operations to the derived streambufs. The mem-initializer for basic_streambuf::_M_out_end was initialized from __sb._M_out_cur, which I assume was not intentional. Those initializers were only added to keep -Weffc++ happy, see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12854#c7 It might make sense just to value-initialize all the members, we're not expecting that copy constructor to ever get called in C++03 mode, but I don't plan to change that. The real commit also removes some trailing whitespace not shown in this patch. Tested x86_64-linux, committed to trunk. commit 8550d85bea18d33f97b546886a0d61a1dc87aea9 Author: Jonathan Wakely jwak...@redhat.com Date: Thu May 8 14:42:49 2014 +0100 PR libstdc++/57394 * include/bits/ios_base.h (ios_base(const ios_base)): Define as deleted for C++11. (operator=(const ios_base)): Likewise. * include/std/streambuf: Remove trailing whitespace. (basic_streambuf(const basic_streambuf)): Fix initializer for _M_out_end. Define as defaulted for C++11. (operator=(const basic_streambuf)): Define as defaulted for C++11. (swap(basic_streambuf)): Define for C++11. * testsuite/27_io/basic_streambuf/cons/57394.cc: New. diff --git a/libstdc++-v3/include/bits/ios_base.h b/libstdc++-v3/include/bits/ios_base.h index ae856de..59c5066 100644 --- a/libstdc++-v3/include/bits/ios_base.h +++ b/libstdc++-v3/include/bits/ios_base.h @@ -780,6 +780,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION protected: ios_base() throw (); +#if __cplusplus 201103L // _GLIBCXX_RESOLVE_LIB_DEFECTS // 50. Copy constructor and assignment operator of ios_base private: @@ -787,6 +788,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION ios_base operator=(const ios_base); +#else + public: +ios_base(const ios_base) = delete; + +ios_base +operator=(const ios_base) = delete; +#endif }; // [27.4.5.1] fmtflags manipulators diff --git a/libstdc++-v3/include/std/streambuf b/libstdc++-v3/include/std/streambuf index 865f26b..0cb609d 100644 --- a/libstdc++-v3/include/std/streambuf +++ b/libstdc++-v3/include/std/streambuf @@ -796,18 +796,38 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION void __safe_pbump(streamsize __n) { _M_out_cur += __n; } +#if __cplusplus 201103L private: // _GLIBCXX_RESOLVE_LIB_DEFECTS // Side effect of DR 50. basic_streambuf(const basic_streambuf __sb) : _M_in_beg(__sb._M_in_beg), _M_in_cur(__sb._M_in_cur), _M_in_end(__sb._M_in_end), _M_out_beg(__sb._M_out_beg), - _M_out_cur(__sb._M_out_cur), _M_out_end(__sb._M_out_cur), + _M_out_cur(__sb._M_out_cur), _M_out_end(__sb._M_out_end), _M_buf_locale(__sb._M_buf_locale) { } basic_streambuf - operator=(const basic_streambuf) { return *this; }; + operator=(const basic_streambuf) { return *this; } +#else +protected: + basic_streambuf(const basic_streambuf) = default; + + basic_streambuf + operator=(const basic_streambuf) = default; + + void + swap(basic_streambuf __sb) + { + std::swap(_M_in_beg, __sb._M_in_beg); + std::swap(_M_in_cur, __sb._M_in_cur); + std::swap(_M_in_end, __sb._M_in_end); + std::swap(_M_out_beg, __sb._M_out_beg); + std::swap(_M_out_cur, __sb._M_out_cur); + std::swap(_M_out_end, __sb._M_out_end); + std::swap(_M_buf_locale, __sb._M_buf_locale); + } +#endif }; // Explicit specialization declarations, defined in src/streambuf.cc. diff --git a/libstdc++-v3/testsuite/27_io/basic_streambuf/cons/57394.cc b/libstdc++-v3/testsuite/27_io/basic_streambuf/cons/57394.cc new file mode 100644 index 000..f58c545 --- /dev/null +++ b/libstdc++-v3/testsuite/27_io/basic_streambuf/cons/57394.cc @@ -0,0 +1,113 @@ +// Copyright (C) 2014 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +// { dg-options -std=gnu++11 } +// { dg-require-namedlocale de_DE } + +// 27.6.3 template class basic_streambuf + +#include streambuf +#include testsuite_hooks.h + +struct streambuf : std::streambuf +{ + streambuf() + { +setp(pbuf, std::end(pbuf)); +setg(gbuf, gbuf, gbuf); + } + + streambuf(const std::locale loc) : streambuf() + { +imbue(loc); + } + + //
Re: [PATCH] Add support for MIPS r3 and r5
On Thu, 8 May 2014, Andrew Bennett wrote: Hi, This patch adds support for MIPS r3 and r5 to GCC. I have updated the msgid strings in the .po files for the error message I changed. Can I assume the actual msgstr entries will be updated later on? Never modify .po files in GCC; they should only ever be imported verbatim from the Translation Project. -- Joseph S. Myers jos...@codesourcery.com
RE: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line option
On Thu, 8 May 2014, Herman, Andrei wrote: The changes in gcc/c/c-decl.c are meant to deal with this problem. Declarations that would fall into the scope of a newly created label scope are moved into the enclosing normal (non label) scope, where they actually belong. Shouldn't you be able to do something like that for the other cases as well, to avoid forcing C99 scoping rules? In any case, I think you need to run the complete gcc testsuite with this option enabled and compare with the results for a default testsuite run. -- Joseph S. Myers jos...@codesourcery.com
Re: [C PATCH] Warn for _Alignas + main (PR c/61077)
On Thu, 8 May 2014, Marek Polacek wrote: Joseph pointed out that we don't warn when the return type of main or the parameter type of main have the _Atomic qualifier. This patch adds such warning. Regtested/bootstrapped on x86_64-linux, ok for trunk? OK. -- Joseph S. Myers jos...@codesourcery.com
[PATCH, AArch64] Use MOVN to generate 64-bit negative immediates where sensible
Hi, It currently takes 4 instructions to generate certain immediates on AArch64 (unless we put them in the constant pool). For example ... long long beefcafebabe () { return 0xBEEFCAFEBABEll; } leads to ... mov x0, 0x47806 mov x0, 0xcafe, lsl 16 mov x0, 0xbeef, lsl 32 orr x0, x0, -281474976710656 The above case is tackled in this patch by employing MOVN to generate the top 32-bits in a single instruction ... mov x0, -71536975282177 movk x0, 0xcafe, lsl 16 movk x0, 0xbabe, lsl 0 Note that where at least two half-words are 0x, existing code that does the immediate in two instructions is still used.) Tested on standard gcc regressions and the attached test case. OK for commit? Cheers, Ian 2014-05-08 Ian Bolton ian.bol...@arm.com gcc/ * config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Use MOVN when top-most half-word (and only that half-word) is 0x. gcc/testsuite/ * gcc.target/aarch64/movn_1.c: New test.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 43a83566..a8e504e 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -1177,6 +1177,18 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm) } } + /* Look for case where upper 16 bits are set, so we can use MOVN. */ + if ((val 0xll) == 0xll) +{ + emit_insn (gen_rtx_SET (VOIDmode, dest, + GEN_INT (~ (~val (0xll 32); + emit_insn (gen_insv_immdi (dest, GEN_INT (16), +GEN_INT ((val 16) 0x))); + emit_insn (gen_insv_immdi (dest, GEN_INT (0), +GEN_INT (val 0x))); + return; +} + simple_sequence: first = true; mask = 0x; diff --git a/gcc/testsuite/gcc.target/aarch64/movn_1.c b/gcc/testsuite/gcc.target/aarch64/movn_1.c new file mode 100644 index 000..cc11ade --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movn_1.c @@ -0,0 +1,27 @@ +/* { dg-do run } */ +/* { dg-options -O2 -fno-inline --save-temps } */ + +extern void abort (void); + +long long +foo () +{ + /* { dg-final { scan-assembler mov\tx\[0-9\]+, -71536975282177 } } */ + return 0xbeefcafebabell; +} + +long long +merge4 (int a, int b, int c, int d) +{ + return ((long long) a 48 | (long long) b 32 + | (long long) c 16 | (long long) d); +} + +int main () +{ + if (foo () != merge4 (0x, 0xbeef, 0xcafe, 0xbabe)) +abort (); + return 0; +} + +/* { dg-final { cleanup-saved-temps } } */
[PATCH, AArch64] Fix macro in vdup_lane_2 test case
This patch fixes a defective macro definition, based on correct definition in similar testcases. The test currently passes through luck rather than correctness. OK for commit? Cheers, Ian 2014-05-08 Ian Bolton ian.bol...@arm.com gcc/testsuite * gcc.target/aarch64/vdup_lane_2.c (force_simd): Emit an actual instruction to move into the allocated register.diff --git a/gcc/testsuite/gcc.target/aarch64/vdup_lane_2.c b/gcc/testsuite/gcc.target/aarch64/vdup_lane_2.c index 7c04e75..2072c79 100644 --- a/gcc/testsuite/gcc.target/aarch64/vdup_lane_2.c +++ b/gcc/testsuite/gcc.target/aarch64/vdup_lane_2.c @@ -4,10 +4,11 @@ #include arm_neon.h -#define force_simd(V1) asm volatile ( \ - : =w(V1) \ - : w(V1)\ - : /* No clobbers */) +/* Used to force a variable to a SIMD register. */ +#define force_simd(V1) asm volatile (orr %0.16b, %1.16b, %1.16b\ + : =w(V1) \ + : w(V1)\ + : /* No clobbers */); extern void abort (void);
RE: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line option
-Original Message- From: Joseph Myers [mailto:jos...@codesourcery.com] Sent: Thursday, May 08, 2014 8:27 PM To: Herman, Andrei Cc: gcc-patches@gcc.gnu.org Subject: RE: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line option On Thu, 8 May 2014, Herman, Andrei wrote: The changes in gcc/c/c-decl.c are meant to deal with this problem. Declarations that would fall into the scope of a newly created label scope are moved into the enclosing normal (non label) scope, where they actually belong. Shouldn't you be able to do something like that for the other cases as well, to avoid forcing C99 scoping rules? I will think about it if you think it's critical. In any case, I think you need to run the complete gcc testsuite with this option enabled and compare with the results for a default testsuite run. I will definitely run the complete testsuites without and with the option enabled. -- Joseph S. Myers jos...@codesourcery.com Thanks and regards, Andrei Herman Mentor Graphics Corporation Israel branch
Re: [C PATCH] Don't reject valid code with _Alignas (PR c/61053)
On Wed, May 07, 2014 at 11:31:38AM -0700, H.J. Lu wrote: OK, though I'm not sure if the lp64 conditions are right in the testcase It should be !ia32 instead of lp64. Ok, I changed lp64 to ! { ia32 } and committed the patch now. Marek
[patch] libstdc++/13860 enforce requirements on traits
Add static assertions to enforce requirements on the traits used with basic_filebuf, improving the diagnostics for invalid traits. Tested x86_64-linux, committed to trunk. commit 78df7e78e805c873883e63d9df4e6befa32ddac5 Author: Jonathan Wakely jwak...@redhat.com Date: Thu May 8 18:20:37 2014 +0100 PR libstdc++/13860 * include/std/fstream (basic_filebuf): Enforce requirements on traits. diff --git a/libstdc++-v3/include/std/fstream b/libstdc++-v3/include/std/fstream index 17ccac6..51db21b 100644 --- a/libstdc++-v3/include/std/fstream +++ b/libstdc++-v3/include/std/fstream @@ -71,6 +71,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION templatetypename _CharT, typename _Traits class basic_filebuf : public basic_streambuf_CharT, _Traits { +#if __cplusplus = 201103L + templatetypename _Tp + using __chk_state = __and_is_copy_assignable_Tp, + is_copy_constructible_Tp, + is_default_constructible_Tp; + + static_assert(__chk_statetypename _Traits::state_type::value, + state_type must be CopyAssignable, CopyConstructible + and DefaultConstructible); + + static_assert(is_sametypename _Traits::pos_type, + fpostypename _Traits::state_type::value, + pos_type must be fposstate_type); +#endif public: // Types: typedef _CharT char_type;
Re: [wide-int] Add fast path for hosts with HWI widening multiplication
Joseph S. Myers jos...@codesourcery.com writes: On Thu, 8 May 2014, Ramana Radhakrishnan wrote: DATE Ramana Radhakrishnan ramana.radhakrish...@arm.com * wide-int.cc (UTItype): Define. (UDWtype): Define for appropriate W_TYPE_SIZE. This breaks builds for 32-bit hosts, where TImode isn't supported. You can only use TImode on the host if it's 64-bit. wide-int.cc:37:56: error: unable to emulate 'TI' The longlong.h interface seems to be designed to be as difficult to use as possible :-( So maybe we really do need to limit it to hosts that are known to work and benefit from it. How about the following? I tested that it produces identical wide-int.o .text for x86_64. I think additions to or removals from the list should be treated as pre-approved. Thanks, Richard gcc/ * wide-int.cc: Only include longlong.h for certain targets. Index: gcc/wide-int.cc === --- gcc/wide-int.cc 2014-05-08 19:13:15.782158808 +0100 +++ gcc/wide-int.cc 2014-05-08 19:28:52.880742385 +0100 @@ -27,19 +27,20 @@ along with GCC; see the file COPYING3. #include tree.h #include dumpfile.h -#if GCC_VERSION = 3000 +#if (GCC_VERSION = 3000 \ + (defined __aarch64 \ + || defined __alpha \ + || defined __ia64 \ + || defined __powerpc64__ \ + || defined __sparcv9 \ + || defined __x86_64__)) #define W_TYPE_SIZE HOST_BITS_PER_WIDE_INT -typedef unsigned HOST_HALF_WIDE_INT UHWtype; -typedef unsigned HOST_WIDE_INT UWtype; typedef unsigned int UQItype __attribute__ ((mode (QI))); typedef unsigned int USItype __attribute__ ((mode (SI))); typedef unsigned int UDItype __attribute__ ((mode (DI))); -typedef unsigned int UTItype __attribute__ ((mode (TI))); -#if W_TYPE_SIZE == 32 -# define UDWtype UDItype -#elif W_TYPE_SIZE == 64 -# define UDWtype UTItype -#endif +typedef unsigned HOST_HALF_WIDE_INT UHWtype; +typedef unsigned HOST_WIDE_INT UWtype; +typedef unsigned int UDWtype __attribute__ ((mode (TI))); #include longlong.h #endif
Re: [C PATCH] Make attributes accept enum values (PR c/50459)
Ping. On Fri, May 02, 2014 at 11:28:54AM +0200, Marek Polacek wrote: On Thu, May 01, 2014 at 11:20:25PM +, Joseph S. Myers wrote: On Wed, 23 Apr 2014, Marek Polacek wrote: diff --git gcc/testsuite/c-c++-common/attributes-1.c gcc/testsuite/c-c++-common/attributes-1.c index af4dd12..8458e47 100644 --- gcc/testsuite/c-c++-common/attributes-1.c +++ gcc/testsuite/c-c++-common/attributes-1.c @@ -9,7 +9,7 @@ typedef char vec __attribute__((vector_size(bar))); /* { dg-warning ignored } void f1(char*) __attribute__((nonnull(bar))); /* { dg-error invalid operand } */ void f2(char*) __attribute__((nonnull(1,bar))); /* { dg-error invalid operand } */ -void g() __attribute__((aligned(bar))); /* { dg-error invalid value|not an integer } */ +void g() __attribute__((aligned(bar))); I don't think it's appropriate to remove any test assertion that this invalid code gets diagnosed. If the only diagnostic is now one swallowed by the dg-prune-output in this test, either that dg-prune-output needs to be removed (and corresponding more detailed error expectations added), or a separate test needs adding for this erroneous use of this attribute (that separate test not using dg-prune-output). Yeah, that was a weird thing to do. I yanked that particular test to a new testcase. Otherwise no changes. Tested again x86_64-linux, ok now? 2014-05-02 Marek Polacek pola...@redhat.com PR c/50459 c-family/ * c-common.c (check_user_alignment): Return -1 if alignment is error node. (handle_aligned_attribute): Don't call default_conversion on FUNCTION_DECLs. (handle_vector_size_attribute): Likewise. (handle_tm_wrap_attribute): Handle case when wrap_decl is error node. (handle_sentinel_attribute): Call default_conversion and allow even integral types as an argument. c/ * c-parser.c (c_parser_attributes): Parse the arguments as an expression-list if the attribute takes identifier. testsuite/ * c-c++-common/attributes-1.c: Move test line to a new test. * c-c++-common/attributes-2.c: New test. * c-c++-common/pr50459.c: New test. * c-c++-common/pr59280.c: Add undeclared to dg-error. * gcc.dg/nonnull-2.c: Likewise. * gcc.dg/pr55570.c: Modify dg-error. * gcc.dg/tm/wrap-2.c: Likewise. diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c index 0ad955d..3ebd960 100644 --- gcc/c-family/c-common.c +++ gcc/c-family/c-common.c @@ -7438,6 +7438,8 @@ check_user_alignment (const_tree align, bool allow_zero) { int i; + if (error_operand_p (align)) +return -1; if (TREE_CODE (align) != INTEGER_CST || !INTEGRAL_TYPE_P (TREE_TYPE (align))) { @@ -7559,7 +7561,8 @@ handle_aligned_attribute (tree *node, tree ARG_UNUSED (name), tree args, if (args) { align_expr = TREE_VALUE (args); - if (align_expr TREE_CODE (align_expr) != IDENTIFIER_NODE) + if (align_expr TREE_CODE (align_expr) != IDENTIFIER_NODE +TREE_CODE (align_expr) != FUNCTION_DECL) align_expr = default_conversion (align_expr); } else @@ -8424,9 +8427,11 @@ handle_tm_wrap_attribute (tree *node, tree name, tree args, else { tree wrap_decl = TREE_VALUE (args); - if (TREE_CODE (wrap_decl) != IDENTIFIER_NODE -TREE_CODE (wrap_decl) != VAR_DECL -TREE_CODE (wrap_decl) != FUNCTION_DECL) + if (error_operand_p (wrap_decl)) +; + else if (TREE_CODE (wrap_decl) != IDENTIFIER_NODE + TREE_CODE (wrap_decl) != VAR_DECL + TREE_CODE (wrap_decl) != FUNCTION_DECL) error (%qE argument not an identifier, name); else { @@ -8553,7 +8558,8 @@ handle_vector_size_attribute (tree *node, tree name, tree args, *no_add_attrs = true; size = TREE_VALUE (args); - if (size TREE_CODE (size) != IDENTIFIER_NODE) + if (size TREE_CODE (size) != IDENTIFIER_NODE + TREE_CODE (size) != FUNCTION_DECL) size = default_conversion (size); if (!tree_fits_uhwi_p (size)) @@ -8964,8 +8970,12 @@ handle_sentinel_attribute (tree *node, tree name, tree args, if (args) { tree position = TREE_VALUE (args); + if (position TREE_CODE (position) != IDENTIFIER_NODE +TREE_CODE (position) != FUNCTION_DECL) + position = default_conversion (position); - if (TREE_CODE (position) != INTEGER_CST) + if (TREE_CODE (position) != INTEGER_CST + || !INTEGRAL_TYPE_P (TREE_TYPE (position))) { warning (OPT_Wattributes, requested position is not an integer constant); diff --git gcc/c/c-parser.c gcc/c/c-parser.c index 7947355..48f8d2f 100644 --- gcc/c/c-parser.c +++ gcc/c/c-parser.c @@ -3955,11 +3955,16 @@ c_parser_attributes (c_parser *parser) In objective-c the identifier may be a
Re: [wide-int] Add fast path for hosts with HWI widening multiplication
everyone who has a private port will hate you forever. note that i have 2 of them. On 05/08/2014 02:31 PM, Richard Sandiford wrote: Joseph S. Myers jos...@codesourcery.com writes: On Thu, 8 May 2014, Ramana Radhakrishnan wrote: DATE Ramana Radhakrishnan ramana.radhakrish...@arm.com * wide-int.cc (UTItype): Define. (UDWtype): Define for appropriate W_TYPE_SIZE. This breaks builds for 32-bit hosts, where TImode isn't supported. You can only use TImode on the host if it's 64-bit. wide-int.cc:37:56: error: unable to emulate 'TI' The longlong.h interface seems to be designed to be as difficult to use as possible :-( So maybe we really do need to limit it to hosts that are known to work and benefit from it. How about the following? I tested that it produces identical wide-int.o .text for x86_64. I think additions to or removals from the list should be treated as pre-approved. Thanks, Richard gcc/ * wide-int.cc: Only include longlong.h for certain targets. Index: gcc/wide-int.cc === --- gcc/wide-int.cc 2014-05-08 19:13:15.782158808 +0100 +++ gcc/wide-int.cc 2014-05-08 19:28:52.880742385 +0100 @@ -27,19 +27,20 @@ along with GCC; see the file COPYING3. #include tree.h #include dumpfile.h -#if GCC_VERSION = 3000 +#if (GCC_VERSION = 3000 \ + (defined __aarch64 \ + || defined __alpha \ + || defined __ia64 \ + || defined __powerpc64__ \ + || defined __sparcv9 \ + || defined __x86_64__)) #define W_TYPE_SIZE HOST_BITS_PER_WIDE_INT -typedef unsigned HOST_HALF_WIDE_INT UHWtype; -typedef unsigned HOST_WIDE_INT UWtype; typedef unsigned int UQItype __attribute__ ((mode (QI))); typedef unsigned int USItype __attribute__ ((mode (SI))); typedef unsigned int UDItype __attribute__ ((mode (DI))); -typedef unsigned int UTItype __attribute__ ((mode (TI))); -#if W_TYPE_SIZE == 32 -# define UDWtype UDItype -#elif W_TYPE_SIZE == 64 -# define UDWtype UTItype -#endif +typedef unsigned HOST_HALF_WIDE_INT UHWtype; +typedef unsigned HOST_WIDE_INT UWtype; +typedef unsigned int UDWtype __attribute__ ((mode (TI))); #include longlong.h #endif
[jit] Introduce params_c_finalize
Attempting to repeatedly compile in-process led to assertion failures in a release build on the 2nd compile within a process: 156 void 157 set_default_param_value (compiler_param num, int value) 158 { 159 gcc_assert (!params_finished); 160 161 compiler_params[(int) num].default_value = value; 162 } #0 fancy_abort (file=file@entry=0x7fffecf394a0 ../../gcc/params.c, line=line@entry=159, function=function@entry=0x7fffecf3d550 set_default_param_value(compiler_param, int)::__FUNCTION__ set_default_param_value) at ../../gcc/diagnostic.c:1182 #1 0x7fffecc9cc07 in set_default_param_value (num=num@entry=GGC_MIN_EXPAND, value=optimized out) at ../../gcc/params.c:159 #2 0x7fffec654685 in init_ggc_heuristics () at ../../gcc/ggc-common.c:899 #3 0x7fffec52b265 in general_init (argv0=optimized out) at ../../gcc/toplev.c:1179 #4 toplev::main (this=this@entry=0x7fff9cdf, argc=argc@entry=5, argv=argv@entry=0x7fff9ce0) at ../../gcc/toplev.c:1944 due to these guards: 895 void 896 init_ggc_heuristics (void) 897 { 898 #if !defined ENABLE_GC_CHECKING !defined ENABLE_GC_ALWAYS_COLLECT 899 set_default_param_value (GGC_MIN_EXPAND, ggc_min_expand_heuristic ()); 900 set_default_param_value (GGC_MIN_HEAPSIZE, ggc_min_heapsize_heuristic ()); 901 #endif 902 } which prevent it being seen in a devel build. The issue here is that init_ggc_heuristics is reinitializing the default param and thus assuming that param initialization is still in progress. My previous attempt at handling multiple initialization of params which simply made it idempotent leaves params_finished as true. Instead, introduce params_c_finalize to purge state within params.c so that we fully reinitialize things each time. Committed to branch dmalcolm/jit gcc/ * params.c (global_init_params): Require that params_finished be false, rather than being idempotent, in favor of purging all state between toplev invocations, since in a release build init_ggc_heuristics calls set_param_value_internal, and the latter assumes that params_finished is true. (params_c_finalize): New. * params.h (params_c_finalize): New. * toplev.c (toplev::finalize): Call params_c_finalize. --- gcc/ChangeLog.jit | 11 +++ gcc/params.c | 16 +--- gcc/params.h | 4 gcc/toplev.c | 1 + 4 files changed, 29 insertions(+), 3 deletions(-) diff --git a/gcc/ChangeLog.jit b/gcc/ChangeLog.jit index 5145cf9..f8235f1 100644 --- a/gcc/ChangeLog.jit +++ b/gcc/ChangeLog.jit @@ -1,3 +1,14 @@ +2014-05-08 David Malcolm dmalc...@redhat.com + + * params.c (global_init_params): Require that params_finished be + false, rather than being idempotent, in favor of purging all state + between toplev invocations, since in a release build + init_ggc_heuristics calls set_param_value_internal, and the + latter assumes that params_finished is true. + (params_c_finalize): New. + * params.h (params_c_finalize): New. + * toplev.c (toplev::finalize): Call params_c_finalize. + 2014-03-24 Tom Tromey tro...@redhat.com * toplev.c (general_init): Initialize input_location. diff --git a/gcc/params.c b/gcc/params.c index 22c7a27..3d523ce 100644 --- a/gcc/params.c +++ b/gcc/params.c @@ -69,9 +69,8 @@ add_params (const param_info params[], size_t n) void global_init_params (void) { - /* Make param initialization be idempotent. */ - if (params_finished) -return; + gcc_assert (!params_finished); + add_params (lang_independent_params, LAST_PARAM); targetm_common.option_default_params (); } @@ -85,6 +84,17 @@ finish_params (void) params_finished = true; } +/* Reset all state in params.c. */ + +void +params_c_finalize (void) +{ + XDELETEVEC (compiler_params); + compiler_params = NULL; + num_compiler_params = 0; + params_finished = false; +} + /* Set the value of the parameter given by NUM to VALUE in PARAMS and PARAMS_SET. If EXPLICIT_P, this is being set by the user; otherwise it is being set implicitly by the compiler. */ diff --git a/gcc/params.h b/gcc/params.h index 6580224..52420b4 100644 --- a/gcc/params.h +++ b/gcc/params.h @@ -113,6 +113,10 @@ extern void global_init_params (void); set. */ extern void finish_params (void); +/* Reset all state in params.c */ + +extern void params_c_finalize (void); + /* Return the default value of parameter NUM. */ extern int default_param_value (compiler_param num); diff --git a/gcc/toplev.c b/gcc/toplev.c index 54a884e..5123dbe 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -2015,6 +2015,7 @@ toplev::finalize (void) gcse_c_finalize (); ipa_c_finalize (); ipa_reference_c_finalize (); + params_c_finalize (); predict_c_finalize (); symtab_c_finalize (); varpool_c_finalize (); -- 1.8.5.3
[PATCH, i386]: Fix PR59952, -march=core-avx2 should not enable RTM
Hello! Apparently, not all Haswell processors have TSX. Attached patch removes PTA_RTM from default Haswell flags. PTA_HLX still makes sense for Haswell processors, since the prefix is ignored on non-TSX processors. 2014-05-08 Uros Bizjak ubiz...@gmail.com PR target/59952 * config/i386/i386.c (PTA_HASWELL): Remove PTA_RTM. Bootstrapped and regression tested on x86_64-pc-linux-gnu. The patch is committed on mainline, will be backported to all relevant release branches. Uros. Index: config/i386/i386.c === --- config/i386/i386.c (revision 210231) +++ config/i386/i386.c (working copy) @@ -3130,7 +3130,7 @@ ix86_option_override_internal (bool main_args_p, (PTA_SANDYBRIDGE | PTA_FSGSBASE | PTA_RDRND | PTA_F16C) #define PTA_HASWELL \ (PTA_IVYBRIDGE | PTA_AVX2 | PTA_BMI | PTA_BMI2 | PTA_LZCNT \ - | PTA_FMA | PTA_MOVBE | PTA_RTM | PTA_HLE) + | PTA_FMA | PTA_MOVBE | PTA_HLE) #define PTA_BROADWELL \ (PTA_HASWELL | PTA_ADX | PTA_PRFCHW | PTA_RDSEED) #define PTA_BONNELL \
Re: [wide-int] Add fast path for hosts with HWI widening multiplication
Kenneth Zadeck zad...@naturalbridge.com writes: everyone who has a private port will hate you forever. note that i have 2 of them. Got any other ideas though? I suppose if we're prepared to break compatibility with whatever the upstream of longlong.h is, we could make more use of intN_t and uintN_t. Having a whitelist of hosts seems like the best fix though. I'm not sure the default umul_ppmm is going to be any better than not defining it. Thanks, Richard On 05/08/2014 02:31 PM, Richard Sandiford wrote: Joseph S. Myers jos...@codesourcery.com writes: On Thu, 8 May 2014, Ramana Radhakrishnan wrote: DATE Ramana Radhakrishnan ramana.radhakrish...@arm.com * wide-int.cc (UTItype): Define. (UDWtype): Define for appropriate W_TYPE_SIZE. This breaks builds for 32-bit hosts, where TImode isn't supported. You can only use TImode on the host if it's 64-bit. wide-int.cc:37:56: error: unable to emulate 'TI' The longlong.h interface seems to be designed to be as difficult to use as possible :-( So maybe we really do need to limit it to hosts that are known to work and benefit from it. How about the following? I tested that it produces identical wide-int.o .text for x86_64. I think additions to or removals from the list should be treated as pre-approved. Thanks, Richard gcc/ * wide-int.cc: Only include longlong.h for certain targets. Index: gcc/wide-int.cc === --- gcc/wide-int.cc 2014-05-08 19:13:15.782158808 +0100 +++ gcc/wide-int.cc 2014-05-08 19:28:52.880742385 +0100 @@ -27,19 +27,20 @@ along with GCC; see the file COPYING3. #include tree.h #include dumpfile.h -#if GCC_VERSION = 3000 +#if (GCC_VERSION = 3000 \ + (defined __aarch64 \ + || defined __alpha \ + || defined __ia64 \ + || defined __powerpc64__ \ + || defined __sparcv9 \ + || defined __x86_64__)) #define W_TYPE_SIZE HOST_BITS_PER_WIDE_INT -typedef unsigned HOST_HALF_WIDE_INT UHWtype; -typedef unsigned HOST_WIDE_INT UWtype; typedef unsigned int UQItype __attribute__ ((mode (QI))); typedef unsigned int USItype __attribute__ ((mode (SI))); typedef unsigned int UDItype __attribute__ ((mode (DI))); -typedef unsigned int UTItype __attribute__ ((mode (TI))); -#if W_TYPE_SIZE == 32 -# define UDWtype UDItype -#elif W_TYPE_SIZE == 64 -# define UDWtype UTItype -#endif +typedef unsigned HOST_HALF_WIDE_INT UHWtype; +typedef unsigned HOST_WIDE_INT UWtype; +typedef unsigned int UDWtype __attribute__ ((mode (TI))); #include longlong.h #endif
Re: [PATCH] Add support for MIPS r3 and r5
Andrew Bennett andrew.benn...@imgtec.com writes: diff --git a/gcc/config/mips/mips-cpus.def b/gcc/config/mips/mips-cpus.def index 07fbf9c..f2e23c6 100644 --- a/gcc/config/mips/mips-cpus.def +++ b/gcc/config/mips/mips-cpus.def @@ -44,9 +44,13 @@ MIPS_CPU (mips4, PROCESSOR_R8000, 4, 0) isn't tuned to a specific processor. */ MIPS_CPU (mips32, PROCESSOR_4KC, 32, PTF_AVOID_BRANCHLIKELY) MIPS_CPU (mips32r2, PROCESSOR_74KF2_1, 33, PTF_AVOID_BRANCHLIKELY) +MIPS_CPU (mips32r3, PROCESSOR_M4K, 34, PTF_AVOID_BRANCHLIKELY) +MIPS_CPU (mips32r5, PROCESSOR_74KF2_1, 36, PTF_AVOID_BRANCHLIKELY) Looks odd for mips32r2 and mips32r5 to have the same processor tuning but mips32r3 to be different. I assume 74KF2_1 is just a reasonable default, given the lack of tuning for a real r5 CPU? That's fine if so, but probably deserves a comment. MIPS_CPU (mips64, PROCESSOR_5KC, 64, PTF_AVOID_BRANCHLIKELY) /* ??? For now just tune the generic MIPS64r2 for 5KC as well. */ MIPS_CPU (mips64r2, PROCESSOR_5KC, 65, PTF_AVOID_BRANCHLIKELY) +MIPS_CPU (mips64r3, PROCESSOR_5KC, 66, PTF_AVOID_BRANCHLIKELY) +MIPS_CPU (mips64r5, PROCESSOR_5KC, 68, PTF_AVOID_BRANCHLIKELY) Now MIPS64r2 and above. @@ -724,7 +752,7 @@ struct mips_cpu_info { /* Infer a -msynci setting from a -mips argument, on the assumption that -msynci is desired where possible. */ #define MIPS_ISA_SYNCI_SPEC \ - %{msynci|mno-synci:;:%{mips32r2|mips64r2:-msynci;:-mno-synci}} + %{msynci|mno-synci:;:%{mips32r2|mips32r3|mips32r5|mips64r2|mips64r3|mips64r5:-msynci;:-mno-synci}} Please split the line to stay within 80 chars. @@ -141,7 +151,8 @@ along with GCC; see the file COPYING3. If not see %{EL:-m elf32lmip} \ %{EB:-m elf32bmip} \ %(endian_spec) \ - %{G*} %{mips1} %{mips2} %{mips3} %{mips4} %{mips32} %{mips32r2} %{mips64} \ + %{G*} %{mips1} %{mips2} %{mips3} %{mips4} %{mips32} %{mips32r2} \ + %{mips32r3} %{mips32r5} %{mips64} \ %(netbsd_link_spec) #define NETBSD_ENTRY_POINT __start Not sure the omission of mips64r2 was deliberate here, or in vxworks.h. As Joseph said, the .po stuff should be left alone. The .pot file is regenerated near to a release so that the translators can update the .po files. Looks good otherwise, thanks. Richard
Re: [PATCH, Pointer Bounds Checker 1/x] Pointer bounds type and mode
On 05/08/14 02:17, Ilya Enkovich wrote: Right. Richi explicitly wanted the entire set approved before staging in any of the bits. I thought it would be useful to have approved codes in the trunk to reveal some possible problems on earlier stages. It also requires significant effort to keep everything in consistency with the trunk (especially when big refactoring happens) and having some parts committed would be helpful. Will keep it in a branch for now but let me know if you change your mind :) I understand -- my preference would to be go go ahead with the stuff that's already been approved, mostly for the reasons noted above. But with Richi wanting to see it go in as a whole after complete review I think it's best to wait. While we could argue back and forth with Richi, it's not a good use of time. It's in the queue of things to look at, but it's a deep queue at the moment. Thanks for keeping an eye on it! Hope this year we can start sooner and have enough time to make it with no hurry. Agreed. BTW, are you or any colleagues coming to the Cauldron this year in Cambridge? It's often helpful to get together and hash through issues in person. I think most of the core GCC developers will be there. jeff
Re: [wide-int] Add fast path for hosts with HWI widening multiplication
On Thu, May 8, 2014 at 12:18 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Kenneth Zadeck zad...@naturalbridge.com writes: everyone who has a private port will hate you forever. note that i have 2 of them. Got any other ideas though? I suppose if we're prepared to break compatibility with whatever the upstream of longlong.h is, we could make more use of intN_t and uintN_t. Having a whitelist of hosts seems like the best fix though. I'm not sure the default umul_ppmm is going to be any better than not defining it. Can you add a configure time check if typedef unsigned int UTItype __attribute__ ((mode (TI))); is supported? -- H.J.
[patch] fix some comments in libstdc++ files
Update an old URL and an old pathname. Tested x86_64-linux, committed to 4.8, 4.9 and trunk. commit 4692e5802722954c4dda17e8b7f4ed4b78bcc272 Author: Jonathan Wakely jwak...@redhat.com Date: Thu May 8 19:38:42 2014 +0100 * include/std/iostream: Fix URL in comment. * src/c++98/ios_init.cc: Fix path in comment. diff --git a/libstdc++-v3/include/std/iostream b/libstdc++-v3/include/std/iostream index 85d2b95..5c10869 100644 --- a/libstdc++-v3/include/std/iostream +++ b/libstdc++-v3/include/std/iostream @@ -48,13 +48,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * * The lt;iostreamgt; header declares the eight emstandard stream * objects/em. For other declarations, see - * http://gcc.gnu.org/onlinedocs/libstdc++/manual/bk01pt11ch24.html + * http://gcc.gnu.org/onlinedocs/libstdc++/manual/io.html * and the @link iosfwd I/O forward declarations @endlink * * They are required by default to cooperate with the global C * library's @c FILE streams, and to be available during program - * startup and termination. For more information, see the HOWTO - * linked to above. + * startup and termination. For more information, see the section of the + * manual linked to above. */ //@{ extern istream cin; /// Linked to standard input diff --git a/libstdc++-v3/src/c++98/ios_init.cc b/libstdc++-v3/src/c++98/ios_init.cc index d8d2a0d..b5c14f2 100644 --- a/libstdc++-v3/src/c++98/ios_init.cc +++ b/libstdc++-v3/src/c++98/ios_init.cc @@ -37,7 +37,7 @@ namespace __gnu_internal _GLIBCXX_VISIBILITY(hidden) { using namespace __gnu_cxx; - // Extern declarations for global objects in src/globals.cc. + // Extern declarations for global objects in src/c++98/globals.cc. extern stdio_sync_filebufchar buf_cout_sync; extern stdio_sync_filebufchar buf_cin_sync; extern stdio_sync_filebufchar buf_cerr_sync;
Re: [wide-int] Add fast path for hosts with HWI widening multiplication
On Thu, May 08, 2014 at 12:34:28PM -0700, H.J. Lu wrote: On Thu, May 8, 2014 at 12:18 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Kenneth Zadeck zad...@naturalbridge.com writes: everyone who has a private port will hate you forever. note that i have 2 of them. Got any other ideas though? I suppose if we're prepared to break compatibility with whatever the upstream of longlong.h is, we could make more use of intN_t and uintN_t. Having a whitelist of hosts seems like the best fix though. I'm not sure the default umul_ppmm is going to be any better than not defining it. Can you add a configure time check if typedef unsigned int UTItype __attribute__ ((mode (TI))); is supported? Why? Isn't that #ifdef __SIZEOF_INT128__ ? Jakub
Re: [PATCH, Pointer Bounds Checker 1/x] Pointer bounds type and mode
On Thu, May 8, 2014 at 12:28 PM, Jeff Law l...@redhat.com wrote: On 05/08/14 02:17, Ilya Enkovich wrote: Right. Richi explicitly wanted the entire set approved before staging in any of the bits. I thought it would be useful to have approved codes in the trunk to reveal some possible problems on earlier stages. It also requires significant effort to keep everything in consistency with the trunk (especially when big refactoring happens) and having some parts committed would be helpful. Will keep it in a branch for now but let me know if you change your mind :) I understand -- my preference would to be go go ahead with the stuff that's already been approved, mostly for the reasons noted above. But with Richi wanting to see it go in as a whole after complete review I think it's best to wait. While we could argue back and forth with Richi, it's not a good use of time. Shouldn't there a git or svn branch for MPX, including run-time library, so that people can take a look at the complete MPX change and try MPX today as NOP? The only extra requirement is MPX enabled binutils. -- H.J.
Re: [wide-int] Add fast path for hosts with HWI widening multiplication
On Thu, May 8, 2014 at 12:42 PM, Jakub Jelinek ja...@redhat.com wrote: On Thu, May 08, 2014 at 12:34:28PM -0700, H.J. Lu wrote: On Thu, May 8, 2014 at 12:18 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Kenneth Zadeck zad...@naturalbridge.com writes: everyone who has a private port will hate you forever. note that i have 2 of them. Got any other ideas though? I suppose if we're prepared to break compatibility with whatever the upstream of longlong.h is, we could make more use of intN_t and uintN_t. Having a whitelist of hosts seems like the best fix though. I'm not sure the default umul_ppmm is going to be any better than not defining it. Can you add a configure time check if typedef unsigned int UTItype __attribute__ ((mode (TI))); is supported? Why? Isn't that #ifdef __SIZEOF_INT128__ ? Yes, we can use that. Will it work? -- H.J.
[patch] libstdc++/61117 FAQ should use free software
Committed to 4.8, 4.9 and trunk.
Re: [patch] libstdc++/61117 FAQ should use free software
On 08/05/14 20:52 +0100, Jonathan Wakely wrote: Committed to 4.8, 4.9 and trunk. And 4.7 since that branch is still open too.
Re: [wide-int] Add fast path for hosts with HWI widening multiplication
H.J. Lu hjl.to...@gmail.com writes: On Thu, May 8, 2014 at 12:42 PM, Jakub Jelinek ja...@redhat.com wrote: On Thu, May 08, 2014 at 12:34:28PM -0700, H.J. Lu wrote: On Thu, May 8, 2014 at 12:18 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Kenneth Zadeck zad...@naturalbridge.com writes: everyone who has a private port will hate you forever. note that i have 2 of them. Got any other ideas though? I suppose if we're prepared to break compatibility with whatever the upstream of longlong.h is, we could make more use of intN_t and uintN_t. Having a whitelist of hosts seems like the best fix though. I'm not sure the default umul_ppmm is going to be any better than not defining it. Can you add a configure time check if typedef unsigned int UTItype __attribute__ ((mode (TI))); is supported? Why? Isn't that #ifdef __SIZEOF_INT128__ ? Read this just after getting the configure test to work... Yes, we can use that. Will it work? Seems to. How does this look? Thanks, Richard gcc/ * wide-int.cc: Only include longlong.h if W_TYPE_SIZE==32 or __SIZEOF_INT128__ is defined. Index: gcc/wide-int.cc === --- gcc/wide-int.cc 2014-05-08 20:48:25.341583885 +0100 +++ gcc/wide-int.cc 2014-05-08 21:09:29.324386217 +0100 @@ -27,18 +27,17 @@ along with GCC; see the file COPYING3. #include tree.h #include dumpfile.h -#if GCC_VERSION = 3000 #define W_TYPE_SIZE HOST_BITS_PER_WIDE_INT +#if GCC_VERSION = 3000 (W_TYPE_SIZE == 32 || defined (__SIZEOF_INT128__)) typedef unsigned HOST_HALF_WIDE_INT UHWtype; typedef unsigned HOST_WIDE_INT UWtype; typedef unsigned int UQItype __attribute__ ((mode (QI))); typedef unsigned int USItype __attribute__ ((mode (SI))); typedef unsigned int UDItype __attribute__ ((mode (DI))); -typedef unsigned int UTItype __attribute__ ((mode (TI))); #if W_TYPE_SIZE == 32 -# define UDWtype UDItype -#elif W_TYPE_SIZE == 64 -# define UDWtype UTItype +typedef unsigned int UDWtype __attribute__ ((mode (DI))); +#else +typedef unsigned int UDWtype __attribute__ ((mode (TI))); #endif #include longlong.h #endif
[google gcc-4_8] fix lipo ICE
Hi, This patch fixed lipo ICE triggered by an out-of-bound access. This is google specific patch and tested with bootstrap and the program exposed the issue. Thanks, -Rong 2014-05-08 Rong Xu x...@google.com * tree-inline.c (add_local_variables): Check if the debug_expr is a decl_node before calling is_global_var. Index: tree-inline.c === --- tree-inline.c (revision 209291) +++ tree-inline.c (working copy) @@ -3842,7 +3842,7 @@ add_local_variables (struct function *callee, stru of varpool node does not check the reference from debug expressions. Set it to 0 for all global vars. */ -if (L_IPO_COMP_MODE tem is_global_var (tem)) +if (L_IPO_COMP_MODE tem DECL_P (tem) is_global_var (tem)) tem = NULL; id-remapping_type_depth++;
Re: [google gcc-4_8] fix lipo ICE
ok. David On Thu, May 8, 2014 at 1:33 PM, Rong Xu x...@google.com wrote: Hi, This patch fixed lipo ICE triggered by an out-of-bound access. This is google specific patch and tested with bootstrap and the program exposed the issue. Thanks, -Rong
[Fortran-caf] Merge of the trunk into the branch
Hi all, I have now merged the trunk into the branch. The main change is the new co_min/max/sum feature but also a patch which has reduced the differences between the trunk and the branch. - And of course, all the other trunk changes such as the wide-int patch. Committed as Rev. 210244. Tobias
Re: [wide-int] Add fast path for hosts with HWI widening multiplication
On Thu, May 8, 2014 at 1:12 PM, Richard Sandiford rdsandif...@googlemail.com wrote: H.J. Lu hjl.to...@gmail.com writes: On Thu, May 8, 2014 at 12:42 PM, Jakub Jelinek ja...@redhat.com wrote: On Thu, May 08, 2014 at 12:34:28PM -0700, H.J. Lu wrote: On Thu, May 8, 2014 at 12:18 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Kenneth Zadeck zad...@naturalbridge.com writes: everyone who has a private port will hate you forever. note that i have 2 of them. Got any other ideas though? I suppose if we're prepared to break compatibility with whatever the upstream of longlong.h is, we could make more use of intN_t and uintN_t. Having a whitelist of hosts seems like the best fix though. I'm not sure the default umul_ppmm is going to be any better than not defining it. Can you add a configure time check if typedef unsigned int UTItype __attribute__ ((mode (TI))); is supported? Why? Isn't that #ifdef __SIZEOF_INT128__ ? Read this just after getting the configure test to work... Yes, we can use that. Will it work? Seems to. How does this look? Thanks, Richard gcc/ * wide-int.cc: Only include longlong.h if W_TYPE_SIZE==32 or __SIZEOF_INT128__ is defined. Index: gcc/wide-int.cc === --- gcc/wide-int.cc 2014-05-08 20:48:25.341583885 +0100 +++ gcc/wide-int.cc 2014-05-08 21:09:29.324386217 +0100 @@ -27,18 +27,17 @@ along with GCC; see the file COPYING3. #include tree.h #include dumpfile.h -#if GCC_VERSION = 3000 #define W_TYPE_SIZE HOST_BITS_PER_WIDE_INT Isn't HOST_BITS_PER_WIDE_INT always 64 now? +#if GCC_VERSION = 3000 (W_TYPE_SIZE == 32 || defined (__SIZEOF_INT128__)) W_TYPE_SIZE == 32 is always false and on 32-bit hosts, __SIZEOF_INT128__ won't be defined. typedef unsigned HOST_HALF_WIDE_INT UHWtype; typedef unsigned HOST_WIDE_INT UWtype; typedef unsigned int UQItype __attribute__ ((mode (QI))); typedef unsigned int USItype __attribute__ ((mode (SI))); typedef unsigned int UDItype __attribute__ ((mode (DI))); -typedef unsigned int UTItype __attribute__ ((mode (TI))); #if W_TYPE_SIZE == 32 -# define UDWtype UDItype -#elif W_TYPE_SIZE == 64 -# define UDWtype UTItype +typedef unsigned int UDWtype __attribute__ ((mode (DI))); Can't we use #ifndef __SIZEOF_INT128__ here? +#else +typedef unsigned int UDWtype __attribute__ ((mode (TI))); #endif #include longlong.h #endif -- H.J.
Re: [PATCH] Fix PR c++/60994 gcc does not recognize hidden/shadowed enumeration as valid nested-name-specifier
Ping. previous post: http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01938.html Again bootstrapped/regtested/diffed against xg++ (GCC) 4.10.0 20140508 (experimental) [master revision ed50168:49aa3a5:e79f58c7b12f37014efb7425399c93814cddb4c4] On 29.04.2014 12:58, Momchil Velikov wrote: Hello, gcc version 4.10.0 20140428 (experimental) (GCC) Compiling (with c++ -c -std=c++11 b.cc) the following program enum struct A { n = 3 }; int foo() { int A; return A::n; } results in the error: b.cc: In function 'int foo()': b.cc:10:10: error: 'A' is not a class, namespace, or enumeration return A::n; ^ According to the C++11 Standard, [basic.lookup.qual] #1 If a :: scope resolution operator in a nested-name-specifier is not preceded by a decltype-specifier, lookup of the name preceding that :: considers only namespaces, types, and templates whose specializations are types. GCC ought not to resolve A to the local variable, but to the enumeration type. This is very similar to the example in the standard struct A { static int n; }; int foo() { int A; return A::n; } which is compiled correctly by GCC, though. Please, review this proposed fix. Bootstrapped/regtested for C/C++ on x86_64-unknown-linux-gnu. ~chill diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog index 3d400bb..cd86f95 100644 --- a/gcc/cp/ChangeLog +++ b/gcc/cp/ChangeLog @@ -1,3 +1,9 @@ +2014-05-08 Momchil Velikov momchil.veli...@gmail.com + + PR c++/60994 + * parser.c (cp_parser_class_name): Allow enumeral type as a + nested-name-specifier + 2014-05-08 Paolo Carlini paolo.carl...@oracle.com PR c++/13981 diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 5542dcd..e7ff57f 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -19220,7 +19220,8 @@ cp_parser_class_name (cp_parser *parser, } else if (TREE_CODE (decl) != TYPE_DECL || TREE_TYPE (decl) == error_mark_node - || !MAYBE_CLASS_TYPE_P (TREE_TYPE (decl)) + || !(MAYBE_CLASS_TYPE_P (TREE_TYPE (decl)) +|| TREE_CODE (TREE_TYPE (decl)) == ENUMERAL_TYPE) /* In Objective-C 2.0, a classname followed by '.' starts a dot-syntax expression, and it's not a type-name. */ || (c_dialect_objc () diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index aa92e3b..60cbe3d 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2014-05-08 Momchil Velikov momchil.veli...@gmail.com + + PR c++/60994 + * g++.dg/cpp0x/scoped_enum3.C: New testcase. + 2014-05-08 Joseph Myers jos...@codesourcery.com * gcc.target/i386/avx256-unaligned-load-2.c, diff --git a/gcc/testsuite/g++.dg/cpp0x/scoped_enum3.C b/gcc/testsuite/g++.dg/cpp0x/scoped_enum3.C new file mode 100644 index 000..ba527cb --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/scoped_enum3.C @@ -0,0 +1,12 @@ +// { dg-do compile { target c++11 } } +enum struct A +{ + n = 3 +}; + +int +foo() +{ + int A; + return A::n; // { dg-error cannot convert 'A' to 'int' in return } +}
RE: [PATCH GCC]Add 'force-dwarf-lexical-blocks' command line option
On Thu, 8 May 2014, Herman, Andrei wrote: Declarations that would fall into the scope of a newly created label scope are moved into the enclosing normal (non label) scope, where they actually belong. Shouldn't you be able to do something like that for the other cases as well, to avoid forcing C99 scoping rules? I will think about it if you think it's critical. I think it's logically the right design of the option. -- Joseph S. Myers jos...@codesourcery.com
Re: [wide-int] Add fast path for hosts with HWI widening multiplication
H.J. Lu hjl.to...@gmail.com writes: On Thu, May 8, 2014 at 1:12 PM, Richard Sandiford rdsandif...@googlemail.com wrote: H.J. Lu hjl.to...@gmail.com writes: On Thu, May 8, 2014 at 12:42 PM, Jakub Jelinek ja...@redhat.com wrote: On Thu, May 08, 2014 at 12:34:28PM -0700, H.J. Lu wrote: On Thu, May 8, 2014 at 12:18 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Kenneth Zadeck zad...@naturalbridge.com writes: everyone who has a private port will hate you forever. note that i have 2 of them. Got any other ideas though? I suppose if we're prepared to break compatibility with whatever the upstream of longlong.h is, we could make more use of intN_t and uintN_t. Having a whitelist of hosts seems like the best fix though. I'm not sure the default umul_ppmm is going to be any better than not defining it. Can you add a configure time check if typedef unsigned int UTItype __attribute__ ((mode (TI))); is supported? Why? Isn't that #ifdef __SIZEOF_INT128__ ? Read this just after getting the configure test to work... Yes, we can use that. Will it work? Seems to. How does this look? Thanks, Richard gcc/ * wide-int.cc: Only include longlong.h if W_TYPE_SIZE==32 or __SIZEOF_INT128__ is defined. Index: gcc/wide-int.cc === --- gcc/wide-int.cc 2014-05-08 20:48:25.341583885 +0100 +++ gcc/wide-int.cc 2014-05-08 21:09:29.324386217 +0100 @@ -27,18 +27,17 @@ along with GCC; see the file COPYING3. #include tree.h #include dumpfile.h -#if GCC_VERSION = 3000 #define W_TYPE_SIZE HOST_BITS_PER_WIDE_INT Isn't HOST_BITS_PER_WIDE_INT always 64 now? Right, but like I said in reply to Ramana's email, there's no guarantee that we'll continue to use HOST_WIDE_INT for wide-int.h. I'd rather keep this code parameterised on W_TYPE_SIZE rather than hard-code W_TYPE_SIZE==64. +#if GCC_VERSION = 3000 (W_TYPE_SIZE == 32 || defined (__SIZEOF_INT128__)) W_TYPE_SIZE == 32 is always false and on 32-bit hosts, __SIZEOF_INT128__ won't be defined. Right, but we won't try to use TImode if in future we do revert to using 32-bit types for 32-bit hosts. Thanks, Richard
Re: [C PATCH] Make attributes accept enum values (PR c/50459)
On Fri, 2 May 2014, Marek Polacek wrote: Yeah, that was a weird thing to do. I yanked that particular test to a new testcase. Otherwise no changes. Tested again x86_64-linux, ok now? 2014-05-02 Marek Polacek pola...@redhat.com PR c/50459 c-family/ * c-common.c (check_user_alignment): Return -1 if alignment is error node. (handle_aligned_attribute): Don't call default_conversion on FUNCTION_DECLs. (handle_vector_size_attribute): Likewise. (handle_tm_wrap_attribute): Handle case when wrap_decl is error node. (handle_sentinel_attribute): Call default_conversion and allow even integral types as an argument. c/ * c-parser.c (c_parser_attributes): Parse the arguments as an expression-list if the attribute takes identifier. testsuite/ * c-c++-common/attributes-1.c: Move test line to a new test. * c-c++-common/attributes-2.c: New test. * c-c++-common/pr50459.c: New test. * c-c++-common/pr59280.c: Add undeclared to dg-error. * gcc.dg/nonnull-2.c: Likewise. * gcc.dg/pr55570.c: Modify dg-error. * gcc.dg/tm/wrap-2.c: Likewise. OK. -- Joseph S. Myers jos...@codesourcery.com
[patch] fix impliedness of -Wunused-parameter depending on -Wexta option ordering
This fixes a regression introduced with 4.8, where the option ordering of -Wextra and -Wunused-parameter emits a warning, which is not emitted with 4.7. No regressions with the trunk, the 4.9 and 4.8 branches. Ok to check in for these? Matthias 2014-05-08 Manuel LC3B3pez-IbC3A1C3B1ez m...@gcc.gnu.org Matthias Klose d...@ubuntu.com PR driver/61106 * optc-gen.awk: Fix option handling for -Wunused-parameter. gcc/testsuite/ 2014-05-08 Matthias Klose d...@ubuntu.com PR driver/61106 * gcc-dg/unused-8a.c: New. * gcc-dg/unused-8b.c: Likewise. gcc/ 2014-05-08 Manuel López-Ibáñez m...@gcc.gnu.org Matthias Klose d...@ubuntu.com PR driver/61106 * optc-gen.awk: Fix option handling for -Wunused-parameter. gcc/testsuite/ 2014-05-08 Matthias Klose d...@ubuntu.com PR driver/61106 * gcc-dg/unused-8a.c: New. * gcc-dg/unused-8b.c: Likewise. Index: gcc/optc-gen.awk === --- gcc/optc-gen.awk(revision 210245) +++ gcc/optc-gen.awk(working copy) @@ -406,11 +406,13 @@ if (opt_var_name != ) { condition = !opts_set-x_ opt_var_name if (thisenableif[j] != ) { -condition = condition ( thisenableif[j] ) +value = ( thisenableif[j] ) +} else { +value = value } print if ( condition ) print handle_generated_option (opts, opts_set, -print opt_enum(thisenable[j]) , NULL, value, +print opt_enum(thisenable[j]) , NULL, value , print lang_mask, kind, loc, handlers, dc); } else { print #error thisenable[j] does not have a Var() flag Index: gcc/testsuite/gcc.dg/unused-8a.c === --- gcc/testsuite/gcc.dg/unused-8a.c(revision 0) +++ gcc/testsuite/gcc.dg/unused-8a.c(working copy) @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options -Wall -Wextra -Wno-unused } */ + +void foo(int x) { } Index: gcc/testsuite/gcc.dg/unused-8b.c === --- gcc/testsuite/gcc.dg/unused-8b.c(revision 0) +++ gcc/testsuite/gcc.dg/unused-8b.c(working copy) @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options -Wall -Wno-unused -Wextra } */ + +void foo(int x) { }
Re: [patch] fix impliedness of -Wunused-parameter depending on -Wexta option ordering
On Thu, 8 May 2014, Matthias Klose wrote: This fixes a regression introduced with 4.8, where the option ordering of -Wextra and -Wunused-parameter emits a warning, which is not emitted with 4.7. No regressions with the trunk, the 4.9 and 4.8 branches. Ok to check in for these? OK. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Fix PR c++/60994 gcc does not recognize hidden/shadowed enumeration as valid nested-name-specifier
Nit: normally, ChangeLog entries are not submitted as diffs, part of the patch proper, but separately (also because the ChangeLog files keep changing quite fast). Paolo.
iq2000-elf: wide-int fallout (was: we are starting the wide int merge)
[...] Just found this for iq2000: g++ -c -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -I. -I. -I/home/jbglaw/repos/gcc/gcc -I/home/jbglaw/repos/gcc/gcc/. -I/home/jbglaw/repos/gcc/gcc/../include -I/home/jbglaw/repos/gcc/gcc/../libcpp/include -I/opt/cfarm/gmp-latest/include -I/opt/cfarm/mpfr-latest/include -I/opt/cfarm/mpc-latest/include -I/home/jbglaw/repos/gcc/gcc/../libdecnumber -I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber -I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o wide-int.o -MT wide-int.o -MMD -MP -MF ./.deps/wide-int.TPo /home/jbglaw/repos/gcc/gcc/wide-int.cc /home/jbglaw/repos/gcc/gcc/wide-int.cc:37:56: error: unable to emulate 'TI' typedef unsigned int UTItype __attribute__ ((mode (TI))); ^ make[1]: *** [wide-int.o] Error 1 See build 222669 (http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=222669) for more details. MfG, JBG -- Jan-Benedict Glaw jbg...@lug-owl.de +49-172-7608481 Signature of: 17:44 @uschebit Evangelist ist doch ein Vertriebler the second : für unverkäufliche Produkte, oder? (#korsett, 20120821) signature.asc Description: Digital signature
Re: iq2000-elf: wide-int fallout (was: we are starting the wide int merge)
On Fri, 2014-05-09 at 00:48 +0200, Jan-Benedict Glaw wrote: [...] Just found this for iq2000: g++ -c -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -I. -I. -I/home/jbglaw/repos/gcc/gcc -I/home/jbglaw/repos/gcc/gcc/. -I/home/jbglaw/repos/gcc/gcc/../include -I/home/jbglaw/repos/gcc/gcc/../libcpp/include -I/opt/cfarm/gmp-latest/include -I/opt/cfarm/mpfr-latest/include -I/opt/cfarm/mpc-latest/include -I/home/jbglaw/repos/gcc/gcc/../libdecnumber -I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber -I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o wide-int.o -MT wide-int.o -MMD -MP -MF ./.deps/wide-int.TPo /home/jbglaw/repos/gcc/gcc/wide-int.cc /home/jbglaw/repos/gcc/gcc/wide-int.cc:37:56: error: unable to emulate 'TI' typedef unsigned int UTItype __attribute__ ((mode (TI))); ^ make[1]: *** [wide-int.o] Error 1 I also just ran into that. Seems to be a host issue. This one seems to fix it: http://gcc.gnu.org/ml/gcc-patches/2014-05/msg00527.html Another wide-int merge fallout I ran into: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61120 Cheers, Oleg
UTItype fallout (was: wide-int fallout)
On Fri, 2014-05-09 00:48:39 +0200, Jan-Benedict Glaw jbg...@lug-owl.de wrote: Just found this for iq2000: g++ -c -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -I. -I. -I/home/jbglaw/repos/gcc/gcc -I/home/jbglaw/repos/gcc/gcc/. -I/home/jbglaw/repos/gcc/gcc/../include -I/home/jbglaw/repos/gcc/gcc/../libcpp/include -I/opt/cfarm/gmp-latest/include -I/opt/cfarm/mpfr-latest/include -I/opt/cfarm/mpc-latest/include -I/home/jbglaw/repos/gcc/gcc/../libdecnumber -I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber -I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o wide-int.o -MT wide-int.o -MMD -MP -MF ./.deps/wide-int.TPo /home/jbglaw/repos/gcc/gcc/wide-int.cc /home/jbglaw/repos/gcc/gcc/wide-int.cc:37:56: error: unable to emulate 'TI' typedef unsigned int UTItype __attribute__ ((mode (TI))); ^ make[1]: *** [wide-int.o] Error 1 See build 222669 (http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=222669) for more details. That isn't actually fallout from the wide-int merge, but of later added code. Other targets affected are moxie-elf, aarch64_be-elf, alpha-linux, cr16-elf, ppc-linux, hppa-linux, arc-elf, younameit. MfG, JBG -- Jan-Benedict Glaw jbg...@lug-owl.de +49-172-7608481 Signature of: Alles sollte so einfach wie möglich gemacht sein. the second : Aber nicht einfacher. (Einstein) signature.asc Description: Digital signature
Re: [patch] change specific int128 - generic intN
The libstdc++v3 headers have __int128 hard-coded all over the place. Any suggestions on parameterizing those for the __intN types that are actually supported by the target?
Re: [patch] change specific int128 - generic intN
On Thu, 8 May 2014, DJ Delorie wrote: The libstdc++v3 headers have __int128 hard-coded all over the place. Any suggestions on parameterizing those for the __intN types that are actually supported by the target? (adding libstdc++@ in Cc:) The first idea that comes to mind (so possibly not such a good one) is to provide predefined macros: #define __EXTENDED_INTEGER_TYPE_1__ __int24 #define __EXTENDED_INTEGER_TYPE_2__ __int128 #undef __EXTENDED_INTEGER_TYPE_3__ Assuming that the formula sizeof(type)*char_bit==precision works for all types, it should be sufficient for the library (abs, type_traits and numeric_limits). -- Marc Glisse
Re: [patch] change specific int128 - generic intN
Assuming that the formula sizeof(type)*char_bit==precision works for all It doesn't. THe MSP430 has __int20 for example. Would it be acceptable for the compiler to always define a set of macros for each of the intN types? I would have thought that would be discouraged, but it would be an easier way to handle it.
Re: [RS6000] Fix PR61098, Poor code setting count register
On Thu, May 08, 2014 at 09:48:35AM -0400, David Edelsohn wrote: The history is 32 bit HWI. Right. The ChangeLog does not mention the changes to rs6000.md nor rs6000-protos.h. Oops, added. * config/rs6000/rs6000.md (movsi_internal1_single+1): Update call to rs6000_emit_set_const in splitter. (movdi_internal64+2, +3): Likewise. * config/rs6000/rs6000-protos.h (rs6000_emit_set_const): Update prototype. Please do not remove all of the comments from the two functions. The comments should provide some documentation about the different purposes of the two functions other than setting DEST to a CONST. I believe my updated comment covers the complete purpose of the function nowadays. The comments I removed are out-dated, and should have been removed a long time ago.. rs6000_emit_set_const does not even look at N, it always returns a non-zero result, and the return is only tested for non-zero. I removed MODE too, because that is always the same as GET_MODE (dest). Why did you remove the test for NULL dest? - if (dest == NULL) - dest = gen_reg_rtx (mode); That could occur, at least it used to occur. I'm sure we can't get a NULL dest nowadays. All (three) uses of rs6000_emit_set_const occur in splitters. They all must have passed a gpc_reg_operand constraint on operands[0] before calling rs6000_emit_set_const, so if NULL were possible we'd segfault in gpc_reg_operand. I think that the way you rearranged the invocations of copy_rtx() in rs6000_emit_set_long_const() is okay, but it would be good for someone else to double check. Yeah, that function is a bit messy. I took the approach of always use a bare dest once in the last instruction emitted, with every other use getting hit with copy_rtx. The previous approach was similar, but used the bare dest on the first instruction emitted. Obviously you don't need copy_rtx anywhere with the new code when can_create_pseudo_p is true, but I felt it wasn't worth optimising that for the added source complication. -- Alan Modra Australia Development Lab, IBM
[PR tree-optimization/61009] Do not use a block as a joiner if it is too big for threading
This was yet another problem issue with threading through a loop backedge and finding equivalences that should have been invalidated. In this instance, we were trying to thread through a large block. When we hit the statement threshold, thread_through_normal_block returned and thus statements later in the block which would have invalidated equivalences that we no longer valid after following a backedge were never examined. So those equivalences were still in the tables. We then proceeded to try and use the block as a joiner and thread through one or more of its successors. That's, of course, stupid. If we determined that the block was too big for normal threading, we certainly don't want to thread through it as a joiner either since that duplicates the block as well. So it's both a codesize and correctness issue. This patch changes thread_through_normal_block to signal to its caller that the block was not fully processed due to code growth considerations. When that happens, we avoid trying to thread through the block's successors. That fixes both the code growth problem as well as the correctness issue. Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Installed on the trunk. Will backport this and the other jump threading fix to the 4.9 branch shortly. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 8e8b76e..0b27fc8 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,13 @@ +2014-05-08 Jeff Law l...@redhat.com + + PR tree-optimization/61009 + * tree-ssa-threadedge.c (thread_through_normal_block): Return a + tri-state rather than a boolean. When a block is too big to + thread through, inform caller via negative return value. + (thread_across_edge): If a block was too big for normal threading, + then it's too big for a joiner too, so remove temporary equivalences + and return immediately. + 2014-05-08 Manuel López-Ibáñez m...@gcc.gnu.org Matthias Klose d...@ubuntu.com diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 2dcf9dc..959763f 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2014-05-08 Jeff Law l...@redhat.com + + PR tree-optimization/61009 + * g++.dg/tree-ssa/pr61009.C: New test. + 2014-05-08 Matthias Klose d...@ubuntu.com PR driver/61106 diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr61009.C b/gcc/testsuite/g++.dg/tree-ssa/pr61009.C new file mode 100644 index 000..4e7bb1a --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/pr61009.C @@ -0,0 +1,53 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fno-tree-vrp -std=c++11 -fno-strict-aliasing -fdump-tree-dom1 } */ + +#include stdio.h +struct Field { + virtual int Compare(void*, void*); +}; +extern int NKF, NR; +extern int idxs[]; +extern Field* the_field; +extern int *incs; +extern char** fptrs; +inline int doCmp(int this_row_offset, int field_idx) { + void *p = fptrs[field_idx] + this_row_offset * incs[field_idx]; + return the_field-Compare(p,0); +} +bool Test(void) { + + int row_offset = 0; + + for (; row_offset NR; ++row_offset) { + + bool is_different = false; + for (int j = 0; j NKF ; ++j) { + int field_idx = idxs[j]; + int cmp = doCmp(row_offset, field_idx); + fprintf (stderr, cmp=%d\n,cmp); + + if (cmp == 0) { + continue; + } + if (cmp 0) { + is_different = true; + break; + } else { + fprintf (stderr, Incorrect\n); + return false; + } + } + if (!is_different) { + + return false; + } + } + + return true; +} + +// The block ending with cmp == 0 should not be threaded. ie, +// there should be a single == 0 comparison in the dump file. + +// { dg-final { scan-tree-dump-times == 0 1 dom1 } } +// { dg-final { cleanup-tree-dump dom1 } } diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c index 7621348..8e628d5 100644 --- a/gcc/tree-ssa-threadedge.c +++ b/gcc/tree-ssa-threadedge.c @@ -966,9 +966,14 @@ thread_around_empty_blocks (edge taken_edge, SIMPLIFY is a pass-specific function used to simplify statements. Our caller is responsible for restoring the state of the expression - and const_and_copies stacks. */ + and const_and_copies stacks. -static bool + Positive return value is success. Zero return value is failure, but + the block can still be duplicated as a joiner in a jump thread path, + negative indicates the block should not be duplicated and thus is not + suitable for a joiner in a jump threading path. */ + +static int thread_through_normal_block (edge e, gimple dummy_cond, bool handle_dominating_asserts, @@ -990,7 +995,7 @@ thread_through_normal_block (edge e, /* PHIs create temporary equivalences. */ if (!record_temporary_equivalences_from_phis (e, stack, *backedge_seen_p, src_map, dst_map)) -return
libbacktrace patch committed: Fixes for large binaries
While testing on a large Google binary, I noticed that libbacktrace is allocating an inordinate amount of memory. The binary winds up with 377,944 entries in the unit_addrs vector. Each entry is 24 bytes, so this is 9,070,656 bytes, which is not too terrible. Unfortunately, for some reason I thought that when a libbacktrace vector is larger than a page the code should only allocate one additional page at a time. This vector requires 2215 4096-byte pages. Growing the vector one page at a time allocates a total of something like (2215 * 2214) / 2 pages, which turns out to be nearly 1.5G. Allocating 1.5G to represent a vector of size 9M is not desirable. It's true that when the vector grows, the old memory can be reused. But there is nothing in libbacktrace that is going to reuse that much memory. And even worse, there was a bug in the vector_grow routine that caused it to fail to correctly report the size of the old vector, so the memory had no chance of being reused anyhow. This patch fixes vector growth to double the number of pages requested each time. It fixes vector growth to record the correct size of the old vector being freed. The patch also adds some code to simply munmap large blocks of allocated memory. It's unlikely in practice that libbacktrace will ever be able to reuse a large block, so it's probably better to hand the memory back rather than hold onto it for no purpose. Bootstrapped and tested on x86_64-unknown-linux-gnu. Committed to 4.9 branch and mainline. Ian 2014-05-08 Ian Lance Taylor i...@google.com * mmap.c (backtrace_free): If freeing a large aligned block of memory, call munmap rather than holding onto it. (backtrace_vector_grow): When growing a vector, double the number of pages requested. When releasing the old version of a grown vector, pass the correct size to backtrace_free. Index: ChangeLog === --- ChangeLog (revision 210248) +++ ChangeLog (working copy) @@ -1,3 +1,11 @@ +2014-05-08 Ian Lance Taylor i...@google.com + + * mmap.c (backtrace_free): If freeing a large aligned block of + memory, call munmap rather than holding onto it. + (backtrace_vector_grow): When growing a vector, double the number + of pages requested. When releasing the old version of a grown + vector, pass the correct size to backtrace_free. + 2014-03-07 Ian Lance Taylor i...@google.com * sort.c (backtrace_qsort): Use middle element as pivot. Index: mmap.c === --- mmap.c (revision 210248) +++ mmap.c (working copy) @@ -164,6 +164,26 @@ backtrace_free (struct backtrace_state * { int locked; + /* If we are freeing a large aligned block, just release it back to + the system. This case arises when growing a vector for a large + binary with lots of debug info. Calling munmap here may cause us + to call mmap again if there is also a large shared library; we + just live with that. */ + if (size = 16 * 4096) +{ + size_t pagesize; + + pagesize = getpagesize (); + if (((uintptr_t) addr (pagesize - 1)) == 0 + (size (pagesize - 1)) == 0) + { + /* If munmap fails for some reason, just add the block to + the freelist. */ + if (munmap (addr, size) == 0) + return; + } +} + /* If we can acquire the lock, add the new space to the free list. If we can't acquire the lock, just leak the memory. __sync_lock_test_and_set returns the old state of the lock, so we @@ -209,14 +229,18 @@ backtrace_vector_grow (struct backtrace_ alc = pagesize; } else - alc = (alc + pagesize - 1) ~ (pagesize - 1); + { + alc *= 2; + alc = (alc + pagesize - 1) ~ (pagesize - 1); + } base = backtrace_alloc (state, alc, error_callback, data); if (base == NULL) return NULL; if (vec-base != NULL) { memcpy (base, vec-base, vec-size); - backtrace_free (state, vec-base, vec-alc, error_callback, data); + backtrace_free (state, vec-base, vec-size + vec-alc, + error_callback, data); } vec-base = base; vec-alc = alc - vec-size;
Re: [PATCH] Windows libibery: Don't quote args unnecessarily
On Tue, May 6, 2014 at 11:56 PM, Ray Donnelly mingw.andr...@gmail.com wrote: We only quote arguments that contain spaces, \t or characters to prevent wasting 2 characters per argument of the CreateProcess() 32,768 limit. This is OK. Thanks. Ian libiberty/pex-win32.c | 46 +- 1 file changed, 37 insertions(+), 9 deletions(-) diff --git a/libiberty/pex-win32.c b/libiberty/pex-win32.c index eae72c5..8b9d4f0 100644 --- a/libiberty/pex-win32.c +++ b/libiberty/pex-win32.c @@ -340,17 +340,25 @@ argv_to_cmdline (char *const *argv) char *p; size_t cmdline_len; int i, j, k; + int needs_quotes; cmdline_len = 0; for (i = 0; argv[i]; i++) { - /* We quote every last argument. This simplifies the problem; -we need only escape embedded double-quotes and immediately + /* We only quote arguments that contain spaces, \t or characters to +prevent wasting 2 chars per argument of the CreateProcess 32k char +limit. We need only escape embedded double-quotes and immediately preceeding backslash characters. A sequence of backslach characters that is not follwed by a double quote character will not be escaped. */ + needs_quotes = 0; for (j = 0; argv[i][j]; j++) { + if (argv[i][j] == ' ' || argv[i][j] == '\t' || argv[i][j] == '') + { + needs_quotes = 1; + } + if (argv[i][j] == '') { /* Escape preceeding backslashes. */ @@ -362,16 +370,33 @@ argv_to_cmdline (char *const *argv) } /* Trailing backslashes also need to be escaped because they will be followed by the terminating quote. */ - for (k = j - 1; k = 0 argv[i][k] == '\\'; k--) - cmdline_len++; + if (needs_quotes) +{ + for (k = j - 1; k = 0 argv[i][k] == '\\'; k--) +cmdline_len++; +} cmdline_len += j; - cmdline_len += 3; /* for leading and trailing quotes and space */ + /* for leading and trailing quotes and space */ + cmdline_len += needs_quotes * 2 + 1; } cmdline = XNEWVEC (char, cmdline_len); p = cmdline; for (i = 0; argv[i]; i++) { - *p++ = ''; + needs_quotes = 0; + for (j = 0; argv[i][j]; j++) +{ + if (argv[i][j] == ' ' || argv[i][j] == '\t' || argv[i][j] == '') +{ + needs_quotes = 1; + break; +} +} + + if (needs_quotes) +{ + *p++ = ''; +} for (j = 0; argv[i][j]; j++) { if (argv[i][j] == '') @@ -382,9 +407,12 @@ argv_to_cmdline (char *const *argv) } *p++ = argv[i][j]; } - for (k = j - 1; k = 0 argv[i][k] == '\\'; k--) - *p++ = '\\'; - *p++ = ''; + if (needs_quotes) +{ + for (k = j - 1; k = 0 argv[i][k] == '\\'; k--) +*p++ = '\\'; + *p++ = ''; +} *p++ = ' '; } p[-1] = '\0'; -- 1.9.2
Re: Contributing new gcc targets: i386-*-dragonfly and x86-64-*-dragonfly
On 05/03/14 01:11, John Marino wrote: revised patchset : http://leaf.dragonflybsd.org/~marino/gcc-df-target/patches/patch-dragonfly-target revised changelog : http://leaf.dragonflybsd.org/~marino/gcc-df-target/changelog_entries/gcc_ChangeLog_entry.txt revised commit msg: http://leaf.dragonflybsd.org/~marino/gcc-df-target/proposed_commit-msg.txt Good catch! Does the rest of the patch set look good to you? I think all the non-obvious patches have been reviewed collectively by various people now and may be ready to be approved now. In config.gcc: +no | gnat | single) + # Let these non-posix thread selections fall through if requested Support for gnat as a thread model was removed in 2011. So I think you need to remove that case. configure.ac: + *-*-dragonfly* | *-*-freebsd*) +if grep dl_iterate_phdr $target_header_dir/sys/link_elf.h /dev/null 21; then + gcc_cv_target_dl_iterate_phdr=yes +else + gcc_cv_target_dl_iterate_phdr=no +fi +;; Presumably you intended to change freebsd* here. Just want a confirmation. I haven't worked on the *bsd platforms in about 20 years, so I have no idea if this is right for them in general. I see you have a dragonfly-stdint.h. Is there a particular reason why you can't use the freebsd-stdint.h? I didn't check every type, but a quick glance makes me think they ought to be equivalent. Similarly for dragonfly.opt. It looks like there's a fair amount of duplication in config/dragonfly.h and config/i386/dragonfly but I don't see an easy way to fix that. So, I'll let that go. I'm going to trust the unwind code works and isn't duplicating something from somewhere else that ought to instead be shared. So it basically looks good. Can you fix the config.gcc nit and determine if we can (and should) share files with freebsd. Repost after those fixes and we should be ready to go. And one final thing, do you have a copyright assignment on file with the FSF? jeff
Re: [ping] [PATCH] config-list.mk: `show' target to show all considered targets
On 05/08/14 02:10, Jan-Benedict Glaw wrote: On Mon, 2014-05-05 14:05:40 +0200, Jan-Benedict Glaw jbg...@lug-owl.de wrote: I'd like to install this patch, which would help me to run the build robot (http://toolchain.lug-owl.de/buildbot/): 2014-05-05 Jan-Benedict Glaw jbg...@lug-owl.de contrib/ * config-list.mk (show): New target. Ping? http://gcc.gnu.org/ml/gcc-patches/2014-05/msg00203.html OK. Jeff
[gomp4] Merge trunk r210100 (2014-05-06) into gomp-4_0-branch
Hi! In r210257, I have committed a merge from trunk r210100 (2014-05-06) into gomp-4_0-branch. The LTO regression that appeared with the last merge, http://news.gmane.org/find-root.php?message_id=%3C87wqf483pl.fsf%40schwinge.name%3E, remains to be resolved: PASS: gcc.dg/lto/save-temps c_lto_save-temps_0.o assemble, -O -flto -save-temps -PASS: gcc.dg/lto/save-temps c_lto_save-temps_0.o-c_lto_save-temps_0.o link, -O -flto -save-temps +FAIL: gcc.dg/lto/save-temps c_lto_save-temps_0.o-c_lto_save-temps_0.o link, -O -flto -save-temps +UNRESOLVED: gcc.dg/lto/save-temps c_lto_save-temps_0.o-c_lto_save-temps_0.o execute -O -flto -save-temps Executing on host: [...]/build/gcc/xgcc -B[...]/build/gcc/ -fno-diagnostics-show-caret -fdiagnostics-color=never -O -flto -save-temps -c -o c_lto_save-temps_0.o [...]/source/gcc/testsuite/gcc.dg/lto/save-temps_0.c(timeout = 300) spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ -fno-diagnostics-show-caret -fdiagnostics-color=never -O -flto -save-temps -c -o c_lto_save-temps_0.o [...]/source/gcc/testsuite/gcc.dg/lto/save-temps_0.c PASS: gcc.dg/lto/save-temps c_lto_save-temps_0.o assemble, -O -flto -save-temps Executing on host: [...]/build/gcc/xgcc -B[...]/build/gcc/ c_lto_save-temps_0.o -fno-diagnostics-show-caret -fdiagnostics-color=never -O -flto -save-temps -o gcc-dg-lto-save-temps-01.exe(timeout = 300) spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ c_lto_save-temps_0.o -fno-diagnostics-show-caret -fdiagnostics-color=never -O -flto -save-temps -o gcc-dg-lto-save-temps-01.exe [...]/build/gcc/xgcc @/tmp/ccjomvFW [...]/build/gcc/xgcc @/tmp/ccAM0t6j output is: [...]/build/gcc/xgcc @/tmp/ccjomvFW [...]/build/gcc/xgcc @/tmp/ccAM0t6j FAIL: gcc.dg/lto/save-temps c_lto_save-temps_0.o-c_lto_save-temps_0.o link, -O -flto -save-temps UNRESOLVED: gcc.dg/lto/save-temps c_lto_save-temps_0.o-c_lto_save-temps_0.o execute -O -flto -save-temps Grüße, Thomas pgpSJ6aHVcql3.pgp Description: PGP signature
Merge from GCC 4.9 branch to gccgo branch
I merged GCC 4.9 branch revision 210256 to the gccgo branch. Ian