[RFC] Making fold-const sane WRT symbol visibilities
Hello, fold-const contains quite few confused statements that deal with WEAK visibility and aliases: static int simple_operand_p (const_tree exp) { /* Strip any conversions that don't change the machine mode. */ STRIP_NOPS (exp); return (CONSTANT_CLASS_P (exp) || TREE_CODE (exp) == SSA_NAME || (DECL_P (exp) ! TREE_ADDRESSABLE (exp) ! TREE_THIS_VOLATILE (exp) ! DECL_NONLOCAL (exp) /* Don't regard global variables as simple. They may be allocated in ways unknown to the compiler (shared memory, #pragma weak, etc). */ ! TREE_PUBLIC (exp) ! DECL_EXTERNAL (exp) /* Weakrefs are not safe to be read, since they can be NULL. They are !TREE_PUBLIC !DECL_EXTERNAL but still have DECL_WEAK flag set. */ (! VAR_OR_FUNCTION_DECL_P (exp) || ! DECL_WEAK (exp)) /* Loading a static variable is unduly expensive, but global registers aren't expensive. */ (! TREE_STATIC (exp) || DECL_REGISTER (exp; } Here I think WEAK is useless, since we already check PUBLIC. /* If this is an equality comparison of the address of two non-weak, unaliased symbols neither of which are extern (since we do not have access to attributes for externs), then we know the result. */ if (TREE_CODE (arg0) == ADDR_EXPR VAR_OR_FUNCTION_DECL_P (TREE_OPERAND (arg0, 0)) ! DECL_WEAK (TREE_OPERAND (arg0, 0)) ! lookup_attribute (alias, DECL_ATTRIBUTES (TREE_OPERAND (arg0, 0))) ! DECL_EXTERNAL (TREE_OPERAND (arg0, 0)) TREE_CODE (arg1) == ADDR_EXPR VAR_OR_FUNCTION_DECL_P (TREE_OPERAND (arg1, 0)) ! DECL_WEAK (TREE_OPERAND (arg1, 0)) ! lookup_attribute (alias, DECL_ATTRIBUTES (TREE_OPERAND (arg1, 0))) ! DECL_EXTERNAL (TREE_OPERAND (arg1, 0))) Here we no longer consstently use alias attribute to do aliases - it should ask symtab. Moreover it handle just fraction of cases where we can prove nonequality. and some others. This patch attempts to deal with nonzero that is one of basic predicates. I added symtab_node::nonzero that I hope is implemented safely and I removed the fold-const code (I will cleanup a bit its implementation - at the very end I added the flag_ checks that makes it osmewhat ugly). The problem is that I do not think I can safely fold before symtab is constructed as shown by the testcase: extern int a; t() { return a!=0; } extern int a __attribute__ ((weak)); Here we incorrectly fold into true before we see the weak predicate. The problem is that the patch fails testcases that assume we do such folding at parsing time. ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/pr36901-1.c (test for excess errors) ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/pr36901-2.c (test for excess errors) ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/pr36901-3.c (test for excess errors) ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/pr36901-4.c (test for excess errors) Here we accept the source as compile time constant while I think it is not: int sc = (sc 0); Does it seem resonable to turn those testcases into one testing for error and live with delaying the folding oppurtunities to early opts? They are now caught by ccp1 pass usually. Honza * cgraph.h (symtab_node): Add method nonzero. (decl_in_symtab_p): Break out from ... (symtab_get_node): ... here. * symtab.c (symtab_node::nonzero): New method. * fold-const.c: Include cgraph.h (tree_single_nonzero_warnv_p): Use symtab for symbols. * testsuite/g++.dg/tree-ssa/nonzero-2.C: New testcase. * testsuite/g++.dg/tree-ssa/nonzero-1.C: New testcase. * testsuite/gcc.dg/tree-ssa/nonzero-1.c: New testcase. Index: cgraph.h === --- cgraph.h(revision 211915) +++ cgraph.h(working copy) @@ -214,6 +214,9 @@ public: void set_init_priority (priority_type priority); priority_type get_init_priority (); + + /* Return true if symbol is known to be nonzero. */ + bool nonzero (); }; enum availability @@ -1068,6 +1077,17 @@ void varpool_remove_initializer (varpool /* In cgraph.c */ extern void change_decl_assembler_name (tree, tree); +/* Return true if DECL should have entry in symbol table if used. + Those are functions and static external veriables*/ + +static bool +decl_in_symtab_p (const_tree decl) +{ + return (TREE_CODE (decl) == FUNCTION_DECL + || (TREE_CODE (decl) == VAR_DECL + (TREE_STATIC (decl) || DECL_EXTERNAL (decl; +} + /* Return symbol table node associated with DECL, if any, and NULL otherwise. */ @@ -1075,12 +1095,7 @@ static inline symtab_node * symtab_get_node (const_tree decl) { #ifdef
Re: Fortran OpenMP UDR fixes, nested handling fixes etc.
Jakub Jelinek wrote: So, either we need something like the following patch (incremental), or another possibility for the problem is not do the value.function.name related change in module.c in the UDR patch, and instead fix up the UDR combiner/initializer expressions when they are loaded from module (change name to NULL only in the UDR combiner/initializer expressions, where they shouldn't be resolved yet). Or make sure value.function.name is set to non-NULL when resolving all intrinsic function calls, rather than just for a subset of them. With this patch it seems to pass bootstrap/regtest. I think your patch looks sufficiently sleek that I would go for it. Tobias 2014-06-21 Jakub Jelinek ja...@redhat.com * resolve.c (resolve_function): If value.function.isym is non-NULL, consider it already resolved. * module.c (fix_mio_expr): Likewise. * trans-openmp.c (gfc_trans_omp_array_reduction_or_udr): Don't initialize value.function.isym. --- gcc/fortran/resolve.c.jj2014-06-20 23:31:49.0 +0200 +++ gcc/fortran/resolve.c 2014-06-21 20:07:39.708099045 +0200 @@ -2887,7 +2887,8 @@ resolve_function (gfc_expr *expr) /* See if function is already resolved. */ - if (expr-value.function.name != NULL) + if (expr-value.function.name != NULL + || expr-value.function.isym != NULL) { if (expr-ts.type == BT_UNKNOWN) expr-ts = sym-ts; --- gcc/fortran/module.c.jj 2014-06-20 23:31:49.0 +0200 +++ gcc/fortran/module.c2014-06-23 08:53:50.488662314 +0200 @@ -3173,7 +3173,8 @@ fix_mio_expr (gfc_expr *e) !e-symtree-n.sym-attr.dummy) e-symtree = ns_st; } - else if (e-expr_type == EXPR_FUNCTION e-value.function.name) + else if (e-expr_type == EXPR_FUNCTION + (e-value.function.name || e-value.function.isym)) { gfc_symbol *sym; --- gcc/fortran/trans-openmp.c.jj 2014-06-20 23:31:49.0 +0200 +++ gcc/fortran/trans-openmp.c 2014-06-23 11:53:02.932495166 +0200 @@ -1417,7 +1417,6 @@ gfc_trans_omp_array_reduction_or_udr (tr e4-expr_type = EXPR_FUNCTION; e4-where = where; e4-symtree = symtree4; - e4-value.function.isym = gfc_find_function (iname); e4-value.function.actual = gfc_get_actual_arglist (); e4-value.function.actual-expr = e3; e4-value.function.actual-next = gfc_get_actual_arglist ();
Re: [PATCH, PR61554] ICE during CCP
On 2014/6/23 04:45 PM, Richard Biener wrote: On Mon, Jun 23, 2014 at 7:32 AM, Chung-Lin Tang clt...@codesourcery.com wrote: Hi Richard, In this change: https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01278.html where substitute_and_fold() was changed to use a dom walker, the calls to purge dead EH edges during the walk can alter the dom-tree, and have chaotic results; the testcase in PR 61554 has some blocks traversed twice during the walk, causing the segfault during CCP. The patch records the to-be-purged-for-dead-EH blocks in a similar manner like stmts_to_remove, and processes it after the walk. (another possible method would be using a bitmap to record the BBs + calling gimple_purge_all_dead_eh_edges...) Oops. Bootstrapped and tested on x86_64-linux, is this okay for trunk? Can you please use a bitmap and use gimple_purge_all_dead_eh_edges like tree-ssa-pre.c does? Also please add the reduced testcase from the PR to the g++.dg/torture Ok with that changes. Thanks, Richard. Thanks for the review. Attached is what I committed. Testcase made by Markus also added. Thanks, Chung-Lin 2014-06-24 Chung-Lin Tang clt...@codesourcery.com PR tree-optimization/61554 * tree-ssa-propagate.c: Include bitmap.h. (substitute_and_fold_dom_walker): Add 'bitmap need_eh_cleanup' member, properly update constructor/destructor. (substitute_and_fold_dom_walker::before_dom_children): Remove call to gimple_purge_dead_eh_edges, add bb-index to need_eh_cleaup instead. (substitute_and_fold): Call gimple_purge_all_dead_eh_edges on need_eh_cleanup. Index: tree-ssa-propagate.c === --- tree-ssa-propagate.c (revision 211927) +++ tree-ssa-propagate.c (working copy) @@ -29,6 +29,7 @@ #include function.h #include gimple-pretty-print.h #include dumpfile.h +#include bitmap.h #include sbitmap.h #include tree-ssa-alias.h #include internal-fn.h @@ -1031,8 +1032,13 @@ class substitute_and_fold_dom_walker : public dom_ fold_fn (fold_fn_), do_dce (do_dce_), something_changed (false) { stmts_to_remove.create (0); + need_eh_cleanup = BITMAP_ALLOC (NULL); } -~substitute_and_fold_dom_walker () { stmts_to_remove.release (); } +~substitute_and_fold_dom_walker () +{ + stmts_to_remove.release (); + BITMAP_FREE (need_eh_cleanup); +} virtual void before_dom_children (basic_block); virtual void after_dom_children (basic_block) {} @@ -1042,6 +1048,7 @@ class substitute_and_fold_dom_walker : public dom_ bool do_dce; bool something_changed; vecgimple stmts_to_remove; +bitmap need_eh_cleanup; }; void @@ -1144,7 +1151,7 @@ substitute_and_fold_dom_walker::before_dom_childre /* If we cleaned up EH information from the statement, remove EH edges. */ if (maybe_clean_or_replace_eh_stmt (old_stmt, stmt)) - gimple_purge_dead_eh_edges (bb); + bitmap_set_bit (need_eh_cleanup, bb-index); if (is_gimple_assign (stmt) (get_gimple_rhs_class (gimple_assign_rhs_code (stmt)) @@ -1235,6 +1242,9 @@ substitute_and_fold (ssa_prop_get_value_fn get_val } } + if (!bitmap_empty_p (walker.need_eh_cleanup)) +gimple_purge_all_dead_eh_edges (walker.need_eh_cleanup); + statistics_counter_event (cfun, Constants propagated, prop_stats.num_const_prop); statistics_counter_event (cfun, Copies propagated,
Re: [PATCH] Fix up -march=native handling under KVM (PR target/61570)
On Mon, Jun 23, 2014 at 6:29 PM, H.J. Lu hjl.to...@gmail.com wrote: --- gcc/config/i386/driver-i386.c.jj2014-05-14 14:45:54.0 +0200 +++ gcc/config/i386/driver-i386.c 2014-06-20 18:59:57.805006358 +0200 @@ -745,6 +745,11 @@ const char *host_detect_local_cpu (int a /* Assume Core 2. */ cpu = core2; } + else if (has_longmode) + /* Perhaps some emulator? Assume x86-64, otherwise gcc + -march=native would be unusable for 64-bit compilations, + as all the CPUs below are 32-bit only. */ + cpu = x86-64; else if (has_sse3) /* It is Core Duo. */ cpu = pentium-m; Jakub host_detect_local_cpu guesses the cpu based on the real processors. It doesn't work with emulators due to some conflicts. This isn't the only only place which has the same issue. I prefer something like this. I'm fine with your patch too. Let's wait what Uros (or other i?86 maintainers) pick up. This looks OK to me. Thanks, Uros. This is what I checked in. This version was NOT approved. Please revert it ASAP and proceed with approved version. Uros.
Re: [RFC] Making fold-const sane WRT symbol visibilities
On Tue, 24 Jun 2014, Jan Hubicka wrote: Hello, fold-const contains quite few confused statements that deal with WEAK visibility and aliases: static int simple_operand_p (const_tree exp) { /* Strip any conversions that don't change the machine mode. */ STRIP_NOPS (exp); return (CONSTANT_CLASS_P (exp) || TREE_CODE (exp) == SSA_NAME || (DECL_P (exp) ! TREE_ADDRESSABLE (exp) ! TREE_THIS_VOLATILE (exp) ! DECL_NONLOCAL (exp) /* Don't regard global variables as simple. They may be allocated in ways unknown to the compiler (shared memory, #pragma weak, etc). */ ! TREE_PUBLIC (exp) ! DECL_EXTERNAL (exp) /* Weakrefs are not safe to be read, since they can be NULL. They are !TREE_PUBLIC !DECL_EXTERNAL but still have DECL_WEAK flag set. */ (! VAR_OR_FUNCTION_DECL_P (exp) || ! DECL_WEAK (exp)) /* Loading a static variable is unduly expensive, but global registers aren't expensive. */ (! TREE_STATIC (exp) || DECL_REGISTER (exp; } Here I think WEAK is useless, since we already check PUBLIC. /* If this is an equality comparison of the address of two non-weak, unaliased symbols neither of which are extern (since we do not have access to attributes for externs), then we know the result. */ if (TREE_CODE (arg0) == ADDR_EXPR VAR_OR_FUNCTION_DECL_P (TREE_OPERAND (arg0, 0)) ! DECL_WEAK (TREE_OPERAND (arg0, 0)) ! lookup_attribute (alias, DECL_ATTRIBUTES (TREE_OPERAND (arg0, 0))) ! DECL_EXTERNAL (TREE_OPERAND (arg0, 0)) TREE_CODE (arg1) == ADDR_EXPR VAR_OR_FUNCTION_DECL_P (TREE_OPERAND (arg1, 0)) ! DECL_WEAK (TREE_OPERAND (arg1, 0)) ! lookup_attribute (alias, DECL_ATTRIBUTES (TREE_OPERAND (arg1, 0))) ! DECL_EXTERNAL (TREE_OPERAND (arg1, 0))) Here we no longer consstently use alias attribute to do aliases - it should ask symtab. Moreover it handle just fraction of cases where we can prove nonequality. and some others. This patch attempts to deal with nonzero that is one of basic predicates. I added symtab_node::nonzero that I hope is implemented safely and I removed the fold-const code (I will cleanup a bit its implementation - at the very end I added the flag_ checks that makes it osmewhat ugly). The problem is that I do not think I can safely fold before symtab is constructed as shown by the testcase: extern int a; t() { return a!=0; } extern int a __attribute__ ((weak)); Here we incorrectly fold into true before we see the weak predicate. The problem is that the patch fails testcases that assume we do such folding at parsing time. ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/pr36901-1.c (test for excess errors) ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/pr36901-2.c (test for excess errors) ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/pr36901-3.c (test for excess errors) ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/pr36901-4.c (test for excess errors) Here we accept the source as compile time constant while I think it is not: int sc = (sc 0); Does it seem resonable to turn those testcases into one testing for error and live with delaying the folding oppurtunities to early opts? They are now caught by ccp1 pass usually. IMHO all symbol visibility related foldings are very premature if done in the frontends (well, most of fold-const.c is ...). Of course everything depends on whether there exists a frontend that requires these foldings for correctness ... Personally I find nonzero ambiguous as it doesn't clearly state it is about the symbols address rather than its value. Richard. Honza * cgraph.h (symtab_node): Add method nonzero. (decl_in_symtab_p): Break out from ... (symtab_get_node): ... here. * symtab.c (symtab_node::nonzero): New method. * fold-const.c: Include cgraph.h (tree_single_nonzero_warnv_p): Use symtab for symbols. * testsuite/g++.dg/tree-ssa/nonzero-2.C: New testcase. * testsuite/g++.dg/tree-ssa/nonzero-1.C: New testcase. * testsuite/gcc.dg/tree-ssa/nonzero-1.c: New testcase. Index: cgraph.h === --- cgraph.h (revision 211915) +++ cgraph.h (working copy) @@ -214,6 +214,9 @@ public: void set_init_priority (priority_type priority); priority_type get_init_priority (); + + /* Return true if symbol is known to be nonzero. */ + bool nonzero (); }; enum availability @@ -1068,6 +1077,17 @@ void varpool_remove_initializer (varpool /* In cgraph.c */ extern void change_decl_assembler_name (tree, tree); +/* Return
Re: [PATCH] Introducing SAD (Sum of Absolute Differences) operation to GCC vectorizer.
On Mon, 23 Jun 2014, Cong Hou wrote: It has been 8 months since this patch is posted. I have addressed all comments to this patch. The SAD pattern is very useful for some multimedia algorithms like ffmpeg. This patch will greatly improve the performance of such algorithms. Could you please have a look again and check if it is OK for the trunk? If it is necessary I can re-post this patch in a new thread. I will try to get to this one this week but can't easily find the latest patch, so - can you re-post it in a new thread? Thanks, Richard. Thank you! Cong On Tue, Dec 17, 2013 at 10:04 AM, Cong Hou co...@google.com wrote: Ping? thanks, Cong On Mon, Dec 2, 2013 at 5:06 PM, Cong Hou co...@google.com wrote: Hi Richard Could you please take a look at this patch and see if it is ready for the trunk? The patch is pasted as a text file here again. Thank you very much! Cong On Mon, Nov 11, 2013 at 11:25 AM, Cong Hou co...@google.com wrote: Hi James Sorry for the late reply. On Fri, Nov 8, 2013 at 2:55 AM, James Greenhalgh james.greenha...@arm.com wrote: On Tue, Nov 5, 2013 at 9:58 AM, Cong Hou co...@google.com wrote: Thank you for your detailed explanation. Once GCC detects a reduction operation, it will automatically accumulate all elements in the vector after the loop. In the loop the reduction variable is always a vector whose elements are reductions of corresponding values from other vectors. Therefore in your case the only instruction you need to generate is: VABAL ops[3], ops[1], ops[2] It is OK if you accumulate the elements into one in the vector inside of the loop (if one instruction can do this), but you have to make sure other elements in the vector should remain zero so that the final result is correct. If you are confused about the documentation, check the one for udot_prod (just above usad in md.texi), as it has very similar behavior as usad. Actually I copied the text from there and did some changes. As those two instruction patterns are both for vectorization, their behavior should not be difficult to explain. If you have more questions or think that the documentation is still improper please let me know. Hi Cong, Thanks for your reply. I've looked at Dorit's original patch adding WIDEN_SUM_EXPR and DOT_PROD_EXPR and I see that the same ambiguity exists for DOT_PROD_EXPR. Can you please add a note in your tree.def that SAD_EXPR, like DOT_PROD_EXPR can be expanded as either: tmp = WIDEN_MINUS_EXPR (arg1, arg2) tmp2 = ABS_EXPR (tmp) arg3 = PLUS_EXPR (tmp2, arg3) or: tmp = WIDEN_MINUS_EXPR (arg1, arg2) tmp2 = ABS_EXPR (tmp) arg3 = WIDEN_SUM_EXPR (tmp2, arg3) Where WIDEN_MINUS_EXPR is a signed MINUS_EXPR, returning a a value of the same (widened) type as arg3. I have added it, although we currently don't have WIDEN_MINUS_EXPR (I mentioned it in tree.def). Also, while looking for the history of DOT_PROD_EXPR I spotted this patch: [autovect] [patch] detect mult-hi and sad patterns http://gcc.gnu.org/ml/gcc-patches/2005-10/msg01394.html I wonder what the reason was for that patch to be dropped? It has been 8 years.. I have no idea why this patch is not accepted finally. There is even no reply in that thread. But I believe the SAD pattern is very important to be recognized. ARM also provides instructions for it. Thank you for your comment again! thanks, Cong Thanks, James -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer
Re: [PATCH] Fix forwporp pattern (T)(P + A) - (T)P - (T)A
On Mon, Jun 23, 2014 at 8:02 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, On Mon, 23 Jun 2014 19:12:47, Richard Biener wrote: On Mon, Jun 23, 2014 at 4:28 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, On Mon, 23 Jun 2014 10:40:53, Richard Biener wrote: On Sun, Jun 22, 2014 at 9:14 AM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi, I noticed that several testcases in the GMP-4.3.2 test suite are failing now which did not happen with GCC 4.9.0. I debugged the first one, mpz/convert, and found the file mpn/generic/get_str.c was miscompiled. mpn/get_str.c.132t.dse2: pretmp_183 = (sizetype) chars_per_limb_80; pretmp_184 = -pretmp_183; _23 = chars_per_limb_80 + 4294967295; _68 = (sizetype) _23; _28 = _68 + pretmp_184; mpn/get_str.c.133t.forwprop4: _28 = 4294967295; That is wrong, because chars_per_limb is unsigned, and it is not zero. So the right result should be -1. This makes the loop termination in that function fail. The reason for this is in this check-in: r210807 | ebotcazou | 2014-05-22 16:32:56 +0200 (Thu, 22 May 2014) | 3 lines * tree-ssa-forwprop.c (associate_plusminus): Extend (T)(P + A) - (T)P - (T)A transformation to integer types. Because it implicitly assumes that integer overflow is not allowed with all types, including unsigned int. Hmm? But the transform is correct if overflow wraps. And it's correct if overflow is undefined as well, as (T)A is always well-defined (implementation defined) if it is a truncation. we have no problem when the cast from (P + A) to T is a truncation, except if the add operation P + A is saturating. Ah, we indeed look at an inner operation. So we match the above an try to transform it to (T)P + (T)A - (T)P. That's wrong if the conversion is extending I think. Yes, in a way. But OTOH, Eric's test case opt37.adb, fails if we simply punt here. Fortunately, with opt37.adb, P and A are signed 32-bit integers, and T is size_t (64 bit) and because the overflow of P + A causes undefined behaviour, we can assume that P + A _did_ not overflow, and therefore the transformation (T)(P + A) == (T)P + (T)A is correct (but needs a strict overflow warning), and we still can use the pattern (T)P + (T)A - (T)P - (T)A in this case. Ok, though that is then the only transform that uses strict-overflow semantics. But we cannot use this transformation, as the attached test case demonstrates when P + A is done in unsigned integers, because the result of the addition is different if it is done in unsigned int with allowed overflow, or in long without overflow. Richard. The attached patch fixes these regressions, and because the reasoning depends on the TYPE_OVERFLOW_UNDEFINED attribute, a strict overflow warning has to be emitted here, at least for widening conversions. Boot-strapped and regression-tested on x86_64-linux-gnu with all languages, including Ada. OK for trunk? + if (!TYPE_SATURATING (TREE_TYPE (a)) this is already tested at the very beginning of the function. We have done TYPE_SATURATING (TREE_TYPE (rhs1)) that refers to T, but I am concerned about the inner addition operation here, and if it is done in a saturating way. Ok, a valid concern. + !FLOAT_TYPE_P (TREE_TYPE (a)) + !FIXED_POINT_TYPE_P (TREE_TYPE (a)) likewise. Well, maybe this cannot happen, because if we have P + A, computed in float, and T an integer type, then probably CONVERT_EXPR_CODE_P (def_code) will not match, because def_code is FIX_TRUNC_EXPR in that case? Yes. It cannot be a FLOAT_TYPE_P here, similar for fixed-point which would use FIXED_CONVERT_EXPR. Saturating types don't seem to have a special conversion tree code. then this should be an assertion, with a comment why this is supposed to be impossible. Hum, well. If then such check belongs in the gimple verifier in tree-cfg.c, not spread (and duplicated) in random code. OTOH it does not hut to check that, becaue A's type may be quite different than rhs1's type. + || (!POINTER_TYPE_P (TREE_TYPE (p)) + INTEGRAL_TYPE_P (TREE_TYPE (a)) + TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (a))) INTEGRAL_TYPE_P are always !POINTER_TYPE_P. We come here, either because P + A is a POINTER_PLUS_EXPR or because P + A is a PLUS_EXPR. Ah, p vs. a - misread that code. In the first case, P's type is POINTER_TYPE_P and A's type is INTEGRAL_TYPE_P so this should not check the TYPE_OVERFLOW_UNDEFINED, but instead the POINTER_TYPE_OVERFLOW_UNDEFINED. Also with undefined pointer wraparound, we can exploit that in the same way as with signed integers. But I am concerned, if (T)A is always the same thing as (T)(void*)A. I'd say, yes, if TYPE_UNSIGNED (TREE_TYPE (p)) == TYPE_UNSIGNED (TREE_TYPE (a)) or if A is a constant, and it is positive. I don't understand this last bit. If p is pointer then a is of unsigned sizetype. sizetypes size doesn't have to match pointer size.
[committed] Fix OpenMP lastprivate and linear clause handling and handling of collapse 1 simd loops
Hi! This patch fixes various issues with handling lastprivate and linear clauses and simd collapse 1 loops. Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk and 4.9 branch. 2014-06-24 Jakub Jelinek ja...@redhat.com * gimplify.c (gimplify_omp_for): For #pragma omp for simd iterator not mentioned in clauses use private clause if the iterator is declared in #pragma omp for simd, and when adding lastprivate instead, add it to the outer #pragma omp for too. Diagnose if the variable is private in outer context. For simd collapse 1 loops, replace all iterators with temporaries. * omp-low.c (lower_rec_input_clauses): Handle LINEAR clause the same even in collapse 1 loops. gcc/c/ * c-parser.c (c_parser_omp_for_loop): For #pragma omp parallel for simd move lastprivate clause from parallel to for rather than simd. gcc/cp/ * parser.c (cp_parser_omp_for_loop): For #pragma omp parallel for simd move lastprivate clause from parallel to for rather than simd. libgomp/ * testsuite/libgomp.c/for-2.c: Define SC to static for #pragma omp for simd testing. * testsuite/libgomp.c/for-2.h (SC): Define if not defined. (N(f5), N(f6), N(f7), N(f8), N(f10), N(f12), N(f14)): Use SC macro. * testsuite/libgomp.c/simd-14.c: New test. * testsuite/libgomp.c/simd-15.c: New test. * testsuite/libgomp.c/simd-16.c: New test. * testsuite/libgomp.c/simd-17.c: New test. * testsuite/libgomp.c++/for-10.C: Define SC to static for #pragma omp for simd testing. * testsuite/libgomp.c++/simd10.C: New test. * testsuite/libgomp.c++/simd11.C: New test. * testsuite/libgomp.c++/simd12.C: New test. * testsuite/libgomp.c++/simd13.C: New test. --- gcc/gimplify.c.jj 2014-06-20 23:31:49.0 +0200 +++ gcc/gimplify.c 2014-06-23 16:55:19.153679764 +0200 @@ -6810,6 +6810,31 @@ gimplify_omp_for (tree *expr_p, gimple_s bool lastprivate = (!has_decl_expr || !bitmap_bit_p (has_decl_expr, DECL_UID (decl))); + if (lastprivate + gimplify_omp_ctxp-outer_context + gimplify_omp_ctxp-outer_context-region_type +== ORT_WORKSHARE + gimplify_omp_ctxp-outer_context-combined_loop + !gimplify_omp_ctxp-outer_context-distribute) + { + struct gimplify_omp_ctx *outer + = gimplify_omp_ctxp-outer_context; + n = splay_tree_lookup (outer-variables, +(splay_tree_key) decl); + if (n != NULL + (n-value GOVD_DATA_SHARE_CLASS) == GOVD_LOCAL) + lastprivate = false; + else if (omp_check_private (outer, decl, false)) + error (lastprivate variable %qE is private in outer + context, DECL_NAME (decl)); + else + { + omp_add_variable (outer, decl, + GOVD_LASTPRIVATE | GOVD_SEEN); + if (outer-outer_context) + omp_notice_variable (outer-outer_context, decl, true); + } + } c = build_omp_clause (input_location, lastprivate ? OMP_CLAUSE_LASTPRIVATE : OMP_CLAUSE_PRIVATE); @@ -6829,10 +6854,13 @@ gimplify_omp_for (tree *expr_p, gimple_s /* If DECL is not a gimple register, create a temporary variable to act as an iteration counter. This is valid, since DECL cannot be -modified in the body of the loop. */ +modified in the body of the loop. Similarly for any iteration vars +in simd with collapse 1 where the iterator vars must be +lastprivate. */ if (orig_for_stmt != for_stmt) var = decl; - else if (!is_gimple_reg (decl)) + else if (!is_gimple_reg (decl) + || (simd TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)) 1)) { var = create_tmp_var (TREE_TYPE (decl), get_name (decl)); TREE_OPERAND (t, 0) = var; --- gcc/omp-low.c.jj2014-06-20 23:31:49.0 +0200 +++ gcc/omp-low.c 2014-06-23 15:30:08.937060484 +0200 @@ -3421,24 +3421,20 @@ lower_rec_input_clauses (tree clauses, g OMP_CLAUSE__LOOPTEMP_); gcc_assert (c); tree l = OMP_CLAUSE_DECL (c); - if (fd-collapse == 1) - { - tree n1 = fd-loop.n1; - tree step = fd-loop.step; - tree itype = TREE_TYPE (l); - if
[Patch AArch64_be] Fix some vec_concat big-endian confusions
Hi, vec_concat ( { a, b }, { c, d }) should give a new vector { a, b, c, d }. On big-endian aarch64 targets, we have to think carefully about what this means as we map GCC's view of endian-ness on to ours. GCC (for reasons I have yet to understand) likes to describe lane-extracts from a vector as endian-ness dependant bit-field extracts. This cause major headaches, and means we have to pretend throughout the backend that lane zero is at the high bits of a vector register. When we have a machine instruction which zeroes the high bits of a vector register, and we want to describe it in RTL, the natural little-endian view is vec_concat ( operand, zeroes ). The reality described above implies that the correct description on big-endian systems is vec_concat ( zeroes, operand ). This also affects arm_neon.h intrinsics. When we say vcombine (a, b) we mean that a should occupy the low 64-bits and b the high 64 bits. We therefore need to take care to swap the operands to vec_concat when we are targeting big-endian. This patch is messy, but it gives an notable improvement in the PASS rates for an internal testsuite for Neon intrinsics. Tested on aarch64-none-elf and aarch64_be-none-elf with no issues, but no improvements either. OK for trunk? Thanks, James --- gcc/ 2014-06-20 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64-simd.md (move_lo_quad_internal_mode): New. (move_lo_quad_internal_be_mode): Likewise. (move_lo_quad_mode): Convert to define_expand. (aarch64_simd_move_hi_quad_mode): Gate on BYTES_BIG_ENDIAN. (aarch64_simd_move_hi_quad_be_mode): New. (move_hi_quad_mode): Use appropriate insn for BYTES_BIG_ENDIAN. (aarch64_combinezmode): Gate on BYTES_BIG_ENDIAN. (aarch64_combinez_bemode): New. (aarch64_combinemode): Convert to define_expand. (aarch64_combine_internalmode): New. (aarch64_simd_combinemode): Remove bogus RTL description. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 6b81d811b70bd157207f7753027309442ec9e8b5..00e2206b200fd32c6df5987d7317687488e8dadd 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -942,14 +942,38 @@ (define_insn sumaxminmode3 [(set_attr type neon_minmaxq)] ) -;; Move into low-half clearing high half to 0. +;; vec_concat gives a new vector with the low elements from operand 1, and +;; the high elements from operand 2. That is to say, given op1 = { a, b } +;; op2 = { c, d }, vec_concat (op1, op2) = { a, b, c, d }. +;; What that means, is that the RTL descriptions of the below patterns +;; need to change depending on endianness. + +;; Move to the low architectural bits of the register. +;; On little-endian this is { operand, zeroes } +;; On big-endian this is { zeroes, operand } -(define_insn move_lo_quad_mode +(define_insn move_lo_quad_internal_mode [(set (match_operand:VQ 0 register_operand =w,w,w) (vec_concat:VQ (match_operand:VHALF 1 register_operand w,r,r) (vec_duplicate:VHALF (const_int 0] - TARGET_SIMD + TARGET_SIMD !BYTES_BIG_ENDIAN + @ + dup\\t%d0, %1.d[0] + fmov\\t%d0, %1 + dup\\t%d0, %1 + [(set_attr type neon_dupq,f_mcr,neon_dupq) + (set_attr simd yes,*,yes) + (set_attr fp *,yes,*) + (set_attr length 4)] +) + +(define_insn move_lo_quad_internal_be_mode + [(set (match_operand:VQ 0 register_operand =w,w,w) +(vec_concat:VQ + (vec_duplicate:VHALF (const_int 0)) + (match_operand:VHALF 1 register_operand w,r,r)))] + TARGET_SIMD BYTES_BIG_ENDIAN @ dup\\t%d0, %1.d[0] fmov\\t%d0, %1 @@ -960,7 +984,23 @@ (define_insn move_lo_quad_mode (set_attr length 4)] ) -;; Move into high-half. +(define_expand move_lo_quad_mode + [(match_operand:VQ 0 register_operand) + (match_operand:VQ 1 register_operand)] + TARGET_SIMD +{ + if (BYTES_BIG_ENDIAN) +emit_insn (gen_move_lo_quad_internal_be_mode (operands[0], operands[1])); + else +emit_insn (gen_move_lo_quad_internal_mode (operands[0], operands[1])); + DONE; +} +) + +;; Move operand1 to the high architectural bits of the register, keeping +;; the low architectural bits of operand2. +;; For little-endian this is { operand2, operand1 } +;; For big-endian this is { operand1, operand2 } (define_insn aarch64_simd_move_hi_quad_mode [(set (match_operand:VQ 0 register_operand +w,w) @@ -969,12 +1009,25 @@ (define_insn aarch64_simd_move_hi_quad_ (match_dup 0) (match_operand:VQ 2 vect_par_cnst_lo_half )) (match_operand:VHALF 1 register_operand w,r)))] - TARGET_SIMD + TARGET_SIMD !BYTES_BIG_ENDIAN @ ins\\t%0.d[1], %1.d[0] ins\\t%0.d[1], %1 - [(set_attr type neon_ins) - (set_attr length 4)] + [(set_attr type neon_ins)] +) + +(define_insn aarch64_simd_move_hi_quad_be_mode + [(set (match_operand:VQ 0 register_operand +w,w) +(vec_concat:VQ + (match_operand:VHALF 1 register_operand w,r) +
[GSoC] remove unnecessary temporaries
Hi, This patch attempts to generate temporaries only when required. I have changed generation of operand names. All children of an expr-node are assigned at expr-node itself. Names are generated as follows at expr-node: olevelchild-pos = rhs; where level is level of the expr-node in decision tree. This is done by dt_operand::gen_opname. Names are referred to as follows at child-node: oparent-levelpos. This is done by dt_operand::get_name. To do this, I changed the type of indexes array in dt_simplify to array of dt_operand *, so that each element points to the decision-tree node, instead of storing it's level. * genmatch.c (operand::gen_gimple_transform): Remove 2nd argument. (predicate::gen_gimple_transform): Likewise. (expr::gen_gimple_transform): Likewise. (c_expr::gen_gimple_transform): Likewise. (capture::gen_gimple_transform): Likewise. (dt_simplify::indexes): Change type to array of dt_operand * (dt_simplify::dt_simplify): change type of 3rd argument to dt_operand ** (dt_simplify::gen_gimple): Remove 2nd argument in call to .gen_gimple_transform() (dt_operand::get_name): New member function. (dt_operand::gen_opname): New member function. (dt_operand::match_dop): New member. (dt_operand::temps): Remove. (dt_operand::temp_count): Likewise. (dt_operand::m_level): Likewise. (dt_operand::dt_operand): Change type of 2nd argument to dt_operand * (dt_operand::gen_gimple): Call get_name for getting operand name. (dt_operand::gen_gimple_expr_fn): Replace call to sprintf, by get_name (opname). (dt_operand::gen_gimple_expr_expr): Likwise. (dt_operand::gen_generic_expr_expr): Likewise. (dt_operand::gen_generic_expr_fn): Likewise (decision_tree::insert_operand): Change type of 3rd argument to dt_operand**. (dt_node::append_simplify): Likewise. Thanks and Regards, Prathamesh Index: gcc/genmatch.c === --- gcc/genmatch.c (revision 211920) +++ gcc/genmatch.c (working copy) @@ -200,18 +200,20 @@ add_builtin (enum built_in_function code /* The predicate expression tree structure. */ +struct dt_operand; + struct operand { enum op_type { OP_PREDICATE, OP_EXPR, OP_CAPTURE, OP_C_EXPR }; operand (enum op_type type_) : type (type_) {} enum op_type type; - virtual void gen_gimple_transform (FILE *f, const char *, const char *) = 0; + virtual void gen_gimple_transform (FILE *f, const char *) = 0; }; struct predicate : public operand { predicate (const char *ident_) : operand (OP_PREDICATE), ident (ident_) {} const char *ident; - virtual void gen_gimple_transform (FILE *, const char *, const char *) { gcc_unreachable (); } + virtual void gen_gimple_transform (FILE *, const char *) { gcc_unreachable (); } }; struct e_operation { @@ -228,7 +230,7 @@ struct expr : public operand void append_op (operand *op) { ops.safe_push (op); } e_operation *operation; vecoperand * ops; - virtual void gen_gimple_transform (FILE *f, const char *, const char *); + virtual void gen_gimple_transform (FILE *f, const char *); }; struct c_expr : public operand @@ -240,7 +242,7 @@ struct c_expr : public operand veccpp_token code; unsigned nr_stmts; char *fname; - virtual void gen_gimple_transform (FILE *f, const char *, const char *); + virtual void gen_gimple_transform (FILE *f, const char *); }; struct capture : public operand @@ -249,7 +251,7 @@ struct capture : public operand : operand (OP_CAPTURE), where (where_), what (what_) {} const char *where; operand *what; - virtual void gen_gimple_transform (FILE *f, const char *, const char *); + virtual void gen_gimple_transform (FILE *f, const char *); }; @@ -305,6 +307,86 @@ struct simplify { source_location result_location; }; +struct dt_node +{ + enum dt_type { DT_NODE, DT_OPERAND, DT_TRUE, DT_MATCH, DT_SIMPLIFY }; + + enum dt_type type; + unsigned level; + vecdt_node * kids; + + dt_node (enum dt_type type_): type (type_), level (0), kids (vNULL) {} + + dt_node *append_node (dt_node *); + dt_node *append_op (operand *, dt_node *parent = 0, unsigned pos = 0); + dt_node *append_true_op (dt_node *parent = 0, unsigned pos = 0); + dt_node *append_match_op (dt_operand *, dt_node *parent = 0, unsigned pos = 0); + dt_node *append_simplify (simplify *, unsigned, dt_operand **); + + virtual void gen_gimple (FILE *) {} +}; + +struct dt_operand: public dt_node +{ + operand *op; + dt_operand *match_dop; + dt_operand *parent; + unsigned pos; + + dt_operand (enum dt_type type, operand *op_, dt_operand *match_dop_, dt_operand *parent_ = 0, unsigned pos_ = 0) + : dt_node (type), op (op_), match_dop (match_dop_), parent (parent_), pos (pos_) {} + + virtual void gen_gimple (FILE *); + unsigned gen_gimple_predicate (FILE *, const char *); + unsigned gen_gimple_match_op (FILE *, const char *); + + unsigned gen_gimple_expr (FILE *, const char
Re: [Committed] New testcase for conditional move with conditional compares
On 23/06/14 22:12, Andrew Pinski wrote: Hi, When looking at the current conditional compare patch, I find that we don't have a testcase to test that we don't ICE for the case where we have conditional compares and conditional moves where the moves are of floating point types. This patch adds that testcase to the C torture compile test to make sure we don't ICE (which I think we do currently). FWIW, this doesn't ICE for me with aarch64-none-elf trunk. Kyrill Thanks, Andrew Pinski 2014-06-23 Andrew Pinski apin...@cavium.com * gcc.c-torture/compile/20140723-1.c: New testcase.
[PING] [PATCH, ARM] Improve code-gen for multiple shifted accumulations in array indexing
Ping~ Original posted here: https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01492.html Thanks, Yufeng On 06/18/14 17:35, Yufeng Zhang wrote: This time with patch... Apologize. Yufeng On 06/18/14 17:31, Yufeng Zhang wrote: Hi, This patch improves the code-gen of -marm in the case of two-dimensional array access. Given the following code: typedef struct { int x,y,a,b; } X; int f7a(X p[][4], int x, int y) { return p[x][y].a; } The code-gen on -O2 -marm -mcpu=cortex-a15 is currently mov r2, r2, asl #4 add r1, r2, r1, asl #6 add r0, r0, r1 ldr r0, [r0, #8] bx lr With the patch, we'll get: add r1, r0, r1, lsl #6 add r2, r1, r2, lsl #4 ldr r0, [r2, #8] bx lr The -mthumb code-gen had been OK. The patch has passed the bootstrapping on cortex-a15 and the arm-none-eabi regtest, with no code-gen difference in spec2k (unfortunately). OK for the trunk? Thanks, Yufeng gcc/ * config/arm/arm.c (arm_reassoc_shifts_in_address): New declaration and new function. (arm_legitimize_address): Call the new functions. (thumb_legitimize_address): Prefix the declaration with static. gcc/testsuite/ * gcc.target/arm/shifted-add-1.c: New test. * gcc.target/arm/shifted-add-2.c: Ditto.
Re: [PATCH AArch64 2/2] PR/60825 Make {int,uint}64x1_t in arm_neon.h a proper vector type
On Thu, Jun 19, 2014 at 01:30:32PM +0100, Alan Lawrence wrote: diff --git a/gcc/testsuite/gcc.target/aarch64/simd/ext_s64.x b/gcc/testsuite/gcc.target/aarch64/simd/ext_s64.x index c71011a5157a207fe68fe814ed80658fd5e0f90f..b879fdacaa6544790e4d3ff98ca0055073d6d1d1 100644 --- a/gcc/testsuite/gcc.target/aarch64/simd/ext_s64.x +++ b/gcc/testsuite/gcc.target/aarch64/simd/ext_s64.x @@ -9,7 +9,7 @@ main (int argc, char **argv) int64_t arr2[] = {1}; int64x1_t in2 = vld1_s64 (arr2); int64x1_t actual = vext_s64 (in1, in2, 0); - if (actual != in1) + if (actual[0] != in1[0]) abort (); return 0; diff --git a/gcc/testsuite/gcc.target/aarch64/simd/ext_u64.x b/gcc/testsuite/gcc.target/aarch64/simd/ext_u64.x index 8d5072bf761d96ea5a95342423ae9861d05d024a..bd51e27c2156bfcaca6b26798c449369b2894c08 100644 --- a/gcc/testsuite/gcc.target/aarch64/simd/ext_u64.x +++ b/gcc/testsuite/gcc.target/aarch64/simd/ext_u64.x @@ -9,7 +9,7 @@ main (int argc, char **argv) uint64_t arr2[] = {1}; uint64x1_t in2 = vld1_u64 (arr2); uint64x1_t actual = vext_u64 (in1, in2, 0); - if (actual != in1) + if (actual[0] != in1[0]) abort (); return 0; Hi Alan, Note that these files are also included by tests in the ARM backend, where uint64x1_t is still a typedef to a scalar type, leading to: PASS-FAIL: gcc.target/arm/simd/vexts64_1.c (test for excess errors) ../aarch64/simd/ext_u64.x:12:23: error: subscripted value is neither array nor pointer nor vector Thanks, James
[PATCH PR61576]
Hi All, Here is a fix for PR 61576 - additional test was added that block containing reduction statement is predecessor of block containing phi to choose the correct condition. Bootstrap and regression testing did not show any new failures. Is it OK for trunk? gcc/ChangeLog 2014-06-24 Yuri Rumyantsev ysrum...@gmail.com PR tree-optimization/61576 * tree-if-conv.c (is_cond_scalar_reduction): Add check that basic block containing reduction statement is predecessor of phi basi block. gcc/testsuite/ChangeLog * gcc.dg/torture/pr61576.c: New test. patch Description: Binary data
[PATCH] Fix PR61572
The following avoids to increase hard register lifetime by not sinking loads from those. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. I wonder if it would make sense to not treat hardregs as memory but instead rewrite them into SSA form but with SSA_NAME_OCCURS_IN_ABNORMAL_PHI set (which really tells that all SSA names of the underlying decl must be able to coalesce to the same register). Richard. 2014-06-24 Richard Biener rguent...@suse.de PR tree-optimization/61572 * tree-ssa-sink.c (statement_sink_location): Do not sink loads from hard registers. * gcc.target/i386/pr61572.c: New testcase. Index: gcc/tree-ssa-sink.c === --- gcc/tree-ssa-sink.c (revision 211928) +++ gcc/tree-ssa-sink.c (working copy) @@ -374,6 +374,12 @@ statement_sink_location (gimple stmt, ba nearest to commondom. */ if (gimple_vuse (stmt)) { + /* Do not sink loads from hard registers. */ + if (gimple_assign_single_p (stmt) + TREE_CODE (gimple_assign_rhs1 (stmt)) == VAR_DECL + DECL_HARD_REGISTER (gimple_assign_rhs1 (stmt))) + return false; + imm_use_iterator imm_iter; use_operand_p use_p; basic_block found = NULL; Index: gcc/testsuite/gcc.target/i386/pr61572.c === --- gcc/testsuite/gcc.target/i386/pr61572.c (revision 0) +++ gcc/testsuite/gcc.target/i386/pr61572.c (working copy) @@ -0,0 +1,46 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +struct autofs_sb_info +{ + int exp_timeout; +}; +void *f; +int g; +static int fn1 (struct autofs_sb_info *p1) +{ + int a, b; + a = ( + { + register __typeof__(0) c +#if defined __x86_64__ + asm(rdx) +#endif + ; + b = c; + int d; + __typeof__(0) e; + e = p1-exp_timeout / 1000; + switch (0) + default: + asm( : =a(d) : 0(e), (0)); + d; + }); + if (a) +return 1; + if (b) +p1-exp_timeout = 0; + return 0; +} + +int fn2 () +{ + struct autofs_sb_info *h = f; + switch (g) +{ + case 0 ?: +0 : return fn1 (h); + default: +return 0; +} +}
Re: [GSoC] remove unnecessary temporaries
On Tue, Jun 24, 2014 at 10:53 AM, Prathamesh Kulkarni bilbotheelffri...@gmail.com wrote: Hi, This patch attempts to generate temporaries only when required. I have changed generation of operand names. All children of an expr-node are assigned at expr-node itself. Names are generated as follows at expr-node: olevelchild-pos = rhs; where level is level of the expr-node in decision tree. This is done by dt_operand::gen_opname. Names are referred to as follows at child-node: oparent-levelpos. This is done by dt_operand::get_name. To do this, I changed the type of indexes array in dt_simplify to array of dt_operand *, so that each element points to the decision-tree node, instead of storing it's level. Thanks, I have applied the patch but changed + dt_node *temp = new dt_operand (dt_node::DT_OPERAND, c-what, 0); + elm = decision_tree::find_node (p-kids, temp); + free (temp); to dt_operand temp (dt_node::DT_OPERAND, c-what, 0); elm = decision_tree::find_node (p-kids, temp); Richard. * genmatch.c (operand::gen_gimple_transform): Remove 2nd argument. (predicate::gen_gimple_transform): Likewise. (expr::gen_gimple_transform): Likewise. (c_expr::gen_gimple_transform): Likewise. (capture::gen_gimple_transform): Likewise. (dt_simplify::indexes): Change type to array of dt_operand * (dt_simplify::dt_simplify): change type of 3rd argument to dt_operand ** (dt_simplify::gen_gimple): Remove 2nd argument in call to .gen_gimple_transform() (dt_operand::get_name): New member function. (dt_operand::gen_opname): New member function. (dt_operand::match_dop): New member. (dt_operand::temps): Remove. (dt_operand::temp_count): Likewise. (dt_operand::m_level): Likewise. (dt_operand::dt_operand): Change type of 2nd argument to dt_operand * (dt_operand::gen_gimple): Call get_name for getting operand name. (dt_operand::gen_gimple_expr_fn): Replace call to sprintf, by get_name (opname). (dt_operand::gen_gimple_expr_expr): Likwise. (dt_operand::gen_generic_expr_expr): Likewise. (dt_operand::gen_generic_expr_fn): Likewise (decision_tree::insert_operand): Change type of 3rd argument to dt_operand**. (dt_node::append_simplify): Likewise. Thanks and Regards, Prathamesh
Re: [RFC] optimize x - y cmp 0 with undefined overflow
On Fri, Jun 20, 2014 at 11:33 AM, Eric Botcazou ebotca...@adacore.com wrote: [I'm at last back to this...] With [1, -x + INF] as the resulting range? But it can be bogus if x is itself equal to +INF (unlike the input range [x + 1, +INF] which is always correct) Hmm, indeed. so this doesn't look valid to me. I don't see how we can get away without a +INF(OVF) here, but I can compute it in extract_range_from_binary_expr_1 if you prefer and try only [op0,op0] and [op1,op1]. Yeah, I'd prefer that. To recap, the range of y is [x + 1, +INF] and we're trying to evaluate the range of y - x, in particular we want to prove that y - x 0. We compute the range of y - x as [1, -x + INF] by combining [x + 1, +INF] with [x, x] and we want to massage it because compare_values will rightly choke. If overflow is undefined, we can simply change it to [1, +INF (OVF)] and be done with that. But if overflow is defined, we need to prove that -x + INF cannot wrap around (which is true if the type is unsigned) and the simplest way to do that in the general case is to recursively invoke the machinery of extract_range_from_binary_expr_1 on range_of(-x) + INF and analyze the result. This looks much more complicated implementation-wise (and would very likely buy us nothing in practice) than my scheme, where we just approximate range_of(-x) by [-INF,+INF] and don't need to implement the recursion at all. Hmm, indeed. Put the above into a comment so it's clear why we don't do it that way. However I can change extract_range_from_binary_expr so that only one range among [-INF,x], [x,+INF] and [x,x] is tried instead of the 3 ranges in a row. I initially didn't want to do that because this breaks the separation between extract_range_from_binary_expr_1 and extract_range_from_binary_expr but, given their names, this is very likely acceptable. What do you think? Yeah, that sounds good to me. Thanks, Richard. -- Eric Botcazou
Re: Fix vectorizer conditions on updating alignment
On Mon, Jun 16, 2014 at 10:24:26AM +0100, Jan Hubicka wrote: On Fri, Jun 13, 2014 at 12:14 AM, Jan Hubicka hubi...@ucw.cz wrote: Hi, while updating vect_can_force_dr_alignment_p for section API I noticed the predicate is bit confused about when it can update the alignment. We need to check that decl_binds_to_current_def_p and in case we compile a partition also that the symbol is not homed in other partition. Previous code was wrong i.e. for COMDATs, weaks or -fpic. Also when having an alias, only way to promote the alignment is to bump up alignment of target. On the other hand comment about DECL_IN_CONSTANT_POOL seems confused - we have no sharing across partitions. I assume it was old hack and removed it. I don't think that code was confused. It's because of the way we emit the constant pool and use its hash (we duplicate the decl). Do a svn blame and see for the associated PR which you now broke again I guess. It wasn't about LTO I think. It is middle-end/50494. I have re-instantiated the check and also added back TREE_ASM_WRITTEN that is needed in the case of -fno-toplevel-reorder. I will try to look into the constant pool output machinery but indeed it is a nasty problem. Thanks, Honza Hi Honza, This patch has caused a number of vector tests to start failing for -fPIC variants of the testsuite. (I'm running aarch64-none-elf, aarch64-none-linux-gnu, arm-none-eabi, arm-none-linux-gnueabihf) The full set of tests which have started failing for me are attached, a sample failure looks like: FAIL: gcc.dg/vect/vect-109.c scan-tree-dump-times vect Vectorizing an unaligned access 3 Looking on test-results I see similar problems on the i32coreavx target: https://gcc.gnu.org/ml/gcc-testresults/2014-06/msg02097.html Thanks, James Index: ChangeLog === --- ChangeLog (revision 211689) +++ ChangeLog (working copy) @@ -1,5 +1,10 @@ 2014-06-15 Jan Hubicka hubi...@ucw.cz + * tree-vect-data-refs.c (vect_can_force_dr_alignment_p): Check again + DECL_IN_CONSTANT_POOL and TREE_ASM_WRITTEN. + +2014-06-15 Jan Hubicka hubi...@ucw.cz + * c-family/c-common.c (handle_tls_model_attribute): Use set_decl_tls_model. * cgraph.h (struct varpool_node): Add tls_model. * tree.c (decl_tls_model, set_decl_tls_model): New functions. Index: tree-vect-data-refs.c === --- tree-vect-data-refs.c (revision 211688) +++ tree-vect-data-refs.c (working copy) @@ -5317,7 +5317,13 @@ vect_can_force_dr_alignment_p (const_tre if (TREE_CODE (decl) != VAR_DECL) return false; - gcc_assert (!TREE_ASM_WRITTEN (decl)); + /* With -fno-toplevel-reorder we may have already output the constant. */ + if (TREE_ASM_WRITTEN (decl)) +return false; + + /* Constant pool entries may be shared and not properly merged by LTO. */ + if (DECL_IN_CONSTANT_POOL (decl)) +return false; if (TREE_PUBLIC (decl) || DECL_EXTERNAL (decl)) { Test Run By pdtltest on Mon Jun 23 20:00:58 2014 Target is aarch64-none-elf Host is x86_64-unknown-linux-gnu === gcc Tests for aarch64-elf-foundation/-mcmodel=tiny/-fPIC === FAIL: gcc.dg/vect/vect-109.c scan-tree-dump-times vect Vectorizing an unaligned access 3 FAIL: gcc.dg/vect/vect-13.c scan-tree-dump-times vect Vectorizing an unaligned access 0 FAIL: gcc.dg/vect/vect-17.c scan-tree-dump-times vect Vectorizing an unaligned access 0 FAIL: gcc.dg/vect/vect-18.c scan-tree-dump-times vect Vectorizing an unaligned access 0 FAIL: gcc.dg/vect/vect-19.c scan-tree-dump-times vect Vectorizing an unaligned access 0 FAIL: gcc.dg/vect/vect-2-big-array.c scan-tree-dump-times vect Vectorizing an unaligned access 0 FAIL: gcc.dg/vect/vect-2.c scan-tree-dump-times vect Vectorizing an unaligned access 0 FAIL: gcc.dg/vect/vect-20.c scan-tree-dump-times vect Vectorizing an unaligned access 0 FAIL: gcc.dg/vect/vect-21.c scan-tree-dump-times vect Vectorizing an unaligned access 0 FAIL: gcc.dg/vect/vect-22.c scan-tree-dump-times vect Vectorizing an unaligned access 0 FAIL: gcc.dg/vect/vect-27.c scan-tree-dump-times vect Alignment of access forced using peeling 0 FAIL: gcc.dg/vect/vect-29.c scan-tree-dump-times vect Alignment of access forced using peeling 0 FAIL: gcc.dg/vect/vect-3.c scan-tree-dump-times vect Vectorizing an unaligned access 0 FAIL: gcc.dg/vect/vect-4.c scan-tree-dump-times vect Vectorizing an unaligned access 0 FAIL: gcc.dg/vect/vect-5.c scan-tree-dump-times vect Vectorizing an unaligned access 0 FAIL: gcc.dg/vect/vect-7.c scan-tree-dump-times vect Vectorizing an unaligned access 0 FAIL: gcc.dg/vect/vect-72.c scan-tree-dump-times vect Alignment of access forced using peeling 0 FAIL: gcc.dg/vect/vect-73-big-array.c scan-tree-dump-times vect Vectorizing an unaligned access 0 FAIL:
RE: [PATCH][MIPS] Enable load-load/store-store bonding
Hi Richard, Thanks for the review. Please find attached updated patch after your review comments. Changelog: gcc/ * config/mips/mips.md (JOIN_MODE): New mode iterator. (join2_load_StoreJOIN_MODE:mode): New pattern. (join2_loadhi): Likewise. (define_peehole2): Add peephole2 patterns to join 2 HI/SI/SF/DF-mode load-load and store-stores. * config/mips/mips.opt (mload-store-pairs): New option. (TARGET_LOAD_STORE_PAIRS): New macro. *config/mips/mips.h (ENABLE_P5600_LD_ST_PAIRS): Likewise. *config/mips/mips-protos.h (mips_load_store_bonding_p): New prototype. *config/mips/mips.c(mips_load_store_bonding_p): New function. The change is tested with dejagnu with additional options -mload-store-pairs and -mtune=p5600. The perf measurement is yet to finish. We had offline discussion based on your comment. There is additional view on the same. Only ISAs mips32r2, mips32r3 and mips32r5 support P5600. Remaining ISAs do not support P5600. For mips32r2 (24K) and mips32r3 (micromips), load-store pairing is implemented separately, and hence, as you suggested, P5600 Ld-ST bonding optimization should not be enabled for them. So, is it fine if I emit error for any ISAs other than mips32r2, mips32r3 and mips32r5 when P5600 is enabled, or the compilation should continue by emitting warning and disabling P5600? No, the point is that we have two separate concepts: ISA and optimisation target. -mipsN and -march=N control the ISA (which instructions are available) and -mtune=M controls optimisation decisions within the constraints of that N, such as scheduling and the cost of things like multiplication and division. E.g. you could have -mips2 -mtune=p5600 -mfix-24k: generate MIPS II- compatible code, optimise it for p5600, but make sure that 24k workarounds are used. The code would run correctly on any MIPS II-compatible processor without known errata and also on the 24k. Ok, disabled the peephole pattern for fix-24k and micromips - to allow specific patterns to be matched. + +mld-st-pairing +Target Report Var(TARGET_ENABLE_LD_ST_PAIRING) Enable load/store +pairing Other options are just TARGET_ + the captialised form of the option name, so I'd prefer TARGET_LD_ST_PAIRING instead. Although ld might be misleading since it's an abbreviation for load rather than the LD instruction. Maybe -mload-store-pairs, since plurals are more common than -ing? Not sure that's a great suggestion though. Renamed the option and corresponding macro as suggested. Performance testing for this patch is not yet done. If the patch proves beneficial in most of the testcases (which we believe will do on P5600) we will enable this optimization by default for P5600 - in which case this option can be removed. OK. Sending the patch for comments before performance testing is fine, but I think it'd be better to commit the patch only after the testing is done, since otherwise the patch might need to be tweaked. I don't see any problem with keeping the option in case people want to experiment with it. I just think the patch should only go in once it can be enabled by default for p5600. I.e. the option would exist to turn off the pairing. Not having the option is fine too of course. Yes, after perf analysis, I will share the results across, and then depending upon the impact, the decision can be made - whether to make the option as default or not, and then the patch will be submitted. We should allow pairing even without -mtune=p5600. The load-store pairing is currently attribute of P5600, so I have not enabled the pairing without mtune=5600. If need be, can enable that without mtune=p5600. (define_mode_iterator JOIN_MODE [ SI (DI TARGET_64BIT) (SF TARGET_HARD_FLOAT) (DF TARGET_HARD_FLOAT TARGET_DOUBLE_FLOAT)]) Done this change. and then extend: @@ -883,6 +884,8 @@ (define_mode_attr loadx [(SF lwxc1) (DF ldxc1) (V2SF ldxc1)]) (define_mode_attr storex [(SF swxc1) (DF sdxc1) (V2SF sdxc1)]) +(define_mode_attr insn_type [(SI ) (SF fp) (DF fp)]) + ;; The unextended ranges of the MIPS16 addiu and daddiu instructions ;; are different. Some forms of unextended addiu have an 8-bit immediate ;; field but the equivalent daddiu has only a 5-bit field. this accordingly. In order to allow d/f for both register classes, the pattern join2_load_storemode was altered a bit which eliminated this mode iterator. Outer (parallel ...)s are redundant in a define_insn. Removed. It would be better to add the mips_load_store_insns for each operand rather than multiplying one of them by 2. Or see the next bit for an alternative. Using the alternative method as you suggested, so this change is not needed. Please instead add HI to the define_mode_iterator so that we can use the same peephole and define_insn. Added HI in the mode iterator to eliminate join2_storehi
[patch] Do not generate useless integral conversions
Hi, https://gcc.gnu.org/ml/gcc-patches/2012-03/msg00491.html changed the old signed_type_for/unsigned_type_for functions and made them always return an integer type, whereas they would previously leave integral types unchanged. I don't see any justification for the latter and this has the annoying effect of generating useless integral conversions in convert.c, for example between boolean types and integer types of the same precision. The attached patch restores the old behavior for them. Bootstrapped/regtested on x86_64-suse-linux, OK for the mainline? 2014-06-24 Eric Botcazou ebotca...@adacore.com * tree.c (signed_or_unsigned_type_for): Treat integral types equally. -- Eric BotcazouIndex: tree.c === --- tree.c (revision 211927) +++ tree.c (working copy) @@ -10684,14 +10684,14 @@ int_cst_value (const_tree x) return val; } -/* If TYPE is an integral or pointer type, return an integer type with +/* If TYPE is an integral or pointer type, return an integral type with the same precision which is unsigned iff UNSIGNEDP is true, or itself - if TYPE is already an integer type of signedness UNSIGNEDP. */ + if TYPE is already an integral type of signedness UNSIGNEDP. */ tree signed_or_unsigned_type_for (int unsignedp, tree type) { - if (TREE_CODE (type) == INTEGER_TYPE TYPE_UNSIGNED (type) == unsignedp) + if (INTEGRAL_TYPE_P (type) TYPE_UNSIGNED (type) == unsignedp) return type; if (TREE_CODE (type) == VECTOR_TYPE) @@ -10713,9 +10713,9 @@ signed_or_unsigned_type_for (int unsigne return build_nonstandard_integer_type (TYPE_PRECISION (type), unsignedp); } -/* If TYPE is an integral or pointer type, return an integer type with +/* If TYPE is an integral or pointer type, return an integral type with the same precision which is unsigned, or itself if TYPE is already an - unsigned integer type. */ + unsigned integral type. */ tree unsigned_type_for (tree type) @@ -10723,9 +10723,9 @@ unsigned_type_for (tree type) return signed_or_unsigned_type_for (1, type); } -/* If TYPE is an integral or pointer type, return an integer type with +/* If TYPE is an integral or pointer type, return an integral type with the same precision which is signed, or itself if TYPE is already a - signed integer type. */ + signed integral type. */ tree signed_type_for (tree type)
Re: [PATCH] Detect a pack-unpack pattern in GCC vectorizer and optimize it.
On Sat, May 3, 2014 at 2:39 AM, Cong Hou co...@google.com wrote: On Mon, Apr 28, 2014 at 4:04 AM, Richard Biener rguent...@suse.de wrote: On Thu, 24 Apr 2014, Cong Hou wrote: Given the following loop: int a[N]; short b[N*2]; for (int i = 0; i N; ++i) a[i] = b[i*2]; After being vectorized, the access to b[i*2] will be compiled into several packing statements, while the type promotion from short to int will be compiled into several unpacking statements. With this patch, each pair of pack/unpack statements will be replaced by less expensive statements (with shift or bit-and operations). On x86_64, the loop above will be compiled into the following assembly (with -O2 -ftree-vectorize): movdqu 0x10(%rcx),%xmm3 movdqu -0x20(%rcx),%xmm0 movdqa %xmm0,%xmm2 punpcklwd %xmm3,%xmm0 punpckhwd %xmm3,%xmm2 movdqa %xmm0,%xmm3 punpcklwd %xmm2,%xmm0 punpckhwd %xmm2,%xmm3 movdqa %xmm1,%xmm2 punpcklwd %xmm3,%xmm0 pcmpgtw %xmm0,%xmm2 movdqa %xmm0,%xmm3 punpckhwd %xmm2,%xmm0 punpcklwd %xmm2,%xmm3 movups %xmm0,-0x10(%rdx) movups %xmm3,-0x20(%rdx) With this patch, the generated assembly is shown below: movdqu 0x10(%rcx),%xmm0 movdqu -0x20(%rcx),%xmm1 pslld $0x10,%xmm0 psrad $0x10,%xmm0 pslld $0x10,%xmm1 movups %xmm0,-0x10(%rdx) psrad $0x10,%xmm1 movups %xmm1,-0x20(%rdx) Bootstrapped and tested on x86-64. OK for trunk? This is an odd place to implement such transform. Also if it is faster or not depends on the exact ISA you target - for example ppc has constraints on the maximum number of shifts carried out in parallel and the above has 4 in very short succession. Esp. for the sign-extend path. Thank you for the information about ppc. If this is an issue, I think we can do it in a target dependent way. So this looks more like an opportunity for a post-vectorizer transform on RTL or for the vectorizer special-casing widening loads with a vectorizer pattern. I am not sure if the RTL transform is more difficult to implement. I prefer the widening loads method, which can be detected in a pattern recognizer. The target related issue will be resolved by only expanding the widening load on those targets where this pattern is beneficial. But this requires new tree operations to be defined. What is your suggestion? I apologize for the delayed reply. Likewise ;) I suggest to implement this optimization in vector lowering in tree-vect-generic.c. This sees for your example vect__5.7_32 = MEM[symbol: b, index: ivtmp.15_13, offset: 0B]; vect__5.8_34 = MEM[symbol: b, index: ivtmp.15_13, offset: 16B]; vect_perm_even_35 = VEC_PERM_EXPR vect__5.7_32, vect__5.8_34, { 0, 2, 4, 6, 8, 10, 12, 14 }; vect__6.9_37 = [vec_unpack_lo_expr] vect_perm_even_35; vect__6.9_38 = [vec_unpack_hi_expr] vect_perm_even_35; where you can apply the pattern matching and transform (after checking with the target, of course). Richard. thanks, Cong Richard.
Re: [PATCH] Introducing SAD (Sum of Absolute Differences) operation to GCC vectorizer.
On Tue, Dec 3, 2013 at 2:06 AM, Cong Hou co...@google.com wrote: Hi Richard Could you please take a look at this patch and see if it is ready for the trunk? The patch is pasted as a text file here again. (found it) The patch is ok for trunk. (please consider re-testing before you commit) Thanks, Richard. Thank you very much! Cong On Mon, Nov 11, 2013 at 11:25 AM, Cong Hou co...@google.com wrote: Hi James Sorry for the late reply. On Fri, Nov 8, 2013 at 2:55 AM, James Greenhalgh james.greenha...@arm.com wrote: On Tue, Nov 5, 2013 at 9:58 AM, Cong Hou co...@google.com wrote: Thank you for your detailed explanation. Once GCC detects a reduction operation, it will automatically accumulate all elements in the vector after the loop. In the loop the reduction variable is always a vector whose elements are reductions of corresponding values from other vectors. Therefore in your case the only instruction you need to generate is: VABAL ops[3], ops[1], ops[2] It is OK if you accumulate the elements into one in the vector inside of the loop (if one instruction can do this), but you have to make sure other elements in the vector should remain zero so that the final result is correct. If you are confused about the documentation, check the one for udot_prod (just above usad in md.texi), as it has very similar behavior as usad. Actually I copied the text from there and did some changes. As those two instruction patterns are both for vectorization, their behavior should not be difficult to explain. If you have more questions or think that the documentation is still improper please let me know. Hi Cong, Thanks for your reply. I've looked at Dorit's original patch adding WIDEN_SUM_EXPR and DOT_PROD_EXPR and I see that the same ambiguity exists for DOT_PROD_EXPR. Can you please add a note in your tree.def that SAD_EXPR, like DOT_PROD_EXPR can be expanded as either: tmp = WIDEN_MINUS_EXPR (arg1, arg2) tmp2 = ABS_EXPR (tmp) arg3 = PLUS_EXPR (tmp2, arg3) or: tmp = WIDEN_MINUS_EXPR (arg1, arg2) tmp2 = ABS_EXPR (tmp) arg3 = WIDEN_SUM_EXPR (tmp2, arg3) Where WIDEN_MINUS_EXPR is a signed MINUS_EXPR, returning a a value of the same (widened) type as arg3. I have added it, although we currently don't have WIDEN_MINUS_EXPR (I mentioned it in tree.def). Also, while looking for the history of DOT_PROD_EXPR I spotted this patch: [autovect] [patch] detect mult-hi and sad patterns http://gcc.gnu.org/ml/gcc-patches/2005-10/msg01394.html I wonder what the reason was for that patch to be dropped? It has been 8 years.. I have no idea why this patch is not accepted finally. There is even no reply in that thread. But I believe the SAD pattern is very important to be recognized. ARM also provides instructions for it. Thank you for your comment again! thanks, Cong Thanks, James
Re: [PATCH] Fix GDB PR15559 (inferior calls using thiscall calling convention)
On Fri, 9 May 2014 17:33:41 +0100 Julian Brown jul...@codesourcery.com wrote: On Wed, 7 May 2014 09:41:27 -0600 Tom Tromey tro...@redhat.com wrote: Tom The usual approach is some appropriate text somewhere on the Tom GCC wiki (though I suppose a note in the mail archives would Tom do in a pinch) along with a URL in a comment in the Tom appropriate file (dwarf2.h or dwarf2.def). Tom Could you please do that? Julian How's this, as a first attempt? Julian http://gcc.gnu.org/wiki/GNUDwarfExtensions Sorry I didn't reply to this sooner. That page looks great. Thanks for doing this. Thanks! Now, does anyone want to review the patch itself? :-) Ping? Julian
Re: RTABI half-precision conversion functions (ping)
On Thu, 29 May 2014 11:16:52 +0100 Julian Brown jul...@codesourcery.com wrote: On Thu, 19 Jul 2012 14:47:54 +0100 Julian Brown jul...@codesourcery.com wrote: On Thu, 19 Jul 2012 13:54:57 +0100 Paul Brook p...@codesourcery.com wrote: But, that means EABI-conformant callers are also perfectly entitled to sign-extend half-float values before calling our helper functions (although GCC itself won't do that). Using unsigned int and taking care to only examine the low-order bits of the value in the helper function itself serves to fix the latent bug, is compatible with existing code, allows us to be conformant with the eabi, and allows use of aliases to make the __gnu and __aeabi functions the same. As long as LTO never sees this mismatch we should be fine :-) AFAIK we don't curently have any way of expressing the actual ABI. Let's not worry about that for now :-). The patch no longer applied as-is, so I've updated it (attached, re-tested). Note that there are no longer any target-independent changes (though I'm not certain that the symbol versions are still correct). OK to apply? I think this deserves a comment in the source. Otherwise it's liable to get fixed in the future :-) Something allong the lines of While the EABI describes the arguments to the half-float helper routines as 'short', it does not require that they be extended to full register width. The normal ABI requres that the caller sign/zero extend short values to 32 bit. We use unsigned int arguments to prevent the gcc making assumptions about the high half of the register. Here's a version with an explanatory comment. I also fixed a couple of minor formatting nits I noticed (they don't upset the diff too much, I don't think). It looks like this one got forgotten about. Ping? Context: https://gcc.gnu.org/ml/gcc-patches/2012-07/msg00902.html https://gcc.gnu.org/ml/gcc-patches/2012-07/msg00912.html This is an EABI-conformance fix. Ping? Julian
Re: Handle MULTILIB_REUSE in auto-generated SYSROOT_SUFFIX_SPEC macro
On Thu, 5 Jun 2014 20:23:27 +0100 Julian Brown jul...@codesourcery.com wrote: Hi, The print-sysroot-suffix.sh script that can be used (via the t-sysroot-suffix makefile fragment) to auto-generate the SYSROOT_SUFFIX_SPEC macro for non-trivial multilib setups does not take into account the MULTILIB_REUSE target fragment variable. I'm not sure of a way to demonstrate how this causes problems with a vanilla tree, but consider the attached patch (arm-sysroot-mlib-arrangement-1.diff) intended to create a compiler with three multilibs: Ping? (Note that no in-tree targets use both print-sysroot-suffix.sh and MULTILIB_REUSE, AFAICT, so this patch is mostly useful to 3rd-party integrators.) Julian
Re: [PATCH, ARM] Don't use NEON for autovectorization in big-endian mode
On Mon, 16 Jun 2014 12:42:36 +0100 Julian Brown jul...@codesourcery.com wrote: Hi, As discussed several times previously, support for NEON in ARM big-endian mode is quite broken because of differing assumptions about lane ordering made by the ARM EABI and the set of NEON intrinsics on the one hand, and the vectorizer on the other. Fixing this properly would involve quite a large overhaul of the NEON backend implementation, and such an overhaul does not appear to be forthcoming. Unfortunately this leaves big-endian mode with a problem: even if the user is not explicitly using NEON intrinsics, compiling with NEON and the vectorizer enabled (i.e. -O3) can quite easily lead to incorrect code being generated. This is the patch we've been using internally for a while to work around the problem. When applied: Ping? Julian
Re: [GSoC][match-and-simplify] factor expr check in gimple-match-head.c
oops, didn't set plain text mode. sorry for re-post. On Tue, Jun 24, 2014 at 4:59 PM, Prathamesh Kulkarni bilbotheelffri...@gmail.com wrote: This patch factors out checking expression-code in gimple-match-head.c * gimple-match-head.c (check_gimple_assign): New function. (check_gimple_assign_convert): Likewise. (check_gimple_call_builtin): Likewise. * genmatch.c (dt_operand::gen_gimple_expr_expr): Add argument const char *. Generate call to gimple_assign_check or check_gimple_assign_convert. (dt_operand::gen_gimple_expr_fn): Add argument const char *. Generate call to check_gimple_call_builtin. (decision_tree::gen_gimple): Generate definition of def_stmt. Thanks and Regards, Prathamesh. Index: gcc/genmatch.c === --- gcc/genmatch.c (revision 211932) +++ gcc/genmatch.c (working copy) @@ -341,8 +341,8 @@ struct dt_operand: public dt_node unsigned gen_gimple_match_op (FILE *, const char *); unsigned gen_gimple_expr (FILE *, const char *); - void gen_gimple_expr_expr (FILE *, expr *); - void gen_gimple_expr_fn (FILE *, expr *); + void gen_gimple_expr_expr (FILE *, expr *, const char *); + void gen_gimple_expr_fn (FILE *, expr *, const char *); unsigned gen_generic_expr (FILE *, const char *); void gen_generic_expr_expr (FILE *, expr *, const char *); @@ -898,12 +898,12 @@ dt_operand::gen_gimple_match_op (FILE *f } void -dt_operand::gen_gimple_expr_fn (FILE *f, expr *e) +dt_operand::gen_gimple_expr_fn (FILE *f, expr *e, const char *opname) { unsigned n_ops = e-ops.length (); fn_id *op = static_cast fn_id * (e-operation-op); - fprintf (f, if (gimple_call_builtin_p (def_stmt, %s))\n, op-id); + fprintf (f, if (check_gimple_call_builtin (%s, def_stmt, %s))\n, opname, op-id); fprintf (f, {\n); for (unsigned i = 0; i n_ops; ++i) @@ -918,18 +918,16 @@ dt_operand::gen_gimple_expr_fn (FILE *f, } void -dt_operand::gen_gimple_expr_expr (FILE *f, expr *e) +dt_operand::gen_gimple_expr_expr (FILE *f, expr *e, const char *opname) { unsigned n_ops = e-ops.length (); operator_id *op_id = static_cast operator_id * (e-operation-op); if (op_id-code == NOP_EXPR || op_id-code == CONVERT_EXPR) -fprintf (f, if (is_gimple_assign (def_stmt)\n - CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt; +fprintf (f, if (check_gimple_assign_convert (%s, def_stmt))\n, opname); else -fprintf (f, if (is_gimple_assign (def_stmt) gimple_assign_rhs_code (def_stmt) == %s)\n, op_id-id); - +fprintf (f, if (check_gimple_assign (%s, def_stmt, %s))\n, opname, op_id-id); fprintf (f, {\n); for (unsigned i = 0; i n_ops; ++i) @@ -948,13 +946,9 @@ dt_operand::gen_gimple_expr (FILE *f, co { expr *e = static_castexpr * (op); - fprintf (f, if (TREE_CODE (%s) == SSA_NAME)\n, opname); - fprintf (f, {\n); - - fprintf (f, gimple def_stmt = SSA_NAME_DEF_STMT (%s);\n, opname); - (e-operation-op-kind == id_base::CODE) ? gen_gimple_expr_expr (f, e) : gen_gimple_expr_fn (f, e); - - return e-ops.length () + 2; + (e-operation-op-kind == id_base::CODE) ? gen_gimple_expr_expr (f, e, opname) + : gen_gimple_expr_fn (f, e, opname); + return e-ops.length () + 1; } @@ -1140,6 +1134,7 @@ decision_tree::gen_gimple (FILE *f) write_fn_prototype (f, 3); fprintf (f, {\n); + fprintf (f, gimple def_stmt;\n); for (unsigned i = 0; i root-kids.length (); i++) { Index: gcc/gimple-match-head.c === --- gcc/gimple-match-head.c (revision 211932) +++ gcc/gimple-match-head.c (working copy) @@ -713,3 +713,33 @@ do_valueize (tree (*valueize)(tree), tre return valueize (op); return op; } + +static bool +check_gimple_assign (tree op, gimple def_stmt, enum tree_code code) +{ + if (TREE_CODE (op) != SSA_NAME) +return false; + + def_stmt = SSA_NAME_DEF_STMT (op); + return is_gimple_assign (def_stmt) gimple_assign_rhs_code (def_stmt) == code; +} + +static bool +check_gimple_assign_convert (tree op, gimple def_stmt) +{ + if (TREE_CODE (op) != SSA_NAME) +return false; + + def_stmt = SSA_NAME_DEF_STMT (op); + return is_gimple_assign (def_stmt) CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt)); +} + +static bool +check_gimple_call_builtin (tree op, gimple def_stmt, enum built_in_function fn) +{ + if (TREE_CODE (op) != SSA_NAME) +return false; + + def_stmt = SSA_NAME_DEF_STMT (op); + return gimple_call_builtin_p (def_stmt, fn); +}
Re: [PATCH]Enable elimination of IV use with unsigned type candidate
On Mon, Jun 23, 2014 at 11:49 AM, Bin Cheng bin.ch...@arm.com wrote: Hi, For below simplified case: #define LEN (32000) __attribute__((aligned(16))) float a[LEN],b[LEN]; int foo (int M) { for (int i = 0; i M; i++) a[i+M] = a[i] + b[i]; } Compiling it with command like: $ aarch64-elf-gcc -O3 -S foo.c -o foo.S -std=c99 The assembly code of vectorized loop is in below form: mov x1, 0 mov w2, 0 .L4: ldr q0, [x1, x3] add w2, w2, 1 ldr q1, [x1, x4] cmp w2, w5 faddv0.4s, v0.4s, v1.4s str q0, [x6, x1] add x1, x1, 16 bcc .L4 Induction variable w2 is unnecessary and can be eliminated with x1. This is safe because X1 will never overflow during all iterations of loop. The optimal assembly should be like: mov x1, 0 .L4: ldr q0, [x1, x2] ldr q1, [x1, x4] faddv0.4s, v0.4s, v1.4s str q0, [x5, x1] add x1, x1, 16 cmp x1, x3 bcc .L4 This case can be handled if we do more complex overflow check on unsigned type in function iv_elimination_compare_lt. Also there is another blocker for the transformation, function number_of_iterations_lt calls fold_build2 to build folded form of may_be_zero, while iv_elimination_compare_lt only handles simple form tree expressions. It's possible to have iv_elimination_compare_lt to do some undo transformation on may_be_zero, but I found it's difficult for cases involving signed/unsigned conversion like case loop-41.c. Since I think there is no obvious benefit to fold may_be_zero here (somehow because the two operands are already in folded forms), this patch just calls build2_loc instead. But it may fold to true/false, no? This optimization is picked up by patch B, but patch A is necessary since the check in iv_elimination_compare_lt of two aff_trees isn't enough when two different types (one in signed type, the other in unsigned) involved. I have to use tree comparison here instead. Considering below simple case: Analyzing # of iterations of loop 5 exit condition [1, + , 1](no_overflow) = i_1 bounds on difference of bases: -3 ... 1 result: zero if i_1 + 1 1 # of iterations (unsigned int) i_1, bounded by 2 number of iterations (unsigned int) i_1; zero if i_1 + 1 1 use 0 compare in statement if (S.7_9 i_1) at position type integer(kind=4) base 1 step 1 is a biv related candidates candidate 0 (important) var_before ivtmp.28 var_after ivtmp.28 incremented at end type unsigned int base 0 step 1 When GCC trying to eliminate use 0 with cand 0, the miscellaneous trees in iv_elimination_compare_lt are like below with i_1 of signed type: B: i_1 + 1 A: 0 niter-niter: (unsigned int)i_1 Apparently, (B-A-1) is i_1, which doesn't equal to (unsigned int)i_1. Without this patch, it is considered equal to each other. just looking at this part. Do you have a testcase that exhibits a difference when just applying patch A? So I can have a look here? From the code in iv_elimination_compare_lt I can't figure why we'd end up with i_1 instead of (unsigned int)i_1 as we convert to a common type. I suppose the issue may be that at tree_to_aff_combination time we strip all nops with STRIP_NOPS but when operating on -rest via convert/scale or add we do not strip them again. But then 'nit' should be i_1, not (unsigned int)i_1. So the analysis above really doesn't look correct. Just to make sure we don't paper over an issue in tree-affine.c. Thus - testcase? On x86 we don't run into this place in iv_elimination_compare_lt (on an unpatched tree). CCing Zdenek for the meat of patch B. Thanks, Richard. Note that the induction variable IS necessary on 32 bit systems since otherwise there is type overflow. These two patch fix the mentioned problem. They pass bootstrap and regression test on x86_64/x86/aarch64/arm, so any comments? Thanks, bin PATCH A) 2014-06-23 Bin Cheng bin.ch...@arm.com * tree-ssa-loop-ivopts.c (iv_elimination_compare_lt): Check number of iteration using tree comparison. PATCH B) 2014-06-23 Bin Cheng bin.ch...@arm.com * tree-ssa-loop-niter.c (number_of_iterations_lt): Build unfolded form of may_be_zero. * tree-ssa-loop-ivopts.c (iv_nowrap_period) (nowrap_cand_for_loop_niter_p): New functions. (period_greater_niter_exit): New function refactored from may_eliminate_iv. (iv_elimination_compare_lt): New parameter. Relax overflow check. Handle special forms may_be_zero expression. (may_eliminate_iv): Call period_greater_niter_exit. Pass new argument for iv_elimination_compare_lt. gcc/testsuite/ChangeLog 2014-06-23 Bin Cheng bin.ch...@arm.com * gcc.dg/tree-ssa/loop-40.c: New
[Obvious][AArch64 testsuite] Add --save-temps to singleton_intrinsics_1.c test.
Scan-assembler test was running with dg-do assemble and not generating any assembler to scan. Tested on aarch64-none-elf and aarch64_be-none-elf; 40 * UNRESOLVED-PASS tests in singleton_intrinsics_1.c Patch (below) committed as r211934. --Alan -- Index: gcc/testsuite/gcc.target/aarch64/singleton_intrinsics_1.c === --- gcc/testsuite/gcc.target/aarch64/singleton_intrinsics_1.c (revision 211933) +++ gcc/testsuite/gcc.target/aarch64/singleton_intrinsics_1.c (working copy) @@ -1,5 +1,5 @@ /* { dg-do assemble } */ -/* { dg-options -O2 -dp } */ +/* { dg-options -O2 -dp --save-temps } */ /* Test the [u]int64x1_t intrinsics. */ @@ -400,3 +400,6 @@ { return vsri_n_u64 (a, b, 9); } + +/* { dg-final { cleanup-saved-temps } } */ +
Re: [c++-concepts] Fix assertion failure with cp_maybe_constrained_type_specifier
Braden, Did you have a specific test case that causes this breakage? I have a feeling that if we're missing base-link nodes in one place, we'll miss them in others too. Andrew On Tue, Jun 17, 2014 at 4:54 AM, Braden Obrzut ad...@maniacsvault.net wrote: cp_maybe_constrained_type_specifier asserted that the decl passed in would be of type OVERLOAD, however a clean build of the compiler was broken since it could also be a BASELINK. I'm not entirely sure when this is the case, except that it seems to happen with class member templates as it also caused a test case in my next patch to fail. The solution is to check for a BASELINK and extract the functions from it. The possibility of decl being a BASELINK is asserted near the call in cp_parser_template_id (cp_maybe_partial_concept_id just calls the function in question at this time). 2014-06-17 Braden Obrzut ad...@maniacsvault.net * gcc/cp/parser.c (cp_maybe_constrained_type_specifier): Fix assertion failure if baselink was passed in as decl.
[PATCH 0/2] Zext/sext elimination using value range
Hi, This patch series (2) implements zext/sext extension elimination using value ranges stored in SSA. Implementation is what was suggested in the thread https://gcc.gnu.org/ml/gcc/2014-05/msg00213.html. I have broken this into: Patch 1 - Changes to store zero and sign extended promotions (SPR_SIGNED_AND_UNSIGNED) in SUBREG with SUBREG_PROMOTED_VAR_P. Patch 2 - Enables Zext/sext extensions by checking the value range. test-cases that motivated this and the asm difference with the patch are: 1. short foo(unsigned char c) { c = c (unsigned char)0x0F; if( c 7 ) return((short)(c - 5)); else return(( short )c); } and r0, r0, #15 cmp r0, #7 subhi r0, r0, #5 - uxthr0, r0 - sxthr0, r0 bx lr 2. unsigned short crc2(unsigned short crc, unsigned char data) { unsigned char i, x16, carry; for (i = 0; i 8; i++) { x16 = (data ^ crc) 1; data = 1; if (x16 == 1) { crc ^= 0x4002; carry = 1; } else carry = 0; crc = 1; if (carry) crc |= 0x8000; else crc = 0x7fff; } return crc; } - mov r3, #8 + mov r2, #8 .L3: - eor r2, r1, r0 - sub r3, r3, #1 - tst r2, #1 + eor r3, r1, r0 mov r1, r1, lsr #1 + tst r3, #1 eorne r0, r0, #16384 moveq r0, r0, lsr #1 eorne r0, r0, #2 movne r0, r0, lsr #1 orrne r0, r0, #32768 - andsr3, r3, #255 + subsr2, r2, #1 bne .L3 bx lr Tested both patches on x86_64-unknown-linux-gnu and arm-none-linux-gnueabi with no new regressions. Is this OK? Thanks, Kugan
[PATCH 1/2] Enable setting sign and unsigned promoted mode (SPR_SIGNED_AND_UNSIGNED)
Changes the the SUBREG flags to be able to set promoted for sign (SRP_SIGNED), unsigned (SRP_UNSIGNED), sign and unsigned (SPR_SIGNED_AND_UNSIGNED) in SUBREG_PROMOTED_VAR_P. Thanks, Kugan gcc/ 2014-06-24 Kugan Vivekanandarajah kug...@linaro.org * gcc/calls.c (precompute_arguments): Use new SUBREG_PROMOTED_SET instead of SUBREG_PROMOTED_UNSIGNED_SET (expand_call) : Likewise. * gcc/expr.c (convert_move) : Use new SUBREG_CHECK_PROMOTED_SIGN instead of SUBREG_PROMOTED_UNSIGNED_P. (convert_modes) : Likewise. (store_expr) : Likewise. (expand_expr_real_1) : Use new SUBREG_PROMOTED_SET instead of SUBREG_PROMOTED_UNSIGNED_SET. * gcc/function.c (assign_param_setup_reg) : Use new SUBREG_PROMOTED_SET instead of SUBREG_PROMOTED_UNSIGNED_SET. * gcc/ifcvt.c (noce_emit_cmove) : Updated to use SUBREG_PROMOTED_UNSIGNED_P and SUBREG_PROMOTED_SIGNED_P. * gcc/internal-fn.c (ubsan_expand_si_overflow_mul_check) : Use SUBREG_PROMOTED_SET instead of SUBREG_PROMOTED_UNSIGNED_SET. * gcc/optabs.c (widen_operand): Use new SUBREG_CHECK_PROMOTED_SIGN instead of SUBREG_PROMOTED_UNSIGNED_P. * gcc/rtl.h (SUBREG_PROMOTED_UNSIGNED_SET) : Remove. (SUBREG_PROMOTED_SET) : New define. (SUBREG_PROMOTED_GET) : Likewise. (SUBREG_PROMOTED_SIGNED_P) : Likewise. (SUBREG_CHECK_PROMOTED_SIGN) : Likewise. (SUBREG_PROMOTED_UNSIGNED_P) : Updated. * gcc/rtlanal.c (simplify_unary_operation_1) : Use new SUBREG_PROMOTED_SET instead of SUBREG_PROMOTED_UNSIGNED_SET. * gcc/simplify-rtx.c (simplify_unary_operation_1) : Use new SUBREG_PROMOTED_SIGNED_P instead of !SUBREG_PROMOTED_UNSIGNED_P. (simplify_subreg) : Use new SUBREG_PROMOTED_SET instead of SUBREG_PROMOTED_UNSIGNED_SET. diff --git a/gcc/calls.c b/gcc/calls.c index 78fe7d8..c1fe3b8 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -1484,8 +1484,7 @@ precompute_arguments (int num_actuals, struct arg_data *args) args[i].initial_value = gen_lowpart_SUBREG (mode, args[i].value); SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1; - SUBREG_PROMOTED_UNSIGNED_SET (args[i].initial_value, - args[i].unsignedp); + SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); } } } @@ -3365,7 +3364,8 @@ expand_call (tree exp, rtx target, int ignore) target = gen_rtx_SUBREG (TYPE_MODE (type), target, offset); SUBREG_PROMOTED_VAR_P (target) = 1; - SUBREG_PROMOTED_UNSIGNED_SET (target, unsignedp); + SUBREG_PROMOTED_SET (target, unsignedp); + } /* If size of args is variable or this was a constructor call for a stack diff --git a/gcc/expr.c b/gcc/expr.c index 512c024..a8db9f5 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -329,7 +329,7 @@ convert_move (rtx to, rtx from, int unsignedp) if (GET_CODE (from) == SUBREG SUBREG_PROMOTED_VAR_P (from) (GET_MODE_PRECISION (GET_MODE (SUBREG_REG (from))) = GET_MODE_PRECISION (to_mode)) - SUBREG_PROMOTED_UNSIGNED_P (from) == unsignedp) + (SUBREG_CHECK_PROMOTED_SIGN (from, unsignedp))) from = gen_lowpart (to_mode, from), from_mode = to_mode; gcc_assert (GET_CODE (to) != SUBREG || !SUBREG_PROMOTED_VAR_P (to)); @@ -703,7 +703,7 @@ convert_modes (enum machine_mode mode, enum machine_mode oldmode, rtx x, int uns if (GET_CODE (x) == SUBREG SUBREG_PROMOTED_VAR_P (x) GET_MODE_SIZE (GET_MODE (SUBREG_REG (x))) = GET_MODE_SIZE (mode) - SUBREG_PROMOTED_UNSIGNED_P (x) == unsignedp) + (SUBREG_CHECK_PROMOTED_SIGN (x, unsignedp))) x = gen_lowpart (mode, SUBREG_REG (x)); if (GET_MODE (x) != VOIDmode) @@ -5202,8 +5202,7 @@ store_expr (tree exp, rtx target, int call_param_p, bool nontemporal) GET_MODE_PRECISION (GET_MODE (target)) == TYPE_PRECISION (TREE_TYPE (exp))) { - if (TYPE_UNSIGNED (TREE_TYPE (exp)) - != SUBREG_PROMOTED_UNSIGNED_P (target)) + if (!(SUBREG_CHECK_PROMOTED_SIGN (target, TYPE_UNSIGNED (TREE_TYPE (exp) { /* Some types, e.g. Fortran's logical*4, won't have a signed version, so use the mode instead. */ @@ -9513,7 +9512,8 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode, temp = gen_lowpart_SUBREG (mode, decl_rtl); SUBREG_PROMOTED_VAR_P (temp) = 1; - SUBREG_PROMOTED_UNSIGNED_SET (temp, unsignedp); + SUBREG_PROMOTED_SET (temp, unsignedp); + return temp; } diff --git a/gcc/function.c b/gcc/function.c index 441289e..9509622 100644 --- a/gcc/function.c +++ b/gcc/function.c @@ -3093,7 +3093,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, /* The argument is already
[PATCH 2/2] Enable elimination of zext/sext
Sets proper flags on the SUBREG based on value range info and enables elimination of zext/sext when possible. Thanks, Kugan gcc/ 2014-06-24 Kugan Vivekanandarajah kug...@linaro.org * gcc/calls.c (precompute_arguments: Check is_promoted_for_type and set the promoted mode. (is_promoted_for_type) : New function. (expand_expr_real_1) : Check is_promoted_for_type and set the promoted mode. * gcc/expr.h (is_promoted_for_type) : New function definition. * gcc/cfgexpand.c (expand_gimple_stmt_1) : Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. diff --git a/gcc/calls.c b/gcc/calls.c index c1fe3b8..4ef9df8 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -1484,7 +1484,10 @@ precompute_arguments (int num_actuals, struct arg_data *args) args[i].initial_value = gen_lowpart_SUBREG (mode, args[i].value); SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1; - SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); + if (is_promoted_for_type (args[i].tree_value, mode, !args[i].unsignedp)) + SUBREG_PROMOTED_SET (args[i].initial_value, SRP_SIGNED_AND_UNSIGNED); + else + SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); } } } diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index e8cd87f..0540b4d 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt) GET_MODE (target), temp, unsignedp); } - convert_move (SUBREG_REG (target), temp, unsignedp); + if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED) +(GET_CODE (temp) == SUBREG) +(GET_MODE (target) == GET_MODE (temp)) +(GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp + emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp)); + else + convert_move (SUBREG_REG (target), temp, unsignedp); } else if (nontemporal emit_storent_insn (target, temp)) ; diff --git a/gcc/expr.c b/gcc/expr.c index a8db9f5..b2c8146 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -9209,6 +9209,59 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode, } #undef REDUCE_BIT_FIELD +/* Return TRUE if value in SSA is already zero/sign extended for lhs type + (type here is the combination of LHS_MODE and LHS_UNS) using value range + information stored. Return FALSE otherwise. */ +bool +is_promoted_for_type (tree ssa, enum machine_mode lhs_mode, bool lhs_uns) +{ + wide_int type_min, type_max; + wide_int min, max, limit; + unsigned int prec; + tree lhs_type; + bool rhs_uns; + + if (flag_wrapv + || (flag_strict_overflow == false) + || (ssa == NULL_TREE) + || (TREE_CODE (ssa) != SSA_NAME) + || !INTEGRAL_TYPE_P (TREE_TYPE (ssa)) + || POINTER_TYPE_P (TREE_TYPE (ssa))) +return false; + + /* Return FALSE if value_range is not recorded for SSA. */ + if (get_range_info (ssa, min, max) != VR_RANGE) +return false; + + lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns); + rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa)); + prec = min.get_precision (); + + /* Signed maximum value. */ + limit = wide_int::from (TYPE_MAX_VALUE (TREE_TYPE (ssa)), prec, SIGNED); + + /* Signedness of LHS and RHS differs but values in range. */ + if ((rhs_uns != lhs_uns) + ((!lhs_uns !wi::neg_p (min, TYPE_SIGN (lhs_type))) + || (lhs_uns (wi::cmp (max, limit, TYPE_SIGN (TREE_TYPE (ssa))) == -1 +lhs_uns = !lhs_uns; + + /* Signedness of LHS and RHS should match. */ + if (rhs_uns != lhs_uns) +return false; + + type_min = wide_int::from (TYPE_MIN_VALUE (lhs_type), prec, TYPE_SIGN (TREE_TYPE (ssa))); + type_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), prec, TYPE_SIGN (TREE_TYPE (ssa))); + + /* Check if values lies in-between the type range. */ + if ((wi::neg_p (max, TYPE_SIGN (TREE_TYPE (ssa))) + || (wi::cmp (max, type_max, TYPE_SIGN (TREE_TYPE (ssa))) != 1)) + (!wi::neg_p (min, TYPE_SIGN (TREE_TYPE (ssa))) + || (wi::cmp (type_min, min, TYPE_SIGN (TREE_TYPE (ssa))) != 1))) +return true; + + return false; +} /* Return TRUE if expression STMT is suitable for replacement. Never consider memory loads as replaceable, because those don't ever lead @@ -9512,7 +9565,10 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode, temp = gen_lowpart_SUBREG (mode, decl_rtl); SUBREG_PROMOTED_VAR_P (temp) = 1; - SUBREG_PROMOTED_SET (temp, unsignedp); + if (is_promoted_for_type (ssa_name, mode, !unsignedp)) + SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED); + else + SUBREG_PROMOTED_SET (temp,
Re: [c++-concepts] Fix assertion failure with cp_maybe_constrained_type_specifier
I saw this during bootstrap. I've verified that the patch works (I was working on similar). Ed
Re: [PATCH 1/2] Enable setting sign and unsigned promoted mode (SPR_SIGNED_AND_UNSIGNED)
On Tue, Jun 24, 2014 at 09:51:51PM +1000, Kugan wrote: Changes the the SUBREG flags to be able to set promoted for sign (SRP_SIGNED), unsigned (SRP_UNSIGNED), sign and unsigned (SPR_SIGNED_AND_UNSIGNED) in SUBREG_PROMOTED_VAR_P. 2014-06-24 Kugan Vivekanandarajah kug...@linaro.org * gcc/calls.c (precompute_arguments): Use new SUBREG_PROMOTED_SET instead of SUBREG_PROMOTED_UNSIGNED_SET (expand_call) : Likewise. gcc/ prefix doesn't belong to gcc/ChangeLog entries (everywhere). * gcc/expr.c (convert_move) : Use new SUBREG_CHECK_PROMOTED_SIGN No space before : (everywhere). @@ -3365,7 +3364,8 @@ expand_call (tree exp, rtx target, int ignore) target = gen_rtx_SUBREG (TYPE_MODE (type), target, offset); SUBREG_PROMOTED_VAR_P (target) = 1; - SUBREG_PROMOTED_UNSIGNED_SET (target, unsignedp); + SUBREG_PROMOTED_SET (target, unsignedp); + } Please avoid adding useless blank lines. --- a/gcc/expr.c +++ b/gcc/expr.c @@ -329,7 +329,7 @@ convert_move (rtx to, rtx from, int unsignedp) if (GET_CODE (from) == SUBREG SUBREG_PROMOTED_VAR_P (from) (GET_MODE_PRECISION (GET_MODE (SUBREG_REG (from))) = GET_MODE_PRECISION (to_mode)) - SUBREG_PROMOTED_UNSIGNED_P (from) == unsignedp) + (SUBREG_CHECK_PROMOTED_SIGN (from, unsignedp))) Please remove the extra ()s, the macro should have ()s around the definition to make this unnecessary (many times). @@ -5202,8 +5202,7 @@ store_expr (tree exp, rtx target, int call_param_p, bool nontemporal) GET_MODE_PRECISION (GET_MODE (target)) == TYPE_PRECISION (TREE_TYPE (exp))) { - if (TYPE_UNSIGNED (TREE_TYPE (exp)) - != SUBREG_PROMOTED_UNSIGNED_P (target)) + if (!(SUBREG_CHECK_PROMOTED_SIGN (target, TYPE_UNSIGNED (TREE_TYPE (exp) Too long line. -#define SUBREG_PROMOTED_UNSIGNED_SET(RTX, VAL) \ -do { \ - rtx const _rtx = RTL_FLAG_CHECK1 (SUBREG_PROMOTED_UNSIGNED_SET, \ - (RTX), SUBREG); \ - if ((VAL) 0) \ -_rtx-volatil = 1; \ - else { \ -_rtx-volatil = 0; \ -_rtx-unchanging = (VAL); \ - } \ -} while (0) - /* Valid for subregs which are SUBREG_PROMOTED_VAR_P(). In that case this gives the necessary extensions: - 0 - signed - 1 - normal unsigned + 0 - signed (SPR_SIGNED) + 1 - normal unsigned (SPR_UNSIGNED) + 2 - value is both sign and unsign extended for mode + (SPR_SIGNED_AND_UNSIGNED). -1 - pointer unsigned, which most often can be handled like unsigned extension, except for generating instructions where we need to - emit special code (ptr_extend insns) on some architectures. */ + emit special code (ptr_extend insns) on some architectures + (SPR_POINTER). */ + +const unsigned int SRP_POINTER = -1; +const unsigned int SRP_SIGNED = 0; +const unsigned int SRP_UNSIGNED = 1; +const unsigned int SRP_SIGNED_AND_UNSIGNED = 2; But most importantly, I thought Richard Henderson suggested to use SRP_POINTER 0, SRP_SIGNED 1, SRP_UNSIGNED 2, SRP_SIGNED_AND_UNSIGNED 3, that way when checking e.g. SUBREG_PROMOTED_SIGNED_P or SUBREG_PROMOTED_UNSIGNED_P you can check just the single bit. Where something tested for SUBREG_PROMOTED_UNSIGNED_P () == -1 just use SUBREG_PROMOTED_GET. Jakub
Re: [PATCH 2/2] Enable elimination of zext/sext
On Tue, Jun 24, 2014 at 09:53:35PM +1000, Kugan wrote: 2014-06-24 Kugan Vivekanandarajah kug...@linaro.org * gcc/calls.c (precompute_arguments: Check is_promoted_for_type and set the promoted mode. (is_promoted_for_type) : New function. (expand_expr_real_1) : Check is_promoted_for_type and set the promoted mode. * gcc/expr.h (is_promoted_for_type) : New function definition. * gcc/cfgexpand.c (expand_gimple_stmt_1) : Call emit_move_insn if SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. Similarly to the other patch, no gcc/ prefix in ChangeLog, no space before :, watch for too long lines, remove useless ()s around conditions. +bool +is_promoted_for_type (tree ssa, enum machine_mode lhs_mode, bool lhs_uns) +{ + wide_int type_min, type_max; + wide_int min, max, limit; + unsigned int prec; + tree lhs_type; + bool rhs_uns; + + if (flag_wrapv Why? + || (flag_strict_overflow == false) Why? Also, that would be !flag_strict_overflow instead of (flag_strict_overflow == false) + || (ssa == NULL_TREE) + || (TREE_CODE (ssa) != SSA_NAME) + || !INTEGRAL_TYPE_P (TREE_TYPE (ssa)) + || POINTER_TYPE_P (TREE_TYPE (ssa))) All pointer types are !INTEGRAL_TYPE_P, so the last condition doesn't make any sense. Jakub
Re: [c++-concepts] Fix assertion failure with cp_maybe_constrained_type_specifier
Committed as r211935. I updated the patch to add a more appropriate comment and changelog entry. Andrew On Tue, Jun 24, 2014 at 8:07 AM, Ed Smith-Rowland 3dw...@verizon.net wrote: I saw this during bootstrap. I've verified that the patch works (I was working on similar). Ed Index: parser.c === --- parser.c (revision 211591) +++ parser.c (working copy) @@ -15175,9 +15175,15 @@ cp_parser_allows_constrained_type_specif static tree cp_maybe_constrained_type_specifier (cp_parser *parser, tree decl, tree args) { - gcc_assert (TREE_CODE (decl) == OVERLOAD); gcc_assert (args ? TREE_CODE (args) == TREE_VEC : true); + // If we get a reference to a member function, allow the referenced + // functions to participate in this resolution: the baselink may refer + // to a static member concept. + if (BASELINK_P (decl)) +decl = BASELINK_FUNCTIONS (decl); + gcc_assert (TREE_CODE (decl) == OVERLOAD); + // Don't do any heavy lifting if we know we're not in a context // where it could succeed. if (!cp_parser_allows_constrained_type_specifier (parser))
Re: [PATCH 3/3] add hash_map class
On 06/20/2014 12:52 PM, tsaund...@mozilla.com wrote: From: Trevor Saunders tsaund...@mozilla.com Hi, This patch adds a hash_map class so we can consolidate the boiler plate around using hash_table as a map, it also allows us to get rid of pointer_map which I do in this patch by converting its users to hash_map. Hello Trev, I like your changes! One small question about pointer_set, which is unable of deletion of items. Do you plan to migrate and simplify hash_map to be a replacement for pointer_set? Thanks, Martin bootstrapped + regtested without regression on x86_64-unknown-linux-gnu, ok? Trev gcc/ * alloc-pool.c (alloc_pool_hash): Use hash_map instead of hash_table. * dominance.c (iterate_fix_dominators): Use hash_map instead of pointer_map. * hash-map.h: New file. * ipa-comdats.c: Use hash_map instead of pointer_map. * lto-section-out.c: Adjust. * lto-streamer.h: Replace pointer_map with hash_map. * symtab.c (verify_symtab): Likewise. * tree-ssa-strlen.c (decl_to_stridxlist_htab): Likewise. * tree-ssa-uncprop.c (val_ssa_equiv): Likewise. * tree-streamer.h: Likewise. * tree-streamer.c: Adjust. * pointer-set.h: Remove pointer_map. lto/ * lto.c (canonical_type_hash_cache): Use hash_map instead of pointer_map. diff --git a/gcc/alloc-pool.c b/gcc/alloc-pool.c index 49209ee..0d31835 100644 --- a/gcc/alloc-pool.c +++ b/gcc/alloc-pool.c @@ -22,6 +22,7 @@ along with GCC; see the file COPYING3. If not see #include system.h #include alloc-pool.h #include hash-table.h +#include hash-map.h #define align_eight(x) (((x+7) 3) 3) @@ -69,7 +70,6 @@ static ALLOC_POOL_ID_TYPE last_id; size for that pool. */ struct alloc_pool_descriptor { - const char *name; /* Number of pools allocated. */ unsigned long created; /* Gross allocated storage. */ @@ -82,48 +82,17 @@ struct alloc_pool_descriptor int elt_size; }; -/* Hashtable helpers. */ -struct alloc_pool_hasher : typed_noop_remove alloc_pool_descriptor -{ - typedef alloc_pool_descriptor value_type; - typedef char compare_type; - static inline hashval_t hash (const alloc_pool_descriptor *); - static inline bool equal (const value_type *, const compare_type *); -}; - -inline hashval_t -alloc_pool_hasher::hash (const value_type *d) -{ - return htab_hash_pointer (d-name); -} - -inline bool -alloc_pool_hasher::equal (const value_type *d, - const compare_type *p2) -{ - return d-name == p2; -} - /* Hashtable mapping alloc_pool names to descriptors. */ -static hash_tablealloc_pool_hasher *alloc_pool_hash; +static hash_mapconst char *, alloc_pool_descriptor *alloc_pool_hash; /* For given name, return descriptor, create new if needed. */ static struct alloc_pool_descriptor * allocate_pool_descriptor (const char *name) { - struct alloc_pool_descriptor **slot; - if (!alloc_pool_hash) -alloc_pool_hash = new hash_tablealloc_pool_hasher (10); - - slot = alloc_pool_hash-find_slot_with_hash (name, - htab_hash_pointer (name), - INSERT); - if (*slot) -return *slot; - *slot = XCNEW (struct alloc_pool_descriptor); - (*slot)-name = name; - return *slot; +alloc_pool_hash = new hash_mapconst char *, alloc_pool_descriptor (10); + + return alloc_pool_hash-get_or_insert (name); } /* Create a pool of things of size SIZE, with NUM in each block we @@ -375,23 +344,22 @@ struct output_info unsigned long total_allocated; }; -/* Called via hash_table.traverse. Output alloc_pool descriptor pointed out by +/* Called via hash_map.traverse. Output alloc_pool descriptor pointed out by SLOT and update statistics. */ -int -print_alloc_pool_statistics (alloc_pool_descriptor **slot, +bool +print_alloc_pool_statistics (const char *const name, +const alloc_pool_descriptor d, struct output_info *i) { - struct alloc_pool_descriptor *d = *slot; - - if (d-allocated) + if (d.allocated) { fprintf (stderr, %-22s %6d %10lu %10lu(%10lu) %10lu(%10lu) %10lu(%10lu)\n, - d-name, d-elt_size, d-created, d-allocated, - d-allocated / d-elt_size, d-peak, d-peak / d-elt_size, - d-current, d-current / d-elt_size); - i-total_allocated += d-allocated; - i-total_created += d-created; + name, d.elt_size, d.created, d.allocated, + d.allocated / d.elt_size, d.peak, d.peak / d.elt_size, + d.current, d.current / d.elt_size); + i-total_allocated += d.allocated; + i-total_created += d.created; } return 1; } diff --git a/gcc/dominance.c b/gcc/dominance.c index 7adec4f..be0a439 100644 --- a/gcc/dominance.c +++ b/gcc/dominance.c @@ -43,6 +43,7 @@ #include diagnostic-core.h
[PATCH] Fix forwprop pattern (T)(P + A) - (T)P - (T)A, Part 1
Hi Richard, I have here part 1 of this patch, which does not use undefined behaviour. Currently I start to think that also TYPE_SATURATING just cannot happen here, because only some targets have saturating types, for instance ARM, and also for those who have it, it is only allowed with fixed point types. Thus _Sat is only allowed with _Frac or _Accum. I have not seen anything different, for instance a vector, with saturation yet. OK for trunk after boot-strap regression-test? Thanks Bernd. gcc/ChangeLog: 2014-06-24 Bernd Edlinger bernd.edlin...@hotmail.de * tree-ssa-forwprop.c (associate_plusminus): Do not use the transformation (T)(P + A) - (T)P - (T)A with widening integer conversions. testsuite/ChangeLog: 2014-06-24 Bernd Edlinger bernd.edlin...@hotmail.de * gcc.c-torture/execute/20140622-1.c: New test. patch-forwprop.diff Description: Binary data
Re: [PATCH 3/3] add hash_map class
On Tue, Jun 24, 2014 at 02:29:53PM +0200, Martin Liška wrote: On 06/20/2014 12:52 PM, tsaund...@mozilla.com wrote: From: Trevor Saunders tsaund...@mozilla.com Hi, This patch adds a hash_map class so we can consolidate the boiler plate around using hash_table as a map, it also allows us to get rid of pointer_map which I do in this patch by converting its users to hash_map. Hello Trev, I like your changes! One small question about pointer_set, which is unable of deletion of items. Do you plan to migrate and simplify hash_map to be a replacement for pointer_set? I'm not sure I follow the question. I imagine that hash_map will largely stay as it is, other than perhaps some const correctness stuff, and supporting element removal at some point. Supporting element removal should be trivial since I'm just wrapping hash_table which already supports it, but I didn't want to add it until there was code testing it. As you see in the patch I removed pointer_map so its already a replacement for that functionality. As for pointer_set since its a set not a map hash_table would seem closer to me. Trev Thanks, Martin bootstrapped + regtested without regression on x86_64-unknown-linux-gnu, ok? Trev gcc/ * alloc-pool.c (alloc_pool_hash): Use hash_map instead of hash_table. * dominance.c (iterate_fix_dominators): Use hash_map instead of pointer_map. * hash-map.h: New file. * ipa-comdats.c: Use hash_map instead of pointer_map. * lto-section-out.c: Adjust. * lto-streamer.h: Replace pointer_map with hash_map. * symtab.c (verify_symtab): Likewise. * tree-ssa-strlen.c (decl_to_stridxlist_htab): Likewise. * tree-ssa-uncprop.c (val_ssa_equiv): Likewise. * tree-streamer.h: Likewise. * tree-streamer.c: Adjust. * pointer-set.h: Remove pointer_map. lto/ * lto.c (canonical_type_hash_cache): Use hash_map instead of pointer_map. diff --git a/gcc/alloc-pool.c b/gcc/alloc-pool.c index 49209ee..0d31835 100644 --- a/gcc/alloc-pool.c +++ b/gcc/alloc-pool.c @@ -22,6 +22,7 @@ along with GCC; see the file COPYING3. If not see #include system.h #include alloc-pool.h #include hash-table.h +#include hash-map.h #define align_eight(x) (((x+7) 3) 3) @@ -69,7 +70,6 @@ static ALLOC_POOL_ID_TYPE last_id; size for that pool. */ struct alloc_pool_descriptor { - const char *name; /* Number of pools allocated. */ unsigned long created; /* Gross allocated storage. */ @@ -82,48 +82,17 @@ struct alloc_pool_descriptor int elt_size; }; -/* Hashtable helpers. */ -struct alloc_pool_hasher : typed_noop_remove alloc_pool_descriptor -{ - typedef alloc_pool_descriptor value_type; - typedef char compare_type; - static inline hashval_t hash (const alloc_pool_descriptor *); - static inline bool equal (const value_type *, const compare_type *); -}; - -inline hashval_t -alloc_pool_hasher::hash (const value_type *d) -{ - return htab_hash_pointer (d-name); -} - -inline bool -alloc_pool_hasher::equal (const value_type *d, - const compare_type *p2) -{ - return d-name == p2; -} - /* Hashtable mapping alloc_pool names to descriptors. */ -static hash_tablealloc_pool_hasher *alloc_pool_hash; +static hash_mapconst char *, alloc_pool_descriptor *alloc_pool_hash; /* For given name, return descriptor, create new if needed. */ static struct alloc_pool_descriptor * allocate_pool_descriptor (const char *name) { - struct alloc_pool_descriptor **slot; - if (!alloc_pool_hash) -alloc_pool_hash = new hash_tablealloc_pool_hasher (10); - - slot = alloc_pool_hash-find_slot_with_hash (name, - htab_hash_pointer (name), - INSERT); - if (*slot) -return *slot; - *slot = XCNEW (struct alloc_pool_descriptor); - (*slot)-name = name; - return *slot; +alloc_pool_hash = new hash_mapconst char *, alloc_pool_descriptor (10); + + return alloc_pool_hash-get_or_insert (name); } /* Create a pool of things of size SIZE, with NUM in each block we @@ -375,23 +344,22 @@ struct output_info unsigned long total_allocated; }; -/* Called via hash_table.traverse. Output alloc_pool descriptor pointed out by +/* Called via hash_map.traverse. Output alloc_pool descriptor pointed out by SLOT and update statistics. */ -int -print_alloc_pool_statistics (alloc_pool_descriptor **slot, +bool +print_alloc_pool_statistics (const char *const name, + const alloc_pool_descriptor d, struct output_info *i) { - struct alloc_pool_descriptor *d = *slot; - - if (d-allocated) + if (d.allocated) { fprintf (stderr, %-22s %6d %10lu %10lu(%10lu) %10lu(%10lu) %10lu(%10lu)\n, - d-name, d-elt_size, d-created,
[GOOGLE] Fix -femit-function-names in LIPO profile-use mode
Emit the proper module name in LIPO profile-use mode. Passes regression tests, ok for google branches? Thanks, Teresa 2014-06-24 Teresa Johnson tejohn...@google.com * coverage.c (emit_function_name): Emit module name in LIPO mode. Index: coverage.c === --- coverage.c (revision 211893) +++ coverage.c (working copy) @@ -1882,7 +1882,9 @@ void emit_function_name (void) { fprintf (stderr, Module %s FuncId %u Name %s\n, - main_input_file_name, + (L_IPO_COMP_MODE +? get_module_name (FUNC_DECL_MODULE_ID (cfun)) +: main_input_file_name), FUNC_DECL_FUNC_ID (cfun), IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (current_function_decl))); } -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: [PATCH 3/3] add hash_map class
On 06/24/2014 02:40 PM, Trevor Saunders wrote: On Tue, Jun 24, 2014 at 02:29:53PM +0200, Martin Liška wrote: On 06/20/2014 12:52 PM, tsaund...@mozilla.com wrote: From: Trevor Saunders tsaund...@mozilla.com Hi, This patch adds a hash_map class so we can consolidate the boiler plate around using hash_table as a map, it also allows us to get rid of pointer_map which I do in this patch by converting its users to hash_map. Hello Trev, I like your changes! One small question about pointer_set, which is unable of deletion of items. Do you plan to migrate and simplify hash_map to be a replacement for pointer_set? I'm not sure I follow the question. I imagine that hash_map will largely stay as it is, other than perhaps some const correctness stuff, and supporting element removal at some point. Supporting element removal should be trivial since I'm just wrapping hash_table which already supports it, but I didn't want to add it until there was code testing it. As you see in the patch I removed pointer_map so its already a replacement for that functionality. As for pointer_set since its a set not a map hash_table would seem closer to me. Understand, yeah, I was asking if we plan to add element removal also for (pointer_)set? I consider such functionality useful, but it looks not related to your patch. If I understand correctly, you are not planning to use hash_* as wrapping data structure for set. Martin Trev Thanks, Martin bootstrapped + regtested without regression on x86_64-unknown-linux-gnu, ok? Trev gcc/ * alloc-pool.c (alloc_pool_hash): Use hash_map instead of hash_table. * dominance.c (iterate_fix_dominators): Use hash_map instead of pointer_map. * hash-map.h: New file. * ipa-comdats.c: Use hash_map instead of pointer_map. * lto-section-out.c: Adjust. * lto-streamer.h: Replace pointer_map with hash_map. * symtab.c (verify_symtab): Likewise. * tree-ssa-strlen.c (decl_to_stridxlist_htab): Likewise. * tree-ssa-uncprop.c (val_ssa_equiv): Likewise. * tree-streamer.h: Likewise. * tree-streamer.c: Adjust. * pointer-set.h: Remove pointer_map. lto/ * lto.c (canonical_type_hash_cache): Use hash_map instead of pointer_map. diff --git a/gcc/alloc-pool.c b/gcc/alloc-pool.c index 49209ee..0d31835 100644 --- a/gcc/alloc-pool.c +++ b/gcc/alloc-pool.c @@ -22,6 +22,7 @@ along with GCC; see the file COPYING3. If not see #include system.h #include alloc-pool.h #include hash-table.h +#include hash-map.h #define align_eight(x) (((x+7) 3) 3) @@ -69,7 +70,6 @@ static ALLOC_POOL_ID_TYPE last_id; size for that pool. */ struct alloc_pool_descriptor { - const char *name; /* Number of pools allocated. */ unsigned long created; /* Gross allocated storage. */ @@ -82,48 +82,17 @@ struct alloc_pool_descriptor int elt_size; }; -/* Hashtable helpers. */ -struct alloc_pool_hasher : typed_noop_remove alloc_pool_descriptor -{ - typedef alloc_pool_descriptor value_type; - typedef char compare_type; - static inline hashval_t hash (const alloc_pool_descriptor *); - static inline bool equal (const value_type *, const compare_type *); -}; - -inline hashval_t -alloc_pool_hasher::hash (const value_type *d) -{ - return htab_hash_pointer (d-name); -} - -inline bool -alloc_pool_hasher::equal (const value_type *d, - const compare_type *p2) -{ - return d-name == p2; -} - /* Hashtable mapping alloc_pool names to descriptors. */ -static hash_tablealloc_pool_hasher *alloc_pool_hash; +static hash_mapconst char *, alloc_pool_descriptor *alloc_pool_hash; /* For given name, return descriptor, create new if needed. */ static struct alloc_pool_descriptor * allocate_pool_descriptor (const char *name) { - struct alloc_pool_descriptor **slot; - if (!alloc_pool_hash) -alloc_pool_hash = new hash_tablealloc_pool_hasher (10); - - slot = alloc_pool_hash-find_slot_with_hash (name, - htab_hash_pointer (name), - INSERT); - if (*slot) -return *slot; - *slot = XCNEW (struct alloc_pool_descriptor); - (*slot)-name = name; - return *slot; +alloc_pool_hash = new hash_mapconst char *, alloc_pool_descriptor (10); + + return alloc_pool_hash-get_or_insert (name); } /* Create a pool of things of size SIZE, with NUM in each block we @@ -375,23 +344,22 @@ struct output_info unsigned long total_allocated; }; -/* Called via hash_table.traverse. Output alloc_pool descriptor pointed out by +/* Called via hash_map.traverse. Output alloc_pool descriptor pointed out by SLOT and update statistics. */ -int -print_alloc_pool_statistics (alloc_pool_descriptor **slot, +bool +print_alloc_pool_statistics (const char *const name, +const alloc_pool_descriptor d, struct output_info
Re: [PATCH AArch64 2/2] PR/60825 Make {int,uint}64x1_t in arm_neon.h a proper vector type
Bother. Apologies. Simplest fix is to unshare the test body - patch attached; ok for trunk? gcc/testsuite/ChangeLog: * gcc.target/arm/simd/vexts64_1.c: Remove #include, inline test body. * gcc.target/arm/simd/vextu64_1.c: Likewise. * gcc.target/aarch64/simd/ext_s64_1.c: Likewise. * gcc.target/aarch64/simd/ext_u64_1.c: Likewise. * gcc.target/aarch64/simd/ext_s64.x: Remove. * gcc.target/aarch64/simd/ext_u64.x: Remove. --Alan James Greenhalgh wrote: On Thu, Jun 19, 2014 at 01:30:32PM +0100, Alan Lawrence wrote: diff --git a/gcc/testsuite/gcc.target/aarch64/simd/ext_s64.x b/gcc/testsuite/gcc.target/aarch64/simd/ext_s64.x index c71011a5157a207fe68fe814ed80658fd5e0f90f..b879fdacaa6544790e4d3ff98ca0055073d6d1d1 100644 --- a/gcc/testsuite/gcc.target/aarch64/simd/ext_s64.x +++ b/gcc/testsuite/gcc.target/aarch64/simd/ext_s64.x @@ -9,7 +9,7 @@ main (int argc, char **argv) int64_t arr2[] = {1}; int64x1_t in2 = vld1_s64 (arr2); int64x1_t actual = vext_s64 (in1, in2, 0); - if (actual != in1) + if (actual[0] != in1[0]) abort (); return 0; diff --git a/gcc/testsuite/gcc.target/aarch64/simd/ext_u64.x b/gcc/testsuite/gcc.target/aarch64/simd/ext_u64.x index 8d5072bf761d96ea5a95342423ae9861d05d024a..bd51e27c2156bfcaca6b26798c449369b2894c08 100644 --- a/gcc/testsuite/gcc.target/aarch64/simd/ext_u64.x +++ b/gcc/testsuite/gcc.target/aarch64/simd/ext_u64.x @@ -9,7 +9,7 @@ main (int argc, char **argv) uint64_t arr2[] = {1}; uint64x1_t in2 = vld1_u64 (arr2); uint64x1_t actual = vext_u64 (in1, in2, 0); - if (actual != in1) + if (actual[0] != in1[0]) abort (); return 0; Hi Alan, Note that these files are also included by tests in the ARM backend, where uint64x1_t is still a typedef to a scalar type, leading to: PASS-FAIL: gcc.target/arm/simd/vexts64_1.c (test for excess errors) ../aarch64/simd/ext_u64.x:12:23: error: subscripted value is neither array nor pointer nor vector Thanks, James Index: gcc/testsuite/gcc.target/arm/simd/vexts64_1.c === --- gcc/testsuite/gcc.target/arm/simd/vexts64_1.c (revision 211933) +++ gcc/testsuite/gcc.target/arm/simd/vexts64_1.c (working copy) @@ -6,7 +6,22 @@ /* { dg-add-options arm_neon } */ #include arm_neon.h -#include ../../aarch64/simd/ext_s64.x +extern void abort (void); + +int +main (int argc, char **argv) +{ + int64_t arr1[] = {0}; + int64x1_t in1 = vld1_s64 (arr1); + int64_t arr2[] = {1}; + int64x1_t in2 = vld1_s64 (arr2); + int64x1_t actual = vext_s64 (in1, in2, 0); + if (actual != in1) +abort (); + + return 0; +} + /* Don't scan assembler for vext - it can be optimized into a move from r0. */ /* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/arm/simd/vextu64_1.c === --- gcc/testsuite/gcc.target/arm/simd/vextu64_1.c (revision 211933) +++ gcc/testsuite/gcc.target/arm/simd/vextu64_1.c (working copy) @@ -6,7 +6,22 @@ /* { dg-add-options arm_neon } */ #include arm_neon.h -#include ../../aarch64/simd/ext_u64.x +extern void abort (void); + +int +main (int argc, char **argv) +{ + uint64_t arr1[] = {0}; + uint64x1_t in1 = vld1_u64 (arr1); + uint64_t arr2[] = {1}; + uint64x1_t in2 = vld1_u64 (arr2); + uint64x1_t actual = vext_u64 (in1, in2, 0); + if (actual != in1) +abort (); + + return 0; +} + /* Don't scan assembler for vext - it can be optimized into a move from r0. */ /* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/aarch64/simd/ext_s64.x === --- gcc/testsuite/gcc.target/aarch64/simd/ext_s64.x (revision 211933) +++ gcc/testsuite/gcc.target/aarch64/simd/ext_s64.x (working copy) @@ -1,17 +0,0 @@ -extern void abort (void); - -int -main (int argc, char **argv) -{ - int i, off; - int64_t arr1[] = {0}; - int64x1_t in1 = vld1_s64 (arr1); - int64_t arr2[] = {1}; - int64x1_t in2 = vld1_s64 (arr2); - int64x1_t actual = vext_s64 (in1, in2, 0); - if (actual[0] != in1[0]) -abort (); - - return 0; -} - Index: gcc/testsuite/gcc.target/aarch64/simd/ext_u64.x === --- gcc/testsuite/gcc.target/aarch64/simd/ext_u64.x (revision 211933) +++ gcc/testsuite/gcc.target/aarch64/simd/ext_u64.x (working copy) @@ -1,17 +0,0 @@ -extern void abort (void); - -int -main (int argc, char **argv) -{ - int i, off; - uint64_t arr1[] = {0}; - uint64x1_t in1 = vld1_u64 (arr1); - uint64_t arr2[] = {1}; - uint64x1_t in2 = vld1_u64 (arr2); - uint64x1_t actual = vext_u64 (in1, in2, 0); - if (actual[0] != in1[0]) -abort (); - - return 0; -} - Index: gcc/testsuite/gcc.target/aarch64/simd/ext_u64_1.c === --- gcc/testsuite/gcc.target/aarch64/simd/ext_u64_1.c (revision 211933) +++
Re: [PATCH 3/3] add hash_map class
On Tue, Jun 24, 2014 at 03:17:46PM +0200, Martin Liška wrote: On 06/24/2014 02:40 PM, Trevor Saunders wrote: On Tue, Jun 24, 2014 at 02:29:53PM +0200, Martin Liška wrote: On 06/20/2014 12:52 PM, tsaund...@mozilla.com wrote: From: Trevor Saunders tsaund...@mozilla.com Hi, This patch adds a hash_map class so we can consolidate the boiler plate around using hash_table as a map, it also allows us to get rid of pointer_map which I do in this patch by converting its users to hash_map. Hello Trev, I like your changes! One small question about pointer_set, which is unable of deletion of items. Do you plan to migrate and simplify hash_map to be a replacement for pointer_set? I'm not sure I follow the question. I imagine that hash_map will largely stay as it is, other than perhaps some const correctness stuff, and supporting element removal at some point. Supporting element removal should be trivial since I'm just wrapping hash_table which already supports it, but I didn't want to add it until there was code testing it. As you see in the patch I removed pointer_map so its already a replacement for that functionality. As for pointer_set since its a set not a map hash_table would seem closer to me. Understand, yeah, I was asking if we plan to add element removal also for (pointer_)set? I consider such functionality useful, but it looks not related to your patch. If I understand correctly, you are not planning to use hash_* as wrapping data structure for set. correct, if anything I'd try and get rid of pointer_set, its not clear to me how much it buys us over hash_table, and if we can't just make hash_table do that. Trev Martin Trev Thanks, Martin bootstrapped + regtested without regression on x86_64-unknown-linux-gnu, ok? Trev gcc/ * alloc-pool.c (alloc_pool_hash): Use hash_map instead of hash_table. * dominance.c (iterate_fix_dominators): Use hash_map instead of pointer_map. * hash-map.h: New file. * ipa-comdats.c: Use hash_map instead of pointer_map. * lto-section-out.c: Adjust. * lto-streamer.h: Replace pointer_map with hash_map. * symtab.c (verify_symtab): Likewise. * tree-ssa-strlen.c (decl_to_stridxlist_htab): Likewise. * tree-ssa-uncprop.c (val_ssa_equiv): Likewise. * tree-streamer.h: Likewise. * tree-streamer.c: Adjust. * pointer-set.h: Remove pointer_map. lto/ * lto.c (canonical_type_hash_cache): Use hash_map instead of pointer_map. diff --git a/gcc/alloc-pool.c b/gcc/alloc-pool.c index 49209ee..0d31835 100644 --- a/gcc/alloc-pool.c +++ b/gcc/alloc-pool.c @@ -22,6 +22,7 @@ along with GCC; see the file COPYING3. If not see #include system.h #include alloc-pool.h #include hash-table.h +#include hash-map.h #define align_eight(x) (((x+7) 3) 3) @@ -69,7 +70,6 @@ static ALLOC_POOL_ID_TYPE last_id; size for that pool. */ struct alloc_pool_descriptor { - const char *name; /* Number of pools allocated. */ unsigned long created; /* Gross allocated storage. */ @@ -82,48 +82,17 @@ struct alloc_pool_descriptor int elt_size; }; -/* Hashtable helpers. */ -struct alloc_pool_hasher : typed_noop_remove alloc_pool_descriptor -{ - typedef alloc_pool_descriptor value_type; - typedef char compare_type; - static inline hashval_t hash (const alloc_pool_descriptor *); - static inline bool equal (const value_type *, const compare_type *); -}; - -inline hashval_t -alloc_pool_hasher::hash (const value_type *d) -{ - return htab_hash_pointer (d-name); -} - -inline bool -alloc_pool_hasher::equal (const value_type *d, - const compare_type *p2) -{ - return d-name == p2; -} - /* Hashtable mapping alloc_pool names to descriptors. */ -static hash_tablealloc_pool_hasher *alloc_pool_hash; +static hash_mapconst char *, alloc_pool_descriptor *alloc_pool_hash; /* For given name, return descriptor, create new if needed. */ static struct alloc_pool_descriptor * allocate_pool_descriptor (const char *name) { - struct alloc_pool_descriptor **slot; - if (!alloc_pool_hash) -alloc_pool_hash = new hash_tablealloc_pool_hasher (10); - - slot = alloc_pool_hash-find_slot_with_hash (name, - htab_hash_pointer (name), - INSERT); - if (*slot) -return *slot; - *slot = XCNEW (struct alloc_pool_descriptor); - (*slot)-name = name; - return *slot; +alloc_pool_hash = new hash_mapconst char *, alloc_pool_descriptor (10); + + return alloc_pool_hash-get_or_insert (name); } /* Create a pool of things of size SIZE, with NUM in each block we @@ -375,23 +344,22 @@ struct output_info unsigned long total_allocated; }; -/* Called via hash_table.traverse. Output alloc_pool descriptor pointed out by +/* Called via hash_map.traverse. Output alloc_pool descriptor
Re: [GSoC][match-and-simplify] factor expr check in gimple-match-head.c
On Tue, Jun 24, 2014 at 1:29 PM, Prathamesh Kulkarni bilbotheelffri...@gmail.com wrote: This patch factors out checking expression-code in gimple-match-head.c * gimple-match-head.c (check_gimple_assign): New function. (check_gimple_assign_convert): Likewise. (check_gimple_call_builtin): Likewise. * genmatch.c (dt_operand::gen_gimple_expr_expr): Add argument const char *. Generate call to gimple_assign_check or check_gimple_assign_convert. (dt_operand::gen_gimple_expr_fn): Add argument const char *. Generate call to check_gimple_call_builtin. (decision_tree::gen_gimple): Generate definition of def_stmt. Hmm, I think we should rather try to optimize the repeated if (TREE_CODE (op) == SSA_NAME) { gimple def_stmt = SSA_NAME_DEF_STMT (op); if (is_gimple_assign (def_stmt) so that we can eventually directly generate sth like if (TREE_CODE (op) == SSA_NAME) { gimple def_stmt = SSA_NAME_DEF_STMT (op); if (is_gimple_assign (def_stmt)) { switch (gimple_assign_rhs_code (def_stmt)) { case PLUS_EXPR: ... but let's leave that for later. The generated code doesn't have to be pretty, it just has to work (and if, then we want to optimize it for speed). Btw, there are still no early outs in the generated code so we always backtrack and try other alternatives. And if we have (match_and_simplify (MINUS_EXPR (PLUS_EXPR @0 @1) @2) @1) (match_and_simplify (MINUS_EXPR @0 @1) if (integer_zerop (@1)) @0) we should try to match the more specific pattern first (we do), but if that ultimately fails we should execute simplifes that can still match (that is, we should backtrack only to 'true' dt nodes). Richard. Thanks and Regards, Prathamesh.
[PATCH][match-and-simplify] GENERIC boilerplate
This adds infrastructure to create generic-match.c (empty for now). Richard. 2014-06-24 Richard Biener rguent...@suse.de * Makefile.in (OBJS): Add generic-match.o. (.PRECIOUS): Likewise. (MOSTLYCLEANFILES): Add generic-match.c. (generic-match.c): New rule. (s-match): Amend. * generic-match-head.c: New file. * genmatch.c (write_gimple): Rename to ... (write_header): ... this and support generic-match.c creation. (main): Recognize -gimple and -generic switches and adjust what we output. Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 211887) +++ gcc/Makefile.in (working copy) @@ -1237,6 +1237,7 @@ OBJS = \ gimple-fold.o \ gimple-low.o \ gimple-match.o \ + generic-match.o \ gimple-pretty-print.o \ gimple-ssa-isolate-paths.o \ gimple-ssa-strength-reduction.o \ @@ -1509,7 +1510,7 @@ MOSTLYCLEANFILES = insn-flags.h insn-con insn-output.c insn-recog.c insn-emit.c insn-extract.c insn-peep.c \ insn-attr.h insn-attr-common.h insn-attrtab.c insn-dfatab.c \ insn-latencytab.c insn-opinit.c insn-opinit.h insn-preds.c insn-constants.h \ - tm-preds.h tm-constrs.h checksum-options gimple-match.c \ + tm-preds.h tm-constrs.h checksum-options gimple-match.c generic-match.c \ tree-check.h min-insn-modes.c insn-modes.c insn-modes.h \ genrtl.h gt-*.h gtype-*.h gtype-desc.c gtyp-input.list \ xgcc$(exeext) cpp$(exeext) \ @@ -2023,7 +2024,7 @@ $(common_out_object_file): $(common_out_ .PRECIOUS: insn-config.h insn-flags.h insn-codes.h insn-constants.h \ insn-emit.c insn-recog.c insn-extract.c insn-output.c insn-peep.c \ insn-attr.h insn-attr-common.h insn-attrtab.c insn-dfatab.c \ - insn-latencytab.c insn-preds.c gimple-match.c + insn-latencytab.c insn-preds.c gimple-match.c generic-match.c # Dependencies for the md file. The first time through, we just assume # the md file itself and the generated dependency file (in order to get @@ -2233,12 +2234,17 @@ s-tm-texi: build/genhooks$(build_exeext) fi gimple-match.c: s-match gimple-match-head.c ; @true +generic-match.c: s-match generic-match-head.c ; @true s-match: build/genmatch$(build_exeext) $(srcdir)/match.pd - $(RUN_GEN) build/genmatch$(build_exeext) $(srcdir)/match.pd \ + $(RUN_GEN) build/genmatch$(build_exeext) -gimple $(srcdir)/match.pd \ tmp-gimple-match.c + $(RUN_GEN) build/genmatch$(build_exeext) -generic $(srcdir)/match.pd \ +tmp-generic-match.c $(SHELL) $(srcdir)/../move-if-change tmp-gimple-match.c \ gimple-match.c + $(SHELL) $(srcdir)/../move-if-change tmp-generic-match.c \ + generic-match.c $(STAMP) s-match GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \ Index: gcc/generic-match-head.c === --- gcc/generic-match-head.c(revision 0) +++ gcc/generic-match-head.c(working copy) @@ -0,0 +1,74 @@ +/* Preamble and helpers for the autogenerated generic-match.c file. + Copyright (C) 2014 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +http://www.gnu.org/licenses/. */ + +#include config.h +#include system.h +#include coretypes.h +#include tree.h +#include stringpool.h +#include stor-layout.h +#include flags.h +#include function.h +#include basic-block.h +#include tree-ssa-alias.h +#include internal-fn.h +#include gimple-expr.h +#include is-a.h +#include gimple.h +#include gimple-ssa.h +#include tree-ssanames.h +#include gimple-fold.h +#include gimple-iterator.h +#include expr.h +#include tree-dfa.h +#include builtins.h + +#define INTEGER_CST_P(node) (TREE_CODE(node) == INTEGER_CST) +#define integral_op_p(node) INTEGRAL_TYPE_P(TREE_TYPE(node)) +#define REAL_CST_P(node) (TREE_CODE(node) == REAL_CST) + + +/* Helper to transparently allow tree codes and builtin function codes + exist in one storage entity. */ +class code_helper +{ +public: + code_helper () {} + code_helper (tree_code code) : rep ((int) code) {} + code_helper (built_in_function fn) : rep (-(int) fn) {} + operator tree_code () const { return (tree_code) rep; } + operator built_in_function () const { return (built_in_function)
[C++ Patch] PR 33972
Hi, in this old rejects-valid we reject: struct s { typedef void f(void); f operator(); }; at the beginning of grokdeclarator: if (((dname IDENTIFIER_OPNAME_P (dname)) || flags == TYPENAME_FLAG) innermost_code != cdk_function ! (ctype !declspecs-any_specifiers_p)) { It seems to me that we can simply remove the check, because a bit later we have: /* Only functions may be declared using an operator-function-id. */ if (unqualified_id IDENTIFIER_OPNAME_P (unqualified_id) TREE_CODE (type) != FUNCTION_TYPE TREE_CODE (type) != METHOD_TYPE) { which uses type and is more precise. Tested x86_64-linux. Thanks! Paolo. /// /cp 2014-06-24 Paolo Carlini paolo.carl...@oracle.com PR c++/33972 * decl.c (grokdeclarator): Do not early check for operator-function-id as non-function. /testsuite 2014-06-24 Paolo Carlini paolo.carl...@oracle.com PR c++/33972 * g++.dg/other/operator3.C: New. * g++.dg/template/operator8.C: Adjust. * g++.dg/template/operator9.C: Likewise. Index: cp/decl.c === --- cp/decl.c (revision 211931) +++ cp/decl.c (working copy) @@ -9007,7 +9007,7 @@ grokdeclarator (const cp_declarator *declarator, return error_mark_node; } - if (((dname IDENTIFIER_OPNAME_P (dname)) || flags == TYPENAME_FLAG) + if (flags == TYPENAME_FLAG innermost_code != cdk_function ! (ctype !declspecs-any_specifiers_p)) { Index: testsuite/g++.dg/other/operator3.C === --- testsuite/g++.dg/other/operator3.C (revision 0) +++ testsuite/g++.dg/other/operator3.C (working copy) @@ -0,0 +1,7 @@ +// PR c++/33972 + +struct s +{ + typedef void f(void); + f operator(); +}; Index: testsuite/g++.dg/template/operator8.C === --- testsuite/g++.dg/template/operator8.C (revision 211931) +++ testsuite/g++.dg/template/operator8.C (working copy) @@ -2,5 +2,5 @@ struct A { -templateoperator+ void foo() {} // { dg-error identifier|non-function|template arguments } +templateoperator+ void foo() {} // { dg-error identifier|parameter|template arguments } }; Index: testsuite/g++.dg/template/operator9.C === --- testsuite/g++.dg/template/operator9.C (revision 211931) +++ testsuite/g++.dg/template/operator9.C (working copy) @@ -1,6 +1,6 @@ //PR c++/27670 -templateoperator+ void foo(); // { dg-error before|non-function|template } +templateoperator+ void foo(); // { dg-error before|parameter|template } void bar() {
Re: [C++ Patch] PR 33972
OK. Jason
[PATCH][match-and-simplify] Get rid of gimple_match_and_simplify wrappers
This re-orders operands so we can default-initialize them to zero now that we only have a single autogenerated function. Richard. 2014-06-24 Richard Biener rguent...@suse.de * genmatch.c (write_fn_prototype): Remove and inline... (decision_tree::gen_gimple): ...here. * gimple-match-head.c (gimple_match_and_simplify): Adjust operand order to put operands last. Make the last two default initialized to zero. (gimple_resimplify1): Adjust. (gimple_resimplify2): Likewise. (gimple_resimplify3): Likewise. (gimple_match_and_simplify): Likewise. Index: genmatch.c === --- genmatch.c (revision 211939) +++ genmatch.c (working copy) @@ -849,16 +849,6 @@ decision_tree::print (FILE *f) return decision_tree::print_node (root, f); } -void -write_fn_prototype (FILE *f, unsigned n) -{ - fprintf (f, static bool\n - gimple_match_and_simplify (code_helper code, tree type); - for (unsigned i = 0; i n; ++i) -fprintf (f, , tree op%d, i); - fprintf (f, , code_helper *res_code, tree *res_ops, gimple_seq *seq, tree (*valueize)(tree))\n); -} - char * dt_operand::get_name (char *name) { @@ -1132,13 +1122,13 @@ dt_simplify::gen_gimple (FILE *f) void decision_tree::gen_gimple (FILE *f) { - write_fn_prototype (f, 1); - fprintf (f, { return gimple_match_and_simplify (code, type, op0, NULL_TREE, NULL_TREE, res_code, res_ops, seq, valueize); }\n\n); - - write_fn_prototype (f, 2); - fprintf (f, { return gimple_match_and_simplify (code, type, op0, op1, NULL_TREE, res_code, res_ops, seq, valueize); }\n\n); - - write_fn_prototype (f, 3); + fprintf (f, static bool\n + gimple_match_and_simplify (code_helper *res_code, tree *res_ops,\n + gimple_seq *seq, tree (*valueize)(tree),\n + code_helper code, tree type); + for (unsigned i = 0; i 3; ++i) +fprintf (f, , tree op%d, i); + fprintf (f, )\n); fprintf (f, {\n); for (unsigned i = 0; i root-kids.length (); i++) Index: gimple-match-head.c === --- gimple-match-head.c (revision 211891) +++ gimple-match-head.c (working copy) @@ -60,18 +60,13 @@ private: int rep; }; -/* Forward declarations of the private auto-generated matchers. They - expect valueized operands in canonical order and they do not +/* Forward declarations of the private auto-generated matcher. + It expects valueized operands in canonical order and does not perform simplification of all-constant operands. */ -static bool gimple_match_and_simplify (code_helper, tree, tree, - code_helper *, tree *, - gimple_seq *, tree (*)(tree)); -static bool gimple_match_and_simplify (code_helper, tree, tree, tree, - code_helper *, tree *, - gimple_seq *, tree (*)(tree)); -static bool gimple_match_and_simplify (code_helper, tree, tree, tree, tree, - code_helper *, tree *, - gimple_seq *, tree (*)(tree)); +static bool gimple_match_and_simplify (code_helper *, tree *, + gimple_seq *, tree (*)(tree), + code_helper, tree, + tree, tree = NULL_TREE, tree = NULL_TREE); /* Return whether T is a constant that we'll dispatch to fold to @@ -130,8 +125,8 @@ gimple_resimplify1 (gimple_seq *seq, code_helper res_code2; tree res_ops2[3] = {}; - if (gimple_match_and_simplify (*res_code, type, res_ops[0], -res_code2, res_ops2, seq, valueize)) + if (gimple_match_and_simplify (res_code2, res_ops2, seq, valueize, +*res_code, type, res_ops[0])) { *res_code = res_code2; res_ops[0] = res_ops2[0]; @@ -199,8 +194,8 @@ gimple_resimplify2 (gimple_seq *seq, code_helper res_code2; tree res_ops2[3] = {}; - if (gimple_match_and_simplify (*res_code, type, res_ops[0], res_ops[1], -res_code2, res_ops2, seq, valueize)) + if (gimple_match_and_simplify (res_code2, res_ops2, seq, valueize, +*res_code, type, res_ops[0], res_ops[1])) { *res_code = res_code2; res_ops[0] = res_ops2[0]; @@ -269,9 +264,9 @@ gimple_resimplify3 (gimple_seq *seq, code_helper res_code2; tree res_ops2[3] = {}; - if (gimple_match_and_simplify (*res_code, type, -res_ops[0], res_ops[1], res_ops[2], -res_code2, res_ops2, seq, valueize)) + if (gimple_match_and_simplify (res_code2, res_ops2, seq, valueize, +*res_code, type, +
Re: [patch] Do not generate useless integral conversions
On Tue, Jun 24, 2014 at 12:54 PM, Eric Botcazou ebotca...@adacore.com wrote: Hi, https://gcc.gnu.org/ml/gcc-patches/2012-03/msg00491.html changed the old signed_type_for/unsigned_type_for functions and made them always return an integer type, whereas they would previously leave integral types unchanged. I don't see any justification for the latter and this has the annoying effect of generating useless integral conversions in convert.c, for example between boolean types and integer types of the same precision. The attached patch restores the old behavior for them. Bootstrapped/regtested on x86_64-suse-linux, OK for the mainline? I think that was on purpose to avoid arithmetics in enum types. As those conversions are useless and thus stripped later is it really important to retain enum and boolean kind here? Richard. 2014-06-24 Eric Botcazou ebotca...@adacore.com * tree.c (signed_or_unsigned_type_for): Treat integral types equally. -- Eric Botcazou
[c++-concepts] small pretty printing fix
This helps improve debug output. When pretty printing template args including a placeholder, show placeholder instead of a dump_expr error. 2014-06-24 Andrew Sutton andrew.n.sut...@gmail.com * gcc/cp/error.C (dump_expr): Pretty print placeholder to improve debug output. Committed as r211942. Andrew Index: gcc/cp/error.c === --- gcc/cp/error.c (revision 211415) +++ gcc/cp/error.c (working copy) @@ -2656,6 +2656,10 @@ dump_expr (cxx_pretty_printer *pp, tree case CONSTEXPR_EXPR: pp_cxx_constexpr_expr (cxx_pp, t); +case PLACEHOLDER_EXPR: + pp_cxx_ws_string (cxx_pp, placeholder); + break; + /* This list is incomplete, but should suffice for now. It is very important that `sorry' does not call `report_error_function'. That could cause an infinite loop. */
Re: [PATCH] Fix up -march=native handling under KVM (PR target/61570)
On Mon, Jun 23, 2014 at 11:51 PM, Uros Bizjak ubiz...@gmail.com wrote: On Mon, Jun 23, 2014 at 6:29 PM, H.J. Lu hjl.to...@gmail.com wrote: --- gcc/config/i386/driver-i386.c.jj2014-05-14 14:45:54.0 +0200 +++ gcc/config/i386/driver-i386.c 2014-06-20 18:59:57.805006358 +0200 @@ -745,6 +745,11 @@ const char *host_detect_local_cpu (int a /* Assume Core 2. */ cpu = core2; } + else if (has_longmode) + /* Perhaps some emulator? Assume x86-64, otherwise gcc + -march=native would be unusable for 64-bit compilations, + as all the CPUs below are 32-bit only. */ + cpu = x86-64; else if (has_sse3) /* It is Core Duo. */ cpu = pentium-m; Jakub host_detect_local_cpu guesses the cpu based on the real processors. It doesn't work with emulators due to some conflicts. This isn't the only only place which has the same issue. I prefer something like this. I'm fine with your patch too. Let's wait what Uros (or other i?86 maintainers) pick up. This looks OK to me. Thanks, Uros. This is what I checked in. This version was NOT approved. Please revert it ASAP and proceed with approved version. Uros. I reverted my change. Sorry for my misunderstanding. -- H.J.
Re: Move DECL_VINDEX and DECL_SAVED_TREE into function_decl
On 06/23/2014 04:25 PM, Jan Hubicka wrote: * class.c (check_methods, create_vtable_ptr, determine_key_method, add_vcall_offset_vtbl_entries_1): Guard VINDEX checks by FUNCTION_DECL check. These changes are unnecessary: TYPE_METHODS is a list of functions. The rest of the patch is OK. Jason
[c++-concepts] member concepts
Actually allow member concepts in using shorthand notation. 2014-06-24 Andrew Sutton andrew.n.sut...@gmail.com * gcc/cp/parser.c (cp_maybe_constrained_type_specifier): Defer handling the BASELINK check until concept-resolution in order to allow member conceps. (cp_parser_nonclass_name): Also Check for concept-names when the lookup finds a BASELINk. * gcc/cp/constraint.cc: (resolve_constraint_check) If the call target is a base-link, resolve against its overload set. (build_concept_check): Update comments and variable names to reflect actual processing. * gcc/testuite/g++.dg/concepts/mem-concept.C: New test. * gcc/testuite/g++.dg/concepts/mem-concept-err.C: New test. Committed as r211946. Andrew Sutton Index: mem-concept.C === --- mem-concept.C (revision 0) +++ mem-concept.C (revision 0) @@ -0,0 +1,28 @@ +// { dg-options -std=c++1y } + +struct Base { + templatetypename T +static concept bool D() { return __is_same_as(T, int); } + + templatetypename T, typename U +static concept bool E() { return __is_same_as(T, U); } +}; + +void f1(Base::D) { } +void f2(Base::Edouble x) { } + +templatetypename T + struct S : Base { +void f1(Base::D) { } +void f2(Base::ET x) { } + }; + +int main() { + f1(0); + + f2(0.0); + + Sint s; + s.f1(0); + s.f2(0); +} Index: mem-concept-err.C === --- mem-concept-err.C (revision 0) +++ mem-concept-err.C (revision 0) @@ -0,0 +1,40 @@ +// { dg-options -std=c++1y } + + +// The following error is emitted without context. I'm not +// certain why that would be the case. It comes as a result +// of failing the declaration of S::f0(). +// +//cc1plus: error: expected ';' at end of member declaration + + +struct Base { + templatetypename T, typename U +bool C() const { return false; } // Not a concept! + + templatetypename T +static concept bool D() { return __is_same_as(T, int); } + + templatetypename T, typename U +static concept bool E() { return __is_same_as(T, U); } +}; + +void f1(Base::D) { } +void f2(Base::Edouble x) { } + +templatetypename T + struct S : Base { +void f0(Base::Cfloat x) { } // { dg-error expected|type } +void f1(Base::D) { } +void f2(Base::ET x) { } + }; + +int main() { + f1('a'); // { dg-error matching } + f2(0); // { dg-error matching } + + Sint s; + s.f1('a'); // { dg-error matching } + s.f2('a'); // { dg-error matching } +} + Index: pt.c === --- pt.c (revision 211415) +++ pt.c (working copy) @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3. #include coretypes.h #include tm.h #include tree.h +#include print-tree.h #include stringpool.h #include varasm.h #include attribs.h Index: parser.c === --- parser.c (revision 211935) +++ parser.c (working copy) @@ -15168,6 +15168,7 @@ cp_parser_allows_constrained_type_specif || parser-in_result_type_constraint_p); } + // Check if DECL and ARGS can form a constrained-type-specifier. If ARGS // is non-null, we try to form a concept check of the form DECL?, ARGS // where ? is a placeholder for any kind of template argument. If ARGS @@ -15177,13 +15178,6 @@ cp_maybe_constrained_type_specifier (cp_ { gcc_assert (args ? TREE_CODE (args) == TREE_VEC : true); - // If we get a reference to a member function, allow the referenced - // functions to participate in this resolution: the baselink may refer - // to a static member concept. - if (BASELINK_P (decl)) -decl = BASELINK_FUNCTIONS (decl); - gcc_assert (TREE_CODE (decl) == OVERLOAD); - // Don't do any heavy lifting if we know we're not in a context // where it could succeed. if (!cp_parser_allows_constrained_type_specifier (parser)) @@ -15191,7 +15185,8 @@ cp_maybe_constrained_type_specifier (cp_ // Try to build a call expression that evaluates the concept. This // can fail if the overload set refers only to non-templates. - tree call = build_concept_check (decl, build_nt(PLACEHOLDER_EXPR), args); + tree placeholder = build_nt(PLACEHOLDER_EXPR); + tree call = build_concept_check (decl, placeholder, args); if (call == error_mark_node) return NULL_TREE; @@ -15291,7 +15286,8 @@ cp_parser_nonclass_name (cp_parser* pars // // TODO: The name could also refer to a variable template or an // introduction (if followed by '{'). - if (flag_concepts TREE_CODE (type_decl) == OVERLOAD) + if (flag_concepts +(TREE_CODE (type_decl) == OVERLOAD || BASELINK_P (type_decl))) { // Determine whether the overload refers to a concept. if (tree decl = cp_maybe_concept_name (parser, type_decl)) Index: constraint.cc === ---
Re: [c++-concepts] Fix assertion failure with cp_maybe_constrained_type_specifier
I am still having problems doing a full build. I get stuck on something that I think can't be a concepts problem in gcc/config/i386/i386.c: make[3]: Entering directory `/home/ed/obj_concepts/gcc' /home/ed/obj_concepts/./prev-gcc/xg++ -B/home/ed/obj_concepts/./prev-gcc/ -B/home/ed/bin_concepts/x86_64-unknown-linux-gnu/bin/ -nostdinc++ -B/home/ed/obj_concepts/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs -B/home/ed/obj_concepts/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -I/home/ed/obj_concepts/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu -I/home/ed/obj_concepts/prev-x86_64-unknown-linux-gnu/libstdc++-v3/include -I/home/ed/gcc_concepts/libstdc++-v3/libsupc++ -L/home/ed/obj_concepts/prev-x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs -L/home/ed/obj_concepts/prev-x86_64-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -c -g -O2 -gtoggle -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -I. -I. -I../../gcc_concepts/gcc -I../../gcc_concepts/gcc/. -I../../gcc_concepts/gcc/../include -I../../gcc_concepts/gcc/../libcpp/include -I/home/ed/obj_concepts/./gmp -I/home/ed/gcc_concepts/gmp -I/home/ed/obj_concepts/./mpfr/src -I/home/ed/gcc_concepts/mpfr/src -I/home/ed/gcc_concepts/mpc/src -I../../gcc_concepts/gcc/../libdecnumber -I../../gcc_concepts/gcc/../libdecnumber/bid -I../libdecnumber -I../../gcc_concepts/gcc/../libbacktrace -DCLOOG_INT_GMP -I/home/ed/obj_concepts/./cloog/include -I/home/ed/gcc_concepts/cloog/include -I../gcc_concepts/cloog/include -I/home/ed/obj_concepts/./isl/include -I/home/ed/gcc_concepts/isl/include -o i386.o -MT i386.o -MMD -MP -MF ./.deps/i386.TPo ../../gcc_concepts/gcc/config/i386/i386.c ../../gcc_concepts/gcc/config/i386/i386.c:113:56: error: uninitialized const member ‘stringop_algs::stringop_strategy::max’ {rep_prefix_1_byte, {{-1, rep_prefix_1_byte, false; ^ ../../gcc_concepts/gcc/config/i386/i386.c:113:56: error: missing initializer for member ‘stringop_algs::stringop_strategy::max’ [-Werror=missing-field-initializers] ../../gcc_concepts/gcc/config/i386/i386.c:113:56: error: uninitialized const member ‘stringop_algs::stringop_strategy::alg’ ../../gcc_concepts/gcc/config/i386/i386.c:113:56: error: missing initializer for member ‘stringop_algs::stringop_strategy::alg’ [-Werror=missing-field-initializers] ../../gcc_concepts/gcc/config/i386/i386.c:113:56: error: missing initializer for member ‘stringop_algs::stringop_strategy::noalign’ [-Werror=missing-field-initializers] Am I the only one seeing this? Do you turn off the warning when you compile? Ed
[PATCH][match-and-simplify] GENERIC code-gen
This massages things so GENERIC code-gen works (well, is emitted and compiles). The GENERIC interface matches that of fold_{unary,binary,ternary} with also supporting calls here. It's supposed to be called at the start of those functions (and if it returns NULL_TREE their remains is executed). At the moment it just creates dead code. I'll see if it works to wire it in tomorrow. Committed. Richard. 2014-06-24 Richard Biener rguent...@suse.de * genmatch.c (operand::gen_gimple_transform): Rename to... (operand::gen_transform): ... this and add a flag argument for the kind of IL to generate. (predicate::gen_gimple_transform): Likewise. (expr::gen_gimple_transform): Likewise. (c_expr::gen_gimple_transform): Likewise. (capture::gen_gimple_transform): Likewise. (dt_node::gen_generic): Add. (dt_operand::gen_generic_expr): Add flag whether to valueize. (dt_operand::gen_generic_expr_expr): Likewise. (dt_operand::gen_generic_expr_fn): Likewise. (dt_simplify::gen_genric): Add. (decision_tree::gen_generic): Likewise. (main): Uncomment generic codegen. Index: genmatch.c === --- genmatch.c (revision 211941) +++ genmatch.c (working copy) @@ -206,14 +206,14 @@ struct operand { enum op_type { OP_PREDICATE, OP_EXPR, OP_CAPTURE, OP_C_EXPR }; operand (enum op_type type_) : type (type_) {} enum op_type type; - virtual void gen_gimple_transform (FILE *f, const char *) = 0; + virtual void gen_transform (FILE *f, const char *, bool) = 0; }; struct predicate : public operand { predicate (const char *ident_) : operand (OP_PREDICATE), ident (ident_) {} const char *ident; - virtual void gen_gimple_transform (FILE *, const char *) { gcc_unreachable (); } + virtual void gen_transform (FILE *, const char *, bool) { gcc_unreachable (); } }; struct e_operation { @@ -230,7 +230,7 @@ struct expr : public operand void append_op (operand *op) { ops.safe_push (op); } e_operation *operation; vecoperand * ops; - virtual void gen_gimple_transform (FILE *f, const char *); + virtual void gen_transform (FILE *f, const char *, bool); }; struct c_expr : public operand @@ -242,7 +242,7 @@ struct c_expr : public operand veccpp_token code; unsigned nr_stmts; char *fname; - virtual void gen_gimple_transform (FILE *f, const char *); + virtual void gen_transform (FILE *f, const char *, bool); }; struct capture : public operand @@ -251,7 +251,7 @@ struct capture : public operand : operand (OP_CAPTURE), where (where_), what (what_) {} const char *where; operand *what; - virtual void gen_gimple_transform (FILE *f, const char *); + virtual void gen_transform (FILE *f, const char *, bool); }; @@ -324,6 +324,7 @@ struct dt_node dt_node *append_simplify (simplify *, unsigned, dt_operand **); virtual void gen_gimple (FILE *) {} + virtual void gen_generic (FILE *) {} }; struct dt_operand: public dt_node @@ -337,6 +338,7 @@ struct dt_operand: public dt_node : dt_node (type), op (op_), match_dop (match_dop_), parent (parent_), pos (pos_) {} virtual void gen_gimple (FILE *); + virtual void gen_generic (FILE *); unsigned gen_gimple_predicate (FILE *, const char *); unsigned gen_gimple_match_op (FILE *, const char *); @@ -344,9 +346,9 @@ struct dt_operand: public dt_node void gen_gimple_expr_expr (FILE *, expr *); void gen_gimple_expr_fn (FILE *, expr *); - unsigned gen_generic_expr (FILE *, const char *); - void gen_generic_expr_expr (FILE *, expr *, const char *); - void gen_generic_expr_fn (FILE *, expr *, const char *); + unsigned gen_generic_expr (FILE *, const char *, bool); + void gen_generic_expr_expr (FILE *, expr *, const char *, bool); + void gen_generic_expr_fn (FILE *, expr *, const char *, bool); char *get_name (char *); void gen_opname (char *, unsigned); @@ -369,6 +371,7 @@ struct dt_simplify: public dt_node } virtual void gen_gimple (FILE *f); + virtual void gen_generic (FILE *f); }; struct decision_tree @@ -377,6 +380,7 @@ struct decision_tree void insert (struct simplify *, unsigned); void gen_gimple (FILE *f = stderr); + void gen_generic (FILE *f = stderr); void print (FILE *f = stderr); decision_tree () { root = new dt_node (dt_node::DT_NODE); } @@ -535,7 +539,7 @@ commutate (operand *op) /* Code gen off the AST. */ void -expr::gen_gimple_transform (FILE *f, const char *dest) +expr::gen_transform (FILE *f, const char *dest, bool gimple) { fprintf (f, {\n); fprintf (f, tree ops[%u], res;\n, ops.length ()); @@ -543,31 +547,46 @@ expr::gen_gimple_transform (FILE *f, con { char dest[32]; snprintf (dest, 32, ops[%u], i); - ops[i]-gen_gimple_transform (f, dest); + ops[i]-gen_transform (f, dest, gimple); +} + if (gimple) +{ + /* ??? Have
Re: [Committed] New testcase for conditional move with conditional compares
On Jun 24, 2014, at 2:08 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: On 23/06/14 22:12, Andrew Pinski wrote: Hi, When looking at the current conditional compare patch, I find that we don't have a testcase to test that we don't ICE for the case where we have conditional compares and conditional moves where the moves are of floating point types. This patch adds that testcase to the C torture compile test to make sure we don't ICE (which I think we do currently). FWIW, this doesn't ICE for me with aarch64-none-elf trunk. I meant with conditional compare patches applied. Thanks, Andrew Kyrill Thanks, Andrew Pinski 2014-06-23 Andrew Pinski apin...@cavium.com * gcc.c-torture/compile/20140723-1.c: New testcase.
Re: [c++-concepts] Fix assertion failure with cp_maybe_constrained_type_specifier
I'm not sure the warning is correct in any case... In i386.h struct stringop_algs { const enum stringop_alg unknown_size; const struct stringop_strategy { const int max; const enum stringop_alg alg; int noalign; } size [MAX_STRINGOP_ALGS]; }; in i386.c --- static stringop_algs ix86_size_memcpy[2] = { {rep_prefix_1_byte, {{-1, rep_prefix_1_byte, false}}}, {rep_prefix_1_byte, {{-1, rep_prefix_1_byte, false; static stringop_algs ix86_size_memset[2] = { {rep_prefix_1_byte, {{-1, rep_prefix_1_byte, false}}}, {rep_prefix_1_byte, {{-1, rep_prefix_1_byte, false;
[patch] Fix some pedantic warnings
This fixes some warnings when building the library with -Wpedantic (some only show up when also building with -Wsystem-headers). Tested x86_64-linux, committed to trunk. commit bf385e3ea2d8d10bdaa8e41c7638d32eb26b9bfd Author: Jonathan Wakely jwak...@redhat.com Date: Tue Jun 24 14:16:28 2014 +0100 * include/bits/functexcept.h (__throw_out_of_range_fmt): Change attribute to __gnu_printf__ archetype to prevent warnings for %zu. * include/bits/locale_facets_nonio.tcc (time_get::do_get_weekday): Remove unused typedef. (time_get::do_get_monthname): Likewise. * include/bits/stl_tree.h: Add system_header pragma. * include/ext/stdio_sync_filebuf.h (stdio_sync_filebuf::file): Remove redundant const-qualifier. * include/std/complex (complex::__rep): Use _GLIBCXX_CONSTEXPR macro instead of _GLIBCXX_USE_CONSTEXPR. diff --git a/libstdc++-v3/include/bits/functexcept.h b/libstdc++-v3/include/bits/functexcept.h index b8359f9..48be255 100644 --- a/libstdc++-v3/include/bits/functexcept.h +++ b/libstdc++-v3/include/bits/functexcept.h @@ -76,7 +76,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION void __throw_out_of_range_fmt(const char*, ...) __attribute__((__noreturn__)) -__attribute__((__format__(__printf__, 1, 2))); +__attribute__((__format__(__gnu_printf__, 1, 2))); void __throw_runtime_error(const char*) __attribute__((__noreturn__)); diff --git a/libstdc++-v3/include/bits/locale_facets_nonio.tcc b/libstdc++-v3/include/bits/locale_facets_nonio.tcc index 41d944d..c9f8dac 100644 --- a/libstdc++-v3/include/bits/locale_facets_nonio.tcc +++ b/libstdc++-v3/include/bits/locale_facets_nonio.tcc @@ -1064,7 +1064,6 @@ _GLIBCXX_END_NAMESPACE_LDBL do_get_weekday(iter_type __beg, iter_type __end, ios_base __io, ios_base::iostate __err, tm* __tm) const { - typedef char_traits_CharT __traits_type; const locale __loc = __io._M_getloc(); const __timepunct_CharT __tp = use_facet__timepunct_CharT (__loc); const ctype_CharT __ctype = use_facetctype_CharT (__loc); @@ -1092,7 +1091,6 @@ _GLIBCXX_END_NAMESPACE_LDBL do_get_monthname(iter_type __beg, iter_type __end, ios_base __io, ios_base::iostate __err, tm* __tm) const { - typedef char_traits_CharT __traits_type; const locale __loc = __io._M_getloc(); const __timepunct_CharT __tp = use_facet__timepunct_CharT (__loc); const ctype_CharT __ctype = use_facetctype_CharT (__loc); diff --git a/libstdc++-v3/include/bits/stl_tree.h b/libstdc++-v3/include/bits/stl_tree.h index ce43ab8..cc9bf94 100644 --- a/libstdc++-v3/include/bits/stl_tree.h +++ b/libstdc++-v3/include/bits/stl_tree.h @@ -58,6 +58,8 @@ #ifndef _STL_TREE_H #define _STL_TREE_H 1 +#pragma GCC system_header + #include bits/stl_algobase.h #include bits/allocator.h #include bits/stl_function.h diff --git a/libstdc++-v3/include/ext/stdio_sync_filebuf.h b/libstdc++-v3/include/ext/stdio_sync_filebuf.h index 5ca16eb..73283a7 100644 --- a/libstdc++-v3/include/ext/stdio_sync_filebuf.h +++ b/libstdc++-v3/include/ext/stdio_sync_filebuf.h @@ -84,7 +84,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * Note that there is no way for the library to track what you do * with the file, so be careful. */ - std::__c_file* const + std::__c_file* file() { return this-_M_file; } protected: diff --git a/libstdc++-v3/include/std/complex b/libstdc++-v3/include/std/complex index 5849cd5..1ae6c45 100644 --- a/libstdc++-v3/include/std/complex +++ b/libstdc++-v3/include/std/complex @@ -217,7 +217,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION templatetypename _Up complex_Tp operator/=(const complex_Up); - _GLIBCXX_USE_CONSTEXPR complex __rep() const + _GLIBCXX_CONSTEXPR complex __rep() const { return *this; } private: @@ -1180,7 +1180,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return *this; } - _GLIBCXX_USE_CONSTEXPR _ComplexT __rep() const { return _M_value; } + _GLIBCXX_CONSTEXPR _ComplexT __rep() const { return _M_value; } private: _ComplexT _M_value; @@ -1330,7 +1330,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return *this; } - _GLIBCXX_USE_CONSTEXPR _ComplexT __rep() const { return _M_value; } + _GLIBCXX_CONSTEXPR _ComplexT __rep() const { return _M_value; } private: _ComplexT _M_value; @@ -1482,7 +1482,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return *this; } - _GLIBCXX_USE_CONSTEXPR _ComplexT __rep() const { return _M_value; } + _GLIBCXX_CONSTEXPR _ComplexT __rep() const { return _M_value; } private: _ComplexT _M_value;
Re: [PATCH][match-and-simplify] GENERIC code-gen
On Tue, Jun 24, 2014 at 9:00 PM, Richard Biener rguent...@suse.de wrote: This massages things so GENERIC code-gen works (well, is emitted and compiles). The GENERIC interface matches that of fold_{unary,binary,ternary} with also supporting calls here. It's supposed to be called at the start of those functions (and if it returns NULL_TREE their remains is executed). At the moment it just creates dead code. I'll see if it works to wire it in tomorrow. Committed. Small point: I renamed dt_operand::gen_gimple_predicate/gen_gimple_match_op to gen_predicate/gen_match_op. * genmatch.c (dt_operand::gen_gimple_predicate): Rename to ... (dt_operand::gen_predicate): ... this. (dt_operand::gen_gimple_match_op): Rename to ... (dt_operand::gen_match_op): ... this. Thanks and Regards, Prathamesh. Richard. 2014-06-24 Richard Biener rguent...@suse.de * genmatch.c (operand::gen_gimple_transform): Rename to... (operand::gen_transform): ... this and add a flag argument for the kind of IL to generate. (predicate::gen_gimple_transform): Likewise. (expr::gen_gimple_transform): Likewise. (c_expr::gen_gimple_transform): Likewise. (capture::gen_gimple_transform): Likewise. (dt_node::gen_generic): Add. (dt_operand::gen_generic_expr): Add flag whether to valueize. (dt_operand::gen_generic_expr_expr): Likewise. (dt_operand::gen_generic_expr_fn): Likewise. (dt_simplify::gen_genric): Add. (decision_tree::gen_generic): Likewise. (main): Uncomment generic codegen. Index: genmatch.c === --- genmatch.c (revision 211941) +++ genmatch.c (working copy) @@ -206,14 +206,14 @@ struct operand { enum op_type { OP_PREDICATE, OP_EXPR, OP_CAPTURE, OP_C_EXPR }; operand (enum op_type type_) : type (type_) {} enum op_type type; - virtual void gen_gimple_transform (FILE *f, const char *) = 0; + virtual void gen_transform (FILE *f, const char *, bool) = 0; }; struct predicate : public operand { predicate (const char *ident_) : operand (OP_PREDICATE), ident (ident_) {} const char *ident; - virtual void gen_gimple_transform (FILE *, const char *) { gcc_unreachable (); } + virtual void gen_transform (FILE *, const char *, bool) { gcc_unreachable (); } }; struct e_operation { @@ -230,7 +230,7 @@ struct expr : public operand void append_op (operand *op) { ops.safe_push (op); } e_operation *operation; vecoperand * ops; - virtual void gen_gimple_transform (FILE *f, const char *); + virtual void gen_transform (FILE *f, const char *, bool); }; struct c_expr : public operand @@ -242,7 +242,7 @@ struct c_expr : public operand veccpp_token code; unsigned nr_stmts; char *fname; - virtual void gen_gimple_transform (FILE *f, const char *); + virtual void gen_transform (FILE *f, const char *, bool); }; struct capture : public operand @@ -251,7 +251,7 @@ struct capture : public operand : operand (OP_CAPTURE), where (where_), what (what_) {} const char *where; operand *what; - virtual void gen_gimple_transform (FILE *f, const char *); + virtual void gen_transform (FILE *f, const char *, bool); }; @@ -324,6 +324,7 @@ struct dt_node dt_node *append_simplify (simplify *, unsigned, dt_operand **); virtual void gen_gimple (FILE *) {} + virtual void gen_generic (FILE *) {} }; struct dt_operand: public dt_node @@ -337,6 +338,7 @@ struct dt_operand: public dt_node : dt_node (type), op (op_), match_dop (match_dop_), parent (parent_), pos (pos_) {} virtual void gen_gimple (FILE *); + virtual void gen_generic (FILE *); unsigned gen_gimple_predicate (FILE *, const char *); unsigned gen_gimple_match_op (FILE *, const char *); @@ -344,9 +346,9 @@ struct dt_operand: public dt_node void gen_gimple_expr_expr (FILE *, expr *); void gen_gimple_expr_fn (FILE *, expr *); - unsigned gen_generic_expr (FILE *, const char *); - void gen_generic_expr_expr (FILE *, expr *, const char *); - void gen_generic_expr_fn (FILE *, expr *, const char *); + unsigned gen_generic_expr (FILE *, const char *, bool); + void gen_generic_expr_expr (FILE *, expr *, const char *, bool); + void gen_generic_expr_fn (FILE *, expr *, const char *, bool); char *get_name (char *); void gen_opname (char *, unsigned); @@ -369,6 +371,7 @@ struct dt_simplify: public dt_node } virtual void gen_gimple (FILE *f); + virtual void gen_generic (FILE *f); }; struct decision_tree @@ -377,6 +380,7 @@ struct decision_tree void insert (struct simplify *, unsigned); void gen_gimple (FILE *f = stderr); + void gen_generic (FILE *f = stderr); void print (FILE *f = stderr); decision_tree () { root = new dt_node (dt_node::DT_NODE); } @@ -535,7 +539,7 @@ commutate (operand *op) /* Code gen
Re: [c++-concepts] Fix assertion failure with cp_maybe_constrained_type_specifier
Weird. Any chance you're doing a bootstrap build? There was an earlier bootstrapping issue with this branch. We had turned on -std=c++1y by default, and it was causing some conversion errors with lvalue references to bitfields in libasan. This doesn't *look* like a regression caused by concepts -- I don't think I'm touching the initializer code at all. Andrew Sutton On Tue, Jun 24, 2014 at 11:42 AM, Ed Smith-Rowland 3dw...@verizon.net wrote: I'm not sure the warning is correct in any case... In i386.h struct stringop_algs { const enum stringop_alg unknown_size; const struct stringop_strategy { const int max; const enum stringop_alg alg; int noalign; } size [MAX_STRINGOP_ALGS]; }; in i386.c --- static stringop_algs ix86_size_memcpy[2] = { {rep_prefix_1_byte, {{-1, rep_prefix_1_byte, false}}}, {rep_prefix_1_byte, {{-1, rep_prefix_1_byte, false; static stringop_algs ix86_size_memset[2] = { {rep_prefix_1_byte, {{-1, rep_prefix_1_byte, false}}}, {rep_prefix_1_byte, {{-1, rep_prefix_1_byte, false;
Re: [fortran, patch] IEEE intrinsic modules (ping)
On Tue, Jun 24, 2014 at 10:11:32AM +0200, FX wrote: Here?s a patch fixing the diff issue with configure.host and the doc (which apparently is only triggered by some versions of texinfo). Apart from that, functionnaly identical, so I?ll paste here the ?history? of the patch: --- Since last time, I incorporated Uros? comments on the libgfortran/config/fpu-387.h part, and add some documentation to the manual (list of supported targets, and required compilation flags for full IEE support). OK to commit? Not yet. On i386-*-freebsd In file included from ../../../gcc4x/libgfortran/runtime/fpu.c:29:0: ./fpu-target.h: In function 'set_fpu_trap_exceptions': ./fpu-target.h:31:3: error: unknown type name 'fp_except' fp_except cw = fpgetmask(); ... gmake[3]: *** [fpu.lo] Error 1 gmake[3]: Leaving directory `/usr/home/kargl/gcc/obj4x/i386-unknown-freebsd11.0/libgfortran' gmake[2]: *** [all] Error 2 gmake[2]: Leaving directory `/usr/home/kargl/gcc/obj4x/i386-unknown-freebsd11.0/libgfortran' gmake[1]: *** [all-target-libgfortran] Error 2 gmake[1]: Leaving directory `/usr/home/kargl/gcc/obj4x' Looking at the libgfortran/config.log shows that there is an error in the config test for fp_except_t. configure:26048: checking for fp_except configure:26048: /home/kargl/gcc/obj4x/./gcc/xgcc -B/home/kargl/gcc/obj4x/./gcc/ -B/home/kargl/work/i386-unknown-freebsd11.0/bin/ -B/home/kargl/work/i386-unknown-freebsd11.0/lib/ -isystem /home/kargl/work/i386-unknown-freebsd11.0/include -isystem /home/kargl/work/i386-unknown-freebsd11.0/sys-include-c -std=gnu11 -g -O2 conftest.c 5 conftest.c: In function 'main': conftest.c:261:13: error: 'fp_except' undeclared (first use in this function) if (sizeof (fp_except)) ^ conftest.c:261:13: note: each undeclared identifier is reported only once for each function it appears in configure:26048: $? = 1 configure: failed program was: ... configure:26061: /home/kargl/gcc/obj4x/./gcc/xgcc -B/home/kargl/gcc/obj4x/./gcc/ -B/home/kargl/work/i386-unknown-freebsd11.0/bin/ -B/home/kargl/work/i386-unknown-freebsd11.0/lib/ -isystem /home/kargl/work/i386-unknown-freebsd11.0/include -isystem /home/kargl/work/i386-unknown-freebsd11.0/sys-include-c -std=gnu11 -g -O2 conftest.c 5 conftest.c: In function 'main': conftest.c:261:26: error: expected expression before ')' token if (sizeof ((fp_except_t))) ^ configure:26061: $? = 1 configure: failed program was: | /* confdefs.h */
[GSoC][match-and-simplify] get rid of multiple def_stmt
This patch avoids multiple definitions of def_stmt in different blocks, and moves it at the beginning of gimple_match_and_simplify. * genmatch.c (decision_tree::gen_gimple): Call fprintf to generate definition of def_stmt. (dt_operand::gen_gimple_expr): Adjust call to fprintf to generate assignment of def_stmt. Thanks and Regards, Prathamesh Index: gcc/genmatch.c === --- gcc/genmatch.c (revision 211950) +++ gcc/genmatch.c (working copy) @@ -960,7 +960,7 @@ dt_operand::gen_gimple_expr (FILE *f, co fprintf (f, if (TREE_CODE (%s) == SSA_NAME)\n, opname); fprintf (f, {\n); - fprintf (f, gimple def_stmt = SSA_NAME_DEF_STMT (%s);\n, opname); + fprintf (f, def_stmt = SSA_NAME_DEF_STMT (%s);\n, opname); (e-operation-op-kind == id_base::CODE) ? gen_gimple_expr_expr (f, e) : gen_gimple_expr_fn (f, e); return e-ops.length () + 2; @@ -1261,6 +1261,7 @@ decision_tree::gen_gimple (FILE *f) fprintf (f, , tree op%d, i); fprintf (f, )\n); fprintf (f, {\n); + fprintf (f, gimple def_stmt;\n); for (unsigned i = 0; i root-kids.length (); i++) {
Re: [PATCH 5/5] add libcc1
Trevor hrm, I know basically nothing about the upcoming changes, but I would Trevor have expected linking c++03 code against c++11 code would be fine Trevor especially when the interface doesn't involve any stl. https://gcc.gnu.org/wiki/Cxx11AbiCompatibility This warns against mixing with C++98, which seems to be how GCC is built. While I agree that in this specific case it is probably safe, since gcc in general isn't a heavy user of libstdc++, I think it's reasonable to simply follow gcc. This is safer in case gcc changes; and the benefit from C++11 in libcc1 is modest, especially when you consider the extra template magic we'd need in order to actually use variadic templates for the RPC stuff. Trevor Well, we build everything or at least everything I've seen with Trevor -fno-exceptions, so if something does throw we'll just crash right? Trevor istm we certainly write code calling the throwing new with that Trevor expectation. Gcc's coding conventions say that code ought to be exception-safe in case exceptions are used in the future. Search for Exceptions here: https://gcc.gnu.org/wiki/CppConventions I think retaining the std::nothrow is safer in view of this, and doesn't cause any problems. Tom
Re: [fortran, patch] IEEE intrinsic modules (ping)
On Tue, Jun 24, 2014 at 09:49:36AM -0700, Steve Kargl wrote: Not yet. On i386-*-freebsd In file included from ../../../gcc4x/libgfortran/runtime/fpu.c:29:0: ./fpu-target.h: In function 'set_fpu_trap_exceptions': ./fpu-target.h:31:3: error: unknown type name 'fp_except' fp_except cw = fpgetmask(); The (autogenerated?) fpu-target. h is totally bogus on FreeBSD. The file includes things like void get_fpu_state (void *s) { fpu_state_t *state = s; /* Check we can actually store the FPU state in the allocated size. */ assert (sizeof(fpu_state_t) = GFC_FPE_STATE_BUFFER_SIZE); s-mask = fpgetmask (); s-sticky = fpgetsticky (); s-round = fpgetround (); } The s- in the last 3 lines should be state-. There are several places where fp_except and fp_rnd are used unconditionally. On FreeBSD (and perhaps other *BSD), there is no fpsetsticky(). The function is fpresetsticky(). -- Steve
Re: [PATCH 5/5] add libcc1
On Tue, Jun 24, 2014 at 11:12:38AM -0600, Tom Tromey wrote: Trevor hrm, I know basically nothing about the upcoming changes, but I would Trevor have expected linking c++03 code against c++11 code would be fine Trevor especially when the interface doesn't involve any stl. https://gcc.gnu.org/wiki/Cxx11AbiCompatibility This warns against mixing with C++98, which seems to be how GCC is built. While I agree that in this specific case it is probably safe, since gcc in general isn't a heavy user of libstdc++, I think it's reasonable to simply follow gcc. This is safer in case gcc changes; and the benefit yeah, I'm not disagreeing at this point. from C++11 in libcc1 is modest, especially when you consider the extra template magic we'd need in order to actually use variadic templates for the RPC stuff. It would get a little more than that, actually deleting the copy ctors and stuff you don't want people to call is one thing that comes to mind. Trevor Well, we build everything or at least everything I've seen with Trevor -fno-exceptions, so if something does throw we'll just crash right? Trevor istm we certainly write code calling the throwing new with that Trevor expectation. Gcc's coding conventions say that code ought to be exception-safe in case exceptions are used in the future. Search for Exceptions here: https://gcc.gnu.org/wiki/CppConventions I think that's a good idea in general, mostly because it results in simpler code, I think the idea gcc will be exception safe in a reasonable amount of time is a bit of a pipe dream. I think retaining the std::nothrow is safer in view of this, and doesn't cause any problems. I think in general trying to handle allocator failure is a waste of time, that is all the places we call the throwing new today we'd want to keep doing that in a world with exceptions and just let the exception kill us. So assuming you actually want to handle allocator failure in this particular case for some reason that seems reasonable. Trev Tom
Re: Move DECL_VINDEX and DECL_SAVED_TREE into function_decl
On 06/23/2014 04:25 PM, Jan Hubicka wrote: * class.c (check_methods, create_vtable_ptr, determine_key_method, add_vcall_offset_vtbl_entries_1): Guard VINDEX checks by FUNCTION_DECL check. These changes are unnecessary: TYPE_METHODS is a list of functions. I just double checked and there are template decls in the list, too. Those always have VINDEX NULL, but with my change they naturally ICEs when you query it. I am re-testing the patch with the checks added back and intend to commit. Honza The rest of the patch is OK. Jason
Re: [PATCH] IPA REF: refactoring
Hello, this patch changes IPA REF API to c++ style. Changes were suggested and consulted with Honza. Patch has been pre approved, will be committed if no comments. Bootstrapped on x86_64-pc-linux-gnu, no regressions. Thanks, Martin ChangeLog: 2014-06-22 Martin Liska mli...@suse.cz * Makefile.in: Removed header file (ipa-ref-inline.h). * cgraph.c (cgraph_turn_edge_to_speculative): New IPA REF function called. (cgraph_speculative_call_info): Likewise. (cgraph_for_node_thunks_and_aliases): Likewise. (cgraph_for_node_and_aliases): Likewise. (verify_cgraph_node): Likewise. * cgraph.h: Batch of IPA REF functions become member functions of symtab_node: add_reference, maybe_add_reference, clone_references, clone_referring, clone_reference, find_reference, remove_stmt_references, remove_all_references, remove_all_referring, dump_references, dump_referring, has_alias_p, iterate_reference, iterate_referring. * cgraphbuild.c (record_reference): New IPA REF function used. (record_type_list): Likewise. (record_eh_tables): Likewise. (mark_address): Likewise. (mark_load): Likewise. (mark_store): Likewise. (pass_build_cgraph_edges): Likewise. (rebuild_cgraph_edge): Likewise. (cgraph_rebuild_references): Likewise. (pass_remove_cgraph_callee_edges): Likewise. * cgraphclones.c (cgraph_clone_node): Likewise. (cgraph_create_virtual_clone): Likewise. (cgraph_materialize_clone): Likewise. (cgraph_materialize_all_clones): Likewise. * cgraphunit.c (cgraph_reset_node): Likewise. (cgraph_reset_node): Likewise. (analyze_function): Likewise. (assemble_thunks_and_aliases): Likewise. (expand_function): Likewise. * ipa-comdats.c (propagate_comdat_group): Likewise. (enqueue_references): Likewise. * ipa-cp.c (ipcp_discover_new_direct_edges): Likewise. (create_specialized_node): Likewise. * ipa-devirt.c (referenced_from_vtable_p): Likewise. * ipa-inline-transform.c (can_remove_node_now_p_1): Likewise. * ipa-inline.c (reset_edge_caches): Likewise. (update_caller_keys): Likewise. (execute): Likewise. * ipa-prop.c (remove_described_reference): Likewise. (propagate_controlled_uses): Likewise. (ipa_edge_duplication_hook): Likewise. (ipa_modify_call_arguments): Likewise. * ipa-pure-const.c (propagate_pure_const): Likewise. * ipa-ref-inline.h: Header file removed, functions moved to symtab_node class. * ipa-ref.c (remove_reference): New class member function. (cannot_lead_to_return): New class member function. (referring_ref_list): Likewise. (referred_ref_list): Likewise. Rest of functions moved to symtab_node class. * ipa-ref.h: New member functions remove_reference, cannot_lead_to_return, referring_ref_list, referred_ref_list added to ipa_ref class. ipa_ref_list class has new member functions: first_reference, first_referring, clear, nreferences. * ipa-reference.c (analyze_function): New IPA REF function used. (write_node_summary_p): Likewise. (ipa_reference_write_optimization_summary): Likewise. * ipa-split.c (split_function): Likewise. * ipa-utils.c (ipa_reverse_postorder): Likewise. * ipa-visibility.c (cgraph_non_local_node_p_1): Likewise. (function_and_variable_visibility): Likewise. * ipa.c (has_addr_references_p): Likewise. (process_references): Argument type changed. (symtab_remove_unreachable_nodes): New IPA REF function used. (process_references): Likewise. (set_writeonly_bit): Likewise. * lto-cgraph.c: Implementation of new symtab_node member functions that uses new IPA REF functions. * lto-streamer-in.c (fixup_call_stmt_edges_1): New IPA REF function used. * lto-streamer-out.c (output_symbol_p): Likewise. * lto-streamer.h (referenced_from_this_partition_p): Argument type changed. * lto/lto-partition.c (add_references_to_partition): New IPA REF function used. (add_symbol_to_partition_1): Likewise. (lto_balanced_map): Likewise. * lto/lto-symtab.c (lto_cgraph_replace_node): Likewise. * symtab.c: Implementation of new IPA REF API. * trans-mem.c (ipa_tm_create_version_alias): New IPA REF function used. (ipa_tm_create_version): Likewise. (ipa_tm_execute): Likewise. * tree-emutls.c (gen_emutls_addr): Likewise. * tree-inline.c (copy_bb): Likewise. (delete_unreachable_blocks_update_callgraph): Likewise. * varpool.c (varpool_remove_unreferenced_decls): Likewise. (varpool_for_node_and_aliases): Likewise. Patch is OK. Thanks a lot for working on it. Note that I added the single_use pass that walks refs, so you need to update it too before commiting.
Re: [PATCH 3/3] add hash_map class
On 06/20/2014 12:52 PM, tsaund...@mozilla.com wrote: From: Trevor Saunders tsaund...@mozilla.com Hi, This patch adds a hash_map class so we can consolidate the boiler plate around using hash_table as a map, it also allows us to get rid of pointer_map which I do in this patch by converting its users to hash_map. Hello Trev, I like your changes! One small question about pointer_set, which is unable of deletion of items. Do you plan to migrate and simplify hash_map to be a replacement for pointer_set? Note that pointer-map use in LTO is quite performance critical. It would be good to double check that the new use of hash does not produce slower code. Honza
Re: [RFC] Making fold-const sane WRT symbol visibilities
The problem is that the patch fails testcases that assume we do such folding at parsing time. ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/pr36901-1.c (test for excess errors) ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/pr36901-2.c (test for excess errors) ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/pr36901-3.c (test for excess errors) ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/pr36901-4.c (test for excess errors) Here we accept the source as compile time constant while I think it is not: int sc = (sc 0); Does it seem resonable to turn those testcases into one testing for error and live with delaying the folding oppurtunities to early opts? They are now caught by ccp1 pass usually. IMHO all symbol visibility related foldings are very premature if done in the frontends (well, most of fold-const.c is ...). Of course everything depends on whether there exists a frontend that requires these foldings for correctness ... Yeah, in my testing it seems that C frontend do care in the testcase above - we used to accept code that does test nonzeroness of address as a constant, while we don't. Clang (IMO correctly) reject it. If we are OK with changing the bhaviour, I will just commit the patch and remove the testcase above (or turn it into error check) Personally I find nonzero ambiguous as it doesn't clearly state it is about the symbols address rather than its value. I can change it to nonzero_address. It did not appear one might think about value stored in the symbol. Honza Richard. Honza * cgraph.h (symtab_node): Add method nonzero. (decl_in_symtab_p): Break out from ... (symtab_get_node): ... here. * symtab.c (symtab_node::nonzero): New method. * fold-const.c: Include cgraph.h (tree_single_nonzero_warnv_p): Use symtab for symbols. * testsuite/g++.dg/tree-ssa/nonzero-2.C: New testcase. * testsuite/g++.dg/tree-ssa/nonzero-1.C: New testcase. * testsuite/gcc.dg/tree-ssa/nonzero-1.c: New testcase. Index: cgraph.h === --- cgraph.h(revision 211915) +++ cgraph.h(working copy) @@ -214,6 +214,9 @@ public: void set_init_priority (priority_type priority); priority_type get_init_priority (); + + /* Return true if symbol is known to be nonzero. */ + bool nonzero (); }; enum availability @@ -1068,6 +1077,17 @@ void varpool_remove_initializer (varpool /* In cgraph.c */ extern void change_decl_assembler_name (tree, tree); +/* Return true if DECL should have entry in symbol table if used. + Those are functions and static external veriables*/ + +static bool +decl_in_symtab_p (const_tree decl) +{ + return (TREE_CODE (decl) == FUNCTION_DECL + || (TREE_CODE (decl) == VAR_DECL + (TREE_STATIC (decl) || DECL_EXTERNAL (decl; +} + /* Return symbol table node associated with DECL, if any, and NULL otherwise. */ @@ -1075,12 +1095,7 @@ static inline symtab_node * symtab_get_node (const_tree decl) { #ifdef ENABLE_CHECKING - /* Check that we are called for sane type of object - functions - and static or external variables. */ - gcc_checking_assert (TREE_CODE (decl) == FUNCTION_DECL - || (TREE_CODE (decl) == VAR_DECL - (TREE_STATIC (decl) || DECL_EXTERNAL (decl) - || in_lto_p))); + gcc_checking_assert (decl_in_symtab_p (decl)); /* Check that the mapping is sane - perhaps this check can go away, but at the moment frontends tends to corrupt the mapping by calling memcpy/memset on the tree nodes. */ Index: symtab.c === --- symtab.c(revision 211915) +++ symtab.c(working copy) @@ -1574,4 +1574,67 @@ symtab_get_symbol_partitioning_class (sy return SYMBOL_PARTITION; } + +/* Return true when symbol is known to be non-zero. */ + +bool +symtab_node::nonzero () +{ + /* Weakrefs may be NULL when their target is not defined. */ + if (this-alias this-weakref) +{ + if (this-analyzed) + { + symtab_node *target = symtab_alias_ultimate_target (this); + + if (target-alias target-weakref) + return false; + /* We can not recurse to target::nonzero. It is possible that the +target is used only via the alias. +We may walk references and look for strong use, but we do not know +if this strong use will survive to final binary, so be +conservative here. +??? Maybe we could do the lookup during late optimization that +could be useful to eliminate the NULL pointer checks in LTO +programs. */ + if (target-definition !DECL_EXTERNAL (target-decl)) + return true; + if (target-resolution != LDPR_UNKNOWN +
Re: [fortran, patch] IEEE intrinsic modules (ping)
Steve Kargl wrote: On FreeBSD (and perhaps other *BSD), there is no fpsetsticky(). The function is fpresetsticky(). Solaris has fpsetsticky() (requires ieeefp.h) and BSD has fpresetsticky() – thus, like at other places in that file, one needs to conditionally enable one or the other. Tobias
[PATCH] Convert XCOFF ASM_DECLARE_FUNCTION_NAME to function
In preparation to fix the alias issues on AIX, this patch changes ASM_DECLARE_FUNCTION_NAME from a macro to a function. Bootstrap on powerpc-ibm-aix7.1.0.0 in progress. * config/rs6000/xcoff.h (ASM_DECLARE_FUNCTION_NAME): Remove definition and call... * config/rs6000/rs6000.c (rs6000_xcoff_declare_function_name): New function. * config/rs6000/rs6000-protos.h (rs6000_xcoff_declare_function_name): Declare. Thanks, David * config/rs6000/xcoff.h (ASM_DECLARE_FUNCTION_NAME): Remove definition and call... * config/rs6000/rs6000.c (rs6000_xcoff_declare_function_name): New function. * config/rs6000/rs6000-protos.h (rs6000_xcoff_declare_function_name): Declare. Index: rs6000-protos.h === --- rs6000-protos.h (revision 211938) +++ rs6000-protos.h (working copy) @@ -164,6 +164,7 @@ extern rtx rs6000_va_arg (tree, tree); extern int function_ok_for_sibcall (tree); extern int rs6000_reg_parm_stack_space (tree, bool); +extern void rs6000_xcoff_declare_function_name (FILE *, const char *, tree); extern void rs6000_elf_declare_function_name (FILE *, const char *, tree); extern bool rs6000_elf_in_small_data_p (const_tree); #ifdef ARGS_SIZE_RTX Index: rs6000.c === --- rs6000.c(revision 211938) +++ rs6000.c(working copy) @@ -29452,6 +29452,71 @@ asm_out_file); } +/* This macro produces the initial definition of a function name. + On the RS/6000, we need to place an extra '.' in the function name and + output the function descriptor. + Dollar signs are converted to underscores. + + The csect for the function will have already been created when + text_section was selected. We do have to go back to that csect, however. + + The third and fourth parameters to the .function pseudo-op (16 and 044) + are placeholders which no longer have any use. */ + +void +rs6000_xcoff_declare_function_name (FILE *file, const char *name, tree decl) +{ + char *buffer = (char *) alloca (strlen (name) + 1); + char *p; + int dollar_inside = 0; + strcpy (buffer, name); + p = strchr (buffer, '$'); + while (p) { +*p = '_'; +dollar_inside++; +p = strchr (p + 1, '$'); + } + if (TREE_PUBLIC (decl)) +{ + if (!RS6000_WEAK || !DECL_WEAK (decl)) + { + if (dollar_inside) { + fprintf(file, \t.rename .%s,\.%s\\n, buffer, name); + fprintf(file, \t.rename %s,\%s\\n, buffer, name); + } + fputs (\t.globl ., file); + RS6000_OUTPUT_BASENAME (file, buffer); + putc ('\n', file); + } +} + else +{ + if (dollar_inside) { + fprintf(file, \t.rename .%s,\.%s\\n, buffer, name); + fprintf(file, \t.rename %s,\%s\\n, buffer, name); + } + fputs (\t.lglobl ., file); + RS6000_OUTPUT_BASENAME (file, buffer); + putc ('\n', file); +} + fputs (\t.csect , file); + RS6000_OUTPUT_BASENAME (file, buffer); + fputs (TARGET_32BIT ? [DS]\n : [DS],3\n, file); + RS6000_OUTPUT_BASENAME (file, buffer); + fputs (:\n, file); + fputs (TARGET_32BIT ? \t.long . : \t.llong ., file); + RS6000_OUTPUT_BASENAME (file, buffer); + fputs (, TOC[tc0], 0\n, file); + in_section = NULL; + switch_to_section (function_section (decl)); + putc ('.', file); + RS6000_OUTPUT_BASENAME (file, buffer); + fputs (:\n, file); + if (write_symbols != NO_DEBUG !DECL_IGNORED_P (decl)) +xcoffout_declare_function (file, decl, buffer); + return; +} + #ifdef HAVE_AS_TLS static void rs6000_xcoff_encode_section_info (tree decl, rtx rtl, int first) Index: xcoff.h === --- xcoff.h (revision 211938) +++ xcoff.h (working copy) @@ -134,68 +134,12 @@ #undef TARGET_ASM_FILE_START_FILE_DIRECTIVE #define TARGET_ASM_FILE_START_FILE_DIRECTIVE false -/* This macro produces the initial definition of a function name. - On the RS/6000, we need to place an extra '.' in the function name and - output the function descriptor. - Dollar signs are converted to underscores. +/* This macro produces the initial definition of a function name. */ - The csect for the function will have already been created when - text_section was selected. We do have to go back to that csect, however. +#undef ASM_DECLARE_FUNCTION_NAME +#define ASM_DECLARE_FUNCTION_NAME(FILE, NAME, DECL)\ + rs6000_xcoff_declare_function_name ((FILE), (NAME), (DECL)) - The third and fourth parameters to the .function pseudo-op (16 and 044) - are placeholders which no longer have any use. */ - -#define ASM_DECLARE_FUNCTION_NAME(FILE,NAME,DECL) \ -{ char *buffer = (char *) alloca (strlen (NAME) + 1); \ - char *p; \ - int dollar_inside = 0;
Re: [PATCH] Convert XCOFF ASM_DECLARE_FUNCTION_NAME to function
In preparation to fix the alias issues on AIX, this patch changes ASM_DECLARE_FUNCTION_NAME from a macro to a function. Thanks, David! We will also need to introduce ASM_DECLARE_OBJECT_NAME to handle the aliases of variables. I can look into that probably later this week (it is last week of my teaching and I need to do the finals) Honza Bootstrap on powerpc-ibm-aix7.1.0.0 in progress. * config/rs6000/xcoff.h (ASM_DECLARE_FUNCTION_NAME): Remove definition and call... * config/rs6000/rs6000.c (rs6000_xcoff_declare_function_name): New function. * config/rs6000/rs6000-protos.h (rs6000_xcoff_declare_function_name): Declare. Thanks, David * config/rs6000/xcoff.h (ASM_DECLARE_FUNCTION_NAME): Remove definition and call... * config/rs6000/rs6000.c (rs6000_xcoff_declare_function_name): New function. * config/rs6000/rs6000-protos.h (rs6000_xcoff_declare_function_name): Declare. Index: rs6000-protos.h === --- rs6000-protos.h (revision 211938) +++ rs6000-protos.h (working copy) @@ -164,6 +164,7 @@ extern rtx rs6000_va_arg (tree, tree); extern int function_ok_for_sibcall (tree); extern int rs6000_reg_parm_stack_space (tree, bool); +extern void rs6000_xcoff_declare_function_name (FILE *, const char *, tree); extern void rs6000_elf_declare_function_name (FILE *, const char *, tree); extern bool rs6000_elf_in_small_data_p (const_tree); #ifdef ARGS_SIZE_RTX Index: rs6000.c === --- rs6000.c (revision 211938) +++ rs6000.c (working copy) @@ -29452,6 +29452,71 @@ asm_out_file); } +/* This macro produces the initial definition of a function name. + On the RS/6000, we need to place an extra '.' in the function name and + output the function descriptor. + Dollar signs are converted to underscores. + + The csect for the function will have already been created when + text_section was selected. We do have to go back to that csect, however. + + The third and fourth parameters to the .function pseudo-op (16 and 044) + are placeholders which no longer have any use. */ + +void +rs6000_xcoff_declare_function_name (FILE *file, const char *name, tree decl) +{ + char *buffer = (char *) alloca (strlen (name) + 1); + char *p; + int dollar_inside = 0; + strcpy (buffer, name); + p = strchr (buffer, '$'); + while (p) { +*p = '_'; +dollar_inside++; +p = strchr (p + 1, '$'); + } + if (TREE_PUBLIC (decl)) +{ + if (!RS6000_WEAK || !DECL_WEAK (decl)) + { + if (dollar_inside) { + fprintf(file, \t.rename .%s,\.%s\\n, buffer, name); + fprintf(file, \t.rename %s,\%s\\n, buffer, name); + } + fputs (\t.globl ., file); + RS6000_OUTPUT_BASENAME (file, buffer); + putc ('\n', file); + } +} + else +{ + if (dollar_inside) { + fprintf(file, \t.rename .%s,\.%s\\n, buffer, name); + fprintf(file, \t.rename %s,\%s\\n, buffer, name); + } + fputs (\t.lglobl ., file); + RS6000_OUTPUT_BASENAME (file, buffer); + putc ('\n', file); +} + fputs (\t.csect , file); + RS6000_OUTPUT_BASENAME (file, buffer); + fputs (TARGET_32BIT ? [DS]\n : [DS],3\n, file); + RS6000_OUTPUT_BASENAME (file, buffer); + fputs (:\n, file); + fputs (TARGET_32BIT ? \t.long . : \t.llong ., file); + RS6000_OUTPUT_BASENAME (file, buffer); + fputs (, TOC[tc0], 0\n, file); + in_section = NULL; + switch_to_section (function_section (decl)); + putc ('.', file); + RS6000_OUTPUT_BASENAME (file, buffer); + fputs (:\n, file); + if (write_symbols != NO_DEBUG !DECL_IGNORED_P (decl)) +xcoffout_declare_function (file, decl, buffer); + return; +} + #ifdef HAVE_AS_TLS static void rs6000_xcoff_encode_section_info (tree decl, rtx rtl, int first) Index: xcoff.h === --- xcoff.h (revision 211938) +++ xcoff.h (working copy) @@ -134,68 +134,12 @@ #undef TARGET_ASM_FILE_START_FILE_DIRECTIVE #define TARGET_ASM_FILE_START_FILE_DIRECTIVE false -/* This macro produces the initial definition of a function name. - On the RS/6000, we need to place an extra '.' in the function name and - output the function descriptor. - Dollar signs are converted to underscores. +/* This macro produces the initial definition of a function name. */ - The csect for the function will have already been created when - text_section was selected. We do have to go back to that csect, however. +#undef ASM_DECLARE_FUNCTION_NAME +#define ASM_DECLARE_FUNCTION_NAME(FILE, NAME, DECL) \ + rs6000_xcoff_declare_function_name ((FILE), (NAME), (DECL)) - The third and fourth parameters to the .function
Re: [PATCH 3/3] add hash_map class
On Tue, Jun 24, 2014 at 08:23:49PM +0200, Jan Hubicka wrote: On 06/20/2014 12:52 PM, tsaund...@mozilla.com wrote: From: Trevor Saunders tsaund...@mozilla.com Hi, This patch adds a hash_map class so we can consolidate the boiler plate around using hash_table as a map, it also allows us to get rid of pointer_map which I do in this patch by converting its users to hash_map. Hello Trev, I like your changes! One small question about pointer_set, which is unable of deletion of items. Do you plan to migrate and simplify hash_map to be a replacement for pointer_set? Note that pointer-map use in LTO is quite performance critical. It would be good to double check that the new use of hash does not produce slower code. I believe the compiled code should be very similar, but I'll do some measuring to check. Trev Honza
Re: [fortran, patch] IEEE intrinsic modules (ping)
On Tue, Jun 24, 2014 at 08:34:23PM +0200, Tobias Burnus wrote: Steve Kargl wrote: On FreeBSD (and perhaps other *BSD), there is no fpsetsticky(). The function is fpresetsticky(). Solaris has fpsetsticky() (requires ieeefp.h) and BSD has fpresetsticky() ? thus, like at other places in that file, one needs to conditionally enable one or the other. I suppose I don't understand the logic in libgfortran/configure.host. It is picking the wrong config/fpu*.h file. gmake | tee sgk.log shows (long lines wrapped) cp ../../../gcc4x/libgfortran/config/fpu-sysv.h fpu-target.h grep '^#define GFC_FPE_' ../../../gcc4x/libgfortran/../gcc/fortran/\ libgfortran.h fpu-target.inc || true grep '^#define GFC_FPE_' ../../../gcc4x/libgfortran/libgfortran.h \ fpu-target.inc || true gmake all-am FreeBSD (and the other *BSD) have both feenbleexcept() and fpsetmask(), but neither check is correct. It seems the check for feenableexcept assumes glibc and fpsetmask assumes SysV system. -- Steve
Re: Delay RTL initialization until it is really needed
On 06/20/14 01:51, Jan Hubicka wrote: Hi, IRA initialization shows high in profiles even when building lto objects. This patch simply delays RTL backend initialization until we really decide to output a function. In some cases this avoids the initialization completely (like in the case of LTO but also user target attributes) and there is some hope for better cache locality. Basic idea is to have two flags saying whether lang and target dependent bits needs initialization and check it when starting function codegen. Bootstrapped/regtested x86_64-linux, testing also at AIX. Ok if it passes? Honza * toplev.c (backend_init_target): Move init_emit_regs and init_regs to... (backend_init) ... here; skip ira_init_once and backend_init_target. (target_reinit) ... and here; clear this_target_rtl-lang_dependent_initialized. (lang_dependent_init_target): Clear this_target_rtl-lang_dependent_initialized; break out rtl initialization to ... (initialize_rtl): ... here; call also backend_init_target and ira_init_once. * toplev.h (initialize_rtl): New function. * function.c: Include toplev.h (init_function_start): Call initialize_rtl. * rtl.h (target_rtl): Add target_specific_initialized, lang_dependent_initialized. Index: toplev.c === --- toplev.c(revision 211837) +++ toplev.c(working copy) @@ -1686,6 +1682,31 @@ lang_dependent_init_target (void) front end is initialized. It also depends on the HAVE_xxx macros generated from the target machine description. */ init_optabs (); + this_target_rtl-lang_dependent_initialized = false; +} + +/* Perform initializations that are lang-dependent or target-dependent. + but matters only for late optimizations and RTL generation. */ + +void +initialize_rtl (void) +{ + static int initialized_once; + + /* Initialization done just once per compilation, but delayed + till code generation. */ + if (!initialized_once) +ira_init_once (); + initialized_once = true; + + /* Target specific RTL backend initialization. */ + if (!this_target_rtl-target_specific_initialized) +backend_init_target (); + this_target_rtl-target_specific_initialized = true; + + if (this_target_rtl-lang_dependent_initialized) +return; + this_target_rtl-lang_dependent_initialized = true; /* The following initialization functions need to generate rtl, so provide a dummy function context for them. */ @@ -1784,8 +1805,15 @@ target_reinit (void) regno_reg_rtx = NULL; } - /* Reinitialize RTL backend. */ - backend_init_target (); + this_target_rtl-lang_dependent_initialized = false; Do you want to reset target_specific_initialized here as well? Jeff
Re: Re: [c++-concepts] Fix assertion failure with cp_maybe_constrained_type_specifier
On 06/24/14, Andrew Sutton wrote: Weird. Any chance you're doing a bootstrap build? There was an earlier bootstrapping issue with this branch. We had turned on -std=c++1y by default, and it was causing some conversion errors with lvalue references to bitfields in libasan. This doesn't *look* like a regression caused by concepts -- I don't think I'm touching the initializer code at all. Andrew Sutton Andrew, I did a full 3-stage bootstrap which is the default these days. I'll try --disable-bootstrap and see what happens. In other news: I think the lvalue to bitfield issue is resolved in 4.9 and trunk. Note to self: Add a testcase for that if not done already. Ed
Re: [PATCH 3/3] add hash_map class
On June 24, 2014 9:16:34 PM CEST, Trevor Saunders tsaund...@mozilla.com wrote: On Tue, Jun 24, 2014 at 08:23:49PM +0200, Jan Hubicka wrote: On 06/20/2014 12:52 PM, tsaund...@mozilla.com wrote: From: Trevor Saunders tsaund...@mozilla.com Hi, This patch adds a hash_map class so we can consolidate the boiler plate around using hash_table as a map, it also allows us to get rid of pointer_map which I do in this patch by converting its users to hash_map. Hello Trev, I like your changes! One small question about pointer_set, which is unable of deletion of items. Do you plan to migrate and simplify hash_map to be a replacement for pointer_set? Note that pointer-map use in LTO is quite performance critical. It would be good to double check that the new use of hash does not produce slower code. I believe the compiled code should be very similar, but I'll do some measuring to check. More important is memory use. Richard. Trev Honza
Re: Another AIX Bootstrap failure
Hi Honza, On 23 Jun 2014, at 18:36, Jan Hubicka wrote: The tests gcc.dg/globalalias-2.c and gcc.dg/localalias-2.c fail on darwin with /opt/gcc/work/gcc/testsuite/gcc.dg/globalalias-2.c:20:2: warning: alias definitions not supported in Mach-O; ignored I think they should be protected by /* { dg-require-alias } */ I see, the anoying property of dg.exp where we compile the additional sources separately, too. Will fix that. Is it really the case that Mach-O have no way of creating alias, even by putting alternative symbol into the source as we intend to do for AIX? The status is this: There is a symbol flag in Mach-O (N_INDR) that would permit marking one symbol as an alias of another. At present, this is ignored by ld64 (and not emitted by cctools-as) [although, ironically, it used to be acted on in the ancient ld_classic of ~2004 vintage] Fortunately, it seems that the darwin/OSX community within llvm is interested in having support for this functionality which is Good News, since without ld64 suppport, it would make maintaining it for future systems very difficult. The statement is it's on the TODO, but not a high priority. For any darwin 13 (or maybe even 14) it would mean that we would have to roll our own implementation. Actually, I am quite close to having both GAS ports and a more generally buildable ld64 version - so rolling our own is really a possibility. However, as I'm sure you know, Darwin stuff on GCC/GAS is on a volunteer basis, so it's difficult to say in which decade the support might appear ;) cheers Iain
[PATCH v2] gcc: fix segfault from calling free on non-malloc'd area
We see the following on a 32bit gcc installed on 64 bit host: Reading symbols from ./i586-pokymllib32-linux-gcc...done. (gdb) run Starting program: x86-pokymllib32-linux/lib32-gcc/4.9.0-r0/image/usr/bin/i586-pokymllib32-linux-gcc Program received signal SIGSEGV, Segmentation fault. 0xf7e957e0 in free () from /lib/i386-linux-gnu/libc.so.6 (gdb) bt #0 0xf7e957e0 in free () from /lib/i386-linux-gnu/libc.so.6 #1 0x0804b73c in set_multilib_dir () at gcc-4.9.0/gcc/gcc.c:7827 #2 main (argc=1, argv=0xd504) at gcc-4.9.0/gcc/gcc.c:6688 (gdb) The problem arises because we conditionally assign the pointer we eventually free, and the conditional may assign the pointer to the non-malloc'd internal string . which fails when we free it here: if (multilib_dir == NULL multilib_os_dir != NULL strcmp (multilib_os_dir, .) == 0) { free (CONST_CAST (char *, multilib_os_dir)); ... As suggested by Jakub, ensure the . case is also malloc'd via xstrdup() and hence the pointer for the . case can be freed. Cc: Jakub Jelinek ja...@redhat.com Cc: Jeff Law l...@redhat.com Cc: Matthias Klose d...@ubuntu.com CC: Tobias Burnus bur...@net-b.de Signed-off-by: Paul Gortmaker paul.gortma...@windriver.com --- [v2: don't change the causality of the free() ; instead just make the . pointer be malloc'd as well. Note that I was unable to reproduce the broken-ness of my original (broken) patch with a direct build of trunk, with ./configure --prefix=/usr/local but I also did re-test this new patch still fixed the error that we saw in yocto with gcc-4.9.0 with the invalid free segfault.] gcc/gcc.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/gcc/gcc.c b/gcc/gcc.c index 9ac18e60d801..168acf7eb0c9 100644 --- a/gcc/gcc.c +++ b/gcc/gcc.c @@ -7790,10 +7790,15 @@ set_multilib_dir (void) q2++; if (*q2 == ':') ml_end = q2; - new_multilib_os_dir = XNEWVEC (char, ml_end - q); - memcpy (new_multilib_os_dir, q + 1, ml_end - q - 1); - new_multilib_os_dir[ml_end - q - 1] = '\0'; - multilib_os_dir = *new_multilib_os_dir ? new_multilib_os_dir : .; + if (ml_end - q == 1) + multilib_os_dir = xstrdup (.); + else + { + new_multilib_os_dir = XNEWVEC (char, ml_end - q); + memcpy (new_multilib_os_dir, q + 1, ml_end - q - 1); + new_multilib_os_dir[ml_end - q - 1] = '\0'; + multilib_os_dir = new_multilib_os_dir; + } if (q2 end *q2 == ':') { -- 1.9.2
Re: [patch, mips] delete bit-rotten ADJUST_REG_ALLOC_ORDER definition
Sandra Loosemore san...@codesourcery.com writes: On 05/14/2014 12:49 PM, Richard Sandiford wrote: Jeff Law l...@redhat.com writes: On 05/13/14 14:11, Sandra Loosemore wrote: 2014-05-13 Catherine Moore c...@codesourcery.com Sandra Loosemore san...@codesourcery.com gcc/ * config/mips/mips.c (mips_order_regs_for_local_alloc): Delete. * config/mips/mips.h (ADJUST_REG_ALLOC_ORDER): Delete. * config/mips/mips-protos.h (mips_order_regs_for_local_alloc): Delete. OK for the trunk. Would it be OK to hold off until after the switch to LRA? That patch has been written and the MIPS parts approved, but we're waiting for some legal things to be sorted out and for a fixed version of the LRA EXTRA_MEMORY_CONSTRAINT patch. I just think it'd be better to tune this sort of thing once that's done, rather than tune it against reload. Richard, is it OK to commit this patch now that LRA is in, or do you want to experiment some more with tuning first? I think we're all in agreement that this is broken old code that should be removed regardless of whether we do other things to tune REG_ALLOC_ORDER. Sure, go ahead. I retried with trunk and removing the definition had very little effect on non-MIPS16 and an overall positive effect on MIPS16. Which is a bit ironic, given that the hook was supposed to help MIPS16 and at face value would hurt non-MIPS16 more. It looks like moving $24 first really isn't a win for MIPS16 with IRA. To reinforce that, I tried the old patch I posted in place of yours, but moving $24 ahead of the other registers was better for non-MIPS16 code and worse for MIPS16. The MIPS16 results were as close to trunk as the non-MIPS16 ones were with your patch; just two differences. So the MIPS16 benefit of your patch really is coming from having $24 after the MIPS16 registers. The reason for the improvement in non-MIPS16 results with my patch seemed to be that putting the return and argument registers first leads to less freedom of movement and thus more unfilled delay slots. In some cases I think these were more due to reorg.c's infamous approach to register liveness rather than real interference. E.g. branches to a return statement were not having their delay slots filled with an assignment to $2 even though the return statement set $2 itself. Change the assigned register from $2 to $24 and the delay slot could be filled. So it might be interesting to experiment with putting the later call-clobbered registers first, but that's a separate change. Richard
Re: [PATCH] Change default for --param allow-...-data-races to off
On Mon, Jun 23, 2014 at 03:35:01PM +0200, Bernd Edlinger wrote: Hi Martin, Well actually, I am not sure if we ever wanted to have a race condition here. Have you seen any impact of --param allow-store-data-races on any benchmark? It's trivially to write one. The only pass that checks the param is tree loop invariant motion and it does that when it applies store-motion. Register pressure increase is increased by a factor of two. So I'd agree that we might want to disable this again for -Ofast. As nothing tests for the PACKED variants nor for the LOAD variant I'd rather remove those. Claiming we don't create races for those when you disable it via the param is simply not true. Thanks, Richard. OK, please go ahead with your patch. Perhaps not unsurprisingly, the patch is very similar. Bootstrapped and tested on x86_64-linux. OK for trunk? Thanks, Martin 2014-06-24 Martin Jambor mjam...@suse.cz * params.def (PARAM_ALLOW_LOAD_DATA_RACES) (PARAM_ALLOW_PACKED_LOAD_DATA_RACES) (PARAM_ALLOW_PACKED_STORE_DATA_RACES): Removed. (PARAM_ALLOW_STORE_DATA_RACES): Set default to zero. * opts.c (default_options_optimization): Set PARAM_ALLOW_STORE_DATA_RACES to one at -Ofast. * doc/invoke.texi (allow-load-data-races) (allow-packed-load-data-races, allow-packed-store-data-races): Removed. (allow-store-data-races): Document the new default. testsuite/ * g++.dg/simulate-thread/bitfields-2.C: Remove allow-load-data-races parameter. * g++.dg/simulate-thread/bitfields.C: Likewise. * gcc.dg/simulate-thread/strict-align-global.c: Remove allow-packed-store-data-races parameter. * gcc.dg/simulate-thread/subfields.c: Likewise. * gcc.dg/tree-ssa/20050314-1.c: Set parameter allow-store-data-races to one. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 0d4bd00..027b6fb 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -10176,25 +10176,10 @@ The maximum number of conditional stores paires that can be sunk. Set to 0 if either vectorization (@option{-ftree-vectorize}) or if-conversion (@option{-ftree-loop-if-convert}) is disabled. The default is 2. -@item allow-load-data-races -Allow optimizers to introduce new data races on loads. -Set to 1 to allow, otherwise to 0. This option is enabled by default -unless implicitly set by the @option{-fmemory-model=} option. - @item allow-store-data-races Allow optimizers to introduce new data races on stores. Set to 1 to allow, otherwise to 0. This option is enabled by default -unless implicitly set by the @option{-fmemory-model=} option. - -@item allow-packed-load-data-races -Allow optimizers to introduce new data races on packed data loads. -Set to 1 to allow, otherwise to 0. This option is enabled by default -unless implicitly set by the @option{-fmemory-model=} option. - -@item allow-packed-store-data-races -Allow optimizers to introduce new data races on packed data stores. -Set to 1 to allow, otherwise to 0. This option is enabled by default -unless implicitly set by the @option{-fmemory-model=} option. +at optimization level @option{-Ofast}. @item case-values-threshold The smallest number of different values for which it is best to use a diff --git a/gcc/opts.c b/gcc/opts.c index 3ab06c6..19203dc 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -620,6 +620,13 @@ default_options_optimization (struct gcc_options *opts, opt2 ? default_param_value (PARAM_LOOP_INVARIANT_MAX_BBS_IN_LOOP) : 1000, opts-x_param_values, opts_set-x_param_values); + /* At -Ofast, allow store motion to introduce potential race conditions. */ + maybe_set_param_value +(PARAM_ALLOW_STORE_DATA_RACES, + opts-x_optimize_fast ? 1 + : default_param_value (PARAM_ALLOW_STORE_DATA_RACES), + opts-x_param_values, opts_set-x_param_values); + if (opts-x_optimize_size) /* We want to crossjump as much as possible. */ maybe_set_param_value (PARAM_MIN_CROSSJUMP_INSNS, 1, diff --git a/gcc/params.def b/gcc/params.def index 28ef79a..aa1e88d 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -1002,25 +1002,10 @@ DEFPARAM (PARAM_CASE_VALUES_THRESHOLD, 0, 0, 0) /* Data race flags for C++0x memory model compliance. */ -DEFPARAM (PARAM_ALLOW_LOAD_DATA_RACES, - allow-load-data-races, - Allow new data races on loads to be introduced, - 1, 0, 1) - DEFPARAM (PARAM_ALLOW_STORE_DATA_RACES, allow-store-data-races, Allow new data races on stores to be introduced, - 1, 0, 1) - -DEFPARAM (PARAM_ALLOW_PACKED_LOAD_DATA_RACES, - allow-packed-load-data-races, - Allow new data races on packed data loads to be introduced, - 1, 0, 1) - -DEFPARAM (PARAM_ALLOW_PACKED_STORE_DATA_RACES, - allow-packed-store-data-races, - Allow new data races on packed data stores to be introduced, -
Re: [fortran, patch] IEEE intrinsic modules (ping)
On Tue, Jun 24, 2014 at 09:43:06PM +0200, FX wrote: Thanks for testing on a platform I don?t have access to! I try to answer to the three main points below: I suppose I don't understand the logic in libgfortran/configure.host. It is picking the wrong config/fpu*.h file. 1. This is a preexisting bug, then. Currently, we have 4 versions of the FPU-specific code: ? fpu-glibc.h works on any platform that has C99 fenv.h + feenableexcept(), fedisableexcept() fegetexcept() extensions ? fpu-387.h aims at x86/x86_64 systems, and should have priority over fpu-glibc.h (because it allows for control of denormals, which the above does not have) ? fpu-aix.h requires C99 fenv.h + many AIX extensions (fp_trap(), fp_enable(), fp_disable(), fp_is_enabled(), fp_invalid_op()) ? fpu-sysv.h requires many SysV function calls: fpgetmask(), fpgetround(), fpgetsticky(), etc. The logic in configure.host clearly does not accomodate targets who have two styles of calls. I think it should be moved around so that the order of priority is aix sysv glibc 387. This would work on FreeBSD and probably the other *BSD systems. FreeBSD (and the other *BSD) have both feenbleexcept() and fpsetmask(), but neither check is correct. It seems the check for feenableexcept assumes glibc and fpsetmask assumes SysV system. 2. How does the check fail? To get past the build failure, I changed configure.host to use have_fp_except instead of have_fpsetmask. With FreeBSD the defined type is fp_except_t instead of the SysV fp_except. if test x${have_fp_except} = xyes; then fpu_host='fpu-sysv' ieee_support='yes' fi I haven't checked to see if haev_dp_except is actually set/unset by configure. With this change I pick up fpu-i387.h and the build completes as expected. I'm now in the regression testing stage. What does the config.log say? It looks like a pretty generic check in configure.ac: AC_CHECK_LIB([m],[feenableexcept],[have_feenableexcept=yes AC_DEFINE([HAVE_FEENABLEEXCEPT],[1],[libm includes feenableexcept])]) config.h eventually ends up with /* libm includes feenableexcept */ #define HAVE_FEENABLEEXCEPT 1 /* Define to 1 if you have the fenv.h header file. */ #define HAVE_FENV_H 1 /* Define if you have fpsetmask. */ #define HAVE_FPSETMASK 1 /* Define to 1 if the system has the type `fp_except'. */ /* #undef HAVE_FP_EXCEPT */ /* Define to 1 if the system has the type `fp_except_t'. */ #define HAVE_FP_EXCEPT_T 1 /* Define to 1 if the system has the type `fp_rnd'. */ /* #undef HAVE_FP_RND */ /* Define to 1 if the system has the type `fp_rnd_t'. */ #define HAVE_FP_RND_T 1 checking only if libc or libm contain any call to a feenableexcept() function. Is it a macro on FreeBSD? It is function. The problem seems to be that fpu-sysv.h assumes the types fp_except and fp_rnd whereas FreeBSD has fp_except_t and fp_rnd_t. 3. Does the attached updated patch (libgfortran only, without regenerated files) fix the problem? I'll test it when my regtesting is completed. But, a scan of the configure.host re-arrangement suggests that it should work. -- Steve
Re: [fortran, patch] IEEE intrinsic modules (ping)
3. Does the attached updated patch (libgfortran only, without regenerated files) fix the problem? I'll test it when my regtesting is completed. But, a scan of the configure.host re-arrangement suggests that it should work. OK. If you have some spare cycles, could you then also check it by modifying configure.host so that it uses the updated config/fpu-sysv.h in my patch? I would like to make sure I don’t break anything, but I don’t have access to a Solaris system (and my earlier calls for someone to test it for me were unanswered, so I don’t have much hope there). Thanks again, FX
Re: [PATCH 3/5] IPA ICF pass
On 06/13/14 04:44, mliska wrote: Hello, this is core of IPA ICF patchset. It adds new pass and registers all needed stuff related to a newly introduced interprocedural optimization. Algorithm description: In LGEN, we visit all read-only variables and functions. For each symbol, a hash value based on e.g. number of arguments, number of BB, GIMPLE CODES is computed (similar hash is computed for read-only variables). This kind of information is streamed for LTO. In WPA, we build congruence classes for all symbols having a same hash value. For functions, these classes are subdivided in WPA by argument type comparison. Each reference (a call or a variable reference) to another semantic item candidate is marked and stored for further congruence class reduction (similar algorithm as Value Numbering: www.cs.ucr.edu/~gupta/teaching/553-07/Papers/value.pdf). For every congruence class of functions with more than one semantic function, we load function body. Having this information, we can process complete semantic function equality and subdivide such congruence class. Read-only variable class members are also deeply compared. After that, we process Value numbering algorithm to do a final subdivision. Finally, all items belonging to a congruence class with more than one item are merged. Martin Changelog: 2014-06-13 Martin Liska mli...@suse.cz Jan Hubicka hubi...@ucw.cz * Makefile.in: New pass object file added. * common.opt: New -fipa-icf flag introduced. * doc/invoke.texi: Documentation enhanced for the pass. * lto-section-in.c: New LTO section for a summary created by IPA-ICF. * lto-streamer.h: New section name introduced. * opts.c: Optimization is added to -O2. * passes.def: New pass added. * timevar.def: New time var for IPA-ICF. * tree-pass.h: Pass construction function. * ipa-icf.h: New pass header file added. * ipa-icf.c: New pass source file added. You'll note many of my comments are do you need to You may in fact be handling that stuff correctly, they're just things I'd like you to verify are properly handled. If they're properly handled just say so :-) At a high level, I think this needs to be broken down a bit more. We've got two high level concepts in ipa-icf. One is all the equivalence testing the other is using that information for the icf optimization. Splitting out the equivalence testing seems like a good thing to do as there's other contexts where it would be useful. Overall I think you're on the right path and we just need to iterate a bit on this part of the patchset. @@ -7862,6 +7863,14 @@ it may significantly increase code size (see @option{--param ipcp-unit-growth=@var{value}}). This flag is enabled by default at @option{-O3}. +@item -fipa-icf +@opindex fipa-icf +Perform Identical Code Folding for functions and read-only variables. +Behavior is similar to Gold Linker ICF optimization. Symbols proved +as semantically equivalent are redirected to corresponding symbol. The pass +sensitively decides for usage of alias, thunk or local redirection. +This flag is enabled by default at @option{-O2}. So you've added this at -O2, what is the general compile-time impact? Would it make more sense to instead have it be part of -O3, particularly since ICF is rarely going to improve performance (sans icache issues). + +/* Interprocedural Identical Code Folding for functions and + read-only variables. + + The goal of this transformation is to discover functions and read-only + variables which do have exactly the same semantics. + + In case of functions, + we could either create a virtual clone or do a simple function wrapper + that will call equivalent function. If the function is just locally visible, + all function calls can be redirected. For read-only variables, we create + aliases if possible. + + Optimization pass arranges as follows: + 1) All functions and read-only variables are visited and internal + data structure, either sem_function or sem_variables is created. + 2) For every symbol from the previoues step, VAR_DECL and FUNCTION_DECL are + saved and matched to corresponding sem_items. s/previoues/previous/ + 3) These declaration are ignored for equality check and are solved + by Value Numbering algorithm published by Alpert, Zadeck in 1992. + 4) We compute hash value for each symbol. + 5) Congruence classes are created based on hash value. If hash value are + equal, equals function is called and symbols are deeply compared. + We must prove that all SSA names, declarations and other items + correspond. + 6) Value Numbering is executed for these classes. At the end of the process + all symbol members in remaining classes can be mrged. s/mrged/merged. + 7) Merge operation creates alias in case of read-only variables. For + callgraph
Re: [fortran, patch] IEEE intrinsic modules (ping)
On Tue, Jun 24, 2014 at 10:26:27PM +0200, FX wrote: 3. Does the attached updated patch (libgfortran only, without regenerated files) fix the problem? I'll test it when my regtesting is completed. But, a scan of the configure.host re-arrangement suggests that it should work. OK. If you have some spare cycles, could you then also check it by modifying configure.host so that it uses the updated config/fpu-sysv.h in my patch? I would like to make sure I don?t break anything, but I don?t have access to a Solaris system (and my earlier calls for someone to test it for me were unanswered, so I don?t have much hope there). Yes, I'll check the configure.host and fpu-sysv.h changes. -- Steve
bootstrap failure for cygwin, mingw targets due recent changes to hash_table
Hi Trevor, your recent commits have broken bootstrap for cygwin/mingw i386 targets with: ../../gcc/gcc/config/i386/winnt.c: In Funktion »unsigned int i386_pe_section_type_flags(tree, const char*, int)«: ../../gcc/gcc/config/i386/winnt.c:503:61: Fehler: keine passende Funktion für Aufruf von »hash_tablepointer_hashunsigned int ::find_slot(const unsigned int*, insert_option)« slot = htab-find_slot ((const unsigned int *)name, INSERT); ^ ../../gcc/gcc/config/i386/winnt.c:503:61: Anmerkung: Kandidat ist: In file included from ../../gcc/gcc/config/i386/winnt.c:35:0: ../../gcc/gcc/hash-table.h:1030:15: Anmerkung: hash_tableDescriptor, Allocator, true::value_type* hash_tableDescriptor, Allocator, true::find_slot(const val ue_type, insert_option) [with Descriptor = pointer_hashunsigned int; Allocator = xcallocator; hash_tableDescriptor, Allocator, true::value_type = unsigned int*] nahe Übereinstimmung value_type *find_slot (const value_type value, insert_option insert) ^ ../../gcc/gcc/hash-table.h:1030:15: Anmerkung: keine bekannte Umwandlung für A rgument 1 von »const unsigned int*« nach »unsigned int* const« ../../gcc/gcc/config/i386/t-cygming:26: recipe for target 'winnt.o' failed make[1]: *** [winnt.o] Error 1 make[1]: Leaving directory '/home/ktietz/source/gcc-head/buildw64/gcc' Makefile:4006: recipe for target 'all-gcc' failed make: *** [all-gcc] Error 2 Please fix that or revert. Thanks, Kai
[PATCH] Fix PR c++/61537
* parser.c (cp_parser_elaborated_type_specifier): Only consider template parameter lists outside of function parameter scope. * g++.dg/cpp1y/pr61537.C: New testcase. --- gcc/cp/parser.c | 28 +++- gcc/testsuite/g++.dg/cpp1y/pr61537.C | 24 2 files changed, 43 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp1y/pr61537.C diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 41200a0..736d012 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -15081,6 +15081,15 @@ cp_parser_elaborated_type_specifier (cp_parser* parser, return cp_parser_make_typename_type (parser, parser-scope, identifier, token-location); + + /* Template parameter lists apply only if we are not within a +function parameter list. */ + bool template_parm_lists_apply + = parser-num_template_parameter_lists; + for (cp_binding_level *s = current_binding_level; s; s = s-level_chain) + if (s-kind == sk_function_parms) + template_parm_lists_apply = false; + /* Look up a qualified name in the usual way. */ if (parser-scope) { @@ -15123,7 +15132,7 @@ cp_parser_elaborated_type_specifier (cp_parser* parser, decl = (cp_parser_maybe_treat_template_as_class (decl, /*tag_name_p=*/is_friend - parser-num_template_parameter_lists)); + template_parm_lists_apply)); if (TREE_CODE (decl) != TYPE_DECL) { @@ -15136,9 +15145,9 @@ cp_parser_elaborated_type_specifier (cp_parser* parser, if (TREE_CODE (TREE_TYPE (decl)) != TYPENAME_TYPE) { - bool allow_template = (parser-num_template_parameter_lists - || DECL_SELF_REFERENCE_P (decl)); - type = check_elaborated_type_specifier (tag_type, decl, + bool allow_template = (template_parm_lists_apply +|| DECL_SELF_REFERENCE_P (decl)); + type = check_elaborated_type_specifier (tag_type, decl, allow_template); if (type == error_mark_node) @@ -15224,15 +15233,16 @@ cp_parser_elaborated_type_specifier (cp_parser* parser, ts = ts_global; template_p = - (parser-num_template_parameter_lists + (template_parm_lists_apply (cp_parser_next_token_starts_class_definition_p (parser) || cp_lexer_next_token_is (parser-lexer, CPP_SEMICOLON))); /* An unqualified name was used to reference this type, so there were no qualifying templates. */ - if (!cp_parser_check_template_parameters (parser, - /*num_templates=*/0, - token-location, - /*declarator=*/NULL)) + if (template_parm_lists_apply + !cp_parser_check_template_parameters (parser, + /*num_templates=*/0, + token-location, + /*declarator=*/NULL)) return error_mark_node; type = xref_tag (tag_type, identifier, ts, template_p); } diff --git a/gcc/testsuite/g++.dg/cpp1y/pr61537.C b/gcc/testsuite/g++.dg/cpp1y/pr61537.C new file mode 100644 index 000..55761cd --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1y/pr61537.C @@ -0,0 +1,24 @@ +// PR c++/61537 +// { dg-do compile { target c++1y } } +// { dg-options } + +struct A {}; + +template typename T +struct B +{ + template typename U + void f(U, struct A); +}; + +template typename T +template typename U +void BT::f(U, struct A) +{ +} + +int main() +{ + Bchar b; + b.f(42, A()); +} -- 2.0.0
Aw: [MIPS r5900] libgcc floating point fixes
Please don't ignore this, support for MIPS r5900 was already ignored for ~10 years. Just apply the patch when you don't know what I am talking about. The fix for mips16 is very important. The patch has impact only on MIPS r5900. Gesendet: Sonntag, 15. Juni 2014 um 22:08 Uhr Von: Jürgen Urban juergenur...@gmx.de An: gcc-patches@gcc.gnu.org Cc: juergenur...@gmx.de Betreff: [MIPS r5900] libgcc floating point fixes Hello, I found a problem in GCC on MIPS r5900: When printf() is used with type float, the converter function __extendsfdf2() is called. Parameters to printf() are always passed as double and not float. The function __extendsfdf2() calls itself to convert 32 bit float to 64 bit float. With Linux the __extendsfdf2() leads to a segfault when the stack limit is reached. Here is the wrong code (mipsel-linux-gnu): 00402af0 __extendsfdf2: 402af0: 3c1c0002lui gp,0x2 402af4: 279c9b30addiu gp,gp,-25808 402af8: 0399e021addugp,gp,t9 402afc: 8f9980a0lw t9,-32608(gp) # Load pointer to __extendsfdf2 into t9 402b00: 27bdffe0addiu sp,sp,-32 402b04: afbf001csw ra,28(sp) 402b08: 0320f809jalrt9 # Calls __extendsfdf2 (itself) 402b0c: afbc0010sw gp,16(sp) 402b10: 8fbf001clw ra,28(sp) 402b14: 03e8jr ra 402b18: 27bd0020addiu sp,sp,32 402b1c: nop The problem happens with the r5900 hard float configurations, e.g.: configure --target=mipsel-linux-gnu --with-float=hard --with-fpu=single --with-arch=r5900 I created the attached patch which fixes this problem for r5900 and another problem explained later. The fixed code generates the following code which should be correct (mipsr5900el-ps2-elf): 00105440 __extendsfdf2: 105440: 27bdffc8addiu sp,sp,-56 105444: 27a40028addiu a0,sp,40 105448: 27a50018addiu a1,sp,24 10544c: afbf0034sw ra,52(sp) 105450: 0c0417b5jal 105ed4 __unpack_f 105454: e7ac0028swc1$f12,40(sp) 105458: 8fa20024lw v0,36(sp) 10545c: 8fa40018lw a0,24(sp) 105460: 8fa5001clw a1,28(sp) 105464: 8fa60020lw a2,32(sp) 105468: 00021882srl v1,v0,0x2 10546c: 00021780sll v0,v0,0x1e 105470: afa20010sw v0,16(sp) 105474: 0c041789jal 105e24 __make_dp 105478: afa30014sw v1,20(sp) 10547c: 8fbf0034lw ra,52(sp) 105480: 03e8jr ra 105484: 27bd0038addiu sp,sp,56 The default targets mipsr5900el and mips64r5900el are not affected by the problem, because soft float is the default. It also seems that the same problem occurs with the following configuration: configure --target=mipsel-linux-gnu --with-float=hard --with-fpu=single I expected that this combination should work and a problem should already be detected. Can somebody confirm that the problem also occurs with default mipsel? The second part of the patch fixes the following configuration: configure --target=mipsel-linux-gnu --with-arch=r5900 It disables the mips16 stuff in the libgcc. This can't be compiled on r5900. This was already disabled for targets mipsr5900*el. I detected the problem, because the buildroot project uses this style which leads to less problems with existing software (because mipsel or mips64el is hardcoded in most configure scripts and don't expect mipsr5900el or mips64r5900el). Can someone please add the patch to the official GCC repository? I am not sure whether I fixed all self-calling implementations. Does somebody know a way of finding selfcalling implementations? I tried to find a FPU testsuite, but the testsuites are not designed to test non-standard FPUs or use double instead of float. So there can be more problems with FPU on r5900 which I don't see at the moment. Best regards Jürgen Urban
Re: [patch] Do not generate useless integral conversions
I think that was on purpose to avoid arithmetics in enum types. As those conversions are useless and thus stripped later is it really important to retain enum and boolean kind here? The problem is that convert.c is called by front-ends and the patch also removed the callback into them that made it possible to have some control. So, yes, it's pretty annoying to see totally bogus conversion nodes being introduced into your ASTs behind your back... -- Eric Botcazou
[C++ Patch/RFC] PR 49132
Hi, this remained unresolved for a long time, but, if I understand correctly Jason' Comment 1, should be rather easy, just do not complain for uninitialized const members in aggregates, recursively too (per struct B in the testcases). Does the below makes sense, then?!? It passes testing, anyway. I'm also taking the occasion to guard the warnings with complain tf_warning. Thanks, Paolo. / Index: cp/typeck2.c === --- cp/typeck2.c(revision 211955) +++ cp/typeck2.c(working copy) @@ -1342,28 +1342,15 @@ process_init_constructor_record (tree type, tree i next = massage_init_elt (TREE_TYPE (field), next, complain); /* Warn when some struct elements are implicitly initialized. */ - warning (OPT_Wmissing_field_initializers, - missing initializer for member %qD, field); + if (complain tf_warning) + warning (OPT_Wmissing_field_initializers, +missing initializer for member %qD, field); } else { - if (TREE_READONLY (field)) + if (TREE_CODE (TREE_TYPE (field)) == REFERENCE_TYPE) { if (complain tf_error) - error (uninitialized const member %qD, field); - else - return PICFLAG_ERRONEOUS; - } - else if (CLASSTYPE_READONLY_FIELDS_NEED_INIT (TREE_TYPE (field))) - { - if (complain tf_error) - error (member %qD with uninitialized const fields, field); - else - return PICFLAG_ERRONEOUS; - } - else if (TREE_CODE (TREE_TYPE (field)) == REFERENCE_TYPE) - { - if (complain tf_error) error (member %qD is uninitialized reference, field); else return PICFLAG_ERRONEOUS; @@ -1371,8 +1358,9 @@ process_init_constructor_record (tree type, tree i /* Warn when some struct elements are implicitly initialized to zero. */ - warning (OPT_Wmissing_field_initializers, - missing initializer for member %qD, field); + if (complain tf_warning) + warning (OPT_Wmissing_field_initializers, +missing initializer for member %qD, field); if (!zero_init_p (TREE_TYPE (field))) next = build_zero_init (TREE_TYPE (field), /*nelts=*/NULL_TREE, Index: testsuite/g++.dg/cpp0x/aggr1.C === --- testsuite/g++.dg/cpp0x/aggr1.C (revision 0) +++ testsuite/g++.dg/cpp0x/aggr1.C (working copy) @@ -0,0 +1,16 @@ +// PR c++/49132 +// { dg-do compile { target c++11 } } + +struct A { + const int m; +}; + +A a1 = {}; +A a2{}; + +struct B { + A a; +}; + +B b1 = {}; +B b2{}; Index: testsuite/g++.dg/init/aggr11.C === --- testsuite/g++.dg/init/aggr11.C (revision 0) +++ testsuite/g++.dg/init/aggr11.C (working copy) @@ -0,0 +1,13 @@ +// PR c++/49132 + +struct A { + const int m; +}; + +A a1 = {}; + +struct B { + A a; +}; + +B b1 = {};
Re: [C++ Patch/RFC] PR 49132
... in order to handle correctly in C++98 mode (*) the case of references in a member, I think we have to add an explicit check of CLASSTYPE_REF_FIELDS_NEED_INIT, per the below. Thanks, Paolo. (*) For C++11 my previous patch is fine, the TREE_CODE (TREE_TYPE (field)) == REFERENCE_TYPE check catches also references in members. Index: cp/typeck2.c === --- cp/typeck2.c(revision 211955) +++ cp/typeck2.c(working copy) @@ -1342,37 +1342,32 @@ process_init_constructor_record (tree type, tree i next = massage_init_elt (TREE_TYPE (field), next, complain); /* Warn when some struct elements are implicitly initialized. */ - warning (OPT_Wmissing_field_initializers, - missing initializer for member %qD, field); + if (complain tf_warning) + warning (OPT_Wmissing_field_initializers, +missing initializer for member %qD, field); } else { - if (TREE_READONLY (field)) + if (TREE_CODE (TREE_TYPE (field)) == REFERENCE_TYPE) { if (complain tf_error) - error (uninitialized const member %qD, field); + error (member %qD is uninitialized reference, field); else return PICFLAG_ERRONEOUS; } - else if (CLASSTYPE_READONLY_FIELDS_NEED_INIT (TREE_TYPE (field))) + else if (CLASSTYPE_REF_FIELDS_NEED_INIT (TREE_TYPE (field))) { if (complain tf_error) - error (member %qD with uninitialized const fields, field); + error (member %qD with uninitialized reference fields, field); else return PICFLAG_ERRONEOUS; } - else if (TREE_CODE (TREE_TYPE (field)) == REFERENCE_TYPE) - { - if (complain tf_error) - error (member %qD is uninitialized reference, field); - else - return PICFLAG_ERRONEOUS; - } /* Warn when some struct elements are implicitly initialized to zero. */ - warning (OPT_Wmissing_field_initializers, - missing initializer for member %qD, field); + if (complain tf_warning) + warning (OPT_Wmissing_field_initializers, +missing initializer for member %qD, field); if (!zero_init_p (TREE_TYPE (field))) next = build_zero_init (TREE_TYPE (field), /*nelts=*/NULL_TREE, Index: testsuite/g++.dg/cpp0x/aggr1.C === --- testsuite/g++.dg/cpp0x/aggr1.C (revision 0) +++ testsuite/g++.dg/cpp0x/aggr1.C (working copy) @@ -0,0 +1,16 @@ +// PR c++/49132 +// { dg-do compile { target c++11 } } + +struct A { + const int m; +}; + +A a1 = {}; +A a2{}; + +struct B { + A a; +}; + +B b1 = {}; +B b2{}; Index: testsuite/g++.dg/cpp0x/aggr2.C === --- testsuite/g++.dg/cpp0x/aggr2.C (revision 0) +++ testsuite/g++.dg/cpp0x/aggr2.C (working copy) @@ -0,0 +1,16 @@ +// PR c++/49132 +// { dg-do compile { target c++11 } } + +struct A { + int m; +}; + +A a1 = {}; // { dg-error uninitialized reference } +A a2{}; // { dg-error uninitialized reference } + +struct B { + A a; +}; + +B b1 = {}; // { dg-error uninitialized reference } +B b2{}; // { dg-error uninitialized reference } Index: testsuite/g++.dg/init/aggr11.C === --- testsuite/g++.dg/init/aggr11.C (revision 0) +++ testsuite/g++.dg/init/aggr11.C (working copy) @@ -0,0 +1,13 @@ +// PR c++/49132 + +struct A { + const int m; +}; + +A a1 = {}; + +struct B { + A a; +}; + +B b1 = {}; Index: testsuite/g++.dg/init/aggr12.C === --- testsuite/g++.dg/init/aggr12.C (revision 0) +++ testsuite/g++.dg/init/aggr12.C (working copy) @@ -0,0 +1,13 @@ +// PR c++/49132 + +struct A { + int m; +}; + +A a1 = {}; // { dg-error uninitialized reference } + +struct B { + A a; +}; + +B b1 = {}; // { dg-error uninitialized reference }
[GOOGLE] Do not change edge probabilities when propagating edge counts
Hi, This patch removes unnecessary edge probability calculations in afdo_propagate_circuit() that would eventually be overridden by afdo_calculate_branch_prob(). This would pave the way for my next patch, which compares the estimated branch probabilities or the branch annotations against the real profile data. Thanks, Yi gcc/ 2014-06-24 Yi Yang ahyan...@google.com * auto-profile.c (afdo_propagate_circuit): Do not change edge probabilities when propagating edge counts diff --git gcc/auto-profile.c gcc/auto-profile.c index 51e318d..74d3d1d 100644 --- gcc/auto-profile.c +++ gcc/auto-profile.c @@ -1328,16 +1328,9 @@ afdo_propagate_circuit (void) continue; total++; only_one = ep; - if (e-probability == 0 (e-flags EDGE_ANNOTATED) == 0) - { - ep-probability = 0; - ep-count = 0; - ep-flags |= EDGE_ANNOTATED; - } } if (total == 1 (only_one-flags EDGE_ANNOTATED) == 0) { - only_one-probability = e-probability; only_one-count = e-count; only_one-flags |= EDGE_ANNOTATED; } --
Re: [GOOGLE] Do not change edge probabilities when propagating edge counts
OK for google-4_8 and google-4_9 Thanks, Dehao On Tue, Jun 24, 2014 at 3:09 PM, Yi Yang ahyan...@google.com wrote: Hi, This patch removes unnecessary edge probability calculations in afdo_propagate_circuit() that would eventually be overridden by afdo_calculate_branch_prob(). This would pave the way for my next patch, which compares the estimated branch probabilities or the branch annotations against the real profile data. Thanks, Yi gcc/ 2014-06-24 Yi Yang ahyan...@google.com * auto-profile.c (afdo_propagate_circuit): Do not change edge probabilities when propagating edge counts diff --git gcc/auto-profile.c gcc/auto-profile.c index 51e318d..74d3d1d 100644 --- gcc/auto-profile.c +++ gcc/auto-profile.c @@ -1328,16 +1328,9 @@ afdo_propagate_circuit (void) continue; total++; only_one = ep; - if (e-probability == 0 (e-flags EDGE_ANNOTATED) == 0) - { - ep-probability = 0; - ep-count = 0; - ep-flags |= EDGE_ANNOTATED; - } } if (total == 1 (only_one-flags EDGE_ANNOTATED) == 0) { - only_one-probability = e-probability; only_one-count = e-count; only_one-flags |= EDGE_ANNOTATED; } --
Re: [PATCH,MIPS] MIPS64r6 support
Matthew Fortune matthew.fort...@imgtec.com writes: I suppose we'll need a way of specifying an isa_rev range, say isa_rev=2-5. That should be a fairly localised change though. There appear to be about 9 tests that are not fixed by educating mips.exp about flags which are not supported on R6. Steve has initially dealt with these via forbid_cpu=mips.*r6 but I guess it would be cleaner to try and support an isa_rev range. I'll see we can club together enough tcl skills to write it :-) Thanks. I saw the patch Steve posted later, but I'm running out of time to look at it today. Hopefully tomorrow. (if_then_else (match_test TARGET_MICROMIPS) (match_test umips_12bit_offset_address_p (op, mode)) - (match_test mips_address_insns (op, mode, false + (if_then_else (match_test ISA_HAS_PREFETCH_9BIT) + (match_test mipsr6_9bit_offset_address_p (op, mode)) + (match_test mips_address_insns (op, mode, false) Please use (cond ...) instead. It seems I cannot use cond in a predicate expression, so I've had to leave it as is. Code outside config/mips can be changed too though. Try the attached. diff --git a/gcc/config/mips/linux.h b/gcc/config/mips/linux.h index e539422..751623f 100644 --- a/gcc/config/mips/linux.h +++ b/gcc/config/mips/linux.h @@ -18,8 +18,9 @@ along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ #define GLIBC_DYNAMIC_LINKER \ - %{mnan=2008:/lib/ld-linux-mipsn8.so.1;:/lib/ld.so.1} + %{mnan=2008|mips32r6|mips64r6:/lib/ld-linux- mipsn8.so.1;:/lib/ld.so.1} #undef UCLIBC_DYNAMIC_LINKER #define UCLIBC_DYNAMIC_LINKER \ - %{mnan=2008:/lib/ld-uClibc-mipsn8.so.0;:/lib/ld-uClibc.so.0} + %{mnan=2008|mips32r6|mips64r6:/lib/ld-uClibc-mipsn8.so.0; \ + :/lib/ld-uClibc.so.0} Rather than update all the specs like this, I think we should force -mnan=2008 onto the command line for r6 using DRIVER_SELF_SPECS. See e.g. MIPS_ISA_SYNCI_SPEC. I agree this could be simpler and your comment has made me realise the implementation in the patch is wrong for configurations like mipsisa32r6-unknown-linux-gnu. The issue for both the current patch and your suggestion is that they rely on MIPS_ISA_LEVEL_SPEC having been applied but this only happens in the vendor triplets. The --with-arch* options used with mips-unknown-linux-gnu would be fine as they place an arch option on the command line. If I add MIPS_ISA_LEVEL_SPEC to the DRIVER_SELF_SPECS generic definition in mips.h then I believe that would fix the problem. Any new spec I add for R6/nan setting would also need adding to the generic DRIVER_SELF_SPECS in mips.h and any vendor definitions of DRIVER_SELF_SPECS. Yeah, sounds like the right way to go. Richard gcc/ * genattrtab.c (check_attr_value): Move COND length check to... * read-rtl.c (read_rtx_code): ...here. * gensupport.c (expand_conds): New function. (process_rtx): Use it on predicate and constraint conditions. Index: gcc/genattrtab.c === --- gcc/genattrtab.c2014-06-24 23:14:49.140002614 +0100 +++ gcc/genattrtab.c2014-06-24 23:14:49.361004899 +0100 @@ -997,13 +997,6 @@ check_attr_value (rtx exp, struct attr_d break; case COND: - if (XVECLEN (exp, 0) % 2 != 0) - { - error_with_line (attr-lineno, - first operand of COND must have even length); - break; - } - for (i = 0; i XVECLEN (exp, 0); i += 2) { XVECEXP (exp, 0, i) = check_attr_test (XVECEXP (exp, 0, i), Index: gcc/read-rtl.c === --- gcc/read-rtl.c 2014-06-24 23:14:49.141002624 +0100 +++ gcc/read-rtl.c 2014-06-24 23:14:49.363004920 +0100 @@ -1350,6 +1350,9 @@ read_rtx_code (const char *code_name) gcc_unreachable (); } + if (code == COND XVECLEN (return_rtx, 0) % 2 != 0) +fatal_with_file_and_line (first operand of COND must have even length); + if (CONST_WIDE_INT_P (return_rtx)) { read_name (name); Index: gcc/gensupport.c === --- gcc/gensupport.c2014-06-24 23:14:49.140002614 +0100 +++ gcc/gensupport.c2014-06-24 23:21:35.389211427 +0100 @@ -143,6 +143,44 @@ gen_rtx_CONST_INT (enum machine_mode ARG XWINT (rt, 0) = arg; return rt; } + +/* Expand CONDs in *LOC to IF_THEN_ELSEs. */ + +static void +expand_conds (rtx *loc) +{ + rtx x = *loc; + + if (GET_CODE (x) == COND) +{ + *loc = XEXP (x, 1); + for (int i = XVECLEN (x, 0) - 2; i = 0; i -= 2) + { + rtx cond = rtx_alloc (IF_THEN_ELSE); + XEXP (cond, 0) = XVECEXP (x, 0, i); + XEXP (cond, 1) = XVECEXP (x, 0, i + 1); + XEXP (cond, 2) = *loc; + *loc = cond; + } + x = *loc; +
Re: [PATCH] Fix PR c++/61537
Hi, On 06/24/2014 01:40 AM, Adam Butcher wrote: +++ b/gcc/testsuite/g++.dg/cpp1y/pr61537.C @@ -0,0 +1,24 @@ +// PR c++/61537 +// { dg-do compile { target c++1y } } I don't think this is a C++1y specific issue... +// { dg-options } Also, likely minor detail, could you please explain why you need this? Thanks for working on the bug! Paolo.
Re: [PATCH] Fix PR c++/61537
On 2014-06-24 23:22, Paolo Carlini wrote: On 06/24/2014 01:40 AM, Adam Butcher wrote: +// { dg-do compile { target c++1y } } I don't think this is a C++1y specific issue... You're right. I'm so used to creating pr testcases in that g++.dg/cpp1y dir, I automatically added the test there without engaging my brain! +// { dg-options } Also, likely minor detail, could you please explain why you need this? Don't think I do; it was a copy from the most recent test in the cpp1y dir (another bad habit). Same issue as before; need to engage brain. :) I've fixed these issues up in my tree and moved the test to g++.dg. Cheers, Adam diff --git a/gcc/testsuite/g++.dg/cpp1y/pr61537.C b/gcc/testsuite/g++.dg/pr61537.C similarity index 79% rename from gcc/testsuite/g++.dg/cpp1y/pr61537.C rename to gcc/testsuite/g++.dg/pr61537.C index 55761cd..12aaf58 100644 --- a/gcc/testsuite/g++.dg/cpp1y/pr61537.C +++ b/gcc/testsuite/g++.dg/pr61537.C @@ -1,6 +1,5 @@ // PR c++/61537 -// { dg-do compile { target c++1y } } -// { dg-options } +// { dg-do compile } struct A {};
Re: [patch 1/4] change specific int128 - generic intN
Part 1 of 4, split from the full patch. The purpose of this set of changes is to remove assumptions in GCC about type sizes. Previous to this patch, GCC assumed that all types were powers-of-two in size, and used naive math accordingly. Old: POINTER_SIZE / BITS_PER_UNIT TYPE_SIZE GET_MODE_BITSIZE New: POINTER_SIZE_UNITS (ceil, not floor) TYPE_PRECISION GET_MODE_PRECISION gcc/ * cppbuiltin.c (define_builtin_macros_for_type_sizes): Round pointer size up to a power of two. * defaults.h (DWARF2_ADDR_SIZE): Round up. (POINTER_SIZE_UNITS): New, rounded up value. * dwarf2asm.c (size_of_encoded_value): Use it. (dw2_output_indirect_constant_1): Likewise. * expmed.c (init_expmed_one_conv): We now know the sizes of partial int modes. * loop-iv.c (iv_number_of_iterations): Use precision, not size. * optabs.c (expand_float): Use precision, not size. (expand_fix): Likewise. * simplify-rtx (simplify_unary_operation_1): Likewise. * tree-dfa.c (get_ref_base_and_extent): Likewise. * varasm.c (assemble_addr_to_section): Round up pointer sizes. (default_assemble_integer) Likewise. (dump_tm_clone_pairs): Likewise. * tree-core.c: Adjust comment. gcc/lto/ * lto-object.c (lto_obj_begin_section): Do not assume pointers are powers-of-two in size. gcc/c-family/ * c-cppbuiltin.c (cpp_atomic_builtins): Round pointer sizes up. (type_suffix): Use type precision, not specific types. Index: gcc/dwarf2asm.c === --- gcc/dwarf2asm.c (revision 211858) +++ gcc/dwarf2asm.c (working copy) @@ -387,13 +387,13 @@ size_of_encoded_value (int encoding) if (encoding == DW_EH_PE_omit) return 0; switch (encoding 0x07) { case DW_EH_PE_absptr: - return POINTER_SIZE / BITS_PER_UNIT; + return POINTER_SIZE_UNITS; case DW_EH_PE_udata2: return 2; case DW_EH_PE_udata4: return 4; case DW_EH_PE_udata8: return 8; @@ -917,13 +917,13 @@ dw2_output_indirect_constant_1 (splay_tr if (USE_LINKONCE_INDIRECT) DECL_VISIBILITY (decl) = VISIBILITY_HIDDEN; } sym_ref = gen_rtx_SYMBOL_REF (Pmode, sym); assemble_variable (decl, 1, 1, 1); - assemble_integer (sym_ref, POINTER_SIZE / BITS_PER_UNIT, POINTER_SIZE, 1); + assemble_integer (sym_ref, POINTER_SIZE_UNITS, POINTER_SIZE, 1); return 0; } /* Emit the constants queued through dw2_force_const_mem. */ Index: gcc/cppbuiltin.c === --- gcc/cppbuiltin.c(revision 211858) +++ gcc/cppbuiltin.c(working copy) @@ -172,13 +172,13 @@ define_builtin_macros_for_type_sizes (cp ? __ORDER_BIG_ENDIAN__ : __ORDER_LITTLE_ENDIAN__)); /* ptr_type_node can't be used here since ptr_mode is only set when toplev calls backend_init which is not done with -E switch. */ cpp_define_formatted (pfile, __SIZEOF_POINTER__=%d, - POINTER_SIZE / BITS_PER_UNIT); + 1 ceil_log2 ((POINTER_SIZE + BITS_PER_UNIT - 1) / BITS_PER_UNIT)); } /* Define macros builtins common to all language performing CPP preprocessing. */ void Index: gcc/c-family/c-cppbuiltin.c === --- gcc/c-family/c-cppbuiltin.c (revision 211858) +++ gcc/c-family/c-cppbuiltin.c (working copy) @@ -675,13 +675,13 @@ cpp_atomic_builtins (cpp_reader *pfile) to a boolean truth value, let the library work around that. */ builtin_define_with_int_value (__GCC_ATOMIC_TEST_AND_SET_TRUEVAL, targetm.atomic_test_and_set_trueval); /* ptr_type_node can't be used here since ptr_mode is only set when toplev calls backend_init which is not done with -E or pch. */ - psize = POINTER_SIZE / BITS_PER_UNIT; + psize = POINTER_SIZE_UNITS; if (psize = SWAP_LIMIT) psize = 0; builtin_define_with_int_value (__GCC_ATOMIC_POINTER_LOCK_FREE, (have_swap[psize]? 2 : 1)); } @@ -1227,18 +1269,21 @@ builtin_define_with_hex_fp_value (const static const char * type_suffix (tree type) { static const char *const suffixes[] = { , U, L, UL, LL, ULL }; int unsigned_suffix; int is_long; + int tp = TYPE_PRECISION (type); if (type == long_long_integer_type_node - || type == long_long_unsigned_type_node) + || type == long_long_unsigned_type_node + || tp TYPE_PRECISION (long_integer_type_node)) is_long = 2; else if (type == long_integer_type_node - || type == long_unsigned_type_node) + || type == long_unsigned_type_node + || tp TYPE_PRECISION (integer_type_node)) is_long = 1; else if (type == integer_type_node ||
Re: [patch 3/4] change specific int128 - generic intN
Part 3 of 4, split from the full patch. Additional optimization opportunity, since the MSP430 does a lot of conversions between HImode and PSImode. gcc/ * expr.c (convert_move): If the target has an explicit converter, use it. Index: gcc/expr.c === --- gcc/expr.c (revision 211858) +++ gcc/expr.c (working copy) @@ -405,12 +405,32 @@ convert_move (rtx to, rtx from, int unsi from) : gen_rtx_FLOAT_EXTEND (to_mode, from)); return; } /* Handle pointer conversion. *//* SPEE 900220. */ + /* If the target has a converter from FROM_MODE to TO_MODE, use it. */ + { +convert_optab ctab; + +if (GET_MODE_PRECISION (from_mode) GET_MODE_PRECISION (to_mode)) + ctab = trunc_optab; +else if (unsignedp) + ctab = zext_optab; +else + ctab = sext_optab; + +if (convert_optab_handler (ctab, to_mode, from_mode) + != CODE_FOR_nothing) + { + emit_unop_insn (convert_optab_handler (ctab, to_mode, from_mode), + to, from, UNKNOWN); + return; + } + } + /* Targets are expected to provide conversion insns between PxImode and xImode for all MODE_PARTIAL_INT modes they use, but no others. */ if (GET_MODE_CLASS (to_mode) == MODE_PARTIAL_INT) { enum machine_mode full_mode = smallest_mode_for_size (GET_MODE_BITSIZE (to_mode), MODE_INT);
Re: [patch 4/4] change specific int128 - generic intN
Part 4 of 4, split from the full patch. This is the MSP430-specific use of the new intN framework to enable true 20-bit pointers. Since I'm one of the MSP430 maintainers, this patch is being posted for reference, not for approval. gcc/config/msp430 * config/msp430/msp430-protos.h (msp430_hard_regno_nregs_has_padding): New. (msp430_hard_regno_nregs_with_padding): New. * config/msp430/msp430.c (msp430_scalar_mode_supported_p): New. (msp430_hard_regno_nregs_has_padding): New. (msp430_hard_regno_nregs_with_padding): New. (msp430_unwind_word_mode): Use PSImode instead of SImode. (msp430_addr_space_legitimate_address_p): New. (msp430_asm_integer): New. (msp430_init_dwarf_reg_sizes_extra): New. (msp430_print_operand): Use X suffix for PSImode even in small model. * config/msp430/msp430.h (POINTER_SIZE): Use 20 bits, not 32. (PTR_SIZE): ...but 4 bytes for EH. (SIZE_TYPE): Use __int20. (PTRDIFF_TYPE): Likewise. (INCOMING_FRAME_SP_OFFSET): Adjust. * config/msp430/msp430.md (movqi_topbyte): New. (movpsi): Use fixed suffixes. (movsipsi2): Enable for 430X, not large model. (extendhipsi2): Likewise. (zero_extendhisi2): Likewise. (zero_extendhisipsi2): Likewise. (extend_and_shift1_hipsi2): Likewise. (extendpsisi2): Likewise. (*bitbranchmode4_z): Fix suffix logic. Index: gcc/config/msp430/msp430-protos.h === --- gcc/config/msp430/msp430-protos.h (revision 211858) +++ gcc/config/msp430/msp430-protos.h (working copy) @@ -27,12 +27,15 @@ voidmsp430_expand_epilogue (int); void msp430_expand_helper (rtx *operands, const char *, bool); void msp430_expand_prologue (void); const char * msp430x_extendhisi (rtx *); void msp430_fixup_compare_operands (enum machine_mode, rtx *); intmsp430_hard_regno_mode_ok (int, enum machine_mode); intmsp430_hard_regno_nregs (int, enum machine_mode); +intmsp430_hard_regno_nregs_has_padding (int, enum machine_mode); +intmsp430_hard_regno_nregs_with_padding (int, enum machine_mode); +boolmsp430_hwmult_enabled (void); rtxmsp430_incoming_return_addr_rtx (void); void msp430_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree, int); intmsp430_initial_elimination_offset (int, int); boolmsp430_is_interrupt_func (void); const char * msp430x_logical_shift_right (rtx); const char * msp430_mcu_name (void); Index: gcc/config/msp430/msp430.md === --- gcc/config/msp430/msp430.md (revision 211858) +++ gcc/config/msp430/msp430.md (working copy) @@ -176,12 +176,19 @@ @ MOV.B\t%1, %0 MOV%X1.B\t%1, %0 ) +(define_insn movqi_topbyte + [(set (match_operand:QI 0 msp_nonimmediate_operand =r) + (subreg:QI (match_operand:PSI 1 msp_general_operand r) 2))] + msp430x + PUSHM.A\t#1,%1 { POPM.W\t#1,%0 { POPM.W\t#1,%0 +) + (define_insn movqi [(set (match_operand:QI 0 msp_nonimmediate_operand =rYs,rm) (match_operand:QI 1 msp_general_operand riYs,rmi))] @ MOV.B\t%1, %0 @@ -220,27 +227,27 @@ ;; Some MOVX.A cases can be done with MOVA, this is only a few of them. (define_insn movpsi [(set (match_operand:PSI 0 msp_nonimmediate_operand =r,Ya,rm) (match_operand:PSI 1 msp_general_operand riYa,r,rmi))] @ - MOV%Q0\t%1, %0 - MOV%Q0\t%1, %0 - MOV%X0.%Q0\t%1, %0) + MOVA\t%1, %0 + MOVA\t%1, %0 + MOVX.A\t%1, %0) ; This pattern is identical to the truncsipsi2 pattern except ; that it uses a SUBREG instead of a TRUNC. It is needed in ; order to prevent reload from converting (set:SI (SUBREG:PSI (SI))) ; into (SET:PSI (PSI)). ; ; Note: using POPM.A #1 is two bytes smaller than using POPX.A (define_insn movsipsi2 [(set (match_operand:PSI0 register_operand =r) (subreg:PSI (match_operand:SI 1 register_operand r) 0))] - TARGET_LARGE + msp430x PUSH.W\t%H1 { PUSH.W\t%L1 { POPM.A #1, %0 ; Move reg-pair %L1:%H1 into pointer %0 ) ;; ;; Math @@ -564,49 +571,49 @@ { return msp430x_extendhisi (operands); } ) (define_insn extendhipsi2 [(set (match_operand:PSI 0 nonimmediate_operand =r) (subreg:PSI (sign_extend:SI (match_operand:HI 1 nonimmediate_operand 0)) 0))] - TARGET_LARGE + msp430x RLAM #4, %0 { RRAM #4, %0 ) ;; Look for cases where integer/pointer conversions are suboptimal due ;; to missing patterns, despite us not having opcodes for these ;; patterns. Doing these manually allows for alternate optimization ;; paths. (define_insn zero_extendhisi2 [(set (match_operand:SI 0 nonimmediate_operand =rm) (zero_extend:SI (match_operand:HI 1 nonimmediate_operand 0)))] - TARGET_LARGE + msp430x MOV.W\t#0,%H0 ) (define_insn zero_extendhisipsi2