[PATCH, ARM] Trunk build fail
Hi, ARM trunk build fail from @209484, since it requires the argument of GET_MODE_SIZE to be enum machine_mode. gcc/gcc/config/arm/arm.c:21433:13: error: invalid conversion from 'int' to 'machine_mode' [-fpermissive] ... Build OK with the patch. OK for trunk? Thanks! -Zhenqiang 2014-04-22 Zhenqiang Chen zhenqiang.c...@linaro.org * config/arm/arm.c (arm_print_operand, thumb_exit): Make sure GET_MODE_SIZE argument is enum machine_mode. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 773c353..822060d 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -21427,7 +21427,7 @@ arm_print_operand (FILE *stream, rtx x, int code) register. */ case 'p': { -int mode = GET_MODE (x); +enum machine_mode mode = GET_MODE (x); int regno; if (GET_MODE_SIZE (mode) != 8 || !REG_P (x)) @@ -21451,7 +21451,7 @@ arm_print_operand (FILE *stream, rtx x, int code) case 'P': case 'q': { - int mode = GET_MODE (x); + enum machine_mode mode = GET_MODE (x); int is_quad = (code == 'q'); int regno; @@ -21487,7 +21487,7 @@ arm_print_operand (FILE *stream, rtx x, int code) case 'e': case 'f': { -int mode = GET_MODE (x); +enum machine_mode mode = GET_MODE (x); int regno; if ((GET_MODE_SIZE (mode) != 16 @@ -21620,7 +21620,7 @@ arm_print_operand (FILE *stream, rtx x, int code) /* Translate an S register number into a D register number and element index. */ case 'y': { -int mode = GET_MODE (x); +enum machine_mode mode = GET_MODE (x); int regno; if (GET_MODE_SIZE (mode) != 4 || !REG_P (x)) @@ -21654,7 +21654,7 @@ arm_print_operand (FILE *stream, rtx x, int code) number into a D register number and element index. */ case 'z': { -int mode = GET_MODE (x); +enum machine_mode mode = GET_MODE (x); int regno; if (GET_MODE_SIZE (mode) != 2 || !REG_P (x)) @@ -25894,7 +25894,7 @@ thumb_exit (FILE *f, int reg_containing_return_addr) int pops_needed; unsigned available; unsigned required; - int mode; + enum machine_mode mode; int size; int restore_a4 = FALSE;
Re: [PATCH] Simplify a VEC_SELECT fed by its own inverse
On Mon, 21 Apr 2014, Richard Henderson wrote: On 04/21/2014 01:19 PM, Bill Schmidt wrote: + if (GET_CODE (trueop0) == VEC_SELECT + GET_MODE (XEXP (trueop0, 0)) == mode) + { + rtx op0_subop1 = XEXP (trueop0, 1); + gcc_assert (GET_CODE (op0_subop1) == PARALLEL); + gcc_assert (XVECLEN (trueop1, 0) == GET_MODE_NUNITS (mode)); + + /* Apply the outer ordering vector to the inner one. (The inner +ordering vector is expressly permitted to be of a different +length than the outer one.) If the result is { 0, 1, ..., n-1 } +then the two VEC_SELECTs cancel. */ + for (int i = 0; i XVECLEN (trueop1, 0); ++i) + { + rtx x = XVECEXP (trueop1, 0, i); + gcc_assert (CONST_INT_P (x)); + rtx y = XVECEXP (op0_subop1, 0, INTVAL (x)); + gcc_assert (CONST_INT_P (y)); In two places you're asserting that you've got a constant permutation. Surely there should be a non-assertion check and graceful exit for either select to be a variable permutation. Note that in the case where trueop0 is a CONST_VECTOR, we already check each element of trueop1: gcc_assert (CONST_INT_P (x)); In the case where the result is a scalar, we also have: gcc_assert (CONST_INT_P (XVECEXP (trueop1, 0, 0))); so we will have other issues if something ever creates a variable vec_select. Not that a graceful exit will hurt of course. -- Marc Glisse
Re: RFA: tweak integer type used for memcpy folding
Richard Sandiford rdsandif...@googlemail.com writes: wide-int fails to build libitm because of a bad interaction between: /* Keep the OI and XI modes from confusing the compiler into thinking that these modes could actually be used for computation. They are only holders for vectors during data movement. */ #define MAX_BITSIZE_MODE_ANY_INT (128) and the memcpy folding code: /* Make sure we are not copying using a floating-point mode or a type whose size possibly does not match its precision. */ if (FLOAT_MODE_P (TYPE_MODE (desttype)) || TREE_CODE (desttype) == BOOLEAN_TYPE || TREE_CODE (desttype) == ENUMERAL_TYPE) { /* A more suitable int_mode_for_mode would return a vector integer mode for a vector float mode or a integer complex mode for a float complex mode if there isn't a regular integer mode covering the mode of desttype. */ enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); if (mode == BLKmode) desttype = NULL_TREE; else desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); } if (FLOAT_MODE_P (TYPE_MODE (srctype)) || TREE_CODE (srctype) == BOOLEAN_TYPE || TREE_CODE (srctype) == ENUMERAL_TYPE) { enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); if (mode == BLKmode) srctype = NULL_TREE; else srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); } The failure occurs for complex long double, which we try to copy as a 256-bit integer type (OImode). This patch tries to do what the comment suggests by introducing a new form of int_mode_for_mode that replaces vector modes with vector modes and complex modes with complex modes. The fallback case of using a MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above 128 bits on x86_64. The question then is what to do about 128-bit types for i386. MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be used for optimisation. However, gcc.target/i386/pr49168-1.c only passes for -m32 -msse2 because we use int128_t to copy a float128_t. I handled that by allowing MODE_VECTOR_INT to be used instead of MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE, even if the original type wasn't a vector. It might be that other callers to int_mode_for_mode should use the new function too, but I'll look at that separately. I used the attached testcase (with printfs added to gcc) to check that the right modes and types were being chosen. The patch fixes the complex float and complex double cases, since the integer type that we previously picked had a larger alignment than the original complex type. One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype)) is that vectors are copied as integer vectors if the target supports them directly but are copied as float vectors otherwise, since in the latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the test are copied as vector floats and vector doubles both before and after the patch. Tested against trunk with x86_64-linux-gnu {,-m32}. OK to install? There was a typo in the declaration of the mode-mode function, should have been as follows. Thanks, Richard gcc/ * machmode.h (bitwise_mode_for_mode): Declare. * stor-layout.h (bitwise_type_for_mode): Likewise. * stor-layout.c (bitwise_mode_for_mode): New function. (bitwise_type_for_mode): Likewise. * builtins.c (fold_builtin_memory_op): Use it instead of int_mode_for_mode and build_nonstandard_integer_type. gcc/testsuite/ * gcc.dg/memcpy-5.c: New test. Index: gcc/machmode.h === --- gcc/machmode.h 2014-04-21 10:35:17.611603989 +0100 +++ gcc/machmode.h 2014-04-21 13:58:59.403884452 +0100 @@ -253,6 +253,8 @@ extern enum machine_mode smallest_mode_f extern enum machine_mode int_mode_for_mode (enum machine_mode); +extern enum machine_mode bitwise_mode_for_mode (enum machine_mode); + /* Return a mode that is suitable for representing a vector, or BLKmode on failure. */ Index: gcc/stor-layout.h === --- gcc/stor-layout.h 2014-04-21 10:35:17.611603989 +0100 +++ gcc/stor-layout.h 2014-04-21 13:58:59.405878960 +0100 @@ -98,6 +98,8 @@ extern tree make_unsigned_type (int); mode_for_size, but is passed a tree. */ extern enum machine_mode mode_for_size_tree (const_tree, enum mode_class, int); +extern tree bitwise_type_for_mode (enum machine_mode); + /* Given a VAR_DECL, PARM_DECL or RESULT_DECL, clears the results of a previous call to layout_decl and
RE: [PATCH] Remove keep_aligning from get_inner_reference
Hi, On Wed, 27 Nov 2013 12:47:19, Richard Biener wrote: On Wed, Nov 27, 2013 at 10:43 AM, Eric Botcazou ebotca...@adacore.com wrote: I think you are right, this flag is no longer necessary, and removing this code path would simplify everything. Therefore I'd like to propose to remove the keep_aligning parameter of get_inner_reference as a split-out patch. Boot-strapped (with languages=all,ada,go) and regression-tested on x86_64-linux-gnu. I don't understand how you can commit a patch that changes something only on strict-alignment platforms and test it only on x86-64. This change *must* be tested with Ada on a strict-alignment platform, that's the only combination for which it is exercised. If you cannot do that, then please back it out. More generally speaking, it's not acceptable to make cleanup changes like that in the RTL expander without extreme care, which of course starts with proper testing. The patch should not have been approved either for that reason. I'm fine with reverting it for now (you were in CC of the patch submission but silent on it, I asked for the patch to start simplifying the way mems are expanded - ultimately to avoid the recursion and mem-attribute compute by the recursion). We can come back during stage1. Well, it's stage1 again. I still have that already-approved patch, updated to current trunk. I've successfully boot-strapped it on armv7-linux-gnueabihf with all languages enabled, including Ada. The test suite runs cleanly without any drop-outs. Is it OK to commit now, or are there objections? Thanks Bernd. get_object_alignment should be able to properly handle this case if you call it on the full reference in the normal_inner_ref: case. All the weird duplicate code on the VIEW_CONVERT_EXPR case should IMHO go. Richard. -- Eric Botcazou 2014-04-16 Bernd Edlinger bernd.edlin...@hotmail.de Remove parameter keep_aligning from get_inner_reference. * tree.h (get_inner_reference): Adjust header. * expr.c (get_inner_reference): Remove parameter keep_aligning. (get_bit_range, expand_assignment, expand_expr_addr_expr_1, expand_expr_real_1): Adjust. * asan.c (instrument_derefs): Adjust. * builtins.c (get_object_alignment_2): Adjust. Remove handling of VIEW_CONVERT_EXPR. * cfgexpand.c (expand_debug_expr): Adjust. * dbxout.c (dbxout_expand_expr): Adjust. * dwarf2out.c (loc_list_for_address_of_addr_expr_of_indirect_ref, loc_list_from_tree, fortran_common): Adjust. * fold-const.c (optimize_bit_field_compare, decode_field_reference, fold_unary_loc, fold_comparison, split_address_to_core_and_offset): Adjust. * gimple-ssa-strength-reduction.c (slsr_process_ref): Adjust. * simplifx-rtx.c (delegitimize_mem_from_attrs): Adjust. * tree-affine.c (tree_to_aff_combination, get_inner_reference_aff): Adjust. * tree-data-ref.c (split_constant_offset_1, dr_analyze_innermost): Adjust. * tree-vect-data-refs.c (vect_check_gather, vect_analyze_data_refs): Adjust. * tree-scalar-evolution.c (interpret_rhs_expr): Adjust. * tree-ssa-loop-ivopts.c ( split_address_cost): Adjust. * tsan.c (instrument_expr): Adjust. * config/mips/mips.c (r10k_safe_mem_expr_p): Adjust. ada: 2014-04-16 Bernd Edlinger bernd.edlin...@hotmail.de Remove parameter keep_aligning from get_inner_reference. * gcc-interface/decl.c (elaborate_expression_1): Adjust. * gcc-interface/trans.c (Attribute_to_gnu): Adjust. * gcc-interface/utils2.c (build_unary_op): Adjust. patch-inner-reference.diff Description: Binary data
Re: [PATCH, ARM] Trunk build fail
On Tue, Apr 22, 2014 at 7:26 AM, Zhenqiang Chen zhenqiang.c...@linaro.org wrote: Hi, ARM trunk build fail from @209484, since it requires the argument of GET_MODE_SIZE to be enum machine_mode. gcc/gcc/config/arm/arm.c:21433:13: error: invalid conversion from 'int' to 'machine_mode' [-fpermissive] ... Build OK with the patch. OK for trunk? Ok - please apply. Thanks, Ramana Thanks! -Zhenqiang 2014-04-22 Zhenqiang Chen zhenqiang.c...@linaro.org * config/arm/arm.c (arm_print_operand, thumb_exit): Make sure GET_MODE_SIZE argument is enum machine_mode. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 773c353..822060d 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -21427,7 +21427,7 @@ arm_print_operand (FILE *stream, rtx x, int code) register. */ case 'p': { -int mode = GET_MODE (x); +enum machine_mode mode = GET_MODE (x); int regno; if (GET_MODE_SIZE (mode) != 8 || !REG_P (x)) @@ -21451,7 +21451,7 @@ arm_print_operand (FILE *stream, rtx x, int code) case 'P': case 'q': { - int mode = GET_MODE (x); + enum machine_mode mode = GET_MODE (x); int is_quad = (code == 'q'); int regno; @@ -21487,7 +21487,7 @@ arm_print_operand (FILE *stream, rtx x, int code) case 'e': case 'f': { -int mode = GET_MODE (x); +enum machine_mode mode = GET_MODE (x); int regno; if ((GET_MODE_SIZE (mode) != 16 @@ -21620,7 +21620,7 @@ arm_print_operand (FILE *stream, rtx x, int code) /* Translate an S register number into a D register number and element index. */ case 'y': { -int mode = GET_MODE (x); +enum machine_mode mode = GET_MODE (x); int regno; if (GET_MODE_SIZE (mode) != 4 || !REG_P (x)) @@ -21654,7 +21654,7 @@ arm_print_operand (FILE *stream, rtx x, int code) number into a D register number and element index. */ case 'z': { -int mode = GET_MODE (x); +enum machine_mode mode = GET_MODE (x); int regno; if (GET_MODE_SIZE (mode) != 2 || !REG_P (x)) @@ -25894,7 +25894,7 @@ thumb_exit (FILE *f, int reg_containing_return_addr) int pops_needed; unsigned available; unsigned required; - int mode; + enum machine_mode mode; int size; int restore_a4 = FALSE;
Re: Inliner heuristics TLC 1/n - let small function inlinng to ignore cold portion of program
On Thu, Apr 17, 2014 at 7:52 PM, Jan Hubicka hubi...@ucw.cz wrote: Hi, I think for 4.10 we should revisit inliner behaviour to be more LTO and LTO+FDO ready. This is first of small patches I made to sanitize behaviour of current bounds. The main problem LTO brings is that we get way too many inline candidates. In per-file model one gets only small percentage of calls inlinable, since most of them go to other units, so our current heuristics behave quite well, inlining usually all calls that it consider benefical. With LTO almost all calls are inlinable and if we inline everything we consider profitable we get insane code size growths, so practically always we hit our 30% unit growth threshold. This is not always a good idea. Reducing inline-insns-auto/inline-insns-single to avoid inliner hitting the growth limit would cause a regression on benchmarks that needs inlining of large functions. LLVM seems to get around the problem by doing code expanding inlining at compile time (in equivalent of our early inliner). This makes functions big, so the LTO doesn't inline much, but it also misses useful cross-module inlines and replace them by less usefull inter-module. Other approach would be to have inline-insns-crossmodule that is significantly smaller than inline-insns-auto. We already have crossmodule hint that probably ought to be made smarter to not fire on COMDAT functions. I do not want to do it, since the numbers I collected in http://hubicka.blogspot.ca/2014/04/devirtualization-in-c-part-5-feedback.html suggest that inline-insns-auto is already quite bad limit. I would be happy to hear about alternative solutions to this. We may want to switch whole program inliner into temperature style bound, like open64 does. Well, this patch actually goes bit different direction - making unit growth threashold more sane. While looking into inliner behaviour at Firefox to write my blog entry I noticed that with profile feedback only very small portion of the program is trained (15%) and only around 7% of code contains something that we consider hot. Inliner however still hits the inline-unit-growth limit with: Unit growth for small function inlining: 7232256-9220597 (27%) Inlined 183353 calls, eliminated 54652 function We do not grow the code in the cold portions of program, but because of the dead padding we grow everything we consider hot 4 times, instead of 1.3 times as we would usually do if it was unpadded. This patch fixes the problem by considering only non-cold functions for frequency calculation. We now get: Unit growth for small function inlining: 2083217-2537163 (21%) Inlined 134611 calls, eliminated 53586 functions So while the relative growth is still close to 30%, the absolute growth is only 22% of the previous one. We inline fewer calls but in the dynamic stats there is very minor (sub 0.01%) diference. You know that I still think limiting both destination growth and candidate initial size is bogus and constrains us too much. In the LTO world even more so. At most we should take 'inline' as a hint that the body will be optimized more - thus reduce the target growth caused by inlining. Richard. Bootstrapped/regtested x86_64-linux, will commit it shortly. Honza * ipa-inline.c (inline_small_functions): Account only non-cold functions. * doc/invoke.texi (inline-unit-growth): Update documentation. Index: ipa-inline.c === --- ipa-inline.c(revision 209461) +++ ipa-inline.c(working copy) @@ -1585,7 +1590,10 @@ inline_small_functions (void) struct inline_summary *info = inline_summary (node); struct ipa_dfs_info *dfs = (struct ipa_dfs_info *) node-aux; - if (!DECL_EXTERNAL (node-decl)) + /* Do not account external functions, they will be optimized out + if not inlined. Also only count the non-cold portion of program. */ + if (!DECL_EXTERNAL (node-decl) +node-frequency != NODE_FREQUENCY_UNLIKELY_EXECUTED) initial_size += info-size; info-growth = estimate_growth (node); if (dfs dfs-next_cycle) Index: doc/invoke.texi === --- doc/invoke.texi (revision 209461) +++ doc/invoke.texi (working copy) @@ -9409,7 +9409,8 @@ before applying @option{--param inline-u @item inline-unit-growth Specifies maximal overall growth of the compilation unit caused by inlining. The default value is 30 which limits unit growth to 1.3 times the original -size. +size. Cold functions (either marked cold via an attribibute or by profile +feedback) are not accounted into the unit size. @item ipcp-unit-growth Specifies maximal overall growth of the compilation unit caused by
Re: [PATCH 00/89] Compile-time gimple-checking
First of all, thanks a lot for doing this. Maybe one day we'll have the same in rtl :-) But... David Malcolm dmalc...@redhat.com writes: In doing the checked downcasts I ran into the verbosity of the as_a API (in our is-a.h). I first tried simplifying them with custom functions e.g.: static inline gimple_bind as_a_gimple_bind (gimple gs) { return as_a gimple_statement_bind (gs); } but the approach I've gone with makes these checked casts be *methods* of the gimple_statement_base class, so that e.g. in a switch statement you can write: case GIMPLE_SWITCH: dump_gimple_switch (buffer, gs-as_a_gimple_switch (), spc, flags); break; where the -as_a_gimple_switch is a no-op cast from gimple to the more concrete gimple_switch in a release build, with runtime checking for code == GIMPLE_SWITCH added in a checked build (it uses as_a internally). This is much less verbose than trying to do it with as_a directly, and I think doing it as a method reads better aloud (to my English-speaking mind, at-least): gs as a gimple switch, as opposed to: as a gimple switch... gs, which I find clunky. It makes the base class a little cluttered, but IMHO it hits a sweet-spot of readability and type-safety with relatively little verbosity (only 8 more characters than doing it with a raw C-style cast). Another advantage of having the checked cast as a *method* is that it implicitly documents the requirement that the input must be non-NULL. ...FWIW I really don't like these cast members. The counterarguments are: - as_a ... (...) and dyn_cast ... (...) follow the C++ syntax for other casts. - the type you get is obvious, rather than being a contraction of the type name. - having them as methods means that the base class needs to aware of all subclasses. I realise that's probably inherently true of gimple due to the enum, but it seems like bad design. You could potentially have different subclasses for the same enum, selected by a secondary field. Maybe I've just been reading C code too long, but as a gimple switch...gs doesn't seem any less natural than is constant...address. Another way of reducing the verbosity of as_a would be to shorten the type names. E.g. gimple_statement contracts to gimple_stmt in some places, so gimple_statement_bind could become gimple_stmt_bind or just gimple_bind. gimple_bind is probably better since it matches the names of the accessors. If the thing after as_a_ matches the type name, the X-as_a_foo () takes the same number of characters as as_a foo (X). Thanks, Richard
Re: [PATCH] Fix uninitialised variable warning in tree-ssa-loop-ivcanon.c
On 17/04/14 18:06, Daniel Marjamäki wrote: Hello! I am not against it.. However I think there is no danger. I see no potential use of uninitialized variable. The use of n_unroll is guarded by n_unroll_found. Hmmm... you're right. I guess the warning is just noise then. I'll leave it up to maintainers to take the patch or not then. Thanks, Kyrill Best regards, Daniel Marjamäki
Re: [PATCH] Remove keep_aligning from get_inner_reference
I still have that already-approved patch, updated to current trunk. I've successfully boot-strapped it on armv7-linux-gnueabihf with all languages enabled, including Ada. The test suite runs cleanly without any drop-outs. Thanks for the testing. Is it OK to commit now, or are there objections? I think that the patch is either incomplete or wrong, in the sense that it will break TYPE_ALIGN_OK support, unless this support is totally obsolete, in which case it ought to be totally removed instead of just partially. The Ada testsuite in the compiler isn't exhaustive enough to give any guarantee so I will need to conduct more testing. Can you sit on the patch a few weeks? -- Eric Botcazou
Re: [PATCH] Fix omp declare simd cloning (PR tree-optimization/60823)
On Fri, 18 Apr 2014, Jakub Jelinek wrote: Hi! This patch fixes the adjustments performed by ipa_simd_modify_function_body on omp declare simd clones. Previously we've been trying to replace typically SSA_NAMEs with underlying PARM_DECLs of the to be replaced arguments with loads/stores from/to array refs (that will be hopefully vectorized) right around the referencing stmt, but: 1) this can't really work well if there is any life range overlap in SSA_NAMEs with the same underlying PARM_DECL 2) PHIs weren't handled at all (neither PHI arguments, nor lhs of the PHIs) 3) for addressable PARM_DECLs the code pretty much assumed the same thing can be done too This patch instead adjusts all SSA_NAMEs with SSA_NAME_VAR equal to the to be replaced PARM_DECLs to a new underlying VAR_DECL, only changes the (D) SSA_NAME to a load done at the start of the entry block, and for addressable PARM_DECLs adjusts them such that they don't have to be regimplified (as we replace say address of a PARM_DECL which is a gimple_min_invariant with array ref with variable index which is not gimple_min_invariant, we need to force the addresses into SSA_NAMEs). The tree-dfa.c fix is what I've discovered while writing the patch, if htab_find_slot_with_hash (..., NO_INSERT) fails to find something in the hash table (most likely not actually needed by the patch, discovered that just because the patch was buggy initially), it returns NULL rather than address of some slot which will contain NULL. Probably doesn't matter in practice as we are clearing a default-def only if it is one (and thus it is recorded). Bootstrapped/regtested on x86_64-linux and i686-linux. Richard, does this look reasonable? Yeah. Thanks, Richard. 2014-04-18 Jakub Jelinek ja...@redhat.com PR tree-optimization/60823 * omp-low.c (ipa_simd_modify_function_body): Go through all SSA_NAMEs and for those refering to vector arguments which are going to be replaced adjust SSA_NAME_VAR and, if it is a default definition, change it into a non-default definition assigned at the beginning of function from new_decl. (ipa_simd_modify_stmt_ops): Rewritten. * tree-dfa.c (set_ssa_default_def): When removing default def, check for NULL loc instead of NULL *loc. * c-c++-common/gomp/pr60823-1.c: New test. * c-c++-common/gomp/pr60823-2.c: New test. * c-c++-common/gomp/pr60823-3.c: New test. --- gcc/omp-low.c.jj 2014-04-17 14:48:59.076025713 +0200 +++ gcc/omp-low.c 2014-04-18 12:00:16.666701773 +0200 @@ -11281,45 +11281,53 @@ static tree ipa_simd_modify_stmt_ops (tree *tp, int *walk_subtrees, void *data) { struct walk_stmt_info *wi = (struct walk_stmt_info *) data; - if (!SSA_VAR_P (*tp)) + struct modify_stmt_info *info = (struct modify_stmt_info *) wi-info; + tree *orig_tp = tp; + if (TREE_CODE (*tp) == ADDR_EXPR) +tp = TREE_OPERAND (*tp, 0); + struct ipa_parm_adjustment *cand = NULL; + if (TREE_CODE (*tp) == PARM_DECL) +cand = ipa_get_adjustment_candidate (tp, NULL, info-adjustments, true); + else { - /* Make sure we treat subtrees as a RHS. This makes sure that - when examining the `*foo' in *foo=x, the `foo' get treated as - a use properly. */ - wi-is_lhs = false; - wi-val_only = true; if (TYPE_P (*tp)) *walk_subtrees = 0; - return NULL_TREE; -} - struct modify_stmt_info *info = (struct modify_stmt_info *) wi-info; - struct ipa_parm_adjustment *cand -= ipa_get_adjustment_candidate (tp, NULL, info-adjustments, true); - if (!cand) -return NULL_TREE; - - tree t = *tp; - tree repl = make_ssa_name (TREE_TYPE (t), NULL); - - gimple stmt; - gimple_stmt_iterator gsi = gsi_for_stmt (info-stmt); - if (wi-is_lhs) -{ - stmt = gimple_build_assign (unshare_expr (cand-new_decl), repl); - gsi_insert_after (gsi, stmt, GSI_SAME_STMT); - SSA_NAME_DEF_STMT (repl) = info-stmt; } + + tree repl = NULL_TREE; + if (cand) +repl = unshare_expr (cand-new_decl); else { - /* You'd think we could skip the extra SSA variable when - wi-val_only=true, but we may have `*var' which will get - replaced into `*var_array[iter]' and will likely be something - not gimple. */ - stmt = gimple_build_assign (repl, unshare_expr (cand-new_decl)); - gsi_insert_before (gsi, stmt, GSI_SAME_STMT); + if (tp != orig_tp) + { + *walk_subtrees = 0; + bool modified = info-modified; + info-modified = false; + walk_tree (tp, ipa_simd_modify_stmt_ops, wi, wi-pset); + if (!info-modified) + { + info-modified = modified; + return NULL_TREE; + } + info-modified = modified; + repl = *tp; + } + else + return NULL_TREE; } - if (!useless_type_conversion_p (TREE_TYPE (*tp),
Re: Inliner heuristics TLC 3/n
On Fri, Apr 18, 2014 at 9:45 PM, Jan Hubicka hubi...@ucw.cz wrote: Hi, this patch makes FDO inliner to be more aggressive on inlining function calls that are considered hot. This is based on observation that INLINE_INSNS_AUTO is the most common reason for inlining not happening (20.5% for Firefox, where 63.2% of calls are not inlinable because body is not avaiable) and 66% for GCC. With this patch INLINE_HINT_known_hot hint is added to edges that was determined to be hot by profile and moreover there is at least 50% chance that caller will invoke the call during its execution. With this hint we now ignore both limits - this is because the greedy algorithm driven by speed/size_cost metric should work pretty well here, but we may want to revisit it (i.e. add INLINE_INSNS_FDO or so). I am on the aggressive side so we collect some data on when the profile is a win or loss. Just remove those artificial limits and replace them with a factor on the estimated size/time benefit (cold-vs-hot and inline-declared-vs-not). At least don't introduce yet another set of size params. Richard. Bootstrapped/regtested x86_64-linux, comitted. Honza * ipa-inline.h (INLINE_HINT_known_hot): New hint. * ipa-inline-analysis.c (dump_inline_hints): Dump it. (do_estimate_edge_time): Compute it. * ipa-inline.c (want_inline_small_function_p): Bypass INLINE_INSNS_AUTO/SINGLE limits for calls that are known to be hot. Index: ipa-inline.h === --- ipa-inline.h(revision 209489) +++ ipa-inline.h(working copy) @@ -68,7 +68,9 @@ enum inline_hints_vals { INLINE_HINT_cross_module = 64, /* If array indexes of loads/stores become known there may be room for further optimization. */ - INLINE_HINT_array_index = 128 + INLINE_HINT_array_index = 128, + /* We know that the callee is hot by profile. */ + INLINE_HINT_known_hot = 256 }; typedef int inline_hints; Index: ipa-inline-analysis.c === --- ipa-inline-analysis.c (revision 209489) +++ ipa-inline-analysis.c (working copy) @@ -671,6 +671,11 @@ dump_inline_hints (FILE *f, inline_hints hints = ~INLINE_HINT_array_index; fprintf (f, array_index); } + if (hints INLINE_HINT_known_hot) +{ + hints = ~INLINE_HINT_known_hot; + fprintf (f, known_hot); +} gcc_assert (!hints); } @@ -3666,6 +3671,17 @@ do_estimate_edge_time (struct cgraph_edg known_aggs); estimate_node_size_and_time (callee, clause, known_vals, known_binfos, known_aggs, size, min_size, time, hints, es-param); + + /* When we have profile feedback, we can quite safely identify hot + edges and for those we disable size limits. Don't do that when + probability that caller will call the callee is low however, since it + may hurt optimization of the caller's hot path. */ + if (edge-count cgraph_maybe_hot_edge_p (edge) + (edge-count * 2 + (edge-caller-global.inlined_to +? edge-caller-global.inlined_to-count : edge-caller-count))) +hints |= INLINE_HINT_known_hot; + known_vals.release (); known_binfos.release (); known_aggs.release (); Index: ipa-inline.c === --- ipa-inline.c(revision 209522) +++ ipa-inline.c(working copy) @@ -578,18 +578,21 @@ want_inline_small_function_p (struct cgr inline cnadidate. At themoment we allow inline hints to promote non-inline function to inline and we increase MAX_INLINE_INSNS_SINGLE 16fold for inline functions. */ - else if (!DECL_DECLARED_INLINE_P (callee-decl) + else if ((!DECL_DECLARED_INLINE_P (callee-decl) + (!e-count || !cgraph_maybe_hot_edge_p (e))) inline_summary (callee)-min_size - inline_edge_summary (e)-call_stmt_size MAX (MAX_INLINE_INSNS_SINGLE, MAX_INLINE_INSNS_AUTO)) { e-inline_failed = CIF_MAX_INLINE_INSNS_AUTO_LIMIT; want_inline = false; } - else if (DECL_DECLARED_INLINE_P (callee-decl) + else if ((DECL_DECLARED_INLINE_P (callee-decl) || e-count) inline_summary (callee)-min_size - inline_edge_summary (e)-call_stmt_size 16 * MAX_INLINE_INSNS_SINGLE) { - e-inline_failed = CIF_MAX_INLINE_INSNS_AUTO_LIMIT; + e-inline_failed = (DECL_DECLARED_INLINE_P (callee-decl) + ? CIF_MAX_INLINE_INSNS_SINGLE_LIMIT + : CIF_MAX_INLINE_INSNS_AUTO_LIMIT); want_inline = false; } else @@ -606,6 +609,7 @@ want_inline_small_function_p (struct cgr growth = MAX_INLINE_INSNS_SINGLE ((!big_speedup !(hints
Re: [PATCH] Simplify a VEC_SELECT fed by its own inverse
Marc Glisse marc.gli...@inria.fr writes: On Mon, 21 Apr 2014, Richard Henderson wrote: On 04/21/2014 01:19 PM, Bill Schmidt wrote: + if (GET_CODE (trueop0) == VEC_SELECT + GET_MODE (XEXP (trueop0, 0)) == mode) + { + rtx op0_subop1 = XEXP (trueop0, 1); + gcc_assert (GET_CODE (op0_subop1) == PARALLEL); + gcc_assert (XVECLEN (trueop1, 0) == GET_MODE_NUNITS (mode)); + + /* Apply the outer ordering vector to the inner one. (The inner +ordering vector is expressly permitted to be of a different +length than the outer one.) If the result is { 0, 1, ..., n-1 } +then the two VEC_SELECTs cancel. */ + for (int i = 0; i XVECLEN (trueop1, 0); ++i) + { + rtx x = XVECEXP (trueop1, 0, i); + gcc_assert (CONST_INT_P (x)); + rtx y = XVECEXP (op0_subop1, 0, INTVAL (x)); + gcc_assert (CONST_INT_P (y)); In two places you're asserting that you've got a constant permutation. Surely there should be a non-assertion check and graceful exit for either select to be a variable permutation. Note that in the case where trueop0 is a CONST_VECTOR, we already check each element of trueop1: gcc_assert (CONST_INT_P (x)); In the case where the result is a scalar, we also have: gcc_assert (CONST_INT_P (XVECEXP (trueop1, 0, 0))); so we will have other issues if something ever creates a variable vec_select. Not that a graceful exit will hurt of course. I realise this isn't the point, but maybe we should go easy on this kind of gcc_assert. Using INTVAL is itself an assertion that you have a CONST_INT. Adding gcc_asserts on top (and so forcing the assert even in release compilers) kind-of subverts the --enable-checking option. Thanks, Richard
RE: [PATCH] Remove keep_aligning from get_inner_reference
Hi Eric, On Tue, 22 Apr 2014 10:09:28, Eric Botcazou wrote: I still have that already-approved patch, updated to current trunk. I've successfully boot-strapped it on armv7-linux-gnueabihf with all languages enabled, including Ada. The test suite runs cleanly without any drop-outs. Thanks for the testing. Is it OK to commit now, or are there objections? I think that the patch is either incomplete or wrong, in the sense that it will break TYPE_ALIGN_OK support, unless this support is totally obsolete, in which case it ought to be totally removed instead of just partially. The Ada testsuite in the compiler isn't exhaustive enough to give any guarantee so I will need to conduct more testing. Can you sit on the patch a few weeks? Sure, and thanks again for your help. I was not able to find any difference on the generated code with or without that patch. Bernd. -- Eric Botcazou
Re: RFA: tweak integer type used for memcpy folding
On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford rdsandif...@googlemail.com wrote: wide-int fails to build libitm because of a bad interaction between: /* Keep the OI and XI modes from confusing the compiler into thinking that these modes could actually be used for computation. They are only holders for vectors during data movement. */ #define MAX_BITSIZE_MODE_ANY_INT (128) and the memcpy folding code: /* Make sure we are not copying using a floating-point mode or a type whose size possibly does not match its precision. */ if (FLOAT_MODE_P (TYPE_MODE (desttype)) || TREE_CODE (desttype) == BOOLEAN_TYPE || TREE_CODE (desttype) == ENUMERAL_TYPE) { /* A more suitable int_mode_for_mode would return a vector integer mode for a vector float mode or a integer complex mode for a float complex mode if there isn't a regular integer mode covering the mode of desttype. */ enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); if (mode == BLKmode) desttype = NULL_TREE; else desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); } if (FLOAT_MODE_P (TYPE_MODE (srctype)) || TREE_CODE (srctype) == BOOLEAN_TYPE || TREE_CODE (srctype) == ENUMERAL_TYPE) { enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); if (mode == BLKmode) srctype = NULL_TREE; else srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); } The failure occurs for complex long double, which we try to copy as a 256-bit integer type (OImode). This patch tries to do what the comment suggests by introducing a new form of int_mode_for_mode that replaces vector modes with vector modes and complex modes with complex modes. The fallback case of using a MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above 128 bits on x86_64. The question then is what to do about 128-bit types for i386. MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be used for optimisation. However, gcc.target/i386/pr49168-1.c only passes for -m32 -msse2 because we use int128_t to copy a float128_t. I handled that by allowing MODE_VECTOR_INT to be used instead of MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE, even if the original type wasn't a vector. Hmm. Sounds reasonable unless there are very weird targets that cannot efficiently load/store vectors unaligned but can handle efficient load/store of unaligned scalars. It might be that other callers to int_mode_for_mode should use the new function too, but I'll look at that separately. I used the attached testcase (with printfs added to gcc) to check that the right modes and types were being chosen. The patch fixes the complex float and complex double cases, since the integer type that we previously picked had a larger alignment than the original complex type. As of complex int modes - are we positively sure that targets even try to do sth optimal for loads/stores of those? One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype)) is that vectors are copied as integer vectors if the target supports them directly but are copied as float vectors otherwise, since in the latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the test are copied as vector floats and vector doubles both before and after the patch. That wasn't intended ... the folding should have failed if we can't copy using an integer mode ... Richard. Tested against trunk with x86_64-linux-gnu {,-m32}. OK to install? Thanks, Richard gcc/ * machmode.h (bitwise_mode_for_size): Declare. * stor-layout.h (bitwise_type_for_mode): Likewise. * stor-layout.c (bitwise_mode_for_size): New function. (bitwise_type_for_mode): Likewise. * builtins.c (fold_builtin_memory_op): Use it instead of int_mode_for_mode and build_nonstandard_integer_type. gcc/testsuite/ * gcc.dg/memcpy-5.c: New test. Index: gcc/machmode.h === --- gcc/machmode.h 2014-04-18 11:16:12.706092658 +0100 +++ gcc/machmode.h 2014-04-18 11:16:38.179299261 +0100 @@ -253,6 +253,8 @@ extern enum machine_mode smallest_mode_f extern enum machine_mode int_mode_for_mode (enum machine_mode); +extern enum machine_mode bitwise_mode_for_size (enum machine_mode); + /* Return a mode that is suitable for representing a vector, or BLKmode on failure. */ Index: gcc/stor-layout.h === --- gcc/stor-layout.h 2014-04-18 11:16:12.707092667 +0100
Re: [PATCH 00/89] Compile-time gimple-checking
On Mon, Apr 21, 2014 at 6:56 PM, David Malcolm dmalc...@redhat.com wrote: This is a greatly-expanded version of: http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01262.html As of r205034 (de6bd75e3c9bc1efe8a6387d48eedaa4dafe622d) and r205428 (a90353203da18288cdac1b0b78fe7b22c69fe63f) the various gimple statements form a C++ inheritance hierarchy, but we're not yet making much use of that in the code: everything refers to just gimple (or const_gimple), and type-checking is performed at run-time within the various gimple_foo_* accessors in gimple.h, and almost nowhere else. The following patch series introduces compile-time checking of much of the handling of gimple statements. Various new typedefs are introduced for pointers to statements where the specific code is known, matching the corresponding names from gimple.def. Even though I like these changes in principle I also wear a release managers hat. Being one of the persons doing frequent backports of trunk fixes to branches this will cause a _lot_ of headache. So ... can we delay this until, say, 4.9.1 is out? Thanks, Richard. For example, it introduces a gimple_bind typedef, which is a (gimple_statement_bind *) which has the invariant that stmt-code == GIMPLE_BIND. The idea is that all of the gimple_foo_* accessors in gimple.h are converted from taking just a gimple to a gimple_foo. I've managed this so far for 15 of the gimple statement subclasses; for example, all of the gimple_bind_* accessors now require a gimple_bind rather than a plain gimple. Similarly, variabless throughout the middle-end have their types strengthened from plain gimple to a typedef expressing a pointer to some concrete statement subclass, and similarly for vectors. For example various variables have their types strengthened from gimple to gimple_bind, and from plain vecgimple to vecgimple_bind (e.g. within gimplify.c for handling the bind stack). Numerous other such typedefs are introduced: essentially two for each of the gimple code values: a gimple_foo and a const_gimple_foo variant e.g. gimple_switch and const_gimple_switch (some of the rarer codes don't have such typedefs yet). Some of these typedefs are aliases for existing subclasses within the gimple class hierarchy, but others are new with this patch series. As with the existing subclasses, they don't add any extra fields, they merely express invariants on the gimple's code. In each case there are some checked downcasts from gimple down to the more concrete statement-type, so that the runtime-checking in the checked build happens there, at the boundary between types, rather than the current checking, which is every time an accessor is called and almost nowhere else. Once we're in a more concrete type than gimple, the compiler can enforce the type-checking for us at compile-time. An additional benefit is that human readers of the code should (I hope) have an easier time following what's going on: assumptions about the underlying gimple_code of a stmt that previously were hidden are now obvious, expressed directly in the type system. For example, various variables in tree-into-ssa.c change from just vecgimple to being vecgimple_phi, capturing the phi-ness of the contents as a compile-time check (and then not needing to check them any more); indeed great swathes of phi-manipulation code are changed from acting on vanilla gimple to acting on gimple_phi. Similarly, within tree-inline.h's struct copy_body_data, the field debug_stmts can be concretized from a vecgimple to a vecgimple_debug. Another notable such concretization is that the call_stmt field of a cgraph_edge becomes a gimple_call, rather than a plain gimple. In doing the checked downcasts I ran into the verbosity of the as_a API (in our is-a.h). I first tried simplifying them with custom functions e.g.: static inline gimple_bind as_a_gimple_bind (gimple gs) { return as_a gimple_statement_bind (gs); } but the approach I've gone with makes these checked casts be *methods* of the gimple_statement_base class, so that e.g. in a switch statement you can write: case GIMPLE_SWITCH: dump_gimple_switch (buffer, gs-as_a_gimple_switch (), spc, flags); break; where the -as_a_gimple_switch is a no-op cast from gimple to the more concrete gimple_switch in a release build, with runtime checking for code == GIMPLE_SWITCH added in a checked build (it uses as_a internally). This is much less verbose than trying to do it with as_a directly, and I think doing it as a method reads better aloud (to my English-speaking mind, at-least): gs as a gimple switch, as opposed to: as a gimple switch... gs, which I find clunky. It makes the base class a little cluttered, but IMHO it hits a sweet-spot of readability and type-safety with relatively little verbosity (only 8 more characters than doing it with a raw C-style cast). Another advantage of
Re: [PATCH] Fix uninitialised variable warning in tree-ssa-loop-ivcanon.c
On Tue, Apr 22, 2014 at 10:07 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: On 17/04/14 18:06, Daniel Marjamäki wrote: Hello! I am not against it.. However I think there is no danger. I see no potential use of uninitialized variable. The use of n_unroll is guarded by n_unroll_found. Hmmm... you're right. I guess the warning is just noise then. I'll leave it up to maintainers to take the patch or not then. If it only appears with the host compiler (not during bootstrap, in which case it would break bootstrap) then we don't want it. Richard. Thanks, Kyrill Best regards, Daniel Marjamäki
Re: [RFC] Add aarch64 support for ada
How about this? I added a check vs MINSIGSTKSZ just in case, and updated the commentary a bit. While 16K is 2*SIGSTKSIZE for i686, it certainly isn't for powerpc64. But since things are working as-is I thought the revision is clearer. Fine with me, thanks. -- Eric Botcazou
Re: RFA: tweak integer type used for memcpy folding
Richard Biener richard.guent...@gmail.com writes: On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford rdsandif...@googlemail.com wrote: wide-int fails to build libitm because of a bad interaction between: /* Keep the OI and XI modes from confusing the compiler into thinking that these modes could actually be used for computation. They are only holders for vectors during data movement. */ #define MAX_BITSIZE_MODE_ANY_INT (128) and the memcpy folding code: /* Make sure we are not copying using a floating-point mode or a type whose size possibly does not match its precision. */ if (FLOAT_MODE_P (TYPE_MODE (desttype)) || TREE_CODE (desttype) == BOOLEAN_TYPE || TREE_CODE (desttype) == ENUMERAL_TYPE) { /* A more suitable int_mode_for_mode would return a vector integer mode for a vector float mode or a integer complex mode for a float complex mode if there isn't a regular integer mode covering the mode of desttype. */ enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); if (mode == BLKmode) desttype = NULL_TREE; else desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); } if (FLOAT_MODE_P (TYPE_MODE (srctype)) || TREE_CODE (srctype) == BOOLEAN_TYPE || TREE_CODE (srctype) == ENUMERAL_TYPE) { enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); if (mode == BLKmode) srctype = NULL_TREE; else srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); } The failure occurs for complex long double, which we try to copy as a 256-bit integer type (OImode). This patch tries to do what the comment suggests by introducing a new form of int_mode_for_mode that replaces vector modes with vector modes and complex modes with complex modes. The fallback case of using a MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above 128 bits on x86_64. The question then is what to do about 128-bit types for i386. MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be used for optimisation. However, gcc.target/i386/pr49168-1.c only passes for -m32 -msse2 because we use int128_t to copy a float128_t. I handled that by allowing MODE_VECTOR_INT to be used instead of MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE, even if the original type wasn't a vector. Hmm. Sounds reasonable unless there are very weird targets that cannot efficiently load/store vectors unaligned but can handle efficient load/store of unaligned scalars. Yeah, in general there's no guarantee that even int_mode_for_mode will return a mode with the same alignment as the original. Callers need to check that (like the memcpy folder does). It might be that other callers to int_mode_for_mode should use the new function too, but I'll look at that separately. I used the attached testcase (with printfs added to gcc) to check that the right modes and types were being chosen. The patch fixes the complex float and complex double cases, since the integer type that we previously picked had a larger alignment than the original complex type. As of complex int modes - are we positively sure that targets even try to do sth optimal for loads/stores of those? Complex modes usually aren't handled directly by .md patterns, either int or float. They're really treated as a pair of values. So IMO it still makes sense to fold this case. One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype)) is that vectors are copied as integer vectors if the target supports them directly but are copied as float vectors otherwise, since in the latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the test are copied as vector floats and vector doubles both before and after the patch. That wasn't intended ... the folding should have failed if we can't copy using an integer mode ... Does that mean that the fold give up if TYPE_MODE is BLKmode? I can do that as a separate patch if so. Thanks, Richard
Re: RFA: tweak integer type used for memcpy folding
On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford rdsandif...@googlemail.com wrote: wide-int fails to build libitm because of a bad interaction between: /* Keep the OI and XI modes from confusing the compiler into thinking that these modes could actually be used for computation. They are only holders for vectors during data movement. */ #define MAX_BITSIZE_MODE_ANY_INT (128) and the memcpy folding code: /* Make sure we are not copying using a floating-point mode or a type whose size possibly does not match its precision. */ if (FLOAT_MODE_P (TYPE_MODE (desttype)) || TREE_CODE (desttype) == BOOLEAN_TYPE || TREE_CODE (desttype) == ENUMERAL_TYPE) { /* A more suitable int_mode_for_mode would return a vector integer mode for a vector float mode or a integer complex mode for a float complex mode if there isn't a regular integer mode covering the mode of desttype. */ enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); if (mode == BLKmode) desttype = NULL_TREE; else desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); } if (FLOAT_MODE_P (TYPE_MODE (srctype)) || TREE_CODE (srctype) == BOOLEAN_TYPE || TREE_CODE (srctype) == ENUMERAL_TYPE) { enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); if (mode == BLKmode) srctype = NULL_TREE; else srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); } The failure occurs for complex long double, which we try to copy as a 256-bit integer type (OImode). This patch tries to do what the comment suggests by introducing a new form of int_mode_for_mode that replaces vector modes with vector modes and complex modes with complex modes. The fallback case of using a MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above 128 bits on x86_64. The question then is what to do about 128-bit types for i386. MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be used for optimisation. However, gcc.target/i386/pr49168-1.c only passes for -m32 -msse2 because we use int128_t to copy a float128_t. I handled that by allowing MODE_VECTOR_INT to be used instead of MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE, even if the original type wasn't a vector. Hmm. Sounds reasonable unless there are very weird targets that cannot efficiently load/store vectors unaligned but can handle efficient load/store of unaligned scalars. Yeah, in general there's no guarantee that even int_mode_for_mode will return a mode with the same alignment as the original. Callers need to check that (like the memcpy folder does). It might be that other callers to int_mode_for_mode should use the new function too, but I'll look at that separately. I used the attached testcase (with printfs added to gcc) to check that the right modes and types were being chosen. The patch fixes the complex float and complex double cases, since the integer type that we previously picked had a larger alignment than the original complex type. As of complex int modes - are we positively sure that targets even try to do sth optimal for loads/stores of those? Complex modes usually aren't handled directly by .md patterns, either int or float. They're really treated as a pair of values. So IMO it still makes sense to fold this case. One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype)) is that vectors are copied as integer vectors if the target supports them directly but are copied as float vectors otherwise, since in the latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the test are copied as vector floats and vector doubles both before and after the patch. That wasn't intended ... the folding should have failed if we can't copy using an integer mode ... Does that mean that the fold give up if TYPE_MODE is BLKmode? I can do that as a separate patch if so. Looking at the code again it should always choose an integer mode/type via setting desttype/srctype to NULL for BLKmode and if (!srctype) srctype = desttype; if (!desttype) desttype = srctype; if (!srctype) return NULL_TREE; no? Thus if we can't get a integer type for either src or dest then we fail. But we should never end up with srctype or desttype being a float mode. No? Richard. Thanks, Richard
Re: RFA: tweak integer type used for memcpy folding
Richard Biener richard.guent...@gmail.com writes: On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford rdsandif...@googlemail.com wrote: wide-int fails to build libitm because of a bad interaction between: /* Keep the OI and XI modes from confusing the compiler into thinking that these modes could actually be used for computation. They are only holders for vectors during data movement. */ #define MAX_BITSIZE_MODE_ANY_INT (128) and the memcpy folding code: /* Make sure we are not copying using a floating-point mode or a type whose size possibly does not match its precision. */ if (FLOAT_MODE_P (TYPE_MODE (desttype)) || TREE_CODE (desttype) == BOOLEAN_TYPE || TREE_CODE (desttype) == ENUMERAL_TYPE) { /* A more suitable int_mode_for_mode would return a vector integer mode for a vector float mode or a integer complex mode for a float complex mode if there isn't a regular integer mode covering the mode of desttype. */ enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); if (mode == BLKmode) desttype = NULL_TREE; else desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); } if (FLOAT_MODE_P (TYPE_MODE (srctype)) || TREE_CODE (srctype) == BOOLEAN_TYPE || TREE_CODE (srctype) == ENUMERAL_TYPE) { enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); if (mode == BLKmode) srctype = NULL_TREE; else srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); } The failure occurs for complex long double, which we try to copy as a 256-bit integer type (OImode). This patch tries to do what the comment suggests by introducing a new form of int_mode_for_mode that replaces vector modes with vector modes and complex modes with complex modes. The fallback case of using a MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above 128 bits on x86_64. The question then is what to do about 128-bit types for i386. MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be used for optimisation. However, gcc.target/i386/pr49168-1.c only passes for -m32 -msse2 because we use int128_t to copy a float128_t. I handled that by allowing MODE_VECTOR_INT to be used instead of MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE, even if the original type wasn't a vector. Hmm. Sounds reasonable unless there are very weird targets that cannot efficiently load/store vectors unaligned but can handle efficient load/store of unaligned scalars. Yeah, in general there's no guarantee that even int_mode_for_mode will return a mode with the same alignment as the original. Callers need to check that (like the memcpy folder does). It might be that other callers to int_mode_for_mode should use the new function too, but I'll look at that separately. I used the attached testcase (with printfs added to gcc) to check that the right modes and types were being chosen. The patch fixes the complex float and complex double cases, since the integer type that we previously picked had a larger alignment than the original complex type. As of complex int modes - are we positively sure that targets even try to do sth optimal for loads/stores of those? Complex modes usually aren't handled directly by .md patterns, either int or float. They're really treated as a pair of values. So IMO it still makes sense to fold this case. One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype)) is that vectors are copied as integer vectors if the target supports them directly but are copied as float vectors otherwise, since in the latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the test are copied as vector floats and vector doubles both before and after the patch. That wasn't intended ... the folding should have failed if we can't copy using an integer mode ... Does that mean that the fold give up if TYPE_MODE is BLKmode? I can do that as a separate patch if so. Looking at the code again it should always choose an integer mode/type via setting desttype/srctype to NULL for BLKmode and if (!srctype) srctype = desttype; if (!desttype) desttype = srctype; if (!srctype) return NULL_TREE; no? Thus if we can't get a integer type for either src or dest then we fail. But we should never end up with srctype or desttype being a float mode. No? Right, they don't have a float _mode_, because the target doesn't support
[4.9] Add no-dist for libffi and boehm-gc (PR other/43620)
Hi! In order to avoid the ftp.gnu.org security check failures in make dist* (that we don't use anyway), I've committed following fix to 4.9 branch. --- boehm-gc/ChangeLog (revision 209553) +++ boehm-gc/ChangeLog (working copy) @@ -1,3 +1,13 @@ +2014-04-22 Jakub Jelinek ja...@redhat.com + + PR other/43620 + * Makefile.am (AUTOMAKE_OPTIONS): Add no-dist. + * include/Makefile.am (AUTOMAKE_OPTIONS): Likewise. + * testsuite/Makefile.am (AUTOMAKE_OPTIONS): Likewise. + * Makefile.in: Regenerated. + * include/Makefile.in: Regenerated. + * testsuite/Makefile.in: Regenerated. + 2013-12-21 Andreas Tobler andre...@gcc.gnu.org * include/private/gcconfig.h: Add FreeBSD powerpc64 defines. --- boehm-gc/Makefile.am(revision 209553) +++ boehm-gc/Makefile.am(working copy) @@ -4,7 +4,7 @@ ## files that should be in the distribution are not mentioned in this ## Makefile.am. -AUTOMAKE_OPTIONS = foreign subdir-objects +AUTOMAKE_OPTIONS = foreign subdir-objects no-dist ACLOCAL_AMFLAGS = -I .. -I ../config SUBDIRS = include testsuite --- boehm-gc/include/Makefile.am(revision 209553) +++ boehm-gc/include/Makefile.am(working copy) @@ -1,4 +1,4 @@ -AUTOMAKE_OPTIONS = foreign +AUTOMAKE_OPTIONS = foreign no-dist noinst_HEADERS = gc.h gc_backptr.h gc_local_alloc.h \ gc_pthread_redirects.h gc_cpp.h --- boehm-gc/testsuite/Makefile.am (revision 209553) +++ boehm-gc/testsuite/Makefile.am (working copy) @@ -1,6 +1,6 @@ ## Process this file with automake to produce Makefile.in. -AUTOMAKE_OPTIONS = foreign dejagnu +AUTOMAKE_OPTIONS = foreign dejagnu no-dist EXPECT = expect --- libffi/ChangeLog(revision 209553) +++ libffi/ChangeLog(working copy) @@ -1,3 +1,12 @@ +2014-04-22 Jakub Jelinek ja...@redhat.com + + PR other/43620 + * configure.ac (AM_INIT_AUTOMAKE): Add no-dist. + * Makefile.in: Regenerated. + * include/Makefile.in: Regenerated. + * man/Makefile.in: Regenerated. + * testsuite/Makefile.in: Regenerated. + 2014-03-12 Yufeng Zhang yufeng.zh...@arm.com * src/aarch64/sysv.S (ffi_closure_SYSV): Use x29 as the --- libffi/configure.ac (revision 209553) +++ libffi/configure.ac (working copy) @@ -12,7 +12,7 @@ target_alias=${target_alias-$host_alias} . ${srcdir}/configure.host -AM_INIT_AUTOMAKE +AM_INIT_AUTOMAKE([no-dist]) # See if makeinfo has been installed and is modern enough # that we can use it. --- boehm-gc/Makefile.in(revision 209553) +++ boehm-gc/Makefile.in(working copy) @@ -36,13 +36,10 @@ build_triplet = @build@ host_triplet = @host@ target_triplet = @target@ subdir = . -DIST_COMMON = $(am__configure_deps) $(srcdir)/../compile \ - $(srcdir)/../config.guess $(srcdir)/../config.sub \ - $(srcdir)/../depcomp $(srcdir)/../install-sh \ - $(srcdir)/../ltmain.sh $(srcdir)/../missing \ - $(srcdir)/../mkinstalldirs $(srcdir)/Makefile.am \ - $(srcdir)/Makefile.in $(srcdir)/threads.mk.in \ - $(top_srcdir)/configure ChangeLog depcomp +DIST_COMMON = ChangeLog $(srcdir)/Makefile.in $(srcdir)/Makefile.am \ + $(top_srcdir)/configure $(am__configure_deps) \ + $(srcdir)/../mkinstalldirs $(srcdir)/threads.mk.in \ + $(srcdir)/../depcomp ACLOCAL_M4 = $(top_srcdir)/aclocal.m4 am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \ $(top_srcdir)/../config/depstand.m4 \ @@ -63,14 +60,6 @@ CONFIG_CLEAN_FILES = threads.mk CONFIG_CLEAN_VPATH_FILES = LTLIBRARIES = $(noinst_LTLIBRARIES) am__DEPENDENCIES_1 = -am__libgcjgc_la_SOURCES_DIST = allchblk.c alloc.c blacklst.c \ - checksums.c dbg_mlc.c dyn_load.c finalize.c gc_dlopen.c \ - gcj_mlc.c headers.c malloc.c mallocx.c mark.c mark_rts.c \ - misc.c new_hblk.c obj_map.c os_dep.c pcr_interface.c \ - ptr_chck.c real_malloc.c reclaim.c specific.c stubborn.c \ - typd_mlc.c backgraph.c win32_threads.c pthread_support.c \ - pthread_stop_world.c darwin_stop_world.c \ - powerpc_darwin_mach_dep.s @POWERPC_DARWIN_TRUE@am__objects_1 = powerpc_darwin_mach_dep.lo am_libgcjgc_la_OBJECTS = allchblk.lo alloc.lo blacklst.lo checksums.lo \ dbg_mlc.lo dyn_load.lo finalize.lo gc_dlopen.lo gcj_mlc.lo \ @@ -80,14 +69,6 @@ am_libgcjgc_la_OBJECTS = allchblk.lo all backgraph.lo win32_threads.lo pthread_support.lo \ pthread_stop_world.lo darwin_stop_world.lo $(am__objects_1) libgcjgc_la_OBJECTS = $(am_libgcjgc_la_OBJECTS) -am__libgcjgc_convenience_la_SOURCES_DIST = allchblk.c alloc.c \ - blacklst.c checksums.c dbg_mlc.c dyn_load.c finalize.c \ - gc_dlopen.c gcj_mlc.c headers.c malloc.c mallocx.c mark.c \ - mark_rts.c misc.c new_hblk.c obj_map.c os_dep.c \ - pcr_interface.c ptr_chck.c real_malloc.c reclaim.c specific.c \ - stubborn.c typd_mlc.c backgraph.c win32_threads.c \ - pthread_support.c pthread_stop_world.c darwin_stop_world.c \ -
Re: [PATCH] Remove keep_aligning from get_inner_reference
Sure, and thanks again for your help. Thanks! I was not able to find any difference on the generated code with or without that patch. Yes, my gut feeling is that TYPE_ALIGN_OK is really obsolete now. It is set in a single place in the compiler (gcc-interface/decl.c:gnat_to_gnu_entity): /* Tell the middle-end that objects of tagged types are guaranteed to be properly aligned. This is necessary because conversions to the class-wide type are translated into conversions to the root type, which can be less aligned than some of its derived types. */ if (Is_Tagged_Type (gnat_entity) || Is_Class_Wide_Equivalent_Type (gnat_entity)) TYPE_ALIGN_OK (gnu_type) = 1; but we changed the way these conversions are done some time ago. -- Eric Botcazou
[PING] [PATCH, AARCH64] movmodecc for fcsel
Ping? Rebase and test. Bootstrap and no make check regression with qemu. OK for trunk? Thanks! -Zhenqiang On 18 March 2014 16:16, Zhenqiang Chen zhenqiang.c...@linaro.org wrote: Hi, For float value, movsfcc/movdfcc is required by emit_conditional_move called in ifcvt pass to expand if-then-else to fcsel insn. Bootstrap and no make check regression with qemu-aarch64. Is it OK for next stage1? Thanks! -Zhenqiang ChangeLog: 2014-03-18 Zhenqiang Chen zhenqiang.c...@linaro.org * config/aarch64/aarch64.md (movmodecc): New for GPF. testsuite/ChangeLog: 2014-03-18 Zhenqiang Chen zhenqiang.c...@linaro.org * gcc.target/aarch64/fcsel.c: New test case. diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 99a6ac8..0f4b8ebf 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2344,6 +2344,25 @@ } ) +(define_expand movmodecc + [(set (match_operand:GPF 0 register_operand ) +(if_then_else:GPF (match_operand 1 aarch64_comparison_operator ) + (match_operand:GPF 2 register_operand ) + (match_operand:GPF 3 register_operand )))] + + { +rtx ccreg; +enum rtx_code code = GET_CODE (operands[1]); + +if (code == UNEQ || code == LTGT) + FAIL; + +ccreg = aarch64_gen_compare_reg (code, XEXP (operands[1], 0), + XEXP (operands[1], 1)); +operands[1] = gen_rtx_fmt_ee (code, VOIDmode, ccreg, const0_rtx); + } +) + (define_insn *csinc2mode_insn [(set (match_operand:GPI 0 register_operand =r) (plus:GPI (match_operator:GPI 2 aarch64_comparison_operator diff --git a/gcc/testsuite/gcc.target/aarch64/fcsel.c b/gcc/testsuite/gcc.target/aarch64/fcsel.c new file mode 100644 index 000..9c5431a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/fcsel.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +float f1 (float a, float b, float c, float d) +{ + if (a 0.0) +return c; + else +return 2.0; +} + +double f2 (double a, double b, double c, double d) +{ + if (a b) +return c; + else +return d; +} + +/* { dg-final { scan-assembler-times \tfcsel 2 } } */
Re: [PING] [PATCH, AARCH64] movmodecc for fcsel
On Apr 22, 2014, at 2:36 AM, Zhenqiang Chen zhenqiang.c...@linaro.org wrote: Ping? Rebase and test. Bootstrap and no make check regression with qemu. OK for trunk? This is the exact same patch we (Cavium) came up with for this missed optimization. Thanks, Andrew Thanks! -Zhenqiang On 18 March 2014 16:16, Zhenqiang Chen zhenqiang.c...@linaro.org wrote: Hi, For float value, movsfcc/movdfcc is required by emit_conditional_move called in ifcvt pass to expand if-then-else to fcsel insn. Bootstrap and no make check regression with qemu-aarch64. Is it OK for next stage1? Thanks! -Zhenqiang ChangeLog: 2014-03-18 Zhenqiang Chen zhenqiang.c...@linaro.org * config/aarch64/aarch64.md (movmodecc): New for GPF. testsuite/ChangeLog: 2014-03-18 Zhenqiang Chen zhenqiang.c...@linaro.org * gcc.target/aarch64/fcsel.c: New test case. diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 99a6ac8..0f4b8ebf 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2344,6 +2344,25 @@ } ) +(define_expand movmodecc + [(set (match_operand:GPF 0 register_operand ) +(if_then_else:GPF (match_operand 1 aarch64_comparison_operator ) + (match_operand:GPF 2 register_operand ) + (match_operand:GPF 3 register_operand )))] + + { +rtx ccreg; +enum rtx_code code = GET_CODE (operands[1]); + +if (code == UNEQ || code == LTGT) + FAIL; + +ccreg = aarch64_gen_compare_reg (code, XEXP (operands[1], 0), + XEXP (operands[1], 1)); +operands[1] = gen_rtx_fmt_ee (code, VOIDmode, ccreg, const0_rtx); + } +) + (define_insn *csinc2mode_insn [(set (match_operand:GPI 0 register_operand =r) (plus:GPI (match_operator:GPI 2 aarch64_comparison_operator diff --git a/gcc/testsuite/gcc.target/aarch64/fcsel.c b/gcc/testsuite/gcc.target/aarch64/fcsel.c new file mode 100644 index 000..9c5431a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/fcsel.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +float f1 (float a, float b, float c, float d) +{ + if (a 0.0) +return c; + else +return 2.0; +} + +double f2 (double a, double b, double c, double d) +{ + if (a b) +return c; + else +return d; +} + +/* { dg-final { scan-assembler-times \tfcsel 2 } } */
Re: RFA: tweak integer type used for memcpy folding
On Tue, Apr 22, 2014 at 11:15 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford rdsandif...@googlemail.com wrote: wide-int fails to build libitm because of a bad interaction between: /* Keep the OI and XI modes from confusing the compiler into thinking that these modes could actually be used for computation. They are only holders for vectors during data movement. */ #define MAX_BITSIZE_MODE_ANY_INT (128) and the memcpy folding code: /* Make sure we are not copying using a floating-point mode or a type whose size possibly does not match its precision. */ if (FLOAT_MODE_P (TYPE_MODE (desttype)) || TREE_CODE (desttype) == BOOLEAN_TYPE || TREE_CODE (desttype) == ENUMERAL_TYPE) { /* A more suitable int_mode_for_mode would return a vector integer mode for a vector float mode or a integer complex mode for a float complex mode if there isn't a regular integer mode covering the mode of desttype. */ enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); if (mode == BLKmode) desttype = NULL_TREE; else desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); } if (FLOAT_MODE_P (TYPE_MODE (srctype)) || TREE_CODE (srctype) == BOOLEAN_TYPE || TREE_CODE (srctype) == ENUMERAL_TYPE) { enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); if (mode == BLKmode) srctype = NULL_TREE; else srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); } The failure occurs for complex long double, which we try to copy as a 256-bit integer type (OImode). This patch tries to do what the comment suggests by introducing a new form of int_mode_for_mode that replaces vector modes with vector modes and complex modes with complex modes. The fallback case of using a MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above 128 bits on x86_64. The question then is what to do about 128-bit types for i386. MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be used for optimisation. However, gcc.target/i386/pr49168-1.c only passes for -m32 -msse2 because we use int128_t to copy a float128_t. I handled that by allowing MODE_VECTOR_INT to be used instead of MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE, even if the original type wasn't a vector. Hmm. Sounds reasonable unless there are very weird targets that cannot efficiently load/store vectors unaligned but can handle efficient load/store of unaligned scalars. Yeah, in general there's no guarantee that even int_mode_for_mode will return a mode with the same alignment as the original. Callers need to check that (like the memcpy folder does). It might be that other callers to int_mode_for_mode should use the new function too, but I'll look at that separately. I used the attached testcase (with printfs added to gcc) to check that the right modes and types were being chosen. The patch fixes the complex float and complex double cases, since the integer type that we previously picked had a larger alignment than the original complex type. As of complex int modes - are we positively sure that targets even try to do sth optimal for loads/stores of those? Complex modes usually aren't handled directly by .md patterns, either int or float. They're really treated as a pair of values. So IMO it still makes sense to fold this case. One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype)) is that vectors are copied as integer vectors if the target supports them directly but are copied as float vectors otherwise, since in the latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the test are copied as vector floats and vector doubles both before and after the patch. That wasn't intended ... the folding should have failed if we can't copy using an integer mode ... Does that mean that the fold give up if TYPE_MODE is BLKmode? I can do that as a separate patch if so. Looking at the code again it should always choose an integer mode/type via setting desttype/srctype to NULL for BLKmode and if (!srctype) srctype = desttype; if (!desttype) desttype = srctype; if (!srctype) return NULL_TREE; no? Thus if we can't get a integer type for either src or dest then we fail. But we should never end up with srctype or desttype being a
Re: [PATCH, x86] merge movsd/movhpd pair in peephole
On Tue, Apr 22, 2014 at 2:59 AM, Xinliang David Li davi...@google.com wrote: Bin, when will the patch for the generic pass be available for review? Hi, The patch is still under working and reviewing. For arm we only need to handle simple load/stores, so it may need to be extended to handle generic memory access instructions on x86. Thanks, bin -- Best Regards.
Re: RFA: tweak integer type used for memcpy folding
Richard Biener richard.guent...@gmail.com writes: On Tue, Apr 22, 2014 at 11:15 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford rdsandif...@googlemail.com wrote: wide-int fails to build libitm because of a bad interaction between: /* Keep the OI and XI modes from confusing the compiler into thinking that these modes could actually be used for computation. They are only holders for vectors during data movement. */ #define MAX_BITSIZE_MODE_ANY_INT (128) and the memcpy folding code: /* Make sure we are not copying using a floating-point mode or a type whose size possibly does not match its precision. */ if (FLOAT_MODE_P (TYPE_MODE (desttype)) || TREE_CODE (desttype) == BOOLEAN_TYPE || TREE_CODE (desttype) == ENUMERAL_TYPE) { /* A more suitable int_mode_for_mode would return a vector integer mode for a vector float mode or a integer complex mode for a float complex mode if there isn't a regular integer mode covering the mode of desttype. */ enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); if (mode == BLKmode) desttype = NULL_TREE; else desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); } if (FLOAT_MODE_P (TYPE_MODE (srctype)) || TREE_CODE (srctype) == BOOLEAN_TYPE || TREE_CODE (srctype) == ENUMERAL_TYPE) { enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); if (mode == BLKmode) srctype = NULL_TREE; else srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); } The failure occurs for complex long double, which we try to copy as a 256-bit integer type (OImode). This patch tries to do what the comment suggests by introducing a new form of int_mode_for_mode that replaces vector modes with vector modes and complex modes with complex modes. The fallback case of using a MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above 128 bits on x86_64. The question then is what to do about 128-bit types for i386. MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be used for optimisation. However, gcc.target/i386/pr49168-1.c only passes for -m32 -msse2 because we use int128_t to copy a float128_t. I handled that by allowing MODE_VECTOR_INT to be used instead of MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE, even if the original type wasn't a vector. Hmm. Sounds reasonable unless there are very weird targets that cannot efficiently load/store vectors unaligned but can handle efficient load/store of unaligned scalars. Yeah, in general there's no guarantee that even int_mode_for_mode will return a mode with the same alignment as the original. Callers need to check that (like the memcpy folder does). It might be that other callers to int_mode_for_mode should use the new function too, but I'll look at that separately. I used the attached testcase (with printfs added to gcc) to check that the right modes and types were being chosen. The patch fixes the complex float and complex double cases, since the integer type that we previously picked had a larger alignment than the original complex type. As of complex int modes - are we positively sure that targets even try to do sth optimal for loads/stores of those? Complex modes usually aren't handled directly by .md patterns, either int or float. They're really treated as a pair of values. So IMO it still makes sense to fold this case. One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype)) is that vectors are copied as integer vectors if the target supports them directly but are copied as float vectors otherwise, since in the latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the test are copied as vector floats and vector doubles both before and after the patch. That wasn't intended ... the folding should have failed if we can't copy using an integer mode ... Does that mean that the fold give up if TYPE_MODE is BLKmode? I can do that as a separate patch if so. Looking at the code again it should always choose an integer mode/type via setting desttype/srctype to NULL for BLKmode and if (!srctype) srctype = desttype; if (!desttype) desttype = srctype; if (!srctype) return NULL_TREE; no? Thus if we can't get a integer type for either src or dest then we fail. But we should
Re: [PATCH, AArch64 1/6] aarch64: Add addti3 and subti3 patterns
On 8 January 2014 18:13, Richard Henderson r...@redhat.com wrote: * config/aarch64/aarch64 (addti3, subti3): New expanders. (addGPI3_compare0): Remove leading * from name. (addGPI3_carryin): Likewise. (subGPI3_compare0): Likewise. (subGPI3_carryin): Likewise. I think this should go in now we are in stage-1 /Marcus
Re: [PATCH, AArch64 3/6] aarch64: Add multi3 pattern
On 8 January 2014 18:13, Richard Henderson r...@redhat.com wrote: * config/aarch64/aarch64.md (multi3): New expander. (maddGPI): Remove leading * from name. I think this should go in now we are in stage-1 /Marcus
Re: [PATCH, AArch64 2/6] aarch64: Add mulditi3 and umulditi3 patterns
On 8 January 2014 18:13, Richard Henderson r...@redhat.com wrote: * config/aarch64/aarch64.md (su_optabmulditi3): New expander. --- gcc/config/aarch64/aarch64.md | 17 + 1 file changed, 17 insertions(+) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index c4acdfc..0b3943d 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2078,6 +2078,23 @@ [(set_attr type sumull)] ) +(define_expand su_optabmulditi3 + [(set (match_operand:TI 0 register_operand) + (mult:TI (ANY_EXTEND:TI (match_operand:DI 1 register_operand)) +(ANY_EXTEND:TI (match_operand:DI 2 register_operand] + +{ + rtx low = gen_reg_rtx (DImode); + emit_insn (gen_muldi3 (low, operands[1], operands[2])); + + rtx high = gen_reg_rtx (DImode); + emit_insn (gen_sumuldi3_highpart (high, operands[1], operands[2])); + + emit_move_insn (gen_lowpart (DImode, operands[0]), low); + emit_move_insn (gen_highpart (DImode, operands[0]), high); + DONE; +}) + (define_insn sumuldi3_highpart [(set (match_operand:DI 0 register_operand =r) (truncate:DI -- 1.8.4.2 I think this should go in now we are in stage-1 /Marcus
Ping: [PATCH] PR debug/16063. Add DW_AT_type to DW_TAG_enumeration.
On Mon, 2014-04-14 at 23:19 +0200, Mark Wielaard wrote: On Fri, 2014-04-11 at 11:03 -0700, Cary Coutant wrote: The DWARF bits are fine with me. Thanks. Who can approve the other bits? You should probably get C and C++ front end approval. I'm not really sure who needs to review patches in c-family/. Since the part in c/ is so tiny, maybe all you need is a C++ front end maintainer. Both Richard Henderson and Jason Merrill are global reviewers, so either of them could approve the whole thing. Thanks, I added them to the CC. When approved should I wait till stage 1 opens before committing? Yes. The PR you're fixing is an enhancement request, not a regression, so it needs to wait. Since stage one just opened up again this seems a good time to re-ask for approval then :) Rebased patch against current trunk attached. Ping. Tom already pushed his patches to GDB that take advantage of the new information if available. Thanks, Mark From 81c76099294a9d617798ed65095d7f8210d6f958 Mon Sep 17 00:00:00 2001 From: Mark Wielaard m...@redhat.com Date: Sun, 23 Mar 2014 12:05:16 +0100 Subject: [PATCH] PR debug/16063. Add DW_AT_type to DW_TAG_enumeration. Add a new lang-hook that provides the underlying base type of an ENUMERAL_TYPE. Including implementations for C and C++. Use this enum_underlying_base_type lang-hook in dwarf2out.c to add a DW_AT_type base type reference to a DW_TAG_enumeration. gcc/ * dwarf2out.c (gen_enumeration_type_die): Add DW_AT_type if enum_underlying_base_type defined and DWARF version 3. * langhooks.h (struct lang_hooks_for_types): Add enum_underlying_base_type. * langhooks-def.h (LANG_HOOKS_ENUM_UNDERLYING_BASE_TYPE): New define. (LANG_HOOKS_FOR_TYPES_INITIALIZER): Add new lang hook. gcc/c-family/ * c-common.c (c_enum_underlying_base_type): New function. * c-common.h (c_enum_underlying_base_type): Add declaration. gcc/c/ * c-objc-common.h (LANG_HOOKS_ENUM_UNDERLYING_BASE_TYPE): Define. gcc/cp/ * cp-lang.c (cxx_enum_underlying_base_type): New function. (LANG_HOOKS_ENUM_UNDERLYING_BASE_TYPE): Define. --- gcc/ChangeLog | 10 ++ gcc/c-family/ChangeLog |6 ++ gcc/c-family/c-common.c |8 gcc/c-family/c-common.h |1 + gcc/c/ChangeLog |5 + gcc/c/c-objc-common.h |2 ++ gcc/cp/ChangeLog|6 ++ gcc/cp/cp-lang.c| 18 ++ gcc/dwarf2out.c |6 ++ gcc/langhooks-def.h |4 +++- gcc/langhooks.h |2 ++ 11 files changed, 67 insertions(+), 1 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index e34c39f..26a9037 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,13 @@ +2014-03-21 Mark Wielaard m...@redhat.com + + PR debug/16063 + * dwarf2out.c (gen_enumeration_type_die): Add DW_AT_type if + enum_underlying_base_type defined and DWARF version 3. + * langhooks.h (struct lang_hooks_for_types): Add + enum_underlying_base_type. + * langhooks-def.h (LANG_HOOKS_ENUM_UNDERLYING_BASE_TYPE): New define. + (LANG_HOOKS_FOR_TYPES_INITIALIZER): Add new lang hook. + 2014-04-22 Ian Bolton ian.bol...@arm.com * config/arm/arm-protos.h (tune_params): New struct members. diff --git a/gcc/c-family/ChangeLog b/gcc/c-family/ChangeLog index 206b47b..0b7b7e1 100644 --- a/gcc/c-family/ChangeLog +++ b/gcc/c-family/ChangeLog @@ -1,3 +1,9 @@ +2014-03-21 Mark Wielaard m...@redhat.com + + PR debug/16063 + * c-common.c (c_enum_underlying_base_type): New function. + * c-common.h (c_enum_underlying_base_type): Add declaration. + 2014-04-14 Richard Biener rguent...@suse.de Marc Glisse marc.gli...@inria.fr diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index c0e247b..a7c212f 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -3902,6 +3902,14 @@ c_register_builtin_type (tree type, const char* name) registered_builtin_types = tree_cons (0, type, registered_builtin_types); } + +/* The C version of the enum_underlying_base_type langhook. */ +tree +c_enum_underlying_base_type (const_tree type) +{ + return c_common_type_for_size (TYPE_PRECISION (type), TYPE_UNSIGNED (type)); +} + /* Print an error message for invalid operands to arith operation CODE with TYPE0 for operand 0, and TYPE1 for operand 1. diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index 24959d8..d433972 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -832,6 +832,7 @@ extern void c_common_finish (void); extern void c_common_parse_file (void); extern alias_set_type c_common_get_alias_set (tree); extern void c_register_builtin_type (tree, const char*); +extern tree c_enum_underlying_base_type (const_tree); extern bool c_promoting_integer_type_p (const_tree); extern int self_promoting_args_p (const_tree); extern tree strip_pointer_operator (tree); diff --git a/gcc/c/ChangeLog b/gcc/c/ChangeLog index bacfbe3..cf4252b 100644 --- a/gcc/c/ChangeLog +++ b/gcc/c/ChangeLog @@
RE: [PATCH] Add a new option -fmerge-bitfields (patch / doc inside)
Hello, Updated doc/invoke.texi by stating that new option is enabled by default at -O2 and higher. Also, -fmerge-bitfields added to the list of optimization flags enabled by default at -O2 and higher. Regards, Zoran Jovanovic -- Lowering is applied only for bit-fields copy sequences that are merged. Data structure representing bit-field copy sequences is renamed and reduced in size. Optimization turned on by default for -O2 and higher. Some comments fixed. Benchmarking performed on WebKit for Android. Code size reduction noticed on several files, best examples are: core/rendering/style/StyleMultiColData (632-520 bytes) core/platform/graphics/FontDescription (1715-1475 bytes) core/rendering/style/FillLayer (5069-4513 bytes) core/rendering/style/StyleRareInheritedData (5618-5346) core/css/CSSSelectorList(4047-3887) core/platform/animation/CSSAnimationData (3844-3440 bytes) core/css/resolver/FontBuilder (13818-13350 bytes) core/platform/graphics/Font (16447-15975 bytes) Example: One of the motivating examples for this work was copy constructor of the class which contains bit-fields. C++ code: class A { public: A(const A x); unsigned a : 1; unsigned b : 2; unsigned c : 4; }; A::A(const Ax) { a = x.a; b = x.b; c = x.c; } GIMPLE code without optimization: bb 2: _3 = x_2(D)-a; this_4(D)-a = _3; _6 = x_2(D)-b; this_4(D)-b = _6; _8 = x_2(D)-c; this_4(D)-c = _8; return; Optimized GIMPLE code: bb 2: _10 = x_2(D)-D.1867; _11 = BIT_FIELD_REF _10, 7, 0; _12 = this_4(D)-D.1867; _13 = _12 128; _14 = (unsigned char) _11; _15 = _13 | _14; this_4(D)-D.1867 = _15; return; Generated MIPS32r2 assembly code without optimization: lw $3,0($5) lbu $2,0($4) andi$3,$3,0x1 andi$2,$2,0xfe or $2,$2,$3 sb $2,0($4) lw $3,0($5) andi$2,$2,0xf9 andi$3,$3,0x6 or $2,$2,$3 sb $2,0($4) lw $3,0($5) andi$2,$2,0x87 andi$3,$3,0x78 or $2,$2,$3 j $31 sb $2,0($4) Optimized MIPS32r2 assembly code: lw $3,0($5) lbu $2,0($4) andi$3,$3,0x7f andi$2,$2,0x80 or $2,$3,$2 j $31 sb $2,0($4) Algorithm works on basic block level and consists of following 3 major steps: 1. Go through basic block statements list. If there are statement pairs that implement copy of bit field content from one memory location to another record statements pointers and other necessary data in corresponding data structure. 2. Identify records that represent adjacent bit field accesses and mark them as merged. 3. Lower bit-field accesses by using new field size for those that can be merged. New command line option -fmerge-bitfields is introduced. Tested - passed gcc regression tests for MIPS32r2. Changelog - gcc/ChangeLog: 2014-04-22 Zoran Jovanovic (zoran.jovano...@imgtec.com) * common.opt (fmerge-bitfields): New option. * doc/invoke.texi: Add reference to -fmerge-bitfields. * doc/invoke.texi: Add -fmerge-bitfields to the list of optimization flags turned on at -O2. * tree-sra.c (lower_bitfields): New function. Entry for (-fmerge-bitfields). (part_of_union_p): New function. (bf_access_candidate_p): New function. (lower_bitfield_read): New function. (lower_bitfield_write): New function. (bitfield_stmt_bfcopy_pair::hash): New function. (bitfield_stmt_bfcopy_pair::equal): New function. (bitfield_stmt_bfcopy_pair::remove): New function. (create_and_insert_bfcopy): New function. (get_bit_offset): New function. (add_stmt_bfcopy_pair): New function. (cmp_bfcopies): New function. (get_merged_bit_field_size): New function. * dwarf2out.c (simple_type_size_in_bits): Move to tree.c. (field_byte_offset): Move declaration to tree.h and make it extern. * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test. * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test. * tree-ssa-sccvn.c (expressions_equal_p): Move to tree.c. * tree-ssa-sccvn.h (expressions_equal_p): Move declaration to tree.h. * tree.c (expressions_equal_p): Move from tree-ssa-sccvn.c. (simple_type_size_in_bits): Move from dwarf2out.c. * tree.h (expressions_equal_p): Add declaration. (field_byte_offset): Add declaration. Patch - diff --git a/gcc/common.opt b/gcc/common.opt index da275e5..52c7f58 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2203,6 +2203,10 @@ ftree-sra Common Report Var(flag_tree_sra) Optimization Perform scalar replacement of aggregates +fmerge-bitfields +Common Report Var(flag_tree_bitfield_merge) Optimization +Merge loads and stores of consecutive bitfields + ftree-ter Common Report Var(flag_tree_ter) Optimization
Re: [AArch64] Relax modes_tieable_p and cannot_change_mode_class
*ping* Thanks, James On Tue, Feb 18, 2014 at 12:40:24PM +, James Greenhalgh wrote: Hi, We aim to improve code generation for the vector structure types such as int64x2x4_t, as used in the vld/st{2,3,4} lane neon intrinsics. It should be possible and cheap to get individual vectors in and out of these structures - these structures are implemented as opaque integer modes straddling multiple vector registers. To do this, we want to weaken the conditions for aarch64_cannot_change_mode_class - to permit cheap subreg operations - and for TARGET_MODES_TIEABLE_P - to allow all combinations of vector structure and vector types to coexist. Regression tested on aarch64-none-elf with no issues. This is a bit too intrusive for Stage 4, but is it OK to queue for Stage 1? Thanks, James --- gcc/ 2014-02-18 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64-protos.h (aarch64_modes_tieable_p): New. * config/aarch64/aarch64.c (aarch64_cannot_change_mode_class): Weaken conditions. (aarch64_modes_tieable_p): New. * config/aarch64/aarch64.h (MODES_TIEABLE_P): Use it. diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 5542f02..04cbc78 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -175,6 +175,8 @@ bool aarch64_is_extend_from_extract (enum machine_mode, rtx, rtx); bool aarch64_is_long_call_p (rtx); bool aarch64_label_mentioned_p (rtx); bool aarch64_legitimate_pic_operand_p (rtx); +bool aarch64_modes_tieable_p (enum machine_mode mode1, + enum machine_mode mode2); bool aarch64_move_imm (HOST_WIDE_INT, enum machine_mode); bool aarch64_mov_operand_p (rtx, enum aarch64_symbol_context, enum machine_mode); diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index ea90311..853d1a9 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -8283,7 +8283,8 @@ aarch64_cannot_change_mode_class (enum machine_mode from, /* Limited combinations of subregs are safe on FPREGs. Particularly, 1. Vector Mode to Scalar mode where 1 unit of the vector is accessed. 2. Scalar to Scalar for integer modes or same size float modes. - 3. Vector to Vector modes. */ + 3. Vector to Vector modes. + 4. On little-endian only, Vector-Structure to Vector modes. */ if (GET_MODE_SIZE (from) GET_MODE_SIZE (to)) { if (aarch64_vector_mode_supported_p (from) @@ -8299,11 +8300,41 @@ aarch64_cannot_change_mode_class (enum machine_mode from, if (aarch64_vector_mode_supported_p (from) aarch64_vector_mode_supported_p (to)) return false; + + /* Within an vector structure straddling multiple vector registers + we are in a mixed-endian representation. As such, we can't + easily change modes for BYTES_BIG_ENDIAN. Otherwise, we can + switch between vectors and vector structures cheaply. */ + if (!BYTES_BIG_ENDIAN) + if ((aarch64_vector_mode_supported_p (from) +aarch64_vect_struct_mode_p (to)) + || (aarch64_vector_mode_supported_p (to) +aarch64_vect_struct_mode_p (from))) + return false; } return true; } +/* Implement MODES_TIEABLE_P. */ + +bool +aarch64_modes_tieable_p (enum machine_mode mode1, enum machine_mode mode2) +{ + if (GET_MODE_CLASS (mode1) == GET_MODE_CLASS (mode2)) +return true; + + /* We specifically want to allow elements of structure modes to + be tieable to the structure. This more general condition allows + other rarer situations too. */ + if (TARGET_SIMD + aarch64_vector_mode_p (mode1) + aarch64_vector_mode_p (mode2)) +return true; + + return false; +} + #undef TARGET_ADDRESS_COST #define TARGET_ADDRESS_COST aarch64_address_cost diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 13c424c..a85de99 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -362,8 +362,7 @@ extern unsigned long aarch64_tune_flags; #define HARD_REGNO_MODE_OK(REGNO, MODE) aarch64_hard_regno_mode_ok (REGNO, MODE) -#define MODES_TIEABLE_P(MODE1, MODE2)\ - (GET_MODE_CLASS (MODE1) == GET_MODE_CLASS (MODE2)) +#define MODES_TIEABLE_P(MODE1, MODE2) aarch64_modes_tieable_p (MODE1, MODE2) #define DWARF2_UNWIND_INFO 1
Re: [AArch64] Relax modes_tieable_p and cannot_change_mode_class
On 18 February 2014 12:40, James Greenhalgh james.greenha...@arm.com wrote: 2014-02-18 James Greenhalgh james.greenha...@arm.com * config/aarch64/aarch64-protos.h (aarch64_modes_tieable_p): New. * config/aarch64/aarch64.c (aarch64_cannot_change_mode_class): Weaken conditions. (aarch64_modes_tieable_p): New. * config/aarch64/aarch64.h (MODES_TIEABLE_P): Use it. OK /Marcus
Re: RFA: tweak integer type used for memcpy folding
On Tue, Apr 22, 2014 at 11:51 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Tue, Apr 22, 2014 at 11:15 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Biener richard.guent...@gmail.com writes: On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford rdsandif...@googlemail.com wrote: wide-int fails to build libitm because of a bad interaction between: /* Keep the OI and XI modes from confusing the compiler into thinking that these modes could actually be used for computation. They are only holders for vectors during data movement. */ #define MAX_BITSIZE_MODE_ANY_INT (128) and the memcpy folding code: /* Make sure we are not copying using a floating-point mode or a type whose size possibly does not match its precision. */ if (FLOAT_MODE_P (TYPE_MODE (desttype)) || TREE_CODE (desttype) == BOOLEAN_TYPE || TREE_CODE (desttype) == ENUMERAL_TYPE) { /* A more suitable int_mode_for_mode would return a vector integer mode for a vector float mode or a integer complex mode for a float complex mode if there isn't a regular integer mode covering the mode of desttype. */ enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype)); if (mode == BLKmode) desttype = NULL_TREE; else desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); } if (FLOAT_MODE_P (TYPE_MODE (srctype)) || TREE_CODE (srctype) == BOOLEAN_TYPE || TREE_CODE (srctype) == ENUMERAL_TYPE) { enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype)); if (mode == BLKmode) srctype = NULL_TREE; else srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1); } The failure occurs for complex long double, which we try to copy as a 256-bit integer type (OImode). This patch tries to do what the comment suggests by introducing a new form of int_mode_for_mode that replaces vector modes with vector modes and complex modes with complex modes. The fallback case of using a MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above 128 bits on x86_64. The question then is what to do about 128-bit types for i386. MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be used for optimisation. However, gcc.target/i386/pr49168-1.c only passes for -m32 -msse2 because we use int128_t to copy a float128_t. I handled that by allowing MODE_VECTOR_INT to be used instead of MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE, even if the original type wasn't a vector. Hmm. Sounds reasonable unless there are very weird targets that cannot efficiently load/store vectors unaligned but can handle efficient load/store of unaligned scalars. Yeah, in general there's no guarantee that even int_mode_for_mode will return a mode with the same alignment as the original. Callers need to check that (like the memcpy folder does). It might be that other callers to int_mode_for_mode should use the new function too, but I'll look at that separately. I used the attached testcase (with printfs added to gcc) to check that the right modes and types were being chosen. The patch fixes the complex float and complex double cases, since the integer type that we previously picked had a larger alignment than the original complex type. As of complex int modes - are we positively sure that targets even try to do sth optimal for loads/stores of those? Complex modes usually aren't handled directly by .md patterns, either int or float. They're really treated as a pair of values. So IMO it still makes sense to fold this case. One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype)) is that vectors are copied as integer vectors if the target supports them directly but are copied as float vectors otherwise, since in the latter case the mode will be BLKmode. E.g. the 1024-bit vectors in the test are copied as vector floats and vector doubles both before and after the patch. That wasn't intended ... the folding should have failed if we can't copy using an integer mode ... Does that mean that the fold give up if TYPE_MODE is BLKmode? I can do that as a separate patch if so. Looking at the code again it should always choose an integer mode/type via setting desttype/srctype to NULL for BLKmode and if (!srctype) srctype = desttype; if (!desttype) desttype = srctype; if (!srctype) return NULL_TREE; no?
Re: [PATCH 22/89] Introduce gimple_goto
On Mon, Apr 21, 2014 at 12:56:53PM -0400, David Malcolm wrote: gcc/ * coretypes.h (gimple_goto): New typedef. (const_gimple_goto): New typedef. * gimple.h (gimple_statement_goto): New subclass of gimple_statement_with_ops, adding the invariant that stmt-code == GIMPLE_GOTO. (gimple_statement_base::as_a_gimple_goto): New. (gimple_statement_base::dyn_cast_gimple_goto): New. (is_a_helper gimple_statement_goto::test): New. (gimple_build_goto): Return a gimple_goto rather than a plain gimple. * gimple-pretty-print.c (dump_gimple_goto): Require a gimple_goto rather than a plain gimple. (pp_gimple_stmt_1): Add a checked cast to gimple_goto within GIMPLE_GOTO case of switch statement. * gimple.c (gimple_build_goto): Return a gimple_goto rather than a plain gimple. * tree-cfg.c (verify_gimple_goto): Require a gimple_goto rather than a plain gimple. (verify_gimple_stmt): Add a checked cast to gimple_goto within GIMPLE_GOTO case of switch statement. --- gcc/coretypes.h | 4 gcc/gimple-pretty-print.c | 4 ++-- didn't you miss updating the gdb pretty printers in this one? Trev gcc/gimple.c | 5 +++-- gcc/gimple.h | 36 +++- gcc/tree-cfg.c| 4 ++-- 5 files changed, 46 insertions(+), 7 deletions(-) diff --git a/gcc/coretypes.h b/gcc/coretypes.h index d5c62b9..1d04d07 100644 --- a/gcc/coretypes.h +++ b/gcc/coretypes.h @@ -77,6 +77,10 @@ struct gimple_statement_debug; typedef struct gimple_statement_debug *gimple_debug; typedef const struct gimple_statement_debug *const_gimple_debug; +struct gimple_statement_goto; +typedef struct gimple_statement_goto *gimple_goto; +typedef const struct gimple_statement_goto *const_gimple_goto; + struct gimple_statement_label; typedef struct gimple_statement_label *gimple_label; typedef const struct gimple_statement_label *const_gimple_label; diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c index 90baded..6ee6ce9 100644 --- a/gcc/gimple-pretty-print.c +++ b/gcc/gimple-pretty-print.c @@ -874,7 +874,7 @@ dump_gimple_label (pretty_printer *buffer, gimple_label gs, int spc, int flags) TDF_* in dumpfile.h). */ static void -dump_gimple_goto (pretty_printer *buffer, gimple gs, int spc, int flags) +dump_gimple_goto (pretty_printer *buffer, gimple_goto gs, int spc, int flags) { tree label = gimple_goto_dest (gs); if (flags TDF_RAW) @@ -2115,7 +2115,7 @@ pp_gimple_stmt_1 (pretty_printer *buffer, gimple gs, int spc, int flags) break; case GIMPLE_GOTO: - dump_gimple_goto (buffer, gs, spc, flags); + dump_gimple_goto (buffer, gs-as_a_gimple_goto (), spc, flags); break; case GIMPLE_NOP: diff --git a/gcc/gimple.c b/gcc/gimple.c index 222c068..b73fc74 100644 --- a/gcc/gimple.c +++ b/gcc/gimple.c @@ -495,10 +495,11 @@ gimple_build_label (tree label) /* Build a GIMPLE_GOTO statement to label DEST. */ -gimple +gimple_goto gimple_build_goto (tree dest) { - gimple p = gimple_build_with_ops (GIMPLE_GOTO, ERROR_MARK, 1); + gimple_goto p = gimple_build_with_ops (GIMPLE_GOTO, ERROR_MARK, + 1)-as_a_gimple_goto (); gimple_goto_set_dest (p, dest); return p; } diff --git a/gcc/gimple.h b/gcc/gimple.h index 39ac2dc..a4c7b30 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -222,6 +222,12 @@ public: return as_a gimple_statement_debug (this); } + inline gimple_goto + as_a_gimple_goto () + { +return as_a gimple_statement_goto (this); + } + inline gimple_label as_a_gimple_label () { @@ -290,6 +296,12 @@ public: return dyn_cast gimple_statement_debug (this); } + inline gimple_goto + dyn_cast_gimple_goto () + { +return dyn_cast gimple_statement_goto (this); + } + inline gimple_label dyn_cast_gimple_label () { @@ -922,6 +934,20 @@ struct GTY((tag(GSS_WITH_OPS))) }; /* A statement with the invariant that + stmt-code == GIMPLE_GOTO + i.e. a goto statement. + + This type will normally be accessed via the gimple_goto and + const_gimple_goto typedefs (in coretypes.h), which are pointers to + this type. */ + +struct GTY((tag(GSS_WITH_OPS))) + gimple_statement_goto : public gimple_statement_with_ops +{ + /* no additional fields; this uses the layout for GSS_WITH_OPS. */ +}; + +/* A statement with the invariant that stmt-code == GIMPLE_LABEL i.e. a label statement. @@ -1024,6 +1050,14 @@ is_a_helper gimple_statement_debug::test (gimple gs) template template inline bool +is_a_helper gimple_statement_goto::test (gimple gs) +{ + return gs-code == GIMPLE_GOTO; +} + +template +template +inline bool is_a_helper gimple_statement_label::test (gimple
Re: [PATCH 26/89] Introduce gimple_eh_filter
On Mon, Apr 21, 2014 at 12:56:57PM -0400, David Malcolm wrote: gcc/ * coretypes.h (gimple_eh_filter): New typedef. (const_gimple_eh_filter): New typedef. * gimple.h (gimple_statement_base::as_a_gimple_eh_filter): New. (gimple_build_eh_filter): Return a gimple_eh_filter rather than a plain gimple. * gimple-pretty-print.c (dump_gimple_eh_filter): Require a gimple_eh_filter rather than a plain gimple. (pp_gimple_stmt_1): Add checked cast to gimple_eh_filter within GIMPLE_EH_FILTER case of switch statement. * gimple.c (gimple_build_eh_filter): Return a gimple_eh_filter rather than a plain gimple. --- gcc/coretypes.h | 4 gcc/gimple-pretty-print.c | 5 +++-- same question about pretty printers. Trev gcc/gimple.c | 5 +++-- gcc/gimple.h | 8 +++- 4 files changed, 17 insertions(+), 5 deletions(-) diff --git a/gcc/coretypes.h b/gcc/coretypes.h index 1dd36fb..592b9e5 100644 --- a/gcc/coretypes.h +++ b/gcc/coretypes.h @@ -117,6 +117,10 @@ struct gimple_statement_catch; typedef struct gimple_statement_catch *gimple_catch; typedef const struct gimple_statement_catch *const_gimple_catch; +struct gimple_statement_eh_filter; +typedef struct gimple_statement_eh_filter *gimple_eh_filter; +typedef const struct gimple_statement_eh_filter *const_gimple_eh_filter; + struct gimple_statement_phi; typedef struct gimple_statement_phi *gimple_phi; typedef const struct gimple_statement_phi *const_gimple_phi; diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c index ec16f13..37f28d9 100644 --- a/gcc/gimple-pretty-print.c +++ b/gcc/gimple-pretty-print.c @@ -994,7 +994,8 @@ dump_gimple_catch (pretty_printer *buffer, gimple_catch gs, int spc, int flags) dumpfile.h). */ static void -dump_gimple_eh_filter (pretty_printer *buffer, gimple gs, int spc, int flags) +dump_gimple_eh_filter (pretty_printer *buffer, gimple_eh_filter gs, int spc, +int flags) { if (flags TDF_RAW) dump_gimple_fmt (buffer, spc, flags, %G %T, %+FAILURE %S%-, gs, @@ -2204,7 +2205,7 @@ pp_gimple_stmt_1 (pretty_printer *buffer, gimple gs, int spc, int flags) break; case GIMPLE_EH_FILTER: - dump_gimple_eh_filter (buffer, gs, spc, flags); + dump_gimple_eh_filter (buffer, gs-as_a_gimple_eh_filter (), spc, flags); break; case GIMPLE_EH_MUST_NOT_THROW: diff --git a/gcc/gimple.c b/gcc/gimple.c index 4bc844b..42eef46 100644 --- a/gcc/gimple.c +++ b/gcc/gimple.c @@ -626,10 +626,11 @@ gimple_build_catch (tree types, gimple_seq handler) TYPES are the filter's types. FAILURE is the filter's failure action. */ -gimple +gimple_eh_filter gimple_build_eh_filter (tree types, gimple_seq failure) { - gimple p = gimple_alloc (GIMPLE_EH_FILTER, 0); + gimple_eh_filter p = +gimple_alloc (GIMPLE_EH_FILTER, 0)-as_a_gimple_eh_filter (); gimple_eh_filter_set_types (p, types); if (failure) gimple_eh_filter_set_failure (p, failure); diff --git a/gcc/gimple.h b/gcc/gimple.h index e12e066..38b257c 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -294,6 +294,12 @@ public: return as_a gimple_statement_catch (this); } + inline gimple_eh_filter + as_a_gimple_eh_filter () + { +return as_a gimple_statement_eh_filter (this); + } + inline gimple_phi as_a_gimple_phi () { @@ -1512,7 +1518,7 @@ gimple_asm gimple_build_asm_vec (const char *, vectree, va_gc *, vectree, va_gc *, vectree, va_gc *, vectree, va_gc *); gimple_catch gimple_build_catch (tree, gimple_seq); -gimple gimple_build_eh_filter (tree, gimple_seq); +gimple_eh_filter gimple_build_eh_filter (tree, gimple_seq); gimple gimple_build_eh_must_not_throw (tree); gimple gimple_build_eh_else (gimple_seq, gimple_seq); gimple_statement_try *gimple_build_try (gimple_seq, gimple_seq, -- 1.8.5.3 signature.asc Description: Digital signature
Re: [PATCH] pedantic warning behavior when casting void* to ptr-to-func, 4.8 and 4.9
Ping for maintainer please. Thanks, Daniel. On Tue, Apr 15, 2014 at 7:05 PM, Daniel Gutson daniel.gut...@tallertechnologies.com wrote: On Tue, Apr 15, 2014 at 6:12 PM, Richard Sandiford rdsandif...@googlemail.com wrote: cc:ing Jason, who's the C++ maintainer. FWIW: I created http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60850 Daniel Gutson daniel.gut...@tallertechnologies.com writes: ping for maintainer. Could this be considered for 4.8.3 please? Thanks, Daniel. On Tue, Apr 1, 2014 at 2:46 PM, Daniel Gutson daniel.gut...@tallertechnologies.com wrote: I just realized I posted the patch in the wrong list. -- Forwarded message -- From: Daniel Gutson daniel.gut...@tallertechnologies.com Date: Tue, Apr 1, 2014 at 10:43 AM Subject: [PATCH] pedantic warning behavior when casting void* to ptr-to-func, 4.8 and 4.9 To: gcc Mailing List g...@gcc.gnu.org Hi, I observed two different behaviors in gcc 4.8.2 and 4.9 regarding the same issue, IMO both erroneous. Regarding 4.8.2, #pragma GCC diagnostic ignored -pedantic doesn't work in cases such as: void* p = 0; #pragma GCC diagnostic ignored -pedantic F* f2 = reinterpret_castF*(p); (see testcase in the patch). The attached patch attempts to fix this issue. Since I no longer have write access, please apply this for me if correct (is the 4.8 branch still alive for adding fixes?). Regarding 4.9, gcc fails to complain at all when -pedantic is passed, even specifying -std=c++03. Please let me know if this is truly a bug, in which case I could also fix it for the latest version as well (if so, please let me know if I should look into trunk or any other branch). Thanks, Daniel. 2014-03-31 Daniel Gutson daniel.gut...@tallertechnologies.com gcc/cp/ * typeck.c (build_reinterpret_cast_1): Pass proper argument to warn() in pedantic. gcc/testsuite/g++.dg/ * diagnostic/pedantic.C: New test case. --- gcc-4.8.2-orig/gcc/cp/typeck.c 2014-03-31 22:29:42.736367936 -0300 +++ gcc-4.8.2/gcc/cp/typeck.c 2014-03-31 14:26:43.536747050 -0300 @@ -6639,7 +6639,7 @@ where possible, and it is necessary in some cases. DR 195 addresses this issue, but as of 2004/10/26 is still in drafting. */ - warning (0, ISO C++ forbids casting between pointer-to-function and pointer-to-object); + warning (OPT_Wpedantic, ISO C++ forbids casting between pointer-to-function and pointer-to-object); return fold_if_not_in_template (build_nop (type, expr)); } else if (TREE_CODE (type) == VECTOR_TYPE) --- gcc-4.8.2-orig/gcc/testsuite/g++.dg/diagnostic/pedantic.C 1969-12-31 21:00:00.0 -0300 +++ gcc-4.8.2/gcc/testsuite/g++.dg/diagnostic/pedantic.C2014-03-31 17:24:42.532607344 -0300 @@ -0,0 +1,12 @@ +// { dg-do compile } +// { dg-options -pedantic } +typedef void F(void); + +void foo() +{ +void* p = 0; +F* f1 = reinterpret_castF*(p);// { dg-warning ISO } +#pragma GCC diagnostic ignored -pedantic +F* f2 = reinterpret_castF*(p); +} + -- Daniel F. Gutson Chief Engineering Officer, SPD San Lorenzo 47, 3rd Floor, Office 5 Córdoba, Argentina Phone: +54 351 4217888 / +54 351 4218211 Skype: dgutson -- Daniel F. Gutson Chief Engineering Officer, SPD San Lorenzo 47, 3rd Floor, Office 5 Córdoba, Argentina Phone: +54 351 4217888 / +54 351 4218211 Skype: dgutson
Re: Remove obsolete Solaris 9 support
Eric Botcazou ebotca...@adacore.com writes: But for the Solaris 9 stuff, it crystal clear that this cannot occur on Solaris 10 and up (no single-threaded case anymore since libthread.so.1 has been folded into libc.so.1). Ok to remove this part? OK for the Solaris 9 - single-threaded part. Thanks. I've updated the comments on the Solaris 9 multi-threaded sections with my findings so they aren't forgotten. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: Remove obsolete Solaris 9 support
Uros Bizjak ubiz...@gmail.com writes: On Wed, Apr 16, 2014 at 1:16 PM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Now that 4.9 has branched, it's time to actually remove the obsolete Solaris 9 configuration. Most of this is just legwork and falls under my Solaris maintainership. A couple of questions, though: * Uros: I'm removing all sse_os_support() checks from the testsuite. Solaris 9 was the only consumer, so it seems best to do away with it. This is OK, but please leave sse-os-check.h (and corresponding sse_os_support calls) in the testsuite. Just remove the Solaris 9 specific code from sse-os-check.h and always return 1, perhaps with the comment that all currently supported OSes support SSE instructions. Here's the final patch I've checked in, incorporating all review comments. I've left out the libgo (already checked in by Ian) and classpath parts. Rainer 2014-01-06 Rainer Orth r...@cebitec.uni-bielefeld.de libstdc++-v3: * configure.host: Remove solaris2.9 handling. Change os_include_dir to os/solaris/solaris2.10. * acinclude.m4 (ac_has_gthreads): Remove solaris2.9* handling. * crossconfig.m4: Remove *-solaris2.9 handling, simplify. * configure: Regenerate. * config/abi/post/solaris2.9: Remove. * config/os/solaris/solaris2.9: Rename to ... * config/os/solaris/solaris2.10: ... this. * config/os/solaris/solaris2.10/os_defines.h (CLOCK_MONOTONIC): Remove. * doc/xml/manual/configure.xml (--enable-libstdcxx-threads): Remove Solaris 9 reference. * doc/html/manual/configure.html: Regenerate. * testsuite/27_io/basic_istream/extractors_arithmetic/char/12.cc: Remove *-*-solaris2.9 xfail. * testsuite/27_io/basic_istream/extractors_arithmetic/wchar_t/12.cc: Likewise. * testsuite/ext/enc_filebuf/char/13598.cc: Remove *-*-solaris2.9 xfail. libjava: * configure.ac (THREADLIBS, THREADSPEC): Remove *-*-solaris2.9 handling. * configure: Regenerate. libgfortran: * config/fpu-387.h [__sun__ __svr4__]: Remove SSE execution check. libgcc: * config/i386/crtfastmath.c (set_fast_math): Remove SSE execution check. * config/i386/sol2-unwind.h (x86_fallback_frame_state): Remove Solaris 9 single-threaded support. * config/sparc/sol2-unwind.h (sparc64_is_sighandler): Remove Solaris 9 single-threaded support. Add call_user_handler code sequences. (sparc_is_sighandler): Likewise. libcpp: * lex.c: Remove Solaris 9 reference. gcc/testsuite: * gcc.c-torture/compile/pr28865.c: Remove dg-xfail-if. * gcc.dg/c99-stdint-6.c: Remove dg-options for *-*-solaris2.9. * gcc.dg/lto/20090210_0.c: Remove dg-extra-ld-options for *-*-solaris2.9. * gcc.dg/torture/pr47917.c: Remove dg-options for *-*-solaris2.9. * gcc.target/i386/pr22076.c: Remove i?86-*-solaris2.9 handling from dg-options. * gcc.target/i386/pr22152.c: Remove i?86-*-solaris2.9 handling from dg-additional-options. * gcc.target/i386/vect8-ret.c: Remove i?86-*-solaris2.9 handling from dg-options. * gcc.dg/vect/tree-vect.h (check_vect): Remove Solaris 9 SSE2 execution check. * gcc.target/i386/sse-os-support.h [__sun__ __svr4__] (sigill_hdlr): Remove. (sse_os_support) [__sun__ __svr4__]: Remove SSE execution check. * gfortran.dg/erf_3.F90: Remove sparc*-*-solaris2.9* handling. * gfortran.dg/fmt_en.f90: Remove i?86-*-solaris2.9* handling. * gfortran.dg/round_4.f90: Remove *-*-solaris2.9* handling. * lib/target-supports.exp (add_options_for_tls): Remove *-*-solaris2.9* handling. gcc: * config.gcc (enable_obsolete): Remove *-*-solaris2.9*. (*-*-solaris2.[0-9] | *-*-solaris2.[0-9].*): Mark unsupported. (*-*-solaris2*): Simplify. (i[34567]86-*-solaris2* | x86_64-*-solaris2.1[0-9]*): Likewise. (i[34567]86-*-solaris2* | x86_64-*-solaris2.1[0-9]*): Remove *-*-solaris2.9* handling. * configure.ac (gcc_cv_as_hidden): Remove test for Solaris 9/x86 as bug. (gcc_cv_ld_hidden): Remove *-*-solaris2.9* handling. (ld_tls_support): Remove i?86-*-solaris2.9, sparc*-*-solaris2.9 handling, simplify. (gcc_cv_as_gstabs_flag): Remove workaround for Solaris 9/x86 as bug. * configure: Regenerate. * config/i386/sol2-9.h: Remove. * doc/install.texi (Specific, i?86-*-solaris2.9): Remove. (Specific, *-*-solaris2*): Mention Solaris 9 support removal. Remove Solaris 9 references. fixincludes: * inclhack.def (math_exception): Bypass on *-*-solaris2.1[0-9]*. (solaris_int_types): Remove.
Re: Remove obsolete Solaris 9 support
Bruce Korb bk...@gnu.org writes: On 04/16/14 04:16, Rainer Orth wrote: I've already verified that trunk fails to build no sparc-sun-solaris2.9 and i386-pc-solaris2.9. Bootstraps on {i386,sparc}-*-solaris2.{10,11} (and x86_64-unknown-linux-gnu for good measure) are in progress. I'll verify that there are no unexpected fixincludes changes and differences in gcc configure results. fixincludes: * inclhack.def (math_exception): Bypass on *-*-solaris2.1[0-9]*. (solaris_int_types): Remove. (solaris_longjmp_noreturn): Remove. (solaris_mutex_init_2): Remove. (solaris_once_init_2): Remove. (solaris_sys_va_list): Remove. * fixincl.x: Regenerate. * tests/base/iso/setjmp_iso.h: Remove. * tests/base/pthread.h [SOLARIS_MUTEX_INIT_2_CHECK]: Remove. [SOLARIS_ONCE_INIT_2_CHECK]: Remove. * tests/base/sys/int_types.h: Remove. * tests/base/sys/va_list.h: Remove. Removing dinkleberry fixes by the platform maintainer always has my approval. :) Thanks. I've had to update once of the testcases which was modified by two different fixes. High time to integrate fixincludes make check with the testsuite so this isn't so easily overlooked... Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: Remove obsolete Solaris 9 support
Rainer Orth r...@cebitec.uni-bielefeld.de writes: Now that 4.9 has branched, it's time to actually remove the obsolete Solaris 9 configuration. Most of this is just legwork and falls under my Solaris maintainership. A couple of questions, though: * David: In target-supports.exp (add_options_for_tls), the comment needs to be updated with Solaris 9 support gone. Is it completely accurate for AIX, even wrt. __tls_get_addr/___tls_get_addr? David, could you please review this comment for correctness on AIX? Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: Remove obsolete Solaris 9 support
Andrew Haley a...@redhat.com writes: On 04/16/2014 12:16 PM, Rainer Orth wrote: * I'm removing the sys/loadavg.h check from classpath. Again, I'm uncertain if this is desirable. In the past, classpath changes were merged upstream by one of the libjava maintainers. We should not diverge from GNU Classpath unless there is a strong reason to do so. I never meant to suggest that. With Solaris 9 support gone from from gcc, the only consumer of this code fragment is gone, and this seems a good opportunity to get rid of this obsolete code. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Try to coalesce for unary and binary ops
Hi, On Fri, 18 Apr 2014, Steven Bosscher wrote: IMHO TER should be improved to *do* disturb the order of the incoming instructions, to reduce register pressure. The latter is the goal, yes. But TER isn't really the right place for that (it's constrained by too many invariants, running after coalescing and caring only for single-use SSA names e.g). So the alternative would be to have something which is designed for nothing else than reducing register pressure on gimple, and making TER not destroy such schedule. Ciao, Michael.
Re: [PATCH 00/89] Compile-time gimple-checking
On 04/22/2014 04:03 AM, Richard Sandiford wrote: First of all, thanks a lot for doing this. Maybe one day we'll have the same in rtl :-) But... David Malcolm dmalc...@redhat.com writes: In doing the checked downcasts I ran into the verbosity of the as_a API (in our is-a.h). I first tried simplifying them with custom functions e.g.: static inline gimple_bind as_a_gimple_bind (gimple gs) { return as_a gimple_statement_bind (gs); } but the approach I've gone with makes these checked casts be *methods* of the gimple_statement_base class, so that e.g. in a switch statement you can write: case GIMPLE_SWITCH: dump_gimple_switch (buffer, gs-as_a_gimple_switch (), spc, flags); break; where the -as_a_gimple_switch is a no-op cast from gimple to the more concrete gimple_switch in a release build, with runtime checking for code == GIMPLE_SWITCH added in a checked build (it uses as_a internally). This is much less verbose than trying to do it with as_a directly, and I think doing it as a method reads better aloud (to my English-speaking mind, at-least): gs as a gimple switch, as opposed to: as a gimple switch... gs, which I find clunky. It makes the base class a little cluttered, but IMHO it hits a sweet-spot of readability and type-safety with relatively little verbosity (only 8 more characters than doing it with a raw C-style cast). Another advantage of having the checked cast as a *method* is that it implicitly documents the requirement that the input must be non-NULL. ...FWIW I really don't like these cast members. The counterarguments are: - as_a ... (...) and dyn_cast ... (...) follow the C++ syntax for other casts. - the type you get is obvious, rather than being a contraction of the type name. - having them as methods means that the base class needs to aware of all subclasses. I realise that's probably inherently true of gimple due to the enum, but it seems like bad design. You could potentially have different subclasses for the same enum, selected by a secondary field. I'm not particularly fond of this aspect as well... I fear that someday down the road we would regret this decision, and end up changing it all back to is_a and friends These kind of sweeping changes we ought to try very hard to make sure we only have to do it once. If this is purely for verbosity, I think we can find better ways to reduce it... Is there any other reason? Maybe I've just been reading C code too long, but as a gimple switch...gs doesn't seem any less natural than is constant...address. Another way of reducing the verbosity of as_a would be to shorten the type names. E.g. gimple_statement contracts to gimple_stmt in some places, so gimple_statement_bind could become gimple_stmt_bind or just gimple_bind. gimple_bind is probably better since it matches the names of the accessors. If the thing after as_a_ matches the type name, the X-as_a_foo () takes the same number of characters as as_a foo (X). I was running into similar issues with the gimple re-arch work... One thing I was going to bring up at some point was the possibility of some renaming of types.In the context of these gimple statements, I would propose that we drop the gimple_ prefix completely, and end up with maybe something more concise like bind_stmt, switch_stmt, assign_stmt, etc. There will be places in the code where we have used something like gimple switch_stmt = blah(); so those variables would have to be renamed... and probably other dribble effects... but it would make the code look cleaner. and then as_a, is_a and dyn_cast wouldn't look so bad. I see the gimple part of the name as being redundant. If we're really concerned about it, put the whole thing inside a namespace, say 'Gimple' and then all the gimple source files can simply start with using namespace Gimple; and then use 'bind_stmt' throughout. Non gimple source files could then refer to it directly as Gimple::bind_stmt... This would tie in well with what I'm planning to propose for gimple types and values. Of course, it would be ideal if we could use 'gimple' as the namespace, but that is currently taken by the gimple statement type... I'd even go so far as to propose that 'gimple' should be renamed 'gimple::stmt'.. but that is much more work :-) Andrew
[COMMITTED][AArch64] Fixup indentation.
Hi, I just committed the attached to fix up some indentation in aarch64.c /Marcuscommit 9484cf884a28c18e310b31fcc283f75ade42e93d Author: Marcus Shawcroft marcus.shawcr...@arm.com Date: Tue Apr 22 13:27:39 2014 +0100 [AArch64] Fix indentation. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index e882ff8..b1bbe95 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2014-04-22 Marcus Shawcroft marcus.shawcr...@arm.com + + * config/aarch64/aarch64.c (aarch64_initial_elimination_offset): + Fix indentation. + 2014-04-22 Jakub Jelinek ja...@redhat.com PR tree-optimization/60823 diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 7b6c2b3..a2b9cc2 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -4147,32 +4147,32 @@ aarch64_initial_elimination_offset (unsigned from, unsigned to) + crtl-outgoing_args_size + cfun-machine-saved_varargs_size); - frame_size = AARCH64_ROUND_UP (frame_size, STACK_BOUNDARY / BITS_PER_UNIT); - offset = frame_size; + frame_size = AARCH64_ROUND_UP (frame_size, STACK_BOUNDARY / BITS_PER_UNIT); + offset = frame_size; - if (to == HARD_FRAME_POINTER_REGNUM) - { - if (from == ARG_POINTER_REGNUM) -return offset - crtl-outgoing_args_size; + if (to == HARD_FRAME_POINTER_REGNUM) +{ + if (from == ARG_POINTER_REGNUM) + return offset - crtl-outgoing_args_size; - if (from == FRAME_POINTER_REGNUM) -return cfun-machine-frame.saved_regs_size + get_frame_size (); - } + if (from == FRAME_POINTER_REGNUM) + return cfun-machine-frame.saved_regs_size + get_frame_size (); +} - if (to == STACK_POINTER_REGNUM) - { - if (from == FRAME_POINTER_REGNUM) - { - HOST_WIDE_INT elim = crtl-outgoing_args_size - + cfun-machine-frame.saved_regs_size - + get_frame_size () - - cfun-machine-frame.fp_lr_offset; - elim = AARCH64_ROUND_UP (elim, STACK_BOUNDARY / BITS_PER_UNIT); - return elim; - } - } + if (to == STACK_POINTER_REGNUM) +{ + if (from == FRAME_POINTER_REGNUM) + { + HOST_WIDE_INT elim = crtl-outgoing_args_size + + cfun-machine-frame.saved_regs_size + + get_frame_size () + - cfun-machine-frame.fp_lr_offset; + elim = AARCH64_ROUND_UP (elim, STACK_BOUNDARY / BITS_PER_UNIT); + return elim; + } +} - return offset; + return offset; }
[PATCH 1/7] [AARCH64] Fix bug in aarch64_initial_elimination_offset
We are subtract an extra cfun-machine-frame.fp_lr_offset which is wrong, but the AARCH64_ROUND_UP below happen to compensate that, thus hiding this bug. The offset from FRAME_POINTER_REGNUM to STACK_POINTER_REGNUM is exactly the sum of outgoing argument size, callee saved reg size and local variable size. Just as what commented in aarch64 target file: +---+ -- arg_pointer_rtx | | | callee-allocated save area | | for register varargs | | | +---+ -- frame_pointer_rtx (A) | | | local variables | | | +---+ | padding0 | \ +---+ | | | | | | | | callee-saved registers | | frame.saved_regs_size | | | +---+ | | LR' | | +---+ | | FP' | / P +---+ -- hard_frame_pointer_rtx | dynamic allocation | +---+ | | | outgoing stack arguments | | | +---+ -- stack_pointer_rtx (B) When alloca invoked, frame pointer is always required, no elimination from FRAME_POINTER_REGNUM to STACK_POINTER_REGNUM. OK for stage 1? 2014-04-22 Renlin renlin...@arm.com 2014-04-22 Jiong Wang jiong.w...@arm.com gcc/ * config/aarch64/aarch64.h (aarch64_frame): Delete fp_lr_offset. * config/aarch64/aarch64.c (aarch64_layout_frame): Likewise. * config/aarch64/aarch64.c (aarch64_initial_elimination_offset): Likewise. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index ebd58c0..ee49bbe 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -1719,8 +1719,6 @@ aarch64_layout_frame (void) if (reload_completed cfun-machine-frame.laid_out) return; - cfun-machine-frame.fp_lr_offset = 0; - /* First mark all the registers that really need to be saved... */ for (regno = R0_REGNUM; regno = R30_REGNUM; regno++) cfun-machine-frame.reg_offset[regno] = -1; @@ -1770,14 +1768,12 @@ aarch64_layout_frame (void) { cfun-machine-frame.reg_offset[R29_REGNUM] = offset; offset += UNITS_PER_WORD; - cfun-machine-frame.fp_lr_offset = UNITS_PER_WORD; } if (cfun-machine-frame.reg_offset[R30_REGNUM] != -1) { cfun-machine-frame.reg_offset[R30_REGNUM] = offset; offset += UNITS_PER_WORD; - cfun-machine-frame.fp_lr_offset += UNITS_PER_WORD; } cfun-machine-frame.padding0 = @@ -4183,8 +4179,7 @@ aarch64_initial_elimination_offset (unsigned from, unsigned to) { HOST_WIDE_INT elim = crtl-outgoing_args_size + cfun-machine-frame.saved_regs_size - + get_frame_size () - - cfun-machine-frame.fp_lr_offset; + + get_frame_size (); elim = AARCH64_ROUND_UP (elim, STACK_BOUNDARY / BITS_PER_UNIT); return elim; } diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 1f71ee5..aa38f9f 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -520,7 +520,6 @@ struct GTY (()) aarch64_frame been saved. */ HOST_WIDE_INT padding0; HOST_WIDE_INT hardfp_offset; /* HARD_FRAME_POINTER_REGNUM */ - HOST_WIDE_INT fp_lr_offset; /* Space needed for saving fp and/or lr */ bool laid_out; };
Re: Remove obsolete Solaris 9 support
On Sat, 2014-04-19 at 09:03 +0100, Andrew Haley wrote: On 04/16/2014 12:16 PM, Rainer Orth wrote: * I'm removing the sys/loadavg.h check from classpath. Again, I'm uncertain if this is desirable. In the past, classpath changes were merged upstream by one of the libjava maintainers. We should not diverge from GNU Classpath unless there is a strong reason to do so. I think the configure check is mostly harmless, but wouldn't be opposed removing it. It really seems to have been added explicitly for Solaris 9, which is probably really dead by now. Andrew Hughes, you added it back in 2008. Are you still using/building on any Solaris 9 setups? Cheers, Mark
Re: PATCH: PR target/60868: [4.9/4.10 Regression] ICE: in int_mode_for_mode, at stor-layout.c:400 with -minline-all-stringops -minline-stringops-dynamically -march=core2
On Thu, Apr 17, 2014 at 10:07:29AM -0700, H.J. Lu wrote: Hi, GET_MODE returns VOIDmode on CONST_INT. It happens with -O0. This patch uses counter_mode on count_exp to get mode. Tested on Linux/x86-64 without regressions. OK for trunk and 4.9 branch? Ok, thanks. 2014-04-17 H.J. Lu hongjiu...@intel.com PR target/60868 * config/i386/i386.c (ix86_expand_set_or_movmem): Call counter_mode on count_exp to get mode. gcc/testsuite/ 2014-04-17 H.J. Lu hongjiu...@intel.com PR target/60868 * gcc.target/i386/pr60868.c: New testcase. Jakub
Re: [Patch, AArch64] Fix shuffle for big-endian.
Alan Lawrence wrote: Sorry to be pedantic again, but 'wierd' should be spelt 'weird'. Otherwise, looks good to me and much neater than before. (Seems you'd rather keep the re-enabling, here and in the testsuite, for another patch?) Hi, Yes, the re-enabling is another patch. Thanks for the typo correction. OK for trunk with that change? Thanks, Tejas. --Alan Tejas Belagod wrote: Richard Henderson wrote: On 02/21/2014 08:30 AM, Tejas Belagod wrote: + /* If two vectors, we end up with a wierd mixed-endian mode on NEON. */ + if (BYTES_BIG_ENDIAN) + { + if (!d-one_vector_p d-perm[i] nunits) + { + /* Extract the offset. */ + elt = d-perm[i] (nunits - 1); + /* Reverse the top half. */ + elt = nunits - 1 - elt; + /* Offset it by the bottom half. */ + elt += nunits; + } + else + elt = nunits - 1 - d-perm[i]; + } Isn't this just elt = d-perm[i] ^ (nunits - 1); all the time? I.e. invert the index within the word, but leave the word index (nunits) unchanged. Here is a revised patch. OK for stage-1? Thanks Tejas. 2014-04-02 Tejas Belagod tejas.bela...@yahoo.com gcc/ * config/aarch64/aarch64.c (aarch64_evpc_tbl): Reverse order of elements for big-endian.
Re: Remove obsolete Solaris 9 support
On Tue, Apr 22, 2014 at 8:39 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Rainer Orth r...@cebitec.uni-bielefeld.de writes: Now that 4.9 has branched, it's time to actually remove the obsolete Solaris 9 configuration. Most of this is just legwork and falls under my Solaris maintainership. A couple of questions, though: * David: In target-supports.exp (add_options_for_tls), the comment needs to be updated with Solaris 9 support gone. Is it completely accurate for AIX, even wrt. __tls_get_addr/___tls_get_addr? David, could you please review this comment for correctness on AIX? AIX TLS needs -pthread command line option. Thanks, David
Re: version typeinfo for 128bit types
On 14 April 2014 11:28, Jonathan Wakely wrote: On 14 April 2014 10:39, Marc Glisse wrote: PR libstdc++/43622 * config/abi/pre/gnu.ver (CXXABI_1.3.9): New version, new symbols. * config/abi/pre/gnu-versioned-namespace.ver: New symbols. * config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Likewise. I'd like to wait until 4.9.0 has been released before adding new versions to the library, to avoid unnecessary differences between trunk and the 4.9.0 release candidate(s). I was about to say the patch is OK for trunk now, but noticed that the new CXXABI_1.3.9 block is added after the CXXABI_TM one, not after CXXABI_1.3.8 ... was that intentional?
Re: version typeinfo for 128bit types
On Tue, 22 Apr 2014, Jonathan Wakely wrote: On 14 April 2014 11:28, Jonathan Wakely wrote: On 14 April 2014 10:39, Marc Glisse wrote: PR libstdc++/43622 * config/abi/pre/gnu.ver (CXXABI_1.3.9): New version, new symbols. * config/abi/pre/gnu-versioned-namespace.ver: New symbols. * config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Likewise. I'd like to wait until 4.9.0 has been released before adding new versions to the library, to avoid unnecessary differences between trunk and the 4.9.0 release candidate(s). I was about to say the patch is OK for trunk now, but noticed that the new CXXABI_1.3.9 block is added after the CXXABI_TM one, not after CXXABI_1.3.8 ... was that intentional? I don't have any opinion, I'll move it just after 1.3.8 if that seems more logical to you. -- Marc Glisse
Re: Remove obsolete Solaris 9 support
David Edelsohn dje@gmail.com writes: On Tue, Apr 22, 2014 at 8:39 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Rainer Orth r...@cebitec.uni-bielefeld.de writes: Now that 4.9 has branched, it's time to actually remove the obsolete Solaris 9 configuration. Most of this is just legwork and falls under my Solaris maintainership. A couple of questions, though: * David: In target-supports.exp (add_options_for_tls), the comment needs to be updated with Solaris 9 support gone. Is it completely accurate for AIX, even wrt. __tls_get_addr/___tls_get_addr? David, could you please review this comment for correctness on AIX? AIX TLS needs -pthread command line option. Understood, but is the reason given in that comment (__tls_get_addr in libthread) correct? Seems like a Solaris 9 implementation detail to me. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: Remove obsolete Solaris 9 support
On Tue, Apr 22, 2014 at 9:58 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: David, could you please review this comment for correctness on AIX? AIX TLS needs -pthread command line option. Understood, but is the reason given in that comment (__tls_get_addr in libthread) correct? Seems like a Solaris 9 implementation detail to me. It's not a Solaris 9 implementation detail. That is why I wrote Same for AIX. in the comment. Thanks, David
Re: [PATCH 1/7] [AARCH64] Fix bug in aarch64_initial_elimination_offset
On 22 April 2014 14:23, Jiong Wang jiong.w...@arm.com wrote: 2014-04-22 Renlin renlin...@arm.com 2014-04-22 Jiong Wang jiong.w...@arm.com gcc/ * config/aarch64/aarch64.h (aarch64_frame): Delete fp_lr_offset. * config/aarch64/aarch64.c (aarch64_layout_frame): Likewise. * config/aarch64/aarch64.c (aarch64_initial_elimination_offset): Likewise. OK /Marcus
Re: [Patch, AArch64] Fix shuffle for big-endian.
On 22 April 2014 14:40, Tejas Belagod tbela...@arm.com wrote: Alan Lawrence wrote: Sorry to be pedantic again, but 'wierd' should be spelt 'weird'. Otherwise, looks good to me and much neater than before. (Seems you'd rather keep the re-enabling, here and in the testsuite, for another patch?) Hi, Yes, the re-enabling is another patch. Thanks for the typo correction. OK for trunk with that change? Yes that is OK. /Marcus
Re: version typeinfo for 128bit types
On 22 April 2014 14:55, Marc Glisse wrote: On Tue, 22 Apr 2014, Jonathan Wakely wrote: On 14 April 2014 11:28, Jonathan Wakely wrote: On 14 April 2014 10:39, Marc Glisse wrote: PR libstdc++/43622 * config/abi/pre/gnu.ver (CXXABI_1.3.9): New version, new symbols. * config/abi/pre/gnu-versioned-namespace.ver: New symbols. * config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Likewise. I'd like to wait until 4.9.0 has been released before adding new versions to the library, to avoid unnecessary differences between trunk and the 4.9.0 release candidate(s). I was about to say the patch is OK for trunk now, but noticed that the new CXXABI_1.3.9 block is added after the CXXABI_TM one, not after CXXABI_1.3.8 ... was that intentional? I don't have any opinion, I'll move it just after 1.3.8 if that seems more logical to you. Yes, I think so. The library changes are OK with that tweak - thanks!
Re: Remove obsolete Solaris 9 support
David Edelsohn dje@gmail.com writes: On Tue, Apr 22, 2014 at 9:58 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: David, could you please review this comment for correctness on AIX? AIX TLS needs -pthread command line option. Understood, but is the reason given in that comment (__tls_get_addr in libthread) correct? Seems like a Solaris 9 implementation detail to me. It's not a Solaris 9 implementation detail. That is why I wrote Same for AIX. in the comment. Ok, thanks for the confirmation. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH 02/89] Introduce gimple_switch and use it in various places
On Tue, Apr 22, 2014 at 12:45 AM, Trevor Saunders tsaund...@mozilla.com wrote: --- a/gcc/tree-loop-distribution.c +++ b/gcc/tree-loop-distribution.c @@ -687,8 +687,9 @@ generate_loops_for_partition (struct loop *loop, partition_t partition, } else if (gimple_code (stmt) == GIMPLE_SWITCH) { + gimple_switch switch_stmt = stmt-as_a_gimple_switch (); maybe it would make more sense to do else if (gimple_switch switch_stmt = stmt-dyn_cast_gimple_switch ()) ? _please_ use is_a as_a etc. from is-a.h instead of member functions. Richard. Trev
[linaro/gcc-4_9-branch] Merge from gcc-4_9-branch
Hi, we have merged the gcc-4_9-branch into linaro/gcc-4_9-branch up to revision 209611 as r209634 (to keep a track of the 4.9.0 release) and up to revision 209633 as r209635. This will be part of our 2014.04 release. Thanks, Yvan
[PATCH, SH] Extend HIQI mode constants
This patch allows constant propagation from HIQI modes, as illustrated by the attached testcase, by converting them into a new SImode pseudo. It also merge the HIQI mode patterns using general_movdst_operand for both. No regression on sh-none-elf. OK for trunk ? Thanks, 2014-04-22 Christian Bruel christian.br...@st.com * config/sh/sh.md (movmode): Replace movQIHI. Force immediates to SImode. 2014-04-22 Christian Bruel christian.br...@st.com * gcc.target/sh/hiconst.c: New test. Index: gcc/config/sh/sh.md === --- gcc/config/sh/sh.md (revision 209556) +++ gcc/config/sh/sh.md (working copy) @@ -6978,20 +6978,20 @@ label: [(set_attr type sfunc) (set_attr needs_delay_slot yes)]) -(define_expand movhi - [(set (match_operand:HI 0 general_movdst_operand ) - (match_operand:HI 1 general_movsrc_operand ))] +(define_expand movmode + [(set (match_operand:QIHI 0 general_movdst_operand) + (match_operand:QIHI 1 general_movsrc_operand))] { - prepare_move_operands (operands, HImode); -}) + if (can_create_pseudo_p () CONST_INT_P (operands[1]) + REG_P (operands[0]) REGNO (operands[0]) != R0_REG) +{ +rtx reg = gen_reg_rtx(SImode); +emit_move_insn (reg, operands[1]); +operands[1] = gen_lowpart (MODEmode, reg); +} -(define_expand movqi - [(set (match_operand:QI 0 general_operand ) - (match_operand:QI 1 general_operand ))] - -{ - prepare_move_operands (operands, QImode); + prepare_move_operands (operands, MODEmode); }) ;; Specifying the displacement addressing load / store patterns separately Index: gcc/testsuite/gcc.target/sh/hiconst.c === --- gcc/testsuite/gcc.target/sh/hiconst.c (revision 0) +++ gcc/testsuite/gcc.target/sh/hiconst.c (working copy) @@ -0,0 +1,22 @@ +/* { dg-do compile { target sh*-*-* } } */ +/* { dg-options -O1 } */ + +char a; +int b; + +foo(char *pt, int *pti) +{ + a = 0; + b = 0; + *pt = 0; + *pti = 0; +} + +rab(char *pt, int *pti) +{ + pt[2] = 0; + pti[3] = 0; +} + +/* { dg-final { scan-assembler-times mov\t#0 2 } } */ +
Re: [PATCH][RFC][wide-int] Fix some build errors on arm in wide-int branch and report ICE
Ping. http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00769.html Any ideas? I recall chatter on IRC that we want to merge wide-int into trunk soon. Bootstrap failure on arm would prevent that... Thanks, Kyrill On 15/04/14 12:03, Kyrill Tkachov wrote: Hi all (and wide-int maintainers in particular), I tried bootstrapping the wide-int branch on arm-none-linux-gnueabihf and encountered some syntax errors while building wide-int.h and wide-int.cc in expressions that tried to cast to HOST_WIDE_INT. This patch fixes those errors. Also, in c-ada-spec.c I think we intended to use the HOST_WIDE_INT_PRINT format rather than HOST_LONG_FORMAT, since on arm-linux HOST_WIDE_INT is a 'long long'. The attached patch allowed the build to proceed for me, but in stage 2 I encountered an ICE: $TOP/gcc/dwarf2out.c: In function 'long unsigned int _ZL11size_of_dieP10die_struct.isra.209(vecdw_attr_struct, va_gc**, long unsigned int)': $TOP/gcc/dwarf2out.c:7820:1: internal compiler error: in set_value_range, at tree-vrp.c:452 size_of_die (dw_die_ref die) ^ 0xa825c1 set_value_range $TOP/gcc/tree-vrp.c:452 0xa8a441 extract_range_basic $TOP/gcc/tree-vrp.c:3679 0xa92c13 vrp_visit_assignment_or_call $TOP/gcc/tree-vrp.c:6725 0xa947eb vrp_visit_stmt $TOP/gcc/tree-vrp.c:7538 0x9d4d47 simulate_stmt $TOP/gcc/tree-ssa-propagate.c:329 0x9d5047 simulate_block $TOP/gcc/tree-ssa-propagate.c:452 0x9d5e23 ssa_propagate(ssa_prop_result (*)(gimple_statement_base*, edge_def**, tree_node**), ssa_prop_result (*)(gimple_statement_base*)) $TOP/gcc/tree-ssa-propagate.c:859 0xa9a1e1 execute_vrp $TOP/gcc/tree-vrp.c:9781 0xa9a4a3 execute $TOP/gcc/tree-vrp.c:9872 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions. Any ideas? The compiler was configured with: --enable-languages=c,c++,fortran --with-cpu=cortex-a15 --with-float=hard --with-mode=thumb Thanks, Kyrill gcc/ 2014-04-15 Kyrylo Tkachov kyrylo.tkac...@arm.com * wide-int.h (sign_mask): Fix syntax error. * wide-int.cc (wi::add_large): Likewise. (mul_internal): Likewise. (sub_large): Likewise. c-family/ 2014-04-15 Kyrylo Tkachov kyrylo.tkac...@arm.com * c-ada-spec.c (dump_generic_ada_node): Use HOST_WIDE_INT_PRINT instead of HOST_LONG_FORMAT.
Re: fuse-caller-save - hook format
On 17-04-14 18:49, Vladimir Makarov wrote: I see. I guess your proposed solution is ok then. Vladimir, Richard, I've updated the fuse-caller-save patch series to model non-callee call clobbers in CALL_INSN_FUNCTION_USAGE. There are 2 new hooks: 1. call_fusage_contains_non_callee_clobbers. A hook to indicate whether a target has added the non-callee call clobbers to CALL_INSN_FUNCTION_USAGE, meaning it's safe to do the fuse-caller-save optimization. 2. post_expand_call_insn. A utility hook to facilitate adding the clobbers to CALL_INSN_FUNCTION_USAGE. I've bootstrapped and reg-tested on x86_64, and I've build and reg-tested on MIPS. The series now looks like: 1 -fuse-caller-save - Add command line option 2 -fuse-caller-save - Add new reg-note REG_CALL_DECL 3 Add implicit parameter to find_all_hard_reg_sets 4 Register CALL_INSN_FUNCTION_USAGE in find_all_hard_reg_sets 5 Add call_fusage_contains_non_callee_clobbers hook 6 -fuse-caller-save - Collect register usage information 7 -fuse-caller-save - Use collected register usage information 8 -fuse-caller-save - Enable by default at O2 and higher 9 -fuse-caller-save - Add documentation 10 -fuse-caller-save - Add test-case 11 Add post_expand_call_insn hook 12 Add clobber_reg 13 -fuse-caller-save - Enable for MIPS 14 -fuse-caller-save - Enable for ARM 15 -fuse-caller-save - Enable for AArch64 16 -fuse-caller-save - Support in lra 17 -fuse-caller-save - Enable for i386 The submission/approval status is: 1-3, 7-10, 16: approved 4: submitted, pinged Eric Botcazou 16-04-2014 5, 11: new hook, need to submit 6, 14-15: approved earlier, but need to resubmit due to updated hook 12: new utility patch, need to submit 13: need to resubmit due to updated hook 17: need to submit I'll post the patches that need (re)submitting. Thanks, - Tom
Re: Remove obsolete Solaris 9 support
- Original Message - On Sat, 2014-04-19 at 09:03 +0100, Andrew Haley wrote: On 04/16/2014 12:16 PM, Rainer Orth wrote: * I'm removing the sys/loadavg.h check from classpath. Again, I'm uncertain if this is desirable. In the past, classpath changes were merged upstream by one of the libjava maintainers. We should not diverge from GNU Classpath unless there is a strong reason to do so. I think the configure check is mostly harmless, but wouldn't be opposed removing it. It really seems to have been added explicitly for Solaris 9, which is probably really dead by now. Andrew Hughes, you added it back in 2008. Are you still using/building on any Solaris 9 setups? I vaguely remember adding it. I was building on the university's Solaris 9 machines at the time. They've long since replaced them with GNU/Linux machines and I've been at Red Hat for over five years, so those days are long gone :) I have some Freetype fixes to push to Classpath as well, so I'll fix this too and look at merging to gcj in the not-too-distant future. I think it's long overdue. Ideally, the change should be left out of this patch, so as to avoid conflicts. Cheers, Mark Thanks, -- Andrew :) Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: 248BDC07 (https://keys.indymedia.org/) Fingerprint = EC5A 1F5E C0AD 1D15 8F1F 8F91 3B96 A578 248B DC07
Re: fuse-caller-save - hook format
Tom de Vries tom_devr...@mentor.com writes: 2. post_expand_call_insn. A utility hook to facilitate adding the clobbers to CALL_INSN_FUNCTION_USAGE. Why is this needed though? Like I say, I think targets should update CALL_INSN_FUNCTION_USAGE when emitting calls as part of the call expander. Splitting the functionality of the call expanders across the define_expand and a new hook just makes things unnecessarily complicated IMO. Thanks, Richard
Re: [PATCH] Simplify a VEC_SELECT fed by its own inverse
Hi, Below is the revised patch addressing Richard's concerns about the assertions. Bootstrapped and tested on powerpc64[,le]-unknown-linux-gnu. Ok for trunk? Thanks, Bill [gcc] 2014-04-22 Bill Schmidt wschm...@linux.vnet.ibm.com * simplify-rtx.c (simplify_binary_operation_1): Optimize case of nested VEC_SELECTs that are inverses of each other. [gcc/testsuite] 2014-04-22 Bill Schmidt wschm...@linux.vnet.ibm.com * gcc.target/powerpc/vsxcopy.c: New test. Index: gcc/simplify-rtx.c === --- gcc/simplify-rtx.c (revision 209516) +++ gcc/simplify-rtx.c (working copy) @@ -3673,6 +3673,31 @@ simplify_binary_operation_1 (enum rtx_code code, e } } + /* If we have two nested selects that are inverses of each +other, replace them with the source operand. */ + if (GET_CODE (trueop0) == VEC_SELECT + GET_MODE (XEXP (trueop0, 0)) == mode) + { + rtx op0_subop1 = XEXP (trueop0, 1); + gcc_assert (GET_CODE (op0_subop1) == PARALLEL); + gcc_assert (XVECLEN (trueop1, 0) == GET_MODE_NUNITS (mode)); + + /* Apply the outer ordering vector to the inner one. (The inner +ordering vector is expressly permitted to be of a different +length than the outer one.) If the result is { 0, 1, ..., n-1 } +then the two VEC_SELECTs cancel. */ + for (int i = 0; i XVECLEN (trueop1, 0); ++i) + { + rtx x = XVECEXP (trueop1, 0, i); + if (!CONST_INT_P (x)) + return 0; + rtx y = XVECEXP (op0_subop1, 0, INTVAL (x)); + if (!CONST_INT_P (y) || i != INTVAL (y)) + return 0; + } + return XEXP (trueop0, 0); + } + return 0; case VEC_CONCAT: { Index: gcc/testsuite/gcc.target/powerpc/vsxcopy.c === --- gcc/testsuite/gcc.target/powerpc/vsxcopy.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vsxcopy.c (working copy) @@ -0,0 +1,15 @@ +/* { dg-do compile { target { powerpc64*-*-* } } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options -O1 } */ +/* { dg-final { scan-assembler lxvd2x } } */ +/* { dg-final { scan-assembler stxvd2x } } */ +/* { dg-final { scan-assembler-not xxpermdi } } */ + +typedef float vecf __attribute__ ((vector_size (16))); +extern vecf j, k; + +void fun (void) +{ + j = k; +} +
Re: [PATCH][RFC][wide-int] Fix some build errors on arm in wide-int branch and report ICE
Kyrill Tkachov kyrylo.tkac...@arm.com writes: Ping. http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00769.html Any ideas? I recall chatter on IRC that we want to merge wide-int into trunk soon. Bootstrap failure on arm would prevent that... Sorry for the late reply. I hadn't forgotten, but I wanted to wait until I had chance to look into the ICE before replying, which I haven't had chance to do yet. There are still some problems on x86_64 that we need to track down. I'm also doing one last read-through of the patch and I optimistically hoped that that might find the problem you were hitting (unlikely though). I'm not a maintainer, but the patch: gcc/ 2014-04-15 Kyrylo Tkachov kyrylo.tkac...@arm.com * wide-int.h (sign_mask): Fix syntax error. * wide-int.cc (wi::add_large): Likewise. (mul_internal): Likewise. (sub_large): Likewise. c-family/ 2014-04-15 Kyrylo Tkachov kyrylo.tkac...@arm.com * c-ada-spec.c (dump_generic_ada_node): Use HOST_WIDE_INT_PRINT instead of HOST_LONG_FORMAT. looks good to me FWIW. Use C-style casts might be more descriptive than Fix syntax error though. It's a shame we can't use C++ style casts, but I suppose that's the price to pay for being able to write unsigned HOST_WIDE_INT. Thanks, Richard
Re: [PATCH 71/89] Concretize gimple_cond_make_{false|true}
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index 902b879..62ec9f5 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -9517,10 +9517,11 @@ fold_predicate_in (gimple_stmt_iterator *si) else { gcc_assert (gimple_code (stmt) == GIMPLE_COND); + gimple_cond cond_stmt = stmt-as_a_gimple_cond (); the assert isn't needed now right? Trev if (integer_zerop (val)) - gimple_cond_make_false (stmt); + gimple_cond_make_false (cond_stmt); else if (integer_onep (val)) - gimple_cond_make_true (stmt); + gimple_cond_make_true (cond_stmt); else gcc_unreachable (); } -- 1.8.5.3 signature.asc Description: Digital signature
Re: [PATCH][RFC][wide-int] Fix some build errors on arm in wide-int branch and report ICE
On Apr 15, 2014, at 4:03 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: I tried bootstrapping the wide-int branch on arm-none-linux-gnueabihf and encountered some syntax errors while building wide-int.h and wide-int.cc in expressions that tried to cast to HOST_WIDE_INT. Thanks, nice catch. Also, in c-ada-spec.c I think we intended to use the HOST_WIDE_INT_PRINT format rather than HOST_LONG_FORMAT, since on arm-linux HOST_WIDE_INT is a 'long long’. Yup. The attached patch Thanks. Committed revision 209639. allowed the build to proceed for me, but in stage 2 I encountered an ICE: Any ideas? Nope. Roughly, what it is saying is that the min or max of a call to set_value_range is now wrong. If you print those two out in the call, and check how they are computed, it might be obvious. I’ll see if I can reproduce and track it down and fix it.
Re: [PATCH 00/89] Compile-time gimple-checking
On Tue, Apr 22, 2014 at 09:05:43AM -0400, Andrew MacLeod wrote: On 04/22/2014 04:03 AM, Richard Sandiford wrote: First of all, thanks a lot for doing this. Maybe one day we'll have the same in rtl :-) But... David Malcolm dmalc...@redhat.com writes: In doing the checked downcasts I ran into the verbosity of the as_a API (in our is-a.h). I first tried simplifying them with custom functions e.g.: static inline gimple_bind as_a_gimple_bind (gimple gs) { return as_a gimple_statement_bind (gs); } but the approach I've gone with makes these checked casts be *methods* of the gimple_statement_base class, so that e.g. in a switch statement you can write: case GIMPLE_SWITCH: dump_gimple_switch (buffer, gs-as_a_gimple_switch (), spc, flags); break; where the -as_a_gimple_switch is a no-op cast from gimple to the more concrete gimple_switch in a release build, with runtime checking for code == GIMPLE_SWITCH added in a checked build (it uses as_a internally). This is much less verbose than trying to do it with as_a directly, and I think doing it as a method reads better aloud (to my English-speaking mind, at-least): gs as a gimple switch, as opposed to: as a gimple switch... gs, which I find clunky. It makes the base class a little cluttered, but IMHO it hits a sweet-spot of readability and type-safety with relatively little verbosity (only 8 more characters than doing it with a raw C-style cast). Another advantage of having the checked cast as a *method* is that it implicitly documents the requirement that the input must be non-NULL. ...FWIW I really don't like these cast members. The counterarguments are: - as_a ... (...) and dyn_cast ... (...) follow the C++ syntax for other casts. - the type you get is obvious, rather than being a contraction of the type name. - having them as methods means that the base class needs to aware of all subclasses. I realise that's probably inherently true of gimple due to the enum, but it seems like bad design. You could potentially have different subclasses for the same enum, selected by a secondary field. That seems kind of unlikely to me given that other things depend on the enum having an element for each sub class, but who knows. I'm not particularly fond of this aspect as well... I fear that someday down the road we would regret this decision, and end up changing it all back fwiw I don't really have an opinion either way. to is_a and friends These kind of sweeping changes we ought to try very hard to make sure we only have to do it once. I'm not convinced having to change it would be *that* bad, I would expect most of the change could be done with sed. If this is purely for verbosity, I think we can find better ways to reduce it... Is there any other reason? Maybe I've just been reading C code too long, but as a gimple switch...gs doesn't seem any less natural than is constant...address. Another way of reducing the verbosity of as_a would be to shorten the type names. E.g. gimple_statement contracts to gimple_stmt in some places, so gimple_statement_bind could become gimple_stmt_bind or just gimple_bind. gimple_bind is probably better since it matches the names of the accessors. as well as the typedef gimple_bind being what's used all over, so it would seem to make sense to rename the class and drop the typedef. If the thing after as_a_ matches the type name, the X-as_a_foo () takes the same number of characters as as_a foo (X). I was running into similar issues with the gimple re-arch work... One thing I was going to bring up at some point was the possibility of some renaming of types.In the context of these gimple statements, I would propose that we drop the gimple_ prefix completely, and end up with maybe something more concise like bind_stmt, switch_stmt, assign_stmt, etc. That seems nice, but I'd worry about name conflicts with rtl or tree types or something. There will be places in the code where we have used something like gimple switch_stmt = blah(); We already have things like loop loop = get_loop (); so conflicts like this don't seem *that* terrible, and I guess we don't absolutely need to rename stuff. so those variables would have to be renamed... and probably other dribble effects... but it would make the code look cleaner. and then as_a, is_a and dyn_cast wouldn't look so bad. Well, it would be nice if the typename wasn't repeated twice, but I fear there's no way out of that without auto. I see the gimple part of the name as being redundant. If we're really I'd tend to agree accept... concerned about it, put the whole thing inside a namespace, say 'Gimple' We'd need to beet gengtype into dealing with that, I'm not sure exactly how hard that is. and then all the gimple source files can simply start with using namespace Gimple; and then use
Re: [PATCH][RFC][wide-int] Fix some build errors on arm in wide-int branch and report ICE
On Apr 22, 2014, at 8:33 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Kyrill Tkachov kyrylo.tkac...@arm.com writes: Ping. http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00769.html Any ideas? I recall chatter on IRC that we want to merge wide-int into trunk soon. Bootstrap failure on arm would prevent that... Sorry for the late reply. I hadn't forgotten, but I wanted to wait until I had chance to look into the ICE before replying, which I haven't had chance to do yet. They are separable issues, so, I checked in the change. It's a shame we can't use C++ style casts, but I suppose that's the price to pay for being able to write unsigned HOST_WIDE_INT”. unsigned_HOST_WIDE_INT isn’t horrible, but, yeah, my fingers were expecting a typedef or better. I slightly prefer the int (1) style, but I think we should go the direction of the patch.
Re: fuse-caller-save - hook format
On 22-04-14 17:27, Richard Sandiford wrote: Tom de Vries tom_devr...@mentor.com writes: 2. post_expand_call_insn. A utility hook to facilitate adding the clobbers to CALL_INSN_FUNCTION_USAGE. Why is this needed though? Like I say, I think targets should update CALL_INSN_FUNCTION_USAGE when emitting calls as part of the call expander. Splitting the functionality of the call expanders across the define_expand and a new hook just makes things unnecessarily complicated IMO. Richard, It is not needed, but it is convenient. There are targets where the define_expands for calls use the rtl template. Having to add clobbers to the CALL_INSN_FUNCTION_USAGE for such a target means you cannot use the rtl template any more and instead need to generate all needed RTL insns in C code. This hook means that you can keep using the rtl template, which is less intrusive for those targets. Thanks, - Tom Thanks, Richard
[PING] [PATCH, AArch64] Use GCC builtins to count leading/tailing zeros
Ping~ OK for stage-1? The original patch was posted here: http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00286.html and the glibc patch was approved here: http://sourceware.org/ml/libc-alpha/2014-01/msg00120.html Thanks, Yufeng On 01/07/14 16:34, Yufeng Zhang wrote: Hi, This patch is to sync up include/longlong.h with its glibc peer after the proposed change here: http://sourceware.org/ml/libc-alpha/2014-01/msg00114.html The patch defines a number of macros in stdlib/longlong.h to use GCC builtins __builtin_clz* to implement the __clz* and __ctz* functions on AArch64. OK for the mainline? Thanks, Yufeng include/ * longlong.h (count_leading_zeros, count_trailing_zeros) (COUNT_LEADING_ZEROS_0): Define for aarch64.
Re: fuse-caller-save - hook format
Tom de Vries tom_devr...@mentor.com writes: On 22-04-14 17:27, Richard Sandiford wrote: Tom de Vries tom_devr...@mentor.com writes: 2. post_expand_call_insn. A utility hook to facilitate adding the clobbers to CALL_INSN_FUNCTION_USAGE. Why is this needed though? Like I say, I think targets should update CALL_INSN_FUNCTION_USAGE when emitting calls as part of the call expander. Splitting the functionality of the call expanders across the define_expand and a new hook just makes things unnecessarily complicated IMO. Richard, It is not needed, but it is convenient. There are targets where the define_expands for calls use the rtl template. Having to add clobbers to the CALL_INSN_FUNCTION_USAGE for such a target means you cannot use the rtl template any more and instead need to generate all needed RTL insns in C code. This hook means that you can keep using the rtl template, which is less intrusive for those targets. But if the target is simple enough to use a single call pattern for call cases, wouldn't it be possible to add the clobber directly to the call pattern? Which target do you have in mind? Thanks, Richard
Re: [RFC] Add aarch64 support for ada
Yes, this bootstrapped. OK, I have installed a variant of the patch (it should not change anything). Thanks for working on this. -- Eric Botcazou
[committed] Fix ICE on invalid combined distribute parallel for (PR c/59073)
Hi! I've committed this fix for invalid #pragma omp distribute parallel for. If the #pragma omp for parsing returns NULL, i.e. it is invalid and the stmt hasn't been added, then we shouldn't set OMP_PARALLEL_COMBINED on the parallel and similarly for the distribute. Tested on x86_64-linux, committed to trunk and 4.9 branch. 2014-04-22 Jakub Jelinek ja...@redhat.com PR c/59073 c/ * c-parser.c (c_parser_omp_parallel): If c_parser_omp_for fails, don't set OM_PARALLEL_COMBINED and return NULL. cp/ * parser.c (cp_parser_omp_parallel): If cp_parser_omp_for fails, don't set OM_PARALLEL_COMBINED and return NULL. testsuite/ * c-c++-common/gomp/pr59073.c: New test. --- gcc/c/c-parser.c.jj 2014-03-28 19:15:30.0 +0100 +++ gcc/c/c-parser.c2014-04-22 16:36:45.889868866 +0200 @@ -12208,10 +12208,12 @@ c_parser_omp_parallel (location_t loc, c if (!flag_openmp) /* flag_openmp_simd */ return c_parser_omp_for (loc, parser, p_name, mask, cclauses); block = c_begin_omp_parallel (); - c_parser_omp_for (loc, parser, p_name, mask, cclauses); + tree ret = c_parser_omp_for (loc, parser, p_name, mask, cclauses); stmt = c_finish_omp_parallel (loc, cclauses[C_OMP_CLAUSE_SPLIT_PARALLEL], block); + if (ret == NULL_TREE) + return ret; OMP_PARALLEL_COMBINED (stmt) = 1; return stmt; } --- gcc/cp/parser.c.jj 2014-04-14 09:31:51.0 +0200 +++ gcc/cp/parser.c 2014-04-22 16:37:22.990681121 +0200 @@ -29829,10 +29829,12 @@ cp_parser_omp_parallel (cp_parser *parse return cp_parser_omp_for (parser, pragma_tok, p_name, mask, cclauses); block = begin_omp_parallel (); save = cp_parser_begin_omp_structured_block (parser); - cp_parser_omp_for (parser, pragma_tok, p_name, mask, cclauses); + tree ret = cp_parser_omp_for (parser, pragma_tok, p_name, mask, cclauses); cp_parser_end_omp_structured_block (parser, save); stmt = finish_omp_parallel (cclauses[C_OMP_CLAUSE_SPLIT_PARALLEL], block); + if (ret == NULL_TREE) + return ret; OMP_PARALLEL_COMBINED (stmt) = 1; return stmt; } --- gcc/testsuite/c-c++-common/gomp/pr59073.c.jj2014-04-22 16:42:26.067112196 +0200 +++ gcc/testsuite/c-c++-common/gomp/pr59073.c 2014-04-22 16:41:57.0 +0200 @@ -0,0 +1,12 @@ +/* PR c/59073 */ +/* { dg-do compile } */ +/* { dg-options -fopenmp } */ + +void +foo () +{ + int i; +#pragma omp distribute parallel for + for (i = 0; i 10; i) /* { dg-error invalid increment expression } */ +; +} Jakub
Re: [PING] [PATCH, AArch64] Use GCC builtins to count leading/tailing zeros
On 22 April 2014 17:09, Yufeng Zhang yufeng.zh...@arm.com wrote: Ping~ OK for stage-1? The original patch was posted here: http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00286.html and the glibc patch was approved here: http://sourceware.org/ml/libc-alpha/2014-01/msg00120.html The glibc patch is now committed and I've just merged that change to the gcc copy. /Marcus
Add call_fusage_contains_non_callee_clobbers hook
On 22-04-14 17:05, Tom de Vries wrote: I've updated the fuse-caller-save patch series to model non-callee call clobbers in CALL_INSN_FUNCTION_USAGE. Vladimir, This patch adds a hook to indicate whether a target has added the non-callee call clobbers to CALL_INSN_FUNCTION_USAGE, meaning it's safe to do the fuse-caller-save optimization. OK for trunk? Thanks, - Tom 2013-04-29 Radovan Obradovic robrado...@mips.com Tom de Vries t...@codesourcery.com * target.def (call_fusage_contains_non_callee_clobbers): New DEFHOOK. * doc/tm.texi.in (@node Stack and Calling): Add Miscellaneous Register Hooks to @menu. (@node Miscellaneous Register Hooks): New node. (@hook TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS): New hook. * doc/tm.texi: Regenerate. diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index b8ca17e..8af8efd 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -3091,6 +3091,7 @@ This describes the stack layout and calling conventions. * Profiling:: * Tail Calls:: * Stack Smashing Protection:: +* Miscellaneous Register Hooks:: @end menu @node Frame Layout @@ -5016,6 +5017,21 @@ normally defined in @file{libgcc2.c}. Whether this target supports splitting the stack when the options described in @var{opts} have been passed. This is called after options have been parsed, so the target may reject splitting the stack in some configurations. The default version of this hook returns false. If @var{report} is true, this function may issue a warning or error; if @var{report} is false, it must simply return a value @end deftypefn +@node Miscellaneous Register Hooks +@subsection Miscellaneous register hooks +@cindex miscellaneous register hooks + +@deftypefn {Target Hook} bool TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS (void) +Return true if all the calls in the current function contain clobbers in +CALL_INSN_FUNCTION_USAGE for the registers that are clobbered by the call +rather than by the callee, and are not already set or clobbered in the call +pattern. Examples of such registers are registers used in PLTs and stubs, +and temporary registers used in the call instruction but not present in the +rtl pattern. Another way to formulate it is the registers not present in the +rtl pattern that are clobbered by the call assuming the callee does not +clobber any register. The default version of this hook returns false. +@end deftypefn + @node Varargs @section Implementing the Varargs Macros @cindex varargs implementation diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index d793d26..8991c3c 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -2720,6 +2720,7 @@ This describes the stack layout and calling conventions. * Profiling:: * Tail Calls:: * Stack Smashing Protection:: +* Miscellaneous Register Hooks:: @end menu @node Frame Layout @@ -3985,6 +3986,12 @@ the function prologue. Normally, the profiling code comes after. @hook TARGET_SUPPORTS_SPLIT_STACK +@node Miscellaneous Register Hooks +@subsection Miscellaneous register hooks +@cindex miscellaneous register hooks + +@hook TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS + @node Varargs @section Implementing the Varargs Macros @cindex varargs implementation diff --git a/gcc/target.def b/gcc/target.def index 3a64cd1..ae0bc9c 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -5130,6 +5130,22 @@ FRAME_POINTER_REGNUM, ARG_POINTER_REGNUM, and the PIC_OFFSET_TABLE_REGNUM., void, (bitmap regs), hook_void_bitmap) +/* Targets should define this target hook to mark that non-callee clobbers are + present in CALL_INSN_FUNCTION_USAGE for all the calls in the current + function. */ +DEFHOOK +(call_fusage_contains_non_callee_clobbers, + Return true if all the calls in the current function contain clobbers in\n\ +CALL_INSN_FUNCTION_USAGE for the registers that are clobbered by the call\n\ +rather than by the callee, and are not already set or clobbered in the call\n\ +pattern. Examples of such registers are registers used in PLTs and stubs,\n\ +and temporary registers used in the call instruction but not present in the\n\ +rtl pattern. Another way to formulate it is the registers not present in the\n\ +rtl pattern that are clobbered by the call assuming the callee does not\n\ +clobber any register. The default version of this hook returns false., + bool, (void), + hook_bool_void_false) + /* Fill in additional registers set up by prologue into a regset. */ DEFHOOK (set_up_by_prologue,
Re: [PATCH] Do not run IPA transform phases multiple times
On Fri, Apr 18, 2014 at 03:08:09PM +0200, Martin Jambor wrote: On Fri, Apr 18, 2014 at 01:49:36PM +0200, Jakub Jelinek wrote: It reproduces on x86_64 too, I guess the reason why you aren't seeing this is that you might have too old assembler that doesn't support avx2 instructions (you actually don't need avx2 hw to reproduce, any x86_64 or i686 just with gas that supports avx2 should be enough). I see, with that information I have managed to reproduce the failures and now am convinced the patch (re-posted below) is indeed the correct thing to do. I am going to bootstrap it over the weekend (can't do it earlier to test these testcases). Honza, is it OK for trunk if it passes? The patch has passed boostrap and testing on x86_64 with AVX support, Honza approved it on IRC and so I have committed it to trunk. Thanks, Martin 2014-04-18 Martin Jambor mjam...@suse.cz * cgraphclones.c (cgraph_function_versioning): Copy ipa_transforms_to_apply instead of asserting it is empty. Index: src/gcc/cgraphclones.c === --- src.orig/gcc/cgraphclones.c +++ src/gcc/cgraphclones.c @@ -974,7 +974,9 @@ cgraph_function_versioning (struct cgrap cgraph_copy_node_for_versioning (old_version_node, new_decl, redirect_callers, bbs_to_copy); - gcc_assert (!old_version_node-ipa_transforms_to_apply.exists ()); + if (old_version_node-ipa_transforms_to_apply.exists ()) +new_version_node-ipa_transforms_to_apply + = old_version_node-ipa_transforms_to_apply.copy (); /* Copy the OLD_VERSION_NODE function tree to the new version. */ tree_function_versioning (old_decl, new_decl, tree_map, false, args_to_skip, skip_return, bbs_to_copy, new_entry_block);
Re: [PATCH][AARCH64] Support tail indirect function call
^Ping... ok for stage 1? Regards, Jiong On 02/04/14 12:04, Jiong Wang wrote: ^Ping... Regards, Jiong On 18/03/14 14:13, Jiong Wang wrote: Current, indirect function call prevents tail-call optimization on AArch64. This patch adapt the fix for PR arm/19599 to AArch64. Is it ok for next stage 1? Thanks. -- Jiong gcc/ * config/aarch64/predicates.md (aarch64_call_insn_operand): New predicate. * config/aarch64/constraints.md (Ucs, Usf): New constraints. * config/aarch64/aarch64.md (*sibcall_insn, *sibcall_value_insn): Adjust for tailcalling through registers. * config/aarch64/aarch64.h (enum reg_class): New caller save register class. (REG_CLASS_NAMES): Likewise. (REG_CLASS_CONTENTS): Likewise. * config/aarch64/aarch64.c (aarch64_function_ok_for_sibcall): Allow tailcalling without decls. gcc/testsuite *gcc.target/aarch64/tail-indirect-call.c: New test. -- Jiong
Re: [PATCH 00/89] Compile-time gimple-checking
On Tue, 2014-04-22 at 09:03 +0100, Richard Sandiford wrote: First of all, thanks a lot for doing this. Maybe one day we'll have the same in rtl :-) But... David Malcolm dmalc...@redhat.com writes: In doing the checked downcasts I ran into the verbosity of the as_a API (in our is-a.h). I first tried simplifying them with custom functions e.g.: static inline gimple_bind as_a_gimple_bind (gimple gs) { return as_a gimple_statement_bind (gs); } but the approach I've gone with makes these checked casts be *methods* of the gimple_statement_base class, so that e.g. in a switch statement you can write: case GIMPLE_SWITCH: dump_gimple_switch (buffer, gs-as_a_gimple_switch (), spc, flags); break; where the -as_a_gimple_switch is a no-op cast from gimple to the more concrete gimple_switch in a release build, with runtime checking for code == GIMPLE_SWITCH added in a checked build (it uses as_a internally). This is much less verbose than trying to do it with as_a directly, and I think doing it as a method reads better aloud (to my English-speaking mind, at-least): gs as a gimple switch, as opposed to: as a gimple switch... gs, which I find clunky. It makes the base class a little cluttered, but IMHO it hits a sweet-spot of readability and type-safety with relatively little verbosity (only 8 more characters than doing it with a raw C-style cast). Another advantage of having the checked cast as a *method* is that it implicitly documents the requirement that the input must be non-NULL. ...FWIW I really don't like these cast members. The counterarguments are: - as_a ... (...) and dyn_cast ... (...) follow the C++ syntax for other casts. - the type you get is obvious, rather than being a contraction of the type name. - having them as methods means that the base class needs to aware of all subclasses. I realise that's probably inherently true of gimple due to the enum, but it seems like bad design. You could potentially have different subclasses for the same enum, selected by a secondary field. Maybe I've just been reading C code too long, but as a gimple switch...gs doesn't seem any less natural than is constant...address. Another way of reducing the verbosity of as_a would be to shorten the type names. E.g. gimple_statement contracts to gimple_stmt in some places, so gimple_statement_bind could become gimple_stmt_bind or just gimple_bind. gimple_bind is probably better since it matches the names of the accessors. If the thing after as_a_ matches the type name, the X-as_a_foo () takes the same number of characters as as_a foo (X). Beauty is indeed in the eye of the beholder :) I prefer my proposal, but I also like yours. It would convert these fragments (from patch 2): static void dump_gimple_switch (pretty_printer *buffer, gimple_switch gs, int spc, int flags) [...snip...] [...later, within pp_gimple_stmt_1:] case GIMPLE_SWITCH: dump_gimple_switch (buffer, gs-as_a_gimple_switch (), spc, flags); break; to these: static void dump_gimple_switch (pretty_printer *buffer, gimple_switch *gs, int spc, int flags) [...snip...] [...later, within pp_gimple_stmt_1:] case GIMPLE_SWITCH: dump_gimple_switch (buffer, as_a gimple_switch (gs), spc, flags); break; Note that this affects the pointerness of the types: in the patches I sent, since is-a.h has: template typename T, typename U inline T * ^^^ Note how it returns a (T*) as_a (U *p) { gcc_checking_assert (is_a T (p)); but uses the specialization of T, not T* here return is_a_helper T::cast (p); ^^^ and here } i.e. in the current proposal, gimple_switch is a typedef of a *pointer* to the GIMPLE_SWITCH subclass: class gimple_statement_switch whereas direct use of the is-a.h interface would eliminate the new typedefs and give us: class gimple_switch and all vars and params that took a subclass would convert from status quo: gimple some_switch_stmt; to: gimple_switch *some_switch_stmt; I like this API. One drawback is that it leads to an inconsistency in pointerness between the typedef gimple, a pointer to the baseclass, and these subclass types, so you might have local decls looking like this: gimple some_stmt; /* note how this doesn't have a star... */ gimple_assign *assign_stmt; /* ...whereas these ones do */ gimple_cond *assign_stmt; gimple_phi *phi; This could be resolved by renaming class gimple_statement_base to class gimple and eliminating the gimple typedef, so that we might have locals declared like this: gimple *some_stmt; /* note how this has gained a star */ gimple_assign *assign_stmt; gimple_cond *assign_stmt; gimple_phi *phi; though clearly
Re: fuse-caller-save - hook format
On 22-04-14 18:18, Richard Sandiford wrote: Tom de Vries tom_devr...@mentor.com writes: On 22-04-14 17:27, Richard Sandiford wrote: Tom de Vries tom_devr...@mentor.com writes: 2. post_expand_call_insn. A utility hook to facilitate adding the clobbers to CALL_INSN_FUNCTION_USAGE. Why is this needed though? Like I say, I think targets should update CALL_INSN_FUNCTION_USAGE when emitting calls as part of the call expander. Splitting the functionality of the call expanders across the define_expand and a new hook just makes things unnecessarily complicated IMO. Richard, It is not needed, but it is convenient. There are targets where the define_expands for calls use the rtl template. Having to add clobbers to the CALL_INSN_FUNCTION_USAGE for such a target means you cannot use the rtl template any more and instead need to generate all needed RTL insns in C code. This hook means that you can keep using the rtl template, which is less intrusive for those targets. [ switching order of questions ] Which target do you have in mind? Aarch64. But if the target is simple enough to use a single call pattern for call cases, wouldn't it be possible to add the clobber directly to the call pattern? I think that can be done, but that feels intrusive as well. I thought the reason that we added these clobbers to CALL_INSN_FUNCTION_USAGE was exactly because we did not want to add them to the rtl patterns? But, if the maintainer is fine with that, so am I. Richard Earnshaw, are you ok with adding the IP0_REGNUM/IP1_REGNUM clobbers to all the call patterns in the Aarch64 target? The alternatives are: - rewrite the call expansions not to use the rtl templates, and add the clobbers there to CALL_INSN_FUNCTION_USAGE - get the post_expand_call_insn hook approved and use that to add the clobbers to CALL_INSN_FUNCTION_USAGE. what is your preference? Thanks, - Tom
Re: [PATCH 02/89] Introduce gimple_switch and use it in various places
On Mon, 2014-04-21 at 18:45 -0400, Trevor Saunders wrote: --- a/gcc/tree-loop-distribution.c +++ b/gcc/tree-loop-distribution.c @@ -687,8 +687,9 @@ generate_loops_for_partition (struct loop *loop, partition_t partition, } else if (gimple_code (stmt) == GIMPLE_SWITCH) { + gimple_switch switch_stmt = stmt-as_a_gimple_switch (); maybe it would make more sense to do else if (gimple_switch switch_stmt = stmt-dyn_cast_gimple_switch ()) Thanks. Yes, or indeed something like: else if (gimple_switch switch_stmt = dyn_cast gimple_switch (stmt)) (modulo the pointerness issues mentioned in http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01334.html )
Re: [PATCH 71/89] Concretize gimple_cond_make_{false|true}
On Tue, 2014-04-22 at 11:37 -0400, Trevor Saunders wrote: diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index 902b879..62ec9f5 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -9517,10 +9517,11 @@ fold_predicate_in (gimple_stmt_iterator *si) else { gcc_assert (gimple_code (stmt) == GIMPLE_COND); + gimple_cond cond_stmt = stmt-as_a_gimple_cond (); the assert isn't needed now right? Correct. I guess my thinking here was that the original code was checking for it, presumably to fail early, rather than hitting the GIMPLE_CHECK macros when calling the accessors later on, but the checked cast to gimple_cond gives us that.
Re: [PATCH 00/89] Compile-time gimple-checking
On Tue, 2014-04-22 at 09:05 -0400, Andrew MacLeod wrote: On 04/22/2014 04:03 AM, Richard Sandiford wrote: First of all, thanks a lot for doing this. Maybe one day we'll have the same in rtl :-) But... David Malcolm dmalc...@redhat.com writes: In doing the checked downcasts I ran into the verbosity of the as_a API (in our is-a.h). I first tried simplifying them with custom functions e.g.: static inline gimple_bind as_a_gimple_bind (gimple gs) { return as_a gimple_statement_bind (gs); } but the approach I've gone with makes these checked casts be *methods* of the gimple_statement_base class, so that e.g. in a switch statement you can write: case GIMPLE_SWITCH: dump_gimple_switch (buffer, gs-as_a_gimple_switch (), spc, flags); break; where the -as_a_gimple_switch is a no-op cast from gimple to the more concrete gimple_switch in a release build, with runtime checking for code == GIMPLE_SWITCH added in a checked build (it uses as_a internally). This is much less verbose than trying to do it with as_a directly, and I think doing it as a method reads better aloud (to my English-speaking mind, at-least): gs as a gimple switch, as opposed to: as a gimple switch... gs, which I find clunky. It makes the base class a little cluttered, but IMHO it hits a sweet-spot of readability and type-safety with relatively little verbosity (only 8 more characters than doing it with a raw C-style cast). Another advantage of having the checked cast as a *method* is that it implicitly documents the requirement that the input must be non-NULL. ...FWIW I really don't like these cast members. The counterarguments are: - as_a ... (...) and dyn_cast ... (...) follow the C++ syntax for other casts. - the type you get is obvious, rather than being a contraction of the type name. - having them as methods means that the base class needs to aware of all subclasses. I realise that's probably inherently true of gimple due to the enum, but it seems like bad design. You could potentially have different subclasses for the same enum, selected by a secondary field. I'm not particularly fond of this aspect as well... I fear that someday down the road we would regret this decision, and end up changing it all back to is_a and friends These kind of sweeping changes we ought to try very hard to make sure we only have to do it once. If this is purely for verbosity, I think we can find better ways to reduce it... Is there any other reason? There was also the idea that a method carries with it the implication that the ptr is non-NULL... but the main reason was verbosity. I think that with a change to is-a.h to better support typedefs, we can achieve a relatively terse API; see: http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01334.html Maybe I've just been reading C code too long, but as a gimple switch...gs doesn't seem any less natural than is constant...address. Another way of reducing the verbosity of as_a would be to shorten the type names. E.g. gimple_statement contracts to gimple_stmt in some places, so gimple_statement_bind could become gimple_stmt_bind or just gimple_bind. gimple_bind is probably better since it matches the names of the accessors. If the thing after as_a_ matches the type name, the X-as_a_foo () takes the same number of characters as as_a foo (X). I was running into similar issues with the gimple re-arch work... One thing I was going to bring up at some point was the possibility of some renaming of types.In the context of these gimple statements, I would propose that we drop the gimple_ prefix completely, and end up with maybe something more concise like bind_stmt, switch_stmt, assign_stmt, etc. There will be places in the code where we have used something like gimple switch_stmt = blah(); so those variables would have to be renamed... and probably other dribble effects... but it would make the code look cleaner. and then as_a, is_a and dyn_cast wouldn't look so bad. I see the gimple part of the name as being redundant. If we're really concerned about it, put the whole thing inside a namespace, say 'Gimple' and then all the gimple source files can simply start with using namespace Gimple; and then use 'bind_stmt' throughout. Non gimple source files could then refer to it directly as Gimple::bind_stmt... This would tie in well with what I'm planning to propose for gimple types and values. That would require teaching gengtype about namespaces (rather than the current hack), which I'd prefer to avoid. Of course, it would be ideal if we could use 'gimple' as the namespace, but that is currently taken by the gimple statement type... I'd even go so far as to propose that 'gimple' should be renamed 'gimple::stmt'.. but that is much more work
Re: [PATCH 00/89] Compile-time gimple-checking
David Malcolm dmalc...@redhat.com writes: Alternatively we could change the is-a.h API to eliminate this discrepancy, and keep the typedefs; giving something like the following: static void dump_gimple_switch (pretty_printer *buffer, gimple_switch gs, int spc, int flags) [...snip...] [...later, within pp_gimple_stmt_1:] case GIMPLE_SWITCH: dump_gimple_switch (buffer, as_a gimple_switch (gs), spc, flags); break; which is concise, readable, and avoid the change in pointerness compared to the gimple typedef; the local decls above would look like this: gimple some_stmt; /* note how this doesn't have a star... */ gimple_assign assign_stmt; /* ...and neither do these */ gimple_cond assign_stmt; gimple_phi phi; I think this last proposal is my preferred API, but it requires the change to is-a.h Attached is a proposed change to the is-a.h API that elimintates the discrepancy, allowing the use of typedefs with is-a.h (doesn't yet compile, but hopefully illustrates the idea). Note how it changes the API to match C++'s dynamic_cast operator i.e. you do Q* q = dyn_castQ* (p); not: Q* q = dyn_castQ (p); Thanks for being flexible. :-) I like this version too FWIW, for the reason you said: it really does look like a proper C++ cast. If we ever decide to get rid of the typedefs (maybe at the same time as using auto) then the choice might be different, but that would be a much more systematic and easily-automated change than this one. Thanks, Richard
Re: [PATCH 00/89] Compile-time gimple-checking
On 04/22/2014 01:50 PM, David Malcolm wrote: On Tue, 2014-04-22 at 09:05 -0400, Andrew MacLeod wrote: Of course, it would be ideal if we could use 'gimple' as the namespace, but that is currently taken by the gimple statement type... I'd even go so far as to propose that 'gimple' should be renamed 'gimple::stmt'.. but that is much more work :-) I'm not at all keen on that further suggestion: I wanted to make this patch series as minimal as possible whilst giving us the compile-time tracking of gimple codes. Although people's inboxes may find this surprising, I was trying to be conservative with the patch series :) [1] I wasn't suggesting you do it in this patch set... It would clearly be its own patch set, and doesn't even need to be done by you. Merely bringing up the option for future consideration since I'd like to see the generic 'gimple' name re-purposed :-) We're both working on large changes that improve the type-safety of the middle-end: this patch series affects statements, whereas AIUI you have a branch working on expressions and types. How do we best co-ordinate this so that we don't bitrot each other's work, so that the result is reviewable, and the changes are understandable in, say, 5 years time? My plan was to do the statement work as a (large) series of small patches against trunk, trying to minimize the number of lines I touch, mostly function and variable decls with a few lines adding is_a and dyn_cast, whereas your change AIUI by necessity involves more substantial changes to function bodies. I think we can only have zero or one such touch every line change(s) landing at once. Dave I don't think you should worry about bit-rotting other in-progress branches when making these decisions...at least not mine.. I think you should do the right thing, whatever that turns out to be :-) My stuff wont land as a touch every line change through the source base. its designed to fully convert a file at a time and transparently coexist with the existing tree interface... Impact on other files is minimal. Plus I expect it will go through at least one more massive round of changes after I finish documenting it and bring it forth for discussions in the coming month(s). So it'll be all bit-rotted then anyway. Andrew
Re: [PATCH 00/89] Compile-time gimple-checking
On 04/22/2014 02:56 PM, Richard Sandiford wrote: David Malcolm dmalc...@redhat.com writes: Alternatively we could change the is-a.h API to eliminate this discrepancy, and keep the typedefs; giving something like the following: static void dump_gimple_switch (pretty_printer *buffer, gimple_switch gs, int spc, int flags) [...snip...] [...later, within pp_gimple_stmt_1:] case GIMPLE_SWITCH: dump_gimple_switch (buffer, as_a gimple_switch (gs), spc, flags); break; which is concise, readable, and avoid the change in pointerness compared to the gimple typedef; the local decls above would look like this: gimple some_stmt; /* note how this doesn't have a star... */ gimple_assign assign_stmt; /* ...and neither do these */ gimple_cond assign_stmt; gimple_phi phi; I think this last proposal is my preferred API, but it requires the change to is-a.h Attached is a proposed change to the is-a.h API that elimintates the discrepancy, allowing the use of typedefs with is-a.h (doesn't yet compile, but hopefully illustrates the idea). Note how it changes the API to match C++'s dynamic_cast operator i.e. you do Q* q = dyn_castQ* (p); not: Q* q = dyn_castQ (p); Thanks for being flexible. :-) I like this version too FWIW, for the reason you said: it really does look like a proper C++ cast. If we ever decide to get rid of the typedefs (maybe at the same time as using auto) then the choice might be different, but that would be a much more systematic and easily-automated change than this one. Can we also consider making dyn_cast handle NULL at the same time? the C++ dynamic_cast does, and I ended up supporting it in my branch as well. so dyn_cast would be more like: inline T dyn_cast (U *p) { if (p is_a T (p)) return is_a_helper T::cast (p); else return static_cast T (0); } both is_a and as_a would still require a valid pointer or you get a NULL dereference... I've found it very pragmatic... Andrew
Re: [PING] [PATCH] register CALL_INSN_FUNCTION_USAGE in find_all_hard_reg_sets
Ping of this ( http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00888.html ) patch. That patch isn't for GCC mainline though, but OK on principle if you test it on mainline, avoid the very ugly set-inside-use idiom and do: record_hard_reg_sets (XEXP (op, 0), NULL, pset); instead of reimplementing it manually. -- Eric Botcazou
Re: [PATCH 00/89] Compile-time gimple-checking
On April 22, 2014 8:56:56 PM CEST, Richard Sandiford rdsandif...@googlemail.com wrote: David Malcolm dmalc...@redhat.com writes: Alternatively we could change the is-a.h API to eliminate this discrepancy, and keep the typedefs; giving something like the following: static void dump_gimple_switch (pretty_printer *buffer, gimple_switch gs, int spc, int flags) [...snip...] [...later, within pp_gimple_stmt_1:] case GIMPLE_SWITCH: dump_gimple_switch (buffer, as_a gimple_switch (gs), spc, flags); break; which is concise, readable, and avoid the change in pointerness compared to the gimple typedef; the local decls above would look like this: gimple some_stmt; /* note how this doesn't have a star... */ gimple_assign assign_stmt; /* ...and neither do these */ gimple_cond assign_stmt; gimple_phi phi; I think this last proposal is my preferred API, but it requires the change to is-a.h Attached is a proposed change to the is-a.h API that elimintates the discrepancy, allowing the use of typedefs with is-a.h (doesn't yet compile, but hopefully illustrates the idea). Note how it changes the API to match C++'s dynamic_cast operator i.e. you do Q* q = dyn_castQ* (p); not: Q* q = dyn_castQ (p); Thanks for being flexible. :-) I like this version too FWIW, for the reason you said: it really does look like a proper C++ cast. Indeed. I even wasn't aware it is different Than a c++ cast... Richard. If we ever decide to get rid of the typedefs (maybe at the same time as using auto) then the choice might be different, but that would be a much more systematic and easily-automated change than this one. Thanks, Richard
[wide-int 1/8] Fix some off-by-one errors and bounds tests
This is the first of 8 patches from reading through the diff with mainline. Some places had an off-by-one error on an index and some used = 0 instead of = 0. I think we should use MAX_BITSIZE_MODE_ANY_MODE rather than MAX_BITSIZE_MODE_ANY_INT when handling floating-point modes. Two hunks contain unrelated formatting fixes too. Tested on x86_64-linux-gnu. OK to install? Thanks, Richard Index: gcc/c-family/c-ada-spec.c === --- gcc/c-family/c-ada-spec.c 2014-04-22 20:31:10.632895953 +0100 +++ gcc/c-family/c-ada-spec.c 2014-04-22 20:31:24.880998602 +0100 @@ -2205,8 +2205,9 @@ dump_generic_ada_node (pretty_printer *b val = -val; } sprintf (pp_buffer (buffer)-digit_buffer, - 16#% HOST_WIDE_INT_PRINT x, val.elt (val.get_len () - 1)); - for (i = val.get_len () - 2; i = 0; i--) + 16#% HOST_WIDE_INT_PRINT x, + val.elt (val.get_len () - 1)); + for (i = val.get_len () - 2; i = 0; i--) sprintf (pp_buffer (buffer)-digit_buffer, HOST_WIDE_INT_PRINT_PADDED_HEX, val.elt (i)); pp_string (buffer, pp_buffer (buffer)-digit_buffer); Index: gcc/dbxout.c === --- gcc/dbxout.c2014-04-22 20:31:10.632895953 +0100 +++ gcc/dbxout.c2014-04-22 20:31:24.881998608 +0100 @@ -720,7 +720,7 @@ stabstr_O (tree cst) } prec -= res_pres; - for (i = prec - 3; i = 0; i = i - 3) + for (i = prec - 3; i = 0; i = i - 3) { digit = wi::extract_uhwi (cst, i, 3); stabstr_C ('0' + digit); Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c 2014-04-22 20:31:10.632895953 +0100 +++ gcc/dwarf2out.c 2014-04-22 20:31:24.884998630 +0100 @@ -1847,7 +1847,7 @@ output_loc_operands (dw_loc_descr_ref lo int i; int len = get_full_len (*val2-v.val_wide); if (WORDS_BIG_ENDIAN) - for (i = len; i = 0; --i) + for (i = len - 1; i = 0; --i) dw2_asm_output_data (HOST_BITS_PER_WIDE_INT / HOST_BITS_PER_CHAR, val2-v.val_wide-elt (i), NULL); else @@ -2073,7 +2073,7 @@ output_loc_operands (dw_loc_descr_ref lo dw2_asm_output_data (1, len * l, NULL); if (WORDS_BIG_ENDIAN) - for (i = len; i = 0; --i) + for (i = len - 1; i = 0; --i) dw2_asm_output_data (l, val2-v.val_wide-elt (i), NULL); else for (i = 0; i len; ++i) @@ -5398,11 +5398,11 @@ print_die (dw_die_ref die, FILE *outfile int i = a-dw_attr_val.v.val_wide-get_len (); fprintf (outfile, constant (); gcc_assert (i 0); - if (a-dw_attr_val.v.val_wide-elt (i) == 0) + if (a-dw_attr_val.v.val_wide-elt (i - 1) == 0) fprintf (outfile, 0x); fprintf (outfile, HOST_WIDE_INT_PRINT_HEX, a-dw_attr_val.v.val_wide-elt (--i)); - while (-- i = 0) + while (--i = 0) fprintf (outfile, HOST_WIDE_INT_PRINT_PADDED_HEX, a-dw_attr_val.v.val_wide-elt (i)); fprintf (outfile, )); @@ -8723,7 +8723,7 @@ output_die (dw_die_ref die) NULL); if (WORDS_BIG_ENDIAN) - for (i = len; i = 0; --i) + for (i = len - 1; i = 0; --i) { dw2_asm_output_data (l, a-dw_attr_val.v.val_wide-elt (i), name); Index: gcc/simplify-rtx.c === --- gcc/simplify-rtx.c 2014-04-22 20:31:10.632895953 +0100 +++ gcc/simplify-rtx.c 2014-04-22 20:31:24.884998630 +0100 @@ -5395,7 +5395,7 @@ simplify_immed_subreg (enum machine_mode case MODE_DECIMAL_FLOAT: { REAL_VALUE_TYPE r; - long tmp[MAX_BITSIZE_MODE_ANY_INT / 32]; + long tmp[MAX_BITSIZE_MODE_ANY_MODE / 32]; /* real_from_target wants its input in words affected by FLOAT_WORDS_BIG_ENDIAN. However, we ignore this,
[wide-int 2/8] Fix ubsan internal-fn.c handling
This code was mixing hprec and hprec*2 wide_ints. The simplest fix seemed to be to introduce a function that gives the minimum precision necessary to represent a function, which also means that no temporary wide_ints are needed. Other places might be able to use this too, but I'd like to look at that after the merge. The patch series fixed a regression in c-c++-common/ubsan/overflow-2.c and I assume it's due to this change. Tested on x86_64-linux-gnu. OK to install? Thanks, Richard Index: gcc/internal-fn.c === --- gcc/internal-fn.c 2014-04-22 20:31:10.516895118 +0100 +++ gcc/internal-fn.c 2014-04-22 20:31:25.842005530 +0100 @@ -478,7 +478,7 @@ ubsan_expand_si_overflow_mul_check (gimp rtx do_overflow = gen_label_rtx (); rtx hipart_different = gen_label_rtx (); - int hprec = GET_MODE_PRECISION (hmode); + unsigned int hprec = GET_MODE_PRECISION (hmode); rtx hipart0 = expand_shift (RSHIFT_EXPR, mode, op0, hprec, NULL_RTX, 0); hipart0 = gen_lowpart (hmode, hipart0); @@ -513,12 +513,11 @@ ubsan_expand_si_overflow_mul_check (gimp wide_int arg0_min, arg0_max; if (get_range_info (arg0, arg0_min, arg0_max) == VR_RANGE) { - if (wi::les_p (arg0_max, wi::max_value (hprec, SIGNED)) - wi::les_p (wi::min_value (hprec, SIGNED), arg0_min)) + unsigned int mprec0 = wi::min_precision (arg0_min, SIGNED); + unsigned int mprec1 = wi::min_precision (arg0_max, SIGNED); + if (mprec0 = hprec mprec1 = hprec) op0_small_p = true; - else if (wi::les_p (arg0_max, wi::max_value (hprec, UNSIGNED)) - wi::les_p (~wi::max_value (hprec, UNSIGNED), -arg0_min)) + else if (mprec0 = hprec + 1 mprec1 = hprec + 1) op0_medium_p = true; if (!wi::neg_p (arg0_min, TYPE_SIGN (TREE_TYPE (arg0 op0_sign = 0; @@ -531,12 +530,11 @@ ubsan_expand_si_overflow_mul_check (gimp wide_int arg1_min, arg1_max; if (get_range_info (arg1, arg1_min, arg1_max) == VR_RANGE) { - if (wi::les_p (arg1_max, wi::max_value (hprec, SIGNED)) - wi::les_p (wi::min_value (hprec, SIGNED), arg1_min)) + unsigned int mprec0 = wi::min_precision (arg1_min, SIGNED); + unsigned int mprec1 = wi::min_precision (arg1_max, SIGNED); + if (mprec0 = hprec mprec1 = hprec) op1_small_p = true; - else if (wi::les_p (arg1_max, wi::max_value (hprec, UNSIGNED)) - wi::les_p (~wi::max_value (hprec, UNSIGNED), -arg1_min)) + else if (mprec0 = hprec + 1 mprec1 = hprec + 1) op1_medium_p = true; if (!wi::neg_p (arg1_min, TYPE_SIGN (TREE_TYPE (arg1 op1_sign = 0; Index: gcc/wide-int.h === --- gcc/wide-int.h 2014-04-22 20:31:10.516895118 +0100 +++ gcc/wide-int.h 2014-04-22 20:31:25.842005530 +0100 @@ -562,6 +562,9 @@ #define SHIFT_FUNCTION \ template typename T unsigned HOST_WIDE_INT extract_uhwi (const T , unsigned int, unsigned int); + + template typename T + unsigned int min_precision (const T , signop); } namespace wi @@ -2995,6 +2998,17 @@ wi::extract_uhwi (const T x, unsigned i return zext_hwi (res, width); } +/* Return the minimum precision needed to store X with sign SGN. */ +template typename T +inline unsigned int +wi::min_precision (const T x, signop sgn) +{ + if (sgn == SIGNED) +return wi::get_precision (x) - clrsb (x); + else +return wi::get_precision (x) - clz (x); +} + templatetypename T void gt_ggc_mx (generic_wide_int T *)
[wide-int 3/8] Add and use udiv_ceil
Just a minor tweak to avoid several calculations when one would do. Since we have a function for rounded-up division, we might as well use it instead of the (X + Y - 1) / Y idiom. Tested on x86_64-linux-gnu. OK to install? Thanks, Richard Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c 2014-04-22 20:31:25.187000808 +0100 +++ gcc/dwarf2out.c 2014-04-22 20:31:26.374009366 +0100 @@ -14824,7 +14824,7 @@ simple_decl_align_in_bits (const_tree de static inline offset_int round_up_to_align (const offset_int t, unsigned int align) { - return wi::udiv_trunc (t + align - 1, align) * align; + return wi::udiv_ceil (t, align) * align; } /* Given a pointer to a FIELD_DECL, compute and return the byte offset of the Index: gcc/wide-int.h === --- gcc/wide-int.h 2014-04-22 20:31:25.842005530 +0100 +++ gcc/wide-int.h 2014-04-22 20:31:26.375009373 +0100 @@ -521,6 +521,7 @@ #define SHIFT_FUNCTION \ BINARY_FUNCTION udiv_floor (const T1 , const T2 ); BINARY_FUNCTION sdiv_floor (const T1 , const T2 ); BINARY_FUNCTION div_ceil (const T1 , const T2 , signop, bool * = 0); + BINARY_FUNCTION udiv_ceil (const T1 , const T2 ); BINARY_FUNCTION div_round (const T1 , const T2 , signop, bool * = 0); BINARY_FUNCTION divmod_trunc (const T1 , const T2 , signop, WI_BINARY_RESULT (T1, T2) *); @@ -2566,6 +2567,13 @@ wi::div_ceil (const T1 x, const T2 y, return quotient; } +template typename T1, typename T2 +inline WI_BINARY_RESULT (T1, T2) +wi::udiv_ceil (const T1 x, const T2 y) +{ + return div_ceil (x, y, UNSIGNED); +} + /* Return X / Y, rouding towards nearest with ties away from zero. Treat X and Y as having the signedness given by SGN. Indicate in *OVERFLOW if the result overflows. */
[wide-int 4/8] Tweak uses of new API
This is an assorted bunch of API tweaks: - use neg_p instead of lts_p (..., 0) - use STATIC_ASSERT for things that are known at compile time - avoid unnecessary wide(st)_int temporaries and arithmetic - remove an unnecessary template parameter - use to_short_addr for an offset_int-HOST_WIDE_INT offset change Tested on x86_64-linux-gnu. OK to install? Thanks, Richard Index: gcc/ada/gcc-interface/cuintp.c === --- gcc/ada/gcc-interface/cuintp.c 2014-04-22 20:31:10.680896299 +0100 +++ gcc/ada/gcc-interface/cuintp.c 2014-04-22 20:31:24.526996049 +0100 @@ -160,7 +160,7 @@ UI_From_gnu (tree Input) in a signed 64-bit integer. */ if (tree_fits_shwi_p (Input)) return UI_From_Int (tree_to_shwi (Input)); - else if (wi::lts_p (Input, 0) TYPE_UNSIGNED (gnu_type)) + else if (wi::neg_p (Input) TYPE_UNSIGNED (gnu_type)) return No_Uint; #endif Index: gcc/expmed.c === --- gcc/expmed.c2014-04-22 20:31:10.680896299 +0100 +++ gcc/expmed.c2014-04-22 20:31:24.527996056 +0100 @@ -4971,7 +4971,7 @@ make_tree (tree type, rtx x) return t; case CONST_DOUBLE: - gcc_assert (HOST_BITS_PER_WIDE_INT * 2 = MAX_BITSIZE_MODE_ANY_INT); + STATIC_ASSERT (HOST_BITS_PER_WIDE_INT * 2 = MAX_BITSIZE_MODE_ANY_INT); if (TARGET_SUPPORTS_WIDE_INT == 0 GET_MODE (x) == VOIDmode) t = wide_int_to_tree (type, wide_int::from_array (CONST_DOUBLE_LOW (x), 2, Index: gcc/fold-const.c === --- gcc/fold-const.c2014-04-22 20:31:10.680896299 +0100 +++ gcc/fold-const.c2014-04-22 20:31:24.530996079 +0100 @@ -4274,9 +4274,8 @@ build_range_check (location_t loc, tree if (integer_onep (low) TREE_CODE (high) == INTEGER_CST) { int prec = TYPE_PRECISION (etype); - wide_int osb = wi::set_bit_in_zero (prec - 1, prec) - 1; - if (osb == high) + if (wi::mask (prec - 1, false, prec) == high) { if (TYPE_UNSIGNED (etype)) { @@ -12950,7 +12949,7 @@ fold_binary_loc (location_t loc, operand_equal_p (tree_strip_nop_conversions (TREE_OPERAND (arg0, 1)), arg1, 0) - wi::bit_and (TREE_OPERAND (arg0, 0), 1) == 1) + wi::extract_uhwi (TREE_OPERAND (arg0, 0), 0, 1) == 1) { return omit_two_operands_loc (loc, type, code == NE_EXPR Index: gcc/predict.c === --- gcc/predict.c 2014-04-22 20:31:10.680896299 +0100 +++ gcc/predict.c 2014-04-22 20:31:24.531996086 +0100 @@ -1309,33 +1309,34 @@ predict_iv_comparison (struct loop *loop bool overflow, overall_overflow = false; widest_int compare_count, tem; - widest_int loop_bound = wi::to_widest (loop_bound_var); - widest_int compare_bound = wi::to_widest (compare_var); - widest_int base = wi::to_widest (compare_base); - widest_int compare_step = wi::to_widest (compare_step_var); - /* (loop_bound - base) / compare_step */ - tem = wi::sub (loop_bound, base, SIGNED, overflow); + tem = wi::sub (wi::to_widest (loop_bound_var), +wi::to_widest (compare_base), SIGNED, overflow); overall_overflow |= overflow; - widest_int loop_count = wi::div_trunc (tem, compare_step, SIGNED, -overflow); + widest_int loop_count = wi::div_trunc (tem, +wi::to_widest (compare_step_var), +SIGNED, overflow); overall_overflow |= overflow; - if (!wi::neg_p (compare_step) + if (!wi::neg_p (wi::to_widest (compare_step_var)) ^ (compare_code == LT_EXPR || compare_code == LE_EXPR)) { /* (loop_bound - compare_bound) / compare_step */ - tem = wi::sub (loop_bound, compare_bound, SIGNED, overflow); + tem = wi::sub (wi::to_widest (loop_bound_var), +wi::to_widest (compare_var), SIGNED, overflow); overall_overflow |= overflow; - compare_count = wi::div_trunc (tem, compare_step, SIGNED, overflow); + compare_count = wi::div_trunc (tem, wi::to_widest (compare_step_var), +SIGNED, overflow); overall_overflow |= overflow; } else { /* (compare_bound - base) / compare_step */ - tem = wi::sub (compare_bound, base, SIGNED, overflow); + tem = wi::sub (wi::to_widest (compare_var), +wi::to_widest (compare_base), SIGNED, overflow); overall_overflow |= overflow; - compare_count = wi::div_trunc (tem,
[wide-int 5/8] Use LOG2_BITS_PER_UNIT
Looks like a few uses of the old idiom: BITS_PER_UNIT == 8 ? 3 : exact_log2 (BITS_PER_UNIT) have crept in. This patch replaces them with LOG2_BITS_PER_UNIT. Tested on x86_64-linux-gnu. OK to install? Thanks, Richard Index: gcc/expr.c === --- gcc/expr.c 2014-04-22 20:58:26.969683484 +0100 +++ gcc/expr.c 2014-04-22 21:00:26.377614881 +0100 @@ -6801,8 +6801,7 @@ get_inner_reference (tree exp, HOST_WIDE if (!integer_zerop (off)) { offset_int boff, coff = mem_ref_offset (exp); - boff = wi::lshift (coff, (BITS_PER_UNIT == 8 - ? 3 : exact_log2 (BITS_PER_UNIT))); + boff = wi::lshift (coff, LOG2_BITS_PER_UNIT); bit_offset += boff; } exp = TREE_OPERAND (TREE_OPERAND (exp, 0), 0); @@ -6828,8 +6827,7 @@ get_inner_reference (tree exp, HOST_WIDE { offset_int tem = wi::sext (wi::to_offset (offset), TYPE_PRECISION (sizetype)); - tem = wi::lshift (tem, (BITS_PER_UNIT == 8 - ? 3 : exact_log2 (BITS_PER_UNIT))); + tem = wi::lshift (tem, LOG2_BITS_PER_UNIT); tem += bit_offset; if (wi::fits_shwi_p (tem)) { @@ -6844,16 +6842,12 @@ get_inner_reference (tree exp, HOST_WIDE /* Avoid returning a negative bitpos as this may wreak havoc later. */ if (wi::neg_p (bit_offset)) { - offset_int mask - = wi::mask offset_int (BITS_PER_UNIT == 8 -? 3 : exact_log2 (BITS_PER_UNIT), -false); + offset_int mask = wi::mask offset_int (LOG2_BITS_PER_UNIT, false); offset_int tem = bit_offset.and_not (mask); /* TEM is the bitpos rounded to BITS_PER_UNIT towards -Inf. Subtract it to BIT_OFFSET and add it (scaled) to OFFSET. */ bit_offset -= tem; - tem = wi::arshift (tem, (BITS_PER_UNIT == 8 - ? 3 : exact_log2 (BITS_PER_UNIT))); + tem = wi::arshift (tem, LOG2_BITS_PER_UNIT); offset = size_binop (PLUS_EXPR, offset, wide_int_to_tree (sizetype, tem)); } Index: gcc/tree-dfa.c === --- gcc/tree-dfa.c 2014-04-22 20:58:27.020683881 +0100 +++ gcc/tree-dfa.c 2014-04-22 21:00:26.378614888 +0100 @@ -463,10 +463,7 @@ get_ref_base_and_extent (tree exp, HOST_ { offset_int tem = (wi::to_offset (ssize) - wi::to_offset (fsize)); - if (BITS_PER_UNIT == 8) - tem = wi::lshift (tem, 3); - else - tem *= BITS_PER_UNIT; + tem = wi::lshift (tem, LOG2_BITS_PER_UNIT); tem -= woffset; maxsize += tem; } @@ -583,8 +580,7 @@ get_ref_base_and_extent (tree exp, HOST_ else { offset_int off = mem_ref_offset (exp); - off = wi::lshift (off, (BITS_PER_UNIT == 8 - ? 3 : exact_log2 (BITS_PER_UNIT))); + off = wi::lshift (off, LOG2_BITS_PER_UNIT); off += bit_offset; if (wi::fits_shwi_p (off)) { Index: gcc/tree-ssa-alias.c === --- gcc/tree-ssa-alias.c2014-04-22 20:58:26.969683484 +0100 +++ gcc/tree-ssa-alias.c2014-04-22 21:00:26.378614888 +0100 @@ -1041,8 +1041,7 @@ indirect_ref_may_alias_decl_p (tree ref1 /* The offset embedded in MEM_REFs can be negative. Bias them so that the resulting offset adjustment is positive. */ offset_int moff = mem_ref_offset (base1); - moff = wi::lshift (moff, (BITS_PER_UNIT == 8 - ? 3 : exact_log2 (BITS_PER_UNIT))); + moff = wi::lshift (moff, LOG2_BITS_PER_UNIT); if (wi::neg_p (moff)) offset2p += (-moff).to_short_addr (); else @@ -1118,8 +1117,7 @@ indirect_ref_may_alias_decl_p (tree ref1 || TREE_CODE (dbase2) == TARGET_MEM_REF) { offset_int moff = mem_ref_offset (dbase2); - moff = wi::lshift (moff, (BITS_PER_UNIT == 8 - ? 3 : exact_log2 (BITS_PER_UNIT))); + moff = wi::lshift (moff, LOG2_BITS_PER_UNIT); if (wi::neg_p (moff)) doffset1 -= (-moff).to_short_addr (); else @@ -1217,15 +1215,13 @@ indirect_refs_may_alias_p (tree ref1 ATT /* The offset embedded in MEM_REFs can be negative. Bias them so that the resulting offset adjustment is positive. */ moff = mem_ref_offset
[wide-int 6/8] Avoid redundant extensions
register_edge_assert_for_2 operates on wide_ints of precision nprec so a lot of the extensions are redundant. Tested on x86_64-linux-gnu. OK to install? Thanks, Richard Index: gcc/tree-vrp.c === --- gcc/tree-vrp.c 2014-04-22 20:58:26.969683484 +0100 +++ gcc/tree-vrp.c 2014-04-22 21:00:26.670617168 +0100 @@ -5125,16 +5125,13 @@ register_edge_assert_for_2 (tree name, e { wide_int minv, maxv, valv, cst2v; wide_int tem, sgnbit; - bool valid_p = false, valn = false, cst2n = false; + bool valid_p = false, valn, cst2n; enum tree_code ccode = comp_code; valv = wide_int::from (val, nprec, UNSIGNED); cst2v = wide_int::from (cst2, nprec, UNSIGNED); - if (TYPE_SIGN (TREE_TYPE (val)) == SIGNED) - { - valn = wi::neg_p (wi::sext (valv, nprec)); - cst2n = wi::neg_p (wi::sext (cst2v, nprec)); - } + valn = wi::neg_p (valv, TYPE_SIGN (TREE_TYPE (val))); + cst2n = wi::neg_p (cst2v, TYPE_SIGN (TREE_TYPE (val))); /* If CST2 doesn't have most significant bit set, but VAL is negative, we have comparison like if ((x 0x123) -4) (always true). Just give up. */ @@ -5153,13 +5150,11 @@ register_edge_assert_for_2 (tree name, e have folded the comparison into false) and maximum unsigned value is VAL | ~CST2. */ maxv = valv | ~cst2v; - maxv = wi::zext (maxv, nprec); valid_p = true; break; case NE_EXPR: tem = valv | ~cst2v; - tem = wi::zext (tem, nprec); /* If VAL is 0, handle (X CST2) != 0 as (X CST2) 0U. */ if (valv == 0) { @@ -5176,7 +5171,7 @@ register_edge_assert_for_2 (tree name, e sgnbit = wi::zero (nprec); goto lt_expr; } - if (!cst2n wi::neg_p (wi::sext (cst2v, nprec))) + if (!cst2n wi::neg_p (cst2v)) sgnbit = wi::set_bit_in_zero (nprec - 1, nprec); if (sgnbit != 0) { @@ -5245,7 +5240,6 @@ register_edge_assert_for_2 (tree name, e maxv -= 1; } maxv |= ~cst2v; - maxv = wi::zext (maxv, nprec); minv = sgnbit; valid_p = true; break; @@ -5274,7 +5268,6 @@ register_edge_assert_for_2 (tree name, e } maxv -= 1; maxv |= ~cst2v; - maxv = wi::zext (maxv, nprec); minv = sgnbit; valid_p = true; break; @@ -5283,7 +5276,7 @@ register_edge_assert_for_2 (tree name, e break; } if (valid_p - wi::zext (maxv - minv, nprec) != wi::minus_one (nprec)) + (maxv - minv) != -1) { tree tmp, new_val, type; int i;
Re: version typeinfo for 128bit types
Hello, as written in the PR, my patch seems wrong for platforms like powerpc that already had the __float128 typeinfo for long double with a different version. The following patch regtested fine on x86_64, and a hackish cross-build shows that float128.ver is ignored on powerpc (good). 2014-04-23 Marc Glisse marc.gli...@inria.fr PR libstdc++/43622 * config/abi/pre/float128.ver: New file. * config/abi/pre/gnu.ver (CXXABI_1.3.9): Move __float128 typeinfo to the new file. * config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update. * configure.ac: Use float128.ver when relevant. * configure: Regenerate. -- Marc GlisseIndex: libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt === --- libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt (revision 209658) +++ libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt (working copy) @@ -2514,20 +2514,21 @@ FUNC:atomic_flag_test_and_set_explicit@@ OBJECT:0:CXXABI_1.3 OBJECT:0:CXXABI_1.3.1 OBJECT:0:CXXABI_1.3.2 OBJECT:0:CXXABI_1.3.3 OBJECT:0:CXXABI_1.3.4 OBJECT:0:CXXABI_1.3.5 OBJECT:0:CXXABI_1.3.6 OBJECT:0:CXXABI_1.3.7 OBJECT:0:CXXABI_1.3.8 OBJECT:0:CXXABI_1.3.9 +OBJECT:0:CXXABI_FLOAT128_1.3.9 OBJECT:0:CXXABI_TM_1 OBJECT:0:GLIBCXX_3.4 OBJECT:0:GLIBCXX_3.4.1 OBJECT:0:GLIBCXX_3.4.10 OBJECT:0:GLIBCXX_3.4.11 OBJECT:0:GLIBCXX_3.4.12 OBJECT:0:GLIBCXX_3.4.13 OBJECT:0:GLIBCXX_3.4.14 OBJECT:0:GLIBCXX_3.4.15 OBJECT:0:GLIBCXX_3.4.16 @@ -2618,21 +2619,21 @@ OBJECT:16:_ZTISt16nested_exception@@CXXA OBJECT:16:_ZTISt8ios_base@@GLIBCXX_3.4 OBJECT:16:_ZTISt9exception@@GLIBCXX_3.4 OBJECT:16:_ZTISt9time_base@@GLIBCXX_3.4 OBJECT:16:_ZTISt9type_info@@GLIBCXX_3.4 OBJECT:16:_ZTIa@@CXXABI_1.3 OBJECT:16:_ZTIb@@CXXABI_1.3 OBJECT:16:_ZTIc@@CXXABI_1.3 OBJECT:16:_ZTId@@CXXABI_1.3 OBJECT:16:_ZTIe@@CXXABI_1.3 OBJECT:16:_ZTIf@@CXXABI_1.3 -OBJECT:16:_ZTIg@@CXXABI_1.3.9 +OBJECT:16:_ZTIg@@CXXABI_FLOAT128_1.3.9 OBJECT:16:_ZTIh@@CXXABI_1.3 OBJECT:16:_ZTIi@@CXXABI_1.3 OBJECT:16:_ZTIj@@CXXABI_1.3 OBJECT:16:_ZTIl@@CXXABI_1.3 OBJECT:16:_ZTIm@@CXXABI_1.3 OBJECT:16:_ZTIn@@CXXABI_1.3.5 OBJECT:16:_ZTIo@@CXXABI_1.3.5 OBJECT:16:_ZTIs@@CXXABI_1.3 OBJECT:16:_ZTIt@@CXXABI_1.3 OBJECT:16:_ZTIv@@CXXABI_1.3 @@ -3119,21 +3120,21 @@ OBJECT:2:_ZNSt10ctype_base5printE@@GLIBC OBJECT:2:_ZNSt10ctype_base5punctE@@GLIBCXX_3.4 OBJECT:2:_ZNSt10ctype_base5spaceE@@GLIBCXX_3.4 OBJECT:2:_ZNSt10ctype_base5upperE@@GLIBCXX_3.4 OBJECT:2:_ZNSt10ctype_base6xdigitE@@GLIBCXX_3.4 OBJECT:2:_ZTSa@@CXXABI_1.3 OBJECT:2:_ZTSb@@CXXABI_1.3 OBJECT:2:_ZTSc@@CXXABI_1.3 OBJECT:2:_ZTSd@@CXXABI_1.3 OBJECT:2:_ZTSe@@CXXABI_1.3 OBJECT:2:_ZTSf@@CXXABI_1.3 -OBJECT:2:_ZTSg@@CXXABI_1.3.9 +OBJECT:2:_ZTSg@@CXXABI_FLOAT128_1.3.9 OBJECT:2:_ZTSh@@CXXABI_1.3 OBJECT:2:_ZTSi@@CXXABI_1.3 OBJECT:2:_ZTSj@@CXXABI_1.3 OBJECT:2:_ZTSl@@CXXABI_1.3 OBJECT:2:_ZTSm@@CXXABI_1.3 OBJECT:2:_ZTSn@@CXXABI_1.3.9 OBJECT:2:_ZTSo@@CXXABI_1.3.9 OBJECT:2:_ZTSs@@CXXABI_1.3 OBJECT:2:_ZTSt@@CXXABI_1.3 OBJECT:2:_ZTSv@@CXXABI_1.3 @@ -3153,41 +3154,41 @@ OBJECT:32:_ZTIPKDe@@CXXABI_1.3.4 OBJECT:32:_ZTIPKDf@@CXXABI_1.3.4 OBJECT:32:_ZTIPKDi@@CXXABI_1.3.3 OBJECT:32:_ZTIPKDn@@CXXABI_1.3.5 OBJECT:32:_ZTIPKDs@@CXXABI_1.3.3 OBJECT:32:_ZTIPKa@@CXXABI_1.3 OBJECT:32:_ZTIPKb@@CXXABI_1.3 OBJECT:32:_ZTIPKc@@CXXABI_1.3 OBJECT:32:_ZTIPKd@@CXXABI_1.3 OBJECT:32:_ZTIPKe@@CXXABI_1.3 OBJECT:32:_ZTIPKf@@CXXABI_1.3 -OBJECT:32:_ZTIPKg@@CXXABI_1.3.9 +OBJECT:32:_ZTIPKg@@CXXABI_FLOAT128_1.3.9 OBJECT:32:_ZTIPKh@@CXXABI_1.3 OBJECT:32:_ZTIPKi@@CXXABI_1.3 OBJECT:32:_ZTIPKj@@CXXABI_1.3 OBJECT:32:_ZTIPKl@@CXXABI_1.3 OBJECT:32:_ZTIPKm@@CXXABI_1.3 OBJECT:32:_ZTIPKn@@CXXABI_1.3.5 OBJECT:32:_ZTIPKo@@CXXABI_1.3.5 OBJECT:32:_ZTIPKs@@CXXABI_1.3 OBJECT:32:_ZTIPKt@@CXXABI_1.3 OBJECT:32:_ZTIPKv@@CXXABI_1.3 OBJECT:32:_ZTIPKw@@CXXABI_1.3 OBJECT:32:_ZTIPKx@@CXXABI_1.3 OBJECT:32:_ZTIPKy@@CXXABI_1.3 OBJECT:32:_ZTIPa@@CXXABI_1.3 OBJECT:32:_ZTIPb@@CXXABI_1.3 OBJECT:32:_ZTIPc@@CXXABI_1.3 OBJECT:32:_ZTIPd@@CXXABI_1.3 OBJECT:32:_ZTIPe@@CXXABI_1.3 OBJECT:32:_ZTIPf@@CXXABI_1.3 -OBJECT:32:_ZTIPg@@CXXABI_1.3.9 +OBJECT:32:_ZTIPg@@CXXABI_FLOAT128_1.3.9 OBJECT:32:_ZTIPh@@CXXABI_1.3 OBJECT:32:_ZTIPi@@CXXABI_1.3 OBJECT:32:_ZTIPj@@CXXABI_1.3 OBJECT:32:_ZTIPl@@CXXABI_1.3 OBJECT:32:_ZTIPm@@CXXABI_1.3 OBJECT:32:_ZTIPn@@CXXABI_1.3.5 OBJECT:32:_ZTIPo@@CXXABI_1.3.5 OBJECT:32:_ZTIPs@@CXXABI_1.3 OBJECT:32:_ZTIPt@@CXXABI_1.3 OBJECT:32:_ZTIPv@@CXXABI_1.3 @@ -3228,21 +3229,21 @@ OBJECT:39:_ZTSSt13basic_filebufIwSt11cha OBJECT:39:_ZTSSt13basic_fstreamIcSt11char_traitsIcEE@@GLIBCXX_3.4 OBJECT:39:_ZTSSt13basic_fstreamIwSt11char_traitsIwEE@@GLIBCXX_3.4 OBJECT:39:_ZTSSt13basic_istreamIwSt11char_traitsIwEE@@GLIBCXX_3.4 OBJECT:39:_ZTSSt13basic_ostreamIwSt11char_traitsIwEE@@GLIBCXX_3.4 OBJECT:3:_ZTSPa@@CXXABI_1.3 OBJECT:3:_ZTSPb@@CXXABI_1.3 OBJECT:3:_ZTSPc@@CXXABI_1.3 OBJECT:3:_ZTSPd@@CXXABI_1.3
[wide-int 7/8] Undo some changes from trunk
This patch undoes a few assorted differences from trunk. For fold-const.c the old code was: /* If INNER is a right shift of a constant and it plus BITNUM does not overflow, adjust BITNUM and INNER. */ if (TREE_CODE (inner) == RSHIFT_EXPR TREE_CODE (TREE_OPERAND (inner, 1)) == INTEGER_CST tree_fits_uhwi_p (TREE_OPERAND (inner, 1)) bitnum TYPE_PRECISION (type) (tree_to_uhwi (TREE_OPERAND (inner, 1)) (unsigned) (TYPE_PRECISION (type) - bitnum))) { bitnum += tree_to_uhwi (TREE_OPERAND (inner, 1)); inner = TREE_OPERAND (inner, 0); } and we lost the bitnum range test. The gimple-fold.c change contained an unrelated stylistic change that makes the code a bit less efficient. For ipa-prop.c we should convert to a HOST_WIDE_INT before multiplying, like trunk does. It doesn't change the result and is more efficient. objc-act.c contains three copies of the same code. The check for 0 was kept in the third but not the first two. Tested on x86_64-linux-gnu. OK to install? Thanks, Richard Index: gcc/fold-const.c === --- gcc/fold-const.c2014-04-22 21:00:26.921619127 +0100 +++ gcc/fold-const.c2014-04-22 21:00:27.317622218 +0100 @@ -6581,8 +6581,9 @@ fold_single_bit_test (location_t loc, en not overflow, adjust BITNUM and INNER. */ if (TREE_CODE (inner) == RSHIFT_EXPR TREE_CODE (TREE_OPERAND (inner, 1)) == INTEGER_CST - wi::ltu_p (wi::to_widest (TREE_OPERAND (inner, 1)) + bitnum, - TYPE_PRECISION (type))) + bitnum TYPE_PRECISION (type) + wi::ltu_p (TREE_OPERAND (inner, 1), + TYPE_PRECISION (type) - bitnum)) { bitnum += tree_to_uhwi (TREE_OPERAND (inner, 1)); inner = TREE_OPERAND (inner, 0); Index: gcc/gimple-fold.c === --- gcc/gimple-fold.c 2014-04-22 20:58:26.869682704 +0100 +++ gcc/gimple-fold.c 2014-04-22 21:00:27.31866 +0100 @@ -3163,12 +3163,13 @@ fold_const_aggregate_ref_1 (tree t, tree (idx = (*valueize) (TREE_OPERAND (t, 1))) TREE_CODE (idx) == INTEGER_CST) { - tree low_bound = array_ref_low_bound (t); - tree unit_size = array_ref_element_size (t); + tree low_bound, unit_size; /* If the resulting bit-offset is constant, track it. */ - if (TREE_CODE (low_bound) == INTEGER_CST - tree_fits_uhwi_p (unit_size)) + if ((low_bound = array_ref_low_bound (t), + TREE_CODE (low_bound) == INTEGER_CST) + (unit_size = array_ref_element_size (t), + tree_fits_uhwi_p (unit_size))) { offset_int woffset = wi::sext (wi::to_offset (idx) - wi::to_offset (low_bound), Index: gcc/ipa-prop.c === --- gcc/ipa-prop.c 2014-04-22 20:58:26.869682704 +0100 +++ gcc/ipa-prop.c 2014-04-22 21:00:27.319622234 +0100 @@ -3787,8 +3787,8 @@ ipa_modify_call_arguments (struct cgraph if (TYPE_ALIGN (type) align) align = TYPE_ALIGN (type); } - misalign += (offset_int::from (off, SIGNED) - * BITS_PER_UNIT).to_short_addr (); + misalign += (offset_int::from (off, SIGNED).to_short_addr () + * BITS_PER_UNIT); misalign = misalign (align - 1); if (misalign != 0) align = (misalign -misalign); Index: gcc/objc/objc-act.c === --- gcc/objc/objc-act.c 2014-04-22 20:58:26.869682704 +0100 +++ gcc/objc/objc-act.c 2014-04-22 21:00:27.320622242 +0100 @@ -4882,7 +4882,9 @@ objc_decl_method_attributes (tree *node, which specifies the index of the format string argument. Add 2. */ number = TREE_VALUE (second_argument); - if (number TREE_CODE (number) == INTEGER_CST) + if (number + TREE_CODE (number) == INTEGER_CST + !wi::eq_p (number, 0)) TREE_VALUE (second_argument) = wide_int_to_tree (TREE_TYPE (number), wi::add (number, 2)); @@ -4893,7 +4895,9 @@ objc_decl_method_attributes (tree *node, in which case we don't need to add 2. Add 2 if not 0. */ number = TREE_VALUE (third_argument); - if (number TREE_CODE (number) == INTEGER_CST) + if (number + TREE_CODE (number) == INTEGER_CST + !wi::eq_p (number, 0)) TREE_VALUE (third_argument)
[wide-int 8/8] Formatting and typo fixes
Almost obvious, but just in case... The first mem_loc_descriptor hunk just reflows the text so that the line breaks are less awkward. Tested on x86_64-linux-gnu. OK to install? Thanks, Richard Index: gcc/doc/rtl.texi === --- gcc/doc/rtl.texi2014-04-22 21:08:26.002367845 +0100 +++ gcc/doc/rtl.texi2014-04-22 21:13:54.343668582 +0100 @@ -1553,7 +1553,7 @@ neither inherently signed nor inherently signedness is determined by the rtl operation instead. On more modern ports, @code{CONST_DOUBLE} only represents floating -point values. New ports define to @code{TARGET_SUPPORTS_WIDE_INT} to +point values. New ports define @code{TARGET_SUPPORTS_WIDE_INT} to make this designation. @findex CONST_DOUBLE_LOW @@ -1571,7 +1571,7 @@ the precise bit pattern used by the targ @findex CONST_WIDE_INT @item (const_wide_int:@var{m} @var{nunits} @var{elt0} @dots{}) -This contains an array of @code{HOST_WIDE_INTS} that is large enough +This contains an array of @code{HOST_WIDE_INT}s that is large enough to hold any constant that can be represented on the target. This form of rtl is only used on targets that define @code{TARGET_SUPPORTS_WIDE_INT} to be nonzero and then Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c 2014-04-22 21:13:54.297668148 +0100 +++ gcc/dwarf2out.c 2014-04-22 21:13:54.337668526 +0100 @@ -12911,14 +12911,13 @@ mem_loc_descriptor (rtx rtl, enum machin dw_die_ref type_die; /* Note that if TARGET_SUPPORTS_WIDE_INT == 0, a -CONST_DOUBLE rtx could represent either an large integer -or a floating-point constant. If -TARGET_SUPPORTS_WIDE_INT != 0, the value is always a -floating point constant. +CONST_DOUBLE rtx could represent either a large integer +or a floating-point constant. If TARGET_SUPPORTS_WIDE_INT != 0, +the value is always a floating point constant. When it is an integer, a CONST_DOUBLE is used whenever -the constant requires 2 HWIs to be adequately -represented. We output CONST_DOUBLEs as blocks. */ +the constant requires 2 HWIs to be adequately represented. +We output CONST_DOUBLEs as blocks. */ if (mode == VOIDmode || (GET_MODE (rtl) == VOIDmode GET_MODE_BITSIZE (mode) != HOST_BITS_PER_DOUBLE_INT)) @@ -15147,9 +15146,9 @@ insert_wide_int (const wide_int val, un } /* We'd have to extend this code to support odd sizes. */ - gcc_assert (elt_size % (HOST_BITS_PER_WIDE_INT/BITS_PER_UNIT) == 0); + gcc_assert (elt_size % (HOST_BITS_PER_WIDE_INT / BITS_PER_UNIT) == 0); - int n = elt_size / (HOST_BITS_PER_WIDE_INT/BITS_PER_UNIT); + int n = elt_size / (HOST_BITS_PER_WIDE_INT / BITS_PER_UNIT); if (WORDS_BIG_ENDIAN) for (i = n - 1; i = 0; i--) Index: gcc/emit-rtl.c === --- gcc/emit-rtl.c 2014-04-22 21:08:26.002367845 +0100 +++ gcc/emit-rtl.c 2014-04-22 21:13:54.338668535 +0100 @@ -213,8 +213,8 @@ const_wide_int_htab_hash (const void *x) const_wide_int_htab_eq (const void *x, const void *y) { int i; - const_rtx xr = (const_rtx)x; - const_rtx yr = (const_rtx)y; + const_rtx xr = (const_rtx) x; + const_rtx yr = (const_rtx) y; if (CONST_WIDE_INT_NUNITS (xr) != CONST_WIDE_INT_NUNITS (yr)) return 0; Index: gcc/fold-const.c === --- gcc/fold-const.c2014-04-22 21:13:54.308668252 +0100 +++ gcc/fold-const.c2014-04-22 21:13:54.340668554 +0100 @@ -1775,7 +1775,7 @@ fold_convert_const_fixed_from_int (tree di.low = TREE_INT_CST_ELT (arg1, 0); if (TREE_INT_CST_NUNITS (arg1) == 1) -di.high = (HOST_WIDE_INT)di.low 0 ? (HOST_WIDE_INT)-1 : 0; +di.high = (HOST_WIDE_INT) di.low 0 ? (HOST_WIDE_INT) -1 : 0; else di.high = TREE_INT_CST_ELT (arg1, 1); Index: gcc/rtl.c === --- gcc/rtl.c 2014-04-22 21:08:26.002367845 +0100 +++ gcc/rtl.c 2014-04-22 21:13:54.341668564 +0100 @@ -232,7 +232,7 @@ cwi_output_hex (FILE *outfile, const_rtx { int i = CWI_GET_NUM_ELEM (x); gcc_assert (i 0); - if (CWI_ELT (x, i-1) == 0) + if (CWI_ELT (x, i - 1) == 0) /* The HOST_WIDE_INT_PRINT_HEX prepends a 0x only if the val is non zero. We want all numbers to have a 0x prefix. */ fprintf (outfile, 0x); Index: gcc/rtl.h === --- gcc/rtl.h 2014-04-22 21:08:26.002367845 +0100 +++ gcc/rtl.h 2014-04-22 21:13:54.341668564 +0100 @@ -348,7 +348,7 @@ struct GTY((chain_next (RTX_NEXT (%h) union { /* The final union field is aligned to 64 bits on LP64 hosts, - giving a 32-bit gap after the fields above. We