Re: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store
Thomas Preud'homme thomas.preudho...@arm.com writes: From: Joseph Myers [mailto:jos...@codesourcery.com] + if { [is-effective-target bswap] + ![istarget x86_64-*-*] } { That x86_64-*-* test is wrong. x86_64-*-* and i?86-*-* should always be handled the same (if you then want to distinguish 32-bit and 64-bit multilibs, you check the appropriate effective-target there, depending on whether the condition is one on the ABI or which register size is being used, which affects how x32 should be counted). Indeed, it's a mistake. I?86 should be in there two. Please find attached an updated patch. diff --git a/gcc/testsuite/gcc.dg/optimize-bswapdi-1.c b/gcc/testsuite/gcc.dg/optimize-bswapdi-1.c index 7d557f3..a9c3443 100644 --- a/gcc/testsuite/gcc.dg/optimize-bswapdi-1.c +++ b/gcc/testsuite/gcc.dg/optimize-bswapdi-1.c @@ -1,6 +1,6 @@ -/* { dg-do compile { target arm*-*-* alpha*-*-* ia64*-*-* x86_64-*-* s390x-*-* powerpc*-*-* rs6000-*-* } } */ +/* { dg-do compile { target *-*-* } } */ Just omit the { target *-*-* } completely, also a few more times. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: Fix ipa-devirt ICE
On Thu, 3 Apr 2014, Jan Hubicka wrote: + /* Use OTR_TOKEN = INT_MAX as a marker of probably type inconsistent + /* Use OTR_TOKEN = INT_MAX as a marker of probably type inconsistent + OTR_TOKEN == INT_MAX is used to mark calls that are provably Did you mean provably instead of probably in the first two? -- Marc Glisse
[Ping][Patch]Simplify SUBREG with operand whose target bits are cleared by AND operation
Hello Eric, Would you please review my patch at http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01582.html? Thanks. BR, Terry
Re: RFA: PATCH to add -fno-gnu-unique for c++/60731
On Wed, Apr 2, 2014 at 9:24 PM, Jason Merrill ja...@redhat.com wrote: Use of STB_GNU_UNIQUE to avoid problems with variable symbols shared between two RTLD_LOCAL plugins and a common library dependency causes problems with libraries that depend on dlclose/dlopen to reinitialize state. This patch adds a -fno-gnu-unique flag that such libraries can use. Tested x86_64-pc-linux-gnu. OK for trunk? Ok. Can you add a testcase as well please? Thanks, Richard.
Re: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store
Thomas Preud'homme thomas.preudho...@arm.com writes: +# Return 1 if the target supports byte swap instructions. + +proc check_effective_target_bswap { } { +global et_bswap_saved + +if [info exists et_bswap_saved] { +verbose check_effective_target_bswap: using cached result 2 +} else { + set et_bswap_saved 0 + if { [istarget aarch64-*-*] + || [istarget alpha*-*-*] + || [istarget arm*-*-*] + || [istarget i?86-*-*] + || [istarget powerpc*-*-*] + || [istarget rs6000-*-*] + || [istarget s390*-*-*] + || [istarget x86_64-*-*] } { Please add m68k-*-*. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 And now for something completely different.
[PATCH][LTO] Reduce WPA memory usage
This reduces WPA memory usage at stream-out time by avoiding to allocate the streamer cache node array and by freeing the global out-decl-states hash tables (we do that already for the fn-decl-states). LTO bootstrapped and bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Ok? Not sure if it will make a notable difference. The pointer-map overhead is at least 4 times of that of the vector, if we make pointer-map behave like hash_table (3/4 full) or htab_t (half full) then that would improve (we could even make that configurable at pointer-set/map construction time). Like with Index: gcc/pointer-set.c === --- gcc/pointer-set.c (revision 209018) +++ gcc/pointer-set.c (working copy) @@ -125,7 +125,7 @@ pointer_set_insert (struct pointer_set_t /* For simplicity, expand the set even if P is already there. This can be superfluous but can happen at most once. */ - if (pset-n_elements pset-n_slots / 4) + if (pset-n_elements * 4 pset-n_slots * 3) { size_t old_n_slots = pset-n_slots; const void **old_slots = pset-slots; Index: gcc/pointer-set.h === --- gcc/pointer-set.h (revision 209018) +++ gcc/pointer-set.h (working copy) @@ -109,7 +109,7 @@ pointer_mapT::insert (const void *p, b /* For simplicity, expand the map even if P is already there. This can be superfluous but can happen at most once. */ /* ??? Fugly that we have to inline that here. */ - if (n_elements n_slots / 4) + if (n_elements * 4 n_slots * 3) { size_t old_n_slots = n_slots; const void **old_keys = slots; might be worth checking how much memory we save from the above. Thanks, Richard. 2014-04-03 Richard Biener rguent...@suse.de * tree-streamer.h (struct streamer_tree_cache_d): Add next_idx member. (streamer_tree_cache_create): Adjust. * tree-streamer.c (streamer_tree_cache_add_to_node_array): Adjust to allow optional nodes array. (streamer_tree_cache_insert_1): Use next_idx to assign idx. (streamer_tree_cache_append): Likewise. (streamer_tree_cache_create): Create nodes array optionally as specified by parameter. * lto-streamer-out.c (create_output_block): Avoid maintaining the node array in the writer cache. (DFS_write_tree): Remove assertion. (produce_asm_for_decls): Free the out decl state hash table early. * lto-streamer-in.c (lto_data_in_create): Adjust for streamer_tree_cache_create prototype change. Index: gcc/tree-streamer.c === *** gcc/tree-streamer.c (revision 209018) --- gcc/tree-streamer.c (working copy) *** static void *** 101,120 streamer_tree_cache_add_to_node_array (struct streamer_tree_cache_d *cache, unsigned ix, tree t, hashval_t hash) { ! /* Make sure we're either replacing an old element or ! appending consecutively. */ ! gcc_assert (ix = cache-nodes.length ()); ! ! if (ix == cache-nodes.length ()) { ! cache-nodes.safe_push (t); ! if (cache-hashes.exists ()) ! cache-hashes.safe_push (hash); } ! else { ! cache-nodes[ix] = t; ! if (cache-hashes.exists ()) cache-hashes[ix] = hash; } } --- 101,119 streamer_tree_cache_add_to_node_array (struct streamer_tree_cache_d *cache, unsigned ix, tree t, hashval_t hash) { ! /* We're either replacing an old element or appending consecutively. */ ! if (cache-nodes.exists ()) { ! if (cache-nodes.length () == ix) ! cache-nodes.safe_push (t); ! else ! cache-nodes[ix] = t; } ! if (cache-hashes.exists ()) { ! if (cache-hashes.length () == ix) ! cache-hashes.safe_push (hash); ! else cache-hashes[ix] = hash; } } *** streamer_tree_cache_insert_1 (struct str *** 146,152 { /* Determine the next slot to use in the cache. */ if (insert_at_next_slot_p) ! ix = cache-nodes.length (); else ix = *ix_p; *slot = ix; --- 145,151 { /* Determine the next slot to use in the cache. */ if (insert_at_next_slot_p) ! ix = cache-next_idx++; else ix = *ix_p; *slot = ix; *** void *** 211,217 streamer_tree_cache_append (struct streamer_tree_cache_d *cache, tree t, hashval_t hash) { ! unsigned ix = cache-nodes.length (); if (!cache-node_map) streamer_tree_cache_add_to_node_array (cache, ix, t, hash); else --- 210,216 streamer_tree_cache_append (struct streamer_tree_cache_d *cache, tree t, hashval_t hash) { !
[PATCH] Fix PR60740
The following fixes the graphite ICE that results from stmt_simple_for_scop_p not walking all GIMPLE_COND operands but only SSA name ones. Bootstrap and regtest in progress on x86_64-unknown-linux-gnu. Richard. 2014-04-03 Richard Biener rguent...@suse.de PR tree-optimization/60740 * graphite-scop-detection.c (stmt_simple_for_scop_p): Iterate over all GIMPLE_COND operands. * gcc.dg/graphite/pr60740.c: New testcase. Index: gcc/graphite-scop-detection.c === *** gcc/graphite-scop-detection.c (revision 209018) --- gcc/graphite-scop-detection.c (working copy) *** stmt_simple_for_scop_p (basic_block scop *** 346,358 case GIMPLE_COND: { - tree op; - ssa_op_iter op_iter; - enum tree_code code = gimple_cond_code (stmt); - /* We can handle all binary comparisons. Inequalities are also supported as they can be represented with union of polyhedra. */ if (!(code == LT_EXPR || code == GT_EXPR || code == LE_EXPR --- 346,355 case GIMPLE_COND: { /* We can handle all binary comparisons. Inequalities are also supported as they can be represented with union of polyhedra. */ + enum tree_code code = gimple_cond_code (stmt); if (!(code == LT_EXPR || code == GT_EXPR || code == LE_EXPR *** stmt_simple_for_scop_p (basic_block scop *** 361,371 || code == NE_EXPR)) return false; ! FOR_EACH_SSA_TREE_OPERAND (op, stmt, op_iter, SSA_OP_ALL_USES) ! if (!graphite_can_represent_expr (scop_entry, loop, op) ! /* We can not handle REAL_TYPE. Failed for pr39260. */ ! || TREE_CODE (TREE_TYPE (op)) == REAL_TYPE) ! return false; return true; } --- 358,371 || code == NE_EXPR)) return false; ! for (unsigned i = 0; i 2; ++i) ! { ! tree op = gimple_op (stmt, i); ! if (!graphite_can_represent_expr (scop_entry, loop, op) ! /* We can not handle REAL_TYPE. Failed for pr39260. */ ! || TREE_CODE (TREE_TYPE (op)) == REAL_TYPE) ! return false; ! } return true; } Index: gcc/testsuite/gcc.dg/graphite/pr60740.c === *** gcc/testsuite/gcc.dg/graphite/pr60740.c (revision 0) --- gcc/testsuite/gcc.dg/graphite/pr60740.c (working copy) *** *** 0 --- 1,16 + /* { dg-options -O2 -floop-interchange } */ + + int **db6 = 0; + + void + k26(void) + { + static int geb = 0; + int *a22 = geb; + int **l30 = a22; + int *c4b; + int ndf; + for (ndf = 0; ndf = 1; ++ndf) + *c4b = (db6 == l30) (*a22)--; + } +
[PATCH, ARM] Fix PR60609 (Error: value of 256 too large for field of 1 bytes)
Hi This bug causes the compiler to create a Thumb-2 TBB instruction with a jump table containing an out of range value in a .byte field: whatever.s:148: Error: value of 256 too large for field of 1 bytes at 100 This occurs because the jump table is followed with a .align 1 due to ASM_OUTPUT_CASE_END, but the 'shorten' phase does not account for the space taken by this align directive. This patch addresses the issue by removing ASM_OUTPUT_CASE_END from arm.h, and ensuring that the alignment after an ADDR_DIFF_VEC is instead inserted by aligning the label following the barrier which follows it. This is achieved by defining LABEL_ALIGN_AFTER_BARRIER appropriately. Bootstrapped/checked on arm-unknown-linux-gnueabihf. OK for trunk, and backporting to 4.8? 2014-04-02 Charles Baylis charles.bay...@linaro.org PR target/60609 * config/arm/arm.h (ASM_OUTPUT_CASE_END) Remove. (LABEL_ALIGN_AFTER_BARRIER) Align barriers which occur after ADDR_DIFF_VEC. 2014-04-02 Charles Baylis charles.bay...@linaro.org PR target/60609 * g++.dg/torture/pr60609.C: New test. From 9b0c1ada23e2b210b02ebaee2f599bb5205a91d6 Mon Sep 17 00:00:00 2001 From: Charles Baylis charles.bay...@linaro.org Date: Thu, 3 Apr 2014 10:57:33 +0100 Subject: [PATCH] fix for PR target/60609 2014-04-02 Charles Baylis charles.bay...@linaro.org PR target/60609 * config/arm/arm.h (ASM_OUTPUT_CASE_END) Remove. (LABEL_ALIGN_AFTER_BARRIER) Align barriers which occur after ADDR_DIFF_VEC. 2014-04-02 Charles Baylis charles.bay...@linaro.org PR target/60609 * g++.dg/torture/pr60609.C: New test. --- gcc/config/arm/arm.h | 11 +- gcc/testsuite/g++.dg/torture/pr60609.C | 252 + 2 files changed, 255 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/g++.dg/torture/pr60609.C diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index 7ca47a7..a4bbd12 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -2194,14 +2194,9 @@ extern int making_const_table; #undef ASM_OUTPUT_BEFORE_CASE_LABEL #define ASM_OUTPUT_BEFORE_CASE_LABEL(FILE, PREFIX, NUM, TABLE) /* Empty. */ -/* Make sure subsequent insns are aligned after a TBB. */ -#define ASM_OUTPUT_CASE_END(FILE, NUM, JUMPTABLE) \ - do \ -{ \ - if (GET_MODE (PATTERN (JUMPTABLE)) == QImode) \ - ASM_OUTPUT_ALIGN (FILE, 1); \ -} \ - while (0) +#define LABEL_ALIGN_AFTER_BARRIER(LABEL)\ + (GET_CODE (PATTERN (prev_active_insn (LABEL))) == ADDR_DIFF_VEC \ + ? 1 : 0) #define ARM_DECLARE_FUNCTION_NAME(STREAM, NAME, DECL) \ do \ diff --git a/gcc/testsuite/g++.dg/torture/pr60609.C b/gcc/testsuite/g++.dg/torture/pr60609.C new file mode 100644 index 000..9ddec0b --- /dev/null +++ b/gcc/testsuite/g++.dg/torture/pr60609.C @@ -0,0 +1,252 @@ +/* { dg-do assemble } */ + +class exception +{ +}; +class bad_alloc:exception +{ +}; +class logic_error:exception +{ +}; +class domain_error:logic_error +{ +}; +class invalid_argument:logic_error +{ +}; +class length_error:logic_error +{ +}; +class overflow_error:exception +{ +}; +typedef int mpz_t[]; +template class class __gmp_expr; +template class __gmp_expr mpz_t +{ +~__gmp_expr (); +}; + +class PIP_Solution_Node; +class internal_exception +{ +~internal_exception (); +}; +class not_an_integer:internal_exception +{ +}; +class not_a_variable:internal_exception +{ +}; +class not_an_optimization_mode:internal_exception +{ +}; +class not_a_bounded_integer_type_width:internal_exception +{ +}; +class not_a_bounded_integer_type_representation:internal_exception +{ +}; +class not_a_bounded_integer_type_overflow:internal_exception +{ +}; +class not_a_complexity_class:internal_exception +{ +}; +class not_a_control_parameter_name:internal_exception +{ +}; +class not_a_control_parameter_value:internal_exception +{ +}; +class not_a_pip_problem_control_parameter_name:internal_exception +{ +}; +class not_a_pip_problem_control_parameter_value:internal_exception +{ +}; +class not_a_relation:internal_exception +{ +}; +class ppl_handle_mismatch:internal_exception +{ +}; +class timeout_exception +{ +~timeout_exception (); +}; +class deterministic_timeout_exception:timeout_exception +{ +}; +void __assert_fail (const char *, const char *, int, int *) +__attribute__ ((__noreturn__)); +void PL_get_pointer (void *); +int Prolog_is_address (); +inline int +Prolog_get_address (void **p1) +{ +Prolog_is_address ()? static_cast +void (0) : __assert_fail (Prolog_is_address, ./swi_cfli.hh, 0, 0); +PL_get_pointer (p1); +return 0; +} + +class non_linear:internal_exception +{ +}; +class not_unsigned_integer:internal_exception +{ +}; +class not_universe_or_empty:internal_exception +{ +}; +class not_a_nil_terminated_list:internal_exception +{ +}; +class PPL_integer_out_of_range +{ +__gmp_expr mpz_t n; +}; +void handle_exception (); +template typename T T * term_to_handle
[PATCH][LTO] Fix(?) parallel WPA memory unsharing
The following fixes(?) parallel WPA memory unsharing caused by streamer_write_chain writing to TREE_CHAIN (for no good reason). The patch removes this historical code. LTO bootstrap and testing running on x86_64-unknown-linux-gnu. Richard. 2014-04-03 Richard Biener rguent...@suse.de * tree-streamer-out.c (streamer_write_chain): Do not temporarily set TREE_CHAIN to NULL_TREE. Index: gcc/tree-streamer-out.c === --- gcc/tree-streamer-out.c (revision 209054) +++ gcc/tree-streamer-out.c (working copy) @@ -523,13 +523,6 @@ streamer_write_chain (struct output_bloc { while (t) { - tree saved_chain; - - /* Clear TREE_CHAIN to avoid blindly recursing into the rest -of the list. */ - saved_chain = TREE_CHAIN (t); - TREE_CHAIN (t) = NULL_TREE; - /* We avoid outputting external vars or functions by reference to the global decls section as we do not want to have them enter decl merging. This is, of course, only for the call @@ -541,7 +534,6 @@ streamer_write_chain (struct output_bloc else stream_write_tree (ob, t, ref_p); - TREE_CHAIN (t) = saved_chain; t = TREE_CHAIN (t); }
Re: [Patch]Simplify SUBREG with operand whose target bits are cleared by AND operation
I find the GCC function simplify_subreg fails to simplify rtx (subreg:SI (and:DI (reg/v:DI 115 [ a ]) (const_int 4294967295 [0x])) 4) to zero during the fwprop1 pass, considering the fact that the high 32-bit part of (a 0x) is zero. This leads to some unnecessary multiplications for high 32-bit part of the result of AND operation. The attached patch is trying to improve simplify_rtx to handle such case. Other target like x86 seems hasn't such issue because it generates different RTX to handle 64bit multiplication on a 32bit machine. See http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00073.html for another try, which led to the simplification in combine.c:combine_simplify_rtx line 5448. Your variant is both more general, because it isn't restricted to the lowpart, and less general, because it is artificially restricted to AND. Some remarks: - this needs to be restricted to non-paradoxical subregs, - you need to test HWI_COMPUTABLE_MODE_P (innermode), - you need to test !side_effects_p (op). I think we need to find a common ground between Jakub's patch and yours and put a single transformation in simplify_subreg. -- Eric Botcazou
Re: [4.8, PATCH 0/26] Backport Power8 and LE support
On Wed, Mar 19, 2014 at 3:23 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, Support for Power8 features and the new powerpc64le-linux-gnu target, including the ELFv2 ABI, has been developed up till now on the ibm/gcc-4_8-branch. It was appropriate to use this separate branch while the support was unstable, but this branch will not represent a particularly good support mechanism for distributions going forward. Most distros are set up to pull from the major release branches, and having a separate branch for one target is quite inconvenient. Also, the ibm/gcc-4_8-branch's original purpose is to serve as the code base for IBM's Advance Toolchain 7.0. Over time the two purposes that the branch currently serves will diverge and make things even more complicated. The code is now tested and stable enough that we are ready to backport this support to the FSF 4.8 branch. This patch series constitutes that backport. Almost all of the changes are specific to PowerPC portions of the code, and for those patches I am only CCing David. However, some of the patches require changes to common code, and for these I will CC Richard and Jakub. Three of these are slightly unrelated but necessary patches, one to enable decimal float ABS builtins, and two others to fix PR54537 and PR56843. In addition there are patches that update configuration files throughout for the new target, and some small changes in common call support (call.c, expr.h, function.c) to support how the new ABI handles calls. I realize it is unusual to backport such a large amount of code, but we have been asked by distribution partners to do this, and we feel it makes good sense for long-term support. I have tested the patch series by applying it to a clean FSF 4.8 branch and comparing the test results against those from the IBM 4.8 branch on three systems: * Power8, little endian (--mcpu=power8) * Power8, big endian (--mcpu=power8) * Power7, big endian (--mcpu=power7) I also checked a recursive diff against the two source directories to ensure that no patches were missed. Thanks, Bill [ 1/26] diff-p8 [ 2/26] diff-p8-htm [ 3/26] diff-le-config [ 4/26] diff-le-libtool [ 5/26] diff-le-tests [ 6/26] diff-le-dfp [ 7/26] diff-le-vector [ 8/26] diff-abi-compat [ 9/26] diff-abi-calls [10/26] diff-abi-elfv2 [11/26] diff-abi-gotest [12/26] diff-le-align [13/26] diff-abi-libffi [14/26] diff-dfp-abs [15/26] diff-pr54537 [16/26] diff-pr56843 [17/26] diff-direct-move [18/26] diff-le-config-2 [19/26] diff-quad-memory [20/26] diff-lra [21/26] diff-le-vector-api [22/26] diff-mcall [23/26] diff-pr60137-pr60203 [24/26] diff-reload [25/26] diff-v1ti [26/26] diff-trunk-missing With the positive feedback from Darwin and RTEMS, the additional backports for AIX and the bug fix for SPE, I am going to approve this patch series. There is a remaining issue with e600, but IBM LTC cannot reproduce it. If IBM can get more information, it can be addressed in a later patch to trunk and 4.8 branch. Thanks, David
Re: [4.8, PATCH 2/26] Backport Power8 and LE support: HTM support
On Wed, Mar 19, 2014 at 3:25 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-p8-htm) backports hardware transactional memory support. Copying Jakub and Richard for the libitm support. Thanks, Bill [gcc] 2014-03-29 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline 2013-12-03 Peter Bergner berg...@vnet.ibm.com * config/rs6000/htmintrin.h (_TEXASR_INSTRUCTION_FETCH_CONFLICT): Fix typo in macro name. (_TEXASRU_INSTRUCTION_FETCH_CONFLICT): Likewise. Backport from mainline r205233. 2013-11-21 Peter Bergner berg...@vnet.ibm.com * doc/extend.texi: Document htm builtins. Backport from mainline 2013-07-17 Iain Sandoe i...@codesourcery.com * config/rs6000/darwin.h (REGISTER_NAMES): Add HTM registers. Backport from mainline 2013-07-16 Peter Bergner berg...@vnet.ibm.com * config/rs6000/rs6000.c (rs6000_option_override_internal): Do not enable extra ISA flags with TARGET_HTM. 2013-07-16 Jakub Jelinek ja...@redhat.com Peter Bergner berg...@vnet.ibm.com * config/rs6000/rs6000.h (FIRST_PSEUDO_REGISTERS): Mention HTM registers in the comment. (DWARF_FRAME_REGISTERS): Subtract also the 3 HTM registers. (DWARF_REG_TO_UNWIND_COLUMN): Use DWARF_FRAME_REGISTERS rather than FIRST_PSEUDO_REGISTERS. * config.gcc (powerpc*-*-*): Install htmintrin.h and htmxlintrin.h. * config/rs6000/t-rs6000 (MD_INCLUDES): Add htm.md. * config/rs6000/rs6000.opt: Add -mhtm option. * config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Add OPTION_MASK_HTM. (ISA_2_7_MASKS_SERVER): Add OPTION_MASK_HTM. * config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Define __HTM__ if the HTM instructions are available. * config/rs6000/predicates.md (u3bit_cint_operand, u10bit_cint_operand, htm_spr_reg_operand): New define_predicates. * config/rs6000/rs6000.md (define_attr type): Add htm. (TFHAR_REGNO, TFIAR_REGNO, TEXASR_REGNO): New define_constants. Include htm.md. * config/rs6000/rs6000-builtin.def (BU_HTM_0, BU_HTM_1, BU_HTM_2, BU_HTM_3, BU_HTM_SPR0, BU_HTM_SPR1): Add support macros for defining HTM builtin functions. * config/rs6000/rs6000.c (RS6000_BUILTIN_H): New macro. (rs6000_reg_names, alt_reg_names): Add HTM SPR register names. (rs6000_init_hard_regno_mode_ok): Add support for HTM instructions. (rs6000_builtin_mask_calculate): Likewise. (rs6000_option_override_internal): Likewise. (bdesc_htm): Add new HTM builtin support. (htm_spr_num): New function. (htm_spr_regno): Likewise. (rs6000_htm_spr_icode): Likewise. (htm_expand_builtin): Likewise. (htm_init_builtins): Likewise. (rs6000_expand_builtin): Add support for HTM builtin functions. (rs6000_init_builtins): Likewise. (rs6000_invalid_builtin, rs6000_opt_mask): Add support for -mhtm option. * config/rs6000/rs6000.h (ASM_CPU_SPEC): Add support for -mhtm. (TARGET_HTM, MASK_HTM): Define macros. (FIRST_PSEUDO_REGISTER): Adjust for new HTM SPR registers. (FIXED_REGISTERS): Likewise. (CALL_USED_REGISTERS): Likewise. (CALL_REALLY_USED_REGISTERS): Likewise. (REG_ALLOC_ORDER): Likewise. (enum reg_class): Likewise. (REG_CLASS_NAMES): Likewise. (REG_CLASS_CONTENTS): Likewise. (REGISTER_NAMES): Likewise. (ADDITIONAL_REGISTER_NAMES): Likewise. (RS6000_BTC_SPR, RS6000_BTC_VOID, RS6000_BTC_32BIT, RS6000_BTC_64BIT, RS6000_BTC_MISC_MASK, RS6000_BTM_HTM): New macros. (RS6000_BTM_COMMON): Add RS6000_BTM_HTM. * config/rs6000/htm.md: New file. * config/rs6000/htmintrin.h: New file. * config/rs6000/htmxlintrin.h: New file. [libitm] 2014-03-29 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline * acinclude.m4 (LIBITM_CHECK_AS_HTM): New. * configure: Rebuild. * configure.tgt (target_cpu): Add -mhtm to XCFLAGS. * config/powerpc/target.h: Include sys/auxv.h and htmintrin.h. (USE_HTM_FASTPATH): Define. (_TBEGIN_STARTED, _TBEGIN_INDETERMINATE, _TBEGIN_PERSISTENT, _HTM_RETRIES) New macros. (htm_abort, htm_abort_should_retry, htm_available, htm_begin, htm_init, htm_begin_success, htm_commit, htm_transaction_active): New functions. [gcc/testsuite] 2014-03-29 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline * lib/target-supports.exp (check_effective_target_powerpc_htm_ok): New function to test if HTM is available. * gcc.target/powerpc/htm-xl-intrin-1.c: New test. *
Re: [4.8, PATCH 5/26] Backport Power8 and LE support: Test adjustments
On Wed, Mar 19, 2014 at 11:25 AM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-le-tests) backports adjustments to a few tests for powerpc64le and the ELFv2 ABI. Thanks, Bill 2014-03-29 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline 2013-11-27 Bill Schmidt wschm...@linux.vnet.ibm.com * gfortran.dg/nan_7.f90: Disable for little endian PowerPC. Backport from mainline r205106: 2013-11-20 Ulrich Weigand ulrich.weig...@de.ibm.com * gcc.target/powerpc/darwin-longlong.c (msw): Make endian-safe. Backport from mainline r205046: 2013-11-19 Ulrich Weigand ulrich.weig...@de.ibm.com * gcc.target/powerpc/ppc64-abi-2.c (MAKE_SLOT): New macro to construct parameter slot value in endian-independent way. (fcevv, fciievv, fcvevv): Use it. Okay. Thanks, David
Re: [4.8, PATCH 6/26] Backport Power8 and LE support: TDmode for LE
On Wed, Mar 19, 2014 at 3:29 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-le-dfp) backports fixes for TDmode on a little endian target. Thanks, Bill 2014-03-19 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline r205123: 2013-11-20 Ulrich Weigand ulrich.weig...@de.ibm.com * config/rs6000/rs6000.c (rs6000_cannot_change_mode_class): Do not allow subregs of TDmode in FPRs of smaller size in little-endian. (rs6000_split_multireg_move): When splitting an access to TDmode in FPRs, do not use simplify_gen_subreg. Backport from mainline r204927: 2013-11-17 Ulrich Weigand ulrich.weig...@de.ibm.com * config/rs6000/rs6000.c (rs6000_emit_move): Use low word of sdmode_stack_slot also in little-endian mode. Okay. Thanks, David
Re: [4.8, PATCH 7/26] Backport Power8 and LE support: Vector LE
On Wed, Mar 19, 2014 at 3:30 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-le-vector) backports the changes to support vector infrastructure on powerpc64le. Copying Richard and Jakub for the libcpp bits. Thanks, Bill [gcc] 2014-03-29 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline r205333 2013-11-24 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Correct for little endian. Backport from mainline r205241 2013-11-21 Bill Schmidt wschm...@vnet.ibm.com * config/rs6000/vector.md (vec_pack_trunc_v2df): Revert previous little endian change. (vec_pack_sfix_trunc_v2df): Likewise. (vec_pack_ufix_trunc_v2df): Likewise. * config/rs6000/rs6000.c (rs6000_expand_interleave): Correct double checking of endianness. Backport from mainline r205146 2013-11-20 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/vsx.md (vsx_set_mode): Adjust for little endian. (vsx_extract_mode): Likewise. (*vsx_extract_mode_one_le): New LE variant on *vsx_extract_mode_zero. (vsx_extract_v4sf): Adjust for little endian. Backport from mainline r205080 2013-11-19 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Adjust V16QI vector splat case for little endian. Backport from mainline r205045: 2013-11-19 Ulrich Weigand ulrich.weig...@de.ibm.com * config/rs6000/vector.md (movmode): Do not call rs6000_emit_le_vsx_move to move into or out of GPRs. * config/rs6000/rs6000.c (rs6000_emit_le_vsx_move): Assert source and destination are not GPR hard regs. Backport from mainline r204920 2011-11-17 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/rs6000.c (rs6000_frame_related): Add split_reg parameter and use it in REG_FRAME_RELATED_EXPR note. (emit_frame_save): Call rs6000_frame_related with extra NULL_RTX parameter. (rs6000_emit_prologue): Likewise, but for little endian VSX stores, pass the source register of the store instead. Backport from mainline r204862 2013-11-15 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/altivec.md (UNSPEC_VPERM_X, UNSPEC_VPERM_UNS_X): Remove. (altivec_vperm_mode): Revert earlier little endian change. (*altivec_vperm_mode_internal): Remove. (altivec_vperm_mode_uns): Revert earlier little endian change. (*altivec_vperm_mode_uns_internal): Remove. * config/rs6000/vector.md (vec_realign_load_mode): Revise commentary. Backport from mainline r204441 2013-11-05 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/rs6000.c (rs6000_option_override_internal): Remove restriction against use of VSX instructions when generating code for little endian mode. Backport from mainline r204440 2013-11-05 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/altivec.md (mulv4si3): Ensure we generate vmulouh for both big and little endian. (mulv8hi3): Swap input operands for merge high and merge low instructions for little endian. Backport from mainline r204439 2013-11-05 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/altivec.md (vec_widen_umult_even_v16qi): Change define_insn to define_expand that uses even patterns for big endian and odd patterns for little endian. (vec_widen_smult_even_v16qi): Likewise. (vec_widen_umult_even_v8hi): Likewise. (vec_widen_smult_even_v8hi): Likewise. (vec_widen_umult_odd_v16qi): Likewise. (vec_widen_smult_odd_v16qi): Likewise. (vec_widen_umult_odd_v8hi): Likewise. (vec_widen_smult_odd_v8hi): Likewise. (altivec_vmuleub): New define_insn. (altivec_vmuloub): Likewise. (altivec_vmulesb): Likewise. (altivec_vmulosb): Likewise. (altivec_vmuleuh): Likewise. (altivec_vmulouh): Likewise. (altivec_vmulesh): Likewise. (altivec_vmulosh): Likewise. Backport from mainline r204395 2013-11-05 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/vector.md (vec_pack_sfix_trunc_v2df): Adjust for little endian. (vec_pack_ufix_trunc_v2df): Likewise. Backport from mainline r204363 2013-11-04 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/altivec.md (vec_widen_umult_hi_v16qi): Swap arguments to merge instruction for little endian. (vec_widen_umult_lo_v16qi): Likewise.
Re: [4.8, PATCH 8/26] Backport Power8 and LE support: PR57949
On Wed, Mar 19, 2014 at 3:30 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-abi-compat) backports the ABI compatibility fix for PR57949. Thanks, Bill [gcc] 2014-03-29 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline r201750. 2013-11-15 Ulrich Weigand ulrich.weig...@de.ibm.com Note: Default setting of -mcompat-align-parm inverted! 2013-08-14 Bill Schmidt wschm...@linux.vnet.ibm.com PR target/57949 * doc/invoke.texi: Add documentation of mcompat-align-parm option. * config/rs6000/rs6000.opt: Add mcompat-align-parm option. * config/rs6000/rs6000.c (rs6000_function_arg_boundary): For AIX and Linux, correct BLKmode alignment when 128-bit alignment is required and compatibility flag is not set. (rs6000_gimplify_va_arg): For AIX and Linux, honor specified alignment for zero-size arguments when compatibility flag is not set. [gcc/testsuite] 2014-03-29 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline r201750. 2013-11-15 Ulrich Weigand ulrich.weig...@de.ibm.com Note: Default setting of -mcompat-align-parm inverted! 2013-08-14 Bill Schmidt wschm...@linux.vnet.ibm.com PR target/57949 * gcc.target/powerpc/pr57949-1.c: New. * gcc.target/powerpc/pr57949-2.c: New. Okay. Thanks, David
Re: [4.8, PATCH 10/26] Backport Power8 and LE support: ELFv2 ABI
On Wed, Mar 19, 2014 at 3:31 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-abi-elfv2) backports the fundamental changes for the ELFv2 ABI for powerpc64le. Copying Richard and Jakub for the libgcc, libitm, and libstdc++ bits. Thanks, Bill [gcc] 2014-03-29 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline r204842: 2013-11-15 Ulrich Weigand ulrich.weig...@de.ibm.com * doc/invoke.texi (-mabi=elfv1, -mabi=elfv2): Document. Backport from mainline r204809: 2013-11-14 Ulrich Weigand ulrich.weig...@de.ibm.com * config/rs6000/sysv4le.h (LINUX64_DEFAULT_ABI_ELFv2): Define. Backport from mainline r204808: 2013-11-14 Ulrich Weigand ulrich.weig...@de.ibm.com Alan Modra amo...@gmail.com * config/rs6000/rs6000.h (RS6000_SAVE_AREA): Handle ABI_ELFv2. (RS6000_SAVE_TOC): Remove. (RS6000_TOC_SAVE_SLOT): New macro. * config/rs6000/rs6000.c (rs6000_parm_offset): New function. (rs6000_parm_start): Use it. (rs6000_function_arg_advance_1): Likewise. (rs6000_emit_prologue): Use RS6000_TOC_SAVE_SLOT. (rs6000_emit_epilogue): Likewise. (rs6000_call_aix): Likewise. (rs6000_output_function_prologue): Do not save/restore r11 around calling _mcount for ABI_ELFv2. 2013-11-14 Ulrich Weigand ulrich.weig...@de.ibm.com Alan Modra amo...@gmail.com * config/rs6000/rs6000-protos.h (rs6000_reg_parm_stack_space): Add prototype. * config/rs6000/rs6000.h (RS6000_REG_SAVE): Remove. (REG_PARM_STACK_SPACE): Call rs6000_reg_parm_stack_space. * config/rs6000/rs6000.c (rs6000_parm_needs_stack): New function. (rs6000_function_parms_need_stack): Likewise. (rs6000_reg_parm_stack_space): Likewise. (rs6000_function_arg): Do not replace BLKmode by Pmode when returning a register argument. 2013-11-14 Ulrich Weigand ulrich.weig...@de.ibm.com Michael Gschwind m...@us.ibm.com * config/rs6000/rs6000.h (FP_ARG_MAX_RETURN): New macro. (ALTIVEC_ARG_MAX_RETURN): Likewise. (FUNCTION_VALUE_REGNO_P): Use them. * config/rs6000/rs6000.c (TARGET_RETURN_IN_MSB): Define. (rs6000_return_in_msb): New function. (rs6000_return_in_memory): Handle ELFv2 homogeneous aggregates. Handle aggregates of up to 16 bytes for ELFv2. (rs6000_function_value): Handle ELFv2 homogeneous aggregates. 2013-11-14 Ulrich Weigand ulrich.weig...@de.ibm.com Michael Gschwind m...@us.ibm.com * config/rs6000/rs6000.h (AGGR_ARG_NUM_REG): Define. * config/rs6000/rs6000.c (rs6000_aggregate_candidate): New function. (rs6000_discover_homogeneous_aggregate): Likewise. (rs6000_function_arg_boundary): Handle homogeneous aggregates. (rs6000_function_arg_advance_1): Likewise. (rs6000_function_arg): Likewise. (rs6000_arg_partial_bytes): Likewise. (rs6000_psave_function_arg): Handle BLKmode arguments. 2013-11-14 Ulrich Weigand ulrich.weig...@de.ibm.com Michael Gschwind m...@us.ibm.com * config/rs6000/rs6000.h (AGGR_ARG_NUM_REG): Define. * config/rs6000/rs6000.c (rs6000_aggregate_candidate): New function. (rs6000_discover_homogeneous_aggregate): Likewise. (rs6000_function_arg_boundary): Handle homogeneous aggregates. (rs6000_function_arg_advance_1): Likewise. (rs6000_function_arg): Likewise. (rs6000_arg_partial_bytes): Likewise. (rs6000_psave_function_arg): Handle BLKmode arguments. 2013-11-14 Ulrich Weigand ulrich.weig...@de.ibm.com * config/rs6000/rs6000.c (machine_function): New member r2_setup_needed. (rs6000_emit_prologue): Set r2_setup_needed if necessary. (rs6000_output_mi_thunk): Set r2_setup_needed. (rs6000_output_function_prologue): Output global entry point prologue and local entry point marker if needed for ABI_ELFv2. Output -mprofile-kernel code here. (output_function_profiler): Do not output -mprofile-kernel code here; moved to rs6000_output_function_prologue. (rs6000_file_start): Output .abiversion 2 for ABI_ELFv2. (rs6000_emit_move): Do not handle dot symbols for ABI_ELFv2. (rs6000_output_function_entry): Likewise. (rs6000_assemble_integer): Likewise. (rs6000_elf_encode_section_info): Likewise. (rs6000_elf_declare_function_name): Do not create dot symbols or .opd section for ABI_ELFv2. (rs6000_trampoline_size): Update for ABI_ELFv2 trampolines. (rs6000_trampoline_init): Likewise. (rs6000_elf_file_end): Call file_end_indicate_exec_stack
Re: [4.8, PATCH 11/26] Backport Power8 and LE support: gotest
On Wed, Mar 19, 2014 at 3:31 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-abi-gotest) backports enablement of the Go testsuite for powerpc64le. Thanks, Bill 2014-03-29 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline r205000. 2013-11-19 Ulrich Weigand ulrich.weig...@de.ibm.com gotest: Recognize PPC ELF v2 function pointers in text section. Okay. Thanks, David
Re: [4.8, PATCH 12/26] Backport Power8 and LE support: Defaults
On Wed, Mar 19, 2014 at 3:32 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-le-align) sets some miscellaneous defaults for little endian support. Thanks, Bill 2014-03-29 Bill Schmidt wschm...@linux.vnet.ibm.com Apply mainline r205060. 2013-11-20 Alan Modra amo...@gmail.com * config/rs6000/sysv4.h (CC1_ENDIAN_LITTLE_SPEC): Define as empty. * config/rs6000/rs6000.c (rs6000_option_override_internal): Default to strict alignment on older processors when little-endian. * config/rs6000/linux64.h (PROCESSOR_DEFAULT64): Default to power8 for ELFv2. Okay. Thanks, David
Re: [4.8, PATCH 14/26] Backport Power8 and LE support: DFP absolute value
On Wed, Mar 19, 2014 at 3:32 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-dfp-abs) backports some unrelated but necessary work to enable the DFP absolute value builtins. Copying Jakub who was involved with the original patch. Thanks, Bill 2014-03-29 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline 2013-08-19 Peter Bergner berg...@vnet.ibm.com Jakub Jelinek ja...@redhat.com * builtins.def (BUILT_IN_FABSD32): New DFP ABS builtin. (BUILT_IN_FABSD64): Likewise. (BUILT_IN_FABSD128): Likewise. * builtins.c (expand_builtin): Add support for new DFP ABS builtins. (fold_builtin_1): Likewise. * config/rs6000/dfp.md (*abstd2_fpr): Handle non-overlapping destination and source operands. (*nabstd2_fpr): Likewise. 2014-03-29 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline 2013-08-19 Peter Bergner berg...@vnet.ibm.com * gcc.target/powerpc/dfp-dd-2.c: New test. * gcc.target/powerpc/dfp-td-2.c: Likewise. * gcc.target/powerpc/dfp-td-3.c: Likewise. Okay. Thanks, David
Re: [4.8, PATCH 17/26] Backport Power8 and LE support: Direct moves
On Wed, Mar 19, 2014 at 3:33 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-direct-move) backports support for the Power8 direct move instructions for little endian. Thanks, Bill 2014-03-19 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline 2013-10-23 Pat Haugen pthau...@us.ibm.com * gcc.target/powerpc/direct-move.h: Fix header for executable tests. Back port from mainline 2014-01-16 Michael Meissner meiss...@linux.vnet.ibm.com PR target/59844 * config/rs6000/rs6000.md (reload_vsx_from_gprsf): Add little endian support, remove tests for WORDS_BIG_ENDIAN. (p8_mfvsrd_3_mode): Likewise. (reload_gpr_from_vsxmode): Likewise. (reload_gpr_from_vsxsf): Likewise. (p8_mfvsrd_4_disf): Likewise. Okay. Thanks, David
Re: [4.8, PATCH 16/26] Backport Power8 and LE support: PR56843
On Wed, Mar 19, 2014 at 3:33 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-pr56843) backports the fix for PR56843. Thanks, Bill [gcc] 2014-03-19 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline 2013-04-05 Bill Schmidt wschm...@linux.vnet.ibm.com PR target/56843 * config/rs6000/rs6000.c (rs6000_emit_swdiv_high_precision): Remove. (rs6000_emit_swdiv_low_precision): Remove. (rs6000_emit_swdiv): Rewrite to handle between one and four iterations of Newton-Raphson generally; modify required number of iterations for some cases. * config/rs6000/rs6000.h (RS6000_RECIP_HIGH_PRECISION_P): Remove. [gcc/testsuite] 2014-03-19 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline 2013-04-05 Bill Schmidt wschm...@linux.vnet.ibm.com PR target/56843 * gcc.target/powerpc/recip-1.c: Modify expected output. * gcc.target/powerpc/recip-3.c: Likewise. * gcc.target/powerpc/recip-4.c: Likewise. * gcc.target/powerpc/recip-5.c: Add expected output for iterations. Okay. Thanks, David
Re: [4.8, PATCH 18/26] Backport Power8 and LE support: Configure bits 2
On Wed, Mar 19, 2014 at 3:33 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-le-config-2) backports more configure changes, particularly for multilib/multiarch targeting powerpc64le. Thanks, Bill 2014-03-19 Bill Schmidt wschm...@linux.vnet.ibm.com Apply mainline r202190, powerpc64le multilibs and multiarch dir 2013-09-03 Alan Modra amo...@gmail.com * config.gcc (powerpc*-*-linux*): Add support for little-endian multilibs to big-endian target and vice versa. * config/rs6000/t-linux64: Use := assignment on all vars. (MULTILIB_EXTRA_OPTS): Remove fPIC. (MULTILIB_OSDIRNAMES): Specify using mapping from multilib_options. * config/rs6000/t-linux64le: New file. * config/rs6000/t-linux64bele: New file. * config/rs6000/t-linux64lebe: New file. Okay. Thanks, David
Re: [4.8, PATCH 19/26] Backport Power8 and LE support: Quad memory atomic
On Wed, Mar 19, 2014 at 3:33 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-quad-memory) backports support for quad-memory atomic operations. Thanks, Bill [gcc/testsuite] 2014-03-19 Bill Schmidt wschm...@linux.vnet.ibm.com Back port from mainline 2014-01-23 Michael Meissner meiss...@linux.vnet.ibm.com PR target/59909 * gcc.target/powerpc/quad-atomic.c: New file to test power8 quad word atomic functions at runtime. [gcc] 2014-03-19 Bill Schmidt wschm...@linux.vnet.ibm.com Back port from mainline 2014-01-23 Michael Meissner meiss...@linux.vnet.ibm.com PR target/59909 * doc/invoke.texi (RS/6000 and PowerPC Options): Document -mquad-memory-atomic. Update -mquad-memory documentation to say it is only used for non-atomic loads/stores. * config/rs6000/predicates.md (quad_int_reg_operand): Allow either -mquad-memory or -mquad-memory-atomic switches. * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Add -mquad-memory-atomic to ISA 2.07 support. * config/rs6000/rs6000.opt (-mquad-memory-atomic): Add new switch to separate support of normal quad word memory operations (ldq, stq) from the atomic quad word memory operations. * config/rs6000/rs6000.c (rs6000_option_override_internal): Add support to separate non-atomic quad word operations from atomic quad word operations. Disable non-atomic quad word operations in little endian mode so that we don't have to swap words after the load and before the store. (quad_load_store_p): Add comment about atomic quad word support. (rs6000_opt_masks): Add -mquad-memory-atomic to the list of options printed with -mdebug=reg. * config/rs6000/rs6000.h (TARGET_SYNC_TI): Use -mquad-memory-atomic as the test for whether we have quad word atomic instructions. (TARGET_SYNC_HI_QI): If either -mquad-memory-atomic, -mquad-memory, or -mp8-vector are used, allow byte/half-word atomic operations. * config/rs6000/sync.md (load_lockedti): Insure that the address is a proper indexed or indirect address for the lqarx instruction. On little endian systems, swap the hi/lo registers after the lqarx instruction. (load_lockedpti): Use indexed_or_indirect_operand predicate to insure the address is valid for the lqarx instruction. (store_conditionalti): Insure that the address is a proper indexed or indirect address for the stqcrx. instruction. On little endian systems, swap the hi/lo registers before doing the stqcrx. instruction. (store_conditionalpti): Use indexed_or_indirect_operand predicate to insure the address is valid for the stqcrx. instruction. * gcc/config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Define __QUAD_MEMORY__ and __QUAD_MEMORY_ATOMIC__ based on what type of quad memory support is available. Okay. Thanks, David
Re: [4.8, PATCH 22/26] Backport Power8 and LE support: -mcall-* endianness
On Wed, Mar 19, 2014 at 3:34 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-mcall) fixes big-endian assumptions for -mcall-aixdesc and various others. Thanks, Bill 2014-03-19 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline r207658 2014-02-06 Ulrich Weigand ulrich.weig...@de.ibm.com * config/rs6000/sysv4.h (ENDIAN_SELECT): Do not attempt to enforce big-endian mode for -mcall-aixdesc, -mcall-freebsd, -mcall-netbsd, -mcall-openbsd, or -mcall-linux. (CC1_ENDIAN_BIG_SPEC): Remove. (CC1_ENDIAN_LITTLE_SPEC): Remove. (CC1_ENDIAN_DEFAULT_SPEC): Remove. (CC1_SPEC): Remove (always empty) %cc1_endian_... spec. (SUBTARGET_EXTRA_SPECS): Remove %cc1_endian_big, %cc1_endian_little, and %cc1_endian_default. * config/rs6000/sysv4le.h (CC1_ENDIAN_DEFAULT_SPEC): Remove. Okay. Thanks, David
Re: [4.8, PATCH 21/26] Backport Power8 and LE support: Vector APIs
On Wed, Mar 19, 2014 at 3:34 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-le-vector-api) backports enablement of LE support for the Altivec APIs, including support for -maltivec=be. Thanks, Bill [gcc] 2014-03-19 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline r206443 2014-01-08 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Remove two duplicate entries. Backport from mainline r206494 2014-01-09 Bill Schmidt wschm...@linux.vnet.ibm.com * doc/invoke.texi: Add -maltivec={be,le} options, and document default element-order behavior for -maltivec. * config/rs6000/rs6000.opt: Add -maltivec={be,le} options. * config/rs6000/rs6000.c (rs6000_option_override_internal): Ensure that -maltivec={le,be} implies -maltivec; disallow -maltivec=le when targeting big endian, at least for now. * config/rs6000/rs6000.h: Add #define of VECTOR_ELT_ORDER_BIG. Backport from mainline r206541 2014-01-10 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/rs6000-builtin.def: Fix pasto for VPKSDUS. Backport from mainline r206590 2014-01-13 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Implement -maltivec=be for vec_insert and vec_extract. Backport from mainline r206641 2014-01-15 Bill Schmidt wschm...@vnet.linux.ibm.com * config/rs6000/altivec.md (mulv8hi3): Explicitly generate vmulesh and vmulosh rather than call gen_vec_widen_smult_*. (vec_widen_umult_even_v16qi): Test VECTOR_ELT_ORDER_BIG rather than BYTES_BIG_ENDIAN to determine use of even or odd instruction. (vec_widen_smult_even_v16qi): Likewise. (vec_widen_umult_even_v8hi): Likewise. (vec_widen_smult_even_v8hi): Likewise. (vec_widen_umult_odd_v16qi): Likewise. (vec_widen_smult_odd_v16qi): Likewise. (vec_widen_umult_odd_v8hi): Likewise. (vec_widen_smult_odd_v8hi): Likewise. (vec_widen_umult_hi_v16qi): Explicitly generate vmuleub and vmuloub rather than call gen_vec_widen_umult_*. (vec_widen_umult_lo_v16qi): Likewise. (vec_widen_smult_hi_v16qi): Explicitly generate vmulesb and vmulosb rather than call gen_vec_widen_smult_*. (vec_widen_smult_lo_v16qi): Likewise. (vec_widen_umult_hi_v8hi): Explicitly generate vmuleuh and vmulouh rather than call gen_vec_widen_umult_*. (vec_widen_umult_lo_v8hi): Likewise. (vec_widen_smult_hi_v8hi): Explicitly gnerate vmulesh and vmulosh rather than call gen_vec_widen_smult_*. (vec_widen_smult_lo_v8hi): Likewise. Backport from mainline r207062 2014-01-24 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Remove correction for little endian... * config/rs6000/vsx.md (vsx_xxpermdi2_mode_1): ...and move it to here. Backport from mainline r207262 2014-01-29 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Use CODE_FOR_altivec_vmrg*_direct rather than CODE_FOR_altivec_vmrg*. * config/rs6000/vsx.md (vsx_mergel_mode): Adjust for -maltivec=be with LE targets. (vsx_mergeh_mode): Likewise. * config/rs6000/altivec.md (UNSPEC_VMRG[HL]_DIRECT): New unspecs. (mulv8hi3): Use gen_altivec_vmrg[hl]w_direct. (altivec_vmrghb): Replace with define_expand and new *altivec_vmrghb_internal insn; adjust for -maltivec=be with LE targets. (altivec_vmrghb_direct): New define_insn. (altivec_vmrghh): Replace with define_expand and new *altivec_vmrghh_internal insn; adjust for -maltivec=be with LE targets. (altivec_vmrghh_direct): New define_insn. (altivec_vmrghw): Replace with define_expand and new *altivec_vmrghw_internal insn; adjust for -maltivec=be with LE targets. (altivec_vmrghw_direct): New define_insn. (*altivec_vmrghsf): Adjust for endianness. (altivec_vmrglb): Replace with define_expand and new *altivec_vmrglb_internal insn; adjust for -maltivec=be with LE targets. (altivec_vmrglb_direct): New define_insn. (altivec_vmrglh): Replace with define_expand and new *altivec_vmrglh_internal insn; adjust for -maltivec=be with LE targets. (altivec_vmrglh_direct): New define_insn. (altivec_vmrglw): Replace with define_expand and new *altivec_vmrglw_internal insn; adjust for -maltivec=be with LE targets.
Re: [4.8, PATCH 23/26] Backport Power8 and LE support: PR60137, PR60203
On Wed, Mar 19, 2014 at 3:34 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-pr60137-pr60203) backports fixes for two little-endian vector mode problems. Thanks, Bill [gcc] 2014-03-19 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline r207699. 2014-02-11 Michael Meissner meiss...@linux.vnet.ibm.com PR target/60137 * config/rs6000/rs6000.md (128-bit GPR splitter): Add a splitter for VSX/Altivec vectors that land in GPR registers. Backport from mainline r207808. 2014-02-15 Michael Meissner meiss...@linux.vnet.ibm.com PR target/60203 * config/rs6000/rs6000.md (rreg): Add TFmode, TDmode constraints. (movmode_internal, TFmode/TDmode): Split TFmode/TDmode moves into 64-bit and 32-bit moves. On 64-bit moves, add support for using direct move instructions on ISA 2.07. Also adjust instruction length for 64-bit. (movmode_64bit, TFmode/TDmode): Likewise. (movmode_32bit, TFmode/TDmode): Likewise. Backport from mainline r207868. 2014-02-18 Michael Meissner meiss...@linux.vnet.ibm.com PR target/60203 * config/rs6000/rs6000.md (movmode_64bit, TF/TDmode moves): Split 64-bit moves into 2 patterns. Do not allow the use of direct move for TDmode in little endian, since the decimal value has little endian bytes within a word, but the 64-bit pieces are ordered in a big endian fashion, and normal subreg's of TDmode are not allowed. (movmode_64bit_dm): Likewise. (movtd_64bit_nodm): Likewise. [gcc/testsuite] 2014-03-19 Bill Schmidt wschm...@linux.vnet.ibm.com Backport from mainline r207699. 2014-02-11 Michael Meissner meiss...@linux.vnet.ibm.com PR target/60137 * gcc.target/powerpc/pr60137.c: New file. Backport from mainline r207808. 2014-02-15 Michael Meissner meiss...@linux.vnet.ibm.com PR target/60203 * gcc.target/powerpc/pr60203.c: New testsuite. Okay. Thanks, David
Re: [4.8, PATCH 24/26] Backport Power8 and LE support: Reload issues
On Wed, Mar 19, 2014 at 3:34 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-reload) backports fixes for a couple of problems in PowerPC reload handling. Thanks, Bill 2014-03-19 Bill Schmidt wschm...@linux.vnet.ibm.com Apply mainline r207798 2014-02-26 Alan Modra amo...@gmail.com PR target/58675 PR target/57935 * config/rs6000/rs6000.c (rs6000_secondary_reload_inner): Use find_replacement on parts of insn rtl that might be reloaded. Backport from mainline r208287 2014-03-03 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/rs6000.c (rs6000_preferred_reload_class): Disallow reload of PLUS rtx's outside of GENERAL_REGS or BASE_REGS; relax constraint on constants to permit them being loaded into GENERAL_REGS or BASE_REGS. Okay. Thanks, David
Re: [4.8, PATCH 25/26] Backport Power8 and LE support: V1TI support
On Wed, Mar 19, 2014 at 3:34 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-v1ti) backports the V1TI support. Thanks, Bill [gcc] 2014-03-19 Bill Schmidt wschm...@linux.vnet.ibm.com Back port from trunk 2014-03-12 Michael Meissner meiss...@linux.vnet.ibm.com * config/rs6000/vector.md (VEC_L): Add V1TI mode to vector types. (VEC_M): Likewise. (VEC_N): Likewise. (VEC_R): Likewise. (VEC_base): Likewise. (movMODE, VEC_M modes): If we are loading TImode into VSX registers, we need to swap double words in little endian mode. * config/rs6000/rs6000-modes.def (V1TImode): Add new vector mode to be a container mode for 128-bit integer operations added in ISA 2.07. Unlike TImode and PTImode, the preferred register set is the Altivec/VMX registers for the 128-bit operations. * config/rs6000/rs6000-protos.h (rs6000_move_128bit_ok_p): Add declarations. (rs6000_split_128bit_ok_p): Likewise. * config/rs6000/rs6000-builtin.def (BU_P8V_AV_3): Add new support macros for creating ISA 2.07 normal and overloaded builtin functions with 3 arguments. (BU_P8V_OVERLOAD_3): Likewise. (VPERM_1T): Add support for V1TImode in 128-bit vector operations for use as overloaded functions. (VPERM_1TI_UNS): Likewise. (VSEL_1TI): Likewise. (VSEL_1TI_UNS): Likewise. (ST_INTERNAL_1ti): Likewise. (LD_INTERNAL_1ti): Likewise. (XXSEL_1TI): Likewise. (XXSEL_1TI_UNS): Likewise. (VPERM_1TI): Likewise. (VPERM_1TI_UNS): Likewise. (XXPERMDI_1TI): Likewise. (SET_1TI): Likewise. (LXVD2X_V1TI): Likewise. (STXVD2X_V1TI): Likewise. (VEC_INIT_V1TI): Likewise. (VEC_SET_V1TI): Likewise. (VEC_EXT_V1TI): Likewise. (EQV_V1TI): Likewise. (NAND_V1TI): Likewise. (ORC_V1TI): Likewise. (VADDCUQ): Add support for 128-bit integer arithmetic instructions added in ISA 2.07. Add both normal 'altivec' builtins, and the overloaded builtin. (VADDUQM): Likewise. (VSUBCUQ): Likewise. (VADDEUQM): Likewise. (VADDECUQ): Likewise. (VSUBEUQM): Likewise. (VSUBECUQ): Likewise. * config/rs6000/rs6000-c.c (__int128_type): New static to hold __int128_t and __uint128_t types. (__uint128_type): Likewise. (altivec_categorize_keyword): Add support for vector __int128_t, vector __uint128_t, vector __int128, and vector unsigned __int128 as a container type for TImode operations that need to be done in VSX/Altivec registers. (rs6000_macro_to_expand): Likewise. (altivec_overloaded_builtins): Add ISA 2.07 overloaded functions to support 128-bit integer instructions vaddcuq, vadduqm, vaddecuq, vaddeuqm, vsubcuq, vsubuqm, vsubecuq, vsubeuqm. (altivec_resolve_overloaded_builtin): Add support for V1TImode. * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Add support for V1TImode, and set up preferences to use VSX/Altivec registers. Setup VSX reload handlers. (rs6000_debug_reg_global): Likewise. (rs6000_init_hard_regno_mode_ok): Likewise. (rs6000_preferred_simd_mode): Likewise. (vspltis_constant): Do not allow V1TImode as easy altivec constants. (easy_altivec_constant): Likewise. (output_vec_const_move): Likewise. (rs6000_expand_vector_set): Convert V1TImode set and extract to simple move. (rs6000_expand_vector_extract): Likewise. (reg_offset_addressing_ok_p): Setup V1TImode to use VSX reg+reg addressing. (rs6000_const_vec): Add support for V1TImode. (rs6000_emit_le_vsx_load): Swap double words when loading or storing TImode/V1TImode. (rs6000_emit_le_vsx_store): Likewise. (rs6000_emit_le_vsx_move): Likewise. (rs6000_emit_move): Add support for V1TImode. (altivec_expand_ld_builtin): Likewise. (altivec_expand_st_builtin): Likewise. (altivec_expand_vec_init_builtin): Likewise. (altivec_expand_builtin): Likewise. (rs6000_init_builtins): Add support for V1TImode type. Add support for ISA 2.07 128-bit integer builtins. Define type names for the VSX/Altivec vector types. (altivec_init_builtins): Add support for overloaded vector functions with V1TImode type. (rs6000_preferred_reload_class): Prefer Altivec registers for V1TImode. (rs6000_move_128bit_ok_p): Move 128-bit move/split validation to external function. (rs6000_split_128bit_ok_p): Likewise. (rs6000_handle_altivec_attribute): Create
Re: [4.8, PATCH 26/26] Backport Power8 and LE support: Missing support
On Wed, Mar 19, 2014 at 3:35 PM, Bill Schmidt wschm...@linux.vnet.ibm.com wrote: Hi, This patch (diff-trunk-missing) backports some LE pieces that were found not to have been backported from trunk to the IBM 4.8 branch until relatively recently. Thanks, Bill 2014-03-19 Bill Schmidt wschm...@linux.vnet.ibm.com Back port from trunk 2013-04-25 Alan Modra amo...@gmail.com PR target/57052 * config/rs6000/rs6000.md (rotlsi3_internal7): Rename to rotlsi3_internal7le and condition on !BYTES_BIG_ENDIAN. (rotlsi3_internal8be): New BYTES_BIG_ENDIAN insn. Repeat for many other rotate/shift and mask patterns using subregs. Name lshiftrt insns. (ashrdisi3_noppc64): Rename to ashrdisi3_noppc64be and condition on WORDS_BIG_ENDIAN. 2013-06-07 Alan Modra amo...@gmail.com * config/rs6000/rs6000.c (rs6000_option_override_internal): Don't override user -mfp-in-toc. (offsettable_ok_by_alignment): Consider just the current access rather than the whole object, unless BLKmode. Handle CONSTANT_POOL_ADDRESS_P constants that lack a decl too. (use_toc_relative_ref): Allow CONSTANT_POOL_ADDRESS_P constants for -mcmodel=medium. * config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Don't override user -mfp-in-toc or -msum-in-toc. Default to -mno-fp-in-toc for -mcmodel=medium. 2013-06-18 Alan Modra amo...@gmail.com * config/rs6000/rs6000.h (enum data_align): New. (LOCAL_ALIGNMENT, DATA_ALIGNMENT): Use rs6000_data_alignment. (DATA_ABI_ALIGNMENT): Define. (CONSTANT_ALIGNMENT): Correct comment. * config/rs6000/rs6000-protos.h (rs6000_data_alignment): Declare. * config/rs6000/rs6000.c (rs6000_data_alignment): New function. 2013-07-11 Ulrich Weigand ulrich.weig...@de.ibm.com * config/rs6000/rs6000.md (*tls_gd_lowTLSmode:tls_abi_suffix): Require GOT register as additional operand in UNSPEC. (*tls_ld_lowTLSmode:tls_abi_suffix): Likewise. (*tls_got_dtprel_lowTLSmode:tls_abi_suffix): Likewise. (*tls_got_tprel_lowTLSmode:tls_abi_suffix): Likewise. (*tls_gdTLSmode:tls_abi_suffix): Update splitter. (*tls_ldTLSmode:tls_abi_suffix): Likewise. (tls_got_dtprel_TLSmode:tls_abi_suffix): Likewise. (tls_got_tprel_TLSmode:tls_abi_suffix): Likewise. 2014-01-23 Pat Haugen pthau...@us.ibm.com * config/rs6000/rs6000.c (rs6000_option_override_internal): Don't force flag_ira_loop_pressure if set via command line. 2014-02-06 Alan Modra amo...@gmail.com PR target/60032 * config/rs6000/rs6000.c (rs6000_secondary_memory_needed_mode): Only change SDmode to DDmode when lra_in_progress. Okay. Thanks, David
[PATCH] Fix PR c++/21113
Hi, This patch fixes c++/21113 which reports that the C++ frontend does not forbid jumps into the scope of identifiers with variably-modified types. The patch simply augments decl_jump_unsafe() to disallow jumping into blocks that initialize variably-modified decls. I bootstrapped and regtested this change on x86_64-unknown-linux-gnu. 2014-04-03 Patrick Palka patr...@parcs.ath.cx PR c++/21113 * decl.c (decl_jump_unsafe): Consider variably-modified decls. --- gcc/cp/decl.c| 5 ++--- gcc/testsuite/g++.dg/ext/vla14.C | 23 +++ 2 files changed, 25 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/g++.dg/ext/vla14.C diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index 5bd33c5..6571af5 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -2785,9 +2785,8 @@ decl_jump_unsafe (tree decl) || type == error_mark_node) return 0; - type = strip_array_types (type); - - if (DECL_NONTRIVIALLY_INITIALIZED_P (decl)) + if (DECL_NONTRIVIALLY_INITIALIZED_P (decl) + || variably_modified_type_p (type, NULL_TREE)) return 2; if (TYPE_HAS_NONTRIVIAL_DESTRUCTOR (TREE_TYPE (decl))) diff --git a/gcc/testsuite/g++.dg/ext/vla14.C b/gcc/testsuite/g++.dg/ext/vla14.C new file mode 100644 index 000..278cb63 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/vla14.C @@ -0,0 +1,23 @@ +// PR c++/21113 +// { dg-options } + +void +f (int n) +{ + goto label; // { dg-error from here } + int a[n]; // { dg-error crosses initialization } +label: // { dg-error jump to label } + ; +} + +void +g (int n) +{ + switch (1) + { + case 1: +int (*a)[n]; // { dg-error crosses initialization } + default: // { dg-error jump to case label } +; + } +} -- 1.9.1
[PATCH] Fix PR c++/44613
Hi, This patch fixes a wrong code issue in the code generated for VLAs in the C++ frontend. This exact issue was fixed in the C frontend with r85849, and this patch is essentially a port of r85849 for the C++ frontend. The issue is that this C++ code: { foo: int x[n]; f (); } gets gimplified into this: { int x[n]; void *saved_stack; saved_stack = __builtin_stack_save (); try { foo: // -- jump to foo will bypass initialization of saved_stack x = alloca (...); f (); } finally { __builtin_stack_restore (saved_stack); } } In order to ensure that labels such as foo that occur before the initialization of a VLA are emitted in the right place by the gimplifier, the C++ frontend is changed to handle the above C++ code as if it looked like this: { foo: { int x[n]; f (); } } thereby forcing the label foo to be placed before the initialization of saved_stack during gimplification. This is the same approach that the C frontend uses (see r85849). I bootstrapped and regtested this patch on x86_64-unknown-linux-gnu. 2014-04-03 Patrick Palka patr...@parcs.ath.cx PR c++/44613 * semantics.c (add_stmt): Set STATEMENT_LIST_HAS_LABEL. * decl.c (cp_finish_decl): Create a new BIND_EXPR before instantiating a variable-sized type. --- gcc/cp/decl.c| 19 ++- gcc/cp/semantics.c | 3 +++ gcc/testsuite/g++.dg/ext/vla15.C | 20 3 files changed, 41 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/ext/vla15.C diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index f3a081b..5bd33c5 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -6441,7 +6441,24 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p, after the call to check_initializer so that the DECL_EXPR for a reference temp is added before the DECL_EXPR for the reference itself. */ if (DECL_FUNCTION_SCOPE_P (decl)) -add_decl_expr (decl); +{ + /* If we're building a variable sized type, and we might be +reachable other than via the top of the current binding +level, then create a new BIND_EXPR so that we deallocate +the object at the right time. */ + if (VAR_P (decl) + DECL_SIZE (decl) + !TREE_CONSTANT (DECL_SIZE (decl)) + STATEMENT_LIST_HAS_LABEL (cur_stmt_list)) + { + tree bind; + bind = build3 (BIND_EXPR, void_type_node, NULL, NULL, NULL); + TREE_SIDE_EFFECTS (bind) = 1; + add_stmt (bind); + BIND_EXPR_BODY (bind) = push_stmt_list (); + } + add_decl_expr (decl); +} /* Let the middle end know about variables and functions -- but not static data members in uninstantiated class templates. */ diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index fb1e404..b00294e 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -386,6 +386,9 @@ add_stmt (tree t) STMT_IS_FULL_EXPR_P (t) = stmts_are_full_exprs_p (); } + if (code == LABEL_EXPR || code == CASE_LABEL_EXPR) +STATEMENT_LIST_HAS_LABEL (cur_stmt_list) = 1; + /* Add T to the statement-tree. Non-side-effect statements need to be recorded during statement expressions. */ gcc_checking_assert (!stmt_list_stack-is_empty ()); diff --git a/gcc/testsuite/g++.dg/ext/vla15.C b/gcc/testsuite/g++.dg/ext/vla15.C new file mode 100644 index 000..feeb49f --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/vla15.C @@ -0,0 +1,20 @@ +// PR c++/44613 +// { dg-do run } +// { dg-options } + +void *volatile p; + +int +main (void) +{ + int n = 0; + lab:; + int x[n % 1000 + 1]; + x[0] = 1; + x[n % 1000] = 2; + p = x; + n++; + if (n 100) +goto lab; + return 0; +} -- 1.9.1
[PING][C++ Patch, 4.8] Backport fix for c++/54537 to FSF 4.8
I'd like to ping the following backport patch for the fix for PR54537. This did bootstrap and regtest with no regressions on powerpc64-linux. http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01148.html Peter
Re: [PING^8][PATCH] Add a couple of dialect and warning options regarding Objective-C instance variable scope
Still pinging. On 03/28/2014 11:58 AM, Dimitris Papavasiliou wrote: Ping! On 03/23/2014 03:20 AM, Dimitris Papavasiliou wrote: Ping! On 03/13/2014 11:54 AM, Dimitris Papavasiliou wrote: Ping! On 03/06/2014 07:44 PM, Dimitris Papavasiliou wrote: Ping! On 02/27/2014 11:44 AM, Dimitris Papavasiliou wrote: Ping! On 02/20/2014 12:11 PM, Dimitris Papavasiliou wrote: Hello all, Pinging this patch review request again. See previous messages quoted below for details. Regards, Dimitris On 02/13/2014 04:22 PM, Dimitris Papavasiliou wrote: Hello, Pinging this patch review request. Can someone involved in the Objective-C language frontend have a quick look at the description of the proposed features and tell me if it'd be ok to have them in the trunk so I can go ahead and create proper patches? Thanks, Dimitris On 02/06/2014 11:25 AM, Dimitris Papavasiliou wrote: Hello, This is a patch regarding a couple of Objective-C related dialect options and warning switches. I have already submitted it a while ago but gave up after pinging a couple of times. I am now informed that should have kept pinging until I got someone's attention so I'm resending it. The patch is now against an old revision and as I stated originally it's probably not in a state that can be adopted as is. I'm sending it as is so that the implemented features can be assesed in terms of their usefulness and if they're welcome I'd be happy to make any necessary changes to bring it up-to-date, split it into smaller patches, add test-cases and anything else that is deemed necessary. Here's the relevant text from my initial message: Two of these switches are related to a feature request I submitted a while ago, Bug 56044 (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56044). I won't reproduce the entire argument here since it is available in the feature request. The relevant functionality in the patch comes in the form of two switches: -Wshadow-ivars which controls the local declaration of ‘somevar’ hides instance variable warning which curiously is enabled by default instead of being controlled at least by -Wshadow. The patch changes it so that this warning can be enabled and disabled specifically through -Wshadow-ivars as well as with all other shadowing-related warnings through -Wshadow. The reason for the extra switch is that, while searching through the Internet for a solution to this problem I have found out that other people are inconvenienced by this particular warning as well so it might be useful to be able to turn it off while keeping all the other shadowing-related warnings enabled. -flocal-ivars which when true, as it is by default, treats instance variables as having local scope. If false (-fno-local-ivars) instance variables must always be referred to as self-ivarname and references of ivarname resolve to the local or global scope as usual. I've also taken the opportunity of adding another switch unrelated to the above but related to instance variables: -fivar-visibility which can be set to either private, protected (the default), public and package. This sets the default instance variable visibility which normally is implicitly protected. My use-case for it is basically to be able to set it to public and thus effectively disable this visibility mechanism altogether which I find no use for and therefore have to circumvent. I'm not sure if anyone else feels the same way towards this but I figured it was worth a try. I'm attaching a preliminary patch against the current revision in case anyone wants to have a look. The changes are very small and any blatant mistakes should be immediately obvious. I have to admit to having virtually no knowledge of the internals of GCC but I have tried to keep in line with formatting guidelines and general style as well as looking up the particulars of the way options are handled in the available documentation to avoid blind copy-pasting. I have also tried to test the functionality both in my own (relatively large, or at least not too small) project and with small test programs and everything works as expected. Finallly, I tried running the tests too but these fail to complete both in the patched and unpatched version, possibly due to the way I've configured GCC. Dimitris
Re: [committed, libjava] XFAIL sourcelocation (PR libgcj/55637) backported to 4.8.3
Thanks for the tip. What should I do now? Should I fix the ChangeLog entry and add a new one or do nothing? Dominique Le 2 avr. 2014 à 12:47, Rainer Orth r...@cebitec.uni-bielefeld.de a écrit : domi...@lps.ens.fr (Dominique Dhumieres) writes: r...@cebitec.uni-bielefeld.de (Rainer Orth) wrote: Sure, patch preapproved. Commited as r208983: 2014-04-01 Dominique d'Humieres domi...@lps.ens.fr Rainer Orth r...@cebitec.uni-bielefeld.de PR libgcj/55637 * testsuite/libjava.lang/sourcelocation.xfail: New file. Btw, the customary format for such a ChangeLog entry is 2014-04-01 Dominique d'Humieres domi...@lps.ens.fr Backport from mainline 2014-02-20 Rainer Orth r...@cebitec.uni-bielefeld.de PR libgcj/55637 * testsuite/libjava.lang/sourcelocation.xfail: New file. This way, you can easily see when the original went in. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [committed, libjava] XFAIL sourcelocation (PR libgcj/55637) backported to 4.8.3
Hi Dominique, Thanks for the tip. What should I do now? Should I fix the ChangeLog entry and add a new one or do nothing? if you want, you could fix the ChangeLog entry in place, but don't add a new one for that change. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PING][C++ Patch, 4.8] Backport fix for c++/54537 to FSF 4.8
On 03/04/14 10:25 -0500, Peter Bergner wrote: I'd like to ping the following backport patch for the fix for PR54537. This did bootstrap and regtest with no regressions on powerpc64-linux. http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01148.html I don't know how risky the front-end change is, but if it gets approved then the library part is obviously fine. That said, my kneejerk reaction is if it's only really needed to allow inclusion of tr1/cmath then my solution would be to not use that TR1 header!
Re: [gomp4] Add tables generation
On 04/02/2014 10:36 AM, Thomas Schwinge wrote: I see regressions in the libgomp testsuite for configurations where offloading is not enabled: spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ [...]/source/libgomp/testsuite/libgomp.c/for-3.c -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/ -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -I[...]/build/x86_64-unknown-linux-gnu/./libgomp -I[...]/source/libgomp/testsuite/.. -fmessage-length=0 -fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -std=gnu99 -fopenmp -L[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -lm -o ./for-3.exe /tmp/ccGnT0ei.o: In function `main': for-3.c:(.text+0x21032): undefined reference to `__OPENMP_TARGET__' collect2: error: ld returned 1 exit status I suppose that's because [...] Workaround committed in r209015: libgcc/ * crtstuff.c [!ENABLE_OFFLOADING] (__OPENMP_TARGET__): Define to NULL. The patch below should be a better fix, making the references to __OPENMP_TARGET__ weak. Does this work for you? Bernd Index: gcc/omp-low.c === --- gcc/omp-low.c (revision 429741) +++ gcc/omp-low.c (working copy) @@ -221,6 +221,28 @@ static tree scan_omp_1_op (tree *, int * *handled_ops_p = false; \ break; +static GTY(()) tree offload_symbol_decl; + +/* Get the __OPENMP_TARGET__ symbol. */ +static tree +get_offload_symbol_decl (void) +{ + if (!offload_symbol_decl) +{ + tree decl = build_decl (UNKNOWN_LOCATION, VAR_DECL, + get_identifier (__OPENMP_TARGET__), + ptr_type_node); + TREE_PUBLIC (decl) = 1; + DECL_EXTERNAL (decl) = 1; + DECL_WEAK (decl) = 1; + DECL_ATTRIBUTES (decl) + = tree_cons (get_identifier (weak), + NULL_TREE, DECL_ATTRIBUTES (decl)); + offload_symbol_decl = decl; +} + return offload_symbol_decl; +} + /* Convenience function for calling scan_omp_1_op on tree operands. */ static inline tree @@ -5148,11 +5170,7 @@ expand_oacc_offload (struct omp_region * } gimple g; - tree openmp_target -= build_decl (UNKNOWN_LOCATION, VAR_DECL, - get_identifier (__OPENMP_TARGET__), ptr_type_node); - TREE_PUBLIC (openmp_target) = 1; - DECL_EXTERNAL (openmp_target) = 1; + tree openmp_target = get_offload_symbol_decl (); tree fnaddr = build_fold_addr_expr (child_fn); g = gimple_build_call (builtin_decl_explicit (start_ix), 10, device, fnaddr, build_fold_addr_expr (openmp_target), @@ -8686,11 +8704,7 @@ expand_omp_target (struct omp_region *re } gimple g; - tree openmp_target -= build_decl (UNKNOWN_LOCATION, VAR_DECL, - get_identifier (__OPENMP_TARGET__), ptr_type_node); - TREE_PUBLIC (openmp_target) = 1; - DECL_EXTERNAL (openmp_target) = 1; + tree openmp_target = get_offload_symbol_decl (); if (kind == GF_OMP_TARGET_KIND_REGION) { tree fnaddr = build_fold_addr_expr (child_fn);
Re: [gomp4] Add tables generation
2014-04-03 20:13 GMT+04:00 Bernd Schmidt ber...@codesourcery.com: The patch below should be a better fix, making the references to __OPENMP_TARGET__ weak. Does this work for you? Shouldn't we just remove __OPENMP_TARGET__ argument from GOMP_target, since we decided to pass it to GOMP_offload_register? -- Ilya
[PATCH] Initialize sanitizer builtins (PR sanitizer/60745)
Under certain circumstances the sanitizer builtins are not initialized properly and ubsan_instrument_return must make sure they are initialized. Otherwise builtin_decl_explicit returns NULL and we'll ICE in build_call_expr_loc_array. I'm not sure which other ubsan routines need similar fix. No testcase attached since it's not trivial to reproduce this. Bootstrapped/ran ubsan testsuite on x86_64-linux, ok for trunk? 2014-04-03 Marek Polacek pola...@redhat.com PR sanitizer/60745 * c-ubsan.c: Include asan.h. (ubsan_instrument_return): Call initialize_sanitizer_builtins. diff --git gcc/c-family/c-ubsan.c gcc/c-family/c-ubsan.c index dc4d981..9d2403c 100644 --- gcc/c-family/c-ubsan.c +++ gcc/c-family/c-ubsan.c @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3. If not see #include ubsan.h #include c-family/c-common.h #include c-family/c-ubsan.h +#include asan.h /* Instrument division by zero and INT_MIN / -1. If not instrumenting, return NULL_TREE. */ @@ -185,6 +186,8 @@ ubsan_instrument_vla (location_t loc, tree size) tree ubsan_instrument_return (location_t loc) { + initialize_sanitizer_builtins (); + tree data = ubsan_create_data (__ubsan_missing_return_data, loc, NULL, NULL_TREE); tree t = builtin_decl_explicit (BUILT_IN_UBSAN_HANDLE_MISSING_RETURN); Marek
Re: [PATCH, ARM] Fix PR60609 (Error: value of 256 too large for field of 1 bytes)
On Thu, Apr 3, 2014 at 2:27 PM, Charles Baylis charles.bay...@linaro.org wrote: Hi This bug causes the compiler to create a Thumb-2 TBB instruction with a jump table containing an out of range value in a .byte field: whatever.s:148: Error: value of 256 too large for field of 1 bytes at 100 This occurs because the jump table is followed with a .align 1 due to ASM_OUTPUT_CASE_END, but the 'shorten' phase does not account for the space taken by this align directive. My first reaction is to wonder why this is this not a bug in the shorten phase. This patch addresses the issue by removing ASM_OUTPUT_CASE_END from arm.h, and ensuring that the alignment after an ADDR_DIFF_VEC is instead inserted by aligning the label following the barrier which follows it. This is achieved by defining LABEL_ALIGN_AFTER_BARRIER appropriately. On first glance this feels like a blunt hammer, what's the code size bloat with putting out such an alignment after each barrier that the compiler emits rather than tracking this in ASM_OUTPUT_CASE_END. I'll try and have a look at this again tomorrow morning. regards Ramana Bootstrapped/checked on arm-unknown-linux-gnueabihf. OK for trunk, and backporting to 4.8? 2014-04-02 Charles Baylis charles.bay...@linaro.org PR target/60609 * config/arm/arm.h (ASM_OUTPUT_CASE_END) Remove. (LABEL_ALIGN_AFTER_BARRIER) Align barriers which occur after ADDR_DIFF_VEC. 2014-04-02 Charles Baylis charles.bay...@linaro.org PR target/60609 * g++.dg/torture/pr60609.C: New test.
Re: [gomp4] Add tables generation
On 04/03/2014 06:53 PM, Ilya Verbin wrote: 2014-04-03 20:13 GMT+04:00 Bernd Schmidt ber...@codesourcery.com: The patch below should be a better fix, making the references to __OPENMP_TARGET__ weak. Does this work for you? Shouldn't we just remove __OPENMP_TARGET__ argument from GOMP_target, since we decided to pass it to GOMP_offload_register? I thought it was used to look up the right function? With shared libraries you'd get multiple __OPENMP_TARGET__ tables. Bernd
Re: [PATCH] PowerPC, PR60735: _Decimal64 moves broken on -mspe
On Tue, Apr 1, 2014 at 7:55 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: In backporting the power8 changes to the 4.8 branch, one of the testers of these patches noticed that libgcc cannot be built on a linux SPE target. The reason was the _Decimal64 type did not have a proper move insn in the SPE environment. This patch fixes that issue. In looking at the patch, I discovered two other thinkos that are fixed in this patch. The first problem is the movdf/movdd insns for 32-bit without hardware floating point, checked whether we had hardware single precision support, when it should have been checking that we had hardware double precision support. The second problem was that some of the types believed they could use the floating point registers in a SPE or software emulation enviornment. So I added additional code to turn off the use of the FPRs in this case. I have done bootstraps and make check on 64-bit PowerPC linux systems with no regression. In addition, I tested the code generated using cross compilers to the Linux SPE system. Is this patch acceptible to be checked in the trunk (and to the 4.8 branch when the other patches are approved)? Mike, Can you work with Edmar and Rohit to create a testcase for the GCC testsuite as well? Thanks, David
Re: [gomp4] Add tables generation
2014-04-03 21:06 GMT+04:00 Bernd Schmidt ber...@codesourcery.com: On 04/03/2014 06:53 PM, Ilya Verbin wrote: 2014-04-03 20:13 GMT+04:00 Bernd Schmidt ber...@codesourcery.com: The patch below should be a better fix, making the references to __OPENMP_TARGET__ weak. Does this work for you? Shouldn't we just remove __OPENMP_TARGET__ argument from GOMP_target, since we decided to pass it to GOMP_offload_register? I thought it was used to look up the right function? With shared libraries you'd get multiple __OPENMP_TARGET__ tables. Bernd Yes, initially the idea was to use it for look up the right function. But now each DSO will call GOMP_offload_register, and pass unique pointer to __OPENMP_TARGET__ (host_table) for this DSO. Then gomp_register_images_for_device registers all this host tables in the plugin. And when libgomp calls device_get_table_func, the plugin returns the joint table for all DSO's. -- Ilya
Re: [gomp4] Add tables generation
On 04/03/2014 07:25 PM, Ilya Verbin wrote: Yes, initially the idea was to use it for look up the right function. But now each DSO will call GOMP_offload_register, and pass unique pointer to __OPENMP_TARGET__ (host_table) for this DSO. Then gomp_register_images_for_device registers all this host tables in the plugin. And when libgomp calls device_get_table_func, the plugin returns the joint table for all DSO's. Why make a joint table? It seems better to use the __OPENMP_TARGET__ symbol to restrict lookups to the subset of symbols that could actually be found. BTW, I still expect that the lookup by ordering will turn out to be fundamentally unreliable and we'll need to use the unique id patch I posted a while ago. In that case using __OPENMP_TARGET__ as a first order key for the lookups eliminates any problem with duplicate names across multiple libraries. Bernd
Re: [gomp4] Add tables generation
2014-04-03 21:28 GMT+04:00 Bernd Schmidt ber...@codesourcery.com: On 04/03/2014 07:25 PM, Ilya Verbin wrote: Yes, initially the idea was to use it for look up the right function. But now each DSO will call GOMP_offload_register, and pass unique pointer to __OPENMP_TARGET__ (host_table) for this DSO. Then gomp_register_images_for_device registers all this host tables in the plugin. And when libgomp calls device_get_table_func, the plugin returns the joint table for all DSO's. Why make a joint table? It seems better to use the __OPENMP_TARGET__ symbol to restrict lookups to the subset of symbols that could actually be found. BTW, I still expect that the lookup by ordering will turn out to be fundamentally unreliable and we'll need to use the unique id patch I posted a while ago. In that case using __OPENMP_TARGET__ as a first order key for the lookups eliminates any problem with duplicate names across multiple libraries. Bernd In current implementation each gomp_device_descr contains one dev_splay_tree. And all addresses are inserted into this splay tree. There is no need to restrict lookup, because the addresses from multiple DSO's can't overlap. -- Ilya
Re: [PATCH] PowerPC, PR60735: _Decimal64 moves broken on -mspe
On Thu, Apr 03, 2014 at 01:24:25PM -0400, David Edelsohn wrote: On Tue, Apr 1, 2014 at 7:55 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: In backporting the power8 changes to the 4.8 branch, one of the testers of these patches noticed that libgcc cannot be built on a linux SPE target. The reason was the _Decimal64 type did not have a proper move insn in the SPE environment. This patch fixes that issue. In looking at the patch, I discovered two other thinkos that are fixed in this patch. The first problem is the movdf/movdd insns for 32-bit without hardware floating point, checked whether we had hardware single precision support, when it should have been checking that we had hardware double precision support. The second problem was that some of the types believed they could use the floating point registers in a SPE or software emulation enviornment. So I added additional code to turn off the use of the FPRs in this case. I have done bootstraps and make check on 64-bit PowerPC linux systems with no regression. In addition, I tested the code generated using cross compilers to the Linux SPE system. Is this patch acceptible to be checked in the trunk (and to the 4.8 branch when the other patches are approved)? Mike, Can you work with Edmar and Rohit to create a testcase for the GCC testsuite as well? Sure, but I won't be able to run it under the test suite. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
[4.8, PATCH 28/26] Backport Power8 and LE support: Fix for SPE (PR60735)
Hi, This patch (diff-pr60735) adds to the 4.8 PowerPC backport patch series with a backported fix for PR60735, an unrecognized insn problem for SPE. Thanks, Bill [gcc] 2014-04-03 Bill Schmidt wschm...@linux.vnet.ibm.com Back port mainline subversion id 209025. 2014-04-02 Michael Meissner meiss...@linux.vnet.ibm.com PR target/60735 * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): If we have software floating point or no floating point registers, do not allow any type in the FPRs. Eliminate a test for SPE SIMD types in GPRs that occurs after we tested for GPRs that would never be true. * config/rs6000/rs6000.md (movmode_softfloat32, FMOVE64): Rewrite tests to use TARGET_DOUBLE_FLOAT and TARGET_E500_DOUBLE, since the FMOVE64 type is DFmode/DDmode. If TARGET_E500_DOUBLE, specifically allow DDmode, since that does not use the SPE SIMD instructions. Index: gcc-4_8-test2/gcc/config/rs6000/rs6000.c === --- gcc-4_8-test2.orig/gcc/config/rs6000/rs6000.c +++ gcc-4_8-test2/gcc/config/rs6000/rs6000.c @@ -1733,6 +1733,9 @@ rs6000_hard_regno_mode_ok (int regno, en modes and DImode. */ if (FP_REGNO_P (regno)) { + if (TARGET_SOFT_FLOAT || !TARGET_FPRS) + return 0; + if (SCALAR_FLOAT_MODE_P (mode) (mode != TDmode || (regno % 2) == 0) FP_REGNO_P (last_regno)) @@ -1761,10 +1764,6 @@ rs6000_hard_regno_mode_ok (int regno, en return (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) || mode == V1TImode); - /* ...but GPRs can hold SIMD data on the SPE in one register. */ - if (SPE_SIMD_REGNO_P (regno) TARGET_SPE SPE_VECTOR_MODE (mode)) -return 1; - /* We cannot put non-VSX TImode or PTImode anywhere except general register and it must be able to fit within the register set. */ Index: gcc-4_8-test2/gcc/config/rs6000/rs6000.md === --- gcc-4_8-test2.orig/gcc/config/rs6000/rs6000.md +++ gcc-4_8-test2/gcc/config/rs6000/rs6000.md @@ -9428,8 +9428,9 @@ [(set (match_operand:FMOVE64 0 nonimmediate_operand =Y,r,r,r,r,r) (match_operand:FMOVE64 1 input_operand r,Y,r,G,H,F))] ! TARGET_POWERPC64 -((TARGET_FPRS TARGET_SINGLE_FLOAT) - || TARGET_SOFT_FLOAT || TARGET_E500_SINGLE) +((TARGET_FPRS TARGET_DOUBLE_FLOAT) + || TARGET_SOFT_FLOAT + || (MODEmode == DDmode TARGET_E500_DOUBLE)) (gpc_reg_operand (operands[0], MODEmode) || gpc_reg_operand (operands[1], MODEmode)) #
[4.8, PATCH 29/26] Backport Power8 and LE support: Document vec_vgbbd
Hi, This patch (diff-vecdoc) is the last addition to the 4.8 PowerPC backport patch series. It simply adds some missing documentation that should have been part of one of the previous patches. I'm currently doing one more quick round of testing with the three late-addition patches, and will then be ready to commit the series. Thanks, Bill [gcc] 2014-04-03 Bill Schmidt wschm...@linux.vnet.ibm.com Back port from main line: 2014-04-01 Michael Meissner meiss...@linux.vnet.ibm.com * doc/extend.texi (PowerPC AltiVec/VSX Built-in Functions): Document vec_vgbbd. Index: gcc-4_8-test3/gcc/doc/extend.texi === --- gcc-4_8-test3.orig/gcc/doc/extend.texi +++ gcc-4_8-test3/gcc/doc/extend.texi @@ -14132,6 +14132,9 @@ vector unsigned short vec_vclzh (vector vector int vec_vclzw (vector int); vector unsigned int vec_vclzw (vector int); +vector signed char vec_vgbbd (vector signed char); +vector unsigned char vec_vgbbd (vector unsigned char); + vector long long vec_vmaxsd (vector long long, vector long long); vector unsigned long long vec_vmaxud (vector unsigned long long,
Re: [GOOGLE] Updates SSA after VPT transofrmations in AFDO pass
looks fine. David On Thu, Apr 3, 2014 at 10:56 AM, Dehao Chen de...@google.com wrote: This patch updates SSA after VPT transformation. This is needed because compute_inline_parameters will ICE without updated SSA. Testing on-going. OK for google-4_8? Thanks, Dehao Index: gcc/auto-profile.c === --- gcc/auto-profile.c (revision 209059) +++ gcc/auto-profile.c (working copy) @@ -1448,6 +1448,7 @@ afdo_vpt_for_early_inline (stmt_set *promoted_stmt free_dominance_info (CDI_POST_DOMINATORS); calculate_dominance_info (CDI_POST_DOMINATORS); calculate_dominance_info (CDI_DOMINATORS); + update_ssa (TODO_update_ssa); rebuild_cgraph_edges (); return true; }
[GOOGLE] Updates SSA after VPT transofrmations in AFDO pass
This patch updates SSA after VPT transformation. This is needed because compute_inline_parameters will ICE without updated SSA. Testing on-going. OK for google-4_8? Thanks, Dehao Index: gcc/auto-profile.c === --- gcc/auto-profile.c (revision 209059) +++ gcc/auto-profile.c (working copy) @@ -1448,6 +1448,7 @@ afdo_vpt_for_early_inline (stmt_set *promoted_stmt free_dominance_info (CDI_POST_DOMINATORS); calculate_dominance_info (CDI_POST_DOMINATORS); calculate_dominance_info (CDI_DOMINATORS); + update_ssa (TODO_update_ssa); rebuild_cgraph_edges (); return true; }
[Fortran-CAF, committed] Add array sending support for coarrays
This patch handles assigning to coarray array (sections) from local arrays for array RHS and for scalar RHS. I have lightly tested it with libcaf_single. On the library side, I added a minimal implementation for libcaf_single, which handles only rank==1 arrays, but which otherwise seems to work. With that patch, the most common cases for sending should be handled. Missing features for sending to remote issues: character strings are not handled, type conversion (i.e. assigning a real to an integer or similar), allocatable/pointer components of coarrays, and array vector sections are still not handled. - And, of course, reading from remote coarrays (get, pull) is not supported. Build on x86-64-gnu-linux - and committed to the branch as Rev. 209060 Tobias PS: Minimal test case to be run with gfortran -fdump-tree-original -fcoarray=single -lcaf_single: integer :: foo(5)[*] integer :: bar(5) bar = [1,2,3,4,5] foo(:)[1] = bar print *, foo foo(:)[1] = 45 print *, foo end gcc/fortran/ChangeLog.fortran-caf |9 + gcc/fortran/trans-decl.c | 15 +++- gcc/fortran/trans-intrinsic.c | 34 +-- gcc/fortran/trans.h |2 + libgfortran/ChangeLog.fortran-caf | 13 +++ libgfortran/caf/libcaf.h | 34 +++ libgfortran/caf/single.c | 67 ++ 7 files changed, 163 insertions(+), 11 deletions(-) Index: libgfortran/ChangeLog.fortran-caf === --- libgfortran/ChangeLog.fortran-caf (Revision 208931) +++ libgfortran/ChangeLog.fortran-caf (Arbeitskopie) @@ -1,3 +1,16 @@ +2014-04-03 Tobias Burnus bur...@net-b.de + + * caf/libcaf.h (descriptor_dimension, gfc_descriptor_t): New + structs. + (GFC_MAX_DIMENSIONS, GFC_DTYPE_RANK_MASK, GFC_DTYPE_TYPE_SHIFT, + GFC_DTYPE_TYPE_MASK, GFC_DTYPE_SIZE_SHIFT, GFC_DESCRIPTOR_RANK, + GFC_DESCRIPTOR_TYPE, GFC_DESCRIPTOR_SIZE): New defines. + (_gfortran_caf_send_desc, _gfortran_caf_send_desc_scalar): New + prototypes. + * caf/single.c (_gfortran_caf_send_desc, + _gfortran_caf_send_desc_scalar): New functions, supporting + rank == 1 only. + 2014-03-14 Tobias Burnus bur...@net-b.de * caf/libcaf.h (caf_token_t): New typedef. Index: libgfortran/caf/libcaf.h === --- libgfortran/caf/libcaf.h (Revision 208931) +++ libgfortran/caf/libcaf.h (Arbeitskopie) @@ -58,6 +58,38 @@ caf_register_t; typedef void* caf_token_t; + +/* GNU Fortran's array descriptor. Keep in sync with libgfortran.h. */ + +typedef struct descriptor_dimension +{ + ptrdiff_t _stride; + ptrdiff_t lower_bound; + ptrdiff_t _ubound; +} +descriptor_dimension; + +typedef struct gfc_descriptor_t { + void *base_addr; + size_t offset; + ptrdiff_t dtype; + descriptor_dimension dim[]; +} gfc_descriptor_t; + + +#define GFC_MAX_DIMENSIONS 7 + +#define GFC_DTYPE_RANK_MASK 0x07 +#define GFC_DTYPE_TYPE_SHIFT 3 +#define GFC_DTYPE_TYPE_MASK 0x38 +#define GFC_DTYPE_SIZE_SHIFT 6 +#define GFC_DESCRIPTOR_RANK(desc) ((desc)-dtype GFC_DTYPE_RANK_MASK) +#define GFC_DESCRIPTOR_TYPE(desc) (((desc)-dtype GFC_DTYPE_TYPE_MASK) \ +GFC_DTYPE_TYPE_SHIFT) +#define GFC_DESCRIPTOR_SIZE(desc) ((desc)-dtype GFC_DTYPE_SIZE_SHIFT) + + + /* Linked list of static coarrays registered. */ typedef struct caf_static_t { caf_token_t token; @@ -77,6 +109,8 @@ void *_gfortran_caf_register (size_t, caf_register void _gfortran_caf_deregister (caf_token_t *, int *, char *, int); void _gfortran_send (caf_token_t, size_t, int, void *, size_t, bool); +void _gfortran_send_desc (caf_token_t, size_t, int, gfc_descriptor_t*, gfc_descriptor_t*, bool); +void _gfortran_send_desc_scalar (caf_token_t, size_t, int, gfc_descriptor_t*, void*, bool); void _gfortran_caf_sync_all (int *, char *, int); void _gfortran_caf_sync_images (int, int[], int *, char *, int); Index: libgfortran/caf/single.c === --- libgfortran/caf/single.c (Revision 208931) +++ libgfortran/caf/single.c (Arbeitskopie) @@ -149,6 +149,7 @@ _gfortran_caf_deregister (caf_token_t *token, int *stat = 0; } +/* Send scalar (or contiguous) data from buffer to a remote image. */ void _gfortran_caf_send (caf_token_t token, size_t offset, @@ -161,7 +162,73 @@ _gfortran_caf_send (caf_token_t token, size_t offs } +/* Send array data from src to dest on a remote image. */ + void +_gfortran_caf_send_desc (caf_token_t token, size_t offset, + int image_id __attribute__ ((unused)), + gfc_descriptor_t *dest, gfc_descriptor_t *src, + bool asyn __attribute__ ((unused))) +{ + fprintf (stderr, COARRAY ERROR: Array communication + [_gfortran_caf_send_desc] not yet implemented for rank /= 0); + exit (EXIT_FAILURE); + size_t i, j; + size_t size = GFC_DESCRIPTOR_SIZE (dest); + int rank = GFC_DESCRIPTOR_RANK (dest);
[Fortran-caf] Merge from the trunk to the branch
Committed to the fortran-caf branch as Rev. 209062 Tobias
[PR target/60657] [P1 regression] Fix operand predicates for a few ARM insns
As noted in the PR, there are a few insns in the ARM backend which use const_int_operand as a predicate, but which have constraints like I or M. With the predicate accepting all constants, it's possible for a pass such as combine to create an insn where the constant operand matches the loose predicate, but will not match the tighter constraint. WIth no other alternatives to choose from, lra/reload won't be able to fixup the insn. The right way (IMHO) is to tighten the predicate in these cases. This patch introduces const_int_I_operand and const_int_M_operand. Bootstrapped on arm7l-unknown-linux-gnu (without java which fails for unrelated reasons) and regression tested. One system didn't have GDB installed, so the atomic and guality tests were noisy and due to time constraints, I haven't re-run them. OK for the trunk? diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 8d0c021..6c170d3 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,15 @@ +2014-04-03 Jeff Law l...@redhat.com + +PR target/60657 + * arm/predicates.md (const_int_I_operand): New predicate. + (const_int_M_operand): Similarly. + * arm/arm.md (insv_zero): Use const_int_M_operand instead of + const_int_operand. + (insv_t2, extv_reg, extzv_t2): Likewise. + (load_multiple_with_writeback): Similarly for const_int_I_operand. + (pop_multiple_with_writeback_and_return): Likewise. + (vfp_pop_multiple_with_writeback): Likewise + 2014-04-03 Richard Biener rguent...@suse.de * tree-streamer.h (struct streamer_tree_cache_d): Add next_idx diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 4df24a2..4b81ee2 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -2784,8 +2784,8 @@ (define_insn insv_zero [(set (zero_extract:SI (match_operand:SI 0 s_register_operand +r) - (match_operand:SI 1 const_int_operand M) - (match_operand:SI 2 const_int_operand M)) + (match_operand:SI 1 const_int_M_operand M) + (match_operand:SI 2 const_int_M_operand M)) (const_int 0))] arm_arch_thumb2 bfc%?\t%0, %2, %1 @@ -2797,8 +2797,8 @@ (define_insn insv_t2 [(set (zero_extract:SI (match_operand:SI 0 s_register_operand +r) - (match_operand:SI 1 const_int_operand M) - (match_operand:SI 2 const_int_operand M)) + (match_operand:SI 1 const_int_M_operand M) + (match_operand:SI 2 const_int_M_operand M)) (match_operand:SI 3 s_register_operand r))] arm_arch_thumb2 bfi%?\t%0, %3, %2, %1 @@ -4480,8 +4480,8 @@ (define_insn *extv_reg [(set (match_operand:SI 0 s_register_operand =r) (sign_extract:SI (match_operand:SI 1 s_register_operand r) - (match_operand:SI 2 const_int_operand M) - (match_operand:SI 3 const_int_operand M)))] + (match_operand:SI 2 const_int_M_operand M) + (match_operand:SI 3 const_int_M_operand M)))] arm_arch_thumb2 sbfx%?\t%0, %1, %3, %2 [(set_attr length 4) @@ -4493,8 +4493,8 @@ (define_insn extzv_t2 [(set (match_operand:SI 0 s_register_operand =r) (zero_extract:SI (match_operand:SI 1 s_register_operand r) - (match_operand:SI 2 const_int_operand M) - (match_operand:SI 3 const_int_operand M)))] + (match_operand:SI 2 const_int_M_operand M) + (match_operand:SI 3 const_int_M_operand M)))] arm_arch_thumb2 ubfx%?\t%0, %1, %3, %2 [(set_attr length 4) @@ -12073,7 +12073,7 @@ [(match_parallel 0 load_multiple_operation [(set (match_operand:SI 1 s_register_operand +rk) (plus:SI (match_dup 1) - (match_operand:SI 2 const_int_operand I))) + (match_operand:SI 2 const_int_I_operand I))) (set (match_operand:SI 3 s_register_operand =rk) (mem:SI (match_dup 1))) ])] @@ -12102,7 +12102,7 @@ [(return) (set (match_operand:SI 1 s_register_operand +rk) (plus:SI (match_dup 1) - (match_operand:SI 2 const_int_operand I))) + (match_operand:SI 2 const_int_I_operand I))) (set (match_operand:SI 3 s_register_operand =rk) (mem:SI (match_dup 1))) ])] @@ -12155,7 +12155,7 @@ [(match_parallel 0 pop_multiple_fp [(set (match_operand:SI 1 s_register_operand +rk) (plus:SI (match_dup 1) - (match_operand:SI 2 const_int_operand I))) + (match_operand:SI 2 const_int_I_operand I))) (set (match_operand:DF 3 vfp_hard_register_operand ) (mem:DF (match_dup 1)))])] TARGET_32BIT TARGET_HARD_FLOAT TARGET_VFP diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md index ce5c9a8..6273e88 100644
[RFA jit] clear timevar_enable in timevar_print
The timevar module doesn't properly re-initialize timevar_print between invocations of the compiler. In particular, if the compiler is put into verbose mode, and subsequently put back into quiet mode, then timevar_enable is never set to false -- leading to unwanted timevar display. This patch fixes the problem by clearing timevar_enable in timevar_print. --- gcc/ChangeLog.jit | 4 gcc/timevar.c | 2 ++ 2 files changed, 6 insertions(+) diff --git a/gcc/ChangeLog.jit b/gcc/ChangeLog.jit index 5145cf9..6ef9794 100644 --- a/gcc/ChangeLog.jit +++ b/gcc/ChangeLog.jit @@ -1,5 +1,9 @@ 2014-03-24 Tom Tromey tro...@redhat.com + * timevar.c (timevar_print): Clear timevar_enable. + +2014-03-24 Tom Tromey tro...@redhat.com + * toplev.c (general_init): Initialize input_location. * input.c (input_location): Initialize to UNKNOWN_LOCATION. diff --git a/gcc/timevar.c b/gcc/timevar.c index 2ceee51..5e4c4c49 100644 --- a/gcc/timevar.c +++ b/gcc/timevar.c @@ -491,6 +491,8 @@ timevar_print (FILE *fp) if (!timevar_enable) return; + // Clean up for a possible next run. + timevar_enable = false; /* Update timing information in case we're calling this from GDB. */ -- 1.9.0
Re: [PATCH, PR 60640] When creating virtual clones, clone thunks too
+/* If E does not lead to a thunk, simply redirect it to N. Otherwise create + one or more equivalent thunks for N and redirect E to the first in the + chain. */ + +void +redirect_edge_duplicating_thunks (struct cgraph_edge *e, struct cgraph_node *n, + bitmap args_to_skip) +{ + cgraph_node *orig_to = cgraph_function_or_thunk_node (e-callee); + if (orig_to-thunk.thunk_p) +n = duplicate_thunk_for_node (orig_to, n, args_to_skip); Is there anything that would pevent us from creating a new thunk for each call? No, given how late we have discovered it, it probably only happens very rarely. Moreover, since you have plans to always inline only directly called thunks for the next release, which should be the ultimate solution, I did not think it was necessary or even appropriate at this stage. A lot of code iterate over thunks/aliases and expect this to be cheap operation. We thus need to be sure we won't create very many thunks or aliases of a given function internally. In order to trigger quadratic behaviour here, we only need a single function call used very often in a big project, like mozilla, to create uncontrolled numbers of thunks. I would suggest to just walk existing thunks before creating new looking if there is one mathcing our needs. Same code is in making local aliases. This change is pre-approved. Also I think you need to avoid this logic when THIS parameter is being optimized out (i.e. it is part of skip_args) You are of course right. However, skipping the creation of a new thunk when we are also removing parameter this leads to verification errors again, so I had to also teach the verifier that this case is actually OK. Moreover, although it seems that currently all That is fine with me. non-this_adjusting thunks are expanded before IPA-CP runs, I made sure the skipping logic checked that flag. Yes, we only keep the simple thunks in non-lowered form, but I do not see how it makes difference for you. Accidently, the two original testcases are removing parameter this so I added a new one, which also shows how current trunk miscompiles stuff. Unfortunately, at the moment it relies on speculative edges and so when IPA-CP correctly redirects calls to a thunk, inlining gives up and removes the edge, so the IPA-CP transformation is not run-time checked. However, the cgraph verifier does see the edge before that happens and is OK with it. You can probably play with anonymous namespaces and final flags to get it devirtualized unconditnally. I have also took the liberty of removing an extra call to cgraph_function_or_thunk_node (clone_of_p calls it too) and a clearly obsolete comment from verify_edge_corresponds_to_fndecl. Bootstrapped and tested on x86_64-linux. OK for trunk? Thanks, Martin 2014-03-31 Martin Jambor mjam...@suse.cz * cgraph.h (cgraph_clone_node): New parameter added to declaration. Adjust all callers. * cgraph.c (clone_of_p): Also return true if thunks match. (verify_edge_corresponds_to_fndecl): Removed extraneous call to cgraph_function_or_thunk_node and an obsolete comment. * cgraphclones.c (build_function_type_skip_args): Moved upwards in the file. (build_function_decl_skip_args): Likewise. (set_new_clone_decl_and_node_flags): New function. (duplicate_thunk_for_node): Likewise. (redirect_edge_duplicating_thunks): Likewise. (cgraph_clone_node): New parameter args_to_skip, pass it to redirect_edge_duplicating_thunks which is called instead of cgraph_redirect_edge_callee. (cgraph_create_virtual_clone): Pass args_to_skip to cgraph_clone_node, moved setting of a lot of flags to set_new_clone_decl_and_node_flags. testsuite/ * g++.dg/ipa/pr60640-1.C: New test. * g++.dg/ipa/pr60640-2.C: Likewise. * g++.dg/ipa/pr60640-3.C: Likewise. OK, with the change above. Honza
Re: [4.8, PATCH 29/26] Backport Power8 and LE support: Document vec_vgbbd
On Thu, 2014-04-03 at 13:01 -0500, Bill Schmidt wrote: I'm currently doing one more quick round of testing with the three late-addition patches, and will then be ready to commit the series. Final tests have all passed (BE Linux, LE Linux, BE AIX). Thanks, Bill
Re: [PATCH] PR debug/57519 - Emit DW_TAG_imported_declaration under the right class for 'using' statements in a class
ChangeLog: 2014-03-25 Siva Chandra Reddy sivachan...@google.com Fix PR debug/57519 /cp PR debug/57519 * class.c (handle_using_decl): Pass the correct scope to cp_emit_debug_info_for_using. testsuite/ PR debug/57519 * g++.dg/debug/dwarf2/imported-decl-2.C: New testcase. This looks right to me, but you'll need approval from a C++ front end maintainer. Thanks! -cary
RE: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store
From: Andreas Schwab [mailto:sch...@suse.de] Please add m68k-*-*. From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Rainer Orth Just omit the { target *-*-* } completely, also a few more times. Please find attached an updated patch. gcc32rm-84.3.2.part1.diff Description: Binary data
RE: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Rainer Orth Just omit the { target *-*-* } completely, also a few more times. Please find attached an updated patch. Best regards, Thomas gcc32rm-84.3.2.part2.diff Description: Binary data
Re: [gomp4] Add tables generation
Hi! On Thu, 3 Apr 2014 18:13:08 +0200, Bernd Schmidt ber...@codesourcery.com wrote: On 04/02/2014 10:36 AM, Thomas Schwinge wrote: I see regressions in the libgomp testsuite for configurations where offloading is not enabled: spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ [...]/source/libgomp/testsuite/libgomp.c/for-3.c -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/ -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -I[...]/build/x86_64-unknown-linux-gnu/./libgomp -I[...]/source/libgomp/testsuite/.. -fmessage-length=0 -fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -std=gnu99 -fopenmp -L[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -lm -o ./for-3.exe /tmp/ccGnT0ei.o: In function `main': for-3.c:(.text+0x21032): undefined reference to `__OPENMP_TARGET__' collect2: error: ld returned 1 exit status I suppose that's because [...] Workaround committed in r209015: libgcc/ * crtstuff.c [!ENABLE_OFFLOADING] (__OPENMP_TARGET__): Define to NULL. The patch below should be a better fix, making the references to __OPENMP_TARGET__ weak. Does this work for you? Yes, it does, thanks! Please revert my patch when committing yours. Oh, and please use ChangeLog.gomp files on gomp-4_0-branch; also please move the entries for your recent commits from the ChangeLog file(s) to the respective ChangeLog.gomp one(s). Grüße, Thomas pgp9LEYYQa4tJ.pgp Description: PGP signature