Re: [x86, 2/n] Replace builtins with vector extensions
On Tue, Nov 4, 2014 at 9:31 PM, Marc Glisse marc.gli...@inria.fr wrote: Ping? Uh, yes, LGTM. (I was under impression that I already OK'd this relatively non-controversial patch. The effect of having too much open tasks in parallel, I guess.) Thanks, Uros. https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01808.html On Sat, 18 Oct 2014, Marc Glisse wrote: Hello, this time, +-* for 128 bit integer vectors. I am using an unsigned type so the compiler knows that we expect wrapping. I don't know why Intel's description of mullo insists that the multiplication is signed, that only matters for the high part... Next parts (waiting for approval for this one) should be: - same thing with 256 and 512 bit integer vectors - | ^ (integer only) Maybe (or it can wait until the next release): - == abs min max (integer only) 2014-10-20 Marc Glisse marc.gli...@inria.fr * config/i386/emmintrin.h (__v2du, __v4su, __v8hu, __v16qu): New typedefs. (_mm_add_epi8, _mm_add_epi16, _mm_add_epi32, _mm_add_epi64, _mm_sub_epi8, _mm_sub_epi16, _mm_sub_epi32, _mm_sub_epi64, _mm_mullo_epi16): Use vector extensions instead of builtins. * config/i386/smmintrin.h (_mm_mullo_epi32): Likewise. -- Marc Glisse
RE: [PATCH][6/n] Merge from match-and-simplify, make forwprop fold all stmts
The patch leads to big regression for float operators on target without hard fpu support due to register shuffle. Please refer https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63743 for more detail. Thanks! -Zhenqiang -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Jeff Law Sent: Saturday, October 25, 2014 1:48 AM To: Richard Biener; gcc-patches@gcc.gnu.org Cc: Jakub Jelinek Subject: Re: [PATCH][6/n] Merge from match-and-simplify, make forwprop fold all stmts On 10/24/14 07:16, Richard Biener wrote: This patch makes GIMPLE forwprop fold all statements, following single-use SSA edges only (as suggested by Jeff and certainly how this will regress the least until we replace manual simplification code that does not restrict itself this way). forwprop is run up to 4 times at the moment (once only for -Og, not at all for -O0), which still seems reasonable. IMHO the forwprop pass immediately after inlining is somewhat superfluous, it was added there just for its ADDR_EXPR propagation. We should eventually split this pass into two. Note that just folding what we propagated into (like the SSA propagators do during substitute-and-fold phase) will miss cases where we propagate into a stmt feeding the one we could simplify. Unless we always fold all single-use (and their use) stmts we have to fold everything from time to time. Changing how / when we fold stuff is certainly sth to look after with fold_stmt now being able to follow SSA edges. Bootstrapped on x86_64-unknown-linux-gnu, testing still in progress. From earlier testing I remember I need to adjust a few testcases that don't expect the early folding - notably two strlenopt cases (previously XFAILed but then PASSed again). I also expect to massage the single-use heuristic as I get to merging the patterns I added for the various forwprop manual pattern matchings to trunk (a lot of them do not restrict themselves this way). Does this otherwise look ok? Thanks, Richard. 2014-10-24 Richard Biener rguent...@suse.de * tree-ssa-forwprop.c: Include tree-cfgcleanup.h and tree-into-ssa.h. (lattice): New global. (fwprop_ssa_val): New function. (fold_all_stmts): Likewise. (pass_forwprop::execute): Finally fold all stmts. Seems reasonable. After all, we can iterate on the single-use heuristic. jeff
Re: [AArch64, Patch] Restructure arm_neon.h vector types's implementation(Take 2).
On 05/11/14 08:28, Tejas Belagod wrote: On 03/11/14 16:49, Marcus Shawcroft wrote: On 1 October 2014 09:26, Tejas Belagod tejas.bela...@arm.com wrote: Hi, Returning to this old thread, https://gcc.gnu.org/ml/gcc-patches/2014-06/msg02285.html here is a patch after a few rounds of review comments, specifically: https://gcc.gnu.org/ml/gcc-patches/2014-06/msg02248.html https://gcc.gnu.org/ml/gcc-patches/2014-06/msg02285.html https://gcc.gnu.org/ml/gcc-patches/2014-07/msg00566.html https://gcc.gnu.org/ml/gcc-patches/2014-07/msg00324.html Sorry for the delay in respinning this patch. Tested on aarch64-none-elf. OK for trunk? Thanks, Tejas. Changelog: 2014-10-01 Tejas Belagod tejas.bela...@arm.com * config/aarch64/aarch64-builtins.c (aarch64_build_scalar_type): Remove. (aarch64_scalar_builtin_types, aarch64_simd_type, aarch64_simd_type, aarch64_mangle_builtin_scalar_type, aarch64_mangle_builtin_vector_type, aarch64_mangle_builtin_type, aarch64_simd_builtin_std_type, aarch64_lookup_simd_builtin_type, aarch64_simd_builtin_type, aarch64_init_simd_builtin_types, aarch64_init_simd_builtin_scalar_types): New. (aarch64_init_simd_builtins): Refactor. (aarch64_init_crc32_builtins): Fixup with qualifier. * config/aarch64/aarch64-protos.h (aarch64_mangle_builtin_type): Export. * config/aarch64/aarch64-simd-builtin-types.def: New. * config/aarch64/aarch64.c (aarch64_simd_mangle_map): Remove. (aarch64_mangle_type): Refactor. * config/aarch64/arm_neon.h: Declare vector types based on internal types. * config/aarch64/t-aarch64: Update dependency. OK /Marcus Thanks. Rebased, retested on aarch64-none-elf and committed as r217114. Tejas. Sorry, forgot to attach the rebased patch - here you go. diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index 527445c..c0881e6 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -363,257 +363,335 @@ static GTY(()) tree aarch64_builtin_decls[AARCH64_BUILTIN_MAX]; #define NUM_DREG_TYPES 6 #define NUM_QREG_TYPES 6 -/* Return a tree for a signed or unsigned argument of either - the mode specified by MODE, or the inner mode of MODE. */ -tree -aarch64_build_scalar_type (machine_mode mode, - bool unsigned_p, - bool poly_p) -{ -#undef INT_TYPES -#define INT_TYPES \ - AARCH64_TYPE_BUILDER (QI) \ - AARCH64_TYPE_BUILDER (HI) \ - AARCH64_TYPE_BUILDER (SI) \ - AARCH64_TYPE_BUILDER (DI) \ - AARCH64_TYPE_BUILDER (EI) \ - AARCH64_TYPE_BUILDER (OI) \ - AARCH64_TYPE_BUILDER (CI) \ - AARCH64_TYPE_BUILDER (XI) \ - AARCH64_TYPE_BUILDER (TI) \ - -/* Statically declare all the possible types we might need. */ -#undef AARCH64_TYPE_BUILDER -#define AARCH64_TYPE_BUILDER(X) \ - static tree X##_aarch64_type_node_p = NULL; \ - static tree X##_aarch64_type_node_s = NULL; \ - static tree X##_aarch64_type_node_u = NULL; - - INT_TYPES - - static tree float_aarch64_type_node = NULL; - static tree double_aarch64_type_node = NULL; - - gcc_assert (!VECTOR_MODE_P (mode)); - -/* If we've already initialised this type, don't initialise it again, - otherwise ask for a new type of the correct size. */ -#undef AARCH64_TYPE_BUILDER -#define AARCH64_TYPE_BUILDER(X) \ - case X##mode: \ -if (unsigned_p) \ - return (X##_aarch64_type_node_u \ - ? X##_aarch64_type_node_u \ - : X##_aarch64_type_node_u \ - = make_unsigned_type (GET_MODE_PRECISION (mode))); \ -else if (poly_p) \ - return (X##_aarch64_type_node_p \ - ? X##_aarch64_type_node_p \ - : X##_aarch64_type_node_p \ - = make_unsigned_type (GET_MODE_PRECISION (mode))); \ -else \ - return (X##_aarch64_type_node_s \ - ? X##_aarch64_type_node_s \ - : X##_aarch64_type_node_s \ - = make_signed_type (GET_MODE_PRECISION (mode))); \ -break; +/* Internal scalar builtin types. These types are used to support + neon intrinsic builtins. They are _not_ user-visible types. Therefore + the mangling for these types are implementation defined. */ +const char *aarch64_scalar_builtin_types[] = { + __builtin_aarch64_simd_qi, + __builtin_aarch64_simd_hi, + __builtin_aarch64_simd_si, + __builtin_aarch64_simd_sf, + __builtin_aarch64_simd_di, + __builtin_aarch64_simd_df, + __builtin_aarch64_simd_poly8, + __builtin_aarch64_simd_poly16, + __builtin_aarch64_simd_poly64, + __builtin_aarch64_simd_poly128, + __builtin_aarch64_simd_ti, + __builtin_aarch64_simd_uqi, + __builtin_aarch64_simd_uhi, + __builtin_aarch64_simd_usi, + __builtin_aarch64_simd_udi, + __builtin_aarch64_simd_ei, + __builtin_aarch64_simd_oi, + __builtin_aarch64_simd_ci, + __builtin_aarch64_simd_xi, +
RE: [Ping] [PATCH, 9/10] aarch64: generate conditional compare instructions
I had retested all the ccmp patches. Bootstrap and no make check regression on X86-64. Bootstrap and no make check regression on AARCH64 qemu. OK for trunk? Thanks! -Zhenqiang -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Zhenqiang Chen Sent: Monday, October 27, 2014 3:50 PM To: 'Richard Henderson' Cc: gcc-patches@gcc.gnu.org Subject: RE: [Ping] [PATCH, 9/10] aarch64: generate conditional compare instructions -Original Message- From: Richard Henderson [mailto:r...@redhat.com] Sent: Sunday, October 12, 2014 4:46 AM To: Zhenqiang Chen; gcc-patches@gcc.gnu.org Subject: Re: [Ping] [PATCH, 9/10] aarch64: generate conditional compare instructions On 09/22/2014 11:46 PM, Zhenqiang Chen wrote: +static bool +aarch64_convert_mode (rtx* op0, rtx* op1, int unsignedp) { + enum machine_mode mode; + + mode = GET_MODE (*op0); + if (mode == VOIDmode) +mode = GET_MODE (*op1); + + if (mode == QImode || mode == HImode) +{ + *op0 = convert_modes (SImode, mode, *op0, unsignedp); + *op1 = convert_modes (SImode, mode, *op1, unsignedp); +} + else if (mode != SImode mode != DImode) +return false; + + return true; +} Hum. I'd rather not replicate too much of the expander logic here. We could avoid that by using struct expand_operand, create_input_operand et al, then expand_insn. That does require that the target hooks be given trees rather than rtl as input. I had tried to use tree/gimple as input. But the codes was more complexity than current one. And comments in https://gcc.gnu.org/ml/gcc-patches/2014- 06/msg02027.html I suspect it might be better to just hoist any preparation operations above the entire conditional compare sequence, so that by the time we start the ccmp expansion we're dealing with operands that are in the 'natural' sizes for the machine (breaking up the conditional compare sequence for what are almost side-effect operations sounds like a source of potential bugs). This would also ensure that the back-end can safely re-order at least some comparison operations if this leads a workable conditional compare sequence. I think the mode conversion (to SImode or DImode) is target dependent. Thanks! -Zhenqiang
Re: [PATCH] Optimize UBSAN_NULL checks, add sanopt.c
On 11/03/2014 05:27 PM, Marek Polacek wrote: Another shot at optimizing redundant UBSAN_NULL statements. This time we walk the dominator tree - that should result in more effective optimization - and keep a list of UBSAN_NULL statements that dominate the current block, see the comment before sanopt_optimize_walker. Marek, A general question - have you considered coding this as a dataflow loop instead of dominator walk? That would allow to also remove checks for variables defined via PHI nodes provided that all arguments of PHI have already been checked. -Y
Re: [PATCH RFC]Pair load store instructions using a generic scheduling fusion pass
On Sat, Nov 1, 2014 at 4:29 AM, Jeff Law l...@redhat.com wrote: On 09/30/14 03:22, Bin Cheng wrote: 2014-09-30 Bin Chengbin.ch...@arm.com Mike Stumpmikest...@comcast.net * timevar.def (TV_SCHED_FUSION): New time var. * passes.def (pass_sched_fusion): New pass. * config/arm/arm.c (TARGET_SCHED_FUSION_PRIORITY): New. (extract_base_offset_in_addr, fusion_load_store): New. (arm_sched_fusion_priority): New. (arm_option_override): Disable scheduling fusion on non-armv7 processors by default. * sched-int.h (struct _haifa_insn_data): New field. (INSN_FUSION_PRIORITY, FUSION_MAX_PRIORITY, sched_fusion): New. * sched-rgn.c (rest_of_handle_sched_fusion): New. (pass_data_sched_fusion, pass_sched_fusion): New. (make_pass_sched_fusion): New. * haifa-sched.c (sched_fusion): New. (insn_cost): Handle sched_fusion. (priority): Handle sched_fusion by calling target hook. (enum rfs_decision): New enum value. (rfs_str): New element for RFS_FUSION. (rank_for_schedule): Support sched_fusion. (schedule_insn, max_issue, prune_ready_list): Handle sched_fusion. (schedule_block, fix_tick_ready): Handle sched_fusion. * common.opt (flag_schedule_fusion): New. * tree-pass.h (make_pass_sched_fusion): New. * target.def (fusion_priority): New. * doc/tm.texi.in (TARGET_SCHED_FUSION_PRIORITY): New. * doc/tm.texi: Regenerated. * doc/invoke.texi (-fschedule-fusion): New. gcc/testsuite/ChangeLog 2014-09-30 Bin Chengbin.ch...@arm.com * gcc.target/arm/ldrd-strd-pair-1.c: New test. * gcc.target/arm/vfp-1.c: Improve scanning string. sched-fusion-20140929.txt Index: gcc/doc/tm.texi === --- gcc/doc/tm.texi (revision 215662) +++ gcc/doc/tm.texi (working copy) @@ -6677,6 +6677,29 @@ This hook is called by tree reassociator to determ parallelism required in output calculations chain. @end deftypefn +@deftypefn {Target Hook} void TARGET_SCHED_FUSION_PRIORITY (rtx_insn *@var{insn}, int @var{max_pri}, int *@var{fusion_pri}, int *@var{pri}) +This hook is called by scheduling fusion pass. It calculates fusion +priorities for each instruction passed in by parameter. The priorities +are returned via pointer parameters. + +@var{insn} is the instruction whose priorities need to be calculated. +@var{max_pri} is the maximum priority can be returned in any cases. +@var{fusion_pri} is the pointer parameter through which @var{insn}'s +fusion priority should be calculated and returned. +@var{pri} is the pointer parameter through which @var{insn}'s priority +should be calculated and returned. + +Same @var{fusion_pri} should be returned for instructions which should +be scheduled together. Different @var{pri} should be returned for +instructions with same @var{fusion_pri}. All instructions will be +scheduled according to the two priorities. @var{fusion_pri} is the major +sort key, @var{pri} is the minor sort key. All priorities calculated +should be between 0 (exclusive) and @var{max_pri} (inclusive). To avoid +false dependencies, @var{fusion_pri} of instructions which need to be +scheduled together should be smaller than @var{fusion_pri} of irrelevant +instructions. +@end deftypefn + @node Sections @section Dividing the Output into Sections (Texts, Data, @dots{}) @c the above section title is WAY too long. maybe cut the part between So I think we need to clarify that this hook is useful when fusing to related insns, but which don't have a data dependency. Somehow we need to describe that the insns to be fused should have the same (or +-1) priority. It may be useful to use code from the ARM implementation to show how to use this to pair up loads as an example. It may also be useful to refer to the code which reorders insns in the ready queue for cases where we want to fuse two truly independent insns. + if (sched_fusion) +{ + /* The instruction that has the same fusion priority as the last +instruction is the instruction we picked next. If that is not +the case, we sort ready list firstly by fusion priority, then +by priority, and at last by INSN_LUID. */ + int a = INSN_FUSION_PRIORITY (tmp); + int b = INSN_FUSION_PRIORITY (tmp2); + int last = -1; + + if (last_nondebug_scheduled_insn + !NOTE_P (last_nondebug_scheduled_insn) + BLOCK_FOR_INSN (tmp) + == BLOCK_FOR_INSN (last_nondebug_scheduled_insn)) + last = INSN_FUSION_PRIORITY (last_nondebug_scheduled_insn); + + if (a != last b != last) + { + if (a == b) + { + a = INSN_PRIORITY (tmp); + b = INSN_PRIORITY (tmp2); + } + if
Re: [PATCH] Optimize UBSAN_NULL checks, add sanopt.c
On Wed, Nov 05, 2014 at 12:19:22PM +0300, Yury Gribov wrote: On 11/03/2014 05:27 PM, Marek Polacek wrote: Another shot at optimizing redundant UBSAN_NULL statements. This time we walk the dominator tree - that should result in more effective optimization - and keep a list of UBSAN_NULL statements that dominate the current block, see the comment before sanopt_optimize_walker. Marek, A general question - have you considered coding this as a dataflow loop instead of dominator walk? That would allow to also remove checks for variables defined via PHI nodes provided that all arguments of PHI have already been checked. I'd be afraid that we'd turn sanopt into another var-tracking that way, with possibly huge hash tables being copied on write, merging of the tables etc., with big memory and time requirements, having to add --param limits to give up if the sum size of the tables go over certain limit. Or can you explain how this problem is different from the var-tracking problem? The way Marek has coded it up is pretty cheap optimization. BTW, as discussed privately with Marek last time, we probably want to optimize UBSAN_NULL (etc.) only if -fno-sanitize-recover=null (etc.) or if location_t is the same, otherwise such optimizations lead to only one problem being reported instead of all of them. Jakub
Re: [PATCH, i386] Add prefixes avoidance tuning for silvermont target
Hi, Having stage1 close to end, may we make some decision regarding this patch? Having a couple of working variants, may we choose and use one of them? Thanks, Ilya 2014-07-15 17:38 GMT+04:00 Uros Bizjak ubiz...@gmail.com: On Tue, Jul 15, 2014 at 1:01 PM, Ilya Enkovich enkovich@gmail.com wrote: On 15 Jul 10:42, Uros Bizjak wrote: On Tue, Jul 15, 2014 at 10:25 AM, Ilya Enkovich enkovich@gmail.com wrote: Also fully restrict xmm8-15 does not seem right. It is just costly but not fully disallowed. As said earlier, you can try Ya*x as a constraint. I tried it. It does not seem to affect allocation much. I do not see any gain on targeted tests. Strange, because the documentation claims: '*' Says that the following character should be ignored when choosing register preferences. '*' has no effect on the meaning of the constraint as a constraint, and no effect on reloading. For LRA '*' additionally disparages slightly the alternative if the following character matches the operand. Let me rethink this a bit. Prehaps we could reconsider Jakub's proposal with Ya,!x (with two alternatives). IIRC this approach was needed for some MMX alternatives, where we didn't want RA to allocate a MMX register when the value could be passed in integer regs, but the value was still allowed in MMX register. That's is what my patch already does, but with '?' instead of '!'. Yes, I know. The problem is, that Ya*x type conditional allocation worked OK in the past for not preferred, but still alowed regclass registers, There are several patterns in i386.md that live by this premise, including movsf_internal and movdf_internal. If this approach doesn't work anymore, then we have to either figure out what is the reason, or invent a new strategy that will be applicable to all cases. Can you please post a small test that illustrates the case where Ya,!x works, but Ya*x doesn't? It's hard to compose a small testcase which will have SSE4 instructions generated with required register usage. I use tcpjumbo test from TCPmark for initial check of how my patch works. This test has a lot of pmovzxwd instructions generated and many of them use xmm8-15. I tried two versions of a simple patch which modifies only pmovzxwd instruction. Patch1: diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index d907353..6b03b72 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -11852,10 +11852,10 @@ (set_attr mode OI)]) (define_insn sse4_1_codev4hiv4si2 - [(set (match_operand:V4SI 0 register_operand =x) + [(set (match_operand:V4SI 0 register_operand =Yr,!x) (any_extend:V4SI (vec_select:V4HI - (match_operand:V8HI 1 nonimmediate_operand xm) + (match_operand:V8HI 1 nonimmediate_operand Yr,!xm) (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3)]] TARGET_SSE4_1 Patch2: diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index d907353..b3721c4 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -11852,10 +11852,10 @@ (set_attr mode OI)]) (define_insn sse4_1_codev4hiv4si2 - [(set (match_operand:V4SI 0 register_operand =x) + [(set (match_operand:V4SI 0 register_operand =Yr*x) (any_extend:V4SI (vec_select:V4HI - (match_operand:V8HI 1 nonimmediate_operand xm) + (match_operand:V8HI 1 nonimmediate_operand Yr*xm) (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3)]] TARGET_SSE4_1 Here are results of looking for pmovzxwd in resulting binaries: #objdump -d tcpjumbo-orig | grep pmovzxwd | grep xmm8\|xmm9\|xmm10\|xmm11\|xmm12\|xmm13\|xmm14\|xmm15 | wc -l 76 #objdump -d tcpjumbo-patch1 | grep pmovzxwd | grep xmm8\|xmm9\|xmm10\|xmm11\|xmm12\|xmm13\|xmm14\|xmm15 | wc -l 0 #objdump -d tcpjumbo-patch2 | grep pmovzxwd | grep xmm8\|xmm9\|xmm10\|xmm11\|xmm12\|xmm13\|xmm14\|xmm15 | wc -l 76 Therefore I make a conclusion that Yr*x does not really differ much from x. Just FTR: Using Yr,*x is also a viable option: #objdump -d tcpjumbo-patch3 | grep pmovzxwd | grep xmm8\|xmm9\|xmm10\|xmm11\|xmm12\|xmm13\|xmm14\|xmm15 | wc -l 0 I believe that the above is the way to go with LRA. Vladimir, what do you think? Uros.
Re: [PATCH] Optimize UBSAN_NULL checks, add sanopt.c
On 11/05/2014 12:33 PM, Jakub Jelinek wrote: On Wed, Nov 05, 2014 at 12:19:22PM +0300, Yury Gribov wrote: On 11/03/2014 05:27 PM, Marek Polacek wrote: Another shot at optimizing redundant UBSAN_NULL statements. This time we walk the dominator tree - that should result in more effective optimization - and keep a list of UBSAN_NULL statements that dominate the current block, see the comment before sanopt_optimize_walker. Marek, A general question - have you considered coding this as a dataflow loop instead of dominator walk? That would allow to also remove checks for variables defined via PHI nodes provided that all arguments of PHI have already been checked. I'd be afraid that we'd turn sanopt into another var-tracking that way, with possibly huge hash tables being copied on write, merging of the tables etc., with big memory and time requirements, having to add --param limits to give up if the sum size of the tables go over certain limit. Sure, that would be slower. I was just curious whether you considered alternatives (looks like you did). The way Marek has coded it up is pretty cheap optimization. Right. BTW, as discussed privately with Marek last time, we probably want to optimize UBSAN_NULL (etc.) only if -fno-sanitize-recover=null (etc.) or if location_t is the same, otherwise such optimizations lead to only one problem being reported instead of all of them. Are you going to work on ASan soon? I could rebase my patches on top of Marek's infrastructure. -Y
Re: [PATCH, i386] Add prefixes avoidance tuning for silvermont target
On Wed, Nov 5, 2014 at 10:35 AM, Ilya Enkovich enkovich@gmail.com wrote: Hi, Having stage1 close to end, may we make some decision regarding this patch? Having a couple of working variants, may we choose and use one of them? I propose to wait for Vlad for an update about his plans on register preference algorythm that would fix this (and other Ya*r-type issues). In the absence of the fix, we'll go with Yr,*x. Uros. Thanks, Ilya 2014-07-15 17:38 GMT+04:00 Uros Bizjak ubiz...@gmail.com: On Tue, Jul 15, 2014 at 1:01 PM, Ilya Enkovich enkovich@gmail.com wrote: On 15 Jul 10:42, Uros Bizjak wrote: On Tue, Jul 15, 2014 at 10:25 AM, Ilya Enkovich enkovich@gmail.com wrote: Also fully restrict xmm8-15 does not seem right. It is just costly but not fully disallowed. As said earlier, you can try Ya*x as a constraint. I tried it. It does not seem to affect allocation much. I do not see any gain on targeted tests. Strange, because the documentation claims: '*' Says that the following character should be ignored when choosing register preferences. '*' has no effect on the meaning of the constraint as a constraint, and no effect on reloading. For LRA '*' additionally disparages slightly the alternative if the following character matches the operand. Let me rethink this a bit. Prehaps we could reconsider Jakub's proposal with Ya,!x (with two alternatives). IIRC this approach was needed for some MMX alternatives, where we didn't want RA to allocate a MMX register when the value could be passed in integer regs, but the value was still allowed in MMX register. That's is what my patch already does, but with '?' instead of '!'. Yes, I know. The problem is, that Ya*x type conditional allocation worked OK in the past for not preferred, but still alowed regclass registers, There are several patterns in i386.md that live by this premise, including movsf_internal and movdf_internal. If this approach doesn't work anymore, then we have to either figure out what is the reason, or invent a new strategy that will be applicable to all cases. Can you please post a small test that illustrates the case where Ya,!x works, but Ya*x doesn't? It's hard to compose a small testcase which will have SSE4 instructions generated with required register usage. I use tcpjumbo test from TCPmark for initial check of how my patch works. This test has a lot of pmovzxwd instructions generated and many of them use xmm8-15. I tried two versions of a simple patch which modifies only pmovzxwd instruction. Patch1: diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index d907353..6b03b72 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -11852,10 +11852,10 @@ (set_attr mode OI)]) (define_insn sse4_1_codev4hiv4si2 - [(set (match_operand:V4SI 0 register_operand =x) + [(set (match_operand:V4SI 0 register_operand =Yr,!x) (any_extend:V4SI (vec_select:V4HI - (match_operand:V8HI 1 nonimmediate_operand xm) + (match_operand:V8HI 1 nonimmediate_operand Yr,!xm) (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3)]] TARGET_SSE4_1 Patch2: diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index d907353..b3721c4 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -11852,10 +11852,10 @@ (set_attr mode OI)]) (define_insn sse4_1_codev4hiv4si2 - [(set (match_operand:V4SI 0 register_operand =x) + [(set (match_operand:V4SI 0 register_operand =Yr*x) (any_extend:V4SI (vec_select:V4HI - (match_operand:V8HI 1 nonimmediate_operand xm) + (match_operand:V8HI 1 nonimmediate_operand Yr*xm) (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3)]] TARGET_SSE4_1 Here are results of looking for pmovzxwd in resulting binaries: #objdump -d tcpjumbo-orig | grep pmovzxwd | grep xmm8\|xmm9\|xmm10\|xmm11\|xmm12\|xmm13\|xmm14\|xmm15 | wc -l 76 #objdump -d tcpjumbo-patch1 | grep pmovzxwd | grep xmm8\|xmm9\|xmm10\|xmm11\|xmm12\|xmm13\|xmm14\|xmm15 | wc -l 0 #objdump -d tcpjumbo-patch2 | grep pmovzxwd | grep xmm8\|xmm9\|xmm10\|xmm11\|xmm12\|xmm13\|xmm14\|xmm15 | wc -l 76 Therefore I make a conclusion that Yr*x does not really differ much from x. Just FTR: Using Yr,*x is also a viable option: #objdump -d tcpjumbo-patch3 | grep pmovzxwd | grep xmm8\|xmm9\|xmm10\|xmm11\|xmm12\|xmm13\|xmm14\|xmm15 | wc -l 0 I believe that the above is the way to go with LRA. Vladimir, what do you think? Uros.
Re: [gofrontend-dev] [PATCH 4/4] Gccgo port to s390[x] -- part II
On Tue, Nov 04, 2014 at 08:16:51PM -0800, Ian Taylor wrote: I committed the change to go-test.exp. Thanks. The other changes are not OK. As described in gcc/testsuite/go.test/test/README.gcc, the files in gcc/testsuite/go.test/test are an exact copy of the master Go testsuite. Any changes must be made to the master Go testsuite first. I understand that, but I'm unsure how to handle a set of patches that all depend on each other but refer to three different reposiories. So I posted this patch intentionally in the wrong place, not knowing how to do it in a better way. I don't know what's up with the complex number change. In general the Go compiler and libraries go to some effort to produce the same answers on all platforms. We need to understand why we get different answers on s390 (you may understand the differences, but I don't). I won't change the tests without a clear understanding of why we are changing them. It's actually not a Go specific problem, the same deviation occurs in C code too. The cause is that constant folding is done with a higher precision and may yield a different result than the run time calculations. There is a Gcc bug report for that issue: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60181 The nilptr test doesn't run on some other platforms when using gccgo--search for nilptr in go-test.exp. If you want to work out a way to change the master Go testsuite such that the nilptr test passes on more platforms, that would be great. I don't have the slightest clue how this could be done in a platform independent way because the test heavily depends on the target's memory map layout. The way to do it is not by copying the test. If the test needs to be customized, add additional files that use // +build lines to pick which files is built. Move them into a directory, like method4.go or other tests that use rundir. I'll check that. Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany
Re: Small multiplier support in Cortex-M0/1/+
On Tue, Oct 21, 2014 at 11:01 AM, Hale Wang hale.w...@arm.com wrote: Hi, Some configurations of the Cortex-M0 and Cortex-M1 come with a high latency multiplier. This patch adds support for such configurations. Small multiplier means using add/sub/shift instructions to replace the mul instruction for the MCU that has no fast multiplier. The following strategies are adopted in this patch: 1. Define new CPUs as -mcpu=cortex-m0.small-multiply,cortex-m0plus.small-multiply,cortex-m1.small- multiply to support small multiplier. 2. -Os means size is preferred. A threshold of 5 is set which means it will prevent spliting if ending up with more than 5 instructions. As for non-OS, there will be no such a limit. Some test cases are also added in the testsuite to verify this function. Is it ok for trunk? This is OK . Ramana Thanks and Best Regards, Hale Wang gcc/ChangeLog: 2014-08-29 Hale Wang hale.w...@arm.com * config/arm/arm-cores.def: Add support for -mcpu=cortex-m0.small-multiply,cortex-m0plus.small-multiply, cortex-m1.small-multiply. * config/arm/arm-tables.opt: Regenerate. * config/arm/arm-tune.md: Regenerate. * config/arm/arm.c: Update the rtx-costs for MUL. * config/arm/bpabi.h: Handle -mcpu=cortex-m0.small-multiply,cortex-m0plus.small-multiply, cortex-m1.small-multiply. * doc/invoke.texi: Document -mcpu=cortex-m0.small-multiply,cortex-m0plus.small-multiply, cortex-m1.small-multiply. * testsuite/gcc.target/arm/small-multiply-m0-1.c: New test case. * testsuite/gcc.target/arm/small-multiply-m0-2.c: Likewise. * testsuite/gcc.target/arm/small-multiply-m0-3.c: Likewise. * testsuite/gcc.target/arm/small-multiply-m0plus-1.c: Likewise. * testsuite/gcc.target/arm/small-multiply-m0plus-2.c: Likewise. * testsuite/gcc.target/arm/small-multiply-m0plus-3.c: Likewise. * testsuite/gcc.target/arm/small-multiply-m1-1.c: Likewise. * testsuite/gcc.target/arm/small-multiply-m1-2.c: Likewise. * testsuite/gcc.target/arm/small-multiply-m1-3.c: Likewise. === diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def index a830a83..af4b373 100644 --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -137,6 +137,11 @@ ARM_CORE(cortex-m1, cortexm1, cortexm1, 6M, FL_LDSCHED, v6m) ARM_CORE(cortex-m0, cortexm0, cortexm0, 6M, FL_LDSCHED, v6m) ARM_CORE(cortex-m0plus, cortexm0plus, cortexm0plus, 6M, FL_LDSCHED, v6m) +/* V6M Architecture Processors for small-multiply implementations. */ +ARM_CORE(cortex-m1.small-multiply, cortexm1smallmultiply, cortexm1, 6M, FL_LDSCHED | FL_SMALLMUL, v6m) +ARM_CORE(cortex-m0.small-multiply, cortexm0smallmultiply, cortexm0, 6M, FL_LDSCHED | FL_SMALLMUL, v6m) +ARM_CORE(cortex-m0plus.small-multiply,cortexm0plussmallmultiply, cortexm0plus,6M, FL_LDSCHED | FL_SMALLMUL, v6m) + /* V7 Architecture Processors */ ARM_CORE(generic-armv7-a,genericv7a, genericv7a, 7A, FL_LDSCHED, cortex) ARM_CORE(cortex-a5, cortexa5, cortexa5, 7A, FL_LDSCHED, cortex_a5) diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt index bc046a0..bd65bd2 100644 --- a/gcc/config/arm/arm-tables.opt +++ b/gcc/config/arm/arm-tables.opt @@ -241,6 +241,15 @@ EnumValue Enum(processor_type) String(cortex-m0plus) Value(cortexm0plus) EnumValue +Enum(processor_type) String(cortex-m1.small-multiply) Value(cortexm1smallmultiply) + +EnumValue +Enum(processor_type) String(cortex-m0.small-multiply) Value(cortexm0smallmultiply) + +EnumValue +Enum(processor_type) String(cortex-m0plus.small-multiply) Value(cortexm0plussmallmultiply) + +EnumValue Enum(processor_type) String(generic-armv7-a) Value(genericv7a) EnumValue diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md index 954cab8..8b5c778 100644 --- a/gcc/config/arm/arm-tune.md +++ b/gcc/config/arm/arm-tune.md @@ -25,6 +25,7 @@ arm1176jzs,arm1176jzfs,mpcorenovfp, mpcore,arm1156t2s,arm1156t2fs, cortexm1,cortexm0,cortexm0plus, + cortexm1smallmultiply,cortexm0smallmultiply,cortexm0plussmallmultiply, genericv7a,cortexa5,cortexa7, cortexa8,cortexa9,cortexa12, cortexa15,cortexr4,cortexr4f, diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 93b989d..5062c85 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -751,6 +751,8 @@ static int thumb_call_reg_needed; #define FL_ARCH8 (1 24) /* Architecture 8. */ #define FL_CRC32 (1 25) /* ARMv8 CRC32 instructions. */ +#define FL_SMALLMUL (1 26) /* Small multiply supported. */ + #define FL_IWMMXT (1 29) /* XScale v2 or Intel Wireless MMX
Re: [PATCH] Optimize UBSAN_NULL checks, add sanopt.c
On Wed, Nov 05, 2014 at 12:54:37PM +0300, Yury Gribov wrote: Are you going to work on ASan soon? I could rebase my patches on top of Marek's infrastructure. I'm not going to work on ASan today or tomorrow, but it'd be nice to get this ASan opt in in this stage1. So if you can rebase your patch, I think that will be appreciated. Marek
Re: [gofrontend-dev] [PATCH 1/4] Gccgo port to s390[x] -- part II
On Tue, Nov 04, 2014 at 02:39:34PM -0800, Ian Taylor wrote: Note that libgo/runtime/runtime.c now refers to S390_HAVE_STCKF. It's not obvious to me that that is defined anywhere. Perhaps it is in a later patch in this series--I haven't looked. This chunk is broken but harmless (because S390_HAVE_STCKF is never defined anyway). The code needs to check at runtime whether the stckf instruction is available. I'll provide a patch for that later. In the mean time there's no immediate need to back out the flawed chunk. Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany
RE: [PATCH][MIPS] Fix P5600 memory cost
The patch below fixes the memory cost for P5600. ChangeLog: 2014-11-05 Prachi Godbole prachi.godb...@imgtec.com * config/mips/mips.c (mips_rtx_cost_data): Fix memory_letency cost for p5600. Please follow these instructions to add yourself to MAINTAINERS in the write-after-approval section now that you have write access to GCC: https://gcc.gnu.org/svnwrite.html#authenticated OK with fixes to the changelog entry: latency not latency. Remember to tab in the changelog entry and split the line as it will exceed 80 chars. Also two spaces between the date/name and name/email. E.g. 2014-11-05 Prachi Godbole prachi.godb...@imgtec.com * config/mips/mips.c (mips_rtx_cost_data): Fix memory_latency cost for p5600. Thanks, Matthew diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index af6a913..558ba2f 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -1193,7 +1193,7 @@ static const struct mips_rtx_cost_data COSTS_N_INSNS (8),/* int_div_si */ COSTS_N_INSNS (8),/* int_div_di */ 2,/* branch_cost */ - 10 /* memory_latency */ + 4 /* memory_latency */ } }; ^L
[PATCH, libfortran] PR 47007, 61847 Locale failures in libgfortran
Hi, the attached patch fixes a few locale related failures in libgfortran, in the case where the POSIX 2008 extended locale functionality and extensions strto{f,d,ld}_l are present. These failures typically occur when libgfortran is used from a program which has set the locale with setlocale(), and the locale uses a different decimal separator than the C locale. The patch fixes this by creating a C locale which is then used by strto{f,d,ld}_l, and also is installed as the per-thread locale when starting a formatted IO, then reset to the previous value when the IO is done. I have chosen to not fallback to calling setlocale() in case the POSIX 2008 locale stuff isn't available, as that could create nasty hard to debug race conditions in a multi-threaded program. (I think Jerry's proposed patch which checks the locale for the decimal separator is still useful as a fallback in case the POSIX 2008 locale stuff isn't available) Regtested on x86_64-unknown-linux-gnu, Ok for trunk? 2014-11-06 Janne Blomqvist j...@gcc.gnu.org PR libfortran/47007 PR libfortran/61847 * config.h.in: Regenerated. * configure: Regenerated. * configure.ac (AC_CHECK_HEADERS_ONCE): Check for locale.h. (AC_CHECK_FUNCS_ONCE): Check for newlocale, freelocale, uselocale, strtof_l, strtod_l, strtold_l. * io/io.h (locale.h): Include if present. (c_locale): New variable. (gfc_strtof): Move macro from libgfortran.h, use strtof_l if present. (gfc_strtod): Likewise. (gfc_strtold): Likewise. (st_parameter_dt): Add old_locale member. * io/transfer.c (data_transfer_init): Set thread locale to c_locale if doing formatted transfer. (finalize_transfer): Reset thread locale to previous. * io/unit.c (c_locale): New variable. (init_units): Init c_locale. (close_units): Free c_locale. * libgfortran.h (gfc_strto{f,d,ld}): Move macros to io/io.h. -- Janne Blomqvist diff --git a/libgfortran/configure.ac b/libgfortran/configure.ac index b3150f4..5550380 100644 --- a/libgfortran/configure.ac +++ b/libgfortran/configure.ac @@ -255,7 +255,7 @@ AC_CHECK_TYPES([ptrdiff_t]) # check header files (we assume C89 is available, so don't check for that) AC_CHECK_HEADERS_ONCE(unistd.h sys/time.h sys/times.h sys/resource.h \ sys/types.h sys/stat.h sys/wait.h floatingpoint.h ieeefp.h fenv.h fptrap.h \ -fpxcp.h pwd.h complex.h) +fpxcp.h pwd.h complex.h locale.h) GCC_HEADER_STDINT(gstdint.h) @@ -290,7 +290,8 @@ else strcasestr getrlimit gettimeofday stat fstat lstat getpwuid vsnprintf dup \ getcwd localtime_r gmtime_r getpwuid_r ttyname_r clock_gettime \ readlink getgid getpid getppid getuid geteuid umask getegid \ - secure_getenv __secure_getenv mkostemp strnlen strndup strtok_r) + secure_getenv __secure_getenv mkostemp strnlen strndup strtok_r newlocale \ + freelocale uselocale strtof_l strtod_l strtold_l) fi # Check strerror_r, cannot be above as versions with two and three arguments exist diff --git a/libgfortran/io/io.h b/libgfortran/io/io.h index 1e0d092..a638daf 100644 --- a/libgfortran/io/io.h +++ b/libgfortran/io/io.h @@ -32,6 +32,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #include gthr.h +#ifdef HAVE_LOCALE_H +#include locale.h +#endif + /* Forward declarations. */ struct st_parameter_dt; typedef struct stream stream; @@ -40,6 +44,35 @@ struct format_data; typedef struct fnode fnode; struct gfc_unit; +#ifdef HAVE_NEWLOCALE +/* We have POSIX 2008 extended locale stuff. */ +extern locale_t c_locale; +#endif + +#ifdef __MINGW32__ +extern float __strtof (const char *, char **); +#define gfc_strtof __strtof +extern double __strtod (const char *, char **); +#define gfc_strtod __strtod +extern long double __strtold (const char *, char **); +#define gfc_strtold __strtold +#else +#ifdef HAVE_STRTOF_L +#define gfc_strtof(nptr, endptr) strtof_l(nptr, endptr, c_locale) +#else +#define gfc_strtof strtof +#endif +#ifdef HAVE_STRTOD_L +#define gfc_strtod(nptr, endptr) strtod_l(nptr, endptr, c_locale) +#else +#define gfc_strtod strtod +#endif +#ifdef HAVE_STRTOLD_L +#define gfc_strtold(nptr, endptr) strtold_l(nptr, endptr, c_locale) +#else +#define gfc_strtold strtold +#endif +#endif /* Macros for testing what kinds of I/O we are doing. */ @@ -450,6 +483,9 @@ typedef struct st_parameter_dt char *line_buffer; struct format_data *fmt; namelist_info *ionml; +#ifdef HAVE_NEWLOCALE + locale_t old_locale; +#endif /* Current position within the look-ahead line buffer. */ int line_buffer_pos; /* Storage area for values except for strings. Must be diff --git a/libgfortran/io/transfer.c b/libgfortran/io/transfer.c index dc1b6f4..4706865 100644 --- a/libgfortran/io/transfer.c +++ b/libgfortran/io/transfer.c @@ -2874,7 +2874,12 @@ data_transfer_init (st_parameter_dt *dtp, int read_flag) if (dtp-u.p.current_unit-flags.form == FORM_FORMATTED
Re: [PATCH] Optimize UBSAN_NULL checks, add sanopt.c
On Wed, Nov 05, 2014 at 11:29:19AM +0100, Marek Polacek wrote: On Wed, Nov 05, 2014 at 12:54:37PM +0300, Yury Gribov wrote: Are you going to work on ASan soon? I could rebase my patches on top of Marek's infrastructure. I'm not going to work on ASan today or tomorrow, but it'd be nice to get this ASan opt in in this stage1. So if you can rebase your patch, I think that will be appreciated. Note, the algorithm we were discussing with Honza for the is there any possibility of a freeing call on the path between a dominating and dominated ASAN_CHECK problem was to compute it lazily; have flags for asan per-bb: 1) bb might contain a !nonfreeing_call_p 2) there is a bb with flag 1) set in some path between imm(bb) and bb 3) flag whether 2) has been computed already 4) some temporary being visited flag and the algorithm: 1) when walking a bb, if you encounter a !nonfreeing_call_p call, either immediately nuke recorded earlier ASAN_CHECKs from the current bb, or use gimple_uids for lazily doing that; but in any case, record the flag 1) for the current bb 2) if you are considering ASAN_CHECK in a different bb than ASAN_CHECK it is dominating, check the 2) flag on the current bb, then on get_immediate_dominator (bb) etc. until you reach the bb with the dominating bb, if the 2) flag is set on any of them, don't optimize; if the 2) flag is not computed on any of these (i.e. flag 3) unset), then compute it recursively; set the 4) flag on a bb, for incoming edges if the src bb is not the imm(bb) of the original bb, and does not have 4) flag set: if it has 1) set, use 1, if it has 3) flag set, use 2), otherwise recurse (and or the result); unset 4) flag before returning; or so. For tsan, pretty much the same thing, just with different 1)/2)/3) flags and different test for that (instead of !nonfreeing_call_p we are interested in: uses atomics or calls that might use atomics or other pthread_* synchronization primitives). Jakub
Re: [PATCH, PR63742][ARM] Fix arm *movhi_insn_arch4 pattern for big-endian
On 05/11/14 07:09, Yangfei (Felix) wrote: Hi, This patch fixes PR63742 by improving arm *movhi_insn_arch4 pattern to make it works under big-endian. The idea is simple: Use movw for certain const source operand instead of ldrh. And exclude the const values which cannot be handled by mov/mvn/movw. I am doing regression test for this patch. Assuming no issue pops up, OK for trunk? So, doesn't that makes the bug latent for architectures older than armv6t2 and big endian and only fixed this in ARM state ? I'd prefer a complete solution please. What about *thumb2_movhi_insn in thumb2.md ? Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 216838) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,12 @@ +2014-11-05 Felix Yang felix.y...@huawei.com + Shanyao Chen chenshan...@huawei.com + I'm assuming you have copyright assignments sorted. + PR target/63742 + * config/arm/predicates.md (arm_hi_operand): New predicate. + (arm_movw_immediate_operand): Similarly. + * config/arm/arm.md (*movhi_insn_arch4): Use arm_hi_operand instead of + general_operand and add movw to the output template. + 2014-10-29 Richard Sandiford richard.sandif...@arm.com * addresses.h, alias.c, asan.c, auto-inc-dec.c, bt-load.c, builtins.c, Index: gcc/config/arm/predicates.md === --- gcc/config/arm/predicates.md(revision 216838) +++ gcc/config/arm/predicates.md(working copy) @@ -144,6 +144,12 @@ (and (match_code const_int) (match_test INTVAL (op) == 0))) +(define_predicate arm_movw_immediate_operand + (and (match_test TARGET_32BIT arm_arch_thumb2) + (ior (match_code high) I don't see why you need to check for high here ? +(and (match_code const_int) + (match_test (INTVAL (op) 0x) == 0) + ;; Something valid on the RHS of an ARM data-processing instruction (define_predicate arm_rhs_operand (ior (match_operand 0 s_register_operand) @@ -211,6 +217,11 @@ (ior (match_operand 0 arm_rhs_operand) (match_operand 0 arm_not_immediate_operand))) +(define_predicate arm_hi_operand + (ior (match_operand 0 arm_rhsm_operand) + (ior (match_operand 0 arm_not_immediate_operand) +(match_operand 0 arm_movw_immediate_operand + (define_predicate arm_di_operand (ior (match_operand 0 s_register_operand) (match_operand 0 arm_immediate_di_operand))) Index: gcc/config/arm/arm.md === --- gcc/config/arm/arm.md (revision 216838) +++ gcc/config/arm/arm.md (working copy) @@ -6285,8 +6285,8 @@ ;; Pattern to recognize insn generated default case above (define_insn *movhi_insn_arch4 - [(set (match_operand:HI 0 nonimmediate_operand =r,r,m,r) - (match_operand:HI 1 general_operand rIk,K,r,mi))] + [(set (match_operand:HI 0 nonimmediate_operand =r,r,r,m,r) + (match_operand:HI 1 arm_hi_operand rIk,K,j,r,mi))] TARGET_ARM arm_arch4 (register_operand (operands[0], HImode) @@ -6294,16 +6294,18 @@ @ mov%?\\t%0, %1\\t%@ movhi mvn%?\\t%0, #%B1\\t%@ movhi + movw%?\\t%0, %1\\t%@ movhi str%(h%)\\t%1, %0\\t%@ movhi ldr%(h%)\\t%0, %1\\t%@ movhi [(set_attr predicable yes) - (set_attr pool_range *,*,*,256) - (set_attr neg_pool_range *,*,*,244) + (set_attr pool_range *,*,*,*,256) + (set_attr neg_pool_range *,*,*,*,244) (set_attr_alternative type [(if_then_else (match_operand 1 const_int_operand ) (const_string mov_imm ) (const_string mov_reg)) (const_string mvn_imm) + (const_string mov_imm) (const_string store1) (const_string load1)])] ) Ramana
Re: [PATCH, libfortran] PR 47007, 61847 Locale failures in libgfortran
On Wed, Nov 05, 2014 at 12:48:01PM +0200, Janne Blomqvist wrote: @@ -3528,6 +3533,11 @@ finalize_transfer (st_parameter_dt *dtp) if ((dtp-common.flags IOPARM_DT_HAS_SIZE) != 0) *dtp-size = dtp-u.p.size_used; +#ifdef HAVE_USELOCALE + if (dtp-u.p.old_locale != (locale_t) 0) +uselocale (dtp-u.p.old_locale); +#endif I wonder if you shouldn't clear dtp-u.p.old_locale here too, so that uselocale isn't called again. Jakub
Re: [PATCH] Optimize UBSAN_NULL checks, add sanopt.c
On Wed, Nov 05, 2014 at 11:50:20AM +0100, Jakub Jelinek wrote: On Wed, Nov 05, 2014 at 11:29:19AM +0100, Marek Polacek wrote: On Wed, Nov 05, 2014 at 12:54:37PM +0300, Yury Gribov wrote: Are you going to work on ASan soon? I could rebase my patches on top of Marek's infrastructure. I'm not going to work on ASan today or tomorrow, but it'd be nice to get this ASan opt in in this stage1. So if you can rebase your patch, I think that will be appreciated. Note, the algorithm we were discussing with Honza for the is there any possibility of a freeing call on the path between a dominating and dominated ASAN_CHECK Right. Let me see then if I can implement the following soon, maybe it makes sense to rebase Yuri's patch only on top of this algorithm. problem was to compute it lazily; have flags for asan per-bb: 1) bb might contain a !nonfreeing_call_p 2) there is a bb with flag 1) set in some path between imm(bb) and bb 3) flag whether 2) has been computed already 4) some temporary being visited flag and the algorithm: 1) when walking a bb, if you encounter a !nonfreeing_call_p call, either immediately nuke recorded earlier ASAN_CHECKs from the current bb, or use gimple_uids for lazily doing that; but in any case, record the flag 1) for the current bb 2) if you are considering ASAN_CHECK in a different bb than ASAN_CHECK it is dominating, check the 2) flag on the current bb, then on get_immediate_dominator (bb) etc. until you reach the bb with the dominating bb, if the 2) flag is set on any of them, don't optimize; if the 2) flag is not computed on any of these (i.e. flag 3) unset), then compute it recursively; set the 4) flag on a bb, for incoming edges if the src bb is not the imm(bb) of the original bb, and does not have 4) flag set: if it has 1) set, use 1, if it has 3) flag set, use 2), otherwise recurse (and or the result); unset 4) flag before returning; or so. For tsan, pretty much the same thing, just with different 1)/2)/3) flags and different test for that (instead of !nonfreeing_call_p we are interested in: uses atomics or calls that might use atomics or other pthread_* synchronization primitives). Marek
Re: [PATCH][AARCH64]Add ACLE arch-related predefined macros
On 31 October 2014 14:37, Renlin Li renlin...@arm.com wrote: Hi all, This is a simple patch to add arch-related macros defined ACLE 2.0. aarch64-none-elf target is tested on the model, no new issues. Is this Okay for trunk? gcc/ChangeLog: 2014-10-31 Renlin Li renlin...@arm.com * config/aarch64/aarch64.c (aarch64_architecture_version): New. (processor): New architecture_version field. (aarch64_override_options): Initialize aarch64_architecture_version. * config/aarch64/aarch64.h (TARGET_CPU_CPP_BUILTINS): Define __ARM_ARCH, __ARM_ARCH_PROFILE, aarch64_arch_name macro. OK /Marcus
Re: [4.9 RFA PATCH, RTL-optimization]: Backport recent AND-alignment alias fixes to 4.9 branch
On Wed, 5 Nov 2014, Uros Bizjak wrote: Ping for [1], quoted below. Ok. Thanks, Richard. [1] https://gcc.gnu.org/ml/gcc-patches/2014-10/msg03189.html Thanks, Uros. On Thu, Oct 30, 2014 at 11:38 AM, Uros Bizjak ubiz...@gmail.com wrote: Hello! I would like to backport recent alias fixes to correctly handle memory references with AND-alignment to 4.9 branch. These patches fix hundreds of failures in gfortran testsuite on alpha-linux-gnu due to invalid aliasing of AND-aligned memory references of two QImode flags. These patches were baking for a couple of weeks in the mainline without problems. Modulo removal of old and unnecessary functionality, these changes affect only alpha target. 2014-10-30 Uros Bizjak ubiz...@gmail.com Backport from mainline: 2014-10-20 Uros Bizjak ubiz...@gmail.com * varasm.c (const_alias_set): Remove. (init_varasm_once): Remove initialization of const_alias_set. (build_constant_desc): Do not set alias set to const_alias_set. Backport from mainline: 2014-10-14 Uros Bizjak ubiz...@gmail.com PR rtl-optimization/63475 * alias.c (true_dependence_1): Always use get_addr to extract true address operands from x_addr and mem_addr. Use extracted address operands to check for references with alignment ANDs. Use extracted address operands with find_base_term and base_alias_check. For noncanonicalized operands call canon_rtx with extracted address operand. (write_dependence_1): Ditto. (may_alias_p): Ditto. Remove unused calls to canon_rtx. Backport from mainline: 2014-10-10 Uros Bizjak ubiz...@gmail.com PR rtl-optimization/63483 * alias.c (true_dependence_1): Do not exit early for MEM_READONLY_P references when alignment ANDs are involved. (write_dependence_p): Ditto. (may_alias_p): Ditto. The complete backport was tested on alpha-linux-gnu, alphaev68-linux-gnu and x86_64-linux-gnu on 4.9 branch. OK for branch? Uros. -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer, HRB 21284 (AG Nuernberg) Maxfeldstrasse 5, 90409 Nuernberg, Germany
Re: [PATCH, libfortran] PR 47007, 61847 Locale failures in libgfortran
On Wed, Nov 5, 2014 at 1:07 PM, Jakub Jelinek ja...@redhat.com wrote: On Wed, Nov 05, 2014 at 12:48:01PM +0200, Janne Blomqvist wrote: @@ -3528,6 +3533,11 @@ finalize_transfer (st_parameter_dt *dtp) if ((dtp-common.flags IOPARM_DT_HAS_SIZE) != 0) *dtp-size = dtp-u.p.size_used; +#ifdef HAVE_USELOCALE + if (dtp-u.p.old_locale != (locale_t) 0) +uselocale (dtp-u.p.old_locale); +#endif I wonder if you shouldn't clear dtp-u.p.old_locale here too, so that uselocale isn't called again. Sure. In principle this shouldn't be needed, since IIRC the entire dtp-u.p structure is set to 0 when starting an IO operation, but OTOH the cost is insignificant. -- Janne Blomqvist
[ARM] RFA: Use new rtl iterators in arm_find_sub_rtx_with_code
I think these functions only want to iterate over instruction patterns rather than whole instructions (which would include things like REG_EQUAL notes), since only the patterns are relevant for finding dependencies. There's then no need to check for null rtxes. Tested by making sure there were no code changes for gcc.dg, gcc.c-torture and g++.dg for plain arm-linux-gnueabi and aarch64-linux-gnu. Ramana also asked me to try -mcpu=cortex-a7, -mcpu=cortex-a9, -mcpu=arm9tdmi and -mcpu=cortex-a15. There were differences in: gcc.c-torture/execute/20060110-2.c gcc.c-torture/execute/ashrdi-1.c and gcc.dg/tree-ssa/pr24627.c for -mcpu=cortex-a7 and no differences for the other combinations. The A7 differences were due to the way that arm_get_set_operands handles multi-set instructions such as: (set (reg:CC_C 100 cc) (compare:CC_C (plus:SI (reg:SI 8 r8 [orig:121 a ] [121]) (reg:SI 0 r0 [orig:122 b ] [122])) (reg:SI 8 r8 [orig:121 a ] [121]))) (set (reg:SI 2 r2 [orig:120 D.4117 ] [120]) (plus:SI (reg:SI 8 r8 [orig:121 a ] [121]) (reg:SI 0 r0 [orig:122 b ] [122]))) for_each_rtx iterates over the subrtxes in forward order, so arm_get_set_operands would pick the set of CC. The new iterator pushes the contents of a PARALLEL onto a stack and pulls them in reverse order, so arm_get_set_operands would pick the set of r2. This means that after the patch the code sees a producer/consumer relationship that it previously missed. I think the new behaviour is what was intended. This code shouldn't really be relying on a particular iteration order though. There's a dependency if any SET in the potential producer sets a register used by the potential consumer. I think any fix for that should be done separately from the iterator rewrite. OK to install? Thanks, Richard gcc/ * config/arm/aarch-common.c: Include rtl-iter.h. (search_term, arm_find_sub_rtx_with_search_term): Delete. (arm_find_sub_rtx_with_code): Use FOR_EACH_SUBRTX_VAR. (arm_get_set_operands): Pass the insn pattern rather than the insn itself. (arm_no_early_store_addr_dep): Likewise. Index: gcc/config/arm/aarch-common.c === --- gcc/config/arm/aarch-common.c 2014-10-25 09:42:00.631168827 +0100 +++ gcc/config/arm/aarch-common.c 2014-10-25 09:51:24.212872553 +0100 @@ -30,6 +30,7 @@ #include tree.h #include c-family/c-common.h #include rtl.h +#include rtl-iter.h /* In ARMv8-A there's a general expectation that AESE/AESMC and AESD/AESIMC sequences of the form: @@ -68,13 +69,6 @@ aarch_crypto_can_dual_issue (rtx_insn *p return 0; } -typedef struct -{ - rtx_code search_code; - rtx search_result; - bool find_any_shift; -} search_term; - /* Return TRUE if X is either an arithmetic shift left, or is a multiplication by a power of two. */ bool @@ -96,68 +90,32 @@ static rtx_code shift_rtx_codes[] = { ASHIFT, ROTATE, ASHIFTRT, LSHIFTRT, ROTATERT, ZERO_EXTEND, SIGN_EXTEND }; -/* Callback function for arm_find_sub_rtx_with_code. - DATA is safe to treat as a SEARCH_TERM, ST. This will - hold a SEARCH_CODE. PATTERN is checked to see if it is an - RTX with that code. If it is, write SEARCH_RESULT in ST - and return 1. Otherwise, or if we have been passed a NULL_RTX - return 0. If ST.FIND_ANY_SHIFT then we are interested in - anything which can reasonably be described as a SHIFT RTX. */ -static int -arm_find_sub_rtx_with_search_term (rtx *pattern, void *data) -{ - search_term *st = (search_term *) data; - rtx_code pattern_code; - int found = 0; - - gcc_assert (pattern); - gcc_assert (st); - - /* Poorly formed patterns can really ruin our day. */ - if (*pattern == NULL_RTX) -return 0; - - pattern_code = GET_CODE (*pattern); - - if (st-find_any_shift) -{ - unsigned i = 0; - - /* Left shifts might have been canonicalized to a MULT of some -power of two. Make sure we catch them. */ - if (arm_rtx_shift_left_p (*pattern)) - found = 1; - else - for (i = 0; i ARRAY_SIZE (shift_rtx_codes); i++) - if (pattern_code == shift_rtx_codes[i]) - found = 1; -} - - if (pattern_code == st-search_code) -found = 1; - - if (found) -st-search_result = *pattern; - - return found; -} - -/* Traverse PATTERN looking for a sub-rtx with RTX_CODE CODE. */ +/* Traverse PATTERN looking for a sub-rtx with RTX_CODE CODE. + If FIND_ANY_SHIFT then we are interested in anything which can + reasonably be described as a SHIFT RTX. */ static rtx arm_find_sub_rtx_with_code (rtx pattern, rtx_code code, bool find_any_shift) { - search_term st; - int result = 0; + subrtx_var_iterator::array_type array; + FOR_EACH_SUBRTX_VAR (iter, array, pattern, NONCONST) +{ + rtx x = *iter; + if
[ARM] RFA: Use new rtl iterators in arm_cannot_copy_insn
Tested in the same way as the aarch-common.c patch. OK to install? Thanks, Richard gcc/ * config/arm/arm.c (arm_note_pic_base): Delete. (arm_cannot_copy_insn_p): Use FOR_EACH_SUBRTX. Index: gcc/config/arm/arm.c === --- gcc/config/arm/arm.c2014-11-05 11:48:55.030053470 + +++ gcc/config/arm/arm.c2014-11-05 11:48:57.406073646 + @@ -13157,16 +13157,6 @@ tls_mentioned_p (rtx x) /* Must not copy any rtx that uses a pc-relative address. */ -static int -arm_note_pic_base (rtx *x, void *date ATTRIBUTE_UNUSED) -{ - if (GET_CODE (*x) == UNSPEC - (XINT (*x, 1) == UNSPEC_PIC_BASE - || XINT (*x, 1) == UNSPEC_PIC_UNIFIED)) -return 1; - return 0; -} - static bool arm_cannot_copy_insn_p (rtx_insn *insn) { @@ -13175,7 +13165,16 @@ arm_cannot_copy_insn_p (rtx_insn *insn) if (recog_memoized (insn) == CODE_FOR_tlscall) return true; - return for_each_rtx (PATTERN (insn), arm_note_pic_base, NULL); + subrtx_iterator::array_type array; + FOR_EACH_SUBRTX (iter, array, PATTERN (insn), ALL) +{ + const_rtx x = *iter; + if (GET_CODE (x) == UNSPEC + (XINT (x, 1) == UNSPEC_PIC_BASE + || XINT (x, 1) == UNSPEC_PIC_UNIFIED)) + return true; +} + return false; } enum rtx_code
[ARM] RFA: Use new rtl iterators in arm_tls_referenced_p
Tested in the same way as the aarch-common.c patch. OK to install? Thanks, Richard gcc/ * config/arm/arm.c: Include rtl-iter.h. (arm_tls_referenced_p_1): Delete. (arm_tls_referenced_p): Use FOR_EACH_SUBRTX. Index: gcc/config/arm/arm.c === --- gcc/config/arm/arm.c2014-11-02 19:59:27.588237213 + +++ gcc/config/arm/arm.c2014-11-05 11:48:55.030053470 + @@ -82,6 +82,7 @@ #include gimple-expr.h #include builtins.h #include tm-constrs.h +#include rtl-iter.h /* Forward definitions of types. */ typedef struct minipool_nodeMnode; @@ -8078,25 +8079,6 @@ thumb_legitimize_reload_address (rtx *x_ return NULL; } -/* Test for various thread-local symbols. */ - -/* Helper for arm_tls_referenced_p. */ - -static int -arm_tls_operand_p_1 (rtx *x, void *data ATTRIBUTE_UNUSED) -{ - if (GET_CODE (*x) == SYMBOL_REF) -return SYMBOL_REF_TLS_MODEL (*x) != 0; - - /* Don't recurse into UNSPEC_TLS looking for TLS symbols; these are - TLS offsets, not real symbol references. */ - if (GET_CODE (*x) == UNSPEC - XINT (*x, 1) == UNSPEC_TLS) -return -1; - - return 0; -} - /* Return TRUE if X contains any TLS symbol references. */ bool @@ -8105,7 +8087,19 @@ arm_tls_referenced_p (rtx x) if (! TARGET_HAVE_TLS) return false; - return for_each_rtx (x, arm_tls_operand_p_1, NULL); + subrtx_iterator::array_type array; + FOR_EACH_SUBRTX (iter, array, x, ALL) +{ + const_rtx x = *iter; + if (GET_CODE (x) == SYMBOL_REF SYMBOL_REF_TLS_MODEL (x) != 0) + return true; + + /* Don't recurse into UNSPEC_TLS looking for TLS symbols; these are +TLS offsets, not real symbol references. */ + if (GET_CODE (x) == UNSPEC XINT (x, 1) == UNSPEC_TLS) + iter.skip_subrtxes (); +} + return false; } /* Implement TARGET_LEGITIMATE_CONSTANT_P.
[AArch64] RFA: Use new rtl iterators in arm_cannot_copy_insn
This is part of a series to remove uses of for_each_rtx from the ports. Tested by making sure there were no code changes for gcc.dg, gcc.c-torture and g++.dg for aarch64-linux-gnu. OK to install? Thanks, Richard gcc/ * config/aarch64/aarch64.c: Include rtl-iter.h. (aarch64_tls_operand_p_1): Delete. (aarch64_tls_operand_p): Use FOR_EACH_SUBRTX. Index: gcc/config/aarch64/aarch64.c === --- gcc/config/aarch64/aarch64.c2014-11-02 19:59:26.977231633 + +++ gcc/config/aarch64/aarch64.c2014-11-05 11:48:59.982095520 + @@ -2791,28 +2791,23 @@ aarch64_output_mi_thunk (FILE *file, tre reload_completed = 0; } -static int -aarch64_tls_operand_p_1 (rtx *x, void *data ATTRIBUTE_UNUSED) -{ - if (GET_CODE (*x) == SYMBOL_REF) -return SYMBOL_REF_TLS_MODEL (*x) != 0; - - /* Don't recurse into UNSPEC_TLS looking for TLS symbols; these are - TLS offsets, not real symbol references. */ - if (GET_CODE (*x) == UNSPEC - XINT (*x, 1) == UNSPEC_TLS) -return -1; - - return 0; -} - static bool aarch64_tls_referenced_p (rtx x) { if (!TARGET_HAVE_TLS) return false; - - return for_each_rtx (x, aarch64_tls_operand_p_1, NULL); + subrtx_iterator::array_type array; + FOR_EACH_SUBRTX (iter, array, x, ALL) +{ + const_rtx x = *iter; + if (GET_CODE (x) == SYMBOL_REF SYMBOL_REF_TLS_MODEL (x) != 0) + return true; + /* Don't recurse into UNSPEC_TLS looking for TLS symbols; these are +TLS offsets, not real symbol references. */ + if (GET_CODE (x) == UNSPEC XINT (x, 1) == UNSPEC_TLS) + iter.skip_subrtxes (); +} + return false; }
Re: [PATCH, libfortran] PR 47007, 61847 Locale failures in libgfortran
On Wed, Nov 05, 2014 at 01:48:32PM +0200, Janne Blomqvist wrote: On Wed, Nov 5, 2014 at 1:07 PM, Jakub Jelinek ja...@redhat.com wrote: On Wed, Nov 05, 2014 at 12:48:01PM +0200, Janne Blomqvist wrote: @@ -3528,6 +3533,11 @@ finalize_transfer (st_parameter_dt *dtp) if ((dtp-common.flags IOPARM_DT_HAS_SIZE) != 0) *dtp-size = dtp-u.p.size_used; +#ifdef HAVE_USELOCALE + if (dtp-u.p.old_locale != (locale_t) 0) +uselocale (dtp-u.p.old_locale); +#endif I wonder if you shouldn't clear dtp-u.p.old_locale here too, so that uselocale isn't called again. Sure. In principle this shouldn't be needed, since IIRC the entire dtp-u.p structure is set to 0 when starting an IO operation, but OTOH the cost is insignificant. If finalize transfer is guaranteed to be called exactly once on a structure on which dtp-u.p.old_locale has been set, then it is ok as is, though clearing it after such an restoration operation is a good style; you could e.g. then assert it is NULL when the done with a particular dtp structure in debugging builds etc. Jakub
Re: [PARCH 1/2, x86, PR63534] Fix darwin bootstrap
Now if your argument is that IRA/LRA handle this, that's fine, a pointer to that code would be appreciated so that it can be quickly audited. Certainly the old local-alloc/global-alloc had magic for setjmp/longjmp and maybe IRA/LRA does too, but it's better to be sure than just assume. See ira-lives.c:1217 and below. -- Eric Botcazou
Re: The nvptx port [8/11+] Write undefined decls.
On 10/22/2014 08:11 PM, Jeff Law wrote: I'm not going to insist you do this in the same way as the PA. That was a different era -- we had significant motivation to make things work in such a way that everything could be buried in the pa specific files. That sometimes led to less than optimal approaches to fix certain problems. So... is this patch approved? Bernd
Re: [Patch, testsuite] [AArch64,ARM] support bswap tests on aarch64_be
On 31 October 2014 18:15, Ramana Radhakrishnan ramana@googlemail.com wrote: On Wed, Oct 29, 2014 at 1:22 PM, Christophe Lyon christophe.l...@linaro.org wrote: Hi, Following discussions after Thomas's patches improving bswap support https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01279.html I noticed that: * the associated tests weren't executed on aarch64_be * ARM targets older than v6 do not support the needed instructions. The attached patch changes check_effective_target_bswap(): - accept aarch64*-*-* instead of aarch64-*-* - when target is arm*-*-*, check __ARM_ARCH = 6 2014-10-29 Christophe Lyon christophe.l...@linaro.org * lib/target-supports.exp (check_effective_target_bswap): Update conditions for AArch64 and ARM targets. OK? The ARM (AArch32) changes are ok. Ramana Thank you Ramana. Marcus, I believe the AArch64 part is obvious. OK? Christophe. Christophe.
Re: [Patch ARM-AArch64/testsuite v3 00/21] Neon intrinsics executable tests
ping? On 26 October 2014 17:50, Christophe Lyon christophe.l...@linaro.org wrote: On 24 October 2014 10:07, Marcus Shawcroft marcus.shawcr...@gmail.com wrote: On 21 October 2014 14:02, Christophe Lyon christophe.l...@linaro.org wrote: This patch series is an updated version of the series I sent here: https://gcc.gnu.org/ml/gcc-patches/2014-07/msg00022.html I addressed comments from Marcus and Richard, and decided to skip support for half-precision variants for the time being. I'll post dedicated patches later. Compared to v2: - the directory containing the new tests is named gcc.target/aarch64/adv-simd instead of gcc.target/aarch64/neon-intrinsics. - the driver is named adv-simd.exp instead of neon-intrinsics.exp - the driver is guarded against the new test parallelization framework - the README file uses 'Advanced SIMD (Neon)' instead of 'Neon' Thank you Christophe. Please commit all 21 patches in the series. Thanks, I have committed the whole series. I've just realized afterwards that the tests aren't guarded against targets not supporting Neon. How about adding the attached small patch? (ChangeLog incorrectly formatted :-() 2014-10-26 Christophe Lyon christophe.l...@linaro.org gcc.target/aarch64/advsimd-intrinsics/advsimd-intrinsics.exp: Skip tests if target does not support Neon. Christophe /Marcus
Re: [Patch, testsuite] [AArch64,ARM] support bswap tests on aarch64_be
On 5 November 2014 12:08, Christophe Lyon christophe.l...@linaro.org wrote: On 31 October 2014 18:15, Ramana Radhakrishnan ramana@googlemail.com wrote: On Wed, Oct 29, 2014 at 1:22 PM, Christophe Lyon christophe.l...@linaro.org wrote: Hi, Following discussions after Thomas's patches improving bswap support https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01279.html I noticed that: * the associated tests weren't executed on aarch64_be * ARM targets older than v6 do not support the needed instructions. The attached patch changes check_effective_target_bswap(): - accept aarch64*-*-* instead of aarch64-*-* - when target is arm*-*-*, check __ARM_ARCH = 6 2014-10-29 Christophe Lyon christophe.l...@linaro.org * lib/target-supports.exp (check_effective_target_bswap): Update conditions for AArch64 and ARM targets. OK? The ARM (AArch32) changes are ok. Ramana Thank you Ramana. Marcus, I believe the AArch64 part is obvious. OK? Yep, it looks obvious to me. /Marcus
Re: [PATCH] Optimize UBSAN_NULL checks, add sanopt.c
On 11/05/2014 02:23 PM, Marek Polacek wrote: On Wed, Nov 05, 2014 at 11:50:20AM +0100, Jakub Jelinek wrote: On Wed, Nov 05, 2014 at 11:29:19AM +0100, Marek Polacek wrote: On Wed, Nov 05, 2014 at 12:54:37PM +0300, Yury Gribov wrote: Are you going to work on ASan soon? I could rebase my patches on top of Marek's infrastructure. I'm not going to work on ASan today or tomorrow, but it'd be nice to get this ASan opt in in this stage1. So if you can rebase your patch, I think that will be appreciated. Note, the algorithm we were discussing with Honza for the is there any possibility of a freeing call on the path between a dominating and dominated ASAN_CHECK Right. Let me see then if I can implement the following soon, maybe it makes sense to rebase Yuri's patch only on top of this algorithm. The algorithm looks like should_hoist_expr_to_dom in gcse.c btw. BTW have you considered relaxing the non-freeing restriction to not drop accesses to globals and stack variables? I wonder if we could win something there. -Y
Re: [PATCH] Optimize UBSAN_NULL checks, add sanopt.c
On Wed, Nov 05, 2014 at 03:16:49PM +0300, Yury Gribov wrote: On 11/05/2014 02:23 PM, Marek Polacek wrote: On Wed, Nov 05, 2014 at 11:50:20AM +0100, Jakub Jelinek wrote: On Wed, Nov 05, 2014 at 11:29:19AM +0100, Marek Polacek wrote: On Wed, Nov 05, 2014 at 12:54:37PM +0300, Yury Gribov wrote: Are you going to work on ASan soon? I could rebase my patches on top of Marek's infrastructure. I'm not going to work on ASan today or tomorrow, but it'd be nice to get this ASan opt in in this stage1. So if you can rebase your patch, I think that will be appreciated. Note, the algorithm we were discussing with Honza for the is there any possibility of a freeing call on the path between a dominating and dominated ASAN_CHECK Right. Let me see then if I can implement the following soon, maybe it makes sense to rebase Yuri's patch only on top of this algorithm. The algorithm looks like should_hoist_expr_to_dom in gcse.c btw. BTW have you considered relaxing the non-freeing restriction to not drop accesses to globals and stack variables? I wonder if we could win something there. Wouldn't it break most uses of __asan_poison_memory_region ? Jakub
Re: [RTL, Patch] Int div by constant compilation enhancement
2014-11-03 Alex Velenko alex.vele...@arm.com * simplify-rtx.c (simplify_binary_operation_1): Div check added. * rtl.h (SUBREG_P): New macro added. Present tense in Change entries: * rtl.h (SUBREG_P): New macro. * simplify-rtx.c (simplify_binary_operation_1): Simplify consecutive right shifts in combination with a low-part operation. Can't the 'c1 == size (M2) - size (M1)' condition be relaxed? -- Eric Botcazou
Re: [PATCH] Optimize UBSAN_NULL checks, add sanopt.c
On 11/05/2014 03:21 PM, Jakub Jelinek wrote: On Wed, Nov 05, 2014 at 03:16:49PM +0300, Yury Gribov wrote: On 11/05/2014 02:23 PM, Marek Polacek wrote: On Wed, Nov 05, 2014 at 11:50:20AM +0100, Jakub Jelinek wrote: On Wed, Nov 05, 2014 at 11:29:19AM +0100, Marek Polacek wrote: On Wed, Nov 05, 2014 at 12:54:37PM +0300, Yury Gribov wrote: Are you going to work on ASan soon? I could rebase my patches on top of Marek's infrastructure. I'm not going to work on ASan today or tomorrow, but it'd be nice to get this ASan opt in in this stage1. So if you can rebase your patch, I think that will be appreciated. Note, the algorithm we were discussing with Honza for the is there any possibility of a freeing call on the path between a dominating and dominated ASAN_CHECK Right. Let me see then if I can implement the following soon, maybe it makes sense to rebase Yuri's patch only on top of this algorithm. The algorithm looks like should_hoist_expr_to_dom in gcse.c btw. BTW have you considered relaxing the non-freeing restriction to not drop accesses to globals and stack variables? I wonder if we could win something there. Wouldn't it break most uses of __asan_poison_memory_region ? Most probably but I wonder if we should ask people to simply do asm volatile with memory clobber in this case? And we probably shouldn't call the whole thing is_nonfreeing anyway. -Y
Re: [PATCH 2/n] OpenMP 4.0 offloading infrastructure: LTO streaming
On 03 Nov 10:24, Jakub Jelinek wrote: On Tue, Oct 28, 2014 at 10:30:47PM +0300, Ilya Verbin wrote: @@ -474,6 +475,13 @@ cgraph_node::create (tree decl) gcc_assert (TREE_CODE (decl) == FUNCTION_DECL); node-decl = decl; + + if (lookup_attribute (omp declare target, DECL_ATTRIBUTES (decl))) +{ + node-offloadable = 1; + g-have_offload = true; +} I wonder if we shouldn't optimize here and call lookup_attribute only if there is a chance that the attribute might be present, so guard with flag_openmp (and flag_openacc later on?). During LTO the cgraph nodes are streamed in and supposedly the flag offloadable too. @@ -2129,8 +2141,12 @@ symbol_table::compile (void) fprintf (stderr, Performing interprocedural optimizations\n); state = IPA; + /* OpenMP offloading requires LTO infrastructure. */ + if (!in_lto_p flag_openmp g-have_offload) +flag_generate_lto = 1; On the other side, do you need flag_openmp here? Supposedly g-have_offload would already been set if needed. Done, flag_openmp moved from symbol_table::compile to cgraph_node::create and varpool_node::get_create. OK for trunk? Maybe also with this change? diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 4e9ed25..beae5b5 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -1653,8 +1653,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP DECL_P (decl) is_global_var (maybe_lookup_decl_in_outer_ctx (decl, ctx)) - lookup_attribute (omp declare target, - DECL_ATTRIBUTES (decl))) + varpool_node::get_create (decl)-offloadable) break; if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP OMP_CLAUSE_MAP_KIND (c) == OMP_CLAUSE_MAP_POINTER) @@ -1794,8 +1793,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) decl = OMP_CLAUSE_DECL (c); if (DECL_P (decl) is_global_var (maybe_lookup_decl_in_outer_ctx (decl, ctx)) - lookup_attribute (omp declare target, - DECL_ATTRIBUTES (decl))) + varpool_node::get_create (decl)-offloadable) break; if (DECL_P (decl)) { Thanks, -- Ilya --- diff --git a/gcc/cgraph.c b/gcc/cgraph.c index 9a47ba2..a491886 100644 --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -70,6 +70,7 @@ along with GCC; see the file COPYING3. If not see #include tree-dfa.h #include profile.h #include params.h +#include context.h /* FIXME: Only for PROP_loops, but cgraph shouldn't have to know about this. */ #include tree-pass.h @@ -474,6 +475,14 @@ cgraph_node::create (tree decl) gcc_assert (TREE_CODE (decl) == FUNCTION_DECL); node-decl = decl; + + if (flag_openmp + lookup_attribute (omp declare target, DECL_ATTRIBUTES (decl))) +{ + node-offloadable = 1; + g-have_offload = true; +} + node-register_symbol (); if (DECL_CONTEXT (decl) TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL) diff --git a/gcc/cgraph.h b/gcc/cgraph.h index 377adce..4988f2d 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -463,6 +463,13 @@ public: /* Set when init priority is set. */ unsigned in_init_priority_hash : 1; + /* Set when symbol needs to be streamed into LTO bytecode for LTO, or in case + of offloading, for separate compilation for a different target. */ + unsigned need_lto_streaming : 1; + + /* Set when symbol can be streamed into bytecode for offloading. */ + unsigned offloadable : 1; + /* Ordering of all symtab entries. */ int order; diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index 3e76bf0..83ab419 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -218,6 +218,7 @@ along with GCC; see the file COPYING3. If not see #include tree-nested.h #include gimplify.h #include dbgcnt.h +#include lto-section-names.h /* Queue of cgraph nodes scheduled to be added into cgraph. This is a secondary queue used during optimization to accommodate passes that @@ -2049,7 +2050,18 @@ ipa_passes (void) targetm.asm_out.lto_start (); if (!in_lto_p) -ipa_write_summaries (); +{ + if (g-have_offload) + { + section_name_prefix = OFFLOAD_SECTION_NAME_PREFIX; + ipa_write_summaries (true); + } + if (flag_lto) + { + section_name_prefix = LTO_SECTION_NAME_PREFIX; + ipa_write_summaries (false); + } +} if (flag_generate_lto) targetm.asm_out.lto_end (); @@ -2129,8 +2141,12 @@ symbol_table::compile (void) fprintf (stderr, Performing interprocedural optimizations\n); state = IPA; + /* Offloading requires LTO infrastructure. */ + if (!in_lto_p g-have_offload) +flag_generate_lto = 1; + /* If LTO is enabled, initialize the streamer hooks needed by GIMPLE. */ - if (flag_lto) + if (flag_generate_lto)
Re: [PATCH 2/n] OpenMP 4.0 offloading infrastructure: LTO streaming
On Wed, Nov 05, 2014 at 03:46:55PM +0300, Ilya Verbin wrote: Maybe also with this change? diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 4e9ed25..beae5b5 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -1653,8 +1653,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP DECL_P (decl) is_global_var (maybe_lookup_decl_in_outer_ctx (decl, ctx)) -lookup_attribute (omp declare target, -DECL_ATTRIBUTES (decl))) +varpool_node::get_create (decl)-offloadable) break; if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP OMP_CLAUSE_MAP_KIND (c) == OMP_CLAUSE_MAP_POINTER) @@ -1794,8 +1793,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) decl = OMP_CLAUSE_DECL (c); if (DECL_P (decl) is_global_var (maybe_lookup_decl_in_outer_ctx (decl, ctx)) -lookup_attribute (omp declare target, -DECL_ATTRIBUTES (decl))) +varpool_node::get_create (decl)-offloadable) break; if (DECL_P (decl)) { That looks reasonable (of course if the other patch is committed). --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -70,6 +70,7 @@ along with GCC; see the file COPYING3. If not see #include tree-dfa.h #include profile.h #include params.h +#include context.h /* FIXME: Only for PROP_loops, but cgraph shouldn't have to know about this. */ #include tree-pass.h @@ -474,6 +475,14 @@ cgraph_node::create (tree decl) gcc_assert (TREE_CODE (decl) == FUNCTION_DECL); node-decl = decl; + + if (flag_openmp + lookup_attribute (omp declare target, DECL_ATTRIBUTES (decl))) +{ + node-offloadable = 1; + g-have_offload = true; +} + node-register_symbol (); LGTM. Jakub
Re: [gofrontend-dev] [PATCH 1/4] Gccgo port to s390[x] -- part II
On Wed, Nov 05, 2014 at 11:31:28AM +0100, Dominik Vogt wrote: On Tue, Nov 04, 2014 at 02:39:34PM -0800, Ian Taylor wrote: Note that libgo/runtime/runtime.c now refers to S390_HAVE_STCKF. It's not obvious to me that that is defined anywhere. Perhaps it is in a later patch in this series--I haven't looked. The attached patch fixes this (by using stckf unconditionally). Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany ChangeLog 2014-11-05 Dominik Vogt v...@linux.vnet.ibm.com * libgo/runtime/runtime.c (runtime_cputicks): s390: use stckf unconditionally From 9b787b6190b5dd12e5be42e4c65a6907ca99bb59 Mon Sep 17 00:00:00 2001 From: Dominik Vogt v...@linux.vnet.ibm.com Date: Wed, 5 Nov 2014 13:38:23 +0100 Subject: [PATCH 1/2] libgo: Use stckf unconditionally on s390[x]. --- libgo/runtime/runtime.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/libgo/runtime/runtime.c b/libgo/runtime/runtime.c index abc1aff..496e77b 100644 --- a/libgo/runtime/runtime.c +++ b/libgo/runtime/runtime.c @@ -195,12 +195,12 @@ runtime_cputicks(void) asm(rdtsc : =a (low), =d (high)); return (int64)(((uint64)high 32) | (uint64)low); #elif defined (__s390__) || defined (__s390x__) - uint64 clock; -#ifdef S390_HAVE_STCKF - asm(stckf\t%0 : =Q (clock) : : ); -#else - clock = 0; -#endif + uint64 clock = 0; + /* stckf may not write the return variable in case of a clock error, so make + it read-write to prevent that the initialisation is optimised out. + Note: Targets below z9-109 will crash when executing store clock fast, i.e. + we don't support Go for machines older than that. */ + asm volatile(.insn s,0xb27c,%0 /* stckf */ : +Q (clock) : : cc ); return (int64)clock; #else // FIXME: implement for other processors. -- 1.8.4.2
Re: [PARCH 1/2, x86, PR63534] Fix darwin bootstrap
On Tue, Nov 4, 2014 at 1:40 AM, Jeff Law l...@redhat.com wrote: On 11/01/14 06:39, Evgeny Stupachenko wrote: When PIC register is pseudo there is nothing special about it's value that setjmp can hurt. So if the pseudo register lives across setjmp_receiver RA should care about correct allocation (in case it is not saved/restored, it should go on stack). gcc.dg tests and specs I've tested behave like this. If the allocator picked a call-clobbered register for the PIC register, then we're obviously OK since the setjmp has to be expected to clobber the PIC register. But if the PIC register is in a call-saved register, then it's going to be assumed to not be clobbered across calls and I don't believe that is guaranteed for builtin setjmp/longjmp. Those restore SP, FP and an ARGP, but not anything else by default. I still don't see what is special for PIC register here. PIC pseudo now behave as every other pseudo register. If we assume that setjmp can change a pseudo register value we need IRA/LRA magic for each pseudo register. I believe that when we had EBX fixed, IRA/LRA don't save/restore it anywhere. Therefore we had to care about EBX value in special cases like setjmp/non-local goto. Now RA cares about PIC pseudo as well as about correct allocation for any pseudo register. So the callee might have clobbered the call saved hard register, expecting to restore its value in its epilogue. But due to the longjmp, that epilogue never gets called and thus the call-saved register won't have the right value in the receiver. Now if your argument is that IRA/LRA handle this, that's fine, a pointer to that code would be appreciated so that it can be quickly audited. Certainly the old local-alloc/global-alloc had magic for setjmp/longjmp and maybe IRA/LRA does too, but it's better to be sure than just assume. The initial problem comes from non-local goto as it tries to emit pseudo PIC register after reload. ? You mean it emits a reference to the pseudo into RTL? That would indicate that the allocators never put the pseudo into a hard register?!? RTL dumps with a few pointers to key insns would help here. Correct, that is why Darwin crashes with ICE on non-local goto. We still have: (define_insn_and_split nonlocal_goto_receiver [(unspec_volatile [(const_int 0)] UNSPECV_NLGR)] TARGET_MACHO !TARGET_64BIT flag_pic # reload_completed [(const_int 0)] { if (crtl-uses_pic_offset_table) { rtx xops[3]; rtx label_rtx = gen_label_rtx (); rtx tmp; /* Get a new pic base. */ emit_insn (gen_set_got_labelled (pic_offset_table_rtx, label_rtx)); . Here for MAC only we are trying to use pseudo PIC: pic_offset_table_rtx when reload_completed. jeff
Re: [PARCH 1/2, x86, PR63534] Fix darwin bootstrap
We don't emit extra SET_GOT. That is beneficial. As for stack usage, that is RA to decide which register is more beneficial to put on stack. On Sat, Nov 1, 2014 at 8:33 PM, Mike Stump mikest...@comcast.net wrote: On Nov 1, 2014, at 5:39 AM, Evgeny Stupachenko evstu...@gmail.com wrote: When PIC register is pseudo there is nothing special about it's value that setjmp can hurt. So if the pseudo register lives across setjmp_receiver RA should care about correct allocation (in case it is not saved/restored, it should go on stack). So, why is consuming more stack space beneficial?
Re: The nvptx port [10/11+] Target files
On 11/04/2014 05:51 PM, Bernd Schmidt wrote: On 11/04/2014 05:48 PM, Richard Henderson wrote: On 10/28/2014 03:56 PM, Bernd Schmidt wrote: +nvptx_ptx_type_from_mode (enum machine_mode mode, bool promote) +{ + switch (mode) +{ +case BLKmode: + return .b8; +case BImode: + return .pred; +case QImode: + if (promote) +return .u32; + else +return .u8; +case HImode: + return .u16; Promote here too? Or does this have nothing to do with +static enum machine_mode +arg_promotion (enum machine_mode mode) +{ + if (mode == QImode || mode == HImode) +return SImode; + return mode; +} No, these are different problems - the one in arg promotion is purely about KR C and trying to match untyped function decls with calls, while the type_from_mode bit was about some ptx ideosyncracy. Although I forget what the problem was, that code is more than a year old - I'll see if I can get rid of this. Err, no, it's quite necessary. From the manual The .u8, .s8 and .b8 instruction types are restricted to ld, st and cvt instructions. This means that if the compiler generates reasonable-looking code along the lines of .reg .u8 %r70; mov.u8 %r70,48; you get ptxas 2211-1.o, line 191; error : Arguments mismatch for instruction 'mov' Now, one _could_ write .cvt.u8.u32 for the load immediate, but then one would also have to write .cvt.u8.u8 for register-register moves, and that's starting to look iffy. I don't really want to rely on the ptx assembler to do the right thing for conversions from one type to itself. Bernd
Re: [PATCH] Optimize UBSAN_NULL checks, add sanopt.c
On 11/05/2014 03:34 PM, Yury Gribov wrote: On 11/05/2014 03:21 PM, Jakub Jelinek wrote: On Wed, Nov 05, 2014 at 03:16:49PM +0300, Yury Gribov wrote: On 11/05/2014 02:23 PM, Marek Polacek wrote: On Wed, Nov 05, 2014 at 11:50:20AM +0100, Jakub Jelinek wrote: On Wed, Nov 05, 2014 at 11:29:19AM +0100, Marek Polacek wrote: On Wed, Nov 05, 2014 at 12:54:37PM +0300, Yury Gribov wrote: Are you going to work on ASan soon? I could rebase my patches on top of Marek's infrastructure. I'm not going to work on ASan today or tomorrow, but it'd be nice to get this ASan opt in in this stage1. So if you can rebase your patch, I think that will be appreciated. Note, the algorithm we were discussing with Honza for the is there any possibility of a freeing call on the path between a dominating and dominated ASAN_CHECK Right. Let me see then if I can implement the following soon, maybe it makes sense to rebase Yuri's patch only on top of this algorithm. The algorithm looks like should_hoist_expr_to_dom in gcse.c btw. BTW have you considered relaxing the non-freeing restriction to not drop accesses to globals and stack variables? I wonder if we could win something there. Wouldn't it break most uses of __asan_poison_memory_region ? Most probably but I wonder if we should ask people to simply do asm volatile with memory clobber in this case? And we probably shouldn't call the whole thing is_nonfreeing anyway. Added Kostya to maybe comment on this. -Y
Re: [Patch ARM-AArch64/testsuite v3 00/21] Neon intrinsics executable tests
On Sun, Oct 26, 2014 at 4:50 PM, Christophe Lyon christophe.l...@linaro.org wrote: On 24 October 2014 10:07, Marcus Shawcroft marcus.shawcr...@gmail.com wrote: On 21 October 2014 14:02, Christophe Lyon christophe.l...@linaro.org wrote: This patch series is an updated version of the series I sent here: https://gcc.gnu.org/ml/gcc-patches/2014-07/msg00022.html I addressed comments from Marcus and Richard, and decided to skip support for half-precision variants for the time being. I'll post dedicated patches later. Compared to v2: - the directory containing the new tests is named gcc.target/aarch64/adv-simd instead of gcc.target/aarch64/neon-intrinsics. - the driver is named adv-simd.exp instead of neon-intrinsics.exp - the driver is guarded against the new test parallelization framework - the README file uses 'Advanced SIMD (Neon)' instead of 'Neon' Thank you Christophe. Please commit all 21 patches in the series. Thanks, I have committed the whole series. I've just realized afterwards that the tests aren't guarded against targets not supporting Neon. How about adding the attached small patch? (ChangeLog incorrectly formatted :-() 2014-10-26 Christophe Lyon christophe.l...@linaro.org gcc.target/aarch64/advsimd-intrinsics/advsimd-intrinsics.exp: Skip tests if target does not support Neon. Ok by me ... Ramana Christophe /Marcus
Re: [Patch ARM-AArch64/testsuite v3 00/21] Neon intrinsics executable tests
On 26 October 2014 16:50, Christophe Lyon christophe.l...@linaro.org wrote: I've just realized afterwards that the tests aren't guarded against targets not supporting Neon. How about adding the attached small patch? +if {[istarget arm*-*-*] + ![check_effective_target_arm_neon_ok]} then { + return +} + Umm, first thought was that this is a bit of a hack but having just discussed it with Ramana we don;t have a better alternative to hand, so OK. /Marcus
Re: [PATCH 3/n] OpenMP 4.0 offloading infrastructure: offload tables
On 08 Oct 11:23, Jakub Jelinek wrote: LGTM, with the requested var/section renames. Would like if Honza and/or Richard had a look at the cgraph/LTO stuff in the patch though. Since patch 2 was updated, this patch also should be updated. Now the offload_vars array is filled in varpool_node::get_create . Richard, is it OK for trunk? Thanks, -- Ilya --- diff --git a/gcc/Makefile.in b/gcc/Makefile.in index f31af05..3db30bf 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -2303,6 +2303,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \ $(srcdir)/tree-profile.c $(srcdir)/tree-nested.c \ $(srcdir)/tree-parloops.c \ $(srcdir)/omp-low.c \ + $(srcdir)/omp-low.h \ $(srcdir)/targhooks.c $(out_file) $(srcdir)/passes.c $(srcdir)/cgraphunit.c \ $(srcdir)/cgraphclones.c \ $(srcdir)/tree-phinodes.c \ diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index 83ab419..bafbadb 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -218,6 +218,7 @@ along with GCC; see the file COPYING3. If not see #include tree-nested.h #include gimplify.h #include dbgcnt.h +#include omp-low.h #include lto-section-names.h /* Queue of cgraph nodes scheduled to be added into cgraph. This is a diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 5036d4f..6a5a031 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -11205,6 +11205,12 @@ If defined, this function returns an appropriate alignment in bits for an atomic ISO C11 requires atomic compound assignments that may raise floating-point exceptions to raise exceptions corresponding to the arithmetic operation whose result was successfully stored in a compare-and-exchange sequence. This requires code equivalent to calls to @code{feholdexcept}, @code{feclearexcept} and @code{feupdateenv} to be generated at appropriate points in the compare-and-exchange sequence. This hook should set @code{*@var{hold}} to an expression equivalent to the call to @code{feholdexcept}, @code{*@var{clear}} to an expression equivalent to the call to @code{feclearexcept} and @code{*@var{update}} to an expression equivalent to the call to @code{feupdateenv}. The three expressions are @code{NULL_TREE} on entry to the hook and may be left as @code{NULL_TREE} if no code is required in a particular place. The default implementation leaves all three expressions as @code{NULL_TREE}. The @code{__atomic_feraiseexcept} function from @code{libatomic} may be of use as part of the code generated in @code{*@var{update}}. @end deftypefn +@deftypefn {Target Hook} void TARGET_RECORD_OFFLOAD_SYMBOL (tree) +Used when offloaded functions are seen in the compilation unit and no named +sections are available. It is called once for each symbol that must be +recorded in the offload function and variable table. +@end deftypefn + @defmac TARGET_SUPPORTS_WIDE_INT On older ports, large integers are stored in @code{CONST_DOUBLE} rtl diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 5674e6c..cadf05d 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -8167,6 +8167,8 @@ and the associated definitions of those functions. @hook TARGET_ATOMIC_ASSIGN_EXPAND_FENV +@hook TARGET_RECORD_OFFLOAD_SYMBOL + @defmac TARGET_SUPPORTS_WIDE_INT On older ports, large integers are stored in @code{CONST_DOUBLE} rtl diff --git a/gcc/gengtype.c b/gcc/gengtype.c index e48b448..06c37d5 100644 --- a/gcc/gengtype.c +++ b/gcc/gengtype.c @@ -1843,7 +1843,7 @@ open_base_files (void) tree-ssa.h, reload.h, cpp-id-data.h, tree-chrec.h, except.h, output.h, cfgloop.h, target.h, ipa-prop.h, lto-streamer.h, target-globals.h, - ipa-inline.h, dwarf2out.h, NULL + ipa-inline.h, dwarf2out.h, omp-low.h, NULL }; const char *const *ifp; outf_p gtype_desc_c; diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c index 45655ba..6c442cf 100644 --- a/gcc/lto-cgraph.c +++ b/gcc/lto-cgraph.c @@ -57,6 +57,7 @@ along with GCC; see the file COPYING3. If not see #include context.h #include pass_manager.h #include ipa-utils.h +#include omp-low.h /* True when asm nodes has been output. */ bool asm_nodes_output = false; @@ -1057,6 +1058,49 @@ read_string (struct lto_input_block *ib) return str; } +/* Output function/variable tables that will allow libgomp to look up offload + target code. OFFLOAD_FUNCS is filled in expand_omp_target, OFFLOAD_VARS is + filled in ipa_passes. In WHOPR (partitioned) mode during the WPA stage both + OFFLOAD_FUNCS and OFFLOAD_VARS are filled by input_offload_tables. */ + +void +output_offload_tables (void) +{ + if (vec_safe_is_empty (offload_funcs) vec_safe_is_empty (offload_vars)) +return; + + struct lto_simple_output_block *ob += lto_create_simple_output_block (LTO_section_offload_table); + + for (unsigned i = 0; i vec_safe_length (offload_funcs); i++) +{ + streamer_write_enum (ob-main_stream, LTO_symtab_tags, + LTO_symtab_last_tag,
Re: [PATCH][AArch64][4.8] LINK_SPEC changes for Cortex-A53 erratum 835769 workaround
On 22 October 2014 15:20, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: Hi all, This is the 4.8 backport of the LINK_SPEC changes to pass down the linker option --fix-cortex-a53-835769 Bootstrapped and tested on aarch64-none-linux-gnu. This depends on the patches under review at: https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01757.html and https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01758.html Ok for that branch after the prerequisites go in? Thanks, Kyrill 2014-10-22 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/aarch64/aarch64-elf-raw.h (CA53_ERR_835769_SPEC): Define. (LINK_SPEC): Include CA53_ERR_835769_SPEC. * config/aarch64/aarch64-linux.h (CA53_ERR_835769_SPEC): Define. (LINK_SPEC): Include CA53_ERR_835769_SPEC. OK /Marcus
Re: [PATCH] Optimize UBSAN_NULL checks, add sanopt.c
On Wed, Nov 05, 2014 at 04:13:01PM +0300, Yury Gribov wrote: Wouldn't it break most uses of __asan_poison_memory_region ? Most probably but I wonder if we should ask people to simply do asm volatile with memory clobber in this case? And we probably shouldn't call the whole thing is_nonfreeing anyway. Added Kostya to maybe comment on this. Well, right now !nonfreeing_call_p is any non-builtin call or non-leaf builtin call (i.e. builtin that might call functions in the current CU), or free/tm_free/realloc/stack_restore. So, by this definition __asan_poison_memory_region is also a !nonfreeing_call_p. Where would you like to see the volatile with memory clobber? You might very well just call some function that does the free for you. *p = 1; foo (p); *p = 2; and foo (p) could asm volatile ( : : : memory); somewhere and free (p) somewhere else. If in the future we e.g. IPA-prop propagate the nonfreeing_call_p property through the callgraph (as in, if the function you call is non-overridable and you know the flag for it, use it), things would still work unless you LTOed libasan together with your app (to make that work you'd probably want to add asm volatile to those calls, but doing it now would be very premature, or make __asan_*poison_memory_region a builtin that would be handled explicitly). Anyway, what I mean, ATM most of the calls are still going to be considered possibly freeing, and it will be pretty much the same calls that are considered as potentially calling __asan_poison_memory_region, so the optimization wouldn't change much. Jakub
Re: [Ping] [PATCH, 9/10] aarch64: generate conditional compare instructions
On 11/05/2014 10:05 AM, Zhenqiang Chen wrote: I had retested all the ccmp patches. Bootstrap and no make check regression on X86-64. Bootstrap and no make check regression on AARCH64 qemu. OK for trunk? No patch? Or what is it that you're wanting approval for? r~
Re: [PATCH] Optimize UBSAN_NULL checks, add sanopt.c
On 11/05/2014 04:23 PM, Jakub Jelinek wrote: On Wed, Nov 05, 2014 at 04:13:01PM +0300, Yury Gribov wrote: Wouldn't it break most uses of __asan_poison_memory_region ? Most probably but I wonder if we should ask people to simply do asm volatile with memory clobber in this case? And we probably shouldn't call the whole thing is_nonfreeing anyway. Added Kostya to maybe comment on this. Well, right now !nonfreeing_call_p is any non-builtin call or non-leaf builtin call (i.e. builtin that might call functions in the current CU), or free/tm_free/realloc/stack_restore. So, by this definition __asan_poison_memory_region is also a !nonfreeing_call_p. Where would you like to see the volatile with memory clobber? You might very well just call some function that does the free for you. *p = 1; foo (p); *p = 2; and foo (p) could asm volatile ( : : : memory); somewhere and free (p) somewhere else. I was thinking about e.g. removing check for the second access in extern int x[]; void foo (int i) { x[i] = 1; foo (p); x[i] = 2; } because accessability of a[i] obviously can't be changed by any call to free () inside foo (). But you are probably right that __asan_poison_memory could potentially be called inside foo (however rare it is) which would preclude this sort of optimization. If in the future we e.g. IPA-prop propagate the nonfreeing_call_p property through the callgraph (as in, if the function you call is non-overridable and you know the flag for it, use it), FYI we tried this on SPEC and some other apps but saw no performance improvements. -Y
Re: [libcc1, build] Enable libcc1 on Solaris
On 03/11/14 16:54, Rainer Orth wrote: I noticed that the new libcc1 wasn't built on Solaris. This happens because socketpair doesn't live in libc, but in libsocket instead. To deal with this, I've copied the libgo (and libjava) code to detect the need for libsocket and libnsl. Once the build was attempted, two failures had to be dealt with: * FD_ZERO and friends need string.h for a memset declaration. * On Solaris 10, AF_LOCAL isn't defined in system headers, while AF_UNIX is. In both libgo and libjava, there are unconditional uses of AF_UNIX, so I've followed their lead. Those changes allowed libcc1.so to build. Bootstrapped without regressions on i386-pc-solaris2.1[01] and x86_64-unknown-linux-gnu, ok for mainline? Btw., MAINTAINERS doesn't currently list a libcc1 maintainer. I believe it should. Rainer 2014-10-31 Rainer Orth r...@cebitec.uni-bielefeld.de * configure.ac (libcc1_cv_lib_sockets): Check for -lsocket -lnsl. * configure: Regenerate. * connection.cc: Include string.h. * libcc1.cc (libcc1_compile): Use AF_UNIX instead of AF_LOCAL. The configure change is fine. Also the include. (From a libcc authorship point of view). I am not aware of the history of AF_UNIX over AF_LOCAL so I have no comment on that. Please await permission from a GCC maintainer (I am not one). Cheers Phil
[PATCH] PR 63721 IPA ICF cause atomic-comp-swap-release-acquire.c ICE
the same ICE will happen on x86-64, if compile with -O2 -fPIC. the reason is for the following two functions, they are identical, so IPA-ICF pass try to transform the second function to call the first one directly. int atomic_compare_exchange_STRONG_RELEASE_ACQUIRE (int a, int b) { return __atomic_compare_exchange (v, a, b, STRONG, __ATOMIC_RELEASE, __ATOMIC_ACQUIRE); } int atomic_compare_exchange_n_STRONG_RELEASE_ACQUIRE (int a, int b) { return __atomic_compare_exchange_n (v, a, b, STRONG, __ATOMIC_RELEASE, __ATOMIC_ACQUIRE); } while during this transformation, looks like there are something wrong with the function argument handling. take a for example, because later there are a, so it's marked as addressable. while after transformation, if we turn the second function into int atomic_compare_exchange_n_STRONG_RELEASE_ACQUIRE (int a, int b) { return atomic_compare_exchange_STRONG_RELEASE_ACQUIRE (a, b) } then argument a is no longer addressable. so, in cgraph_node::release_body, when making the wrapper, except clearing the function body, we should also clear the addressable flag for function args because they are decided by the function body which is cleared. bootstrap ok on x86-64 and no regression. bootstrap ok on aarch64 juno. ICE gone away on arm x86-64 ok for trunk? gcc/ PR tree-optimization/63721 * cgraph.c (cgraph_node::release_body): Clear addressable flag for function args. diff --git a/gcc/cgraph.c b/gcc/cgraph.c index d430bc5..7ac0b2a 100644 --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -1668,6 +1668,7 @@ release_function_body (tree decl) void cgraph_node::release_body (bool keep_arguments) { + tree arg_p; ipa_transforms_to_apply.release (); if (!used_as_abstract_origin symtab-state != PARSING) { @@ -1676,6 +1677,10 @@ cgraph_node::release_body (bool keep_arguments) if (!keep_arguments) DECL_ARGUMENTS (decl) = NULL; } + + for (arg_p = DECL_ARGUMENTS (decl); arg_p; arg_p = DECL_CHAIN (arg_p)) +TREE_ADDRESSABLE (arg_p) = 0; + /* If the node is abstract and needed, then do not clear DECL_INITIAL of its associated function function declaration because it's needed to emit debug info later. */
libstdc++ new deque failures
Jonathan, I still am seeing new failures in the libstdc++ deque testsuite as of last night. I don't know if you still are working through the fallout from the earlier patches, but I wanted to make you aware. AIX defaults to 32 bit. A template was not initialized for int? FAIL: 23_containers/deque/requirements/dr438/insert_neg.cc (test for errors, line 1943) FAIL: 23_containers/deque/requirements/dr438/insert_neg.cc (test for excess errors) Excess errors: /tmp/20141104/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/stl_deque.h:1931:25: error: no matching function for call to 'std::dequeA::_M_fill_insert(std::dequeA::iterator, int, int)' FAIL: 23_containers/deque/requirements/dr438/assign_neg.cc (test for errors, line 1859) FAIL: 23_containers/deque/requirements/dr438/assign_neg.cc (test for excess errors) Excess errors: /tmp/20141104/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/stl_deque.h:1847:25: error: no matching function for call to 'std::dequeA::_M_fill_assign(int, int)' FAIL: 23_containers/deque/requirements/dr438/constructor_1_neg.cc (test for errors, line 1792) FAIL: 23_containers/deque/requirements/dr438/constructor_1_neg.cc (test for excess errors) Excess errors: /tmp/20141104/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/stl_deque.h:1780:22: error: no matching function for call to 'std::dequestd::dequeint ::_M_fill_initialize(int)' FAIL: 23_containers/deque/requirements/dr438/constructor_2_neg.cc (test for errors, line 1792) FAIL: 23_containers/deque/requirements/dr438/constructor_2_neg.cc (test for excess errors) Excess errors: /tmp/20141104/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/bits/stl_deque.h:1780:22: error: no matching function for call to 'std::dequestd::dequestd::pairchar, char ::_M_fill_initialize(char)' And these are not related to deque, but appear to be additional issues in the libstdc++ implementation: FAIL: 20_util/tuple/comparison_operators/overloaded.cc (test for excess errors) Excess errors: /tmp/20141104/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/tuple:102:12: error: 'constexpr std::_Head_base_Idx, _Head, false::_Head_base(const std::_Head_base_Idx, _Head, false) [with long unsigned int _Idx = 0ul; _Head = std::nullptr_t]' conflicts with a previous declaration /tmp/20141104/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/tuple:102:12: error: 'constexpr std::_Head_base_Idx, _Head, false::_Head_base(const std::_Head_base_Idx, _Head, false) [with long unsigned int _Idx = 0ul; _Head = std::nullptr_t]' conflicts with a previous declaration FAIL: 20_util/tuple/creation_functions/tuple_cat.cc (test for excess errors) Excess errors: /tmp/20141104/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/tuple:102:12: error: 'constexpr std::_Head_base_Idx, _Head, false::_Head_base(const std::_Head_base_Idx, _Head, false) [with long unsigned int _Idx = 6ul; _Head = std::nullptr_t]' conflicts with a previous declaration /tmp/20141104/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/tuple:102:12: error: 'constexpr std::_Head_base_Idx, _Head, false::_Head_base(const std::_Head_base_Idx, _Head, false) [with long unsigned int _Idx = 6ul; _Head = std::nullptr_t]' conflicts with a previous declaration If you would prefer that I open Bugzilla issues, let me know. Thanks, David
[PATCH, committed] Fix AIX testsuite failures
The appended patch XFAILs or adjusts testcases to avoid spurious warnings on AIX. Bootstrapped on powerpc-ibm-aix7.1.0.0 Thanks, David * gcc.dg/torture/pr59166.c: XFAIL on AIX. * g++.dg/ext/visitibility/anon1.C: XFAIL on AIX. * g++.dg/opt/pr60002.C: XFAIL on AIX. * g++.dg/torture/pr63419.C: Ignore non-standard ABI warning. * g++.dg/ipa/ipa-icf-5.C: Require visibility support. Index: torture/pr59166.c === --- torture/pr59166.c (revision 217109) +++ torture/pr59166.c (working copy) @@ -1,5 +1,6 @@ /* PR rtl-optimization/59166 */ /* { dg-additional-options -fcompare-debug } */ +/* { dg-xfail-if { powerpc-ibm-aix* } { * } { } } */ int a, b, c, f, g; Index: ext/visibility/anon1.C === --- ext/visibility/anon1.C (revision 217109) +++ ext/visibility/anon1.C (working copy) @@ -3,6 +3,7 @@ // { dg-do compile } // { dg-final { scan-assembler-not globl.*_ZN.*1fEv } } +// { dg-xfail-if { powerpc-ibm-aix* } { * } { } } namespace { Index: opt/pr60002.C === --- opt/pr60002.C (revision 217109) +++ opt/pr60002.C (working copy) @@ -1,6 +1,7 @@ // PR tree-optimization/60002 // { dg-do compile } // { dg-options -O2 -fcompare-debug -fdeclone-ctor-dtor -fipa-cp-clone } +// { dg-xfail-if { powerpc-ibm-aix* } { * } { } } struct A {}; Index: torture/pr63419.C === --- torture/pr63419.C (revision 217109) +++ torture/pr63419.C (working copy) @@ -1,5 +1,7 @@ // { dg-do compile } // { dg-additional-options -Wno-psabi } +// Ignore warning on some powerpc-linux configurations. +// { dg-prune-output non-standard ABI extension } typedef float __m128 __attribute__ ((__vector_size__ (16))); const int a = 0; Index: ipa/ipa-icf-5.C === --- ipa/ipa-icf-5.C (revision 217109) +++ ipa/ipa-icf-5.C (working copy) @@ -1,4 +1,5 @@ /* { dg-do compile } */ +/* { dg-require-visibility } */ /* { dg-options -O2 -fdump-ipa-icf } */ struct test
Re: [PATCH, committed] Fix AIX testsuite failures
Hi David, The appended patch XFAILs or adjusts testcases to avoid spurious warnings on AIX. Bootstrapped on powerpc-ibm-aix7.1.0.0 Thanks, David * gcc.dg/torture/pr59166.c: XFAIL on AIX. * g++.dg/ext/visitibility/anon1.C: XFAIL on AIX. * g++.dg/opt/pr60002.C: XFAIL on AIX. * g++.dg/torture/pr63419.C: Ignore non-standard ABI warning. * g++.dg/ipa/ipa-icf-5.C: Require visibility support. Index: torture/pr59166.c === --- torture/pr59166.c (revision 217109) +++ torture/pr59166.c (working copy) @@ -1,5 +1,6 @@ /* PR rtl-optimization/59166 */ /* { dg-additional-options -fcompare-debug } */ +/* { dg-xfail-if { powerpc-ibm-aix* } { * } { } } */ please omit the default args to dg-xfail-if/dg-skip/if ({ * } { }) here and in several other testcases: they have been unnecessary for quite some time. Also, please include an explanation why you are xfailing/skipping the test in the comment field so others can understand what's happening and eventually add their targets to the list if appropriate. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCHv5][Kasan] Allow to override Asan shadow offset from command line
Hello, On 24 Oct 17:56, Yury Gribov wrote: ... +const struct test_data_t test_data[] = { + { STRTOL, -0x8000, 0, -0x8000L, 0 }, ... + switch (test_data[i].fun) + { + case STRTOL: + res = strtol (test_data[i].nptr, 0, test_data[i].base); + break; As far as we might have `long long int' on 32-bit, `res' will fail to compare with corresponding `test_data[i].base'. Tiny patch fixes it. -- Thanks, K diff --git a/libiberty/testsuite/test-strtol.c b/libiberty/testsuite/test-strtol.c index 96d6871..6faf81b 100644 --- a/libiberty/testsuite/test-strtol.c +++ b/libiberty/testsuite/test-strtol.c @@ -132,7 +132,8 @@ run_tests (const struct test_data_t *test_data, size_t ntests) switch (test_data[i].fun) { case STRTOL: - res = strtol (test_data[i].nptr, 0, test_data[i].base); + res = (unsigned long) strtol (test_data[i].nptr, + 0, test_data[i].base); break; case STRTOUL: res = strtoul (test_data[i].nptr, 0, test_data[i].base);
Re: nvptx offloading patches [1/n]
Hi, On Tue, 4 Nov 2014, Jeff Law wrote: They still need to agree on the layout of the structure. And assuming it'll always be memcpy perhaps isn't wise. Consider the possibility that one day (perhaps soon) the host and GPU may share address space memory. Not only soon, there is already hardware out that does exactly that. HSA. Ciao, Michael.
RE: [PATCH, i686] Fix for asan test failures with -m32 happened after EBX enabling in PIC mode
Hi! Following patch (moving initialization of pic_offset_table_rtx earlier) fixes failures for asan tests on 32 bits in PIC mode mentioned here - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534#c48 Bootstrapped/regtested on x86_64, i686 Is it ok for trunk? ChangeLog: 2014-10-30 Igor Zamyatin igor.zamya...@intel.com * function.c (assign_parms): Move init of pic_offset_table_rtx from here to... * cfgexpand.c (expand_used_vars): ...here. The patch is probably fine. However, it would be good to have the analysis why you want to move initialization of the PIC register earlier. Asan (and anybody else can) emits global variable(s) in expand_used_vars during function expanding while pic reg is currently initialized later, during expand_function_start in assign_parms thus to be late in asan case in PIC mode. So to avoid such cases we put pic reg initialization in the beginning of expand_used_vars. This seems to be early enough. Thanks, Igor
Re: [patch,gomp-4_0-branch] acc nested function support
On Tue, 2014-11-04 at 16:45 -0800, Cesar Philippidis wrote: Here's an updated version of my nested function patch. David, I tweaked the gimple class hierarchy a little bit. Here's what the updated class diagram looks like: + gimple_statement_omp | |layout: GSS_OMP. Used for code GIMPLE_OMP_SECTION | | | + gimple_statement_omp_parallel_layout | | |layout: GSS_OMP_PARALLEL_LAYOUT | | | | | + gimple_statement_omp_targetreg | | | | | + gimple_statement_oacc_kernels | | |code: GIMPLE_OACC_KERNELS | | | | | + gimple_statement_oacc_parallel | | |code: GIMPLE_OACC_PARALLEL | | | | | + gimple_statement_omp_target | |code: GIMPLE_OMP_TARGET Basically, I've introduced gimple_statement_omp_targetreg and made GIMPLE_OACC_{PARALLEL,KERNELS} and GIMPLE_OMP_TARGET inherit it. This seems to work out pretty good. It cleans up both {lower,expand}_oacc_offload in omp-low.c and allows OpenACC kernel and parallel regions to be treated as OpenMP target regions in tree-nested.c. Are these changes to gimple.h OK? I'm not a reviewer, so it's not directly up to me, but if it simplifies the code then it seems reasonable. I'm interested in Jakub's opinion. Thomas, assuming these gimple changes are OK, should I commit this change to gomp-4_0-branch, or do you want to include this patch with your middle end trunk submission?
[gomp4] Remove unused BUILT_IN_OMP_SET_NUM_THREADS (was: various OpenACC/PTX built-ins and a reduction tweak)
Hi! On Thu, 18 Sep 2014 20:43:20 +0200, I wrote: On Tue, 16 Sep 2014 17:32:54 -0700, Cesar Philippidis ce...@codesourcery.com wrote: The patch [...] --- a/gcc/omp-builtins.def +++ b/gcc/omp-builtins.def @@ -236,6 +236,3 @@ DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TARGET_UPDATE, GOMP_target_update, BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR, ATTR_NOTHROW_LIST) DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TEAMS, GOMP_teams, BT_FN_VOID_UINT_UINT, ATTR_NOTHROW_LIST) - -DEF_GOMP_BUILTIN (BUILT_IN_OMP_SET_NUM_THREADS, omp_set_num_threads, - BT_FN_VOID_INT, ATTR_CONST_NOTHROW_LEAF_LIST) To avoid confusion: that has been added to gomp-4_0-branch earlier, and is now reverted to the trunk state. I have now actually removed this; r217135: commit d2579456a7b9008ba19cabc88393f83334324bdd Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Nov 5 15:38:02 2014 + Remove unused BUILT_IN_OMP_SET_NUM_THREADS. gcc/ * omp-builtins.def (BUILT_IN_OMP_SET_NUM_THREADS): Remove. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@217135 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.gomp | 4 gcc/omp-builtins.def | 3 --- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index ce98a18..ae1afd0 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,3 +1,7 @@ +2014-11-05 Thomas Schwinge tho...@codesourcery.com + + * omp-builtins.def (BUILT_IN_OMP_SET_NUM_THREADS): Remove. + 2014-11-03 Cesar Philippidis ce...@codesourcery.com * builtins.def (DEF_GOACC_BUILTIN): Revert erroneous checkin. diff --git gcc/omp-builtins.def gcc/omp-builtins.def index 698dc79..08b825c 100644 --- gcc/omp-builtins.def +++ gcc/omp-builtins.def @@ -236,6 +236,3 @@ DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TARGET_UPDATE, GOMP_target_update, BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR, ATTR_NOTHROW_LIST) DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TEAMS, GOMP_teams, BT_FN_VOID_UINT_UINT, ATTR_NOTHROW_LIST) - -DEF_GOMP_BUILTIN (BUILT_IN_OMP_SET_NUM_THREADS, omp_set_num_threads, - BT_FN_VOID_INT, ATTR_CONST_NOTHROW_LEAF_LIST) Grüße, Thomas pgp3S6lYSv5Cl.pgp Description: PGP signature
[gomp4] Remove unused OACC_WAIT (was: acc dealloc map)
Hi! On Mon, 20 Oct 2014 13:30:23 -0700, Cesar Philippidis ce...@codesourcery.com wrote: 2014-10-20 Cesar Philippidis ce...@codesourcery.com gcc/ * gimplify.c [...] (gimplify_expr): Remove OACC_WAIT, since it handled directly by the front ends. In r217136, I have now completely removed it: commit 1f7efc1d102a69676af101767271664a5788664e Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Nov 5 15:43:08 2014 + Remove unused OACC_WAIT. gcc/ * tree.def (OACC_WAIT): Remove. Update all users. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@217136 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.gomp | 2 ++ gcc/doc/generic.texi| 5 - gcc/gimplify.c | 2 -- gcc/tree-pretty-print.c | 5 - gcc/tree.def| 4 gcc/tree.h | 3 --- 6 files changed, 2 insertions(+), 19 deletions(-) diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index ae1afd0..5b2bade 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,5 +1,7 @@ 2014-11-05 Thomas Schwinge tho...@codesourcery.com + * tree.def (OACC_WAIT): Remove. Update all users. + * omp-builtins.def (BUILT_IN_OMP_SET_NUM_THREADS): Remove. 2014-11-03 Cesar Philippidis ce...@codesourcery.com diff --git gcc/doc/generic.texi gcc/doc/generic.texi index e756cf3..a638b87 100644 --- gcc/doc/generic.texi +++ gcc/doc/generic.texi @@ -2058,7 +2058,6 @@ edge. Rethrowing the exception is represented using @code{RESX_EXPR}. @tindex OACC_UPDATE @tindex OACC_ENTER_DATA @tindex OACC_EXIT_DATA -@tindex OACC_WAIT @tindex OACC_CACHE @tindex OMP_PARALLEL @tindex OMP_FOR @@ -2115,10 +2114,6 @@ Represents @code{#pragma acc enter data [clause1 @dots{} clauseN]}. Represents @code{#pragma acc exit data [clause1 @dots{} clauseN]}. -@item OACC_WAIT - -Represents @code{#pragma acc wait [(num @dots{})]}. - @item OACC_CACHE Represents @code{#pragma acc cache (var @dots{})}. diff --git gcc/gimplify.c gcc/gimplify.c index bdf4f4a..bfd7f66 100644 --- gcc/gimplify.c +++ gcc/gimplify.c @@ -4425,7 +4425,6 @@ is_gimple_stmt (tree t) case OACC_UPDATE: case OACC_ENTER_DATA: case OACC_EXIT_DATA: -case OACC_WAIT: case OACC_CACHE: case OMP_PARALLEL: case OMP_FOR: @@ -8755,7 +8754,6 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, code != OACC_UPDATE code != OACC_ENTER_DATA code != OACC_EXIT_DATA - code != OACC_WAIT code != OACC_CACHE code != OMP_CRITICAL code != OMP_FOR diff --git gcc/tree-pretty-print.c gcc/tree-pretty-print.c index 6f80e80..f311ed9 100644 --- gcc/tree-pretty-print.c +++ gcc/tree-pretty-print.c @@ -2546,11 +2546,6 @@ dump_generic_node (pretty_printer *buffer, tree node, int spc, int flags, dump_omp_clauses (buffer, OACC_EXIT_DATA_CLAUSES (node), spc, flags); break; -case OACC_WAIT: - pp_string (buffer, #pragma acc wait); - dump_omp_clauses (buffer, OACC_WAIT_CLAUSES (node), spc, flags); - break; - case OACC_CACHE: pp_string (buffer, #pragma acc cache); dump_omp_clauses (buffer, OACC_CACHE_CLAUSES(node), spc, flags); diff --git gcc/tree.def gcc/tree.def index 1af7d81..871a7fb 100644 --- gcc/tree.def +++ gcc/tree.def @@ -1163,10 +1163,6 @@ DEFTREECODE (OACC_ENTER_DATA, oacc_enter_data, tcc_statement, 1) Operand 0: OACC_EXIT_DATA_CLAUSES: List of clauses. */ DEFTREECODE (OACC_EXIT_DATA, oacc_exit_data, tcc_statement, 1) -/* OpenACC - #pragma acc wait [clause1 ... clauseN] - Operand 0: OACC_WAIT_CLAUSES: List of clauses. */ -DEFTREECODE (OACC_WAIT, oacc_wait, tcc_statement, 1) - /* OpenACC - #pragma acc cache [clause1 ... clauseN] Operand 0: OACC_CACHE_CLAUSES: List of clauses. */ DEFTREECODE (OACC_CACHE, oacc_cache, tcc_statement, 1) diff --git gcc/tree.h gcc/tree.h index ba5fc83..c91e716 100644 --- gcc/tree.h +++ gcc/tree.h @@ -1199,9 +1199,6 @@ extern void protected_set_expr_location (tree, location_t); #define OACC_UPDATE_CLAUSES(NODE) \ TREE_OPERAND (OACC_UPDATE_CHECK (NODE), 0) -#define OACC_WAIT_CLAUSES(NODE) \ - TREE_OPERAND (OACC_WAIT_CHECK (NODE), 0) - #define OACC_CACHE_CLAUSES(NODE) \ TREE_OPERAND (OACC_CACHE_CHECK (NODE), 0) Grüße, Thomas pgpP9UQS9kgi8.pgp Description: PGP signature
Re: [gofrontend-dev] [PATCH 4/4] Gccgo port to s390[x] -- part II
On Wed, Nov 5, 2014 at 2:05 AM, Dominik Vogt v...@linux.vnet.ibm.com wrote: On Tue, Nov 04, 2014 at 08:16:51PM -0800, Ian Taylor wrote: I committed the change to go-test.exp. Thanks. The other changes are not OK. As described in gcc/testsuite/go.test/test/README.gcc, the files in gcc/testsuite/go.test/test are an exact copy of the master Go testsuite. Any changes must be made to the master Go testsuite first. I understand that, but I'm unsure how to handle a set of patches that all depend on each other but refer to three different reposiories. So I posted this patch intentionally in the wrong place, not knowing how to do it in a better way. Changes to the master Go repository must follow the procedure described at http://golang.org/doc/contribute.html. I don't know what's up with the complex number change. In general the Go compiler and libraries go to some effort to produce the same answers on all platforms. We need to understand why we get different answers on s390 (you may understand the differences, but I don't). I won't change the tests without a clear understanding of why we are changing them. It's actually not a Go specific problem, the same deviation occurs in C code too. The cause is that constant folding is done with a higher precision and may yield a different result than the run time calculations. There is a Gcc bug report for that issue: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60181 So far it doesn't sound appropriate to change the Go testsuite for this. If the immediate goal is simply to get the s390 tests to pass, let's change go-test.exp to xfail the test unless and until somebody figures out the whole issue. Ian
[jit] Use ISALPHA and ISALNUM rather than writing our own
On Tue, 2014-11-04 at 14:39 -0700, Jeff Law wrote: On 11/04/14 09:57, David Malcolm wrote: +#define IS_ASCII_DIGIT(CHAR) \ + ((CHAR) = '0' (CHAR) ='9') + +#define IS_ASCII_ALNUM(CHAR) \ + (IS_ASCII_ALPHA (CHAR) || IS_ASCII_DIGIT (CHAR)) Can't we rely on the C library to give us equivalents? I've been burned in the past by the C library using locales, in particular the two lowercase i variants in Turkish. These macros are used by gcc_jit_context_new_function to enforce C's naming restrictions, to avoid errors from the assembler. The comment I put there was: /* The assembler can only handle certain names, so for now, enforce C's rules for identifiers upon the name. Eventually we'll need some way to interact with e.g. C++ name mangling. */ Am I right in thinking that for the assembler we need to enforce the C naming rules specifically on *ASCII*. (clearly another comment is needed here). I guess you've got to do it somewhere. Presumably there isn't something already in GCC that enforces an input character set? I guess I just dislike seeing something that feels like it ought to already be available. It turns out that locale-independent tests for this did already exist in libiberty, in safe-ctype.h, so I've committed this to the jit branch: gcc/jit/ChangeLog.jit: * libgccjit.c: Include safe-ctype.h from libiberty. (IS_ASCII_ALPHA): Delete. (IS_ASCII_DIGIT): Delete. (IS_ASCII_ALNUM): Delete. (gcc_jit_context_new_function): Replace use of IS_ASCII_ALPHA and IS_ASCII_ALNUM with ISALPHA and ISALNUM respectively, from libiberty. --- gcc/jit/ChangeLog.jit | 10 ++ gcc/jit/libgccjit.c | 24 +++- 2 files changed, 17 insertions(+), 17 deletions(-) diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit index 90fccdb..3d6361c 100644 --- a/gcc/jit/ChangeLog.jit +++ b/gcc/jit/ChangeLog.jit @@ -1,3 +1,13 @@ +2014-11-05 David Malcolm dmalc...@redhat.com + + * libgccjit.c: Include safe-ctype.h from libiberty. + (IS_ASCII_ALPHA): Delete. + (IS_ASCII_DIGIT): Delete. + (IS_ASCII_ALNUM): Delete. + (gcc_jit_context_new_function): Replace use of IS_ASCII_ALPHA and + IS_ASCII_ALNUM with ISALPHA and ISALNUM respectively, from + libiberty. + 2014-10-30 David Malcolm dmalc...@redhat.com * dummy-frontend.c (jit_langhook_init): Remove some dead code. diff --git a/gcc/jit/libgccjit.c b/gcc/jit/libgccjit.c index 286a85e..d9f259e 100644 --- a/gcc/jit/libgccjit.c +++ b/gcc/jit/libgccjit.c @@ -22,24 +22,12 @@ along with GCC; see the file COPYING3. If not see #include system.h #include coretypes.h #include opts.h +#include safe-ctype.h #include libgccjit.h #include jit-common.h #include jit-recording.h -#define IS_ASCII_ALPHA(CHAR) \ - (\ -((CHAR) = 'a' (CHAR) ='z')\ -|| \ -((CHAR) = 'A' (CHAR) = 'Z') \ - ) - -#define IS_ASCII_DIGIT(CHAR) \ - ((CHAR) = '0' (CHAR) ='9') - -#define IS_ASCII_ALNUM(CHAR) \ - (IS_ASCII_ALPHA (CHAR) || IS_ASCII_DIGIT (CHAR)) - struct gcc_jit_context : public gcc::jit::recording::context { gcc_jit_context (gcc_jit_context *parent_ctxt) : @@ -589,13 +577,15 @@ gcc_jit_context_new_function (gcc_jit_context *ctxt, RETURN_NULL_IF_FAIL (return_type, ctxt, loc, NULL return_type); RETURN_NULL_IF_FAIL (name, ctxt, loc, NULL name); /* The assembler can only handle certain names, so for now, enforce - C's rules for identiers upon the name. - Eventually we'll need some way to interact with e.g. C++ name mangling. */ + C's rules for identiers upon the name, using ISALPHA and ISALNUM + from safe-ctype.h to ignore the current locale. + Eventually we'll need some way to interact with e.g. C++ name + mangling. */ { /* Leading char: */ char ch = *name; RETURN_NULL_IF_FAIL_PRINTF2 ( - IS_ASCII_ALPHA (ch) || ch == '_', + ISALPHA (ch) || ch == '_', ctxt, loc, name \%s\ contains invalid character: '%c', name, ch); @@ -603,7 +593,7 @@ gcc_jit_context_new_function (gcc_jit_context *ctxt, for (const char *ptr = name + 1; (ch = *ptr); ptr++) { RETURN_NULL_IF_FAIL_PRINTF2 ( - IS_ASCII_ALNUM (ch) || ch == '_', + ISALNUM (ch) || ch == '_', ctxt, loc, name \%s\ contains invalid character: '%c', name, ch); -- 1.7.11.7
[gomp4] OpenACC Fortran testsuite: Expect some things to work by now.
Hi! Committed to gomp-4_0-branch in r217137: commit 83c3ae92fb16c23a782f012a49dd7aa1fcd01287 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Nov 5 15:54:45 2014 + OpenACC Fortran testsuite: Expect some things to work by now. gcc/testsuite/ * gfortran.dg/goacc/data-tree.f95: Remove dg-prune-output directive. * gfortran.dg/goacc/kernels-tree.f95: Likewise. * gfortran.dg/goacc/loop-tree-1.f90: Likewise. * gfortran.dg/goacc/parallel-tree.f95: Likewise. * gfortran.dg/goacc/private-1.f95: Likewise. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@217137 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/testsuite/ChangeLog.gomp | 8 gcc/testsuite/gfortran.dg/goacc/data-tree.f95 | 1 - gcc/testsuite/gfortran.dg/goacc/kernels-tree.f95 | 1 - gcc/testsuite/gfortran.dg/goacc/loop-tree-1.f90 | 1 - gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95 | 1 - gcc/testsuite/gfortran.dg/goacc/private-1.f95 | 1 - 6 files changed, 8 insertions(+), 5 deletions(-) diff --git gcc/testsuite/ChangeLog.gomp gcc/testsuite/ChangeLog.gomp index 4dd6b83..25be821 100644 --- gcc/testsuite/ChangeLog.gomp +++ gcc/testsuite/ChangeLog.gomp @@ -1,3 +1,11 @@ +2014-11-05 Thomas Schwinge tho...@codesourcery.com + + * gfortran.dg/goacc/data-tree.f95: Remove dg-prune-output directive. + * gfortran.dg/goacc/kernels-tree.f95: Likewise. + * gfortran.dg/goacc/loop-tree-1.f90: Likewise. + * gfortran.dg/goacc/parallel-tree.f95: Likewise. + * gfortran.dg/goacc/private-1.f95: Likewise. + 2014-11-04 Cesar Philippidis ce...@codesourcery.com * gfortran.dg/goacc/routine-1.f90: New test. diff --git gcc/testsuite/gfortran.dg/goacc/data-tree.f95 gcc/testsuite/gfortran.dg/goacc/data-tree.f95 index a5c012a..32c50fd 100644 --- gcc/testsuite/gfortran.dg/goacc/data-tree.f95 +++ gcc/testsuite/gfortran.dg/goacc/data-tree.f95 @@ -12,7 +12,6 @@ program test !$acc end data end program test -! { dg-prune-output unimplemented } ! { dg-final { scan-tree-dump-times pragma acc data 1 original } } ! { dg-final { scan-tree-dump-times if 1 original } } diff --git gcc/testsuite/gfortran.dg/goacc/kernels-tree.f95 gcc/testsuite/gfortran.dg/goacc/kernels-tree.f95 index 73f172c..7585a16 100644 --- gcc/testsuite/gfortran.dg/goacc/kernels-tree.f95 +++ gcc/testsuite/gfortran.dg/goacc/kernels-tree.f95 @@ -12,7 +12,6 @@ program test !$acc end kernels end program test -! { dg-prune-output unimplemented } ! { dg-final { scan-tree-dump-times pragma acc kernels 1 original } } ! { dg-final { scan-tree-dump-times if 1 original } } diff --git gcc/testsuite/gfortran.dg/goacc/loop-tree-1.f90 gcc/testsuite/gfortran.dg/goacc/loop-tree-1.f90 index 14779b6..47ff77e 100644 --- gcc/testsuite/gfortran.dg/goacc/loop-tree-1.f90 +++ gcc/testsuite/gfortran.dg/goacc/loop-tree-1.f90 @@ -35,7 +35,6 @@ program test !$acc end parallel end program test -! { dg-prune-output sorry } ! { dg-final { scan-tree-dump-times pragma acc loop 5 original } } ! { dg-final { scan-tree-dump-times collapse\\(2\\) 1 original } } diff --git gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95 gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95 index f004702..48061b1 100644 --- gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95 +++ gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95 @@ -15,7 +15,6 @@ program test !$acc end parallel end program test -! { dg-prune-output unimplemented } ! { dg-final { scan-tree-dump-times pragma acc parallel 1 original } } ! { dg-final { scan-tree-dump-times if 1 original } } diff --git gcc/testsuite/gfortran.dg/goacc/private-1.f95 gcc/testsuite/gfortran.dg/goacc/private-1.f95 index 5aeee3b..54c027d 100644 --- gcc/testsuite/gfortran.dg/goacc/private-1.f95 +++ gcc/testsuite/gfortran.dg/goacc/private-1.f95 @@ -31,7 +31,6 @@ program test end do !$acc end parallel end program test -! { dg-prune-output unimplemented } ! { dg-final { scan-tree-dump-times pragma acc parallel 3 omplower } } ! { dg-final { scan-tree-dump-times private\\(i\\) 3 omplower } } ! { dg-final { scan-tree-dump-times private\\(j\\) 2 omplower } } Grüße, Thomas pgpInHIpZoXdg.pgp Description: PGP signature
Re: [PATCH AVX512] Fix dg.torture tests with avx512
On 03 Nov 11:21, Jakub Jelinek wrote: On Fri, Oct 31, 2014 at 11:17:07AM +0100, Uros Bizjak wrote: I'd like to ask Jakub for a review of the above two parts, other parts are OK with a rename (as mentioned above). Looks ok to me. Where the ICEs discovered just by normal make check or only with GCC_TEST_RUN_EXPENSIVE ? If the latter, can you promote one of the permutations that caused the ICEs to normal tests? If not and GCC_TEST_RUN_EXPENSIVE has not been tested, can you try that? This was discovered without GCC_TEST_RUN_EXPENSIVE, but I've tested it with it enabled, and didn't see any fails. I've committed version below. --- gcc/config/i386/i386.c | 59 -- gcc/config/i386/sse.md | 54 ++--- 2 files changed, 98 insertions(+), 15 deletions(-) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index c528599..aaffe9d 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -45943,6 +45943,42 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d) { if (!TARGET_AVX512BW) return false; + + /* If vpermq didn't work, vpshufb won't work either. */ + if (d-vmode == V8DFmode || d-vmode == V8DImode) + return false; + + vmode = V64QImode; + if (d-vmode == V16SImode + || d-vmode == V32HImode + || d-vmode == V64QImode) + { + /* First see if vpermq can be used for +V16SImode/V32HImode/V64QImode. */ + if (valid_perm_using_mode_p (V8DImode, d)) + { + for (i = 0; i 8; i++) + perm[i] = (d-perm[i * nelt / 8] * 8 / nelt) 7; + if (d-testing_p) + return true; + target = gen_reg_rtx (V8DImode); + if (expand_vselect (target, gen_lowpart (V8DImode, d-op0), + perm, 8, false)) + { + emit_move_insn (d-target, + gen_lowpart (d-vmode, target)); + return true; + } + return false; + } + + /* Next see if vpermd can be used. */ + if (valid_perm_using_mode_p (V16SImode, d)) + vmode = V16SImode; + } + /* Or if vpermps can be used. */ + else if (d-vmode == V16SFmode) + vmode = V16SImode; if (vmode == V64QImode) { /* vpshufb only works intra lanes, it is not @@ -45962,6 +45998,9 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d) if (vmode == V8SImode) for (i = 0; i 8; ++i) rperm[i] = GEN_INT ((d-perm[i * nelt / 8] * 8 / nelt) 7); + else if (vmode == V16SImode) +for (i = 0; i 16; ++i) + rperm[i] = GEN_INT ((d-perm[i * nelt / 16] * 16 / nelt) 15); else { eltsz = GET_MODE_SIZE (GET_MODE_INNER (d-vmode)); @@ -46000,8 +46039,14 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d) emit_insn (gen_avx512bw_pshufbv64qi3 (target, op0, vperm)); else if (vmode == V8SFmode) emit_insn (gen_avx2_permvarv8sf (target, op0, vperm)); - else + else if (vmode == V8SImode) emit_insn (gen_avx2_permvarv8si (target, op0, vperm)); + else if (vmode == V16SFmode) + emit_insn (gen_avx512f_permvarv16sf (target, op0, vperm)); + else if (vmode == V16SImode) + emit_insn (gen_avx512f_permvarv16si (target, op0, vperm)); + else + gcc_unreachable (); } else { @@ -46055,21 +46100,21 @@ expand_vec_perm_1 (struct expand_vec_perm_d *d) { case V64QImode: if (TARGET_AVX512BW) - gen = gen_avx512bw_vec_dupv64qi; + gen = gen_avx512bw_vec_dupv64qi_1; break; case V32QImode: gen = gen_avx2_pbroadcastv32qi_1; break; case V32HImode: if (TARGET_AVX512BW) - gen = gen_avx512bw_vec_dupv32hi; + gen = gen_avx512bw_vec_dupv32hi_1; break; case V16HImode: gen = gen_avx2_pbroadcastv16hi_1; break; case V16SImode: if (TARGET_AVX512F) - gen = gen_avx512f_vec_dupv16si; + gen = gen_avx512f_vec_dupv16si_1; break; case V8SImode: gen = gen_avx2_pbroadcastv8si_1; @@ -46082,18 +46127,18 @@ expand_vec_perm_1 (struct expand_vec_perm_d *d) break; case V16SFmode: if (TARGET_AVX512F) - gen = gen_avx512f_vec_dupv16sf; + gen = gen_avx512f_vec_dupv16sf_1; break; case V8SFmode: gen = gen_avx2_vec_dupv8sf_1; break; case V8DFmode: if (TARGET_AVX512F) -
Re: [gomp4] OpenACC / C++
Hi! On Wed, 15 Oct 2014 11:21:05 -0500, James Norris jnor...@codesourcery.com wrote: This patch adds OpenACC support to C++ in the gomp4 branch. I found a few missing pieces; applied to gomp-4_0-branch in r217139: commit 09b8ef34550c377610c7a01aa2057fc8297e1b0a Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Nov 5 15:58:09 2014 + OpenACC C++: Add a few missing pieces. gcc/cp/ * parser.c (cp_parser_omp_clause_name): Also look for pcopy, pcopyin, pcopyout, pcreate. Look for wait instead of WAIT. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@217139 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/cp/ChangeLog.gomp | 6 ++ gcc/cp/parser.c | 14 +- 2 files changed, 15 insertions(+), 5 deletions(-) diff --git gcc/cp/ChangeLog.gomp gcc/cp/ChangeLog.gomp index 89f3f22..024e6a5 100644 --- gcc/cp/ChangeLog.gomp +++ gcc/cp/ChangeLog.gomp @@ -1,3 +1,9 @@ +2014-11-05 Thomas Schwinge tho...@codesourcery.com + + * parser.c (cp_parser_omp_clause_name): Also look for pcopy, + pcopyin, pcopyout, pcreate. Look for wait instead of + WAIT. + 2014-11-03 Cesar Philippidis ce...@codesourcery.com * parser.c (cp_parser_omp_clause_name): Also consider CPP_KEYWORD diff --git gcc/cp/parser.c gcc/cp/parser.c index e3afdca..3ef2de7 100644 --- gcc/cp/parser.c +++ gcc/cp/parser.c @@ -27535,13 +27535,17 @@ cp_parser_omp_clause_name (cp_parser *parser) result = PRAGMA_OMP_CLAUSE_PARALLEL; else if (!strcmp (present, p)) result = PRAGMA_OMP_CLAUSE_PRESENT; - else if (!strcmp (present_or_copy, p)) + else if (!strcmp (present_or_copy, p) + || !strcmp (pcopy, p)) result = PRAGMA_OMP_CLAUSE_PRESENT_OR_COPY; - else if (!strcmp (present_or_copyin, p)) + else if (!strcmp (present_or_copyin, p) + || !strcmp (pcopyin, p)) result = PRAGMA_OMP_CLAUSE_PRESENT_OR_COPYIN; - else if (!strcmp (present_or_copyout, p)) + else if (!strcmp (present_or_copyout, p) + || !strcmp (pcopyout, p)) result = PRAGMA_OMP_CLAUSE_PRESENT_OR_COPYOUT; - else if (!strcmp (present_or_create, p)) + else if (!strcmp (present_or_create, p) + || !strcmp (pcreate, p)) result = PRAGMA_OMP_CLAUSE_PRESENT_OR_CREATE; else if (!strcmp (private, p)) result = PRAGMA_OMP_CLAUSE_PRIVATE; @@ -27587,7 +27591,7 @@ cp_parser_omp_clause_name (cp_parser *parser) result = PRAGMA_CILK_CLAUSE_VECTORLENGTH; break; case 'w': - if (!strcmp (WAIT, p)) + if (!strcmp (wait, p)) result = PRAGMA_OMP_CLAUSE_WAIT; break; } Grüße, Thomas pgpvIJUHZcURD.pgp Description: PGP signature
[gomp4] libgomp OpenACC testsuite: Remove two obsolete test cases.
Hi! Applied to gomp-4_0-branch in r217140: commit 3838e13dedee9217b067cf2ab4b3fb5bb7d5cf68 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Nov 5 16:02:30 2014 + libgomp OpenACC testsuite: Remove two obsolete test cases. libgomp/ * testsuite/libgomp.oacc-c/goacc_kernels.c: Remove file. * testsuite/libgomp.oacc-c/goacc_parallel.c: Remove file. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@217140 138bc75d-0d04-0410-961f-82ee72b054a4 --- libgomp/ChangeLog.gomp| 5 libgomp/testsuite/libgomp.oacc-c/goacc_kernels.c | 28 --- libgomp/testsuite/libgomp.oacc-c/goacc_parallel.c | 28 --- 3 files changed, 5 insertions(+), 56 deletions(-) diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp index bd8c119..d24ef43 100644 --- libgomp/ChangeLog.gomp +++ libgomp/ChangeLog.gomp @@ -1,3 +1,8 @@ +2014-11-05 Thomas Schwinge tho...@codesourcery.com + + * testsuite/libgomp.oacc-c/goacc_kernels.c: Remove file. + * testsuite/libgomp.oacc-c/goacc_parallel.c: Remove file. + 2014-11-04 Cesar Philippidis ce...@codesourcery.com * testsuite/libgomp.oacc-fortran/routine-1.f90: New test. diff --git libgomp/testsuite/libgomp.oacc-c/goacc_kernels.c libgomp/testsuite/libgomp.oacc-c/goacc_kernels.c deleted file mode 100644 index 683fefa..000 --- libgomp/testsuite/libgomp.oacc-c/goacc_kernels.c +++ /dev/null @@ -1,28 +0,0 @@ -/* { dg-do run } */ -/* { dg-skip-if { *-*-* } { * } { -DACC_DEVICE_TYPE_host=1 } } */ - -#include libgomp_g.h - -extern void abort (); - -volatile int i; - -void -f (void *data) -{ - if (i != -1) -abort (); - i = 42; -} - -int main(void) -{ - i = -1; - GOACC_kernels (0, f, (const void *) 0, -0, (void *) 0, (void *) 0, (void *) 0, -1, 1, 1, -2, -1); - if (i != 42) -abort (); - - return 0; -} diff --git libgomp/testsuite/libgomp.oacc-c/goacc_parallel.c libgomp/testsuite/libgomp.oacc-c/goacc_parallel.c deleted file mode 100644 index 232ce8a..000 --- libgomp/testsuite/libgomp.oacc-c/goacc_parallel.c +++ /dev/null @@ -1,28 +0,0 @@ -/* { dg-do run } */ -/* { dg-skip-if { *-*-* } { * } { -DACC_DEVICE_TYPE_host=1 } } */ - -#include libgomp_g.h - -extern void abort (); - -volatile int i; - -void -f (void *data) -{ - if (i != -1) -abort (); - i = 42; -} - -int main(void) -{ - i = -1; - GOACC_parallel (0, f, (const void *) 0, - 0, (void *) 0, (void *) 0, (void *) 0, - 1, 1, 1, -2, -1); - if (i != 42) -abort (); - - return 0; -} Grüße, Thomas pgpIYTfkoAS9U.pgp Description: PGP signature
Re: [PING][PATCH] Don't call fatal_error before error reporting has been initialized.
Ping. On 20 Oct 19:25, Ilya Tocar wrote: Same in collect2. On 09 Oct 15:40, Ilya Tocar wrote: Ping. On 29 Sep 18:02, Ilya Tocar wrote: Hi, Currently if call to atexit (lto_wrapper_cleanup) fails we won't report error as we haven't initialized error-reporting infrastructure. This patch moves this call after diagnostic_initialize. I hope that we can't exit inside diagnostic_initialize. Otherwise we won't cleanup after it. Ok for trunk? --- gcc/collect2.c| 6 +++--- gcc/lto-wrapper.c | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/gcc/collect2.c b/gcc/collect2.c index c54e6fb..b0784e8 100644 --- a/gcc/collect2.c +++ b/gcc/collect2.c @@ -955,9 +955,6 @@ main (int argc, char **argv) signal (SIGCHLD, SIG_DFL); #endif - if (atexit (collect_atexit) != 0) -fatal_error (atexit failed); - /* Unlock the stdio streams. */ unlock_std_streams (); @@ -965,6 +962,9 @@ main (int argc, char **argv) diagnostic_initialize (global_dc, 0); + if (atexit (collect_atexit) != 0) +fatal_error (atexit failed); + /* Do not invoke xcalloc before this point, since locale needs to be set first, in case a diagnostic is issued. */ diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c index 8033b15..d97f617 100644 --- a/gcc/lto-wrapper.c +++ b/gcc/lto-wrapper.c @@ -879,13 +879,13 @@ main (int argc, char *argv[]) xmalloc_set_program_name (progname); - if (atexit (lto_wrapper_cleanup) != 0) -fatal_error (atexit failed); - gcc_init_libintl (); diagnostic_initialize (global_dc, 0); + if (atexit (lto_wrapper_cleanup) != 0) +fatal_error (atexit failed); + if (signal (SIGINT, SIG_IGN) != SIG_IGN) signal (SIGINT, fatal_signal); #ifdef SIGHUP -- 1.8.3.1
Re: [debug-early] emit locals early patchset
On Tue, Oct 28, 2014 at 03:57:43PM +0100, Richard Biener wrote: On Tue, Oct 28, 2014 at 1:00 AM, Aldy Hernandez al...@redhat.com wrote: Gentlemen! My apologies for the big patch. In getting locals emitted early (parameters and locally scoped variables), I ran into many things which were in need of surgery, many of which couldn't happen without the other. Consequently, I ended up fixing everything such that we are now back to no guality.exp failures for any language. [Curiously, I hadn't noticed locals were not being dumped early because they were being picked up by the late dwarf pass. Fixing this oversight is what caused this entire patch.] There are a lot of changes here, and I would greatly appreciate feedback, so let me at least explain what's going on at a high level... 1. Changes to gen_subprogram_die() to handle early generation of locals, and amending location information on the second pass. Everything else in this patch, basically stems from this change. 2. Changes to gen_variable_die() to handle multiple passes (early/late dwarf generation). A lot of this is complicated by the fact that old_die's are cached and keyed by `tree', but an abstract instance and an inline instance share trees, while dwarf2out_abstract_function() sets DECL_ABSTRACT_P behind the scenes. The current support (and my changes) maintain this shared and delicate design. I wonder whether we could simplify a lot of this code by unsharing these trees, but this may be beyond the scope of this work. Richi perhaps you can comment? I think that the abstract and inline instances are cases that are _only_ generated early - that is, they don't contain any locations or whatever and thus do not need to ameded in the late dwarf pass. So I'd simply generate dwarf for them and not remember their DIEs. Please have a look at PR 63722, which is a complaint I got from people exploring possiblities of using dwarf to analyze what gcc did to Linux kernel and they discovered that DIEs of abstract origins do not have their DW_AT_inline set correctly for functions that were IPA-SRAed but not inlined. On the original testcase before reduction I've seen the same problem also when using -fno-ipa-sra -fno-ipa-cp so it is not just my passes but also probably ipa-split ;-) In any case, if we want abstract origins to have this field set correctly in abstract origins, we need to have the possibility to modify it after IPA. Martin
[PATCH] Correctly check dg-require-effective-target in avx512 tests.
Hi, Currently we only check for dg-require-effective-target avx512vl in avx512vl tests. We should also check for avx512dq/avx512bw. Patch bwllow does this. Ok for trunk? 2014-11-05 Ilya Tocar ilya.to...@intel.com * gcc.target/i386/avx512vl-vandnpd-2.c: Fix dg-require-effective-target cehck. * gcc.target/i386/avx512vl-vandnps-2.c: Ditto. * gcc.target/i386/avx512vl-vandpd-2.c: Ditto. * gcc.target/i386/avx512vl-vandps-2.c: Ditto. * gcc.target/i386/avx512vl-vbroadcastf32x2-2.c: Ditto. * gcc.target/i386/avx512vl-vbroadcastf32x4-2.c: Ditto. * gcc.target/i386/avx512vl-vbroadcastf64x2-2.c: Ditto. * gcc.target/i386/avx512vl-vbroadcasti32x2-2.c: Ditto. * gcc.target/i386/avx512vl-vbroadcasti32x4-2.c: Ditto. * gcc.target/i386/avx512vl-vbroadcasti64x2-2.c: Ditto. * gcc.target/i386/avx512vl-vcvtpd2qq-2.c: Ditto. * gcc.target/i386/avx512vl-vcvtpd2uqq-2.c: Ditto. * gcc.target/i386/avx512vl-vcvtps2qq-2.c: Ditto. * gcc.target/i386/avx512vl-vcvtps2uqq-2.c: Ditto. * gcc.target/i386/avx512vl-vcvtqq2pd-2.c: Ditto. * gcc.target/i386/avx512vl-vcvtqq2ps-2.c: Ditto. * gcc.target/i386/avx512vl-vcvttpd2qq-2.c: Ditto. * gcc.target/i386/avx512vl-vcvttpd2uqq-2.c: Ditto. * gcc.target/i386/avx512vl-vcvttps2qq-2.c: Ditto. * gcc.target/i386/avx512vl-vcvttps2uqq-2.c: Ditto. * gcc.target/i386/avx512vl-vcvtuqq2pd-2.c: Ditto. * gcc.target/i386/avx512vl-vcvtuqq2ps-2.c: Ditto. * gcc.target/i386/avx512vl-vdbpsadbw-2.c: Ditto. * gcc.target/i386/avx512vl-vextractf64x2-2.c: Ditto. * gcc.target/i386/avx512vl-vextracti64x2-2.c: Ditto. * gcc.target/i386/avx512vl-vfpclasspd-2.c: Ditto. * gcc.target/i386/avx512vl-vfpclassps-2.c: Ditto. * gcc.target/i386/avx512vl-vinsertf64x2-2.c: Ditto. * gcc.target/i386/avx512vl-vinserti64x2-2.c: Ditto. * gcc.target/i386/avx512vl-vmovdqu16-2.c: Ditto. * gcc.target/i386/avx512vl-vmovdqu8-2.c: Ditto. * gcc.target/i386/avx512vl-vorpd-2.c: Ditto. * gcc.target/i386/avx512vl-vorps-2.c: Ditto. * gcc.target/i386/avx512vl-vpabsb-2.c: Ditto. * gcc.target/i386/avx512vl-vpabsw-2.c: Ditto. * gcc.target/i386/avx512vl-vpackssdw-2.c: Ditto. * gcc.target/i386/avx512vl-vpacksswb-2.c: Ditto. * gcc.target/i386/avx512vl-vpackusdw-2.c: Ditto. * gcc.target/i386/avx512vl-vpackuswb-2.c: Ditto. * gcc.target/i386/avx512vl-vpaddb-2.c: Ditto. * gcc.target/i386/avx512vl-vpaddsb-2.c: Ditto. * gcc.target/i386/avx512vl-vpaddsw-2.c: Ditto. * gcc.target/i386/avx512vl-vpaddusb-2.c: Ditto. * gcc.target/i386/avx512vl-vpaddusw-2.c: Ditto. * gcc.target/i386/avx512vl-vpaddw-2.c: Ditto. * gcc.target/i386/avx512vl-vpalignr-2.c: Ditto. * gcc.target/i386/avx512vl-vpavgb-2.c: Ditto. * gcc.target/i386/avx512vl-vpavgw-2.c: Ditto. * gcc.target/i386/avx512vl-vpblendmb-2.c: Ditto. * gcc.target/i386/avx512vl-vpblendmw-2.c: Ditto. * gcc.target/i386/avx512vl-vpbroadcastb-2.c: Ditto. * gcc.target/i386/avx512vl-vpbroadcastw-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpb-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpeqb-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpequb-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpequw-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpeqw-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpgtb-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpgtub-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpgtuw-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpgtw-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpub-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpuw-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpw-2.c: Ditto. * gcc.target/i386/avx512vl-vpermi2w-2.c: Ditto. * gcc.target/i386/avx512vl-vpermt2w-2.c: Ditto. * gcc.target/i386/avx512vl-vpermw-2.c: Ditto. * gcc.target/i386/avx512vl-vpmaddubsw-2.c: Ditto. * gcc.target/i386/avx512vl-vpmaddwd-2.c: Ditto. * gcc.target/i386/avx512vl-vpmaxsb-2.c: Ditto. * gcc.target/i386/avx512vl-vpmaxsw-2.c: Ditto. * gcc.target/i386/avx512vl-vpmaxub-2.c: Ditto. * gcc.target/i386/avx512vl-vpmaxuw-2.c: Ditto. * gcc.target/i386/avx512vl-vpminsb-2.c: Ditto. * gcc.target/i386/avx512vl-vpminsw-2.c: Ditto. * gcc.target/i386/avx512vl-vpminub-2.c: Ditto. * gcc.target/i386/avx512vl-vpminuw-2.c: Ditto. * gcc.target/i386/avx512vl-vpmovb2m-2.c: Ditto. * gcc.target/i386/avx512vl-vpmovd2m-2.c: Ditto. * gcc.target/i386/avx512vl-vpmovm2b-2.c: Ditto. * gcc.target/i386/avx512vl-vpmovm2d-2.c: Ditto. * gcc.target/i386/avx512vl-vpmovm2q-2.c: Ditto. * gcc.target/i386/avx512vl-vpmovm2w-2.c: Ditto. *
[gomp4] OpenACC documentation updates.
Hi! Applied to gomp-4_0-branch in r217142: commit 0c5178ff5207bf1ede83070629c7d76fbbdf1afb Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Nov 5 16:12:51 2014 + OpenACC documentation updates. gcc/ * invoke.texi: Update for OpenACC. * sourcebuild.texi: Likewise. gcc/fortran/ * gfortran.texi: Update for OpenACC. * intrinsic.texi: Likewise. * invoke.texi: Likewise. libgomp/ * libgomp.texi: Update for OpenACC. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@217142 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.gomp | 3 +++ gcc/doc/invoke.texi| 4 ++-- gcc/doc/sourcebuild.texi | 2 +- gcc/fortran/ChangeLog.gomp | 6 ++ gcc/fortran/gfortran.texi | 38 ++ gcc/fortran/intrinsic.texi | 31 ++- gcc/fortran/invoke.texi| 7 ++- libgomp/ChangeLog.gomp | 2 ++ libgomp/libgomp.texi | 10 ++ 9 files changed, 90 insertions(+), 13 deletions(-) diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index 5b2bade..fc624c8 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,5 +1,8 @@ 2014-11-05 Thomas Schwinge tho...@codesourcery.com + * invoke.texi: Update for OpenACC. + * sourcebuild.texi: Likewise. + * tree.def (OACC_WAIT): Remove. Update all users. * omp-builtins.def (BUILT_IN_OMP_SET_NUM_THREADS): Remove. diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi index 4cd4f4a..0fe875b 100644 --- gcc/doc/invoke.texi +++ gcc/doc/invoke.texi @@ -1872,8 +1872,8 @@ freestanding and hosted environments. @item -fopenacc @opindex fopenacc @cindex OpenACC accelerator programming -Enable handling of OpenACC directives @code{#pragma acc} in C. -When @option{-fopenacc} is specified, the +Enable handling of OpenACC directives @code{#pragma acc} in C/C++ and +@code{!$acc} in Fortran. When @option{-fopenacc} is specified, the compiler generates accelerated code according to the OpenACC Application Programming Interface v2.0 @w{@uref{http://www.openacc.org/}}. This option implies @option{-pthread}, and thus is only supported on targets that diff --git gcc/doc/sourcebuild.texi gcc/doc/sourcebuild.texi index 5d1625d..d27fac0 100644 --- gcc/doc/sourcebuild.texi +++ gcc/doc/sourcebuild.texi @@ -89,7 +89,7 @@ The Go runtime library. The bulk of this library is mirrored from the @uref{http://code.google.com/@/p/@/go/, master Go repository}. @item libgomp -The GNU OpenMP runtime library. +The GNU OpenACC and OpenMP runtime library. @item libiberty The @code{libiberty} library, used for portability and for some diff --git gcc/fortran/ChangeLog.gomp gcc/fortran/ChangeLog.gomp index 5f2e9ba..98e3971 100644 --- gcc/fortran/ChangeLog.gomp +++ gcc/fortran/ChangeLog.gomp @@ -1,3 +1,9 @@ +2014-11-05 Thomas Schwinge tho...@codesourcery.com + + * gfortran.texi: Update for OpenACC. + * intrinsic.texi: Likewise. + * invoke.texi: Likewise. + 2014-11-04 Cesar Philippidis ce...@codesourcery.com * gfortran.h (ST_OACC_ROUTINE): New statement enum. diff --git gcc/fortran/gfortran.texi gcc/fortran/gfortran.texi index 41d6559..c3e7518 100644 --- gcc/fortran/gfortran.texi +++ gcc/fortran/gfortran.texi @@ -474,7 +474,8 @@ The GNU Fortran compiler is able to compile nearly all standard-compliant Fortran 95, Fortran 90, and Fortran 77 programs, including a number of standard and non-standard extensions, and can be used on real-world programs. In particular, the supported extensions -include OpenMP, Cray-style pointers, and several Fortran 2003 and Fortran +include OpenACC, OpenMP, Cray-style pointers, and several Fortran 2003 +and Fortran 2008 features, including TR 15581. However, it is still under development and has a few remaining rough edges. @@ -531,7 +532,8 @@ The current status of the support is can be found in the @ref{Fortran 2003 status}, @ref{Fortran 2008 status} and @ref{TS 29113 status} sections of the documentation. -Additionally, the GNU Fortran compilers supports the OpenMP specification +Additionally, the GNU Fortran compilers supports the OpenACC specification +(version 2.0, @url{http://www.openacc.org/}), and OpenMP specification (version 4.0, @url{http://openmp.org/@/wp/@/openmp-specifications/}). @node Varying Length Character Strings @@ -963,7 +965,8 @@ module. @cindex statement, @code{ISO_FORTRAN_ENV} @code{USE} statement with @code{INTRINSIC} and @code{NON_INTRINSIC} attribute; supported intrinsic modules: @code{ISO_FORTRAN_ENV}, -@code{ISO_C_BINDING}, @code{OMP_LIB} and @code{OMP_LIB_KINDS}. +@code{ISO_C_BINDING}, @code{OMP_LIB} and @code{OMP_LIB_KINDS}, +and @code{OPENACC}. @item Renaming of operators in the @code{USE} statement. @@ -1358,6 +1361,7 @@ without warning. * Hollerith constants support:: * Cray pointers:: * CONVERT specifier:: +* OpenACC:: * OpenMP::
[gomp4] libgomp testsuite: OpenACC C++ testing (was: [2/3] OpenACC 2.0 support for libgomp - new tests)
Hi! Applied to gomp-4_0-branch in r217143: commit a78a06124f4047ec46a85e539e83640cc973aec1 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Nov 5 16:16:14 2014 + libgomp testsuite: OpenACC C++ testing. libgomp/ * testsuite/libgomp.oacc-c++/c++.exp: Enable libgomp.oacc-c-c++-common testing. * testsuite/libgomp.oacc-c/c.exp: Likewise. * testsuite/libgomp.oacc-c/abort-2.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/abort-2.c: ... this. * testsuite/libgomp.oacc-c/abort.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/abort.c: ... this. * testsuite/libgomp.oacc-c/acc_on_device-1.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: ... this. * testsuite/libgomp.oacc-c/clauses-1.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/clauses-1.c: ... this. * testsuite/libgomp.oacc-c/clauses-2.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/clauses-2.c: ... this. * testsuite/libgomp.oacc-c/context-1.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/context-1.c: ... this. * testsuite/libgomp.oacc-c/context-2.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/context-2.c: ... this. * testsuite/libgomp.oacc-c/context-3.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/context-3.c: ... this. * testsuite/libgomp.oacc-c/context-4.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/context-4.c: ... this. * testsuite/libgomp.oacc-c/data-1.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/data-1.c: ... this. * testsuite/libgomp.oacc-c/data-2.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/data-2.c: ... this. * testsuite/libgomp.oacc-c/data-3.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/data-3.c: ... this. * testsuite/libgomp.oacc-c/deviceptr-1.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/deviceptr-1.c: ... this. * testsuite/libgomp.oacc-c/if-1.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/if-1.c: ... this. * testsuite/libgomp.oacc-c/kernels-1.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/kernels-1.c: ... this. * testsuite/libgomp.oacc-c/lib-1.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-1.c: ... this. * testsuite/libgomp.oacc-c/lib-10.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-10.c: ... this. * testsuite/libgomp.oacc-c/lib-11.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-11.c: ... this. * testsuite/libgomp.oacc-c/lib-12.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-12.c: ... this. * testsuite/libgomp.oacc-c/lib-13.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-13.c: ... this. * testsuite/libgomp.oacc-c/lib-14.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-14.c: ... this. * testsuite/libgomp.oacc-c/lib-15.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-15.c: ... this. * testsuite/libgomp.oacc-c/lib-16.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-16.c: ... this. * testsuite/libgomp.oacc-c/lib-17.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-17.c: ... this. * testsuite/libgomp.oacc-c/lib-18.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-18.c: ... this. * testsuite/libgomp.oacc-c/lib-19.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-19.c: ... this. * testsuite/libgomp.oacc-c/lib-2.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-2.c: ... this. * testsuite/libgomp.oacc-c/lib-20.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-20.c: ... this. * testsuite/libgomp.oacc-c/lib-21.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-21.c: ... this. * testsuite/libgomp.oacc-c/lib-22.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-22.c: ... this. * testsuite/libgomp.oacc-c/lib-23.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-23.c: ... this. * testsuite/libgomp.oacc-c/lib-24.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-24.c: ... this. * testsuite/libgomp.oacc-c/lib-25.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-25.c: ... this. * testsuite/libgomp.oacc-c/lib-26.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-26.c: ... this. * testsuite/libgomp.oacc-c/lib-27.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-27.c: ... this. * testsuite/libgomp.oacc-c/lib-28.c: Rename to... * testsuite/libgomp.oacc-c-c++-common/lib-28.c: ... this. * testsuite/libgomp.oacc-c/lib-29.c: Rename to...
Re: [PATCH] Correctly check dg-require-effective-target in avx512 tests.
On Wed, Nov 5, 2014 at 5:14 PM, Ilya Tocar tocarip.in...@gmail.com wrote: Hi, Currently we only check for dg-require-effective-target avx512vl in avx512vl tests. We should also check for avx512dq/avx512bw. Patch bwllow does this. Ok for trunk? 2014-11-05 Ilya Tocar ilya.to...@intel.com * gcc.target/i386/avx512vl-vandnpd-2.c: Fix dg-require-effective-target cehck. * gcc.target/i386/avx512vl-vandnps-2.c: Ditto. * gcc.target/i386/avx512vl-vandpd-2.c: Ditto. * gcc.target/i386/avx512vl-vandps-2.c: Ditto. * gcc.target/i386/avx512vl-vbroadcastf32x2-2.c: Ditto. * gcc.target/i386/avx512vl-vbroadcastf32x4-2.c: Ditto. * gcc.target/i386/avx512vl-vbroadcastf64x2-2.c: Ditto. * gcc.target/i386/avx512vl-vbroadcasti32x2-2.c: Ditto. * gcc.target/i386/avx512vl-vbroadcasti32x4-2.c: Ditto. * gcc.target/i386/avx512vl-vbroadcasti64x2-2.c: Ditto. * gcc.target/i386/avx512vl-vcvtpd2qq-2.c: Ditto. * gcc.target/i386/avx512vl-vcvtpd2uqq-2.c: Ditto. * gcc.target/i386/avx512vl-vcvtps2qq-2.c: Ditto. * gcc.target/i386/avx512vl-vcvtps2uqq-2.c: Ditto. * gcc.target/i386/avx512vl-vcvtqq2pd-2.c: Ditto. * gcc.target/i386/avx512vl-vcvtqq2ps-2.c: Ditto. * gcc.target/i386/avx512vl-vcvttpd2qq-2.c: Ditto. * gcc.target/i386/avx512vl-vcvttpd2uqq-2.c: Ditto. * gcc.target/i386/avx512vl-vcvttps2qq-2.c: Ditto. * gcc.target/i386/avx512vl-vcvttps2uqq-2.c: Ditto. * gcc.target/i386/avx512vl-vcvtuqq2pd-2.c: Ditto. * gcc.target/i386/avx512vl-vcvtuqq2ps-2.c: Ditto. * gcc.target/i386/avx512vl-vdbpsadbw-2.c: Ditto. * gcc.target/i386/avx512vl-vextractf64x2-2.c: Ditto. * gcc.target/i386/avx512vl-vextracti64x2-2.c: Ditto. * gcc.target/i386/avx512vl-vfpclasspd-2.c: Ditto. * gcc.target/i386/avx512vl-vfpclassps-2.c: Ditto. * gcc.target/i386/avx512vl-vinsertf64x2-2.c: Ditto. * gcc.target/i386/avx512vl-vinserti64x2-2.c: Ditto. * gcc.target/i386/avx512vl-vmovdqu16-2.c: Ditto. * gcc.target/i386/avx512vl-vmovdqu8-2.c: Ditto. * gcc.target/i386/avx512vl-vorpd-2.c: Ditto. * gcc.target/i386/avx512vl-vorps-2.c: Ditto. * gcc.target/i386/avx512vl-vpabsb-2.c: Ditto. * gcc.target/i386/avx512vl-vpabsw-2.c: Ditto. * gcc.target/i386/avx512vl-vpackssdw-2.c: Ditto. * gcc.target/i386/avx512vl-vpacksswb-2.c: Ditto. * gcc.target/i386/avx512vl-vpackusdw-2.c: Ditto. * gcc.target/i386/avx512vl-vpackuswb-2.c: Ditto. * gcc.target/i386/avx512vl-vpaddb-2.c: Ditto. * gcc.target/i386/avx512vl-vpaddsb-2.c: Ditto. * gcc.target/i386/avx512vl-vpaddsw-2.c: Ditto. * gcc.target/i386/avx512vl-vpaddusb-2.c: Ditto. * gcc.target/i386/avx512vl-vpaddusw-2.c: Ditto. * gcc.target/i386/avx512vl-vpaddw-2.c: Ditto. * gcc.target/i386/avx512vl-vpalignr-2.c: Ditto. * gcc.target/i386/avx512vl-vpavgb-2.c: Ditto. * gcc.target/i386/avx512vl-vpavgw-2.c: Ditto. * gcc.target/i386/avx512vl-vpblendmb-2.c: Ditto. * gcc.target/i386/avx512vl-vpblendmw-2.c: Ditto. * gcc.target/i386/avx512vl-vpbroadcastb-2.c: Ditto. * gcc.target/i386/avx512vl-vpbroadcastw-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpb-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpeqb-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpequb-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpequw-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpeqw-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpgtb-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpgtub-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpgtuw-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpgtw-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpub-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpuw-2.c: Ditto. * gcc.target/i386/avx512vl-vpcmpw-2.c: Ditto. * gcc.target/i386/avx512vl-vpermi2w-2.c: Ditto. * gcc.target/i386/avx512vl-vpermt2w-2.c: Ditto. * gcc.target/i386/avx512vl-vpermw-2.c: Ditto. * gcc.target/i386/avx512vl-vpmaddubsw-2.c: Ditto. * gcc.target/i386/avx512vl-vpmaddwd-2.c: Ditto. * gcc.target/i386/avx512vl-vpmaxsb-2.c: Ditto. * gcc.target/i386/avx512vl-vpmaxsw-2.c: Ditto. * gcc.target/i386/avx512vl-vpmaxub-2.c: Ditto. * gcc.target/i386/avx512vl-vpmaxuw-2.c: Ditto. * gcc.target/i386/avx512vl-vpminsb-2.c: Ditto. * gcc.target/i386/avx512vl-vpminsw-2.c: Ditto. * gcc.target/i386/avx512vl-vpminub-2.c: Ditto. * gcc.target/i386/avx512vl-vpminuw-2.c: Ditto. * gcc.target/i386/avx512vl-vpmovb2m-2.c: Ditto. * gcc.target/i386/avx512vl-vpmovd2m-2.c: Ditto. * gcc.target/i386/avx512vl-vpmovm2b-2.c: Ditto. *
Enable -fextended-identifiers by default
As proposed at https://gcc.gnu.org/ml/gcc/2014-11/msg00014.html, this patch enables -fextended-identifiers by default for all standard versions including this feature (all C++ versions, C99 and above for C, but not C90 / C94 / gnu89 / preprocessing assembler). It adds a couple of tests for areas where I previously noted testsuite coverage for extended identifiers was lacking, removes -fextended-identifiers from existing tests, adds -g to various such tests to verify that extended identifiers don't break debug info generation and removes the test that was only there to verify that the feature was off by default. The current state of the feature may not correspond exactly to any particular checklist from 2004/5 (see bug 9449) of what was wanted before enabling the feature by default, but I don't think it's any worse than plenty of other features supported by default before every corner case is fully functional, and think problems can readily be fixed incrementally. The following aspects of extended identifiers could still do with more work (and should be straightforward): * C -aux-info (output should use UCNs). * ObjC -gen-decls (output should use UCNs; associated diagnostics from the ObjC front end should use extended characters or UCNs as appropriate to the locale, via using %qE or identifier_to_locale). * Use DW_AT_use_UTF8 in DWARF-3 debug info for compilation units built with extended identifiers enabled (or unconditionally). * cpplib diagnostics (outputting characters or UCNs as appropriate depending on the locale, as done for identifiers in non-cpplib diagnostics). * C++ test for UCN linking with C and extern C. * Check GDB support / file issues for support if needed. * Actual UTF-8 in identifiers (?). (Be careful about not affecting performance for the normal fast path of lexing identifiers, if possible.) The following may be trickier: * cpplib spelling preservation (required to diagnose macro redefinition with different spellings of the same identifier in the definition or argument names; different spellings of the name of the macro itself are OK, however; also required for correct handling of multiple stringizing in C++); correct output for -d (UCNs), DWARF debug info for macros (UCNs), PCH and PCH tests. (Spelling preservation is the issue that needs fixing to remove references to corner cases in the documentation of -std=c99 and -std=c11 and in c99status.html.) The idea would be to add a second pointer to cpp_identifier that stores the original spelling (whether for extended identifiers only, or for all identifiers); this does not enlarge cpp_token because the resulting larger cpp_identifier structure is no bigger than cpp_string. * C++ translation of extended characters (including $@` and various control characters) to UCNs in phase 1 (note diagnostics thus needed, but not for C++11, for control characters in strings / character constants as those UCNs invalid); a likely implementation approach is to do translation when identifiers / strings / character constants are lexed, together with errors for stray $@` / control characters in program as not being valid UCNs in identifiers ($ only if not accepted in identifiers); note that this translation should not take place inside raw string literals. Bootstrapped with no regressions on x86_64-unknown-linux-gnu. Applied to mainline. libcpp: 2014-11-05 Joseph Myers jos...@codesourcery.com PR preprocessor/9449 * init.c (lang_defaults): Enable extended identifiers for C++ and C99-based standards. gcc: 2014-11-05 Joseph Myers jos...@codesourcery.com PR preprocessor/9449 * doc/cpp.texi (Character sets, Tokenization) (Implementation-defined behavior): Don't refer to UCNs in identifiers requiring -fextended-identifiers. * doc/cppopts.texi (-fextended-identifiers): Document as enabled by default for C99 and later and C++. * doc/invoke.texi (-std=c99, -std=c11): Don't refer to extended identifiers needing -fextended-identifiers. gcc/testsuite: 2014-11-05 Joseph Myers jos...@codesourcery.com PR preprocessor/9449 * lib/target-supports.exp (check_effective_target_ucn_nocache): Don't use -fextended-identifiers. * c-c++-common/cpp/normalize-3.c, c-c++-common/cpp/ucnid-2011-1.c, g++.dg/cpp/ucn-1.C, g++.dg/cpp/ucnid-1.C, g++.dg/other/ucnid-1.C, gcc.dg/cpp/normalize-1.c, gcc.dg/cpp/normalize-2.c, gcc.dg/cpp/normalize-4.c: Don't use -fextended-identifiers. * gcc.dg/cpp/ucnid-1.c: Don't use -fextended-identifiers. Use -g3. * gcc.dg/cpp/ucnid-10.c, gcc.dg/cpp/ucnid-2.c, gcc.dg/cpp/ucnid-3.c, gcc.dg/cpp/ucnid-4.c, gcc.dg/cpp/ucnid-5.c, gcc.dg/cpp/ucnid-7.c, gcc.dg/cpp/ucnid-9.c, gcc.dg/cpp/warn-normalized-1.c, gcc.dg/cpp/warn-normalized-2.c, gcc.dg/cpp/warn-normalized-3.c: Don't use
[gomp4] OpenACC cache directive for C.
Hi! In r217145, I applied Jim's patch to gomp-4_0-branch: commit 4361f9b6b2c74c2961c3a5290a4945abe2d7a444 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Nov 5 16:26:47 2014 + OpenACC cache directive for C. gcc/c-family/ * c-pragma.c (oacc_pragmas): Add cache. gcc/c/ * c-parser.c (c_parser_omp_variable_list): Handle OMP_NO_CLAUSE_CACHE. (c_parser_oacc_cache): New function. (c_parser_omp_construct): Use it for PRAGMA_OACC_CACHE. libgomp/ * testsuite/libgomp.oacc-c/cache-1.c: New file. * testsuite/libgomp.oacc-c++/cache-1.C: Likewise. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@217145 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/c-family/ChangeLog.gomp | 4 ++ gcc/c-family/c-pragma.c | 1 + gcc/c/ChangeLog.gomp | 7 +++ gcc/c/c-parser.c | 50 libgomp/ChangeLog.gomp | 5 ++ libgomp/testsuite/libgomp.oacc-c++/cache-1.C | 50 libgomp/testsuite/libgomp.oacc-c/cache-1.c | 69 7 files changed, 186 insertions(+) diff --git gcc/c-family/ChangeLog.gomp gcc/c-family/ChangeLog.gomp index 2f8c8a6..5f3b641 100644 --- gcc/c-family/ChangeLog.gomp +++ gcc/c-family/ChangeLog.gomp @@ -1,3 +1,7 @@ +2014-11-05 James Norris jnor...@codesourcery.com + + * c-pragma.c (oacc_pragmas): Add cache. + 2014-11-03 Cesar Philippidis ce...@codesourcery.com * c-pragma.c (oacc_pragmas): Add entries for PRAGMA_OACC_ENTER_DATA diff --git gcc/c-family/c-pragma.c gcc/c-family/c-pragma.c index a28727e..e98b555 100644 --- gcc/c-family/c-pragma.c +++ gcc/c-family/c-pragma.c @@ -1181,6 +1181,7 @@ static vecpragma_ns_name registered_pp_pragmas; struct omp_pragma_def { const char *name; unsigned int id; }; static const struct omp_pragma_def oacc_pragmas[] = { + { cache, PRAGMA_OACC_CACHE }, { data, PRAGMA_OACC_DATA }, { enter, PRAGMA_OACC_ENTER_DATA }, { exit, PRAGMA_OACC_EXIT_DATA }, diff --git gcc/c/ChangeLog.gomp gcc/c/ChangeLog.gomp index f4e2010..7acd7b3 100644 --- gcc/c/ChangeLog.gomp +++ gcc/c/ChangeLog.gomp @@ -1,3 +1,10 @@ +2014-11-05 James Norris jnor...@codesourcery.com + + * c-parser.c (c_parser_omp_variable_list): Handle + OMP_NO_CLAUSE_CACHE. + (c_parser_oacc_cache): New function. + (c_parser_omp_construct): Use it for PRAGMA_OACC_CACHE. + 2014-11-03 Cesar Philippidis ce...@codesourcery.com * c-parser.c (c_parser_oacc_enter_exit_data): New function. diff --git gcc/c/c-parser.c gcc/c/c-parser.c index 6ac1ace..410b19f 100644 --- gcc/c/c-parser.c +++ gcc/c/c-parser.c @@ -10053,6 +10053,14 @@ c_parser_omp_variable_list (c_parser *parser, { switch (kind) { + case OMP_NO_CLAUSE_CACHE: + if (c_parser_peek_token (parser)-type != CPP_OPEN_SQUARE) + { + c_parser_error (parser, expected %[%); + t = error_mark_node; + break; + } + /* FALL THROUGH. */ case OMP_CLAUSE_MAP: case OMP_CLAUSE_FROM: case OMP_CLAUSE_TO: @@ -10091,6 +10099,29 @@ c_parser_omp_variable_list (c_parser *parser, t = error_mark_node; break; } + + if (kind == OMP_NO_CLAUSE_CACHE) + { + mark_exp_read (low_bound); + mark_exp_read (length); + + if (TREE_CODE (low_bound) != INTEGER_CST + !TREE_READONLY (low_bound)) + { + error_at (clause_loc, + %qD is not a constant, low_bound); + t = error_mark_node; + } + + if (TREE_CODE (length) != INTEGER_CST + !TREE_READONLY (length)) + { + error_at (clause_loc, + %qD is not a constant, length); + t = error_mark_node; + } + } + t = tree_cons (low_bound, length, t); } break; @@ -11864,6 +11895,21 @@ c_parser_omp_structured_block (c_parser *parser) } /* OpenACC 2.0: + # pragma acc cache (variable-list) new-line + + LOC is the location of the #pragma token. +*/ + +static tree +c_parser_oacc_cache (location_t loc __attribute__((unused)), c_parser *parser) +{ + c_parser_omp_var_list_parens (parser, OMP_NO_CLAUSE_CACHE, NULL); + c_parser_skip_to_pragma_eol (parser); + + return NULL_TREE; +} + +/* OpenACC 2.0: # pragma acc data oacc-data-clause[optseq] new-line structured-block @@ -14506,6 +14552,10
[gomp4] OpenACC cache directive maintenance (was: [PATCH 4/6] [GOMP4] OpenACC 1.0+ support in fortran front-end)
Hi! On Fri, 24 Jan 2014 20:33:35 +0100, I wrote: On Thu, 23 Jan 2014 22:04:45 +0400, Ilmir Usmanov i.usma...@samsung.com wrote: Subject: [PATCH 4/6] OpenACC GENERIC nodes --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -216,12 +216,18 @@ enum omp_clause_code { + /* Internal structure to hold OpenACC cache directive's variable-list. + #pragma acc cache (variable-_ist). */ + OACC_NO_CLAUSE_CACHE, Hmm, yeah, while *_NO_CLAUSE_* perhaps isn't the most beautiful approach, I think it's fine at least for now. In r217146, I applied the following to gomp-4_0-branch: commit e8e44b733808997d06c0cdf9bf5756ce03530f42 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Nov 5 16:35:30 2014 + OpenACC cache directive maintenance. gcc/c/ * c-parser.c (c_parser_oacc_cache): Generate OACC_CACHE. * c-typeck.c (c_finish_omp_clauses): Handle OMP_CLAUSE__CACHE_. gcc/cp/ * parser.c (cp_parser_oacc_cache): Generate OACC_CACHE. * semantics.c (finish_omp_clauses): Handle OMP_CLAUSE__CACHE_. gcc/ * gimplify.c (gimplify_oacc_cache): New function. (gimplify_expr): Use it for OACC_CACHE. (gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses): Handle OMP_CLAUSE__CACHE_. gcc/c/ * c-parser.c (c_parser_omp_variable_list) OMP_CLAUSE__CACHE_: Remove explicit mark_exp_read invocations. gcc/cp/ * parser.c (cp_parser_omp_var_list_no_open) OMP_CLAUSE__CACHE_: Remove explicit mark_exp_read invocations. gcc/ * tree-core.h (enum omp_clause_code): Move OMP_NO_CLAUSE_CACHE next to, and handle it like a data clause. Rename it to OMP_CLAUSE__CACHE_. Update all users. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@217146 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.gomp | 9 + gcc/c/ChangeLog.gomp | 8 gcc/c/c-parser.c | 23 +++ gcc/c/c-typeck.c | 1 + gcc/cp/ChangeLog.gomp | 6 ++ gcc/cp/parser.c| 24 +++- gcc/cp/semantics.c | 1 + gcc/fortran/trans-openmp.c | 2 +- gcc/gimplify.c | 25 ++--- gcc/omp-low.c | 4 ++-- gcc/tree-core.h| 8 gcc/tree-pretty-print.c| 11 +++ gcc/tree.c | 6 +++--- gcc/tree.def | 5 +++-- gcc/tree.h | 2 +- 15 files changed, 98 insertions(+), 37 deletions(-) diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index fc624c8..2c2b349 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,5 +1,14 @@ 2014-11-05 Thomas Schwinge tho...@codesourcery.com + * gimplify.c (gimplify_oacc_cache): New function. + (gimplify_expr): Use it for OACC_CACHE. + (gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses): Handle + OMP_CLAUSE__CACHE_. + + * tree-core.h (enum omp_clause_code): Move OMP_NO_CLAUSE_CACHE + next to, and handle it like a data clause. Rename it to + OMP_CLAUSE__CACHE_. Update all users. + * invoke.texi: Update for OpenACC. * sourcebuild.texi: Likewise. diff --git gcc/c/ChangeLog.gomp gcc/c/ChangeLog.gomp index 7acd7b3..70278b9 100644 --- gcc/c/ChangeLog.gomp +++ gcc/c/ChangeLog.gomp @@ -1,3 +1,11 @@ +2014-11-05 Thomas Schwinge tho...@codesourcery.com + + * c-parser.c (c_parser_oacc_cache): Generate OACC_CACHE. + * c-typeck.c (c_finish_omp_clauses): Handle OMP_CLAUSE__CACHE_. + + * c-parser.c (c_parser_omp_variable_list) OMP_CLAUSE__CACHE_: + Remove explicit mark_exp_read invocations. + 2014-11-05 James Norris jnor...@codesourcery.com * c-parser.c (c_parser_omp_variable_list): Handle diff --git gcc/c/c-parser.c gcc/c/c-parser.c index 410b19f..40d4314 100644 --- gcc/c/c-parser.c +++ gcc/c/c-parser.c @@ -10053,7 +10053,7 @@ c_parser_omp_variable_list (c_parser *parser, { switch (kind) { - case OMP_NO_CLAUSE_CACHE: + case OMP_CLAUSE__CACHE_: if (c_parser_peek_token (parser)-type != CPP_OPEN_SQUARE) { c_parser_error (parser, expected %[%); @@ -10100,11 +10100,8 @@ c_parser_omp_variable_list (c_parser *parser, break; } - if (kind == OMP_NO_CLAUSE_CACHE) + if (kind == OMP_CLAUSE__CACHE_) { - mark_exp_read (low_bound); - mark_exp_read (length); - if (TREE_CODE (low_bound) != INTEGER_CST !TREE_READONLY (low_bound)) { @@ -11901,12 +11898,22 @@ c_parser_omp_structured_block (c_parser *parser) */ static tree -c_parser_oacc_cache (location_t loc __attribute__((unused)), c_parser *parser)
[PING 2] Enhance array types debug info. for Ada
Hello, Ping for https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00694.html Thanks in advance! Regards, -- Pierre-Marie de Rodat
Re: [gomp4] OpenACC cache directive maintenance
Hi Cesar! On Wed, 05 Nov 2014 17:36:46 +0100, I wrote: In r217146, I applied the following to gomp-4_0-branch: commit e8e44b733808997d06c0cdf9bf5756ce03530f42 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Nov 5 16:35:30 2014 + OpenACC cache directive maintenance. gcc/c/ * c-parser.c (c_parser_oacc_cache): Generate OACC_CACHE. * c-typeck.c (c_finish_omp_clauses): Handle OMP_CLAUSE__CACHE_. gcc/cp/ * parser.c (cp_parser_oacc_cache): Generate OACC_CACHE. * semantics.c (finish_omp_clauses): Handle OMP_CLAUSE__CACHE_. gcc/ * gimplify.c (gimplify_oacc_cache): New function. (gimplify_expr): Use it for OACC_CACHE. (gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses): Handle OMP_CLAUSE__CACHE_. --- gcc/c/c-parser.c +++ gcc/c/c-parser.c static tree -c_parser_oacc_cache (location_t loc __attribute__((unused)), c_parser *parser) +c_parser_oacc_cache (location_t loc, c_parser *parser) { - c_parser_omp_var_list_parens (parser, OMP_NO_CLAUSE_CACHE, NULL); + tree stmt, clauses; + + clauses = c_parser_omp_var_list_parens (parser, OMP_CLAUSE__CACHE_, NULL); + clauses = c_finish_omp_clauses (clauses); + c_parser_skip_to_pragma_eol (parser); - return NULL_TREE; + stmt = make_node (OACC_CACHE); + TREE_TYPE (stmt) = void_type_node; + OACC_CACHE_CLAUSES (stmt) = clauses; + SET_EXPR_LOCATION (stmt, loc); + add_stmt (stmt); + + return stmt; } --- gcc/c/c-typeck.c +++ gcc/c/c-typeck.c @@ -12204,6 +12204,7 @@ c_finish_omp_clauses (tree clauses) case OMP_CLAUSE_MAP: case OMP_CLAUSE_TO: case OMP_CLAUSE_FROM: + case OMP_CLAUSE__CACHE_: t = OMP_CLAUSE_DECL (c); if (TREE_CODE (t) == TREE_LIST) { [The same for C++.] I also tried to make this work for Fortran, but didn't manage to (in a reasonable amount of time, which has not been a lot that I allocated) ;-) -- would you please have a look at this (but it's not urgent). commit 7cf14ddb307a5b271e098f3a3fdb0452f6036f91 Author: Thomas Schwinge tho...@codesourcery.com Date: Wed Nov 5 09:16:12 2014 +0100 cache: Fortran experimenting. --- gcc/fortran/frontend-passes.c | 3 ++- gcc/fortran/openmp.c | 12 ++-- gcc/fortran/trans-openmp.c| 18 -- gcc/testsuite/gfortran.dg/goacc/cache-1.f95 | 1 - gcc/testsuite/gfortran.dg/goacc/coarray.f95 | 1 - gcc/testsuite/gfortran.dg/goacc/cray.f95 | 2 -- gcc/testsuite/gfortran.dg/goacc/loop-1.f95| 1 - gcc/testsuite/gfortran.dg/goacc/parameter.f95 | 1 - 8 files changed, 16 insertions(+), 23 deletions(-) diff --git gcc/fortran/frontend-passes.c gcc/fortran/frontend-passes.c index 97a9164..729629e 100644 --- gcc/fortran/frontend-passes.c +++ gcc/fortran/frontend-passes.c @@ -2190,7 +2190,8 @@ gfc_code_walker (gfc_code **c, walk_code_fn_t codefn, walk_expr_fn_t exprfn, gfc_omp_namelist *n; static int list_types[] = { OMP_LIST_ALIGNED, OMP_LIST_LINEAR, OMP_LIST_DEPEND, - OMP_LIST_MAP, OMP_LIST_TO, OMP_LIST_FROM }; + OMP_LIST_MAP, OMP_LIST_TO, OMP_LIST_FROM, + OMP_LIST_CACHE }; size_t idx; WALK_SUBEXPR (co-ext.omp_clauses-if_expr); WALK_SUBEXPR (co-ext.omp_clauses-final_expr); diff --git gcc/fortran/openmp.c gcc/fortran/openmp.c index 959798a..167331a 100644 --- gcc/fortran/openmp.c +++ gcc/fortran/openmp.c @@ -3102,6 +3102,7 @@ resolve_omp_clauses (gfc_code *code, locus *where, case OMP_LIST_MAP: case OMP_LIST_TO: case OMP_LIST_FROM: + case OMP_LIST_CACHE: for (; n != NULL; n = n-next) { if (n-expr) @@ -4594,13 +4595,6 @@ resolve_oacc_loop(gfc_code *code) } -static void -resolve_oacc_cache (gfc_code *code) -{ - gfc_error (Sorry, !$ACC cache unimplemented yet at %L, code-loc); -} - - void gfc_resolve_oacc_declare (gfc_namespace *ns) { @@ -4675,6 +4669,7 @@ gfc_resolve_oacc_directive (gfc_code *code, gfc_namespace *ns ATTRIBUTE_UNUSED) case EXEC_OACC_ENTER_DATA: case EXEC_OACC_EXIT_DATA: case EXEC_OACC_WAIT: +case EXEC_OACC_CACHE: resolve_omp_clauses (code, code-loc, code-ext.omp_clauses, NULL, true); break; @@ -4683,9 +4678,6 @@ gfc_resolve_oacc_directive (gfc_code *code, gfc_namespace *ns ATTRIBUTE_UNUSED) case EXEC_OACC_LOOP: resolve_oacc_loop (code); break; -case EXEC_OACC_CACHE: - resolve_oacc_cache (code); - break; default: break; } diff --git gcc/fortran/trans-openmp.c gcc/fortran/trans-openmp.c index 7dd4498..e39e903 100644 --- gcc/fortran/trans-openmp.c +++ gcc/fortran/trans-openmp.c @@ -1806,9 +1806,9 @@ gfc_trans_omp_clauses
[PATCH] Fix libbacktrace and libiberty tests fail on sanitized GCC due to wrong link options.
Hi, When I ran Asan tests under Asan-bootstrapped GCC 5.0, I've noted, that tests for libiberty and libbacktrace fail to link with sanitized libbacktrace.a and libiberty.a because of missing -static-libasan -fsanitize=address linker flags. This patch adds necessary flags to provide a linkage of these tests in bootstrap-asan case. I've checked that regression tests pass with disabled bootstrap, normal bootstrap (stage1, stage3) and Asan-bootstrap (stage 1, stage3) on x86_64-unknown-linux-gnu. Does the patch look sane? -Maxim libiberty/ChangeLog: 2014-11-05 Max Ostapenko m.ostape...@partner.samsung.com * testsuite/Makefile.in (LIBCFLAGS): Add LDFLAGS. ChangeLog: 2014-11-05 Max Ostapenko m.ostape...@partner.samsung.com * Makefile.tpl (EXTRA_HOST_EXPORTS): New variables. (EXTRA_BOOTSTRAP_FLAGS): Likewise. (check-[+module+]): Add EXTRA_HOST_EXPORTS and EXTRA_BOOTSTRAP_FLAGS. * Makefile.in: Regenerate. diff --git a/Makefile.in b/Makefile.in index 4564dbe..62c8301 100644 --- a/Makefile.in +++ b/Makefile.in @@ -832,6 +832,14 @@ POSTSTAGE1_FLAGS_TO_PASS = \ $(LTO_FLAGS_TO_PASS) \ `echo 'ADAFLAGS=$(BOOT_ADAFLAGS)' | sed -e s'/[^=][^=]*=$$/XFOO=/'` +@if gcc-bootstrap +EXTRA_HOST_EXPORTS = if [ $(current_stage) != stage1 ]; then \ + $(POSTSTAGE1_HOST_EXPORTS) \ + fi ; + +EXTRA_BOOTSTRAP_FLAGS = CC=$$CC CXX=$$CXX LDFLAGS=$$LDFLAGS +@endif gcc-bootstrap + # Flags to pass down to makes which are built with the target environment. # The double $ decreases the length of the command line; those variables # are set in BASE_FLAGS_TO_PASS, and the sub-make will expand them. The @@ -3597,9 +3605,9 @@ check-bfd: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) $(EXTRA_HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/bfd \ - $(MAKE) $(FLAGS_TO_PASS) check) + $(MAKE) $(FLAGS_TO_PASS) $(EXTRA_BOOTSTRAP_FLAGS) check) @endif bfd @@ -4471,9 +4479,9 @@ check-opcodes: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) $(EXTRA_HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/opcodes \ - $(MAKE) $(FLAGS_TO_PASS) check) + $(MAKE) $(FLAGS_TO_PASS) $(EXTRA_BOOTSTRAP_FLAGS) check) @endif opcodes @@ -5345,9 +5353,9 @@ check-binutils: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) $(EXTRA_HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/binutils \ - $(MAKE) $(FLAGS_TO_PASS) check) + $(MAKE) $(FLAGS_TO_PASS) $(EXTRA_BOOTSTRAP_FLAGS) check) @endif binutils @@ -5775,9 +5783,9 @@ check-bison: @if [ '$(host)' = '$(target)' ] ; then \ r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/bison \ - $(MAKE) $(FLAGS_TO_PASS) check); \ + $(MAKE) $(FLAGS_TO_PASS) check) fi @endif bison @@ -6217,7 +6225,7 @@ check-cgen: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/cgen \ $(MAKE) $(FLAGS_TO_PASS) check) @@ -6658,7 +,7 @@ check-dejagnu: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/dejagnu \ $(MAKE) $(FLAGS_TO_PASS) check) @@ -7099,7 +7107,7 @@ check-etc: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/etc \ $(MAKE) $(FLAGS_TO_PASS) check) @@ -7542,9 +7550,9 @@ check-fastjar: @if [ '$(host)' = '$(target)' ] ; then \ r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/fastjar \ - $(MAKE) $(FLAGS_TO_PASS) check); \ + $(MAKE) $(FLAGS_TO_PASS) check) fi @endif fastjar @@ -8430,9 +8438,9 @@ check-fixincludes: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) $(EXTRA_HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/fixincludes \ - $(MAKE) $(FLAGS_TO_PASS) check) + $(MAKE) $(FLAGS_TO_PASS) $(EXTRA_BOOTSTRAP_FLAGS) check) @endif fixincludes @@ -8845,9 +8853,9 @@ check-flex: @if [ '$(host)' = '$(target)' ] ; then \ r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \ - $(HOST_EXPORTS) \ + $(HOST_EXPORTS) \ (cd $(HOST_SUBDIR)/flex \ - $(MAKE) $(FLAGS_TO_PASS) check); \ + $(MAKE) $(FLAGS_TO_PASS) check) fi @endif flex @@ -9733,9 +9741,9 @@ check-gas: @: $(MAKE); $(unstage) @r=`${PWD_COMMAND}`; export r; \ s=`cd $(srcdir); ${PWD_COMMAND}`; export s;
[gomp4] Testing of C/C++ OpenACC cache directive (was: OpenACC cache directive for C)
Hi! On Wed, 05 Nov 2014 17:29:19 +0100, I wrote: In r217145, I applied Jim's patch to gomp-4_0-branch: * testsuite/libgomp.oacc-c/cache-1.c: New file. * testsuite/libgomp.oacc-c++/cache-1.C: Likewise. Applied to gomp-4_0-branch in r217147: commit 267cffc4105255a8372fb5788fdcbb54560f493b Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Nov 5 16:46:23 2014 + Testing of C/C++ OpenACC cache directive. gcc/testsuite/ * c-c++-common/goacc/cache-1.c: New file. libgomp/ * testsuite/libgomp.oacc-c/cache-1.c: Remove directives that are expected to fail, and rename the file to... * testsuite/libgomp.oacc-c-c++-common/cache-1.c: ... this. * testsuite/libgomp.oacc-c++/cache-1.C: Remove file. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@217147 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/testsuite/ChangeLog.gomp | 2 + gcc/testsuite/c-c++-common/goacc/cache-1.c | 88 ++ libgomp/ChangeLog.gomp | 7 ++ libgomp/testsuite/libgomp.oacc-c++/cache-1.C | 50 .../testsuite/libgomp.oacc-c-c++-common/cache-1.c | 48 libgomp/testsuite/libgomp.oacc-c/cache-1.c | 69 - 6 files changed, 145 insertions(+), 119 deletions(-) diff --git gcc/testsuite/ChangeLog.gomp gcc/testsuite/ChangeLog.gomp index 25be821..1faf0fa 100644 --- gcc/testsuite/ChangeLog.gomp +++ gcc/testsuite/ChangeLog.gomp @@ -1,5 +1,7 @@ 2014-11-05 Thomas Schwinge tho...@codesourcery.com + * c-c++-common/goacc/cache-1.c: New file. + * gfortran.dg/goacc/data-tree.f95: Remove dg-prune-output directive. * gfortran.dg/goacc/kernels-tree.f95: Likewise. * gfortran.dg/goacc/loop-tree-1.f90: Likewise. diff --git gcc/testsuite/c-c++-common/goacc/cache-1.c gcc/testsuite/c-c++-common/goacc/cache-1.c new file mode 100644 index 000..9503341 --- /dev/null +++ gcc/testsuite/c-c++-common/goacc/cache-1.c @@ -0,0 +1,88 @@ +int +main (int argc, char **argv) +{ +#define N 2 +int a[N], b[N]; +int i; + +for (i = 0; i N; i++) +{ +a[i] = 3; +b[i] = 0; +} + +#pragma acc parallel copyin (a[0:N]) copyout (b[0:N]) +{ +int ii; + +for (ii = 0; ii N; ii++) +{ +const int idx = ii; +int n = 1; +const int len = n; + +#pragma acc cache /* { dg-error expected '\\\(' before end of line } */ + +#pragma acc cache a[0:N] /* { dg-error expected '\\\(' before 'a' } */ + /* { dg-bogus expected end of line before 'a' { xfail c++ } 26 } */ + +#pragma acc cache (a) /* { dg-error expected '\\\[' } */ + +#pragma acc cache ( /* { dg-error expected (identifier|unqualified-id) before end of line } */ + +#pragma acc cache () /* { dg-error expected (identifier|unqualified-id) before '\\\)' token } */ + +#pragma acc cache (,) /* { dg-error expected (identifier|unqualified-id) before '(,|\\\))' token } */ + +#pragma acc cache (a[0:N] /* { dg-error expected '\\\)' before end of line } */ + +#pragma acc cache (a[0:N],) /* { dg-error expected (identifier|unqualified-id) before '(,|\\\))' token { xfail c } } */ + +#pragma acc cache (a[0:N]) copyin (a[0:N]) /* { dg-error expected end of line before 'copyin' } */ + +#pragma acc cache () /* { dg-error expected (identifier|unqualified-id) before '\\\)' token } */ + +#pragma acc cache (a[0:N] b[0:N]) /* { dg-error expected '\\\)' before 'b' } */ + +#pragma acc cache (a[0:N] b[0:N}) /* { dg-error expected '\\\)' before 'b' } */ + /* { dg-bogus expected end of line before '\\\}' token { xfail c++ } 47 } */ + +#pragma acc cache (a[0:N] /* { dg-error expected '\\\)' before end of line } */ + +#pragma acc cache (a[ii]) /* { dg-error 'ii' is not a constant } */ + +#pragma acc cache (a[idx:n]) /* { dg-error 'n' is not a constant } */ + +#pragma acc cache (a[0:N]) ( /* { dg-error expected end of line before '\\(' token } */ + +#pragma acc cache (a[0:N]) ii /* { dg-error expected end of line before 'ii' } */ + +#pragma acc cache (a[0:N] ii) /* { dg-error expected '\\)' before 'ii' } */ + +#pragma acc cache (a[0:N]) + +#pragma acc cache (a[0:N], a[0:N]) + +#pragma acc cache (a[0:N], b[0:N]) + +#pragma acc cache (a[0]) + +#pragma acc cache (a[0], a[1], b[0:N]) + +#pragma acc cache (a[idx]) + +#pragma acc cache (a[idx:len]) + +b[ii] = a[ii]; +} +} + + +for (i = 0; i N; i++) +{ +if (a[i] != b[i]) +__builtin_abort (); +} + +return 0; +} diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp index 4ac348d..096a2a9 100644 --- libgomp/ChangeLog.gomp +++ libgomp/ChangeLog.gomp @@ -1,3 +1,10 @@ +2014-11-05 Thomas Schwinge tho...@codesourcery.com + + * testsuite/libgomp.oacc-c/cache-1.c: Remove directives that are + expected to fail, and rename the file to... + * testsuite/libgomp.oacc-c-c++-common/cache-1.c:
[gomp4] OpenACC update host/self maintenance (was: acc enter/exit data)
Hi! On Thu, 30 Oct 2014 17:11:04 -0700, Cesar Philippidis ce...@codesourcery.com wrote: gcc/fortran/ * gfortran.h (enum OMP_LIST_HOST): Remove. (enum OMP_LIST_DEVICE, OMP_LIST_DEVICE): Remove. * dump-parse-tree.c (show_omp_clauses): Remove OMP_LIST_HOST and OMP_LIST_DEVICE from here also. * openmp.c (OMP_CLAUSE_SELF): New define. (gfc_match_omp_clauses): Update handling of OMP_CLAUSE_HOST and OMP_CLAUSE_DEVICE. Add support for OMP_CLAUSE_SELF. * trans-openmp.c (gfc_trans_omp_clauses): Remove support for OMP_LIST_HOST and OMP_LIST_DEVICE since they are treated as memory maps now. (gfc_trans_oacc_executable_directive): Remove stale EXEC_OACC_WAIT. Applied to gomp-4_0-branch in r217148: commit a7bba5ecc7c62a022616f55ff1d8fb48266fcb67 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Nov 5 16:54:07 2014 + OpenACC update host/self maintenance. gcc/c/ * c-parser.c (c_parser_omp_clause_name) host: Return PRAGMA_OMP_CLAUSE_HOST. gcc/c/ (c_parser_oacc_data_clause): Group PRAGMA_OMP_CLAUSE_SELF next to PRAGMA_OMP_CLAUSE_HOST. gcc/cp/ * parser.c (cp_parser_oacc_data_clause): Group PRAGMA_OMP_CLAUSE_SELF next to PRAGMA_OMP_CLAUSE_HOST. gcc/fortran/ * openmp.c (OMP_CLAUSE_HOST, OMP_CLAUSE_SELF): Merge into the new OMP_CLAUSE_HOST_SELF. Update all users. gcc/ * tree-core.h (enum omp_clause_code): Remove OMP_CLAUSE_HOST and OMP_CLAUSE_OACC_DEVICE. Update all users. gcc/testsuite/ * c-c++-common/goacc/update-1.c: Extend. * gfortran.dg/goacc/assumed.f95: Likewise. * gfortran.dg/goacc/coarray.f95: Likewise. * gfortran.dg/goacc/cray.f95: Likewise. * gfortran.dg/goacc/literal.f95: Likewise. * gfortran.dg/goacc/parameter.f95: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/update-1-2.c: New file. * testsuite/libgomp.oacc-fortran/data-4-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/data-4.f90: In one instance, use the self clause instead of host clause. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@217148 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.gomp | 3 + gcc/c/ChangeLog.gomp | 6 + gcc/c/c-parser.c | 10 +- gcc/cp/ChangeLog.gomp | 3 + gcc/cp/parser.c| 8 +- gcc/fortran/ChangeLog.gomp | 3 + gcc/fortran/openmp.c | 21 +- gcc/gimplify.c | 4 - gcc/omp-low.c | 4 - gcc/testsuite/ChangeLog.gomp | 7 + gcc/testsuite/c-c++-common/goacc/update-1.c| 5 + gcc/testsuite/gfortran.dg/goacc/assumed.f95| 8 +- gcc/testsuite/gfortran.dg/goacc/coarray.f95| 3 +- gcc/testsuite/gfortran.dg/goacc/cray.f95 | 6 +- gcc/testsuite/gfortran.dg/goacc/literal.f95| 5 +- gcc/testsuite/gfortran.dg/goacc/parameter.f95 | 3 +- gcc/tree-core.h| 10 +- gcc/tree-pretty-print.c| 6 - gcc/tree.c | 6 - libgomp/ChangeLog.gomp | 5 + .../libgomp.oacc-c-c++-common/update-1-2.c | 282 + .../{data-4.f90 = data-4-2.f90} | 8 +- libgomp/testsuite/libgomp.oacc-fortran/data-4.f90 | 2 +- 23 files changed, 353 insertions(+), 65 deletions(-) diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index 2c2b349..d140a35 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,5 +1,8 @@ 2014-11-05 Thomas Schwinge tho...@codesourcery.com + * tree-core.h (enum omp_clause_code): Remove OMP_CLAUSE_HOST and + OMP_CLAUSE_OACC_DEVICE. Update all users. + * gimplify.c (gimplify_oacc_cache): New function. (gimplify_expr): Use it for OACC_CACHE. (gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses): Handle diff --git gcc/c/ChangeLog.gomp gcc/c/ChangeLog.gomp index 70278b9..a223a17 100644 --- gcc/c/ChangeLog.gomp +++ gcc/c/ChangeLog.gomp @@ -1,5 +1,11 @@ 2014-11-05 Thomas Schwinge tho...@codesourcery.com + * c-parser.c (c_parser_omp_clause_name) host: Return + PRAGMA_OMP_CLAUSE_HOST. + + * c-parser.c (c_parser_oacc_data_clause): Group + PRAGMA_OMP_CLAUSE_SELF next to PRAGMA_OMP_CLAUSE_HOST. + * c-parser.c (c_parser_oacc_cache): Generate OACC_CACHE. * c-typeck.c (c_finish_omp_clauses): Handle OMP_CLAUSE__CACHE_. diff --git gcc/c/c-parser.c gcc/c/c-parser.c index 40d4314..bd2864f 100644 --- gcc/c/c-parser.c +++
RE: [PATCHv3][MIPS] Implement O32 ABI extensions (GCC)
Hi Catherine, The full patch is attached and the delta from v2 is inline below. Testing (O32): MIPS I - FP32, MIPS II - FP32, MIPS II - FPXX MIPS32 - FP32, MIPS32 - FPXX, MIPS32 - FPXX ODDSPREG, MIPS32R2 - FP32, MIPS32R2 - FPXX, MIPS32R2 - FPXX ODDSPREG, MIPS32R2 - FP64, MIPS32R2 - FP64A One of the new tests fails in the two ODDSPREG configurations but I don't plan to resolve that as it makes very little sense to pre-configure GCC to explicitly use -modd-spreg (it only makes sense to use -mno-odd-spreg). The failure is merely that the expected behaviour does not match up because of the implicit -modd-spreg option. This same issue affects one of the pre-existing loongson tests in the same configurations but I am leaving that for the same reason. The FP64 configurations enable extra tests which failed before for FP64 and still fail after so will be cleaned up in later work. In particular there are some missed vectorization cases which are to do with loongson. Testing (N64): The testsuite is still running after a rebase but it had no regressions last time. Changes (on top of addressing all comments): * Handle the fact that -msingle-float, -msoft-float, -mfp32, -mfpxx, -mfp64 are all part of the FPU ABI selection. As such if one of these is given which conflicts with an implicit option then the implicit option is not added. This is necessary to provide a stable and consistent behaviour when writing assembly code such that (for example) building a file with -msoft-float and using .set hardfloat gets you into the standard hard-float ABI which is FP32. This issue was identified as part of resolving kernel issues when built with FPXX/.MIPS.abiflags aware tools: http://www.linux-mips.org/archives/linux-mips/2014-10/msg00886.html There are lots of other strange specs issues to resolve for MIPS but this is a step in the right direction. * Fix a build warning * Fix a testcase for FP64 configs OK to commit (pending successful results for the N64 testrun)? Thanks, Matthew --- gcc/config.in | 2 +- gcc/config/mips/mips.c | 49 +++- gcc/config/mips/mips.h | 51 +- gcc/config/mips/mips.md| 8 ++--- gcc/config/mips/mips.opt | 2 +- gcc/config/mips/mti-elf.h | 6 ++-- gcc/config/mips/mti-linux.h| 6 ++-- gcc/configure | 20 ++-- gcc/configure.ac | 12 +++ gcc/doc/invoke.texi| 17 +- gcc/testsuite/gcc.target/mips/oddspreg-3.c | 2 +- 11 files changed, 93 insertions(+), 82 deletions(-) diff --git a/gcc/config.in b/gcc/config.in index 40dd6f5..8fd0e0e 100644 --- a/gcc/config.in +++ b/gcc/config.in @@ -474,7 +474,7 @@ /* Define if the assembler understands .module. */ #ifndef USED_FOR_TARGET -#undef HAVE_AS_MODULE +#undef HAVE_AS_DOT_MODULE #endif diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index 2dc6725..46f3890 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -5466,7 +5466,7 @@ mips_function_arg_boundary (machine_mode mode, const_tree type) static machine_mode mips_get_reg_raw_mode (int regno) { - if (mips_abi == ABI_32 !TARGET_FLOAT32 FP_REG_P (regno)) + if (TARGET_FLOATXX FP_REG_P (regno)) return DFmode; return default_get_reg_raw_mode (regno); } @@ -5796,19 +5796,24 @@ mips_libcall_value (machine_mode mode, const_rtx fun ATTRIBUTE_UNUSED) static bool mips_function_value_regno_p (const unsigned int regno) { + /* Most types only require one GPR or one FPR for return values but for + hard-float two FPRs can be used for _Complex types (for all ABIs) + and long doubles (for n64). */ if (regno == GP_RETURN || regno == FP_RETURN - || regno == FP_RETURN + 2 - || (LONG_DOUBLE_TYPE_SIZE == 128 - FP_RETURN != GP_RETURN + || (FP_RETURN != GP_RETURN regno == FP_RETURN + 2)) return true; - if ((regno == FP_RETURN + 1 - || regno == FP_RETURN + 3) + /* For o32 FP32, _Complex double will be returned in four 32-bit registers. + This does not apply to o32 FPXX as floating-point function argument and + return registers are described as 64-bit even though floating-point + registers are primarily described as 32-bit internally. + See: mips_get_reg_raw_mode. */ + if ((mips_abi == ABI_32 TARGET_FLOAT32) FP_RETURN != GP_RETURN - (mips_abi == ABI_32 TARGET_FLOAT32) - FP_REG_P (regno)) + (regno == FP_RETURN + 1 + || regno == FP_RETURN + 3)) return true; return false; @@ -8723,14 +8728,13 @@ mips_dwarf_register_span (rtx reg) machine_mode mode; /* TARGET_FLOATXX is implemented as 32-bit floating-point registers but - ensures that double precision registers are treated as if they were + ensures that
Re: [PATCH] PR36312
On 4 November 2014 23:40, Jeff Law l...@redhat.com wrote: On 10/25/14 04:20, Anthony Brandon wrote: Hi, Sorry for the delay. Here are the updated diff and changelog. gcc/testsuite/ChangeLog: 2014-10-25 Anthony Brandon anthony.bran...@gmail.com PR driver/36312 * gcc.misc-tests/output.exp: New test case for identical input and output files. include/ChangeLog: 2014-10-25 Anthony Brandon anthony.bran...@gmail.com PR driver/36312 * filenames.h: Add prototype for canonical_filename_eq. gcc/ChangeLog: 2014-10-25 Anthony Brandon anthony.bran...@gmail.com PR driver/36312 * diagnostic-core.h: Add prototype for fatal_error. * diagnostic.c (fatal_error): New function fatal_error. * gcc.c (store_arg): Remove have_o_argbuf_index. (process_command): Check if input and output files are the same. * toplev.c (init_asm_output): Check if input and output files are the same. libiberty/ChangeLog: 2014-10-25 Anthony Brandon anthony.bran...@gmail.com PR driver/36312 * filename_cmp.c (canonical_filename_eq): New function to check if file names are the same. * functions.texi: Updated with documentation for new function. This is fine for the trunk. Please install. Thanks, Jeff I committed this as r217149. Anthony, Thanks! Cheers, Manuel.
Re: libstdc++ new deque failures
On 5 November 2014 14:14, David Edelsohn wrote: Jonathan, I still am seeing new failures in the libstdc++ deque testsuite as of last night. I don't know if you still are working through the fallout from the earlier patches, but I wanted to make you aware. Yes, those tests are meant to fail but I need to adjust the dg-error line numbers after one of my earlier patches. I'm working on a patch (I might make other changes to std::deque, which would require changing the dg-error line numbers yet agan, so I'm holding off until the other changes are ready ... or I decide not to make them and just fix the tests.) Sorry for the noise in the testresults. And these are not related to deque, but appear to be additional issues in the libstdc++ implementation: I hadn't seen these ones, I'll take a look, thanks.
[gomp4] Use GOMP_OFFLOAD_ prefix for (OpenACC) plugin hooks
Hi, Mirroring changes in Ilya Verbin's libgomp offloading pieces posted to trunk, this patch adds a prefix of GOMP_OFFLOAD_ to the OpenACC plugin hooks. Some of these bits will not be needed for a trunk version of the patch once Ilya's patch is approved (I'm hoping other incompatibilities haven't crept in other than the renaming!). I will apply to the gomp4 branch shortly. Thanks, Julian ChangeLog libgomp/ * oacc-host.c: Add GOMP_OFFLOAD_ prefix for plugin hooks. Rename device_init to init_device, device_fini to fini_device, offload_register to register_image and remove extraneous device_ from device_alloc, device_free, device_dev2host, device_host2dev and device_run. (host_dispatch): Use new names for hooks. * oacc-init.c: Use new names for hooks, throughout. * plugin-nvptx.c: Likewise. * target.c: Likewise. (gomp_load_plugin_for_device): Likewise. Look for new hook names. * target.h (gomp_device_descr): Use new hook names. commit 4e1b71a5e0d15de4c6e89ab5139964e32b563d68 Author: Julian Brown jul...@codesourcery.com Date: Wed Nov 5 02:34:22 2014 -0800 Use GOMP_OFFLOAD_ prefix for plugin hooks. diff --git a/libgomp/oacc-host.c b/libgomp/oacc-host.c index fc3e77c..02794bb 100644 --- a/libgomp/oacc-host.c +++ b/libgomp/oacc-host.c @@ -60,7 +60,7 @@ static struct gomp_device_descr host_dispatch; #endif STATIC const char * -get_name (void) +GOMP_OFFLOAD_get_name (void) { #ifdef DEBUG fprintf (stderr, SELF %s:%s\n, __FILE__, __FUNCTION__); @@ -74,7 +74,7 @@ get_name (void) } STATIC int -get_type (void) +GOMP_OFFLOAD_get_type (void) { #ifdef DEBUG fprintf (stderr, SELF %s:%s\n, __FILE__, __FUNCTION__); @@ -88,7 +88,7 @@ get_type (void) } STATIC unsigned int -get_caps (void) +GOMP_OFFLOAD_get_caps (void) { unsigned int caps = TARGET_CAP_OPENACC_200 | TARGET_CAP_OPENMP_400 | TARGET_CAP_NATIVE_EXEC; @@ -105,7 +105,7 @@ get_caps (void) } STATIC int -get_num_devices (void) +GOMP_OFFLOAD_get_num_devices (void) { #ifdef DEBUG fprintf (stderr, SELF %s:%s\n, __FILE__, __FUNCTION__); @@ -115,7 +115,7 @@ get_num_devices (void) } STATIC void -offload_register (void *host_table, void *target_data) +GOMP_OFFLOAD_register_image (void *host_table, void *target_data) { #ifdef DEBUG fprintf (stderr, SELF %s:%s (%p, %p)\n, __FILE__, __FUNCTION__, host_table, @@ -124,17 +124,17 @@ offload_register (void *host_table, void *target_data) } STATIC int -device_init (void) +GOMP_OFFLOAD_init_device (void) { #ifdef DEBUG fprintf (stderr, SELF %s:%s\n, __FILE__, __FUNCTION__); #endif - return get_num_devices (); + return GOMP_OFFLOAD_get_num_devices (); } STATIC int -device_fini (void) +GOMP_OFFLOAD_fini_device (void) { #ifdef DEBUG fprintf (stderr, SELF %s:%s\n, __FILE__, __FUNCTION__); @@ -144,7 +144,7 @@ device_fini (void) } STATIC int -device_get_table (struct mapping_table **table) +GOMP_OFFLOAD_get_table (struct mapping_table **table) { #ifdef DEBUG fprintf (stderr, SELF %s:%s (%p)\n, __FILE__, __FUNCTION__, table); @@ -154,7 +154,7 @@ device_get_table (struct mapping_table **table) } STATIC bool -openacc_avail (void) +GOMP_OFFLOAD_openacc_avail (void) { #ifdef DEBUG fprintf (stderr, SELF %s:%s\n, __FILE__, __FUNCTION__); @@ -164,7 +164,7 @@ openacc_avail (void) } STATIC void * -openacc_open_device (int n) +GOMP_OFFLOAD_openacc_open_device (int n) { #ifdef DEBUG fprintf (stderr, SELF %s:%s (%u)\n, __FILE__, __FUNCTION__, n); @@ -174,7 +174,7 @@ openacc_open_device (int n) } STATIC int -openacc_close_device (void *hnd) +GOMP_OFFLOAD_openacc_close_device (void *hnd) { #ifdef DEBUG fprintf (stderr, SELF %s:%s (%p)\n, __FILE__, __FUNCTION__, hnd); @@ -184,7 +184,7 @@ openacc_close_device (void *hnd) } STATIC int -openacc_get_device_num (void) +GOMP_OFFLOAD_openacc_get_device_num (void) { #ifdef DEBUG fprintf (stderr, SELF %s:%s\n, __FILE__, __FUNCTION__); @@ -194,7 +194,7 @@ openacc_get_device_num (void) } STATIC void -openacc_set_device_num (int n) +GOMP_OFFLOAD_openacc_set_device_num (int n) { #ifdef DEBUG fprintf (stderr, SELF %s:%s (%u)\n, __FILE__, __FUNCTION__, n); @@ -205,7 +205,7 @@ openacc_set_device_num (int n) } STATIC void * -device_alloc (size_t s) +GOMP_OFFLOAD_alloc (size_t s) { void *ptr = GOMP(malloc) (s); @@ -217,7 +217,7 @@ device_alloc (size_t s) } STATIC void -device_free (void *p) +GOMP_OFFLOAD_free (void *p) { #ifdef DEBUG fprintf (stderr, SELF %s:%s (%p)\n, __FILE__, __FUNCTION__, p); @@ -227,7 +227,7 @@ device_free (void *p) } STATIC void * -device_host2dev (void *d, const void *h, size_t s) +GOMP_OFFLOAD_host2dev (void *d, const void *h, size_t s) { #ifdef DEBUG fprintf (stderr, SELF %s:%s (%p, %p, %zd)\n, __FILE__, __FUNCTION__, d, h, @@ -242,7 +242,7 @@ device_host2dev (void *d, const void *h, size_t s) } STATIC void * -device_dev2host (void *h, const void *d, size_t s)
[gomp4] Move libgomp plugins into subdirectory
Hi, This patch moves plugin-nvptx.c and plugin-host.c (from oacc-host.c) into a new plugin subdirectory, as requested by Jakub, and to match more closely the layout of the Intel MIC pieces. This also moves the autotools bits to enable the NVPTX plugin and locate CUDA libraries into the plugin directory's (new) configury bits. So far this only changes the location of the source files: the plugins themselves are still installed to the same place as before (alongside libgomp itself). Test results look reasonable with my (patched for PTX support) version of the gomp4 branch. I'll apply it there shortly. Thanks, Julian ChangeLog libgomp/ * Makefile.am (SUBDIRS): Add plugin. (DIST_SUBDIRS): Define. (libgomp_plugin_nvptx_*): Remove nvptx support from here. (libgomp_plugin_host_nonshm_*): Likewise. * Makefile.in: Regenerate. * configure: Regenerate. * oacc-host.c: Replace with #include of plugin/plugin-host.c code, move implementation to the latter. * plugin/plugin-host.c: New file. * plugin-nvptx.c: Move to... * plugin/plugin-nvptx.c: New file. * plugin/Makefile.am: New. * plugin/Makefile.in: Regenerate. * plugin/aclocal.m4: Regenerate. * plugin/configure: Regenerate. commit 8994fb8c1b9d52cb9c82a61227a450df29e61806 Author: Julian Brown jul...@codesourcery.com Date: Wed Nov 5 02:54:30 2014 -0800 Move libgomp plugins into their own directory. diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am index e0ab763..f265c5d 100644 --- a/libgomp/Makefile.am +++ b/libgomp/Makefile.am @@ -1,7 +1,8 @@ ## Process this file with automake to produce Makefile.in ACLOCAL_AMFLAGS = -I .. -I ../config -SUBDIRS = testsuite +SUBDIRS = testsuite plugin +DIST_SUBDIRS = plugin ## May be used by toolexeclibdir. gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER) @@ -21,27 +22,6 @@ AM_LDFLAGS = $(XLDFLAGS) $(SECTION_LDFLAGS) $(OPT_LDFLAGS) toolexeclib_LTLIBRARIES = libgomp.la nodist_toolexeclib_HEADERS = libgomp.spec -if PLUGIN_NVPTX -# Nvidia PTX OpenACC plugin. -libgomp_plugin_nvptx_version_info = -version-info $(libtool_VERSION) -toolexeclib_LTLIBRARIES += libgomp-plugin-nvptx.la -libgomp_plugin_nvptx_la_SOURCES = plugin-nvptx.c -libgomp_plugin_nvptx_la_CPPFLAGS = $(AM_CPPFLAGS) $(PLUGIN_NVPTX_CPPFLAGS) -libgomp_plugin_nvptx_la_LDFLAGS = $(libgomp_plugin_nvptx_version_info) \ - $(lt_host_flags) -libgomp_plugin_nvptx_la_LDFLAGS += $(PLUGIN_NVPTX_LDFLAGS) -libgomp_plugin_nvptx_la_LIBADD = $(PLUGIN_NVPTX_LIBS) -libgomp_plugin_nvptx_la_LIBTOOLFLAGS = --tag=disable-static -endif - -libgomp_plugin_host_nonshm_version_info = -version-info $(libtool_VERSION) -toolexeclib_LTLIBRARIES += libgomp-plugin-host_nonshm.la -libgomp_plugin_host_nonshm_la_SOURCES = oacc-host.c -libgomp_plugin_host_nonshm_la_CPPFLAGS = $(AM_CPPFLAGS) -DHOST_NONSHM_PLUGIN -libgomp_plugin_host_nonshm_la_LDFLAGS = \ - $(libgomp_plugin_host_nonshm_version_info) $(lt_host_flags) -libgomp_plugin_host_nonshm_la_LIBTOOLFLAGS = --tag=disable-static - if LIBGOMP_BUILD_VERSIONED_SHLIB # -Wc is only a libtool option. comma = , diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in index d12376e..ea3e1ca 100644 diff --git a/libgomp/configure b/libgomp/configure index 7daccd9..11a7ae0 100755 diff --git a/libgomp/configure.ac b/libgomp/configure.ac index 89c6b31..e883945 100644 --- a/libgomp/configure.ac +++ b/libgomp/configure.ac @@ -30,42 +30,6 @@ LIBGOMP_ENABLE(generated-files-in-srcdir, no, , AC_MSG_RESULT($enable_generated_files_in_srcdir) AM_CONDITIONAL(GENINSRC, test $enable_generated_files_in_srcdir = yes) -# Look for the CUDA driver package. -CUDA_DRIVER_INCLUDE= -CUDA_DRIVER_LIB= -AC_SUBST(CUDA_DRIVER_INCLUDE) -AC_SUBST(CUDA_DRIVER_LIB) -CUDA_DRIVER_CPPFLAGS= -CUDA_DRIVER_LDFLAGS= -AC_ARG_WITH(cuda-driver, - [AS_HELP_STRING([--with-cuda-driver=PATH], - [specify prefix directory for installed CUDA driver package. - Equivalent to --with-cuda-driver-include=PATH/include - plus --with-cuda-driver-lib=PATH/lib])]) -AC_ARG_WITH(cuda-driver-include, - [AS_HELP_STRING([--with-cuda-driver-include=PATH], - [specify directory for installed CUDA driver include files])]) -AC_ARG_WITH(cuda-driver-lib, - [AS_HELP_STRING([--with-cuda-driver-lib=PATH], - [specify directory for the installed CUDA driver library])]) -if test x$with_cuda_driver != x; then - CUDA_DRIVER_INCLUDE=$with_cuda_driver/include - CUDA_DRIVER_LIB=$with_cuda_driver/lib -fi -if test x$with_cuda_driver_include != x; then - CUDA_DRIVER_INCLUDE=$with_cuda_driver_include -fi -if test x$with_cuda_driver_lib != x; then - CUDA_DRIVER_LIB=$with_cuda_driver_lib -fi -if test x$CUDA_DRIVER_INCLUDE != x; then - CUDA_DRIVER_CPPFLAGS=-I$CUDA_DRIVER_INCLUDE -fi -if test x$CUDA_DRIVER_LIB != x; then - CUDA_DRIVER_LDFLAGS=-L$CUDA_DRIVER_LIB -fi - - # --- # --- @@ -241,52 +205,7 @@ elif test x$enable_accelerator != xno; then AC_MSG_ERROR([Can't have support for accelerators without support for plugins]) fi
[jit] Verify enum values earlier
It wasn't clear to me that all of these enum values were being fully validated by the internals, and it's better to fail early (so we can report which function was at fault), so explicitly validate enum values at the API entrypoints. The new testcases bring the # of expected passes in jit.sum from 4663 to 4711. Committed to the dmalcolm/jit branch. gcc/jit/ChangeLog.jit: * libgccjit.c (gcc_jit_context_get_type): Verify that type is valid immediately, rather than relying on called code. (gcc_jit_context_new_function): Likewise for kind. (gcc_jit_context_new_unary_op): Likewise for op. (valid_binary_op_p): New. (gcc_jit_context_new_binary_op): Verify that op is valid immediately, rather than relying on called code. (gcc_jit_context_new_comparison): Likewise. (gcc_jit_block_add_assignment_op): Likewise. gcc/testsuite/ChangeLog.jit: * jit.dg/test-error-get-type-bad-enum.c: New test case. * jit.dg/test-error-new-binary-op-bad-op.c: Likewise. * jit.dg/test-error-new-function-bad-kind.c: Likewise. * jit.dg/test-error-new-unary-op-bad-op.c: Likewise. --- gcc/jit/ChangeLog.jit | 12 ++ gcc/jit/libgccjit.c| 49 +++--- gcc/testsuite/ChangeLog.jit| 7 .../jit.dg/test-error-get-type-bad-enum.c | 27 .../jit.dg/test-error-new-binary-op-bad-op.c | 37 .../jit.dg/test-error-new-function-bad-kind.c | 41 ++ .../jit.dg/test-error-new-unary-op-bad-op.c| 36 7 files changed, 204 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/jit.dg/test-error-get-type-bad-enum.c create mode 100644 gcc/testsuite/jit.dg/test-error-new-binary-op-bad-op.c create mode 100644 gcc/testsuite/jit.dg/test-error-new-function-bad-kind.c create mode 100644 gcc/testsuite/jit.dg/test-error-new-unary-op-bad-op.c diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit index 3d6361c..ce927d6 100644 --- a/gcc/jit/ChangeLog.jit +++ b/gcc/jit/ChangeLog.jit @@ -1,5 +1,17 @@ 2014-11-05 David Malcolm dmalc...@redhat.com + * libgccjit.c (gcc_jit_context_get_type): Verify that type + is valid immediately, rather than relying on called code. + (gcc_jit_context_new_function): Likewise for kind. + (gcc_jit_context_new_unary_op): Likewise for op. + (valid_binary_op_p): New. + (gcc_jit_context_new_binary_op): Verify that op is valid + immediately, rather than relying on called code. + (gcc_jit_context_new_comparison): Likewise. + (gcc_jit_block_add_assignment_op): Likewise. + +2014-11-05 David Malcolm dmalc...@redhat.com + * libgccjit.c: Include safe-ctype.h from libiberty. (IS_ASCII_ALPHA): Delete. (IS_ASCII_DIGIT): Delete. diff --git a/gcc/jit/libgccjit.c b/gcc/jit/libgccjit.c index d9f259e..c109ba6 100644 --- a/gcc/jit/libgccjit.c +++ b/gcc/jit/libgccjit.c @@ -316,7 +316,11 @@ gcc_jit_context_get_type (gcc_jit_context *ctxt, enum gcc_jit_types type) { RETURN_NULL_IF_FAIL (ctxt, NULL, NULL, NULL context); - /* The inner function checks type for us. */ + RETURN_NULL_IF_FAIL_PRINTF1 ( +(type = GCC_JIT_TYPE_VOID + type = GCC_JIT_TYPE_FILE_PTR), +ctxt, NULL, +unrecognized value for enum gcc_jit_types: %i, type); return (gcc_jit_type *)ctxt-get_type (type); } @@ -574,6 +578,12 @@ gcc_jit_context_new_function (gcc_jit_context *ctxt, int is_variadic) { RETURN_NULL_IF_FAIL (ctxt, NULL, loc, NULL context); + RETURN_NULL_IF_FAIL_PRINTF1 ( +((kind = GCC_JIT_FUNCTION_EXPORTED) + (kind = GCC_JIT_FUNCTION_ALWAYS_INLINE)), +ctxt, loc, +unrecognized value for enum gcc_jit_function_kind: %i, +kind); RETURN_NULL_IF_FAIL (return_type, ctxt, loc, NULL return_type); RETURN_NULL_IF_FAIL (name, ctxt, loc, NULL name); /* The assembler can only handle certain names, so for now, enforce @@ -835,13 +845,29 @@ gcc_jit_context_new_unary_op (gcc_jit_context *ctxt, gcc_jit_rvalue *rvalue) { RETURN_NULL_IF_FAIL (ctxt, NULL, loc, NULL context); - /* op is checked by the inner function. */ + RETURN_NULL_IF_FAIL_PRINTF1 ( +(op = GCC_JIT_UNARY_OP_MINUS + op = GCC_JIT_UNARY_OP_LOGICAL_NEGATE), +ctxt, loc, +unrecognized value for enum gcc_jit_unary_op: %i, +op); RETURN_NULL_IF_FAIL (result_type, ctxt, loc, NULL result_type); RETURN_NULL_IF_FAIL (rvalue, ctxt, loc, NULL rvalue); return (gcc_jit_rvalue *)ctxt-new_unary_op (loc, op, result_type, rvalue); } +/* Determine if OP is a valid value for enum gcc_jit_binary_op. + For use by both gcc_jit_context_new_binary_op and + gcc_jit_block_add_assignment_op. */ + +static bool +valid_binary_op_p (enum gcc_jit_binary_op op) +{ + return (op =
Add uniform_inside_sphere_distribution
This distribution has come in handy for me. It relies on uniform_on_sphere_distribution and like it, min and max have no real meaning. Unlike uniform_on_sphere_distribution which really is a random multidimensional unit vector users often want to pick the radius of the distribution. Unit radius is a good default and is provided but the user can specify radius. Like the uniform_on_sphere_distribution which is used inside, the 2-dimensional case uses rejection and higher dimensions use a transform. Built and tested clean on x86_64-linux. OK? Ed 2014-09-05 Edward Smith-Rowland 3dw...@verizon.net * include/ext/random: Add uniform_inside_sphere_distribution. * include/ext/random.tcc: Out-of-line implementation of uniform_inside_sphere_distribution. * testsuite/ext/random/uniform_inside_sphere_distribution/cons/ default.cc: New. * testsuite/ext/random/uniform_inside_sphere_distribution/cons/ parms.cc: New. * testsuite/ext/random/uniform_inside_sphere_distribution/operators/ equal.cc: New. * testsuite/ext/random/uniform_inside_sphere_distribution/operators/ generate.cc: New. * testsuite/ext/random/uniform_inside_sphere_distribution/operators/ inequal.cc: New. * testsuite/ext/random/uniform_inside_sphere_distribution/operators/ serialize.cc: New. Index: include/ext/random === --- include/ext/random (revision 216942) +++ include/ext/random (working copy) @@ -3487,6 +3487,218 @@ _RealType __d2) { return !(__d1 == __d2); } + + /** + * @brief A distribution for random coordinates inside a unit sphere. + */ + templatestd::size_t _Dimen, typename _RealType = double +class uniform_inside_sphere_distribution +{ + static_assert(std::is_floating_point_RealType::value, + template argument not a floating point type); + static_assert(_Dimen != 0, dimension is zero); + +public: + /** The type of the range of the distribution. */ + using result_type = std::array_RealType, _Dimen; + + /** Parameter type. */ + struct param_type + { + using distribution_type + = uniform_inside_sphere_distribution_Dimen, _RealType; + friend class uniform_inside_sphere_distribution_Dimen, _RealType; + + explicit + param_type(_RealType __radius = _RealType(1)) + : _M_radius(__radius) + { + _GLIBCXX_DEBUG_ASSERT(_M_radius _RealType(0)); + } + + _RealType + radius() const + { return _M_radius; } + + friend bool + operator==(const param_type __p1, const param_type __p2) + { return __p1._M_radius == __p2._M_radius; } + + private: + _RealType _M_radius; + }; + + /** + * @brief Constructors. + */ + explicit + uniform_inside_sphere_distribution(_RealType __radius = _RealType(1)) + : _M_param(__radius), _M_uosd() + { } + + explicit + uniform_inside_sphere_distribution(const param_type __p) + : _M_param(__p), _M_uosd() + { } + + /** + * @brief Resets the distribution state. + */ + void + reset() + { _M_uosd.reset(); } + + /** + * @brief Returns the @f$radius@f$ of the distribution. + */ + _RealType + radius() const + { return _M_param.radius(); } + + /** + * @brief Returns the parameter set of the distribution. + */ + param_type + param() const + { return _M_param; } + + /** + * @brief Sets the parameter set of the distribution. + * @param __param The new parameter set of the distribution. + */ + void + param(const param_type __param) + { _M_param = __param; } + + /** + * @brief Returns the greatest lower bound value of the distribution. + * This function makes no sense for this distribution. + */ + result_type + min() const + { + result_type __res; + __res.fill(0); + return __res; + } + + /** + * @brief Returns the least upper bound value of the distribution. + * This function makes no sense for this distribution. + */ + result_type + max() const + { + result_type __res; + __res.fill(0); + return __res; + } + + /** + * @brief Generating functions. + */ + templatetypename _UniformRandomNumberGenerator + result_type + operator()(_UniformRandomNumberGenerator __urng) + { return this-operator()(__urng, _M_param); } + + templatetypename _UniformRandomNumberGenerator + result_type + operator()(_UniformRandomNumberGenerator __urng, + const param_type __p); + + templatetypename _ForwardIterator, + typename _UniformRandomNumberGenerator + void +
Re: [Ada] Changes related to back-end inlining
2014-10-31 Eric Botcazou ebotca...@adacore.com * inline.adb (Has_Excluded_Declaration): With back-end inlining, only return true for nested packages. (Cannot_Inline): Issue errors/warnings whatever the optimization level for back-end inlining and remove assertion. Here is a follow-up patch for the case of nested subprograms, as well as a bunch of testcases for the gnat.dg testsuite. Tested on x86_64-suse-linux, applied on the mainline. 2014-11-05 Eric Botcazou ebotca...@adacore.com * gcc-interface/utils.c (create_subprog_decl): Move code dealing with conflicting inlining status of nested subprograms to... * gcc-interface/trans.c (check_inlining_for_nested_subprog): ...here. (Attribute_to_gnu) Attr_Access: Call it. (Call_to_gnu): Likewise. (Subprogram_Body_to_gnu): Drop the body if it is an inlined external function that has been marked uninlinable. 2014-11-05 Eric Botcazou ebotca...@adacore.com * gnat.dg/inline1.adb: New test. * gnat.dg/inline1_pkg.ad[sb]: New helper. * gnat.dg/inline2.adb: New test. * gnat.dg/inline2_pkg.ad[sb]: New helper. * gnat.dg/inline3.adb: New test. * gnat.dg/inline3_pkg.ad[sb]: New helper. * gnat.dg/inline4.adb: New test. * gnat.dg/inline4_pkg.ad[sb]: New helper. * gnat.dg/inline5.adb: New test. * gnat.dg/inline5_pkg.ad[sb]: New helper. * gnat.dg/inline6.adb: New test. * gnat.dg/inline6_pkg.ad[sb]: New helper. * gnat.dg/inline7.adb: New test. * gnat.dg/inline7_pkg1.ad[sb]: New helper. * gnat.dg/inline7_pkg2.ad[sb]: Likewise. * gnat.dg/inline8.adb: New test. * gnat.dg/inline8_pkg1.ad[sb]: New helper. * gnat.dg/inline8_pkg2.ad[sb]: New helper. * gnat.dg/inline9.adb: New test. * gnat.dg/inline9_pkg.ad[sb]: New helper. * gnat.dg/inline10.adb: New test. * gnat.dg/inline10_pkg.ad[sb]: New helper. * gnat.dg/inline11.adb: New test. * gnat.dg/inline11_pkg.ad[sb]: New helper. -- Eric BotcazouIndex: gcc-interface/utils.c === --- gcc-interface/utils.c (revision 217119) +++ gcc-interface/utils.c (working copy) @@ -3027,18 +3027,6 @@ create_subprog_decl (tree subprog_name, TREE_TYPE (subprog_type)); DECL_ARGUMENTS (subprog_decl) = param_decl_list; - /* If this is a non-inline function nested inside an inlined external - function, we cannot honor both requests without cloning the nested - function in the current unit since it is private to the other unit. - We could inline the nested function as well but it's probably better - to err on the side of too little inlining. */ - if ((inline_status == is_suppressed || inline_status == is_disabled) - !public_flag - current_function_decl - DECL_DECLARED_INLINE_P (current_function_decl) - DECL_EXTERNAL (current_function_decl)) -DECL_DECLARED_INLINE_P (current_function_decl) = 0; - DECL_ARTIFICIAL (subprog_decl) = artificial_flag; DECL_EXTERNAL (subprog_decl) = extern_flag; Index: gcc-interface/trans.c === --- gcc-interface/trans.c (revision 217119) +++ gcc-interface/trans.c (working copy) @@ -1481,6 +1481,49 @@ Pragma_to_gnu (Node_Id gnat_node) return gnu_result; } + +/* Check the inlining status of nested function FNDECL in the current context. + + If a non-inline nested function is referenced from an inline external + function, we cannot honor both requests at the same time without cloning + the nested function in the current unit since it is private to its unit. + We could inline it as well but it's probably better to err on the side + of too little inlining. + + This must be invoked only on nested functions present in the source code + and not on nested functions generated by the compiler, e.g. finalizers, + because they are not marked inline and we don't want them to block the + inlining of the parent function. */ + +static void +check_inlining_for_nested_subprog (tree fndecl) +{ + if (!DECL_DECLARED_INLINE_P (fndecl) + current_function_decl + DECL_EXTERNAL (current_function_decl) + DECL_DECLARED_INLINE_P (current_function_decl)) +{ + const location_t loc1 = DECL_SOURCE_LOCATION (fndecl); + const location_t loc2 = DECL_SOURCE_LOCATION (current_function_decl); + + if (lookup_attribute (always_inline, + DECL_ATTRIBUTES (current_function_decl))) + { + error_at (loc1, subprogram %q+F not marked Inline_Always, fndecl); + error_at (loc2, parent subprogram cannot be inlined); + } + else + { + warning_at (loc1, OPT_Winline, subprogram %q+F not marked Inline, + fndecl); + warning_at (loc2, OPT_Winline, parent subprogram cannot be inlined); + } + + DECL_DECLARED_INLINE_P
[Ada] Fix location information of exception block
This makes it so that an exception block doesn't inherit a bogus location information in SJLJ mode. Tested on x86_64-suse-linux, applied on the mainline. 2014-11-05 Eric Botcazou ebotca...@adacore.com * gcc-interface/trans.c (Handled_Sequence_Of_Statements_to_gnu): Set the SLOC of the node on the call to set_jmpbuf_address_soft emitted on block entry with SJLJ. -- Eric BotcazouIndex: gcc-interface/trans.c === --- gcc-interface/trans.c (revision 217151) +++ gcc-interface/trans.c (working copy) @@ -4629,9 +4629,13 @@ Handled_Sequence_Of_Statements_to_gnu (N start_stmt_group (); if (setjmp_longjmp) -add_stmt (build_call_n_expr (set_jmpbuf_decl, 1, - build_unary_op (ADDR_EXPR, NULL_TREE, - gnu_jmpbuf_decl))); +{ + gnu_expr = build_call_n_expr (set_jmpbuf_decl, 1, +build_unary_op (ADDR_EXPR, NULL_TREE, + gnu_jmpbuf_decl)); + set_expr_location_from_node (gnu_expr, gnat_node); + add_stmt (gnu_expr); +} if (Present (First_Real_Statement (gnat_node))) process_decls (Statements (gnat_node), Empty,
Re: [gomp4] Use GOMP_OFFLOAD_ prefix for (OpenACC) plugin hooks
Hi, On 05 Nov 17:56, Julian Brown wrote: +GOMP_OFFLOAD_register_image (void *host_table, void *target_data) +GOMP_OFFLOAD_get_table (struct mapping_table **table) FYI, these interfaces may change in the near future. Currently GOMP_OFFLOAD_get_table returns a joint table for all images, offloaded to a device. But this doesn't work properly with offloading from dlopened libs. Do you plan to support such cases for PTX? Perhaps it's worth to replace them with a function like GOMP_OFFLOAD_load_image, which will offload one image, and return a target table for this image. In this case there is no need to pass host_table to the plugin, and return a joint table, since libgomp will join host and target tables itself. Another question is what to do with multiple devices of same type. Can they have different images? There are 2 options: 1. GOMP_OFFLOAD_load_image will offload one image to one device and receive a table from it. or 2. GOMP_OFFLOAD_register_image will register one image in the plugin for all devices of same type, and GOMP_OFFLOAD_get_table will return a table for one image and for one device. Multiple MICs can't have different images, but for the generality we can use option #1. -- Ilya
[Ada] Fix ICE on type derived from private discriminated type
The compiler aborts on a record type derived from a private discriminated record type without discriminant contraints, if the private discriminated record type is itself derived from another discriminated record type. Tested on x86_64-suse-linux, applied on the mainline. 2014-11-05 Eric Botcazou ebotca...@adacore.com * gcc-interface/decl.c (gnat_to_gnu_entity) E_Record_Type: For a derived untagged type that renames discriminants, be prepared for a type derived from a private discriminated type when changing the type of the stored discriminants. 2014-11-05 Eric Botcazou ebotca...@adacore.com * gnat.dg/specs/private2.ads: New test. * gnat.dg/specs/private2_pkg.ads: New helper. -- Eric BotcazouIndex: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 217119) +++ gcc-interface/decl.c (working copy) @@ -3056,7 +3056,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit gnat_field = Next_Stored_Discriminant (gnat_field)) if (Present (Corresponding_Discriminant (gnat_field))) { - Entity_Id field = Empty; + Entity_Id field; for (field = First_Stored_Discriminant (gnat_parent); Present (field); field = Next_Stored_Discriminant (field)) @@ -3138,8 +3138,30 @@ gnat_to_gnu_entity (Entity_Id gnat_entit Ekind (Entity (Node (gnat_constr))) == E_Discriminant) { Entity_Id gnat_discr = Entity (Node (gnat_constr)); - tree gnu_discr_type = gnat_to_gnu_type (Etype (gnat_discr)); - tree gnu_ref + tree gnu_discr_type, gnu_ref; + + /* If the scope of the discriminant is not the record type, + this means that we're processing the implicit full view + of a type derived from a private discriminated type: in + this case, the Stored_Constraint list is simply copied + from the partial view, see Build_Derived_Private_Type. + So we need to retrieve the corresponding discriminant + of the implicit full view, otherwise we will abort. */ + if (Scope (gnat_discr) != gnat_entity) + { + Entity_Id field; + for (field = First_Entity (gnat_entity); + Present (field); + field = Next_Entity (field)) + if (Ekind (field) == E_Discriminant + same_discriminant_p (gnat_discr, field)) + break; + gcc_assert (Present (field)); + gnat_discr = field; + } + + gnu_discr_type = gnat_to_gnu_type (Etype (gnat_discr)); + gnu_ref = gnat_to_gnu_entity (Original_Record_Component (gnat_discr), NULL_TREE, 0); -- { dg-do compile } with Private2_Pkg; use Private2_Pkg; package Private2 is type R is new Rec2; end Private2; package Private2_Pkg is type Rec2 (D : Natural) is private; private type Rec1 (D : Natural) is null record; type Rec2 (D : Natural) is new Rec1 (D); end Private2_Pkg;
[Ada] Fix crash on function with In-Out param returning discriminated type
The compiler crashes on a function with an In-Out parameter which returns a discriminated record type with default discriminant. Tested on x86_64-suse-linux, applied on the mainline. 2014-11-05 Eric Botcazou ebotca...@adacore.com * gcc-interface/trans.c (Subprogram_Body_to_gnu): For a function with copy-in/copy-out parameters and which returns by invisible reference, do not create the variable for the return value; instead, manually generate the indirect copy out statements on exit. (gnat_to_gnu) N_Simple_Return_Statement: Adjust accordingly and build a simple indirect assignment for the return value. 2014-11-05 Eric Botcazou ebotca...@adacore.com * gnat.dg/discr42.adb: New test. * gnat.dg/discr42_pkg.ad[sb]: New helper. -- Eric Botcazou-- { dg-do run } with Discr42_Pkg; use Discr42_Pkg; procedure Discr42 is R : Rec; Pos : Natural := 1; begin R := F (Pos); if Pos /= 2 then raise Program_Error; end if; if R /= (D = True, N = 4) then raise Program_Error; end if; end;package body Discr42_Pkg is function F (Pos : in out Natural) return Rec is begin Pos := Pos + 1; if Pos 1 then return (D = True, N = Pos * 2); else return (D = False); end if; end; end Discr42_Pkg;Index: gcc-interface/trans.c === --- gcc-interface/trans.c (revision 217152) +++ gcc-interface/trans.c (working copy) @@ -3547,13 +3547,12 @@ Subprogram_Body_to_gnu (Node_Id gnat_nod gnu_result_decl = DECL_RESULT (gnu_subprog_decl); gnu_subprog_type = TREE_TYPE (gnu_subprog_decl); gnu_cico_list = TYPE_CI_CO_LIST (gnu_subprog_type); - if (gnu_cico_list) -gnu_return_var_elmt = value_member (void_type_node, gnu_cico_list); + if (gnu_cico_list TREE_VALUE (gnu_cico_list) == void_type_node) +gnu_return_var_elmt = gnu_cico_list; /* If the function returns by invisible reference, make it explicit in the - function body. See gnat_to_gnu_entity, E_Subprogram_Type case. - Handle the explicit case here and the copy-in/copy-out case below. */ - if (TREE_ADDRESSABLE (gnu_subprog_type) !gnu_return_var_elmt) + function body. See gnat_to_gnu_entity, E_Subprogram_Type case. */ + if (TREE_ADDRESSABLE (gnu_subprog_type)) { TREE_TYPE (gnu_result_decl) = build_reference_type (TREE_TYPE (gnu_result_decl)); @@ -3573,9 +3572,10 @@ Subprogram_Body_to_gnu (Node_Id gnat_nod begin_subprog_body (gnu_subprog_decl); - /* If there are In Out or Out parameters, we need to ensure that the return - statement properly copies them out. We do this by making a new block and - converting any return into a goto to a label at the end of the block. */ + /* If there are copy-in/copy-out parameters, we need to ensure that they are + properly copied out by the return statement. We do this by making a new + block and converting any return into a goto to a label at the end of the + block. */ if (gnu_cico_list) { tree gnu_return_var = NULL_TREE; @@ -3586,19 +3586,14 @@ Subprogram_Body_to_gnu (Node_Id gnat_nod start_stmt_group (); gnat_pushlevel (); - /* If this is a function with In Out or Out parameters, we also need a - variable for the return value to be placed. */ - if (gnu_return_var_elmt) + /* If this is a function with copy-in/copy-out parameters and which does + not return by invisible reference, we also need a variable for the + return value to be placed. */ + if (gnu_return_var_elmt !TREE_ADDRESSABLE (gnu_subprog_type)) { tree gnu_return_type = TREE_TYPE (TREE_PURPOSE (gnu_return_var_elmt)); - /* If the function returns by invisible reference, make it - explicit in the function body. See gnat_to_gnu_entity, - E_Subprogram_Type case. */ - if (TREE_ADDRESSABLE (gnu_subprog_type)) - gnu_return_type = build_reference_type (gnu_return_type); - gnu_return_var = create_var_decl (get_identifier (RETVAL), NULL_TREE, gnu_return_type, NULL_TREE, false, false, @@ -3693,7 +3688,8 @@ Subprogram_Body_to_gnu (Node_Id gnat_nod the label and copy statement. */ if (gnu_cico_list) { - tree gnu_retval; + const Node_Id gnat_end_label + = End_Label (Handled_Statement_Sequence (gnat_node)); gnu_return_var_stack-pop (); @@ -3701,14 +3697,45 @@ Subprogram_Body_to_gnu (Node_Id gnat_nod add_stmt (build1 (LABEL_EXPR, void_type_node, gnu_return_label_stack-last ())); - if (list_length (gnu_cico_list) == 1) - gnu_retval = TREE_VALUE (gnu_cico_list); + /* If this is a function which returns by invisible reference, the + return value has already been dealt with at the return statements, + so we only need to indirectly copy out the parameters. */ + if (TREE_ADDRESSABLE (gnu_subprog_type)) + { + tree gnu_ret_deref + = build_unary_op
[jit] Add comments throughout libgccjit.c, and in libgccjit.h
Committed to branch dmalcolm/jit: Also, add checking to ensure that gcc_jit_context_new_array_type fails with an error if given a negative size. gcc/jit/ChangeLog.jit: * docs/topics/expressions.rst (Type-coercion): Casts between pointer types are valid. * libgccjit.c: Document that gcc_jit_context et al are actually subclasses of the gcc::jit::recording classes. (RETURN_VAL_IF_FAIL): Add top-level descriptive comment. (RETURN_IF_NOT_VALID_BLOCK): Likewise. (RETURN_NULL_IF_NOT_VALID_BLOCK): Likewise. (jit_error): Likewise. (compatible_types): Likewise. (gcc_jit_context_acquire): Likewise. (gcc_jit_context_release): Likewise. (gcc_jit_context_new_child_context): Likewise. (gcc_jit_context_new_location): Likewise. (gcc_jit_location_as_object): Likewise. (gcc_jit_type_as_object): Likewise. (gcc_jit_context_get_type): Likewise. (gcc_jit_context_get_int_type): Likewise. (gcc_jit_type_get_pointer): Likewise. (gcc_jit_type_get_const): Likewise. (gcc_jit_type_get_volatile): Likewise. (gcc_jit_context_new_array_type): Likewise. Also document that LOC can be NULL. Fail with an error on negative size. (gcc_jit_context_new_field): Add top-level descriptive comment and document that LOC can be NULL. (gcc_jit_field_as_object): Add top-level descriptive comment. (gcc_jit_context_new_struct_type): Likewise. Also document that LOC can be NULL. (gcc_jit_context_new_opaque_struct): Likewise. (gcc_jit_struct_as_type): Add top-level descriptive comment. (gcc_jit_struct_set_fields): Likewise. Also document that LOC can be NULL. (gcc_jit_context_new_union_type): Likewise. (gcc_jit_context_new_function_ptr_type): Likewise. (gcc_jit_context_new_param): Likewise. (gcc_jit_param_as_object): Add top-level descriptive comment. (gcc_jit_param_as_lvalue): Likewise. (gcc_jit_param_as_rvalue): Likewise. (gcc_jit_context_new_function): Likewise. Also document that LOC can be NULL. (gcc_jit_context_get_builtin_function): Add top-level descriptive comment. (gcc_jit_function_as_object): Likewise. (gcc_jit_function_get_param): Likewise. (gcc_jit_function_dump_to_dot): Likewise. (gcc_jit_function_new_block): Likewise. (gcc_jit_block_as_object): Likewise. (gcc_jit_block_get_function): Likewise. (gcc_jit_context_new_global): Likewise. Also document that LOC can be NULL. (gcc_jit_lvalue_as_object): Add top-level descriptive comment. (gcc_jit_lvalue_as_rvalue): Likewise. (gcc_jit_rvalue_as_object): Likewise. (gcc_jit_rvalue_get_type): Likewise. (RETURN_NULL_IF_FAIL_NONNULL_NUMERIC_TYPE): Likewise. (gcc_jit_context_new_rvalue_from_int): Likewise. (gcc_jit_context_zero): Likewise. (gcc_jit_context_one): Likewise. (gcc_jit_context_new_rvalue_from_double): Likewise. (gcc_jit_context_new_rvalue_from_ptr): Likewise. (gcc_jit_context_null): Likewise. (gcc_jit_context_new_string_literal): Likewise. (gcc_jit_context_new_unary_op): Likewise. Also document that LOC can be NULL. (gcc_jit_context_new_binary_op): Likewise. (gcc_jit_context_new_comparison): Likewise. (gcc_jit_context_new_call): Likewise. (gcc_jit_context_new_call_through_ptr): Likewise. (is_valid_cast): Add top-level descriptive comment. (gcc_jit_context_new_cast): Likewise. Also document that LOC can be NULL. (gcc_jit_context_new_array_access): Likewise. (gcc_jit_object_get_context): Add top-level descriptive comment. (gcc_jit_object_get_debug_string): Likewise. (gcc_jit_lvalue_access_field): Likewise. Also document that LOC can be NULL. (gcc_jit_rvalue_access_field): Likewise. (gcc_jit_rvalue_dereference_field): Likewise. (gcc_jit_rvalue_dereference): Likewise. (gcc_jit_lvalue_get_address): Likewise. (gcc_jit_function_new_local): Likewise. (gcc_jit_block_add_eval): Likewise. (gcc_jit_block_add_assignment): Likewise. (gcc_jit_block_add_assignment_op): Likewise. (is_bool): Add top-level descriptive comment. (gcc_jit_block_end_with_conditional): Likewise. Also document that LOC can be NULL. (gcc_jit_block_add_comment): Likewise. (gcc_jit_block_end_with_jump): Likewise. (gcc_jit_block_end_with_return): Likewise. (gcc_jit_block_end_with_void_return): Likewise. (gcc_jit_context_set_str_option): Add top-level descriptive comment. (gcc_jit_context_set_int_option): Likewise. (gcc_jit_context_set_bool_option): Likewise. (gcc_jit_context_compile):
Re: [PATCH 10/27] New file: gcc/jit/libgccjit.c
On Tue, 2014-11-04 at 14:39 -0700, Jeff Law wrote: On 11/04/14 09:57, David Malcolm wrote: +#define IS_ASCII_DIGIT(CHAR) \ + ((CHAR) = '0' (CHAR) ='9') + +#define IS_ASCII_ALNUM(CHAR) \ + (IS_ASCII_ALPHA (CHAR) || IS_ASCII_DIGIT (CHAR)) Can't we rely on the C library to give us equivalents? I've been burned in the past by the C library using locales, in particular the two lowercase i variants in Turkish. These macros are used by gcc_jit_context_new_function to enforce C's naming restrictions, to avoid errors from the assembler. The comment I put there was: /* The assembler can only handle certain names, so for now, enforce C's rules for identifiers upon the name. Eventually we'll need some way to interact with e.g. C++ name mangling. */ Am I right in thinking that for the assembler we need to enforce the C naming rules specifically on *ASCII*. (clearly another comment is needed here). I guess you've got to do it somewhere. Presumably there isn't something already in GCC that enforces an input character set? I guess I just dislike seeing something that feels like it ought to already be available. Presumably by marking it with __attribute__((cold)) ? (with a suitable macro in case we're not being compiled with a gcc that supports it). Yup. That's precisely what you want since that gives the predictors enough information to mark paths as unlikely without having to mark each path yourself. Sorry. I'll post a followup with comments added. Thanks. I probably rely more on those for this kind of review than anything, so the lack of them really stood out. Many of the functions are public API entrypoints, where there's a comment in the public header. Should I simply duplicate the comment from there into the .c file, or put a comment like: Good question. Normally in the past we'd have you duplicate the comment, but with this new usage scenario that may not make a lot of sense since one or the other will likely get out of sync at some point. At this point a snarky comment about generating documentation and the interface from a single definition would be appropriate. /* Public entrypoint. See description in libgccjit.h. */ for each of these? (perhaps with additional text giving implementation notes?) Let's go with this. If folks want the comment duplicated, they can argue for it after the fact :-) Thanks for all the reviews. Looks like this and patch 16 are now the only non-approved parts of the kit (I didn't see a review of 16). Right. I didn't get to #16 yesterday. Thanks. I've added comments throughout the file. I didn't bother adding __attribute__((cold)), instead simply dropping that TODO. Attached is the current state of the file gcc/jit/libgccjit.c (on the branch) for review. OK for trunk? (conditional on all the rest being approved, and usual bootstrapregrtesting; I've merely verified a non-bootstrap compile and successful make check-jit so far). There were a few other changes relative to what you've approved, which I'll post for review shortly. Dave /* Implementation of the C API; all wrappers into the internal C++ API Copyright (C) 2013-2014 Free Software Foundation, Inc. Contributed by David Malcolm dmalc...@redhat.com. This file is part of GCC. GCC is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3, or (at your option) any later version. GCC is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see http://www.gnu.org/licenses/. */ #include config.h #include system.h #include coretypes.h #include opts.h #include safe-ctype.h #include libgccjit.h #include jit-common.h #include jit-recording.h /* The opaque types used by the public API are actually subclasses of the gcc::jit::recording classes. */ struct gcc_jit_context : public gcc::jit::recording::context { gcc_jit_context (gcc_jit_context *parent_ctxt) : context (parent_ctxt) {} }; struct gcc_jit_result : public gcc::jit::result { }; struct gcc_jit_object : public gcc::jit::recording::memento { }; struct gcc_jit_location : public gcc::jit::recording::location { }; struct gcc_jit_type : public gcc::jit::recording::type { }; struct gcc_jit_struct : public gcc::jit::recording::struct_ { }; struct gcc_jit_field : public gcc::jit::recording::field { }; struct gcc_jit_function : public gcc::jit::recording::function { }; struct gcc_jit_block : public gcc::jit::recording::block { }; struct gcc_jit_rvalue : public gcc::jit::recording::rvalue { }; struct gcc_jit_lvalue :
Re: [jit] Use ISALPHA and ISALNUM rather than writing our own
On 11/05/14 08:48, David Malcolm wrote: On Tue, 2014-11-04 at 14:39 -0700, Jeff Law wrote: On 11/04/14 09:57, David Malcolm wrote: +#define IS_ASCII_DIGIT(CHAR) \ + ((CHAR) = '0' (CHAR) ='9') + +#define IS_ASCII_ALNUM(CHAR) \ + (IS_ASCII_ALPHA (CHAR) || IS_ASCII_DIGIT (CHAR)) Can't we rely on the C library to give us equivalents? I've been burned in the past by the C library using locales, in particular the two lowercase i variants in Turkish. These macros are used by gcc_jit_context_new_function to enforce C's naming restrictions, to avoid errors from the assembler. The comment I put there was: /* The assembler can only handle certain names, so for now, enforce C's rules for identifiers upon the name. Eventually we'll need some way to interact with e.g. C++ name mangling. */ Am I right in thinking that for the assembler we need to enforce the C naming rules specifically on *ASCII*. (clearly another comment is needed here). I guess you've got to do it somewhere. Presumably there isn't something already in GCC that enforces an input character set? I guess I just dislike seeing something that feels like it ought to already be available. It turns out that locale-independent tests for this did already exist in libiberty, in safe-ctype.h, so I've committed this to the jit branch: gcc/jit/ChangeLog.jit: * libgccjit.c: Include safe-ctype.h from libiberty. (IS_ASCII_ALPHA): Delete. (IS_ASCII_DIGIT): Delete. (IS_ASCII_ALNUM): Delete. (gcc_jit_context_new_function): Replace use of IS_ASCII_ALPHA and IS_ASCII_ALNUM with ISALPHA and ISALNUM respectively, from libiberty. Excellent. Thanks for the cleanup. Jeff
Re: The nvptx port [8/11+] Write undefined decls.
On 11/05/14 05:01, Bernd Schmidt wrote: On 10/22/2014 08:11 PM, Jeff Law wrote: I'm not going to insist you do this in the same way as the PA. That was a different era -- we had significant motivation to make things work in such a way that everything could be buried in the pa specific files. That sometimes led to less than optimal approaches to fix certain problems. So... is this patch approved? Yes, sorry for not being explicit. Jeff
[jit] Drop the disabled debugging code within handle_locations
On Tue, 2014-11-04 at 15:21 -0700, Jeff Law wrote: On 10/31/14 11:02, David Malcolm wrote: This files implements the gcc::jit::playback internal API, called by the dummy frontend to replay the public API calls made to the library. A thin wrapper around trees. gcc/jit/ * jit-playback.c: New. + /* line_table should now be populated; every playback::location should + now have an m_srcloc. */ + + if (0) +line_table_dump (stderr, + line_table, +LINEMAPS_ORDINARY_USED (line_table), + LINEMAPS_MACRO_USED (line_table)); + + /* Now assign them to tree nodes as appropriate. */ + std::pairtree, location * *cached_location; + + FOR_EACH_VEC_ELT (m_cached_locations, i, cached_location) +{ + tree t = cached_location-first; + source_location srcloc = cached_location-second-m_srcloc; +#if 0 + inform (srcloc, location of ); + debug_tree (t); +#endif Put the if () #if0 under some kind of debugging control or remove them. Similarly for later instances. With that change, this is good for the trunk. Thanks. I removed them in the following commit (on the branch): gcc/jit/ChangeLog.jit: * jit-playback.c (gcc::jit::playback::context::handle_locations): Drop the disabled debugging code. --- gcc/jit/ChangeLog.jit | 5 + gcc/jit/jit-playback.c | 16 2 files changed, 5 insertions(+), 16 deletions(-) diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit index 8de64ac..6212b53 100644 --- a/gcc/jit/ChangeLog.jit +++ b/gcc/jit/ChangeLog.jit @@ -1,5 +1,10 @@ 2014-11-05 David Malcolm dmalc...@redhat.com + * jit-playback.c (gcc::jit::playback::context::handle_locations): + Drop the disabled debugging code. + +2014-11-05 David Malcolm dmalc...@redhat.com + * docs/topics/expressions.rst (Type-coercion): Casts between pointer types are valid. * libgccjit.c: Document that gcc_jit_context et al are actually diff --git a/gcc/jit/jit-playback.c b/gcc/jit/jit-playback.c index dc1b468..1dbb778 100644 --- a/gcc/jit/jit-playback.c +++ b/gcc/jit/jit-playback.c @@ -1858,12 +1858,6 @@ handle_locations () /* line_table should now be populated; every playback::location should now have an m_srcloc. */ - if (0) -line_table_dump (stderr, -line_table, -LINEMAPS_ORDINARY_USED (line_table), -LINEMAPS_MACRO_USED (line_table)); - /* Now assign them to tree nodes as appropriate. */ std::pairtree, location * *cached_location; @@ -1871,10 +1865,6 @@ handle_locations () { tree t = cached_location-first; source_location srcloc = cached_location-second-m_srcloc; -#if 0 - inform (srcloc, location of ); - debug_tree (t); -#endif /* This covers expressions: */ if (CAN_HAVE_LOCATION_P (t)) @@ -1884,12 +1874,6 @@ handle_locations () else { /* Don't know how to set location on this node. */ - if (0) - { - fprintf (stderr, can't set location on:); - debug_tree (t); - fprintf (stderr, \n); - } } } } -- 1.7.11.7
Re: Re: [PATCH] Add missing requirement to crossmodule-indircall-1a.c
Jeff Law l...@redhat.com: On 10/23/14 08:30, jb...@gmx.de wrote: Jeff Law l...@redhat.com: On 10/21/14 12:21, jb...@gmx.de wrote: Jeff Law l...@redhat.com: On 10/21/14 16:13, Haswell wrote: The additional source must have the same requirement crossmodule-indircall-1.c has. * crossmodule-indircall-1a.c: Add missing requirement. Why? When used by crossmodule-indircall-1.c we'll have already tested the marker and when used by itself, it does nothing. So I don't see why you think a marker is needed for this source file. When configuring --disable-lto it gets compiled twice: FAIL: gcc.dg/tree-prof/crossmodule-indircall-1a.c compilation, -fprofile-generate -D_PROFILE_GENERATE UNRESOLVED: gcc.dg/tree-prof/crossmodule-indircall-1a.c execution, -fprofile-generate -D_PROFILE_GENERATE UNRESOLVED: gcc.dg/tree-prof/crossmodule-indircall-1a.c compilation, -fprofile-use -D_PROFILE_USE UNRESOLVED: gcc.dg/tree-prof/crossmodule-indircall-1a.c execution, -fprofile-use -D_PROFILE_USE I'd recommend looking deeper. I believe that file should be collapsing down to main () { return 0; } when LTO is not enabled. I'm not a dejagnu expert, but this is what happens: /tmp/build/gcc/xgcc -B/tmp/build/gcc/ /tmp/gcc-4.9.1/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1a.c -fno-diagnostics-show-caret -fdiagnostics-color=never /tmp/gcc-4.9.1/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1a.c -fprofile-generate -D_PROFILE_GENERATE -lm -o /tmp/build/gcc/testsuite/gcc/crossmodule-indircall-1a.x01 /tmp/cc4rrWCn.o: In function `main': crossmodule-indircall-1a.c:(.text+0x0): multiple definition of `main' /tmp/ccgMlXGi.o:crossmodule-indircall-1a.c:(.text+0x0): first defined here collect2: error: ld returned 1 exit status compiler exited with status 1 Thanks. What's weird here is the source file is listed twice on the command line! No wonder it's failing. I can't typically decipher tcl code without trace info and some send_user commands to see what the values of various things are. [...] Though I have no idea how that's expected to work in an LTO enabled compile. With LTO enabled it runs just fine (which is the reason for the patch I suggested): spawn /tmp/build/gcc/xgcc -B/tmp/build/gcc/ /tmp/gcc-4.9.1/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1a.c -fno-diagnostics-show-caret -fdiagnostics-color=never -fprofile-generate -D_PROFILE_GENERATE -lm -o /tmp/build/gcc/testsuite/gcc/crossmodule-indircall-1a.x01 PASS: gcc.dg/tree-prof/crossmodule-indircall-1a.c compilation, -fprofile-generate -D_PROFILE_GENERATE PASS: gcc.dg/tree-prof/crossmodule-indircall-1a.c execution, -fprofile-generate -D_PROFILE_GENERATE spawn /tmp/build/gcc/xgcc -B/tmp/build/gcc/ /tmp/gcc-4.9.1/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1a.c -fno-diagnostics-show-caret -fdiagnostics-color=never -fprofile-use -D_PROFILE_USE -lm -o /tmp/build/gcc/testsuite/gcc/crossmodule-indircall-1a.x02 PASS: gcc.dg/tree-prof/crossmodule-indircall-1a.c compilation, -fprofile-use -D_PROFILE_USE PASS: gcc.dg/tree-prof/crossmodule-indircall-1a.c execution,-fprofile-use -D_PROFILE_USE