RE: [PATCH] MIPS: Prevent the p5600-bonding.c test from being run for the n32 and 64 ABIs
> This is OK now. Committed as SVN 232980. Regards, Andrew
[PATCH] Fix wide_int unsigned division (PR tree-optimization/69546)
Hi! As the testcase shows, wide_int unsigned division is broken for > 64bit precision division of unsigned dividend which have 63rd bit set, and all higher bits cleared (thus is normalized as 2 HWIs, first with MSB set, the second 0) and divisor of 1, we return just a single HWI, which is equivalent to all higher bits set too. If the divisor is > 1, there is no such problem, as the MSB will not be set after the division. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-01-29 Jakub JelinekPR tree-optimization/69546 * wide-int.cc (wi::divmod_internal): For unsigned division where both operands fit into uhwi, if o1 is 1 and o0 has msb set, if divident_prec is larger than bits per hwi, clear another quotient word and return 2 instead of 1. * gcc.dg/torture/pr69546.c: New test. --- gcc/wide-int.cc.jj 2016-01-26 11:46:39.0 +0100 +++ gcc/wide-int.cc 2016-01-29 11:59:33.348852003 +0100 @@ -1788,15 +1788,25 @@ wi::divmod_internal (HOST_WIDE_INT *quot { unsigned HOST_WIDE_INT o0 = dividend.to_uhwi (); unsigned HOST_WIDE_INT o1 = divisor.to_uhwi (); + unsigned int quotient_len = 1; if (quotient) - quotient[0] = o0 / o1; + { + quotient[0] = o0 / o1; + if (o1 == 1 + && (HOST_WIDE_INT) o0 < 0 + && dividend_prec > HOST_BITS_PER_WIDE_INT) + { + quotient[1] = 0; + quotient_len = 2; + } + } if (remainder) { remainder[0] = o0 % o1; *remainder_len = 1; } - return 1; + return quotient_len; } /* Make the divisor and dividend positive and remember what we --- gcc/testsuite/gcc.dg/torture/pr69546.c.jj 2016-01-29 12:06:03.148516651 +0100 +++ gcc/testsuite/gcc.dg/torture/pr69546.c 2016-01-29 12:08:17.847672967 +0100 @@ -0,0 +1,26 @@ +/* PR tree-optimization/69546 */ +/* { dg-do run { target int128 } } */ + +unsigned __int128 __attribute__ ((noinline, noclone)) +foo (unsigned long long x) +{ + unsigned __int128 y = ~0ULL; + x >>= 63; + return y / (x | 1); +} + +unsigned __int128 __attribute__ ((noinline, noclone)) +bar (unsigned long long x) +{ + unsigned __int128 y = ~33ULL; + x >>= 63; + return y / (x | 1); +} + +int +main () +{ + if (foo (1) != ~0ULL || bar (17) != ~33ULL) +__builtin_abort (); + return 0; +} Jakub
[PATCH][AArch64] PR target/69161: Don't use special predicate for CCmode comparisons in expressions that require matching modes
Hi all, In this PR we ICE during combine when trying to propagate a comparison into a vec_duplicate, that is we end up creating the rtx: (vec_duplicate:V4SI (eq:CC_NZ (reg:CC_NZ 66 cc) (const_int 0 [0]))) The documentation for vec_duplicate says: "The output vector mode must have the same submodes as the input vector mode or the scalar modes" So this is invalid RTL, which triggers an assert in simplify-rtx to that effect. It has been suggested on the PR that this is because we use a special_predicate for aarch64_comparison_operator which means that it ignores the mode when matching. This is fine when used in RTXes that don't need it, like if_then_else expressions but can cause trouble when used in places where the modes do matter, like in SET operations. In this particular ICE the cause was the conditional store patterns that could end up matching an intermediate rtx during combine of (set (reg:SI) (eq:CC_NZ x y)). The suggested solution is to define a separate predicate with the same conditions as aarch64_comparison_operator but make it not special, so it gets automatic mode checks to prevent such a situation. This patch does that. Bootstrapped and tested on aarch64-linux-gnu. SPEC2006 codegen did not change with this patch, so there shouldn't be any code quality regressions. Ok for trunk? Thanks, Kyrill 2016-01-29 Kyrylo TkachovPR target/69161 * config/aarch64/predicates.md (aarch64_comparison_operator_mode): New predicate. (aarch64_comparison_operator): Break overly long line into two. (aarch64_comparison_operation): Likewise. * config/aarch64/aarch64.md (cstorecc4): Use aarch64_comparison_operator_mode instead of aarch64_comparison_operator. (cstore4): Likewise. (aarch64_cstore): Likewise. (*cstoresi_insn_uxtw): Likewise. (cstore_neg): Likewise. (*cstoresi_neg_uxtw): Likewise. 2016-01-29 Kyrylo Tkachov PR target/69161 * gcc.c-torture/compile/pr69161.c: New test. diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index f900c12cfb4d108fd4c1671c75b465966befee06..46ca6588c93d793668808fa2e3accaa038ea71d4 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2957,7 +2957,7 @@ (define_expand "cstore4" (define_expand "cstorecc4" [(set (match_operand:SI 0 "register_operand") - (match_operator 1 "aarch64_comparison_operator" + (match_operator 1 "aarch64_comparison_operator_mode" [(match_operand 2 "cc_register") (match_operand 3 "const0_operand")]))] "" @@ -2969,7 +2969,7 @@ (define_expand "cstorecc4" (define_expand "cstore4" [(set (match_operand:SI 0 "register_operand" "") - (match_operator:SI 1 "aarch64_comparison_operator" + (match_operator:SI 1 "aarch64_comparison_operator_mode" [(match_operand:GPF 2 "register_operand" "") (match_operand:GPF 3 "aarch64_fp_compare_operand" "")]))] "" @@ -2982,7 +2982,7 @@ (define_expand "cstore4" (define_insn "aarch64_cstore" [(set (match_operand:ALLI 0 "register_operand" "=r") - (match_operator:ALLI 1 "aarch64_comparison_operator" + (match_operator:ALLI 1 "aarch64_comparison_operator_mode" [(match_operand 2 "cc_register" "") (const_int 0)]))] "" "cset\\t%0, %m1" @@ -3027,7 +3027,7 @@ (define_insn_and_split "*compare_cstore_insn" (define_insn "*cstoresi_insn_uxtw" [(set (match_operand:DI 0 "register_operand" "=r") (zero_extend:DI - (match_operator:SI 1 "aarch64_comparison_operator" + (match_operator:SI 1 "aarch64_comparison_operator_mode" [(match_operand 2 "cc_register" "") (const_int 0)])))] "" "cset\\t%w0, %m1" @@ -3036,7 +3036,7 @@ (define_insn "*cstoresi_insn_uxtw" (define_insn "cstore_neg" [(set (match_operand:ALLI 0 "register_operand" "=r") - (neg:ALLI (match_operator:ALLI 1 "aarch64_comparison_operator" + (neg:ALLI (match_operator:ALLI 1 "aarch64_comparison_operator_mode" [(match_operand 2 "cc_register" "") (const_int 0)])))] "" "csetm\\t%0, %m1" @@ -3047,7 +3047,7 @@ (define_insn "cstore_neg" (define_insn "*cstoresi_neg_uxtw" [(set (match_operand:DI 0 "register_operand" "=r") (zero_extend:DI - (neg:SI (match_operator:SI 1 "aarch64_comparison_operator" + (neg:SI (match_operator:SI 1 "aarch64_comparison_operator_mode" [(match_operand 2 "cc_register" "") (const_int 0)]] "" "csetm\\t%w0, %m1" diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index e96dc000bea8470daa187dfd7c44e9c9993dbb0f..04cb695b4f038c9a9be5470598939e1ad20f36f4 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -229,10 +229,17 @@ (define_predicate "aarch64_reg_or_imm" ;; True for integer comparisons and for FP comparisons other than LTGT or UNEQ. (define_special_predicate "aarch64_comparison_operator" - (match_code "eq,ne,le,lt,ge,gt,geu,gtu,leu,ltu,unordered,ordered,unlt,unle,unge,ungt")) + (match_code
[committed] Fix SSE1 V4SImode vector insert (PR target/69551)
Hi! The following patch fixes a bug in the V4SImode ix86_expand_vector_set SSE1 handling, before we recurse, we need to copy the original target to the temporary, otherwise we set just the single element and leave the rest of the elements uninitialized. Bootstrapped/regtested on x86_64-linux and i686-linux, preapproved by Uros in the PR, committed to trunk and 5.x so far. 2016-01-29 Jakub JelinekPR target/69551 * config/i386/i386.c (ix86_expand_vector_set) : For SSE1, copy target into the temporary reg first before recursing on it. * gcc.target/i386/pr69551.c: New test. --- gcc/config/i386/i386.c.jj 2016-01-28 15:07:25.0 +0100 +++ gcc/config/i386/i386.c 2016-01-29 13:02:32.100788474 +0100 @@ -46744,6 +46744,7 @@ ix86_expand_vector_set (bool mmx_ok, rtx { /* For SSE1, we have to reuse the V4SF code. */ rtx t = gen_reg_rtx (V4SFmode); + emit_move_insn (t, gen_lowpart (V4SFmode, target)); ix86_expand_vector_set (false, t, gen_lowpart (SFmode, val), elt); emit_move_insn (target, gen_lowpart (mode, t)); } --- gcc/testsuite/gcc.target/i386/pr69551.c.jj 2016-01-29 13:10:46.338993771 +0100 +++ gcc/testsuite/gcc.target/i386/pr69551.c 2016-01-29 13:09:49.0 +0100 @@ -0,0 +1,23 @@ +/* PR target/69551 */ +/* { dg-do run { target sse_runtime } } */ +/* { dg-options "-O2 -mno-sse2 -msse" } */ + +typedef unsigned char v16qi __attribute__ ((vector_size (16))); +typedef unsigned int v4si __attribute__ ((vector_size (16))); + +char __attribute__ ((noinline, noclone)) +test (v4si vec) +{ + vec[1] = 0x5fb856; + return ((v16qi) vec)[0]; +} + +int +main () +{ + char z = test ((v4si) { -1, -1, -1, -1 }); + + if (z != -1) +__builtin_abort (); + return 0; +} Jakub
Re: [PATCH] Fix PR69547
On Fri, Jan 29, 2016 at 09:40:48AM +0100, Richard Biener wrote: > I am testing the following patch to fix a regression that we no longer > remove some empty loops. Doesn't this mean that DCE will remove the clobbers as unnecessary, even when they aren't in empty loops? Jakub
Re: [PATCH] S/390: Require a hardware vector support for test to succeed.
On Wed, Jan 27, 2016 at 11:04:32AM +0100, Dominik Vogt wrote: > gcc/testsuite/ChangeLog > > * gcc.dg/tree-ssa/ssa-dom-cse-2.c: Require a hardware vector support for > test to succeed. Applied. Thanks! -Andreas-
Re: [PATCH] s390: Add -fsplit-stack support
Hi Marcin, sorry for the late feedback. A few comments regarding the split stack implementation: The GNU coding style requires to replace every 8 leading blanks on a line with a tab. There are many lines in your patch violating this. In case you are an emacs user `whitespace-cleanup' will fix this for you. Could you please add a testcase checking the different variants. I.e. with early exit, no-alloc in __morestack, and with an actual allocation? There are a few more comments inline. Bye, -Andreas- > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index c881d52..71f6f38 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,5 +1,38 @@ > 2016-01-16 Marcin Kościelnicki> > + * common/config/s390/s390-common.c (s390_supports_split_stack): > + New function. > + (TARGET_SUPPORTS_SPLIT_STACK): New macro. > + * config/s390/s390-protos.h: Add s390_expand_split_stack_prologue. > + * config/s390/s390.c (struct machine_function): New field > + split_stack_varargs_pointer. > + (s390_register_info): Mark r12 as clobbered if it'll be used as temp > + in s390_emit_prologue. > + (s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack > + vararg pointer. > + (morestack_ref): New global. > + (SPLIT_STACK_AVAILABLE): New macro. > + (s390_expand_split_stack_prologue): New function. > + (s390_expand_split_stack_call): New function. > + (s390_live_on_entry): New function. > + (s390_va_start): Use split-stack vararg pointer if appropriate. > + (s390_reorg): Lower the split-stack pseudo-insns. > + (s390_asm_file_end): Emit the split-stack note sections. > + (TARGET_EXTRA_LIVE_ON_ENTRY): New macro. > + * config/s390/s390.md: (UNSPEC_STACK_CHECK): New unspec. > + (UNSPECV_SPLIT_STACK_CALL): New unspec. > + (UNSPECV_SPLIT_STACK_SIBCALL): New unspec. > + (UNSPECV_SPLIT_STACK_MARKER): New unspec. > + (split_stack_prologue): New expand. > + (split_stack_call_*): New insn. > + (split_stack_cond_call_*): New insn. > + (split_stack_space_check): New expand. > + (split_stack_sibcall_*): New insn. > + (split_stack_cond_sibcall_*): New insn. > + (split_stack_marker): New insn. > + > +2016-01-02 Marcin Kościelnicki > + > * cfgrtl.c (rtl_tidy_fallthru_edge): Bail for unconditional jumps > with side effects. > > diff --git a/gcc/common/config/s390/s390-common.c > b/gcc/common/config/s390/s390-common.c > index 4519c21..1e497e6 100644 > --- a/gcc/common/config/s390/s390-common.c > +++ b/gcc/common/config/s390/s390-common.c > @@ -105,6 +105,17 @@ s390_handle_option (struct gcc_options *opts > ATTRIBUTE_UNUSED, > } > } > > +/* -fsplit-stack uses a field in the TCB, available with glibc-2.23. > + We don't verify it, since earlier versions just have padding at > + its place, which works just as well. */ > + > +static bool > +s390_supports_split_stack (bool report ATTRIBUTE_UNUSED, > +struct gcc_options *opts ATTRIBUTE_UNUSED) > +{ > + return true; > +} > + > #undef TARGET_DEFAULT_TARGET_FLAGS > #define TARGET_DEFAULT_TARGET_FLAGS (TARGET_DEFAULT) > > @@ -117,4 +128,7 @@ s390_handle_option (struct gcc_options *opts > ATTRIBUTE_UNUSED, > #undef TARGET_OPTION_INIT_STRUCT > #define TARGET_OPTION_INIT_STRUCT s390_option_init_struct > > +#undef TARGET_SUPPORTS_SPLIT_STACK > +#define TARGET_SUPPORTS_SPLIT_STACK s390_supports_split_stack > + > struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER; > diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h > index 633bc1e..09032c9 100644 > --- a/gcc/config/s390/s390-protos.h > +++ b/gcc/config/s390/s390-protos.h > @@ -42,6 +42,7 @@ extern bool s390_handle_option (struct gcc_options *opts > ATTRIBUTE_UNUSED, > extern HOST_WIDE_INT s390_initial_elimination_offset (int, int); > extern void s390_emit_prologue (void); > extern void s390_emit_epilogue (bool); > +extern void s390_expand_split_stack_prologue (void); > extern bool s390_can_use_simple_return_insn (void); > extern bool s390_can_use_return_insn (void); > extern void s390_function_profiler (FILE *, int); > diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c > index 3be64de..6afce7c 100644 > --- a/gcc/config/s390/s390.c > +++ b/gcc/config/s390/s390.c > @@ -426,6 +426,13 @@ struct GTY(()) machine_function >/* True if the current function may contain a tbegin clobbering > FPRs. */ >bool tbegin_p; > + > + /* For -fsplit-stack support: A stack local which holds a pointer to > + the stack arguments for a function with a variable number of > + arguments. This is set at the start of the function and is used > + to initialize the overflow_arg_area field of the va_list > + structure. */ > + rtx split_stack_varargs_pointer; > }; > > /* Few accessor macros for struct cfun->machine->s390_frame_layout. */ > @@ -9316,9 +9323,13 @@
[committed] Add testcase for PR66137
Hi! This PR has been fixed by the PR68701, I've committed the testcase as obvious to trunk. 2016-01-29 Jakub JelinekPR target/66137 * gcc.target/i386/pr66137.c: New test. --- gcc/testsuite/gcc.target/i386/pr66137.c.jj 2016-01-29 15:05:19.804958974 +0100 +++ gcc/testsuite/gcc.target/i386/pr66137.c 2016-01-29 15:04:53.0 +0100 @@ -0,0 +1,11 @@ +/* PR target/66137 */ +/* { dg-do compile } */ +/* { dg-options "-mavx -O3 -ffixed-ebp" } */ + +void +foo (char *x, char *y, char *z, int a) +{ + int i; + for (i = a; i > 0; i--) +*x++ = *y++ = *z++; +} Jakub
[hsa] Atomic assess memory model fixes
Hi, this is a followup to comments by Jakub and Richi on handling of memory models in HSA atomic operations: - I have made user-visible diagnostics lower case simple words, rather than constant identifiers. - I have added masking by MEMMODEL_BASE_MASK where appropriate. - I have made sure that warning code does not crash even when it encounters an unknown model and that it never warns multiple times. - I have fixed handling of atomic load operations which wrongly insisted on release semantics instead of acquire (apart from relaxed). - And last but not least, after looking at the respective documentations, I have convinced myself that __ATOMIC_SEQ_CST can be implemented using the HSA scacq, screl and scar memory orders, so I implemented that. Bootstrapped and tested on x86_64-linux. Since all of the above seems to be worth fixing and low risk, I am going to commit it trunk even at this stage, even though of course nothing in HSA is a regression. Thanks, Martin 2016-01-29 Martin Jambor* hsa-gen.c (get_memory_order_name): Mask with MEMMODEL_BASE_MASK. Use short lowercase names. (get_memory_order): Mask with MEMMODEL_BASE_MASK. Support MEMMODEL_CONSUME with acquire semantics and MEMMODEL_SEQ_CST with acq_rel one. Protect warning agains segfaults if get_memory_order_name returns NULL. (gen_hsa_ternary_atomic_for_builtin): Support with MEMMODEL_SEQ_CST with release semantics. Do not warn if get_memory_order already did. (gen_hsa_insns_for_call): Support with MEMMODEL_SEQ_CST with acquire semantics. Fix check for relaxed or acquire semantics. Do not warn if get_memory_order already did. --- gcc/hsa-gen.c | 59 --- 1 file changed, 40 insertions(+), 19 deletions(-) diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c index e8f80da..768c2cf 100644 --- a/gcc/hsa-gen.c +++ b/gcc/hsa-gen.c @@ -4415,20 +4415,20 @@ get_address_from_value (tree val, hsa_bb *hbb) static const char * get_memory_order_name (unsigned memmodel) { - switch (memmodel) + switch (memmodel & MEMMODEL_BASE_MASK) { case MEMMODEL_RELAXED: - return "__ATOMIC_RELAXED"; + return "relaxed"; case MEMMODEL_CONSUME: - return "__ATOMIC_CONSUME"; + return "consume"; case MEMMODEL_ACQUIRE: - return "__ATOMIC_ACQUIRE"; + return "acquire"; case MEMMODEL_RELEASE: - return "__ATOMIC_RELEASE"; + return "release"; case MEMMODEL_ACQ_REL: - return "__ATOMIC_ACQ_REL"; + return "acq_rel"; case MEMMODEL_SEQ_CST: - return "__ATOMIC_SEQ_CST"; + return "seq_cst"; default: return NULL; } @@ -4440,21 +4440,31 @@ get_memory_order_name (unsigned memmodel) static BrigMemoryOrder get_memory_order (unsigned memmodel, location_t location) { - switch (memmodel) + switch (memmodel & MEMMODEL_BASE_MASK) { case MEMMODEL_RELAXED: return BRIG_MEMORY_ORDER_RELAXED; +case MEMMODEL_CONSUME: + /* HSA does not have an equivalent, but we can use the slightly stronger +ACQUIRE. */ case MEMMODEL_ACQUIRE: return BRIG_MEMORY_ORDER_SC_ACQUIRE; case MEMMODEL_RELEASE: return BRIG_MEMORY_ORDER_SC_RELEASE; case MEMMODEL_ACQ_REL: +case MEMMODEL_SEQ_CST: + /* Callers implementing a simple load or store need to remove the release +or acquire part respectively. */ return BRIG_MEMORY_ORDER_SC_ACQUIRE_RELEASE; default: - HSA_SORRY_ATV (location, -"support for HSA does not implement memory model: %s", -get_memory_order_name (memmodel)); - return BRIG_MEMORY_ORDER_NONE; + { + const char *mmname = get_memory_order_name (memmodel); + HSA_SORRY_ATV (location, + "support for HSA does not implement the specified " + " memory model%s %s", + mmname ? ": " : "", mmname ? mmname : ""); + return BRIG_MEMORY_ORDER_NONE; + } } } @@ -4523,13 +4533,20 @@ gen_hsa_ternary_atomic_for_builtin (bool ret_orig, nops = 2; } - if (acode == BRIG_ATOMIC_ST && memorder != BRIG_MEMORY_ORDER_RELAXED - && memorder != BRIG_MEMORY_ORDER_SC_RELEASE) + if (acode == BRIG_ATOMIC_ST) { - HSA_SORRY_ATV (gimple_location (stmt), -"support for HSA does not implement memory model for " -"ATOMIC_ST: %s", get_memory_order_name (mmodel)); - return; + if (memorder == BRIG_MEMORY_ORDER_SC_ACQUIRE_RELEASE) + memorder = BRIG_MEMORY_ORDER_SC_RELEASE; + + if (memorder != BRIG_MEMORY_ORDER_RELAXED + && memorder != BRIG_MEMORY_ORDER_SC_RELEASE + && memorder != BRIG_MEMORY_ORDER_NONE) + { + HSA_SORRY_ATV (gimple_location (stmt), +"support for HSA does not implement memory
Re: [off-list] Re: [PATCH PR68542]
Uros, Here is update patch which includes (1) couple changes proposed by Richard in tree-vect-loop.c and (2) the changes in back-end proposed by you. Is it OK for trunk? Bootstrap and regression testing dis not show any new failures. ChangeLog: 2016-01-29 Yuri RumyantsevPR middle-end/68542 * config/i386/i386.c (ix86_expand_branch): Add support for conditional branch with vector comparison. *config/i386/sse.md (Vi48_AVX): New mode iterator. (define_expand "cbranch4): Add support for conditional branch with vector comparison. * tree-vect-loop.c (optimize_mask_stores): New function. * tree-vect-stmts.c (vectorizable_mask_load_store): Initialize has_mask_store field of vect_info. * tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for vectorized loops having masked stores after vec_info destroy. * tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and correspondent macros. (optimize_mask_stores): Add prototype. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-mask-store-move-1.c: New test. * testsuite/gcc.target/i386/avx2-vect-mask-store-move1.c: Likewise. 2016-01-29 15:26 GMT+03:00 Uros Bizjak : > On Fri, Jan 29, 2016 at 1:20 PM, Yuri Rumyantsev wrote: >> Uros, >> >> Thanks for your comments. >> I deleted swap of operands as you told. >> Let me explain my point in adding support for conditional branches >> with vector comparison. >> This feature is used to put vectorized masked stores and its >> producers under guard that checks that mask is not zero, i.e. if mask >> which is result of other vector computations is zero we don't need to >> execute correspondent masked store and its producers if they don't >> have other uses. It means that only integer 128-bit and 256-bit >> vectors must be accepted as operands of cbranch. I did not introduce >> new iterator but simply used existence iterator V48_AVX2. BTW you >> proposed to add new iterator VI_AVX but it would be better to ad >> VI48_AVX as >> >> (define_mode_iterator Vi48_AVX >> [(V4SI "TARGET_AVX") (V2DI "TARGET_AVX") >> (V8SI "TARGET_AVX") (V4DI "TARGET_AVX")]) >> >> I also don't think that we need to add support in expand_compare since >> such comparisons are not generated. > > OK with me. If there is no need for cstore pattern, then the > comparison can be integrated with existing code in expand_branch by > using ""goto simple" as is already the case there. > > BR, > Uros. > >> 2016-01-28 20:08 GMT+03:00 Uros Bizjak : >>> Yuri, >>> >>> please find attached a target-dependent patch that illustrates my >>> review remarks. The patch is lightly tested, and it produces desired >>> ptest insns on the testcases you provided. >>> >>> Some further remarks: >>> >>> + tmp = gen_rtx_fmt_ee (code, VOIDmode, flag, const0_rtx); >>> + if (code == EQ) >>> +tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, tmp, >>> +gen_rtx_LABEL_REF (VOIDmode, label), pc_rtx); >>> + else >>> +tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, tmp, >>> +pc_rtx, gen_rtx_LABEL_REF (VOIDmode, label)); >>> + emit_jump_insn (gen_rtx_SET (pc_rtx, tmp)); >>> + return; >>> >>> The above code is IMO wrong. You don't need to swap the arms of the >>> target, since "code" will generate je or jne. Please see the attached >>> patch. >>> >>> BTW: Maybe we can introduce corresponding cstore pattrn to use ptest >>> in order to more efficiently vectorize code like: >>> >>> --cut here-- >>> int a[256]; >>> >>> int foo (void) >>> { >>> int ret = 0; >>> int i; >>> >>> for (i = 0; i < 256; i++) >>> { >>> if (a[i] != 0) >>> ret = 1; >>> } >>> return ret; >>> } >>> --cut here-- >>> >>> Uros. PR68542.patch.3 Description: Binary data
[PATCH][ARM] PR target/69161: Don't ignore mode when matching comparison operator in cstore-like patterns
Hi all, Similar to aarch64, the arm port also suffers from PR target/69161 when combine tries to propagate a CCmode comparison into a vec_duplicate, creating invalid RTL that ICEs. Please refer to the PR and the aarch64 fix for more info. The fix for arm is very similar. We define a new predicate identical to arm_comparison_operator but make it not special so that it gets the normal mode checks. This prevents combine from matching an intermediate CCmode cstore (where it's doing an SImode SET of a CCmode source) which it then tries to propagate into a V4SImode vec_duplicate. The offending patterns are the cstore patterns, so this patch updates them to use the new predicate with mode checks. Both arm and thumb patterns are updated. There was no codegen difference observed on SPEC2006 for arm. Bootstrapped and tested on arm-none-linux-gnueabihf. Ok for trunk? Thanks, Kyrill 2016-01-29 Kyrylo TkachovPR target/69161 * config/arm/predicates.md (arm_comparison_operator_mode): New predicate. * config/arm/arm.md (*mov_scc): Use arm_comparison_operator_mode instead of arm_comparison_operator. (*mov_negscc): Likewise. (*mov_notscc): Likewise. * config/arm/thumb2.md (*thumb2_mov_scc): Likewise. (*thumb2_mov_negscc): Likewise. (*thumb2_mov_negscc_strict_it): Likewise. (*thumb2_mov_notscc): Likewise. (*thumb2_mov_notscc_strict_it): Likewise. diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 5129e858578dd3f3c3c46b089c96011f6a6423c3..15b4a4a1278c6be14dca1887aff8dc7c7a8fc16d 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -7190,7 +7190,7 @@ (define_expand "cstore_cc" (define_insn_and_split "*mov_scc" [(set (match_operand:SI 0 "s_register_operand" "=r") - (match_operator:SI 1 "arm_comparison_operator" + (match_operator:SI 1 "arm_comparison_operator_mode" [(match_operand 2 "cc_register" "") (const_int 0)]))] "TARGET_ARM" "#" ; "mov%D1\\t%0, #0\;mov%d1\\t%0, #1" @@ -7207,7 +7207,7 @@ (define_insn_and_split "*mov_scc" (define_insn_and_split "*mov_negscc" [(set (match_operand:SI 0 "s_register_operand" "=r") - (neg:SI (match_operator:SI 1 "arm_comparison_operator" + (neg:SI (match_operator:SI 1 "arm_comparison_operator_mode" [(match_operand 2 "cc_register" "") (const_int 0)])))] "TARGET_ARM" "#" ; "mov%D1\\t%0, #0\;mvn%d1\\t%0, #0" diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md index c66c31d5c6047aa7decfe7e95d111d5fbf6fb52e..b8f09ab6b109f80abe2df08a8b7f954f521ec1bf 100644 --- a/gcc/config/arm/predicates.md +++ b/gcc/config/arm/predicates.md @@ -341,6 +341,11 @@ (define_special_predicate "arm_comparison_operator" (and (match_operand 0 "expandable_comparison_operator") (match_test "maybe_get_arm_condition_code (op) != ARM_NV"))) +;; Likewise, but don't ignore the mode. +(define_predicate "arm_comparison_operator_mode" + (and (match_operand 0 "expandable_comparison_operator") + (match_test "maybe_get_arm_condition_code (op) != ARM_NV"))) + (define_special_predicate "lt_ge_comparison_operator" (match_code "lt,ge")) diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md index 3e762018e4d3761852dda6434d3a8e31166a8678..7368d0658da1afde05b2dfc40eda79e1c228df1f 100644 --- a/gcc/config/arm/thumb2.md +++ b/gcc/config/arm/thumb2.md @@ -370,7 +370,7 @@ (define_insn "*thumb2_cmpsi_neg_shiftsi" (define_insn_and_split "*thumb2_mov_scc" [(set (match_operand:SI 0 "s_register_operand" "=l,r") - (match_operator:SI 1 "arm_comparison_operator" + (match_operator:SI 1 "arm_comparison_operator_mode" [(match_operand 2 "cc_register" "") (const_int 0)]))] "TARGET_THUMB2" "#" ; "ite\\t%D1\;mov%D1\\t%0, #0\;mov%d1\\t%0, #1" @@ -388,7 +388,7 @@ (define_insn_and_split "*thumb2_mov_scc" (define_insn_and_split "*thumb2_mov_negscc" [(set (match_operand:SI 0 "s_register_operand" "=r") - (neg:SI (match_operator:SI 1 "arm_comparison_operator" + (neg:SI (match_operator:SI 1 "arm_comparison_operator_mode" [(match_operand 2 "cc_register" "") (const_int 0)])))] "TARGET_THUMB2 && !arm_restrict_it" "#" ; "ite\\t%D1\;mov%D1\\t%0, #0\;mvn%d1\\t%0, #0" @@ -407,7 +407,7 @@ (define_insn_and_split "*thumb2_mov_negscc" (define_insn_and_split "*thumb2_mov_negscc_strict_it" [(set (match_operand:SI 0 "low_register_operand" "=l") - (neg:SI (match_operator:SI 1 "arm_comparison_operator" + (neg:SI (match_operator:SI 1 "arm_comparison_operator_mode" [(match_operand 2 "cc_register" "") (const_int 0)])))] "TARGET_THUMB2 && arm_restrict_it" "#" ; ";mvn\\t%0, #0 ;it\\t%D1\;mov%D1\\t%0, #0\" @@ -436,7 +436,7 @@ (define_insn_and_split "*thumb2_mov_negscc_strict_it" (define_insn_and_split "*thumb2_mov_notscc" [(set (match_operand:SI 0 "s_register_operand" "=r") - (not:SI (match_operator:SI 1 "arm_comparison_operator" + (not:SI (match_operator:SI 1 "arm_comparison_operator_mode" [(match_operand 2 "cc_register" "")
Re: [PATCH] Remove PTX link option
On Mon, 11 Jan 2016, Alexander Monakov wrote: > On Mon, 11 Jan 2016, Thomas Schwinge wrote: > > Alexander, would you please also submit a fix for that for nvptx-tools' > > nvptx-run.c? (Or want me to do that?) > > I can do that, along with another small change I used for -mgomp testing. I have now done that as part of nvptx-tools pull request #10 here: https://github.com/MentorEmbedded/nvptx-tools/pull/10 (the pull request also addresses other issues noted on the project's tracker) Thanks. Alexander
Re: Is it OK for rtx_addr_can_trap_p_1 to attempt to compute the frame layout? (was Re: [PATCH] Skip re-computing the mips frame info after reload completed)
On 29.01.2016 02:09, Bernd Schmidt wrote: > On 01/28/2016 12:36 AM, Eric Botcazou wrote: >>> [cc-ing Eric as RTL maintainer] >> >> Sorry for the delay, the message apparently bounced] >> >>> IMO the problem is that rtx_addr_can_trap_p_1 duplicates a large >>> bit of LRA/reload logic: >>> >>> [...] >>> >>> Under the current interface macros like INITIAL_ELIMINATION_OFFSET >>> are expected to trigger the calculation of the target's frame layout >>> (since you need that information to answer the question). >>> To me it seems wrong that we're attempting to call that sort of >>> macro in a query routine like rtx_addr_can_trap_p_1. >> >> I'm a little uncomfortable stepping in here because, while I >> essentially share >> your objections and was opposed to the patch (I was almost sure that >> it would >> introduce maintainance issues for no practical benefit), I didn't >> imagine that >> such a failure mode would have been possible (computing an >> approximation of >> the frame layout after reload being problematic) so I didn't really >> object to >> being overruled after seeing Bernd's patch... >> >>> IMO we should cache the information we need@the start of each >>> LRA/reload cycle. This will be more robust, because there will >>> be no accidental changes to global state either during or after >>> LRA/reload. It will also be more efficient because >>> rtx_addr_can_trap_p_1 can read the cached variables rather >>> than calling back into the target. >> >> That would be a better design for sure and would eliminate the >> maintainance >> issues I was originally afraid of. >> >> My recommendation would be to back out Bernd's patch for GCC 6.0 >> (again, it >> doesn't fix any regression and, more importantly, any bug in real >> software, >> but only corner case bugs in dumb computer-generated testcases) but >> with the >> commitment to address the underlying issue for GCC 7.0 and backport >> the fix to >> GCC 6.x unless really impracticable. That being said, it's ultimately >> Jakub >> and Richard's call. > > I'm on the fence; I do think the original problem is an issue we should > fix, but I'm also not terribly happy with the implementation we have > right now. Besides the issues already mentioned, doesn't it kind of > assume these offsets are constant (which they definitely aren't, > consider arg pushes for example)? > Yes that is right. I saw it as a thing that could possibly happen more often than we know, because it is difficult to spot a wrong code by ordinary tests. Even the reproducer from pr61047 does not crash when it runs in the gcc testsuite, I had to tweak the example first to use a larger offset to compensate the large number of environment values that are passed from the test suite in each test execution. Yes, rtx_addr_can_trap_p_1 does not know how many bytes are pushed on the stack and I saw no easy way how to get to the REG_NOTES for instance, because they are attached to the INSN and rtx_addr_can_trap_p has only access to a MEM rtx. It is no exact science, but the error is on the safe side. Nevertheless, with the data from the target hook the approximation is good enough, to not pessimize any single bit of code that is generated in stage2 vs. stage3, which would not have been the case without some help from the target. > I think a better approach might be to just mark accesses at known > locations in the frame, or arg pushes, as MEM_NOTRAP_P, and consider > accesses with non-constant or calculated offsets as potentially trapping. > Yes I think also that might be a next step. It would probably be good to somehow fixate the result of rtx_addr_can_trap_p immediately before reload when the RTX's are still FRAMEP+x and ARGP+x, and annotate that somehow to the reloaded RTX's, that way it would finally be superfluous to call the target hook at all, because the actual addresses should not change during or after reload. It will probably have to be a spare bit on the RTX that is currently unused, because MEM_NOTRAP_P is already used for something different. It will however not be simple to find a valid piece of C code where the current implementation with all of its limitations generates different code compared to an implementation that has access to the exact offsets. As I said, I tried already, but could not find an example of a missed optimization due to my patch. Thanks Bernd. > > Bernd
Re: Default compute dimensions
On Thu, Jan 28, 2016 at 10:38:51AM -0500, Nathan Sidwell wrote: > This patch adds default compute dimension handling. Users rarely specify > compute dimensions, expecting the toolchain to DTRT. More savvy users would > like to specify global defaults. This patch permits both. Isn't it better to be able to override the defaults on the library side? I mean, when when somebody is compiling the code, often he doesn't know the exact properties of the hw it will be run on, if he does, I think it is better to specify them explicitly in the code. But if he doesn't, one just has to hope libgomp will figure out the best defaults. So, wouldn't it be better to add some env var that would allow to control this instead? Jakub
Re: [gomp4, PR68977, Committed] Don't gimplify in ssa mode if seen_error in oacc_xform_loop
On 01/29/16 02:48, Richard Biener wrote: I see. Is it possible to simply scrub the whole OACC region in this case instead? Do you mean jettison the body of the offloaded fn, or something else? I guess the former's doable. (Throwing away the fn entirely could result in unresolved symbol errors, which might confuse?) Or even better, report those errors earlier? Well, one could split the pass into two passes (I think) and move the first half. But in general, these errors are only discoverable in the device compiler. nathan -- Nathan Sidwell - Director, Sourcery Services - Mentor Embedded
Re: [PATCH] New flag for dumping information about constexpr function calls memoization (GCC 5.2.0)
2016-01-28 17:54 GMT-03:00 Joseph Myers: > Any patch adding a new option needs to add documentation for it to > invoke.texi (both substantive documentation, and inclusion in the summary > lists of options). > > -- > Joseph S. Myers > jos...@codesourcery.com Hi, Thanks for the feedback, I've just updated the documentation. patch: diff --git a/gcc/common.opt b/gcc/common.opt index 1218a71..bf0c7df 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1168,6 +1168,10 @@ fdump-passes Common Var(flag_dump_passes) Init(0) Dump optimization passes +fdump-memoization-hits +Common Var(flag_dump_memoization_hits) Init(0) +Dump info about constexpr calls memoized. + fdump-unnumbered Common Report Var(flag_dump_unnumbered) Suppress output of instruction numbers, line number notes and addresses in debugging dumps diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c index e250726..41ae5b3 100644 --- a/gcc/cp/constexpr.c +++ b/gcc/cp/constexpr.c @@ -42,6 +42,8 @@ along with GCC; see the file COPYING3. If not see #include "builtins.h" #include "tree-inline.h" #include "ubsan.h" +#include "tree-pretty-print.h" +#include "dumpfile.h" static bool verify_constant (tree, bool, bool *, bool *); #define VERIFY_CONSTANT(X)\ @@ -1173,6 +1175,14 @@ cx_error_context (void) return r; } +static void +dump_memoization_hit (FILE *file, tree call, int flags) +{ + fprintf(file, "Memoized call:\n"); + print_generic_decl(file, call, flags); + fprintf(file, "\n"); +} + /* Subroutine of cxx_eval_constant_expression. Evaluate the call expression tree T in the context of OLD_CALL expression evaluation. */ @@ -1338,7 +1348,11 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t, entry->result = result = error_mark_node; } else -result = entry->result; +{ + if (flag_dump_memoization_hits) + dump_memoization_hit(stderr, t, 0); + result = entry->result; +} } if (!depth_ok) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index f84a199..b78bd4b 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -322,6 +322,7 @@ Objective-C and Objective-C++ Dialects}. -fdump-class-hierarchy@r{[}-@var{n}@r{]} @gol -fdump-ipa-all -fdump-ipa-cgraph -fdump-ipa-inline @gol -fdump-passes @gol +-fdump-memoization-hits @gol -fdump-statistics @gol -fdump-tree-all @gol -fdump-tree-original@r{[}-@var{n}@r{]} @gol @@ -6771,6 +6772,10 @@ Dump after function inlining. Dump the list of optimization passes that are turned on and off by the current command-line options. +@item -fdump-memoization-hits +@opindex fdump-memoization-hits +Dump the list of constexpr function calls that were memoized. + @item -fdump-statistics-@var{option} @opindex fdump-statistics Enable and control dumping of pass statistics in a separate file. The
Re: [PING, PATCH] Reduce accuracy of bessel_6.f90.
On Mon, Jan 11, 2016 at 03:40:56PM +0100, Dominik Vogt wrote: > Another patch reducing the accuracy required in the bessel_6 test. Fixes the test case for S/390. Can this be committed? > gcc/testsuite/ChangeLog > > * gfortran.dg/bessel_6.f90: Reduce accuracy for S/390. > >From 70a35dd6f6bf906d8e5907667ad0f04f981a61ac Mon Sep 17 00:00:00 2001 > From: Dominik Vogt> Date: Mon, 11 Jan 2016 15:36:38 +0100 > Subject: [PATCH] S/390: Reduce accuracy of bessel_6.f90. > > --- > gcc/testsuite/gfortran.dg/bessel_6.f90 | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/testsuite/gfortran.dg/bessel_6.f90 > b/gcc/testsuite/gfortran.dg/bessel_6.f90 > index e0220f7..da917ff 100644 > --- a/gcc/testsuite/gfortran.dg/bessel_6.f90 > +++ b/gcc/testsuite/gfortran.dg/bessel_6.f90 > @@ -12,7 +12,7 @@ > implicit none > real,parameter :: values(*) = [0.0, 0.5, 1.0, 0.9, > 1.8,2.0,3.0,4.0,4.25,8.0,34.53, 475.78] > real,parameter :: myeps(size(values)) = epsilon(0.0) & > - * [2, 7, 5, 6, 9, 12, 12, 7, 7, 8, 92, 15 ] > + * [2, 7, 5, 6, 9, 12, 12, 7, 7, 8, 98, 15 ] > ! The following is sufficient for me - the values above are a bit > ! more tolerant > ! * [0, 5, 3, 4, 6, 7, 7, 5, 5, 6, 66, 4 ] > -- > 2.3.0 Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany
Re: [PATCH] PR other/69006: S/390: Fix extra newlines after diagnostics.
On Wed, Jan 27, 2016 at 12:01:26PM +0100, Dominik Vogt wrote: > gcc/ChangeLog > > PR other/69006 > * config/s390/s390-c.c (s390_resolve_overloaded_builtin): Remove > trailing blank line from error message. Applied. Thanks! -Andreas-
Fix some i386 testcases for -frename-registers
This patch corrects some tests that can fail with -frename-registers. The problems typically are of the form "xmm[0-7]+", disallowing registers 8 and 9, and "xmm[0-9]". disallowing numbers higher than 9. Most the patch was automatically generated, but there were some other cases as well. Enabling register renaming would help with PR57193, I'll propose that separately after a few more tests. This was bootstrapped and tested on x86_64-linux. Ok? Bernd * gcc.target/i386/avx512bw-vptestmb-1.c: Correct [xyz]mm register number scans. * gcc.target/i386/avx512bw-vptestmw-1.c: Likewise. * gcc.target/i386/avx512bw-vptestnmb-1.c: Likewise. * gcc.target/i386/avx512bw-vptestnmw-1.c: Likewise. * gcc.target/i386/avx512cd-vpbroadcastmb2q-1.c: Likewise. * gcc.target/i386/avx512cd-vpbroadcastmw2d-1.c: Likewise. * gcc.target/i386/avx512dq-vfpclasspd-1.c: Likewise. * gcc.target/i386/avx512dq-vfpclassps-1.c: Likewise. * gcc.target/i386/avx512dq-vinsertf64x2-1.c: Likewise. * gcc.target/i386/avx512dq-vinserti64x2-1.c: Likewise. * gcc.target/i386/avx512f-gather-5.c: Likewise. * gcc.target/i386/avx512f-vptestmd-1.c: Likewise. * gcc.target/i386/avx512f-vptestmq-1.c: Likewise. * gcc.target/i386/avx512f-vptestnmd-1.c: Likewise. * gcc.target/i386/avx512f-vptestnmq-1.c: Likewise. * gcc.target/i386/avx512f-vrndscaleps-1.c: Likewise. * gcc.target/i386/avx512vl-vpbroadcastmb2q-1.c: Likewise. * gcc.target/i386/avx512vl-vpbroadcastmw2d-1.c: Likewise. * gcc.target/i386/avx512vl-vptestmd-1.c: Likewise. * gcc.target/i386/avx512vl-vptestmq-1.c: Likewise. * gcc.target/i386/avx512vl-vptestnmd-1.c: Likewise. * gcc.target/i386/avx512vl-vptestnmq-1.c: Likewise. * gcc.target/i386/pr32219-2.c: Allow registers other than %eax in scans. * gcc.target/i386/pr32219-4.c: Likewise. * gcc.target/i386/pr32219-6.c: Likewise. * gcc.target/i386/pr32219-8.c: Likewise. Index: gcc/testsuite/gcc.target/i386/avx512bw-vptestmb-1.c === --- gcc/testsuite/gcc.target/i386/avx512bw-vptestmb-1.c (revision 232689) +++ gcc/testsuite/gcc.target/i386/avx512bw-vptestmb-1.c (working copy) @@ -1,11 +1,11 @@ /* { dg-do compile } */ /* { dg-options "-mavx512bw -mavx512vl -O2" } */ -/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%xmm\[0-7\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ -/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%ymm\[0-7\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ -/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%zmm\[0-7\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ -/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%xmm\[0-7\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */ -/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%ymm\[0-7\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */ -/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%zmm\[0-7\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */ #include Index: gcc/testsuite/gcc.target/i386/avx512bw-vptestmw-1.c === --- gcc/testsuite/gcc.target/i386/avx512bw-vptestmw-1.c (revision 232689) +++ gcc/testsuite/gcc.target/i386/avx512bw-vptestmw-1.c (working copy) @@ -1,11 +1,11 @@ /* { dg-do compile } */ /* { dg-options "-mavx512bw -mavx512vl -O2" } */ -/* { dg-final { scan-assembler-times "vptestmw\[ \\t\]+\[^\{\n\]*%xmm\[0-7\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ -/* { dg-final { scan-assembler-times "vptestmw\[ \\t\]+\[^\{\n\]*%ymm\[0-7\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ -/* { dg-final { scan-assembler-times "vptestmw\[ \\t\]+\[^\{\n\]*%zmm\[0-7\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ -/* { dg-final { scan-assembler-times "vptestmw\[ \\t\]+\[^\{\n\]*%xmm\[0-7\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */ -/* { dg-final { scan-assembler-times "vptestmw\[ \\t\]+\[^\{\n\]*%ymm\[0-7\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */ -/* { dg-final { scan-assembler-times "vptestmw\[ \\t\]+\[^\{\n\]*%zmm\[0-7\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vptestmw\[
[Patch] Update GCC Internals: remove section Preserving virtual SSA form
Hi, The section "12.3.2 Preserving the virtual SSA form" in GCC Internals is outdated. The two functions it documents push_stmt_changes and pop_stmt_changes have been removed. The functionality have been replaced with update_stmt. update_stmt is documented elsewhere in internals. I therefore propose to remove section 12.3.2. Furthermore, the function mark_stmt_modified have been replaced by gimple_set_modified. Please install this patch. Best, Nicklas 2016-01-29 Nicklas Bo Jensen* doc/tree-ssa.texi (Preserving the virtual SSA form): Remove outdated section index d795090..7ca607d 100644 --- a/gcc/doc/tree-ssa.texi +++ b/gcc/doc/tree-ssa.texi @@ -432,7 +432,7 @@ dominator optimizations currently do this. When lazy updating is being used, the immediate use information is out of date and cannot be used reliably. Lazy updating is achieved by simply marking -statements modified via calls to @code{mark_stmt_modified} instead of +statements modified via calls to @code{gimple_set_modified} instead of @code{update_stmt}. When lazy updating is no longer required, all the modified statements must have @code{update_stmt} called in order to bring them up to date. This must be done before the optimization is finished, or @@ -654,40 +654,6 @@ are explicitly destroyed and only the symbols marked for renaming are processed@. @end itemize -@subsection Preserving the virtual SSA form -@cindex preserving virtual SSA form - -The virtual SSA form is harder to preserve than the non-virtual SSA form -mainly because the set of virtual operands for a statement may change at -what some would consider unexpected times. In general, statement -modifications should be bracketed between calls to -@code{push_stmt_changes} and @code{pop_stmt_changes}. For example, - -@smallexample -munge_stmt (tree stmt) -@{ - push_stmt_changes (); - @dots{} rewrite STMT @dots{} - pop_stmt_changes (); -@} -@end smallexample - -The call to @code{push_stmt_changes} saves the current state of the -statement operands and the call to @code{pop_stmt_changes} compares -the saved state with the current one and does the appropriate symbol -marking for the SSA renamer. - -It is possible to modify several statements at a time, provided that -@code{push_stmt_changes} and @code{pop_stmt_changes} are called in -LIFO order, as when processing a stack of statements. - -Additionally, if the pass discovers that it did not need to make -changes to the statement after calling @code{push_stmt_changes}, it -can simply discard the topmost change buffer by calling -@code{discard_stmt_changes}. This will avoid the expensive operand -re-scan operation and the buffer comparison that determines if symbols -need to be marked for renaming. - @subsection Examining @code{SSA_NAME} nodes @cindex examining SSA_NAMEs
Fix c/69522, memory management issue in c-parser
Let's say we have struct a { int x[1]; int y[1]; } x = { 0, { 0 } }; ^ When we reach the marked brace, we call into push_init_level, where we notice that we have implicit initializers (for x[]) lying around that we should deal with now that we've seen another open brace. The problem is that we've created a new obstack for the initializer of y, and this is where we also put data for the inits of x, freeing it when we see the close brace for the initialization of y. In the actual testcase, which is a little more complex to actually demonstrate the issue, we end up allocating two init elts at the same address (because of premature freeing) and place them in the same tree, which ends up containing a cycle because of this. Then we hang. Fixed by this patch, which splits off a new function finish_implicit_inits from push_init_level and ensures it's called with the outer obstack instead of the new one in the problematic case. Bootstrapped and tested on x86_64-linux, ok? Bernd c/ PR c/69522 * c-parser.c (c_parser_braced_init): New arg outer_obstack. All callers changed. If nested_p is true, use it to call finish_implicit_inits. * c-tree.h (finish_implicit_inits): Declare. * c-typeck.c (finish_implicit_inits): New function. Move code from ... (push_init_level): ... here. (set_designator, process_init_element): Call finish_implicit_inits. testsuite/ PR c/69522 gcc.dg/pr69522.c: New test. diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index 43c26ae..eac0b1c 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -1284,7 +1284,8 @@ static tree c_parser_simple_asm_expr (c_parser *); static tree c_parser_attributes (c_parser *); static struct c_type_name *c_parser_type_name (c_parser *); static struct c_expr c_parser_initializer (c_parser *); -static struct c_expr c_parser_braced_init (c_parser *, tree, bool); +static struct c_expr c_parser_braced_init (c_parser *, tree, bool, + struct obstack *); static void c_parser_initelt (c_parser *, struct obstack *); static void c_parser_initval (c_parser *, struct c_expr *, struct obstack *); @@ -4289,7 +4290,7 @@ static struct c_expr c_parser_initializer (c_parser *parser) { if (c_parser_next_token_is (parser, CPP_OPEN_BRACE)) -return c_parser_braced_init (parser, NULL_TREE, false); +return c_parser_braced_init (parser, NULL_TREE, false, NULL); else { struct c_expr ret; @@ -4309,7 +4310,8 @@ c_parser_initializer (c_parser *parser) top-level initializer in a declaration. */ static struct c_expr -c_parser_braced_init (c_parser *parser, tree type, bool nested_p) +c_parser_braced_init (c_parser *parser, tree type, bool nested_p, + struct obstack *outer_obstack) { struct c_expr ret; struct obstack braced_init_obstack; @@ -4318,7 +4320,10 @@ c_parser_braced_init (c_parser *parser, tree type, bool nested_p) gcc_assert (c_parser_next_token_is (parser, CPP_OPEN_BRACE)); c_parser_consume_token (parser); if (nested_p) -push_init_level (brace_loc, 0, _init_obstack); +{ + finish_implicit_inits (brace_loc, outer_obstack); + push_init_level (brace_loc, 0, _init_obstack); +} else really_start_incremental_init (type); if (c_parser_next_token_is (parser, CPP_CLOSE_BRACE)) @@ -4576,7 +4581,8 @@ c_parser_initval (c_parser *parser, struct c_expr *after, location_t loc = c_parser_peek_token (parser)->location; if (c_parser_next_token_is (parser, CPP_OPEN_BRACE) && !after) -init = c_parser_braced_init (parser, NULL_TREE, true); +init = c_parser_braced_init (parser, NULL_TREE, true, + braced_init_obstack); else { init = c_parser_expr_no_commas (parser, after); @@ -8060,7 +8066,7 @@ c_parser_postfix_expression_after_paren_type (c_parser *parser, error_at (type_loc, "compound literal has variable size"); type = error_mark_node; } - init = c_parser_braced_init (parser, type, false); + init = c_parser_braced_init (parser, type, false, NULL); finish_init (); maybe_warn_string_init (type_loc, type, init); diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h index 00e72b1..5902fd2 100644 --- a/gcc/c/c-tree.h +++ b/gcc/c/c-tree.h @@ -625,6 +625,7 @@ extern void maybe_warn_string_init (location_t, tree, struct c_expr); extern void start_init (tree, tree, int); extern void finish_init (void); extern void really_start_incremental_init (tree); +extern void finish_implicit_inits (location_t, struct obstack *); extern void push_init_level (location_t, int, struct obstack *); extern struct c_expr pop_init_level (location_t, int, struct obstack *); extern void set_init_index (location_t, tree, tree, struct obstack *); diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c index a147ac6..e4e2944 100644 --- a/gcc/c/c-typeck.c +++ b/gcc/c/c-typeck.c @@ -7447,6 +7447,30 @@ really_start_incremental_init (tree type) } } +/* Called when we see an open brace for a nested initializer. Finish +
Re: [C++ PATCH] Fix -Wunused-function (PR debug/66869)
On Fri, Jan 29, 2016 at 11:35:07AM +0100, Jakub Jelinek wrote: > I can try to stick there an assert whether for FUNCTION_DECL > (DECL_INITIAL (decl) == 0) == DECL_EXTERNAL (decl). Tried that, but cancelled that quickly, I see lots of cases where DECL_INITIAL is non-NULL, but DECL_EXTERNAL is set, and some where DECL_INITIAL is NULL, and DECL_EXTERNAL is not set, at least in the other two spots (check_global_declaration in cgraphunit.c and c-decl.c). Haven't waited long enough to find out if the C++ FE is some exception. Jakub
Re: [C++ PATCH] Fix -Wunused-function (PR debug/66869)
On 01/29/2016 11:35 AM, Jakub Jelinek wrote: On Thu, Jan 28, 2016 at 09:51:34PM -0500, Jason Merrill wrote: On 01/28/2016 03:15 PM, Jakub Jelinek wrote: + if (TREE_CODE (decl) == FUNCTION_DECL + && DECL_INITIAL (decl) == 0 + && DECL_EXTERNAL (decl) + && !TREE_PUBLIC (decl) + && !DECL_ARTIFICIAL (decl) + && !TREE_NO_WARNING (decl)) Do we need to check both DECL_INITIAL and DECL_EXTERNAL? Dunno, but that is what cgraphunit.c does, c-decl.c too, what the old toplev.c (check_global_declaration_1) did (back to at least r26593 from ~ 1999), so I think we want some consistency. OK. Jason
[patch] libstdc++/69506 Fix Cygwin bootstrap error due to TM symbols
Another target that doesn't have the necessary weak ref support for the TM-aware exception-handling. Bootrapped successfully by the reporter, committed to trunk. commit b7e2e38ab1d938ee19280eba11ed3643a140f86d Author: Jonathan WakelyDate: Fri Jan 29 10:38:45 2016 + Fix Cygwin bootstrap error due to TM symbols PR libstdc++/69506 * config/os/newlib/os_defines.h (_GLIBCXX_USE_WEAK_REF): Define. diff --git a/libstdc++-v3/config/os/newlib/os_defines.h b/libstdc++-v3/config/os/newlib/os_defines.h index 4a09dd1..2a87e74 100644 --- a/libstdc++-v3/config/os/newlib/os_defines.h +++ b/libstdc++-v3/config/os/newlib/os_defines.h @@ -53,6 +53,9 @@ // their dtors are called #define _GLIBCXX_THREAD_ATEXIT_WIN32 1 +// See libstdc++/69506 +#define _GLIBCXX_USE_WEAK_REF 0 + #endif #endif
Re: Fix some i386 testcases for -frename-registers
Hello! > * gcc.target/i386/avx512bw-vptestmb-1.c: Correct [xyz]mm register > number scans. > * gcc.target/i386/avx512bw-vptestmw-1.c: Likewise. > * gcc.target/i386/avx512bw-vptestnmb-1.c: Likewise. > * gcc.target/i386/avx512bw-vptestnmw-1.c: Likewise. > * gcc.target/i386/avx512cd-vpbroadcastmb2q-1.c: Likewise. > * gcc.target/i386/avx512cd-vpbroadcastmw2d-1.c: Likewise. > * gcc.target/i386/avx512dq-vfpclasspd-1.c: Likewise. > * gcc.target/i386/avx512dq-vfpclassps-1.c: Likewise. > * gcc.target/i386/avx512dq-vinsertf64x2-1.c: Likewise. > * gcc.target/i386/avx512dq-vinserti64x2-1.c: Likewise. > * gcc.target/i386/avx512f-gather-5.c: Likewise. > * gcc.target/i386/avx512f-vptestmd-1.c: Likewise. > * gcc.target/i386/avx512f-vptestmq-1.c: Likewise. > * gcc.target/i386/avx512f-vptestnmd-1.c: Likewise. > * gcc.target/i386/avx512f-vptestnmq-1.c: Likewise. > * gcc.target/i386/avx512f-vrndscaleps-1.c: Likewise. > * gcc.target/i386/avx512vl-vpbroadcastmb2q-1.c: Likewise. > * gcc.target/i386/avx512vl-vpbroadcastmw2d-1.c: Likewise. > * gcc.target/i386/avx512vl-vptestmd-1.c: Likewise. > * gcc.target/i386/avx512vl-vptestmq-1.c: Likewise. > * gcc.target/i386/avx512vl-vptestnmd-1.c: Likewise. > * gcc.target/i386/avx512vl-vptestnmq-1.c: Likewise. > * gcc.target/i386/pr32219-2.c: Allow registers other than %eax in > scans. > * gcc.target/i386/pr32219-4.c: Likewise. > * gcc.target/i386/pr32219-6.c: Likewise. > * gcc.target/i386/pr32219-8.c: Likewise. OK. Thanks, Uros.
Re: [PATCH] PR c++/69462: Provide FLT_EVAL_METHOD and DECIMAL_DIG in float.h.
On 28/01/16 15:42 +0100, Jakub Jelinek wrote: On Thu, Jan 28, 2016 at 01:32:18PM +, Jonathan Wakely wrote: On 28/01/16 13:40 +0100, Dominik Vogt wrote: >The attached patch (written by Jonathan, not me) makes >FLT_EVAL_METHOD and DECIMAL_DIG available in C++-11 as they should >be. > >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69462 > >Can this be committed (should it wait for stage1)? I've just noticed we should also do the following, although this can definitely wait for stage 1 as it works fine as is (unlike Dominik's case which is a conformance bug). --- a/gcc/ginclude/stdarg.h +++ b/gcc/ginclude/stdarg.h @@ -47,7 +47,7 @@ typedef __builtin_va_list __gnuc_va_list; #define va_start(v,l) __builtin_va_start(v,l) #define va_end(v) __builtin_va_end(v) #define va_arg(v,l)__builtin_va_arg(v,l) -#if !defined(__STRICT_ANSI__) || __STDC_VERSION__ + 0 >= 199900L || defined(__GXX_EXPERIMENTAL_CXX0X__) +#if !defined(__STRICT_ANSI__) || __STDC_VERSION__ + 0 >= 199900L || __cplusplus + 0 >= 201103L #define va_copy(d,s) __builtin_va_copy(d,s) #endif #define __va_copy(d,s) __builtin_va_copy(d,s) This is ok, but please fix up the formatting (avoid too long line). I've committed the attached version, wrapping the line. commit 92f4d9e8d059d2d1bad1dcea30ec44b60a5c35e7 Author: Jonathan WakelyDate: Fri Jan 29 11:58:17 2016 + Test __cplusplus instead of __GXX_EXPERIMENTAL_CXX0X__ * ginclude/stdarg.h: Test __cplusplus instead of __GXX_EXPERIMENTAL_CXX0X__. diff --git a/gcc/ginclude/stdarg.h b/gcc/ginclude/stdarg.h index 74a234d..6525152 100644 --- a/gcc/ginclude/stdarg.h +++ b/gcc/ginclude/stdarg.h @@ -47,7 +47,8 @@ typedef __builtin_va_list __gnuc_va_list; #define va_start(v,l) __builtin_va_start(v,l) #define va_end(v) __builtin_va_end(v) #define va_arg(v,l) __builtin_va_arg(v,l) -#if !defined(__STRICT_ANSI__) || __STDC_VERSION__ + 0 >= 199900L || defined(__GXX_EXPERIMENTAL_CXX0X__) +#if !defined(__STRICT_ANSI__) || __STDC_VERSION__ + 0 >= 199900L \ +|| __cplusplus + 0 >= 201103L #define va_copy(d,s) __builtin_va_copy(d,s) #endif #define __va_copy(d,s) __builtin_va_copy(d,s)
Re: [C++ PATCH] Fix -Wunused-function (PR debug/66869)
On Thu, Jan 28, 2016 at 09:51:34PM -0500, Jason Merrill wrote: > On 01/28/2016 03:15 PM, Jakub Jelinek wrote: > >+if (TREE_CODE (decl) == FUNCTION_DECL > >+&& DECL_INITIAL (decl) == 0 > >+&& DECL_EXTERNAL (decl) > >+&& !TREE_PUBLIC (decl) > >+&& !DECL_ARTIFICIAL (decl) > >+&& !TREE_NO_WARNING (decl)) > > Do we need to check both DECL_INITIAL and DECL_EXTERNAL? Dunno, but that is what cgraphunit.c does, c-decl.c too, what the old toplev.c (check_global_declaration_1) did (back to at least r26593 from ~ 1999), so I think we want some consistency. Either it is needed, or if it is not needed, then all the spots should change, not just this one. I can try to stick there an assert whether for FUNCTION_DECL (DECL_INITIAL (decl) == 0) == DECL_EXTERNAL (decl). Jakub
Re: Martin Jambor appointed HSA Maintainer
Hi, On Fri, Dec 18, 2015 at 08:41:41AM -0500, David Edelsohn wrote: > I am pleased to announce that the GCC Steering Committee has > appointed Martin Jambor as HSA maintainer. > > Please join me in congratulating Martin on his new role. > Martin, please update your listing in the MAINTAINERS file. > thank you very much for your trust. I will do my best when carrying out the associated duties. Now that HSA is also in, I have committed the following change to the MAINTAINERS file. Martin 2016-01-29 Martin Jambor* MAINTAINERS (hsa maintainers): Add myself. --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index a5afeb7..aa757ea 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -209,6 +209,7 @@ fixincludes Bruce Korb *gimpl*Jason Merrill gcse.c Jeff Law global opt framework Jeff Law +hsaMartin Jambor jump.c David S. Miller web pages Gerald Pfeifer config.sub/config.guessBen Elliston -- 2.7.0
Re: [PATCH] PR other/69006: S/390: Fix extra newlines after diagnostics.
On Wed, Jan 27, 2016 at 09:22:19AM -0500, David Malcolm wrote: > On Wed, 2016-01-27 at 12:01 +0100, Dominik Vogt wrote: > > The attached patch removes a blank line after an error message. > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69006 > > Presumably this was exposed by the stricter testing I added to lib/gcc > -dg.exp in r232837? Yes, of course. > > - error_at (loc, "ambiguous overload for intrinsic: %s\n", > > + error_at (loc, "ambiguous overload for intrinsic: %s", > > IDENTIFIER_POINTER (DECL_NAME (ob_fndecl))); > Should this code be using %qs rather than %s? (or somesuch, or is that > a gcc 7 thing) Yes, probably. I'll make a separate patch for this. Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany
[PATCH] Fix PR69547
I am testing the following patch to fix a regression that we no longer remove some empty loops. Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2016-01-19 Richard BienerPR tree-optimization/69547 * tree-ssa-dce.c (mark_aliased_reaching_defs_necessary_1): Do not mark clobbers necessary. (mark_all_reaching_defs_necessary_1): Likewise. * g++.dg/tree-ssa/pr69547.C: New testcase. Index: gcc/tree-ssa-dce.c === *** gcc/tree-ssa-dce.c (revision 232928) --- gcc/tree-ssa-dce.c (working copy) *** mark_aliased_reaching_defs_necessary_1 ( *** 462,468 gimple *def_stmt = SSA_NAME_DEF_STMT (vdef); /* All stmts we visit are necessary. */ ! mark_operand_necessary (vdef); /* If the stmt lhs kills ref, then we can stop walking. */ if (gimple_has_lhs (def_stmt) --- 462,469 gimple *def_stmt = SSA_NAME_DEF_STMT (vdef); /* All stmts we visit are necessary. */ ! if (! gimple_clobber_p (def_stmt)) ! mark_operand_necessary (vdef); /* If the stmt lhs kills ref, then we can stop walking. */ if (gimple_has_lhs (def_stmt) *** mark_all_reaching_defs_necessary_1 (ao_r *** 584,590 } } ! mark_operand_necessary (vdef); return false; } --- 585,592 } } ! if (! gimple_clobber_p (def_stmt)) ! mark_operand_necessary (vdef); return false; } Index: gcc/testsuite/g++.dg/tree-ssa/pr69547.C === *** gcc/testsuite/g++.dg/tree-ssa/pr69547.C (revision 0) --- gcc/testsuite/g++.dg/tree-ssa/pr69547.C (working copy) *** *** 0 --- 1,15 + // { dg-do compile } + // { dg-options "-O2 -fdump-tree-cddce1" } + + struct A { A () { } }; + + void foo (void*, int); + + void bar () + { + enum { N = 64 }; + A a [N]; + foo (, N); + } + + // { dg-final { scan-tree-dump-not "if" "cddce1" } }
[Patch,microblaze]: Better register allocation to minimize the spill and fetch.
This patch improves the allocation of registers in the given function. The allocation is optimized for the conditional branches. The temporary register used in the conditional branches to store the comparison results and use of temporary in the conditional branch is optimized. Such temporary registers are allocated with a fixed register r18. Currently such temporaries are allocated with a free registers in the given function. Due to this one of the free register is reserved for the temporaries and given function is left with a few registers. This is unoptimized with respect to microblaze. In Microblaze r18 is marked as fixed and cannot be allocated to pseudos' in the given function. Instead r18 can be used as a temporary for the conditional branches with compare and branch. Use of r18 as a temporary for conditional branches will save one of the free registers to be allocated. The free registers can be used for other pseudos' and hence the better register allocation. The usage of r18 as above reduces the spill and fetch because of the availability of one of the free registers to other pseudos instead of being used for conditional temporaries. The advantage of the above is that the scope of the temporaries is limited to the conditional branches and hence the usage of r18 as temporary for such conditional branches is optimized and preserve the functionality of the function. Regtested for Microblaze target. Performance runs are done with Mibench/EEMBC benchmarks. Following gains are achieved. Benchmarks Gains automotive_qsort1 1.630730524% network_dijkstra 1.527506256% office_stringsearch 1 1.81356288% security_rijndael_d 3.26129357% basefp01_lite 4.465120185% a2time01_lite 1.893862857% cjpeg_lite 3.286496675% djpeg_lite 3.120150612% qos_lite 2.63964381% office_ispell 1.531340405% Code Size improvements: Reduction in number of instructions for Mibench : 12927. Reduction in number of instructions for EEMBC : 212. ChangeLog: 2016-01-29 Ajit Agarwal* config/microblaze/microblaze.c (microblaze_expand_conditional_branch): Use of MB_ABI_ASM_TEMP_REGNUM for temporary conditional branch. (microblaze_expand_conditional_branch_reg): Use of MB_ABI_ASM_TEMP_REGNUM for temporary conditional branch. (microblaze_expand_conditional_branch_sf): Use of MB_ABI_ASM_TEMP_REGNUM for temporary conditional branch. Signed-off-by:Ajit Agarwal ajit...@xilinx.com. --- gcc/config/microblaze/microblaze.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/config/microblaze/microblaze.c b/gcc/config/microblaze/microblaze.c index baff67a..b4277ad 100644 --- a/gcc/config/microblaze/microblaze.c +++ b/gcc/config/microblaze/microblaze.c @@ -3402,7 +3402,7 @@ microblaze_expand_conditional_branch (machine_mode mode, rtx operands[]) rtx cmp_op0 = operands[1]; rtx cmp_op1 = operands[2]; rtx label1 = operands[3]; - rtx comp_reg = gen_reg_rtx (SImode); + rtx comp_reg = gen_rtx_REG (SImode, MB_ABI_ASM_TEMP_REGNUM); rtx condition; gcc_assert ((GET_CODE (cmp_op0) == REG) || (GET_CODE (cmp_op0) == SUBREG)); @@ -3439,7 +3439,7 @@ microblaze_expand_conditional_branch_reg (enum machine_mode mode, rtx cmp_op0 = operands[1]; rtx cmp_op1 = operands[2]; rtx label1 = operands[3]; - rtx comp_reg = gen_reg_rtx (SImode); + rtx comp_reg = gen_rtx_REG (SImode, MB_ABI_ASM_TEMP_REGNUM); rtx condition; gcc_assert ((GET_CODE (cmp_op0) == REG) @@ -3483,7 +3483,7 @@ microblaze_expand_conditional_branch_sf (rtx operands[]) rtx condition; rtx cmp_op0 = XEXP (operands[0], 0); rtx cmp_op1 = XEXP (operands[0], 1); - rtx comp_reg = gen_reg_rtx (SImode); + rtx comp_reg = gen_rtx_REG (SImode, MB_ABI_ASM_TEMP_REGNUM); emit_insn (gen_cstoresf4 (comp_reg, operands[0], cmp_op0, cmp_op1)); condition = gen_rtx_NE (SImode, comp_reg, const0_rtx); -- 1.7.1 Thanks & Regards Ajit better-reg-alloc.patch Description: better-reg-alloc.patch
Re: [PATCH] PR c++/69462: Provide FLT_EVAL_METHOD and DECIMAL_DIG in float.h.
On Fri, Jan 29, 2016 at 09:27:46AM +0100, Dominik Vogt wrote: > On Thu, Jan 28, 2016 at 03:41:29PM +0100, Jakub Jelinek wrote: > > On Thu, Jan 28, 2016 at 01:40:12PM +0100, Dominik Vogt wrote: > > > -#if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L > > > +#if (defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L) \ > > > + || (defined (__cplusplus) && __cplusplus >= 201103L) > > > > The formatting is wrong, there is a tab before || when it should be aligned > > below defined on the previous line. > > Attached. > gcc/ChangeLog > > PR c++/69462 > * ginclude/float.h: Also provide FLT_EVAL_METHOD and DECIMAL_DIG for > C++-11. Ok, thanks. > --- a/gcc/ginclude/float.h > +++ b/gcc/ginclude/float.h > @@ -127,7 +127,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. > If not, see > #undef FLT_ROUNDS > #define FLT_ROUNDS 1 > > -#if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L > +#if (defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L) \ > + || (defined (__cplusplus) && __cplusplus >= 201103L) > /* The floating-point expression evaluation method. > -1 indeterminate > 0 evaluate all operations and constants just to the range and Jakub
Re: [PATCH] S/390: Use %qs in error messages.
On Fri, Jan 29, 2016 at 10:06:47AM +0100, Dominik Vogt wrote: > gcc/ChangeLog > > * config/s390/s390-c.c (s390_resolve_overloaded_builtin): Format > declaration name with %qs and print it in both error messages. Also > fix indentation. Applied. Thanks! -Andreas-
Re: [PATCH] PR c++/69462: Provide FLT_EVAL_METHOD and DECIMAL_DIG in float.h.
On Thu, Jan 28, 2016 at 03:41:29PM +0100, Jakub Jelinek wrote: > On Thu, Jan 28, 2016 at 01:40:12PM +0100, Dominik Vogt wrote: > > -#if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L > > +#if (defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L) \ > > + || (defined (__cplusplus) && __cplusplus >= 201103L) > > The formatting is wrong, there is a tab before || when it should be aligned > below defined on the previous line. Attached. Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany gcc/ChangeLog PR c++/69462 * ginclude/float.h: Also provide FLT_EVAL_METHOD and DECIMAL_DIG for C++-11. >From dcaa02429122bd66916caf730b34b0f86d9b1a8f Mon Sep 17 00:00:00 2001 From: Dominik VogtDate: Mon, 25 Jan 2016 11:59:06 +0100 Subject: [PATCH] PR c++/69462: Provide FLT_EVAL_METHOD and DECIMAL_DIG in float.h. --- gcc/ginclude/float.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/ginclude/float.h b/gcc/ginclude/float.h index 18f5aac..862f7cc 100644 --- a/gcc/ginclude/float.h +++ b/gcc/ginclude/float.h @@ -127,7 +127,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #undef FLT_ROUNDS #define FLT_ROUNDS 1 -#if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L +#if (defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L) \ + || (defined (__cplusplus) && __cplusplus >= 201103L) /* The floating-point expression evaluation method. -1 indeterminate 0 evaluate all operations and constants just to the range and -- 2.3.0
[PATCH] S/390: Use %qs in error messages.
The attached patch replaces %qs instead of %s in an error message, adds %qs to another and fixes indentation in one of the messages. Compiled and checked that no tests rely on the changed error messages on a zEC12. Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany gcc/ChangeLog * config/s390/s390-c.c (s390_resolve_overloaded_builtin): Format declaration name with %qs and print it in both error messages. Also fix indentation. >From 2455f3dd89b84e9e93ca30d0d52733f1f05a1802 Mon Sep 17 00:00:00 2001 From: Dominik VogtDate: Fri, 29 Jan 2016 09:58:55 +0100 Subject: [PATCH] S/390: Use %qs in error messages. --- gcc/config/s390/s390-c.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/gcc/config/s390/s390-c.c b/gcc/config/s390/s390-c.c index 2b6e405..cd3584b 100644 --- a/gcc/config/s390/s390-c.c +++ b/gcc/config/s390/s390-c.c @@ -904,13 +904,14 @@ s390_resolve_overloaded_builtin (location_t loc, if (last_match_type == INT_MAX) { - error_at (loc, "invalid parameter combination for intrinsic"); + error_at (loc, "invalid parameter combination for intrinsic %qs", + IDENTIFIER_POINTER (DECL_NAME (ob_fndecl))); return error_mark_node; } else if (num_matches > 1) { - error_at (loc, "ambiguous overload for intrinsic: %s", - IDENTIFIER_POINTER (DECL_NAME (ob_fndecl))); + error_at (loc, "ambiguous overload for intrinsic %qs", + IDENTIFIER_POINTER (DECL_NAME (ob_fndecl))); return error_mark_node; } -- 2.3.0
Re: [PATCH] PR c++/69462: Provide FLT_EVAL_METHOD and DECIMAL_DIG in float.h.
On Fri, Jan 29, 2016 at 09:27:46AM +0100, Dominik Vogt wrote: > gcc/ChangeLog > > PR c++/69462 > * ginclude/float.h: Also provide FLT_EVAL_METHOD and DECIMAL_DIG for > C++-11. Applied. Thanks! -Andreas-
Re: [PATCH] Fix use of declare'd vars by routine procedures.
On Thu, Jan 28, 2016 at 12:26:38PM -0600, James Norris wrote: > I think the attached change is what you had in mind with > regard to doing the check at gimplification time. Nope, this is still a wrong location for that. If you look at the next line after the block you've added, you'll see if (gimplify_omp_ctxp && omp_notice_variable (gimplify_omp_ctxp, decl, true)) And that function fairly early calls is_global_var (decl). So you know already gimplify_omp_ctxp && is_global_var (decl), just put the rest into that block. The TREE_CODE (decl) == VAR_DECL check is VAR_P (decl). What do you want to achieve with > + && ((TREE_STATIC (decl) && !DECL_EXTERNAL (decl)) > + || (!TREE_STATIC (decl) && DECL_EXTERNAL (decl ? is_global_var already guarantees you that it is either TREE_STATIC or DECL_EXTERNAL, why is that not good enough? > diff --git a/trunk/gcc/gimplify.c b/trunk/gcc/gimplify.c > --- a/trunk/gcc/gimplify.c(revision 232802) > +++ b/trunk/gcc/gimplify.c(working copy) > @@ -1841,6 +1841,33 @@ >return GS_ERROR; > } > > + /* Validate variable for use within routine function. */ > + if (gimplify_omp_ctxp && gimplify_omp_ctxp->region_type == ORT_TARGET > + && get_oacc_fn_attrib (current_function_decl) If you only care about the implicit target region of acc routine, I think you also want to check that gimplify_omp_ctxp->outer_context == NULL. > + && TREE_CODE (decl) == VAR_DECL > + && is_global_var (decl) > + && ((TREE_STATIC (decl) && !DECL_EXTERNAL (decl)) > + || (!TREE_STATIC (decl) && DECL_EXTERNAL (decl > +{ > + location_t loc = DECL_SOURCE_LOCATION (decl); > + > + if (lookup_attribute ("omp declare target link", DECL_ATTRIBUTES > (decl))) > + { > + error_at (loc, > + "%qE with % clause used in %function", > + DECL_NAME (decl)); > + return GS_ERROR; > + } > + else if (!lookup_attribute ("omp declare target", DECL_ATTRIBUTES > (decl))) > + { > + error_at (loc, > + "storage class of %qE cannot be ", DECL_NAME (decl)); > + error_at (gimplify_omp_ctxp->location, > + "used in enclosing % function"); And I'm really confused by this error message. If you are complaining that the variable is not listed in acc declare clauses, why don't you say that? What does the error have to do with its storage class? Also, splitting one error into two is weird, usually there would be one error message and perhaps inform after it. Jakub
Re: [PATCH] Fix PR69547
On Fri, 29 Jan 2016, Jakub Jelinek wrote: > On Fri, Jan 29, 2016 at 09:40:48AM +0100, Richard Biener wrote: > > I am testing the following patch to fix a regression that we no longer > > remove some empty loops. > > Doesn't this mean that DCE will remove the clobbers as unnecessary, even > when they aren't in empty loops? No, as with other cases (we never mark clobbers as necessary during propagation) they will be retained if all required operands are still there (SSA names used in the LHS). So the idea is that they should not keeping other stuff live but remain in the IL if all uses are still live after DCE. Richard.
Re: [C++ PATCH] Fix -Wunused-function (PR debug/66869)
On 01/28/2016 03:15 PM, Jakub Jelinek wrote: + if (TREE_CODE (decl) == FUNCTION_DECL + && DECL_INITIAL (decl) == 0 + && DECL_EXTERNAL (decl) + && !TREE_PUBLIC (decl) + && !DECL_ARTIFICIAL (decl) + && !TREE_NO_WARNING (decl)) Do we need to check both DECL_INITIAL and DECL_EXTERNAL? Jason
Re: [C++ patch] report better diagnostic for static following '[' in parameter declaration
On 29 January 2016 at 05:03, Marek Polacekwrote: > On Fri, Jan 29, 2016 at 04:46:56AM +0530, Prathamesh Kulkarni wrote: >> @@ -19016,10 +19017,22 @@ cp_parser_direct_declarator (cp_parser* parser, >> cp_lexer_consume_token (parser->lexer); >> /* Peek at the next token. */ >> token = cp_lexer_peek_token (parser->lexer); >> + >> + /* If static keyword immediately follows [, report error. */ >> + if (cp_lexer_next_token_is_keyword (parser->lexer, RID_STATIC) >> + && current_binding_level->kind == sk_function_parms) >> + { >> + error_at (token->location, >> + "static array size is a C99 feature," >> + "not permitted in C++"); >> + bounds = error_mark_node; >> + } >> + > > I think this isn't sufficient as-is; if we're changing the diagnostics here, > we should also handle e.g. void f(int a[const 10]); where clang++ says > g.C:1:13: error: qualifier in array size is a C99 feature, not permitted in > C++ > > And also e.g. > void f(int a[const static 10]); > void f(int a[static const 10]); > and similar. Thanks for the review. AFAIK the type-qualifiers would be const, restrict, volatile and _Atomic (n1570 p 6.7.3) ? I added a check for those and for variable length array. I am having issues with writing the test-case, some cases pass with -std=c++11 but fail with -std=c++98. Could you please have a look ? Thanks, Prathamesh > > Marek diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 33f1df3..04137b3 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -982,6 +982,24 @@ cp_lexer_next_token_is_decl_specifier_keyword (cp_lexer *lexer) } } +static bool +cp_lexer_next_token_is_c_type_qual (cp_lexer *lexer) +{ + if (cp_lexer_next_token_is_keyword (lexer, RID_CONST) + || cp_lexer_next_token_is_keyword (lexer, RID_VOLATILE)) +return true; + + cp_token *token = cp_lexer_peek_token (lexer); + if (token->type == CPP_NAME) +{ + tree name = token->u.value; + const char *p = IDENTIFIER_POINTER (name); + return !strcmp (p, "restrict") || !strcmp (p, "_Atomic"); +} + + return false; +} + /* Returns TRUE iff the token T begins a decltype type. */ static bool @@ -18998,10 +19016,40 @@ cp_parser_direct_declarator (cp_parser* parser, cp_lexer_consume_token (parser->lexer); /* Peek at the next token. */ token = cp_lexer_peek_token (parser->lexer); + + /* If static or type-qualifier or * immediately follows [, +report error. */ + if (current_binding_level->kind == sk_function_parms) + { + if (cp_lexer_next_token_is_keyword (parser->lexer, RID_STATIC)) + { + error_at (token->location, + "static array size is a C99 feature, " + "not permitted in C++"); + bounds = error_mark_node; + } + else if (cp_lexer_next_token_is_c_type_qual (parser->lexer)) + { + error_at (token->location, + "qualifier in array size is a C99 feature, " + "not permitted in C++"); + bounds = error_mark_node; + } + + else if (token->type == CPP_MULT) + { + error_at (token->location, + "variable-length array size is a C99 feature, " + "not permitted in C++"); + bounds = error_mark_node; + } + } + /* If the next token is `]', then there is no constant-expression. */ - if (token->type != CPP_CLOSE_SQUARE) + if (token->type != CPP_CLOSE_SQUARE && bounds != error_mark_node) { + bool non_constant_p; bounds = cp_parser_constant_expression (parser, diff --git a/gcc/testsuite/g++.dg/parse/static-array-error.C b/gcc/testsuite/g++.dg/parse/static-array-error.C new file mode 100644 index 000..028320d --- /dev/null +++ b/gcc/testsuite/g++.dg/parse/static-array-error.C @@ -0,0 +1,33 @@ +// { dg-do compile } + +void f1(int a[static 10]); /* { dg-error "static array size is a C99 feature" } */ +/* { dg-error "expected '\\]' before 'static'" "" { target *-*-* } 3 } */ +/* { dg-error "expected '\\)' before 'static'" "" { target *-*-* } 3 } */ +/* { dg-error "expected initializer before 'static'" "" { target *-*-* } 3 } */ + +void f2(int a[const 10]); /* { dg-error "qualifier in array size is a C99 feature" } */ +/* { dg-error "expected '\\]' before 'const'" "" { target *-*-* } 8 } */ +/* { dg-error "expected '\\)' before 'const'" "" { target *-*-* } 8 } */ +/* { dg-error "expected initializer before numeric constant" "" { target *-*-* } 8 } */ + +void f3(int a[restrict 10]); /* { dg-error
[PATCH PR67921]Convert pointer expr to proper type before negating it
Hi, Function fold_binary_loc calls split_tree to split a tree into constant, literal and variable parts. Function split_tree deals with minus_expr by negating different parts into NEXGATE_EXPR. Since tree exprs fed to split_tree are with NOP conversions stripped, this could result in illegal expr for pointer expressions. Given below example as described by PR67921: op0: (4 - (sizetype) ) code: MINUS_EXPR op1: (sizetype)b fold_binary_loc calls split_tree for both op0 and op1 and gets below from the function calls (it also flips the code): op0: 4, -(sizetype) code: PLUS_EXPR op1: -b Here we generate NEGATIVE_EXPR of pointer variable (b) which is illegal. If "-b" can not be canceled by following call to associate_trees, it will be passed along in IR resulting in ICE somewhere. This patch fixes it by converting pointer expression to proper type before negating it. Note the proper type is the outer type stripped in fold_binary_loc before calling split_tree. I also included a test which is heavily reduced from the original ffmpeg code in the original PR. Considering it's stage4, I restricted the patch to the smallest change. As a matter of fact, we may need to do the same thing for signed int types because -TYPE_MIN is undefined. Unfortunately, I failed to create a test in this case. Bootstrap and test on x64_64, is it OK? 2016-01-27 Bin ChengPR tree-optimization/67921 * fold-const.c (split_tree): New parameters. Convert pointer type variable part to proper type before negating. (fold_binary_loc): Pass new arguments to split_tree. gcc/testsuite/ChangeLog 2016-01-27 Bin Cheng PR tree-optimization/67921 * c-c++-common/ubsan/pr67921.c: New test. diff --git a/gcc/fold-const.c b/gcc/fold-const.c index bece8d7..e34bc81 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -109,7 +109,8 @@ enum comparison_code { static bool negate_expr_p (tree); static tree negate_expr (tree); -static tree split_tree (tree, enum tree_code, tree *, tree *, tree *, int); +static tree split_tree (location_t, tree, tree, enum tree_code, + tree *, tree *, tree *, int); static tree associate_trees (location_t, tree, tree, enum tree_code, tree); static enum comparison_code comparison_to_compcode (enum tree_code); static enum tree_code compcode_to_comparison (enum comparison_code); @@ -767,7 +768,10 @@ negate_expr (tree t) literal for which we use *MINUS_LITP instead. If NEGATE_P is true, we are negating all of IN, again except a literal - for which we use *MINUS_LITP instead. + for which we use *MINUS_LITP instead. If a variable part is of pointer + type, it is negated after converting to TYPE. This prevents us from + generating illegal MINUS pointer expression. LOC is the location of + the converted variable part. If IN is itself a literal or constant, return it as appropriate. @@ -775,8 +779,8 @@ negate_expr (tree t) same type as IN, but they will have the same signedness and mode. */ static tree -split_tree (tree in, enum tree_code code, tree *conp, tree *litp, - tree *minus_litp, int negate_p) +split_tree (location_t loc, tree in, tree type, enum tree_code code, + tree *conp, tree *litp, tree *minus_litp, int negate_p) { tree var = 0; @@ -833,7 +837,12 @@ split_tree (tree in, enum tree_code code, tree *conp, tree *litp, if (neg_conp_p) *conp = negate_expr (*conp); if (neg_var_p) - var = negate_expr (var); + { + /* Convert to TYPE before negating a pointer type expr. */ + if (var && POINTER_TYPE_P (TREE_TYPE (var))) + var = fold_convert_loc (loc, type, var); + var = negate_expr (var); + } } else if (TREE_CODE (in) == BIT_NOT_EXPR && code == PLUS_EXPR) @@ -854,6 +863,9 @@ split_tree (tree in, enum tree_code code, tree *conp, tree *litp, else if (*minus_litp) *litp = *minus_litp, *minus_litp = 0; *conp = negate_expr (*conp); + /* Convert to TYPE before negating a pointer type expr. */ + if (var && POINTER_TYPE_P (TREE_TYPE (var))) + var = fold_convert_loc (loc, type, var); var = negate_expr (var); } @@ -9621,9 +9633,10 @@ fold_binary_loc (location_t loc, then the result with variables. This increases the chances of literals being recombined later and of generating relocatable expressions for the sum of a constant and literal. */ - var0 = split_tree (arg0, code, , , _lit0, 0); - var1 = split_tree (arg1, code, , , _lit1, -code == MINUS_EXPR); + var0 = split_tree (loc, arg0, type, code, +, , _lit0, 0); + var1 = split_tree (loc, arg1, type, code, +, , _lit1, code == MINUS_EXPR); /* Recombine
Re: Default compute dimensions
On 01/29/16 10:18, Jakub Jelinek wrote: On Thu, Jan 28, 2016 at 10:38:51AM -0500, Nathan Sidwell wrote: This patch adds default compute dimension handling. Users rarely specify compute dimensions, expecting the toolchain to DTRT. More savvy users would like to specify global defaults. This patch permits both. Isn't it better to be able to override the defaults on the library side? I mean, when when somebody is compiling the code, often he doesn't know the exact properties of the hw it will be run on, if he does, I think it is better to specify them explicitly in the code. But if he doesn't, one just has to hope libgomp will figure out the best defaults. So, wouldn't it be better to add some env var that would allow to control this instead? You have anticipated part 2 of this patch, which would allow a default to be deferred to runtime in the manner you describe. Generally, one can know at compile time the upper bound on workers (it's part of the chip specification), but the number of physical gangs depends on the accelerator card. (That is true for PTX and IIUC for other GPGPUs too.) So, you may want defer num gangs to runtime -- but of course then you lose constant folding opportunities. nathan
Re: Is it OK for rtx_addr_can_trap_p_1 to attempt to compute the frame layout? (was Re: [PATCH] Skip re-computing the mips frame info after reload completed)
On 01/29/2016 04:41 PM, Jakub Jelinek wrote: On Fri, Jan 29, 2016 at 02:09:25AM +0100, Bernd Schmidt wrote: I think a better approach might be to just mark accesses at known locations in the frame, or arg pushes, as MEM_NOTRAP_P, and consider accesses with non-constant or calculated offsets as potentially trapping. I don't see how that would work generally. Sure, if there is e.g. a constant offset array access, it could be checked easily, but if there is variable offset array access, that is at some point later on changed into a constant offset access, you'd need to be conservative, unless you can prove it is in range. Yes. What is the problem with that? If we have (plus sfp const_int) at any point before reload, we can check whether that offset is inside frame_size. If it isn't or if the offset isn't known, it could trap. Bernd
[PATCH] [graphite] document that isl-0.16 is supported
* config/isl.m4: Add comments about isl-0.16. * configure: Regenerate. gcc/ * doc/install.texi: Document that isl-0.16 is supported. --- config/isl.m4| 6 +++--- configure| 12 ++-- gcc/doc/install.texi | 2 +- 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/config/isl.m4 b/config/isl.m4 index 0103f1f..92524af 100644 --- a/config/isl.m4 +++ b/config/isl.m4 @@ -106,7 +106,7 @@ AC_DEFUN([ISL_CHECK_VERSION], LDFLAGS="${_isl_saved_LDFLAGS} ${isllibs} ${gmplibs}" LIBS="${_isl_saved_LIBS} -lisl -lgmp" -AC_MSG_CHECKING([for isl 0.15 (or deprecated 0.14)]) +AC_MSG_CHECKING([for isl 0.16, 0.15, or deprecated 0.14]) AC_TRY_LINK([#include ], [isl_ctx_get_max_operations (isl_ctx_alloc ());], [gcc_cv_isl=yes], @@ -114,10 +114,10 @@ AC_DEFUN([ISL_CHECK_VERSION], AC_MSG_RESULT([$gcc_cv_isl]) if test "${gcc_cv_isl}" = no ; then - AC_MSG_RESULT([recommended isl version is 0.15, minimum required isl version 0.14 is deprecated]) + AC_MSG_RESULT([recommended isl version is 0.16 or 0.15, the minimum required isl version 0.14 is deprecated]) fi -AC_MSG_CHECKING([for isl-0.15]) +AC_MSG_CHECKING([for isl 0.16 or 0.15]) AC_TRY_LINK([#include ], [isl_options_set_schedule_serialize_sccs (NULL, 0);], [ac_has_isl_options_set_schedule_serialize_sccs=yes], diff --git a/configure b/configure index b9a4b51..89c863c 100755 --- a/configure +++ b/configure @@ -6021,8 +6021,8 @@ $as_echo "$as_me: WARNING: using in-tree isl, disabling version check" >&2;} LDFLAGS="${_isl_saved_LDFLAGS} ${isllibs} ${gmplibs}" LIBS="${_isl_saved_LIBS} -lisl -lgmp" -{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for isl 0.15 (or deprecated 0.14)" >&5 -$as_echo_n "checking for isl 0.15 (or deprecated 0.14)... " >&6; } +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for isl 0.16, 0.15, or deprecated 0.14" >&5 +$as_echo_n "checking for isl 0.16, 0.15, or deprecated 0.14... " >&6; } cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include @@ -6045,12 +6045,12 @@ rm -f core conftest.err conftest.$ac_objext \ $as_echo "$gcc_cv_isl" >&6; } if test "${gcc_cv_isl}" = no ; then - { $as_echo "$as_me:${as_lineno-$LINENO}: result: recommended isl version is 0.15, minimum required isl version 0.14 is deprecated" >&5 -$as_echo "recommended isl version is 0.15, minimum required isl version 0.14 is deprecated" >&6; } + { $as_echo "$as_me:${as_lineno-$LINENO}: result: recommended isl version is 0.16 or 0.15, the minimum required isl version 0.14 is deprecated" >&5 +$as_echo "recommended isl version is 0.16 or 0.15, the minimum required isl version 0.14 is deprecated" >&6; } fi -{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for isl-0.15" >&5 -$as_echo_n "checking for isl-0.15... " >&6; } +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for isl 0.16 or 0.15" >&5 +$as_echo_n "checking for isl 0.16 or 0.15... " >&6; } cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ #include diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi index 062f42c..3df7974 100644 --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -383,7 +383,7 @@ installed but it is not in your default library search path, the @option{--with-mpc} configure option should be used. See also @option{--with-mpc-lib} and @option{--with-mpc-include}. -@item isl Library version 0.15 or 0.14. +@item isl Library version 0.16, 0.15, or 0.14. Necessary to build GCC with the Graphite loop optimizations. It can be downloaded from @uref{ftp://gcc.gnu.org/pub/gcc/infrastructure/}. -- 2.5.0
Re: [PATCH, RFC] New memory usage statistics infrastructure
On Thu, May 28, 2015 at 1:07 PM, Jeff Lawwrote: > On 05/28/2015 06:29 AM, Martin Liška wrote: > >>> >> >> Hello. >> >> Thank you for pointing about missing copyright. >> Following patch adds that. >> >> Ready for trunk? > > Yes. > jeff > It looks like this patch was never committed. gcc/mem-stats.h and gcc/mem-stats-traits.h don't yet have copyright headers.
[RS6000] ABI_V4 init of toc section
Since 4c4a180d, LTO has turned off flag_pic when linking a fixed position executable. This results in flag_pic being zero in rs6000_file_start, and no definition of ".LCTOC1". However, when we get to actually emitting code, flag_pic may be on again, and references made to ".LCTOC1". How flag_pic comes to be enabled again is quite a story. It goes like this.. If a function is compiled with -fPIC then sysv4.h SUBTARGET_OVERRIDE_OPTIONS will set TARGET_RELOCATABLE. Conversely, if TARGET_RELOCATABLE is set and flag_pic is zero, then SUBTARGET_OVERRIDE_OPTIONS will set flag_pic=2. It also happens that TARGET_RELOCATABLE is a bit in rs6000_isa_flags, which is handled by rs6000_function_specific_save and rs6000_function_specific_restore. That last fact means lto streaming keeps track of the state of TARGET_RELOCATABLE for functions, and when options are restored for a given function we'll set flag_pic=2 if the function was originally compiled with -fPIC. That's bad because it defeats the purpose of the 4c4a180d lto change, resulting in worse optimization of ppc32 executables. What's more, we don't seem to turn off flag_pic once it is on. We should really untangle the flag_pic/TARGET_RELOCATABLE mess, but that change is probably a little dangerous for stage4. Instead, this patch removes the toc symbol initialization from file_start and does so when the first item is emitted to the toc, or after the function epilogue in the cases where we emit code to initialize a toc pointer but don't actually use it (-O0 mostly, I think). Bootstrapped and regression tested powerpc64-linux biarch with all languages enabled. OK to apply? PR target/68662 * config/rs6000/rs6000.c (need_toc_init): New var, set it whenever toc_label_name used. (rs6000_file_start): Don't set up toc section here, (rs6000_output_function_epilogue): do so here instead, (rs6000_xcoff_file_start): and here. * config/rs6000/rs6000.md (load_toc_aix_si): Set need_toc_init. (load_toc_aix_di): Likewise. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 4704d00..4ea4efb 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -209,7 +209,7 @@ tree rs6000_builtin_types[RS6000_BTI_MAX]; tree rs6000_builtin_decls[RS6000_BUILTIN_COUNT]; /* Flag to say the TOC is initialized */ -int toc_initialized; +int toc_initialized, need_toc_init; char toc_label_name[10]; /* Cached value of rs6000_variable_issue. This is cached in @@ -5682,13 +5682,6 @@ rs6000_file_start (void) if (DEFAULT_ABI == ABI_ELFv2) fprintf (file, "\t.abiversion 2\n"); - - if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2 - || (TARGET_ELF && flag_pic == 2)) -{ - switch_to_section (toc_section); - switch_to_section (text_section); -} } @@ -20375,6 +20368,7 @@ rs6000_output_addr_const_extra (FILE *file, rtx x) { putc ('-', file); assemble_name (file, toc_label_name); + need_toc_init = 1; } else if (TARGET_ELF) fputs ("@toc", file); @@ -24003,7 +23997,10 @@ rs6000_emit_load_toc_table (int fromprolog) ASM_GENERATE_INTERNAL_LABEL (buf, "L", CODE_LABEL_NUMBER (lab)); lab = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (buf)); if (flag_pic == 2) - got = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (toc_label_name)); + { + got = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (toc_label_name)); + need_toc_init = 1; + } else got = rs6000_got_sym (); tmp1 = tmp2 = dest; @@ -24048,6 +24045,7 @@ rs6000_emit_load_toc_table (int fromprolog) rtx tocsym, lab; tocsym = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (toc_label_name)); + need_toc_init = 1; lab = gen_label_rtx (); emit_insn (gen_load_toc_v4_PIC_1b (tocsym, lab)); emit_move_insn (dest, gen_rtx_REG (Pmode, LR_REGNO)); @@ -24062,6 +24060,7 @@ rs6000_emit_load_toc_table (int fromprolog) /* This is for AIX code running in non-PIC ELF32. */ rtx realsym = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (toc_label_name)); + need_toc_init = 1; emit_insn (gen_elf_high (dest, realsym)); emit_insn (gen_elf_low (dest, dest, realsym)); } @@ -27598,6 +27597,17 @@ rs6000_output_function_epilogue (FILE *file, fputs ("\t.align 2\n", file); } + + /* Arrange to define .LCTOC1 label, if not already done. */ + if (need_toc_init) +{ + need_toc_init = 0; + if (!toc_initialized) + { + switch_to_section (toc_section); + switch_to_section (current_function_section ()); + } +} } /* -fsplit-stack support. */ @@ -31745,6 +31755,7 @@ rs6000_elf_declare_function_name (FILE *file, const char *name, tree decl) fprintf (file, "\t.long "); assemble_name (file, toc_label_name); + need_toc_init = 1; putc ('-', file);
Re: Is it OK for rtx_addr_can_trap_p_1 to attempt to compute the frame layout? (was Re: [PATCH] Skip re-computing the mips frame info after reload completed)
On Fri, Jan 29, 2016 at 02:09:25AM +0100, Bernd Schmidt wrote: > I'm on the fence; I do think the original problem is an issue we should fix, > but I'm also not terribly happy with the implementation we have right now. The fact that it has been only reported from generated testcases only means we are lucky nobody encountered it yet in real-world programs. Plus, we need to be thankful to people working on those generators that keep reporting GCC bugs, they significantly improve the compiler. > Besides the issues already mentioned, doesn't it kind of assume these > offsets are constant (which they definitely aren't, consider arg pushes for > example)? Sure, for some registers the offsets aren't constant. In some cases we e.g. have REG_ARGS_SIZE notes, but in other cases don't and don't have anything else that would help us understand the difference between current sp value and one at the end of the prologue or so. > I think a better approach might be to just mark accesses at known locations > in the frame, or arg pushes, as MEM_NOTRAP_P, and consider accesses with > non-constant or calculated offsets as potentially trapping. I don't see how that would work generally. Sure, if there is e.g. a constant offset array access, it could be checked easily, but if there is variable offset array access, that is at some point later on changed into a constant offset access, you'd need to be conservative, unless you can prove it is in range. Or perhaps we could also use some other flag (or turn it into __builtin_trap or __builtin_unreachable or whatever) to mark MEMs that are always invalid if executed. Jakub
Re: [off-list] Re: [PATCH PR68542]
On Fri, Jan 29, 2016 at 3:13 PM, Yuri Rumyantsevwrote: > Uros, > > Here is update patch which includes (1) couple changes proposed by > Richard in tree-vect-loop.c and (2) the changes in back-end proposed > by you. > > Is it OK for trunk? > Bootstrap and regression testing dis not show any new failures. > > ChangeLog: > > 2016-01-29 Yuri Rumyantsev > > PR middle-end/68542 > * config/i386/i386.c (ix86_expand_branch): Add support for conditional > branch with vector comparison. > *config/i386/sse.md (Vi48_AVX): New mode iterator. > (define_expand "cbranch4): Add support for conditional branch > with vector comparison. > * tree-vect-loop.c (optimize_mask_stores): New function. > * tree-vect-stmts.c (vectorizable_mask_load_store): Initialize > has_mask_store field of vect_info. > * tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for > vectorized loops having masked stores after vec_info destroy. > * tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and > correspondent macros. > (optimize_mask_stores): Add prototype. > > gcc/testsuite/ChangeLog: > * gcc.dg/vect/vect-mask-store-move-1.c: New test. > * testsuite/gcc.target/i386/avx2-vect-mask-store-move1.c: Likewise. +(define_mode_iterator Vi48_AVX + [(V4SI "TARGET_AVX") (V2DI "TARGET_AVX") + (V8SI "TARGET_AVX") (V4DI "TARGET_AVX")]) + Please name this iterator with all caps: VI48_AVX. Also, there is no need for a condition at V4SI and V2DI: (define_mode_iterator VI48_AVX [V4SI V2DI (V8SI "TARGET_AVX") (V4DI "TARGET_AVX") x86 part is OK with the above change. Thanks, Uros.
Re: [PATCH] s390: Add -fsplit-stack support
On 29/01/16 14:33, Andreas Krebbel wrote: Hi Marcin, sorry for the late feedback. A few comments regarding the split stack implementation: The GNU coding style requires to replace every 8 leading blanks on a line with a tab. There are many lines in your patch violating this. In case you are an emacs user `whitespace-cleanup' will fix this for you. OK, will do. Could you please add a testcase checking the different variants. I.e. with early exit, no-alloc in __morestack, and with an actual allocation? The testsuite with -fsplit-stack already hits all of them, and checking them manually is rather tricky (I don't know if it could be done in target-independent way at all), but I think it'd be reasonable to make assembly testcases calling __morestack for the last two cases, to check if the registers are being preserved, etc. There are a few more comments inline. Bye, -Andreas- diff --git a/gcc/ChangeLog b/gcc/ChangeLog index c881d52..71f6f38 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,38 @@ 2016-01-16 Marcin Kościelnicki+ * common/config/s390/s390-common.c (s390_supports_split_stack): + New function. + (TARGET_SUPPORTS_SPLIT_STACK): New macro. + * config/s390/s390-protos.h: Add s390_expand_split_stack_prologue. + * config/s390/s390.c (struct machine_function): New field + split_stack_varargs_pointer. + (s390_register_info): Mark r12 as clobbered if it'll be used as temp + in s390_emit_prologue. + (s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack + vararg pointer. + (morestack_ref): New global. + (SPLIT_STACK_AVAILABLE): New macro. + (s390_expand_split_stack_prologue): New function. + (s390_expand_split_stack_call): New function. + (s390_live_on_entry): New function. + (s390_va_start): Use split-stack vararg pointer if appropriate. + (s390_reorg): Lower the split-stack pseudo-insns. + (s390_asm_file_end): Emit the split-stack note sections. + (TARGET_EXTRA_LIVE_ON_ENTRY): New macro. + * config/s390/s390.md: (UNSPEC_STACK_CHECK): New unspec. + (UNSPECV_SPLIT_STACK_CALL): New unspec. + (UNSPECV_SPLIT_STACK_SIBCALL): New unspec. + (UNSPECV_SPLIT_STACK_MARKER): New unspec. + (split_stack_prologue): New expand. + (split_stack_call_*): New insn. + (split_stack_cond_call_*): New insn. + (split_stack_space_check): New expand. + (split_stack_sibcall_*): New insn. + (split_stack_cond_sibcall_*): New insn. + (split_stack_marker): New insn. + +2016-01-02 Marcin Kościelnicki + * cfgrtl.c (rtl_tidy_fallthru_edge): Bail for unconditional jumps with side effects. diff --git a/gcc/common/config/s390/s390-common.c b/gcc/common/config/s390/s390-common.c index 4519c21..1e497e6 100644 --- a/gcc/common/config/s390/s390-common.c +++ b/gcc/common/config/s390/s390-common.c @@ -105,6 +105,17 @@ s390_handle_option (struct gcc_options *opts ATTRIBUTE_UNUSED, } } +/* -fsplit-stack uses a field in the TCB, available with glibc-2.23. + We don't verify it, since earlier versions just have padding at + its place, which works just as well. */ + +static bool +s390_supports_split_stack (bool report ATTRIBUTE_UNUSED, + struct gcc_options *opts ATTRIBUTE_UNUSED) +{ + return true; +} + #undef TARGET_DEFAULT_TARGET_FLAGS #define TARGET_DEFAULT_TARGET_FLAGS (TARGET_DEFAULT) @@ -117,4 +128,7 @@ s390_handle_option (struct gcc_options *opts ATTRIBUTE_UNUSED, #undef TARGET_OPTION_INIT_STRUCT #define TARGET_OPTION_INIT_STRUCT s390_option_init_struct +#undef TARGET_SUPPORTS_SPLIT_STACK +#define TARGET_SUPPORTS_SPLIT_STACK s390_supports_split_stack + struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER; diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h index 633bc1e..09032c9 100644 --- a/gcc/config/s390/s390-protos.h +++ b/gcc/config/s390/s390-protos.h @@ -42,6 +42,7 @@ extern bool s390_handle_option (struct gcc_options *opts ATTRIBUTE_UNUSED, extern HOST_WIDE_INT s390_initial_elimination_offset (int, int); extern void s390_emit_prologue (void); extern void s390_emit_epilogue (bool); +extern void s390_expand_split_stack_prologue (void); extern bool s390_can_use_simple_return_insn (void); extern bool s390_can_use_return_insn (void); extern void s390_function_profiler (FILE *, int); diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 3be64de..6afce7c 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -426,6 +426,13 @@ struct GTY(()) machine_function /* True if the current function may contain a tbegin clobbering FPRs. */ bool tbegin_p; + + /* For -fsplit-stack support: A stack local which holds a pointer to + the stack arguments for a function with a variable number of + arguments. This is set at the
Re: [PATCH] s390: Add -fsplit-stack support
On 01/29/2016 04:43 PM, Marcin Kościelnicki wrote: > The testsuite with -fsplit-stack already hits all of them, and checking > them manually is rather tricky (I don't know if it could be done in > target-independent way at all), but I think it'd be reasonable to make > assembly testcases calling __morestack for the last two cases, to check > if the registers are being preserved, etc. Sounds good. Thanks! ... >>> + if (frame_size <= 0x7fff || (TARGET_EXTIMM && frame_size <= 0xu)) >> The agfi immediate value is a signed 32 bit integer. So you can only >> add up to 2G-1. I think it would be more readable to write this as: > > We're emitting ALGFI here, which accepts unsigned 32-bit integer. Ah right. Then it would be: if (CONST_OK_FOR_K (frame_size) || CONST_OK_FOR_Op (frame_size)) instead. >> >> if (CONST_OK_FOR_K (frame_size) || CONST_OK_FOR_Os (frame_size)) >> >> as in s390_emit_prologue. The Os check will check for TARGET_EXTIMM as well. > > Alright. ... >> I'm wondering if it is really necessary to expand the call in that >> two-step approach?! We do the general literal pool handling in >> s390_reorg because we need all the insn lengths to be finalized before >> performing the branch/pool splitting loop. But this shouldn't be necessary >> in this case. Would it be possible to expand the call already in >> emit_prologue phase and get rid of the s390_reorg part? > > There's an internal literal pool involved, which needs to be emitted as > one chunk. Optimizations are also very likely to destroy the sequence: > consider the target address that __morestack will call - the control > flow change happens in __morestack jump instruction, but the address > itself is encoded in one of the pool literals. Just not worth the risk. Ok. ... >>> + # OK, no stack allocation needed. We still follow the protocol and >>> + # call our caller - it doesn't cost much and makes sure vararg works. >>> + # No need to set any registers here - %r0 and %r2-%r6 weren't modified. >>> + basr%r14, %r10 # Call our caller. >> The comment confuses me. It somewhat sounds to me like the call >> wouldn't be really needed but in fact it cannot even remotely work >> without jumping back to the function body right?! > > Certainly. __morestack's task is to call the given function entry point > once the necessary stack space is established. In fact, in the no > allocation case, a sibling-call would actually be possible, if it > weren't for one annoying detail: there are no free GPRs we could use to > keep the address of the entry point - %r0 may be used to keep static > chain, %r1 may have to be the argument pointer, %r2-%r5 may be used to > keep parameters, and %r6-%r15 are callee-saved. Ok. The comment isn't about no-call vs. call it is about sibcall vs. call - got it. Bye, -Andreas-
[PING] Re: [RFC][PATCH, ARM 8/8] Added support for ARMV8-M Security Extension cmse_nonsecure_caller intrinsic
On 26/12/15 01:59, Thomas Preud'homme wrote: [Sending on behalf of Andre Vieira] Hello, This patch adds support ARMv8-M's Security Extension's cmse_nonsecure_caller intrinsic. This intrinsic is used to check whether an entry function was called from a non-secure state. See Section 5.4.3 of ARM®v8-M Security Extensions: Requirements on Development Tools (http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html) for further details. *** gcc/ChangeLog *** 2015-10-27 Andre VieiraThomas Preud'homme * gcc/config/arm/arm-builtins.c (arm_builtins): Define ARM_BUILTIN_CMSE_NONSECURE_CALLER. (bdesc_2arg): Add line for cmse_nonsecure_caller. (arm_init_builtins): Init for cmse_nonsecure_caller. (arm_expand_builtin): Handle cmse_nonsecure_caller. * gcc/config/arm/arm_cmse.h (cmse_nonsecure_caller): New. *** gcc/testsuite/ChangeLog *** 2015-10-27 Andre Vieira Thomas Preud'homme * gcc.target/arm/cmse/cmse-1.c: Added test for cmse_nonsecure_caller. diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 11cd17d0b8f3c29ccbe16cb463a17d55ba0fa1e3..7934cf1d4d96c40255d3e93dc9902b4568014984 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -515,6 +515,8 @@ enum arm_builtins ARM_BUILTIN_GET_FPSCR, ARM_BUILTIN_SET_FPSCR, + ARM_BUILTIN_CMSE_NONSECURE_CALLER, + #undef CRYPTO1 #undef CRYPTO2 #undef CRYPTO3 @@ -1263,6 +1265,10 @@ static const struct builtin_description bdesc_2arg[] = FP_BUILTIN (set_fpscr, SET_FPSCR) #undef FP_BUILTIN + {ARM_FSET_MAKE_CPU2 (FL2_CMSE), CODE_FOR_andsi3, + "__builtin_arm_cmse_nonsecure_caller", ARM_BUILTIN_CMSE_NONSECURE_CALLER, + UNKNOWN, 0}, + #define CRC32_BUILTIN(L, U) \ {ARM_FSET_EMPTY, CODE_FOR_##L, "__builtin_arm_"#L, \ ARM_BUILTIN_##U, UNKNOWN, 0}, @@ -1797,6 +1803,17 @@ arm_init_builtins (void) = add_builtin_function ("__builtin_arm_stfscr", ftype_set_fpscr, ARM_BUILTIN_SET_FPSCR, BUILT_IN_MD, NULL, NULL_TREE); } + + if (arm_arch_cmse) +{ + tree ftype_cmse_nonsecure_caller + = build_function_type_list (unsigned_type_node, NULL); + arm_builtin_decls[ARM_BUILTIN_CMSE_NONSECURE_CALLER] + = add_builtin_function ("__builtin_arm_cmse_nonsecure_caller", + ftype_cmse_nonsecure_caller, + ARM_BUILTIN_CMSE_NONSECURE_CALLER, BUILT_IN_MD, + NULL, NULL_TREE); +} } /* Return the ARM builtin for CODE. */ @@ -2356,6 +2373,14 @@ arm_expand_builtin (tree exp, emit_insn (pat); return target; +case ARM_BUILTIN_CMSE_NONSECURE_CALLER: + icode = CODE_FOR_andsi3; + target = gen_reg_rtx (SImode); + op0 = arm_return_addr (0, NULL_RTX); + pat = GEN_FCN (icode) (target, op0, const1_rtx); + emit_insn (pat); + return target; + case ARM_BUILTIN_TEXTRMSB: case ARM_BUILTIN_TEXTRMUB: case ARM_BUILTIN_TEXTRMSH: diff --git a/gcc/config/arm/arm_cmse.h b/gcc/config/arm/arm_cmse.h index ab20a3ec46025f268a1e9bed895d27da9af7aab6..0bdff668d03d54e1acf2bdd3b5ff1bfb2b463bd8 100644 --- a/gcc/config/arm/arm_cmse.h +++ b/gcc/config/arm/arm_cmse.h @@ -163,6 +163,13 @@ __attribute__ ((__always_inline__)) cmse_TTAT (void *p) CMSE_TT_ASM (at) +//TODO: diagnose use outside cmse_nonsecure_entry functions +__extension__ static __inline int __attribute__ ((__always_inline__)) +cmse_nonsecure_caller (void) +{ + return __builtin_arm_cmse_nonsecure_caller (); +} + #define CMSE_AU_NONSECURE 2 #define CMSE_MPU_NONSECURE16 #define CMSE_NONSECURE18 diff --git a/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c b/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c index 1c3d4e9e934f4b1166d4d98383cf4ae8c3515117..ccecf396d3cda76536537b4d146bbb5f70589fd5 100644 --- a/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c +++ b/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c @@ -66,3 +66,32 @@ int foo (char * p) /* { dg-final { scan-assembler-times "ttat " 2 } } */ /* { dg-final { scan-assembler-times "bl.cmse_check_address_range" 7 } } */ /* { dg-final { scan-assembler-not "cmse_check_pointed_object" } } */ + +typedef int (*int_ret_funcptr_t) (void); +typedef int __attribute__ ((cmse_nonsecure_call)) (*int_ret_nsfuncptr_t) (void); + +int __attribute__ ((cmse_nonsecure_entry)) +baz (void) +{ + return cmse_nonsecure_caller (); +} + +int __attribute__ ((cmse_nonsecure_entry)) +qux (int_ret_funcptr_t int_ret_funcptr) +{ + int_ret_nsfuncptr_t int_ret_nsfunc_ptr; + + if (cmse_is_nsfptr (int_ret_funcptr)) +{ + int_ret_nsfunc_ptr = cmse_nsfptr_create (int_ret_funcptr); + return int_ret_nsfunc_ptr (); +} + return 0; +} +/* {
[PATCHv2] Re: [RFC][PATCH, ARM 7/8] ARMv8-M Security Extension's cmse_nonsecure_call: use __gnu_cmse_nonsecure_call]
On 19/01/16 15:28, Andre Vieira (lists) wrote: On 16/01/16 14:49, Senthil Kumar Selvaraj wrote: User-agent: mu4e 0.9.13; emacs 24.5.1 Hi, Apologies for the bad posting style (I don't have the original email handy), but shouldn't _gnu_cmse_nonsecure_call be defined with the .global directive in the below hunk (to make it visible when linking)? diff --git a/libgcc/config/arm/cmse_nonsecure_call.S b/libgcc/config/arm/cm= se_nonsecure_call.S new file mode 100644 index ..bdc140f5bbe87c6599db225b1b9= b7bbc7d606710 --- /dev/null +++ b/libgcc/config/arm/cmse_nonsecure_call.S @@ -0,0 +1,87 @@ +.syntax unified +.thumb +__gnu_cmse_nonsecure_call: Right now, it ends up as a local symbol, and compiling and linking a program with cmse_nonsecure_call (say cmse-11.c), results in a linker error - the linker doesn't find the symbol even if it is present in libgcc.a. I found the problem that way - dumping symbols for my variant of libgcc.a and grepping showed the symbol to be available but local. Regards Senthil Hi Senthil, Thanks for catching that! Cheers, Andre Hi there, Added missing global symbol. Is this OK? Cheers, Andre *** gcc/ChangeLog *** 2016-01-29 Andre VieiraThomas Preud'homme * gcc/config/arm/arm.c (detect_cmse_nonsecure_call): New. (cmse_nonsecure_call_clear_caller_saved): New. * gcc/config/arm/arm-protos.h (detect_cmse_nonsecure_call): New. * gcc/config/arm/arm.md (call): Handle cmse_nonsecure_entry. (call_value): Likewise. (nonsecure_call_internal): New. (nonsecure_call_value_internal): New. * gcc/config/arm/thumb1.md (*nonsecure_call_reg_thumb1_v5): New. (*nonsecure_call_value_reg_thumb1_v5): New. * gcc/config/arm/thumb2.md (*nonsecure_call_reg_thumb2): New. (*nonsecure_call_value_reg_thumb2): New. * gcc/config/arm/unspecs.md (UNSPEC_NONSECURE_MEM): New. * libgcc/config/arm/cmse_nonsecure_call.S: New. * libgcc/config/arm/t-arm: Compile cmse_nonsecure_call.S *** gcc/testsuite/ChangeLog *** 2016-01-29 Andre Vieira Thomas Preud'homme * gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c: New. * gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c: New. * gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c: New. * gcc/testsuite/gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: New. * gcc/testsuite/gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: New. * gcc/testsuite/gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: New. * gcc/testsuite/gcc.target/arm/cmse/mainline/hard/cmse-13.c: New. * gcc/testsuite/gcc.target/arm/cmse/mainline/hard/cmse-7.c: New. * gcc/testsuite/gcc.target/arm/cmse/mainline/hard/cmse-8.c: New. * gcc/testsuite/gcc.target/arm/cmse/mainline/soft/cmse-13.c: New. * gcc/testsuite/gcc.target/arm/cmse/mainline/soft/cmse-7.c: New. * gcc/testsuite/gcc.target/arm/cmse/mainline/soft/cmse-8.c: New. * gcc/testsuite/gcc.target/arm/cmse/mainline/softfp-sp/cmse-7.c: New. * gcc/testsuite/gcc.target/arm/cmse/mainline/softfp-sp/cmse-8.c: New. * gcc/testsuite/gcc.target/arm/cmse/mainline/softfp/cmse-13.c: New. * gcc/testsuite/gcc.target/arm/cmse/mainline/softfp/cmse-7.c: New. * gcc/testsuite/gcc.target/arm/cmse/mainline/softfp/cmse-8.c: New. diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 4fb4261794668752a8224e2d4a2363162ae9cb94..402313c5f4aeb9d2d26ea7d4a0412609142d490b 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -132,6 +132,7 @@ extern int arm_const_double_inline_cost (rtx); extern bool arm_const_double_by_parts (rtx); extern bool arm_const_double_by_immediates (rtx); extern void arm_emit_call_insn (rtx, rtx, bool); +bool detect_cmse_nonsecure_call (tree); extern const char *output_call (rtx *); void arm_emit_movpair (rtx, rtx); extern const char *output_mov_long_double_arm_from_arm (rtx *); diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index da33ba1136b97c5f534c135e6d39f8b5777b3f36..153c746ad1910ad8ea7527e74369930ca14d2594 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -17417,6 +17417,129 @@ note_invalid_constants (rtx_insn *insn, HOST_WIDE_INT address, int do_pushes) return; } +/* Saves callee saved registers, clears callee saved registers and caller saved + registers not used to pass arguments before a cmse_nonsecure_call. And + restores the callee saved registers after. */ + +static void +cmse_nonsecure_call_clear_caller_saved (void) +{ + basic_block bb; + + FOR_EACH_BB_FN (bb, cfun) +{ + rtx_insn *insn; + + FOR_BB_INSNS (bb, insn) + { + uint64_t to_clear_mask, float_mask; + rtx_insn *seq; + rtx pat, call,
Enabling -frename-registers?
So PR57193 has an example of sub-optimal code generation, with some unnecessary register moves left after LRA. These seem to be difficult to prevent, but last year Robert Suchanek made some modifications to regrename that allow it to clean up such cases. Enabling -frename-registers removes one of the two unnecessary copies, and I'm pretty sure I could make it eliminate the other one as well with a bit more work. Hence, this patch. The renamer has seen a lot of fixes over the years and should be in pretty good shape IMO. Still, I won't deny that this is a bit riskier than the usual bugfix patch at this stage. Bootstrapped and tested on x86_64-linux, with my earlier patch to fix some i386 tests. Thoughts? Should we do this for gcc-7 at least? Bernd PR rtl-optimization/57193 * opts.c (default_options_table): Add OPT_frename_registers at -O2 and above. * doc/invoke.texi (-frename-registers): Update documentation. Index: gcc/opts.c === --- gcc/opts.c (revision 232689) +++ gcc/opts.c (working copy) @@ -498,6 +498,7 @@ static const struct default_options defa { OPT_LEVELS_2_PLUS, OPT_fstrict_overflow, NULL, 1 }, { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_freorder_blocks_algorithm_, NULL, REORDER_BLOCKS_ALGORITHM_STC }, +{ OPT_LEVELS_2_PLUS, OPT_frename_registers, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_freorder_functions, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_ftree_vrp, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_ftree_pre, NULL, 1 }, Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 232689) +++ gcc/doc/invoke.texi (working copy) @@ -8467,7 +8467,8 @@ debug information format adopted by the make debugging impossible, since variables no longer stay in a ``home register''. -Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops}. +Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops}, +and also enabled at levels @option{-O2} and @option{-O3}. @item -fschedule-fusion @opindex fschedule-fusion
Re: RFA: patch to fix PR69299
On 01/28/2016 09:07 AM, Vladimir Makarov wrote: On 01/28/2016 08:05 AM, Jakub Jelinek wrote: On Wed, Jan 27, 2016 at 04:01:23PM -0500, Vladimir Makarov wrote: The following patch fixes PR69299. The details of the problem is described on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69299 The patch was successfully bootstrapped and tested on x86/x86-64. The patch introduces a new type of constraints define_special_memory_constraint for memory constraints whose address reload can not make memory to satisfy the constraint. It is useful when specifically aligned memory is necessary or desirable. I don't know what is the best name for this constraint. I use special_memory_constraint but it could be more specific, e.g. aligned_memory_constraint. Please let me know what is the best name for you. Is the patch ok to commit? I support the general idea and for naming will defer to Jeff, I actually like the name special_memory_constraint. I'm sure it will eventually be use for more than just aligned memories. Off the top of my head I can imagine constraints for specific address spaces. --- ira-costs.c(revision 232571) +++ ira-costs.c(working copy) @@ -777,6 +777,7 @@ record_reg_classes (int n_alts, int n_op break; case CT_MEMORY: +case CT_SPECIAL_MEMORY: /* Every MEM can be reloaded to fit. */ insn_allows_mem[i] = allows_mem[i] = 1; if (MEM_P (op)) The comment is true only for CT_MEMORY. Wonder if it wouldn't be better to handle CT_SPECIAL_MEMORY separately, perhaps as: case CT_SPECIAL_MEMORY: if (MEM_P (op) && constraint_satisfied_p (op, cn)) { insn_allows_mem[i] = allows_mem[i] = 1; win = 1; } break; ? I.e. if the constraint is already satisfied, treat it like memory constraint, otherwise treat like (unsatisfied) fixed form constraint. Or, if you want to account for the possibility that it doesn't satisfy the constraint yet due to address that if reloaded would make it satisfy, consider !memory_operand (op, ...) case as unknown, with no need to check the constraint. Because if op satisfies already memory_operand, but doesn't constraint_satisfied_p, it means it will never satisfy. The difference in code most probably does not affect generated code correctness but may change code quality. After some thinking, I decided to change it to case CT_SPECIAL_MEMORY: if (MEM_P (op) && constraint_satisfied_p (op, cn)) win = 1; allows_mem[i] = insn_allows_mem[i] = 1; break; ... But taking complexity of RA we can do unaligned memory -> pseudo first assigning a hard reg to the pseudo and on later subpasses to spill the pseudo for some reasons. Removing such RA freedom might result in having LRA stuck. I believe that I follow your reasoning here, and the possible paths through LRA that would make this happen, and I agree that we should allow LRA the freedom to spill the pseudo that we just allocated. The patch with that change looks good to me. r~
Re: [Patch,microblaze]: Better register allocation to minimize the spill and fetch.
On 01/29/2016 02:31 AM, Ajit Kumar Agarwal wrote: This patch improves the allocation of registers in the given function. The allocation is optimized for the conditional branches. The temporary register used in the conditional branches to store the comparison results and use of temporary in the conditional branch is optimized. Such temporary registers are allocated with a fixed register r18. Currently such temporaries are allocated with a free registers in the given function. Due to this one of the free register is reserved for the temporaries and given function is left with a few registers. This is unoptimized with respect to microblaze. In Microblaze r18 is marked as fixed and cannot be allocated to pseudos' in the given function. Instead r18 can be used as a temporary for the conditional branches with compare and branch. Use of r18 as a temporary for conditional branches will save one of the free registers to be allocated. The free registers can be used for other pseudos' and hence the better register allocation. The usage of r18 as above reduces the spill and fetch because of the availability of one of the free registers to other pseudos instead of being used for conditional temporaries. The advantage of the above is that the scope of the temporaries is limited to the conditional branches and hence the usage of r18 as temporary for such conditional branches is optimized and preserve the functionality of the function. Regtested for Microblaze target. Performance runs are done with Mibench/EEMBC benchmarks. Following gains are achieved. Benchmarks Gains automotive_qsort1 1.630730524% network_dijkstra 1.527506256% office_stringsearch 1 1.81356288% security_rijndael_d 3.26129357% basefp01_lite 4.465120185% a2time01_lite 1.893862857% cjpeg_lite 3.286496675% djpeg_lite 3.120150612% qos_lite 2.63964381% office_ispell 1.531340405% Code Size improvements: Reduction in number of instructions for Mibench : 12927. Reduction in number of instructions for EEMBC : 212. ChangeLog: 2016-01-29 Ajit Agarwal* config/microblaze/microblaze.c (microblaze_expand_conditional_branch): Use of MB_ABI_ASM_TEMP_REGNUM for temporary conditional branch. (microblaze_expand_conditional_branch_reg): Use of MB_ABI_ASM_TEMP_REGNUM for temporary conditional branch. (microblaze_expand_conditional_branch_sf): Use of MB_ABI_ASM_TEMP_REGNUM for temporary conditional branch. You can combine these ChangeLog entries: * config/microblaze/microblaze.c (microblaze_expand_conditional_branch, microblaze_expand_conditional_branch_reg, microblaze_expand_conditional_branch_sf): Use MB_ABI_ASM_TEMP_REGNUM for temp reg. Otherwise, OK. Signed-off-by:Ajit Agarwal ajit...@xilinx.com. --- gcc/config/microblaze/microblaze.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/config/microblaze/microblaze.c b/gcc/config/microblaze/microblaze.c index baff67a..b4277ad 100644 --- a/gcc/config/microblaze/microblaze.c +++ b/gcc/config/microblaze/microblaze.c @@ -3402,7 +3402,7 @@ microblaze_expand_conditional_branch (machine_mode mode, rtx operands[]) rtx cmp_op0 = operands[1]; rtx cmp_op1 = operands[2]; rtx label1 = operands[3]; - rtx comp_reg = gen_reg_rtx (SImode); + rtx comp_reg = gen_rtx_REG (SImode, MB_ABI_ASM_TEMP_REGNUM); rtx condition; gcc_assert ((GET_CODE (cmp_op0) == REG) || (GET_CODE (cmp_op0) == SUBREG)); @@ -3439,7 +3439,7 @@ microblaze_expand_conditional_branch_reg (enum machine_mode mode, rtx cmp_op0 = operands[1]; rtx cmp_op1 = operands[2]; rtx label1 = operands[3]; - rtx comp_reg = gen_reg_rtx (SImode); + rtx comp_reg = gen_rtx_REG (SImode, MB_ABI_ASM_TEMP_REGNUM); rtx condition; gcc_assert ((GET_CODE (cmp_op0) == REG) @@ -3483,7 +3483,7 @@ microblaze_expand_conditional_branch_sf (rtx operands[]) rtx condition; rtx cmp_op0 = XEXP (operands[0], 0); rtx cmp_op1 = XEXP (operands[0], 1); - rtx comp_reg = gen_reg_rtx (SImode); + rtx comp_reg = gen_rtx_REG (SImode, MB_ABI_ASM_TEMP_REGNUM); emit_insn (gen_cstoresf4 (comp_reg, operands[0], cmp_op0, cmp_op1)); condition = gen_rtx_NE (SImode, comp_reg, const0_rtx); -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
[Patch, fortran, pr67451, v1] [5/6 Regression] ICE with sourced allocation from coarray
Hi all, attached is a patch to fix a regression in current gfortran when a coarray is used in the source=-expression of an allocate(). The ICE was caused by the class information, i.e., _vptr and so on, not at the expected place. The patch fixes this. The patch also fixes pr69418, which I will flag as a duplicate in a second. Bootstrapped and regtested ok on x86_64-linux-gnu/F23. Ok for trunk? Backport to gcc-5 is pending, albeit more difficult, because the allocate() implementation on 5 is not as advanced the one in 6. Regards, Andre -- Andre Vehreschild * Email: vehre ad gmx dot de diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c index c5ae4c5..8f63d34 100644 --- a/gcc/fortran/trans-expr.c +++ b/gcc/fortran/trans-expr.c @@ -1103,7 +1103,14 @@ gfc_copy_class_to_class (tree from, tree to, tree nelems, bool unlimited) } else { - from_data = gfc_class_data_get (from); + /* Check that from is a class. When the class is part of a coarray, + then from is a common pointer and is to be used as is. */ + tmp = POINTER_TYPE_P (TREE_TYPE (from)) + ? build_fold_indirect_ref (from) : from; + from_data = + (GFC_CLASS_TYPE_P (TREE_TYPE (tmp)) + || (DECL_P (tmp) && GFC_DECL_CLASS (tmp))) + ? gfc_class_data_get (from) : from; is_from_desc = GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (from_data)); } } diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c index 310d2cd..5143c31 100644 --- a/gcc/fortran/trans-stmt.c +++ b/gcc/fortran/trans-stmt.c @@ -5358,7 +5358,8 @@ gfc_trans_allocate (gfc_code * code) expression. */ if (code->expr3) { - bool vtab_needed = false, temp_var_needed = false; + bool vtab_needed = false, temp_var_needed = false, + is_coarray = gfc_is_coarray (code->expr3); /* Figure whether we need the vtab from expr3. */ for (al = code->ext.alloc.list; !vtab_needed && al != NULL; @@ -5392,9 +5393,9 @@ gfc_trans_allocate (gfc_code * code) with the POINTER_PLUS_EXPR in this case. */ if (code->expr3->ts.type == BT_CLASS && TREE_CODE (se.expr) == NOP_EXPR - && TREE_CODE (TREE_OPERAND (se.expr, 0)) - == POINTER_PLUS_EXPR) - //&& ! GFC_CLASS_TYPE_P (TREE_TYPE (se.expr))) + && (TREE_CODE (TREE_OPERAND (se.expr, 0)) + == POINTER_PLUS_EXPR + || is_coarray)) se.expr = TREE_OPERAND (se.expr, 0); } /* Create a temp variable only for component refs to prevent @@ -5435,7 +5436,7 @@ gfc_trans_allocate (gfc_code * code) if (se.expr != NULL_TREE && temp_var_needed) { tree var, desc; - tmp = GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (se.expr)) ? + tmp = GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (se.expr)) || is_coarray ? se.expr : build_fold_indirect_ref_loc (input_location, se.expr); @@ -5448,7 +5449,7 @@ gfc_trans_allocate (gfc_code * code) { /* When an array_ref was in expr3, then the descriptor is the first operand. */ - if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp))) + if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp)) || is_coarray) { desc = TREE_OPERAND (tmp, 0); } @@ -5460,11 +5461,12 @@ gfc_trans_allocate (gfc_code * code) e3_is = E3_DESC; } else - desc = se.expr; + desc = !is_coarray ? se.expr + : TREE_OPERAND (TREE_OPERAND (se.expr, 0), 0); /* We need a regular (non-UID) symbol here, therefore give a prefix. */ var = gfc_create_var (TREE_TYPE (tmp), "source"); - if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp))) + if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp)) || is_coarray) { gfc_allocate_lang_decl (var); GFC_DECL_SAVED_DESCRIPTOR (var) = desc; diff --git a/gcc/testsuite/gfortran.dg/coarray_allocate_2.f08 b/gcc/testsuite/gfortran.dg/coarray_allocate_2.f08 new file mode 100644 index 000..7a712a9 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/coarray_allocate_2.f08 @@ -0,0 +1,26 @@ +! { dg-do run } +! { dg-options "-fcoarray=single" } +! +! Contributed by Ian Harvey+! Extended by Andre Vehreschild +! to test that coarray references in allocate work now +! PR fortran/67451 + + program main +implicit none +type foo + integer :: bar = 99 +end type +class(foo), allocatable :: foobar[:] +class(foo), allocatable :: some_local_object +allocate(foobar[*]) + +allocate(some_local_object, source=foobar) + +if (.not. allocated(foobar)) call abort() +if (.not. allocated(some_local_object)) call abort() + +deallocate(some_local_object) +deallocate(foobar) + end program + diff --git a/gcc/testsuite/gfortran.dg/coarray_allocate_3.f08 b/gcc/testsuite/gfortran.dg/coarray_allocate_3.f08 new file mode 100644 index 000..46f34c0 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/coarray_allocate_3.f08 @@ -0,0 +1,28 @@ +! { dg-do run } +! { dg-options "-fcoarray=single" } +! +! Contributed by Ian Harvey
[PING] Re: [RFC][PATCH, ARM 5/8] ARMv8-M Security Extension's cmse_nonsecure_entry: clear registers
On 26/12/15 01:54, Thomas Preud'homme wrote: [Sending on behalf of Andre Vieira] Hello, This patch extends support for the ARMv8-M Security Extensions 'cmse_nonsecure_entry' attribute to safeguard against leak of information through unbanked registers. When returning from a nonsecure entry function we clear all caller-saved registers that are not used to pass return values, by writing either the LR, in case of general purpose registers, or the value 0, in case of FP registers. We use the LR to write to APSR and FPSCR too. We currently only support 32 FP registers as in we only clear D0-D7. We currently do not support entry functions that pass arguments or return variables on the stack and we diagnose this. This patch relies on the existing code to make sure callee-saved registers used in cmse_nonsecure_entry functions are saved and restored thus retaining their nonsecure mode value, this should be happening already as it is required by AAPCS. *** gcc/ChangeLog *** 2015-10-27 Andre VieiraThomas Preud'homme * gcc/config/arm/arm.c (output_return_instruction): Clear registers. (thumb2_expand_return): Likewise. (thumb1_expand_epilogue): Likewise. (arm_expand_epilogue): Likewise. (cmse_nonsecure_entry_clear_before_return): New. * gcc/config/arm/arm.h (TARGET_DSP_ADD): New macro define. * gcc/config/arm/thumb1.md (*epilogue_insns): Change length attribute. * gcc/config/arm/thumb2.md (*thumb2_return): Likewise. *** gcc/testsuite/ChangeLog *** 2015-10-27 Andre Vieira Thomas Preud'homme * gcc.target/arm/cmse/cmse.exp: Test different multilibs separate. * gcc.target/arm/cmse/baseline/cmse-2.c: Test that registers are cleared. * gcc.target/arm/cmse/mainline/soft/cmse-5.c: New. * gcc.target/arm/cmse/mainline/hard/cmse-5.c: New. * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New. * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New. * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New. diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index f12e3c93bbe24b10ed8eee6687161826773ef649..b06e0586a3da50f57645bda13629bc4dbd3d53b7 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -230,6 +230,9 @@ extern void (*arm_lang_output_object_attributes_hook)(void); /* Integer SIMD instructions, and extend-accumulate instructions. */ #define TARGET_INT_SIMD \ (TARGET_32BIT && arm_arch6 && (arm_arch_notm || arm_arch7em)) +/* Parallel addition and subtraction instructions. */ +#define TARGET_DSP_ADD \ + (TARGET_ARM_ARCH >= 6 && (arm_arch_notm || arm_arch7em)) /* Should MOVW/MOVT be used in preference to a constant pool. */ #define TARGET_USE_MOVT \ diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index e530b772e3cc053c16421a2a2861d815d53ebb01..0700478ca38307f35d0cb01f83ea182802ba28fa 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -19755,6 +19755,24 @@ output_return_instruction (rtx operand, bool really_return, bool reverse, default: if (IS_CMSE_ENTRY (func_type)) { + char flags[12] = "APSR_nzcvq"; + /* Check if we have to clear the 'GE bits' which is only used if +parallel add and subtraction instructions are available. */ + if (TARGET_DSP_ADD) + { + /* If so also clear the ge flags. */ + flags[10] = 'g'; + flags[11] = '\0'; + } + snprintf (instr, sizeof (instr), "msr%s\t%s, %%|lr", conditional, + flags); + output_asm_insn (instr, & operand); + if (TARGET_HARD_FLOAT && TARGET_VFP) + { + snprintf (instr, sizeof (instr), "vmsr%s\tfpscr, %%|lr", + conditional); + output_asm_insn (instr, & operand); + } snprintf (instr, sizeof (instr), "bxns%s\t%%|lr", conditional); } /* Use bx if it's available. */ @@ -23999,6 +24017,17 @@ thumb_pop (FILE *f, unsigned long mask) static void thumb1_cmse_nonsecure_entry_return (FILE *f, int reg_containing_return_addr) { + char flags[12] = "APSR_nzcvq"; + /* Check if we have to clear the 'GE bits' which is only used if + parallel add and subtraction instructions are available. */ + if (TARGET_DSP_ADD) +{ + flags[10] = 'g'; + flags[11] = '\0'; +} + asm_fprintf (f, "\tmsr\t%s, %r\n", flags, reg_containing_return_addr); + if (TARGET_HARD_FLOAT && TARGET_VFP) +asm_fprintf (f, "\tvmsr\tfpscr, %r\n", reg_containing_return_addr); asm_fprintf (f, "\tbxns\t%r\n", reg_containing_return_addr); } @@ -25140,6 +25169,139 @@
[PING] Re: [RFC][PATCH, ARM 6/8] Handling ARMv8-M Security Extension's cmse_nonsecure_call attribute
On 26/12/15 01:55, Thomas Preud'homme wrote: [Sending on behalf of Andre Vieira] Hello, This patch adds support for the ARMv8-M Security Extensions 'cmse_nonsecure_call' attribute. This attribute may only be used for function types and when used in combination with the '-mcmse' compilation flag. See Section 5.5 of ARM®v8-M Security Extensions (http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html). We currently do not support cmse_nonsecure_call functions that pass arguments or return variables on the stack and we diagnose this. *** gcc/ChangeLog *** 2015-10-27 Andre VieiraThomas Preud'homme * gcc/config/arm/arm.c (gimplify.h): New include. (arm_handle_cmse_nonsecure_call): New. (arm_attribute_table): Added cmse_nonsecure_call. *** gcc/testsuite/ChangeLog *** 2015-10-27 Andre Vieira Thomas Preud'homme * gcc.target/arm/cmse/cmse-3.c: Add tests. * gcc.target/arm/cmse/cmse-4.c: Add tests. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 0700478ca38307f35d0cb01f83ea182802ba28fa..4b4eea88cbec8e04d5b92210f0af2440ce6fb6e4 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -61,6 +61,7 @@ #include "builtins.h" #include "tm-constrs.h" #include "rtl-iter.h" +#include "gimplify.h" /* This file should be included last. */ #include "target-def.h" @@ -136,6 +137,7 @@ static tree arm_handle_isr_attribute (tree *, tree, tree, int, bool *); static tree arm_handle_notshared_attribute (tree *, tree, tree, int, bool *); #endif static tree arm_handle_cmse_nonsecure_entry (tree *, tree, tree, int, bool *); +static tree arm_handle_cmse_nonsecure_call (tree *, tree, tree, int, bool *); static void arm_output_function_epilogue (FILE *, HOST_WIDE_INT); static void arm_output_function_prologue (FILE *, HOST_WIDE_INT); static int arm_comp_type_attributes (const_tree, const_tree); @@ -347,6 +349,8 @@ static const struct attribute_spec arm_attribute_table[] = /* ARMv8-M Security Extensions support. */ { "cmse_nonsecure_entry", 0, 0, true, false, false, arm_handle_cmse_nonsecure_entry, false }, + { "cmse_nonsecure_call", 0, 0, true, false, false, +arm_handle_cmse_nonsecure_call, false }, { NULL, 0, 0, false, false, false, NULL, false } }; @@ -6667,6 +6671,76 @@ arm_handle_cmse_nonsecure_entry (tree *node, tree name, return NULL_TREE; } + +/* Called upon detection of the use of the cmse_nonsecure_call attribute, this + function will check whether the attribute is allowed here and will add the + attribute to the function type tree or otherwise issue a diagnose. The + reason we check this at declaration time is to only allow the use of the + attribute with declartions of function pointers and not function + declartions. */ + +static tree +arm_handle_cmse_nonsecure_call (tree *node, tree name, +tree /* args */, +int /* flags */, +bool *no_add_attrs) +{ + tree decl = NULL_TREE; + tree type, fntype, main_variant; + + if (!use_cmse) +{ + *no_add_attrs = true; + return NULL_TREE; +} + + if (TREE_CODE (*node) == VAR_DECL || TREE_CODE (*node) == TYPE_DECL) +{ + decl = *node; + type = TREE_TYPE (decl); +} + + if (!decl + || (!(TREE_CODE (type) == POINTER_TYPE + && TREE_CODE (TREE_TYPE (type)) == FUNCTION_TYPE) + && TREE_CODE (type) != FUNCTION_TYPE)) +{ + warning (OPT_Wattributes, "%qE attribute only applies to base type of a " +"function pointer", name); + *no_add_attrs = true; + return NULL_TREE; +} + + /* type is either a function pointer, when the attribute is used on a function + * pointer, or a function type when used in a typedef. */ + if (TREE_CODE (type) == FUNCTION_TYPE) +fntype = type; + else +fntype = TREE_TYPE (type); + + *no_add_attrs |= cmse_func_args_or_return_in_stack (NULL, name, fntype); + + if (*no_add_attrs) +return NULL_TREE; + + /* Prevent tree's being shared among function types with and without + cmse_nonsecure_call attribute. Do however make sure they keep the same + main_variant, this is required for correct DIE output. */ + main_variant = TYPE_MAIN_VARIANT (fntype); + fntype = build_distinct_type_copy (fntype); + TYPE_MAIN_VARIANT (fntype) = main_variant; + if (TREE_CODE (type) == FUNCTION_TYPE) +TREE_TYPE (decl) = fntype; + else +TREE_TYPE (type) = fntype; + + /* Construct a type attribute and add it to the function type. */ + tree attrs = tree_cons (get_identifier ("cmse_nonsecure_call"), NULL_TREE, + TYPE_ATTRIBUTES (fntype)); + TYPE_ATTRIBUTES (fntype) = attrs; + return NULL_TREE; +} + /*
[PING] Re: [RFC][PATCH, ARM 3/8] Handling ARMv8-M Security Extension's cmse_nonsecure_entry attribute
On 26/12/15 01:47, Thomas Preud'homme wrote: [Sending on behalf of Andre Vieira] Hello, This patch adds support for the ARMv8-M Security Extensions 'cmse_nonsecure_entry' attribute. In this patch we implement the attribute handling and diagnosis around the attribute. See Section 5.4 of ARM®v8-M Security Extensions (http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html). *** gcc/ChangeLog *** 2015-10-27 Andre VieiraThomas Preud'homme * gcc/config/arm/arm.c (arm_handle_cmse_nonsecure_entry): New. (arm_attribute_table): Added cmse_nonsecure_entry (arm_compute_func_type): Handle cmse_nonsecure_entry. (cmse_func_args_or_return_in_stack): New. (arm_handle_cmse_nonsecure_entry): New. * gcc/config/arm/arm.h (ARM_FT_CMSE_ENTRY): New macro define. (IS_CMSE_ENTRY): Likewise. *** gcc/testsuite/ChangeLog *** 2015-10-27 Andre Vieira Thomas Preud'homme * gcc.target/arm/cmse/cmse-3.c: New. diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index cf6d9466fb79e4f8a2dbfe725c52d5be8ea24fd2..f12e3c93bbe24b10ed8eee6687161826773ef649 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -1375,6 +1375,7 @@ enum reg_class #define ARM_FT_VOLATILE (1 << 4) /* Does not return. */ #define ARM_FT_NESTED (1 << 5) /* Embedded inside another func. */ #define ARM_FT_STACKALIGN (1 << 6) /* Called with misaligned stack. */ +#define ARM_FT_CMSE_ENTRY (1 << 7) /* ARMv8-M non-secure entry function. */ /* Some macros to test these flags. */ #define ARM_FUNC_TYPE(t) (t & ARM_FT_TYPE_MASK) @@ -1383,6 +1384,7 @@ enum reg_class #define IS_NAKED(t) (t & ARM_FT_NAKED) #define IS_NESTED(t) (t & ARM_FT_NESTED) #define IS_STACKALIGN(t) (t & ARM_FT_STACKALIGN) +#define IS_CMSE_ENTRY(t) (t & ARM_FT_CMSE_ENTRY) /* Structure used to hold the function stack frame layout. Offsets are diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 2223101fbf96bceb4beb3a7d6cb04162481dc3bf..5b9e51b10e91eee64e3383c1ed50269c3e6cf24c 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -135,6 +135,7 @@ static tree arm_handle_isr_attribute (tree *, tree, tree, int, bool *); #if TARGET_DLLIMPORT_DECL_ATTRIBUTES static tree arm_handle_notshared_attribute (tree *, tree, tree, int, bool *); #endif +static tree arm_handle_cmse_nonsecure_entry (tree *, tree, tree, int, bool *); static void arm_output_function_epilogue (FILE *, HOST_WIDE_INT); static void arm_output_function_prologue (FILE *, HOST_WIDE_INT); static int arm_comp_type_attributes (const_tree, const_tree); @@ -343,6 +344,9 @@ static const struct attribute_spec arm_attribute_table[] = { "notshared",0, 0, false, true, false, arm_handle_notshared_attribute, false }, #endif + /* ARMv8-M Security Extensions support. */ + { "cmse_nonsecure_entry", 0, 0, true, false, false, +arm_handle_cmse_nonsecure_entry, false }, { NULL, 0, 0, false, false, false, NULL, false } }; @@ -3562,6 +3566,9 @@ arm_compute_func_type (void) else type |= arm_isr_value (TREE_VALUE (a)); + if (lookup_attribute ("cmse_nonsecure_entry", attr)) +type |= ARM_FT_CMSE_ENTRY; + return type; } @@ -6552,6 +6559,109 @@ arm_handle_notshared_attribute (tree *node, } #endif +/* This function is used to check whether functions with attributes + cmse_nonsecure_call or cmse_nonsecure_entry use the stack to pass arguments + or return variables. If the function does indeed use the stack this + function returns true and diagnoses this, otherwise it returns false. */ + +static bool +cmse_func_args_or_return_in_stack (tree fndecl, tree name, tree fntype) +{ + function_args_iterator args_iter; + CUMULATIVE_ARGS args_so_far_v; + cumulative_args_t args_so_far; + bool first_param = true; + tree arg_type, prev_arg_type = NULL_TREE, ret_type; + + /* Error out if any argument is passed on the stack. */ + arm_init_cumulative_args (_so_far_v, fntype, NULL_RTX, fndecl); + args_so_far = pack_cumulative_args (_so_far_v); + FOREACH_FUNCTION_ARGS (fntype, arg_type, args_iter) +{ + rtx arg_rtx; + machine_mode arg_mode = TYPE_MODE (arg_type); + + prev_arg_type = arg_type; + if (VOID_TYPE_P (arg_type)) + continue; + + if (!first_param) + arm_function_arg_advance (args_so_far, arg_mode, arg_type, true); + arg_rtx = arm_function_arg (args_so_far, arg_mode, arg_type, true); + if (!arg_rtx + || arm_arg_partial_bytes (args_so_far, arg_mode, arg_type, true)) + { + error ("%qE attribute not available to functions with arguments " +"passed on the stack", name); + return true; + } +
Re: [RFC][PATCH , ARM 2/8] Add RTL patterns for thumb1 push/pop
On 26/12/15 01:45, Thomas Preud'homme wrote: [Sending on behalf of Andre Vieira] Hello, This patch adds RTL patterns for the push and pop instructions for thumb1. These are needed by subsequent patches in the series. *** gcc/ChangeLog *** 2015-10-27 Andre VieiraThomas Preud'homme * gcc/config/arm/arm-ldmstm.nl (constr thumb): Enabled stackpointer to be written/read. * gcc/config/arm/ldmstm.md: Regenerated. * gcc/config/arm/thumb1.md (*thumb1_pop_single): New. (*thumb1_load_multiple_operation): New. * gcc/config/arm/arm.c (thumb_pop): Fix of comment. diff --git a/gcc/config/arm/arm-ldmstm.ml b/gcc/config/arm/arm-ldmstm.ml index 62982df594d5d4a1407df359e927c66986a9788c..f3ee741e93927d8d44a9eccec8970b46a8984216 100644 --- a/gcc/config/arm/arm-ldmstm.ml +++ b/gcc/config/arm/arm-ldmstm.ml @@ -63,7 +63,7 @@ let rec final_offset addrmode nregs = | DB -> -4 * nregs let constr thumb = - if thumb then "l" else "rk" + if thumb then "lk" else "rk" let inout_constr op_type = match op_type with diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 06a6184ee0c4ed1a7cec1de4c1786e297cc57872..2223101fbf96bceb4beb3a7d6cb04162481dc3bf 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -23773,8 +23773,8 @@ thumb1_emit_multi_reg_push (unsigned long mask, unsigned long real_regs) return insn; } -/* Emit code to push or pop registers to or from the stack. F is the - assembly file. MASK is the registers to pop. */ +/* Emit code to pop registers from the stack. F is the assembly file. + MASK is the registers to pop. */ static void thumb_pop (FILE *f, unsigned long mask) { diff --git a/gcc/config/arm/ldmstm.md b/gcc/config/arm/ldmstm.md index ebb09ab86e799f3606e0988980edf3cd0189272b..8c0472e07799bd9d08759e35b6b98f3536d3d013 100644 --- a/gcc/config/arm/ldmstm.md +++ b/gcc/config/arm/ldmstm.md @@ -43,7 +43,7 @@ (define_insn "*thumb_ldm4_ia" [(match_parallel 0 "load_multiple_operation" [(set (match_operand:SI 1 "low_register_operand" "") - (mem:SI (match_operand:SI 5 "s_register_operand" "l"))) + (mem:SI (match_operand:SI 5 "s_register_operand" "lk"))) (set (match_operand:SI 2 "low_register_operand" "") (mem:SI (plus:SI (match_dup 5) (const_int 4 @@ -80,7 +80,7 @@ (define_insn "*thumb_ldm4_ia_update" [(match_parallel 0 "load_multiple_operation" -[(set (match_operand:SI 5 "s_register_operand" "+") +[(set (match_operand:SI 5 "s_register_operand" "+") (plus:SI (match_dup 5) (const_int 16))) (set (match_operand:SI 1 "low_register_operand" "") (mem:SI (match_dup 5))) @@ -133,7 +133,7 @@ (define_insn "*thumb_stm4_ia_update" [(match_parallel 0 "store_multiple_operation" -[(set (match_operand:SI 5 "s_register_operand" "+") +[(set (match_operand:SI 5 "s_register_operand" "+") (plus:SI (match_dup 5) (const_int 16))) (set (mem:SI (match_dup 5)) (match_operand:SI 1 "low_register_operand" "")) @@ -491,7 +491,7 @@ (define_insn "*thumb_ldm3_ia" [(match_parallel 0 "load_multiple_operation" [(set (match_operand:SI 1 "low_register_operand" "") - (mem:SI (match_operand:SI 4 "s_register_operand" "l"))) + (mem:SI (match_operand:SI 4 "s_register_operand" "lk"))) (set (match_operand:SI 2 "low_register_operand" "") (mem:SI (plus:SI (match_dup 4) (const_int 4 @@ -522,7 +522,7 @@ (define_insn "*thumb_ldm3_ia_update" [(match_parallel 0 "load_multiple_operation" -[(set (match_operand:SI 4 "s_register_operand" "+") +[(set (match_operand:SI 4 "s_register_operand" "+") (plus:SI (match_dup 4) (const_int 12))) (set (match_operand:SI 1 "low_register_operand" "") (mem:SI (match_dup 4))) @@ -568,7 +568,7 @@ (define_insn "*thumb_stm3_ia_update" [(match_parallel 0 "store_multiple_operation" -[(set (match_operand:SI 4 "s_register_operand" "+") +[(set (match_operand:SI 4 "s_register_operand" "+") (plus:SI (match_dup 4) (const_int 12))) (set (mem:SI (match_dup 4)) (match_operand:SI 1 "low_register_operand" "")) @@ -877,7 +877,7 @@ (define_insn "*thumb_ldm2_ia" [(match_parallel 0 "load_multiple_operation" [(set (match_operand:SI 1 "low_register_operand" "") - (mem:SI (match_operand:SI 3 "s_register_operand" "l"))) + (mem:SI (match_operand:SI 3 "s_register_operand" "lk"))) (set (match_operand:SI 2 "low_register_operand" "") (mem:SI (plus:SI (match_dup 3) (const_int 4])] @@ -902,7 +902,7 @@ (define_insn "*thumb_ldm2_ia_update" [(match_parallel 0 "load_multiple_operation" -[(set (match_operand:SI 3 "s_register_operand" "+") +[(set (match_operand:SI 3
Re: Is it OK for rtx_addr_can_trap_p_1 to attempt to compute the frame layout? (was Re: [PATCH] Skip re-computing the mips frame info after reload completed)
On 28.01.2016 23:17, Richard Sandiford wrote: > Bernd Edlingerwrites: >> On 26.01.2016 22:18, Richard Sandiford wrote: >>> [cc-ing Eric as RTL maintainer] >>> >>> Matthew Fortune writes: Bernd Edlinger writes: > Matthew Fortune writes: >> Has the patch been tested beyond just building GCC? I can do a >> test run for you if you don't have things set up to do one yourself. > > I built a cross-gcc with all languages and a cross-glibc, but I have > not set up an emulation environment, so if you could give it a test > that would be highly welcome. mipsel-linux-gnu test results are the same before and after this patch. Please go ahead and commit. >>> >>> I still object to this. And it feels like the patch was posted >>> as though it was a new one in order to avoid answering the objections >>> that were raised when it was last posted: >>> >>> https://gcc.gnu.org/ml/gcc-patches/2015-12/msg02218.html >>> >> >> Richard, I am really sorry when you feel now like I did not take your >> objections seriously. Let me first explain what happened from my point >> of view: >> >> When I posted this response to your objections here: >> >>https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00235.html >> >> I was waiting for your response, but nothing happened, so I kind of >> forgot about the issue. In the meantime Ubuntu and Debian began to >> roll out GCC-6 and they got stuck at the same issue, but instead of >> waiting for pr69012 to be eventually resolved they created a duplicate >> pr69129, and then last Thursday Nick applied my initial patch without >> my intervention, probably because of the pressure they put on us. > > Ah, I'd missed that, sorry. It's not obvious from the email thread > that the patch was actually approved. I just see a message from Matthias > saying that it worked for him and then a message from Nick saying that > he'd applied it. > > Ah well. I guess at some point I have to get over the fact that I'm no > longer the MIPS maintainer :-) > >> That changed significantly how I looked at the issue after that point, >> as there was no actual regression anymore, the revised patch still made >> sense, but for another reason. When you look at the 20+ targets in the >> gcc tree you'll see that almost all of them have a frame-layout >> computation function and all except mips have a shortcut >> "if (reload_completed) return;" in that function. And OTOH mips has >> one of the most complicated frame layout functions of all targets. >> >> For all of these reasons I posted a new patch which tries to resolve >> differences between mips and other targets inital_elimination_offset >> functions. > > OK. But the point still stands that the patch is only useful because > we're now calling mips_compute_frame_info in cases where we wouldn't > previously, because of the rtx_addr_can_trap_p changes. > Yes of course, but it cannot hurt to have all targets behave identical in such a central point. Even if at some point also the implementation in rtx_addr_can_trap_p will eventually be improved in one way or the other. >> I still think that it is not OK for the mips target to do the frame >> layout already in mips_frame_pointer_required because the frame layout >> will change several times, until reload is completed, and that function >> is only called right in the beginning. > > I don't think it's any better or worse than doing the frame layout in > INITIAL_ELIMINATION_OFFSET (which is common practice and pretty much > required). They're both part of the initial setup phase -- > targetm.frame_layout_required determines frame_pointer_needed, which is > a vital input to the code that decides which eliminations to make. > Hmm... That can also be a difference between LRA and traditional reload. Reload calls targetm.frame_pointer_required in update_eliminables, and can apparently handle 0=>1 and 1=>0 transitions here. While LRA calls frame_pointer_required only once in ira_setup_eliminable_regset and can only handle 0=>1 transitions of frame_pointer_needed in setup_can_eliminate, but not based on targetm.frame_pointer_required but targetm.can_eliminate(FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM). At least it I read the source correctly, I could be wrong of course. > And this is an example of us doing the kind of caching that I was > suggesting. Code that wants to know whether the frame pointer is needed > for the current function should use frame_pointer_needed. Only the code > that sets up frame_pointer_needed should call frame_pointer_required > directly. > But frame_pointer_needed adds at least some value to the raw targetm.frame_pointer_required, while a cached result of initial_elimination_offset would not, just conserve the current interface exactly as it is. >> And I think that it is not really well-designed to have a frame layout >>
Re: [PATCH, rs6000] Fix PR65546
On Thu, Jan 28, 2016 at 5:41 PM, Bill Schmidtwrote: > Hi, > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65546 identifies a failure > in gcc.dg/vect/costmodel/ppc/costmodel-vect-31a.c. The test case hasn't > kept up with changes in the vectorizer, so it's looking for the wrong > error message. Also, the error message should be conditioned by a check > for support of unaligned memory accesses. This patch corrects these > problems. > > For 4.9 and 5, the error message needs to be similarly changed. > However, for these earlier releases, the check for misalignment support > doesn't apply. > > Verified on powerpc64le-unknown-linux-gnu for both -mcpu=power7 and > -mcpu=power8, which differ in their support for misalignment. Is this > ok for trunk? Provided verification succeeds on 4.9 and 5, is the > revised test ok for those releases? > > Thanks, > Bill > > > 2016-01-28 Bill Schmidt > > PR target/65546 > * gcc.dg/vect/costmodel/ppc/costmodel-vect-31a.c: Correct > condition being checked, and disable it when the target supports > misaligned loads and stores. Okay. Thanks, David
Re: [RS6000] ABI_V4 init of toc section
On Fri, Jan 29, 2016 at 11:38 AM, Alan Modrawrote: > Since 4c4a180d, LTO has turned off flag_pic when linking a fixed > position executable. This results in flag_pic being zero in > rs6000_file_start, and no definition of ".LCTOC1". > > However, when we get to actually emitting code, flag_pic may be on > again, and references made to ".LCTOC1". How flag_pic comes to be > enabled again is quite a story. It goes like this.. If a function is > compiled with -fPIC then sysv4.h SUBTARGET_OVERRIDE_OPTIONS will set > TARGET_RELOCATABLE. Conversely, if TARGET_RELOCATABLE is set and > flag_pic is zero, then SUBTARGET_OVERRIDE_OPTIONS will set flag_pic=2. > It also happens that TARGET_RELOCATABLE is a bit in rs6000_isa_flags, > which is handled by rs6000_function_specific_save and > rs6000_function_specific_restore. That last fact means lto streaming > keeps track of the state of TARGET_RELOCATABLE for functions, and when > options are restored for a given function we'll set flag_pic=2 if the > function was originally compiled with -fPIC. That's bad because it > defeats the purpose of the 4c4a180d lto change, resulting in worse > optimization of ppc32 executables. What's more, we don't seem to turn > off flag_pic once it is on. > > We should really untangle the flag_pic/TARGET_RELOCATABLE mess, but > that change is probably a little dangerous for stage4. Instead, this > patch removes the toc symbol initialization from file_start and does > so when the first item is emitted to the toc, or after the function > epilogue in the cases where we emit code to initialize a toc pointer > but don't actually use it (-O0 mostly, I think). > > Bootstrapped and regression tested powerpc64-linux biarch with all > languages enabled. OK to apply? > > PR target/68662 > * config/rs6000/rs6000.c (need_toc_init): New var, set it > whenever toc_label_name used. > (rs6000_file_start): Don't set up toc section here, > (rs6000_output_function_epilogue): do so here instead, > (rs6000_xcoff_file_start): and here. > * config/rs6000/rs6000.md (load_toc_aix_si): Set need_toc_init. > (load_toc_aix_di): Likewise. I'm worried about how this is going to interact with AIX. AIX assembler is single pass and this patch moves the initialization from the beginning of the file to the end of the file, which means there will be references to a label whose definition is delayed until the end. - David
[PATCHv2] Re: [RFC][PATCH, ARM 1/8] Add support for ARMv8-M's Security Extensions flag and intrinsics
On 05/01/16 14:38, Andre Vieira wrote: On 31/12/15 20:54, Joseph Myers wrote: On Sat, 26 Dec 2015, Thomas Preud'homme wrote: +#define CMSE_TT_ASM(flags) \ +{ \ + cmse_address_info_t result; \ + __asm__ ("tt" # flags " %0,%1" \ + : "=r"(result) \ + : "r"(p) \ + : "memory"); \ + return result; \ Are the identifiers "result" and "p" really meant to be reserved by this header (so that users can't have macros with those names before including it), or should they actually be __result and __p (and likewise for any other identifiers in this file not specified as reserved)? +__extension__ void * +cmse_check_address_range (void *p, size_t size, int flags); Are "size" and "flags" really meant to be reserved? +@item -mcmse +@opindex mcmse +Generate secure code as per ARMv8-M Security Extensions. I think you also need a section in extend.texi much like the existing ACLE section, to describe support for this as a language extension. I'll change all non-reserved and 'not-ment-for-export' identifiers to be preceded by '__' and Ill also look into adding a section for ARMv8-M Security Extensions (CMSE) to extend.texi. Thank you for your feedback. BR, Andre Hi there, Forgot to send the reworked patch upstream, here it is following Joseph's comments. Thank you again. Is this OK? Cheers, Andre *** gcc/ChangeLog *** 2016-01-29 Andre VieiraThomas Preud'homme * gcc/config.gcc (extra_headers): Added arm_cmse.h. * gcc/config/arm/arm-arches.def (ARM_ARCH): (armv8-m): Add FL2_CMSE. (armv8-m.main): Likewise. (armv8-m.main+dsp): Likewise. * gcc/config/arm/arm-c.c (arm_cpu_builtins): Added __ARM_FEATURE_CMSE macro. * gcc/config/arm/arm-protos.h (arm_is_constant_pool_ref): Define FL2_CMSE. * gcc/config/arm.c (arm_arch_cmse): New. (arm_option_override): New error for unsupported cmse target. * gcc/config/arm/arm.h (arm_arch_cmse): New. * gcc/config/arm/arm.opt (mcmse): New. * gcc/doc/invoke.texi (ARM Options): Add -mcmse. * gcc/doc/extend.texi (ACLE): Add CMSE. * gcc/config/arm/arm_cmse.h: New file. * libgcc/config/arm/cmse.c: Likewise. * libgcc/config/arm/t-arm (HAVE_CMSE): New. *** gcc/testsuite/ChangeLog *** 2016-01-29 Andre Vieira Thomas Preud'homme * gcc.target/arm/cmse/cmse.exp: New. * gcc.target/arm/cmse/cmse-1.c: New. * gcc.target/arm/cmse/cmse-12.c: New. * lib/target-supports.exp (check_effective_target_arm_cmse_ok): New. diff --git a/gcc/config.gcc b/gcc/config.gcc index 7c3ad8984d8032b984b0acb21e9c05fdcc40579a..5d42d00819e74ff1c5b665f36e1b6f4033fe357d 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -323,7 +323,7 @@ arc*-*-*) arm*-*-*) cpu_type=arm extra_objs="arm-builtins.o aarch-common.o" - extra_headers="mmintrin.h arm_neon.h arm_acle.h" + extra_headers="mmintrin.h arm_neon.h arm_acle.h arm_cmse.h" target_type_format_char='%' c_target_objs="arm-c.o" cxx_target_objs="arm-c.o" diff --git a/gcc/config/arm/arm-arches.def b/gcc/config/arm/arm-arches.def index be46521c9eaea54f9ad78a92874567589289dbdf..0e523959551cc3b1da31411ccdd1105b830db845 100644 --- a/gcc/config/arm/arm-arches.def +++ b/gcc/config/arm/arm-arches.def @@ -63,11 +63,11 @@ ARM_ARCH("armv8.1-a+crc",cortexa53, 8A, ARM_FSET_MAKE (FL_CO_PROC | FL_CRC32 | FL_FOR_ARCH8A, FL2_FOR_ARCH8_1A)) ARM_ARCH("armv8-m.base", cortexm0, 8M_BASE, - ARM_FSET_MAKE_CPU1 ( FL_FOR_ARCH8M_BASE)) + ARM_FSET_MAKE ( FL_FOR_ARCH8M_BASE, FL2_CMSE)) ARM_ARCH("armv8-m.main", cortexm7, 8M_MAIN, - ARM_FSET_MAKE_CPU1(FL_CO_PROC | FL_FOR_ARCH8M_MAIN)) + ARM_FSET_MAKE (FL_CO_PROC | FL_FOR_ARCH8M_MAIN, FL2_CMSE)) ARM_ARCH("armv8-m.main+dsp", cortexm7, 8M_MAIN, - ARM_FSET_MAKE_CPU1(FL_CO_PROC | FL_ARCH7EM | FL_FOR_ARCH8M_MAIN)) + ARM_FSET_MAKE (FL_CO_PROC | FL_ARCH7EM | FL_FOR_ARCH8M_MAIN, FL2_CMSE)) ARM_ARCH("iwmmxt", iwmmxt, 5TE, ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT)) ARM_ARCH("iwmmxt2", iwmmxt2,5TE, ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT | FL_IWMMXT2)) diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c index 195905fa25b36cd35fe9bc843c695333892106be..862bd095cb1c34626872194a03892ff915d18916 100644 --- a/gcc/config/arm/arm-c.c +++ b/gcc/config/arm/arm-c.c @@ -76,6 +76,14 @@ arm_cpu_builtins (struct cpp_reader* pfile) def_or_undef_macro (pfile, "__ARM_32BIT_STATE", TARGET_32BIT); + if (arm_arch8 && !arm_arch_notm) +{ + if (arm_arch_cmse && use_cmse) + builtin_define_with_int_value ("__ARM_FEATURE_CMSE", 3); + else + builtin_define ("__ARM_FEATURE_CMSE"); +} + if (TARGET_ARM_FEATURE_LDREX)
[PING] Re: [RFC][PATCH, ARM 4/8] ARMv8-M Security Extension's cmse_nonsecure_entry: __acle_se label and bxns return
On 26/12/15 01:52, Thomas Preud'homme wrote: [Sending on behalf of Andre Vieira] Hello, This patch extends support for the ARMv8-M Security Extensions 'cmse_nonsecure_entry' attribute in two ways: 1) Generate two labels for the function, the regular function name and one with the function's name appended to '__acle_se_', this will trigger the linker to create a secure gateway veneer for this entry function. 2) Return from cmse_nonsecure_entry marked functions using bxns. See Section 5.4 of ARM®v8-M Security Extensions (http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html). *** gcc/ChangeLog *** 2015-10-27 Andre VieiraThomas Preud'homme * gcc/config/arm/arm.c (use_return_insn): Change to return with bxns when cmse_nonsecure_entry. (output_return_instruction): Likewise. (arm_output_function_prologue): Likewise. (thumb_pop): Likewise. (thumb_exit): Likewise. (arm_function_ok_for_sibcall): Disable sibcall for entry functions. (arm_asm_declare_function_name): New. (thumb1_cmse_nonsecure_entry_return): New. * gcc/config/arm/arm-protos.h (arm_asm_declare_function_name): New. * gcc/config/arm/elf.h (ASM_DECLARE_FUNCTION_NAME): Redefine to use arm_asm_declare_function_name. *** gcc/testsuite/ChangeLog *** 2015-10-27 Andre Vieira Thomas Preud'homme * gcc.target/arm/cmse/cmse-2.c: New. * gcc.target/arm/cmse/cmse-4.c: New. diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 85dca057d63544c672188db39b05a33b1be10915..9ee8c333046d9a5bb0487f7b710a5aff42d2 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -31,6 +31,7 @@ extern int arm_volatile_func (void); extern void arm_expand_prologue (void); extern void arm_expand_epilogue (bool); extern void arm_declare_function_name (FILE *, const char *, tree); +extern void arm_asm_declare_function_name (FILE *, const char *, tree); extern void thumb2_expand_return (bool); extern const char *arm_strip_name_encoding (const char *); extern void arm_asm_output_labelref (FILE *, const char *); diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 5b9e51b10e91eee64e3383c1ed50269c3e6cf24c..e530b772e3cc053c16421a2a2861d815d53ebb01 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -3795,6 +3795,11 @@ use_return_insn (int iscond, rtx sibling) return 0; } + /* ARMv8-M nonsecure entry function need to use bxns to return and thus need + several instructions if anything needs to be popped. */ + if (saved_int_regs && IS_CMSE_ENTRY (func_type)) +return 0; + /* If there are saved registers but the LR isn't saved, then we need two instructions for the return. */ if (saved_int_regs && !(saved_int_regs & (1 << LR_REGNUM))) @@ -6820,6 +6825,11 @@ arm_function_ok_for_sibcall (tree decl, tree exp) if (IS_INTERRUPT (func_type)) return false; + /* ARMv8-M non-secure entry functions need to return with bxns which is only + generated for entry functions themselves. */ + if (IS_CMSE_ENTRY (arm_current_func_type ())) +return false; + if (!VOID_TYPE_P (TREE_TYPE (DECL_RESULT (cfun->decl { /* Check that the return value locations are the same. For @@ -19607,6 +19617,7 @@ output_return_instruction (rtx operand, bool really_return, bool reverse, (e.g. interworking) then we can load the return address directly into the PC. Otherwise we must load it into LR. */ if (really_return + && !IS_CMSE_ENTRY (func_type) && (IS_INTERRUPT (func_type) || !TARGET_INTERWORK)) return_reg = reg_names[PC_REGNUM]; else @@ -19742,8 +19753,12 @@ output_return_instruction (rtx operand, bool really_return, bool reverse, break; default: + if (IS_CMSE_ENTRY (func_type)) + { + snprintf (instr, sizeof (instr), "bxns%s\t%%|lr", conditional); + } /* Use bx if it's available. */ - if (arm_arch5 || arm_arch4t) + else if (arm_arch5 || arm_arch4t) sprintf (instr, "bx%s\t%%|lr", conditional); else sprintf (instr, "mov%s\t%%|pc, %%|lr", conditional); @@ -19756,6 +19771,42 @@ output_return_instruction (rtx operand, bool really_return, bool reverse, return ""; } +/* Output in FILE asm statements needed to declare the NAME of the function + defined by its DECL node. */ + +void +arm_asm_declare_function_name (FILE *file, const char *name, tree decl) +{ + size_t cmse_name_len; + char *cmse_name = 0; + char cmse_prefix[] = "__acle_se_"; + + if (use_cmse && lookup_attribute ("cmse_nonsecure_entry", +
[commited, PATCH] PR target/69530: [6 Regression] ICE: SIGSEGV
in ix86_split_long_move (i386.c:24353) with -fno-split-wide-types -mavx Reply-To: "H.J. Lu"r229087, which caused PR 69530, was supposed to fix PR 67609. r229458 has made r229087 unnecessary. Approved by Vladimir in PR 69530. Checked into trunk. H.J. --- gcc/ PR target/69530 * lra-splill.c (lra_final_code_change): Revert r229087 by removing all sub-registers. gcc/testsuite/ PR target/69530 * gcc.target/i386/pr69530.c: New test. --- gcc/lra-spills.c| 46 ++--- gcc/testsuite/gcc.target/i386/pr69530.c | 11 2 files changed, 19 insertions(+), 38 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr69530.c diff --git a/gcc/lra-spills.c b/gcc/lra-spills.c index fa0a579..5709ef1 100644 --- a/gcc/lra-spills.c +++ b/gcc/lra-spills.c @@ -760,44 +760,14 @@ lra_final_code_change (void) struct lra_static_insn_data *static_id = id->insn_static_data; bool insn_change_p = false; - - for (i = id->insn_static_data->n_operands - 1; i >= 0; i--) - { - if (! DEBUG_INSN_P (insn) && static_id->operand[i].is_operator) - continue; - - rtx op = *id->operand_loc[i]; - - if (static_id->operand[i].type == OP_OUT - && GET_CODE (op) == SUBREG && REG_P (SUBREG_REG (op)) - && ! LRA_SUBREG_P (op)) - { - hard_regno = REGNO (SUBREG_REG (op)); - /* We can not always remove sub-registers of -hard-registers as we may lose information that -only a part of registers is changed and -subsequent optimizations may do wrong -transformations (e.g. dead code eliminations). -We can not also keep all sub-registers as the -subsequent optimizations can not handle all such -cases. Here is a compromise which works. */ - if ((GET_MODE_SIZE (GET_MODE (op)) - < GET_MODE_SIZE (GET_MODE (SUBREG_REG (op - && (hard_regno_nregs[hard_regno][GET_MODE (SUBREG_REG (op))] - == hard_regno_nregs[hard_regno][GET_MODE (op)]) -#ifdef STACK_REGS - && (hard_regno < FIRST_STACK_REG - || hard_regno > LAST_STACK_REG) -#endif - ) - continue; - } - if (alter_subregs (id->operand_loc[i], ! DEBUG_INSN_P (insn))) - { - lra_update_dup (id, i); - insn_change_p = true; - } - } + + for (i = id->insn_static_data->n_operands - 1; i >= 0; i--) + if ((DEBUG_INSN_P (insn) || ! static_id->operand[i].is_operator) + && alter_subregs (id->operand_loc[i], ! DEBUG_INSN_P (insn))) + { + lra_update_dup (id, i); + insn_change_p = true; + } if (insn_change_p) lra_update_operator_dups (id); } diff --git a/gcc/testsuite/gcc.target/i386/pr69530.c b/gcc/testsuite/gcc.target/i386/pr69530.c new file mode 100644 index 000..9146d1d --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr69530.c @@ -0,0 +1,11 @@ +/* { dg-do compile { target int128 } } */ +/* { dg-options "-O -fno-forward-propagate -fno-split-wide-types -mavx " } */ + +typedef unsigned __int128 v32u128 __attribute__ ((vector_size (32))); + +v32u128 +foo (v32u128 v32u128_0) +{ + v32u128_0[0] *= v32u128_0[1]; + return v32u128_0; +} -- 2.5.0
[wwwdocs] fortran/index.html - remove local styles
These local styles feel a bit odd to begin with, and if we skip the ... within ..., the originally perceived/ addresses issue should go away. Unless there are objections from the Fortran side, I plan on committing this in a couple of days. Gerald Index: fortran/index.html === RCS file: /cvs/gcc/wwwdocs/htdocs/fortran/index.html,v retrieving revision 1.34 diff -u -r1.34 index.html --- fortran/index.html 29 Jun 2014 11:31:33 - 1.34 +++ fortran/index.html 30 Jan 2016 04:02:12 - @@ -91,52 +91,51 @@ people: -Paul Brook -Steven Bosscher -Bud Davis -Jerry DeLisle -Toon Moene -Tobias Schlueter -Janne Blomqvist -Steve Kargl -Thomas Koenig -Paul Thomas -Janus Weil -Daniel Kraft -Daniel Franke +Paul Brook +Steven Bosscher +Bud Davis +Jerry DeLisle +Toon Moene +Tobias Schlueter +Janne Blomqvist +Steve Kargl +Thomas Koenig +Paul Thomas +Janus Weil +Daniel Kraft +Daniel Franke Under the rules specified below: -All normal +All normal requirements for patch submission (assignment of copyright to the FSF, testing, ChangeLog entries, etc) still apply, and reviewers should ensure that these have been met before approving -changes. -Approval should be necessary for +changes. +Approval should be necessary for patches which don't fall under the obvious rule. So, with the approver list put in place, everybody (except maintainers) should still seek approval for his/her patches. We have found the mutual peer review process really -works well. -Patches should only be reviewed by +works well. +Patches should only be reviewed by people who know the affected parts of the compiler. (i.e. the reviewer has to be sure he/she knows stuff well enough to make a -good judgment.) -Large/complicated patches should -still go by one of our maintainers, or team consensus. -We are all reasonable people, and nobody is working under +good judgment.) +Large/complicated patches should +still go by one of our maintainers, or team consensus. +We are all reasonable people, and nobody is working under employer pressure or needs an ego-boost badly, so in general we -assume that no-one deliberately does anything stupid :-) +assume that no-one deliberately does anything stupid :-) The directories involved are: -gcc/gcc/fortran/ -gcc/gcc/testsuite/gfortran.dg/ -gcc/gcc/testsuite/gfortran.fortran-torture/ - -gcc/libgfortran/ +gcc/gcc/fortran/ +gcc/gcc/testsuite/gfortran.dg/ +gcc/gcc/testsuite/gfortran.fortran-torture/ +gcc/libgfortran/ Documentation
[wwwdocs] Use external CSS for the News and Status panes on the main page
This fixes the worst of what the new server settings broke; our main page should look quite more reasonable with this. And it simplifies things a little. Applied. Index: index.html === RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v retrieving revision 1.992 diff -u -r1.992 index.html --- index.html 24 Jan 2016 23:54:36 - 1.992 +++ index.html 30 Jan 2016 05:58:12 - @@ -45,12 +45,10 @@ - - -News - - + +News + GCC 5.3 released [2015-12-04] @@ -98,13 +96,9 @@ - - - - -Release Series and Status - - + +Release Series and Status + GCC 5.3 (changes) Index: gcc.css === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc.css,v retrieving revision 1.30 diff -u -r1.30 gcc.css --- gcc.css 30 Jan 2016 04:28:47 - 1.30 +++ gcc.css 30 Jan 2016 05:58:12 - @@ -14,15 +14,19 @@ .highlight{ color: darkslategray; font-weight:bold; } -dl.news { margin-top:0; } -dl.news dt { color:darkslategrey; font-weight:bold; margin-top:0.3em; } -dl.news dd { margin-left:3ex; margin-top:0.1em; margin-bottom:0.1em; } -dl.news .date { color:darkslategrey; font-size:90%; margin-left:0.1ex; } - -dl.status{ margin-top:0; } -dl.status .version { font-weight:bold; } -dl.status .regress { font-size: 90%; } -dl.status dd { margin-left:3ex; } +td.news { width: 50%; padding-right: 8px; } +td.news h2 { font-size: 1.2em; margin-top: 0; margin-bottom: 2%; } +td.news dl { margin-top:0; } +td.news dt { color:darkslategrey; font-weight:bold; margin-top:0.3em; } +td.news dd { margin-left:3ex; margin-top:0.1em; margin-bottom:0.1em; } +td.news .date { color:darkslategrey; font-size:90%; margin-left:0.1ex; } + +td.status{ width: 50%; padding-left: 12px; border-left: #3366cc thin solid; } +td.status h2 { font-size: 1.2em; margin-top:0; margin-bottom: 1%; } +td.status dl { margin-top:0; } +td.status .version { font-weight:bold; } +td.status .regress { font-size: 90%; } +td.status dd { margin-left:3ex; } .td_title { border-color: #3366cc;
Re: [PATCH] Fix up _Pragma GCC diagnostics regressions (PR preprocessor/69543, PR c/69558)
On Fri, 2016-01-29 at 20:50 +0100, Jakub Jelinek wrote: > Hi! > > This patch reverts one tiny change from r228049 changes (which hasn't > been > mentioned in the ChangeLog or patch description). We definitely need > to > revisit this for GCC 7, but stage4 is probably not the right time for > that, > and the patch fixes e.g. tons of warnings (or with -Werror errors on > including pretty much all glib2 headers). > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2016-01-29 Jakub Jelinek> > PR preprocessor/69543 > PR c/69558 > * c-pragma.c (handle_pragma_diagnostic): Pass input_location > instead of loc to control_warning_option. > > * gcc.dg/pr69543.c: New test. > * gcc.dg/pr69558.c: New test. This touches c-family; shouldn't the new tests be in c-c++-common, rather than gcc.dg? (presumably we need to ensure that the glib2 headers are sane from C++ also) I've been attempting to fix these by fixing linemap_compare_locations, but I don't have that approach working, so fwiw I don't object to this patch.
[wwwdocs] Avoid local styles in the standard footer.
Our standard footer also got hit by the (needlessly for for us) stricter server settings. This again gets rid of extra vertical space in that box. Installed, and afterwards I rebuilt our website using the /www/gcc/bin/preprocess script on gcc.gnu.org. Gerald Index: gcc.css === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc.css,v retrieving revision 1.28 diff -u -r1.28 gcc.css --- gcc.css 22 Jan 2016 05:28:41 - 1.28 +++ gcc.css 30 Jan 2016 04:27:59 - @@ -49,6 +49,7 @@ border-width: thin; padding: 4px; } +div.copyright p:nth-child(3) { margin-bottom: 0; } .boldcyan{ font-weight:bold; color:cyan; } .boldlime{ font-weight:bold; color:lime; } Index: style.mhtml === RCS file: /cvs/gcc/wwwdocs/htdocs/style.mhtml,v retrieving revision 1.125 diff -u -r1.125 style.mhtml --- style.mhtml 29 Jun 2014 11:31:32 - 1.125 +++ style.mhtml 30 Jan 2016 04:16:21 - @@ -241,7 +241,7 @@ -For questions related to the use of GCC, +For questions related to the use of GCC, please consult these web pages and the https://gcc.gnu.org/onlinedocs/;>GCC manuals. If that fails, the mailto:gcc-h...@gcc.gnu.org;>gcc-h...@gcc.gnu.org @@ -257,7 +257,7 @@ Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved. -These pages are +These pages are https://gcc.gnu.org/about.html;>maintained by the GCC team. Last modified http://validator.w3.org/check/referer;>.
Re: Enabling -frename-registers?
On January 29, 2016 6:34:50 PM GMT+01:00, Bernd Schmidtwrote: >So PR57193 has an example of sub-optimal code generation, with some >unnecessary register moves left after LRA. These seem to be difficult >to >prevent, but last year Robert Suchanek made some modifications to >regrename that allow it to clean up such cases. Enabling >-frename-registers removes one of the two unnecessary copies, and I'm >pretty sure I could make it eliminate the other one as well with a bit >more work. > >Hence, this patch. The renamer has seen a lot of fixes over the years >and should be in pretty good shape IMO. Still, I won't deny that this >is >a bit riskier than the usual bugfix patch at this stage. > >Bootstrapped and tested on x86_64-linux, with my earlier patch to fix >some i386 tests. Thoughts? Should we do this for gcc-7 at least? I don't think it's appropriate at this stage. Richard. > >Bernd
patch to fix PR69299
The following patch is the final version of the patch for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69299 The patch was approved by Richard Henderson and Jakub. Committed as rev. 232993. Index: ChangeLog === --- ChangeLog (revision 232992) +++ ChangeLog (working copy) @@ -1,3 +1,37 @@ +2016-01-29 Vladimir Makarov+ + PR target/69299 + * config/i386/constraints.md (Bm): Describe as special memory + constraint. + * doc/md.texi (DEFINE_SPECIAL_MEMORY_CONSTRAINT): Describe it. + * genoutput.c (main): Process DEFINE_SPECIAL_MEMORY_CONSTRAINT. + * genpreds.c (struct constraint_data): Add is_special_memory. + (have_special_memory_constraints, special_memory_start): New + static vars. + (special_memory_end): Ditto. + (add_constraint): Add new arg is_special_memory. Add code to + process its true value. Update have_special_memory_constraints. + (process_define_constraint): Pass the new arg. + (process_define_register_constraint): Ditto. + (choose_enum_order): Process special memory. + (write_tm_preds_h): Generate enum const CT_SPECIAL_MEMORY and + function insn_extra_special_memory_constraint. + (main): Process DEFINE_SPECIAL_MEMORY_CONSTRAINT. + * gensupport.c (process_rtx): Process + DEFINE_SPECIAL_MEMORY_CONSTRAINT. + * ira-costs.c (record_reg_classes): Process CT_SPECIAL_MEMORY. + * ira-lives.c (single_reg_class): Use + insn_extra_special_memory_constraint. + * ira.c (ira_setup_alts): Process CT_SPECIAL_MEMORY. + * lra-constraints.c (process_alt_operands): Ditto. + (curr_insn_transform): Use insn_extra_special_memory_constraint. + * recog.c (asm_operand_ok, preprocess_constraints): Process + CT_SPECIAL_MEMORY. + * reload.c (find_reloads): Ditto. + * rtl.def (DEFINE_SPECIFAL_MEMORY_CONSTRAINT): New. + * stmt.c (parse_input_constraint): Use + insn_extra_special_memory_constraint. + 2016-01-29 H.J. Lu PR target/69530 Index: config/i386/constraints.md === --- config/i386/constraints.md (revision 232990) +++ config/i386/constraints.md (working copy) @@ -162,7 +162,7 @@ "@internal GOT memory operand." (match_operand 0 "GOT_memory_operand")) -(define_constraint "Bm" +(define_special_memory_constraint "Bm" "@internal Vector memory operand." (match_operand 0 "vector_memory_operand")) Index: doc/md.texi === --- doc/md.texi (revision 232990) +++ doc/md.texi (working copy) @@ -4424,6 +4424,20 @@ The syntax and semantics are otherwise i @code{define_constraint}. @end deffn +@deffn {MD Expression} define_special_memory_constraint name docstring exp +Use this expression for constraints that match a subset of all memory +operands: that is, @code{reload} can not make them match by reloading +the address as it is described for @code{define_memory_constraint} or +such address reload is undesirable with the performance point of view. + +For example, @code{define_special_memory_constraint} can be useful if +specifically aligned memory is necessary or desirable for some insn +operand. + +The syntax and semantics are otherwise identical to +@code{define_constraint}. +@end deffn + @deffn {MD Expression} define_address_constraint name docstring exp Use this expression for constraints that match a subset of all address operands: that is, @code{reload} can make the constraint match by Index: genoutput.c === --- genoutput.c (revision 232990) +++ genoutput.c (working copy) @@ -1019,6 +1019,7 @@ main (int argc, char **argv) case DEFINE_REGISTER_CONSTRAINT: case DEFINE_ADDRESS_CONSTRAINT: case DEFINE_MEMORY_CONSTRAINT: + case DEFINE_SPECIAL_MEMORY_CONSTRAINT: note_constraint (); break; Index: genpreds.c === --- genpreds.c (revision 232990) +++ genpreds.c (working copy) @@ -659,11 +659,11 @@ write_one_predicate_function (struct pre /* Constraints fall into two categories: register constraints (define_register_constraint), and others (define_constraint, - define_memory_constraint, define_address_constraint). We - work out automatically which of the various old-style macros - they correspond to, and produce appropriate code. They all - go in the same hash table so we can verify that there are no - duplicate names. */ + define_memory_constraint, define_special_memory_constraint, + define_address_constraint). We work out automatically which of the + various old-style macros they correspond to, and produce + appropriate code. They all go in the same hash table so we can + verify that there are no duplicate names. */ /* All data from one constraint definition. */ struct constraint_data @@ -681,6 +681,7 @@ struct constraint_data unsigned int is_const_dbl : 1; unsigned
Re: [C++ patch] report better diagnostic for static following '[' in parameter declaration
On 29/01/16 17:01, Prathamesh Kulkarni wrote: Thanks for the review. AFAIK the type-qualifiers would be const, restrict, volatile and _Atomic (n1570 p 6.7.3) ? I added a check for those and for variable length array. I am having issues with writing the test-case, some cases pass with -std=c++11 but fail with -std=c++98. Could you please have a look ? Is there _Atomic in C++? Also, why not simply reuse cp_parser_cv_qualifier_seq_opt (cp_parser* parser), perhaps adding a complain parameter that defaults to tf_error and calling it here with tf_none. I think you will get nicer errors if you don't set bounds to error-mark, just give the error, consume the tokens and continue as usual. Ideally, smart error-recovery should only be done when things already go wrong, thus after bounds = cp_parser_constant_expression (parser, /*allow_non_constant=*/true, _constant_p); if (!non_constant_p) /* OK */; fails, however our C++ parser tends to give errors quite deep in the stack instead of letting the caller decide what to do, which makes this too noisy in this case. Nonetheless, moving this error-recovery within: if (token->type != CPP_CLOSE_SQUARE){ } but before the above can only make the parser (marginally) faster for correct code. Cheers, Manuel.
Re: [C PATCH] Clear C_TYPE_INCOMPLETE_VARS even on variant types (PR debug/69518)
On Fri, Jan 29, 2016 at 08:41:18PM +0100, Jakub Jelinek wrote: > Hi! > > We ICE on the following testcase, because the C FE abuses TYPE_VFIELD > for its FE stuff, but may leak it to the middle-end. > We clear it for TYPE_MAIN_VARIANT (and only use it for that), but > for the other variants it could be non-NULL, because build_variant_type* > would just copy that field over. > > Fixed by clearing it on all the variant types. > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2016-01-29 Jakub Jelinek> > PR debug/69518 > * c-decl.c (finish_struct): Clear C_TYPE_INCOMPLETE_VARS in > all type variants, not just TYPE_MAIN_VARIANT. > > * gcc.dg/torture/pr69518.c: New test. Ok. Marek
[C PATCH] Clear C_TYPE_INCOMPLETE_VARS even on variant types (PR debug/69518)
Hi! We ICE on the following testcase, because the C FE abuses TYPE_VFIELD for its FE stuff, but may leak it to the middle-end. We clear it for TYPE_MAIN_VARIANT (and only use it for that), but for the other variants it could be non-NULL, because build_variant_type* would just copy that field over. Fixed by clearing it on all the variant types. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-01-29 Jakub JelinekPR debug/69518 * c-decl.c (finish_struct): Clear C_TYPE_INCOMPLETE_VARS in all type variants, not just TYPE_MAIN_VARIANT. * gcc.dg/torture/pr69518.c: New test. --- gcc/c/c-decl.c.jj 2016-01-27 20:32:17.0 +0100 +++ gcc/c/c-decl.c 2016-01-29 13:54:52.776583200 +0100 @@ -7842,6 +7842,14 @@ finish_struct (location_t loc, tree t, t } } + /* Note: C_TYPE_INCOMPLETE_VARS overloads TYPE_VFIELD which is used + in dwarf2out via rest_of_decl_compilation below and means + something totally different. Since we will be clearing + C_TYPE_INCOMPLETE_VARS shortly after we iterate through them, + clear it ahead of time and avoid problems in dwarf2out. Ideally, + C_TYPE_INCOMPLETE_VARS should use some language specific + node. */ + tree incomplete_vars = C_TYPE_INCOMPLETE_VARS (TYPE_MAIN_VARIANT (t)); for (x = TYPE_MAIN_VARIANT (t); x; x = TYPE_NEXT_VARIANT (x)) { TYPE_FIELDS (x) = TYPE_FIELDS (t); @@ -7849,6 +7857,7 @@ finish_struct (location_t loc, tree t, t C_TYPE_FIELDS_READONLY (x) = C_TYPE_FIELDS_READONLY (t); C_TYPE_FIELDS_VOLATILE (x) = C_TYPE_FIELDS_VOLATILE (t); C_TYPE_VARIABLE_SIZE (x) = C_TYPE_VARIABLE_SIZE (t); + C_TYPE_INCOMPLETE_VARS (x) = NULL_TREE; } /* If this was supposed to be a transparent union, but we can't @@ -7862,17 +7871,7 @@ finish_struct (location_t loc, tree t, t } /* If this structure or union completes the type of any previous - variable declaration, lay it out and output its rtl. - - Note: C_TYPE_INCOMPLETE_VARS overloads TYPE_VFIELD which is used - in dwarf2out via rest_of_decl_compilation below and means - something totally different. Since we will be clearing - C_TYPE_INCOMPLETE_VARS shortly after we iterate through them, - clear it ahead of time and avoid problems in dwarf2out. Ideally, - C_TYPE_INCOMPLETE_VARS should use some language specific - node. */ - tree incomplete_vars = C_TYPE_INCOMPLETE_VARS (TYPE_MAIN_VARIANT (t)); - C_TYPE_INCOMPLETE_VARS (TYPE_MAIN_VARIANT (t)) = 0; + variable declaration, lay it out and output its rtl. */ for (x = incomplete_vars; x; x = TREE_CHAIN (x)) { tree decl = TREE_VALUE (x); --- gcc/testsuite/gcc.dg/torture/pr69518.c.jj 2016-01-29 13:52:22.547656581 +0100 +++ gcc/testsuite/gcc.dg/torture/pr69518.c 2016-01-29 13:52:03.0 +0100 @@ -0,0 +1,11 @@ +/* PR debug/69518 */ +/* { dg-do compile } */ +/* { dg-options "-g" } */ + +struct A a; +typedef struct A B; +struct A {} +foo (B x) +{ + __builtin_abort (); +} Jakub
[PATCH] Fix up _Pragma GCC diagnostics regressions (PR preprocessor/69543, PR c/69558)
Hi! This patch reverts one tiny change from r228049 changes (which hasn't been mentioned in the ChangeLog or patch description). We definitely need to revisit this for GCC 7, but stage4 is probably not the right time for that, and the patch fixes e.g. tons of warnings (or with -Werror errors on including pretty much all glib2 headers). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-01-29 Jakub JelinekPR preprocessor/69543 PR c/69558 * c-pragma.c (handle_pragma_diagnostic): Pass input_location instead of loc to control_warning_option. * gcc.dg/pr69543.c: New test. * gcc.dg/pr69558.c: New test. --- gcc/c-family/c-pragma.c.jj 2016-01-15 21:57:00.0 +0100 +++ gcc/c-family/c-pragma.c 2016-01-29 18:34:51.743943283 +0100 @@ -817,9 +817,12 @@ handle_pragma_diagnostic(cpp_reader *ARG const char *arg = NULL; if (cl_options[option_index].flags & CL_JOINED) arg = option_string + 1 + cl_options[option_index].opt_len; + /* FIXME: input_location isn't the best location here, but it is + what we used to do here before and changing it breaks e.g. + PR69543 and PR69558. */ control_warning_option (option_index, (int) kind, arg, kind != DK_IGNORED, - loc, lang_mask, , + input_location, lang_mask, , _options, _options_set, global_dc); } --- gcc/testsuite/gcc.dg/pr69558.c.jj 2016-01-29 18:43:32.191665058 +0100 +++ gcc/testsuite/gcc.dg/pr69558.c 2016-01-29 18:40:05.0 +0100 @@ -0,0 +1,17 @@ +/* PR c/69558 */ +/* { dg-do compile } */ +/* { dg-options "-Wdeprecated-declarations" } */ + +#define A \ + _Pragma ("GCC diagnostic push") \ + _Pragma ("GCC diagnostic ignored \"-Wdeprecated-declarations\"") +#define B \ + _Pragma ("GCC diagnostic pop") +#define C(x) \ + A \ + static inline void bar (void) { x (); } \ + B + +__attribute__((deprecated)) void foo (void); /* { dg-bogus "declared here" } */ + +C (foo) /* { dg-bogus "is deprecated" } */ --- gcc/testsuite/gcc.dg/pr69543.c.jj 2016-01-29 18:45:09.520323395 +0100 +++ gcc/testsuite/gcc.dg/pr69543.c 2016-01-29 18:44:56.0 +0100 @@ -0,0 +1,18 @@ +/* PR preprocessor/69543 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -Wuninitialized" } */ + +# define YY_IGNORE_MAYBE_UNINITIALIZED_BEGIN \ +_Pragma ("GCC diagnostic push") \ +_Pragma ("GCC diagnostic ignored \"-Wuninitialized\"")\ +_Pragma ("GCC diagnostic ignored \"-Wmaybe-uninitialized\"") +# define YY_IGNORE_MAYBE_UNINITIALIZED_END \ +_Pragma ("GCC diagnostic pop") + +void test (char yylval) +{ + char *yyvsp; + YY_IGNORE_MAYBE_UNINITIALIZED_BEGIN + *++yyvsp = yylval; + YY_IGNORE_MAYBE_UNINITIALIZED_END +} Jakub
Re: Is it OK for rtx_addr_can_trap_p_1 to attempt to compute the frame layout? (was Re: [PATCH] Skip re-computing the mips frame info after reload completed)
On 29.01.2016 16:47 Bernd Schmidt wrote: > On 01/29/2016 04:41 PM, Jakub Jelinek wrote: >> On Fri, Jan 29, 2016 at 02:09:25AM +0100, Bernd Schmidt wrote: > >>> I think a better approach might be to just mark accesses at known >>> locations >>> in the frame, or arg pushes, as MEM_NOTRAP_P, and consider accesses with >>> non-constant or calculated offsets as potentially trapping. >> >> I don't see how that would work generally. Sure, if there is e.g. a >> constant >> offset array access, it could be checked easily, but if there is >> variable offset >> array access, that is at some point later on changed into a constant >> offset >> access, you'd need to be conservative, unless you can prove it is in >> range. > > Yes. What is the problem with that? If we have (plus sfp const_int) at > any point before reload, we can check whether that offset is inside > frame_size. If it isn't or if the offset isn't known, it could trap. > > Usually we have "if (x==1234) { read MEM[FP+x]; }", so wo don't know, and then after reload: "if (x==1234) { read MEM[SP+x+sp_fp_offset]; }" but wait, in the if statement we know, that x==1234, so everything turns in one magic constant, and we have a totally new constant offset from the SP register "if (x==1234) { read MEM[SP+1234+sp_fp_offset]; }". Now if rtx_addr_can_trap_p(MEM[SP+1234+sp_fp_offset]) says it cannot trap we think we do not need the if at all => BANG. Bernd.
[Patch, MIPS] Fix PR target/68273, passing args in wrong regs
This is a patch for PR 68273, where MIPS is passing arguments in the wrong registers. The problem is that when passing an argument by value, mips_function_arg_boundary was looking at the type of the argument (and not just the mode), and if that argument was a variable with extra alignment info (say an integer variable with 8 byte alignment), MIPS was aligning the argument on an 8 byte boundary instead of a 4 byte boundary the way it should. Since we are passing values (not variables), the alignment of the variable that the value is copied from should not affect the alignment of the value being passed. This patch fixes the problem and it could change what registers arguments are passed in, which means it could affect backwards compatibility with older programs. But since the current behaviour is not compliant with the MIPS ABI and does not match what LLVM does, I think we have to make this change. For the most part this will only affect arguments which are copied from variables that have non-standard alignments set by the aligned attribute, however the SRA optimization pass can create aligned variables as it splits aggregates apart and that was what triggered this bug report. This is basically the same bug as the ARM bug PR 65956 and the fix is pretty much the same too. Rather than create MIPS specific tests that check the use of specific registers I created two tests to put in gcc.c-torture/execute that were failing before because GCC on MIPS was not consistent in where arguments were passed and which now work with this patch. Tested with mips-mti-linux-gnu and no regressions. OK to checkin? Steve Ellcey sell...@imgtec.com 2016-01-29 Steve EllceyPR target/68273 * config/mips/mips.c (mips_function_arg_boundary): Fix argument alignment. diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index dd54d6a..ecce3cd 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -5643,8 +5643,9 @@ static unsigned int mips_function_arg_boundary (machine_mode mode, const_tree type) { unsigned int alignment; - - alignment = type ? TYPE_ALIGN (type) : GET_MODE_ALIGNMENT (mode); + alignment = type && mode == BLKmode + ? TYPE_ALIGN (TYPE_MAIN_VARIANT (type)) + : GET_MODE_ALIGNMENT (mode); if (alignment < PARM_BOUNDARY) alignment = PARM_BOUNDARY; if (alignment > STACK_BOUNDARY) 2016-01-29 Steve Ellcey PR target/68273 * gcc.c-torture/execute/pr68273-1.c: New test. * gcc.c-torture/execute/pr68273-2.c: New test. diff --git a/gcc/testsuite/gcc.c-torture/execute/pr68273-1.c b/gcc/testsuite/gcc.c-torture/execute/pr68273-1.c index e69de29..3ce07c6 100644 --- a/gcc/testsuite/gcc.c-torture/execute/pr68273-1.c +++ b/gcc/testsuite/gcc.c-torture/execute/pr68273-1.c @@ -0,0 +1,74 @@ +/* Make sure that the alignment attribute on an argument passed by + value does not affect the calling convention and what registers + arguments are passed in. */ + +extern void exit (int); +extern void abort (void); + +typedef int alignedint __attribute__((aligned(8))); + +int __attribute__((noinline)) +foo1 (int a, alignedint b) +{ return a + b; } + +int __attribute__((noinline)) +foo2 (int a, int b) +{ + return a + b; +} + +int __attribute__((noinline)) +bar1 (alignedint x) +{ + return foo1 (1, x); +} + +int __attribute__((noinline)) +bar2 (alignedint x) +{ + return foo1 (1, (alignedint) 99); +} + +int __attribute__((noinline)) +bar3 (alignedint x) +{ + return foo1 (1, x + (alignedint) 1); +} + +alignedint q = 77; + +int __attribute__((noinline)) +bar4 (alignedint x) +{ + return foo1 (1, q); +} + + +int __attribute__((noinline)) +bar5 (alignedint x) +{ + return foo2 (1, x); +} + +int __attribute__((noinline)) +use_arg_regs (int i, int j, int k) +{ + return i+j-k; +} + +int main() +{ + if (use_arg_regs (999, 999, 999) != 999) abort (); + if (foo1 (19,13) != 32) abort (); + if (use_arg_regs (999, 999, 999) != 999) abort (); + if (bar1 (-33) != -32) abort (); + if (use_arg_regs (999, 999, 999) != 999) abort (); + if (bar2 (1) != 100) abort (); + if (use_arg_regs (999, 999, 999) != 999) abort (); + if (bar3 (17) != 19) abort (); + if (use_arg_regs (999, 999, 999) != 999) abort (); + if (bar4 (-33) != 78) abort (); + if (use_arg_regs (999, 999, 999) != 999) abort (); + if (bar5 (-84) != -83) abort (); + exit (0); +} diff --git a/gcc/testsuite/gcc.c-torture/execute/pr68273-2.c b/gcc/testsuite/gcc.c-torture/execute/pr68273-2.c index e69de29..1661be9 100644 --- a/gcc/testsuite/gcc.c-torture/execute/pr68273-2.c +++ b/gcc/testsuite/gcc.c-torture/execute/pr68273-2.c @@ -0,0 +1,109 @@ +/* Make sure that the alignment attribute on an argument passed by + value does not affect the calling convention and what registers + arguments are passed in. */ + +extern void exit (int); +extern void abort (void); + +typedef struct s { + char c; + char