RE: [PATCH] MIPS: Prevent the p5600-bonding.c test from being run for the n32 and 64 ABIs

2016-01-29 Thread Andrew Bennett
> This is OK now.

Committed as SVN 232980.

Regards,



Andrew


[PATCH] Fix wide_int unsigned division (PR tree-optimization/69546)

2016-01-29 Thread Jakub Jelinek
Hi!

As the testcase shows, wide_int unsigned division is broken for > 64bit
precision division of unsigned dividend which have 63rd bit set, and all
higher bits cleared (thus is normalized as 2 HWIs, first with MSB set,
the second 0) and divisor of 1, we return just a single HWI, which is
equivalent to all higher bits set too.  If the divisor is > 1, there is
no such problem, as the MSB will not be set after the division.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-01-29  Jakub Jelinek  

PR tree-optimization/69546
* wide-int.cc (wi::divmod_internal): For unsigned division
where both operands fit into uhwi, if o1 is 1 and o0 has
msb set, if divident_prec is larger than bits per hwi,
clear another quotient word and return 2 instead of 1.

* gcc.dg/torture/pr69546.c: New test.

--- gcc/wide-int.cc.jj  2016-01-26 11:46:39.0 +0100
+++ gcc/wide-int.cc 2016-01-29 11:59:33.348852003 +0100
@@ -1788,15 +1788,25 @@ wi::divmod_internal (HOST_WIDE_INT *quot
 {
   unsigned HOST_WIDE_INT o0 = dividend.to_uhwi ();
   unsigned HOST_WIDE_INT o1 = divisor.to_uhwi ();
+  unsigned int quotient_len = 1;
 
   if (quotient)
-   quotient[0] = o0 / o1;
+   {
+ quotient[0] = o0 / o1;
+ if (o1 == 1
+ && (HOST_WIDE_INT) o0 < 0
+ && dividend_prec > HOST_BITS_PER_WIDE_INT)
+   {
+ quotient[1] = 0;
+ quotient_len = 2;
+   }
+   }
   if (remainder)
{
  remainder[0] = o0 % o1;
  *remainder_len = 1;
}
-  return 1;
+  return quotient_len;
 }
 
   /* Make the divisor and dividend positive and remember what we
--- gcc/testsuite/gcc.dg/torture/pr69546.c.jj   2016-01-29 12:06:03.148516651 
+0100
+++ gcc/testsuite/gcc.dg/torture/pr69546.c  2016-01-29 12:08:17.847672967 
+0100
@@ -0,0 +1,26 @@
+/* PR tree-optimization/69546 */
+/* { dg-do run { target int128 } } */
+
+unsigned __int128 __attribute__ ((noinline, noclone))
+foo (unsigned long long x)
+{
+  unsigned __int128 y = ~0ULL;
+  x >>= 63;
+  return y / (x | 1);
+}
+
+unsigned __int128 __attribute__ ((noinline, noclone))
+bar (unsigned long long x)
+{
+  unsigned __int128 y = ~33ULL;
+  x >>= 63;
+  return y / (x | 1);
+}
+
+int
+main ()
+{
+  if (foo (1) != ~0ULL || bar (17) != ~33ULL)
+__builtin_abort ();
+  return 0;
+}

Jakub


[PATCH][AArch64] PR target/69161: Don't use special predicate for CCmode comparisons in expressions that require matching modes

2016-01-29 Thread Kyrill Tkachov

Hi all,

In this PR we ICE during combine when trying to propagate a comparison into a 
vec_duplicate,
that is we end up creating the rtx:
(vec_duplicate:V4SI (eq:CC_NZ (reg:CC_NZ 66 cc)
(const_int 0 [0])))

The documentation for vec_duplicate says:
"The output vector mode must have the same submodes as the input vector mode or the 
scalar modes"
So this is invalid RTL, which triggers an assert in simplify-rtx to that effect.

It has been suggested on the PR that this is because we use a special_predicate 
for
aarch64_comparison_operator which means that it ignores the mode when matching.
This is fine when used in RTXes that don't need it, like if_then_else 
expressions
but can cause trouble when used in places where the modes do matter, like in
SET operations. In this particular ICE the cause was the conditional store
patterns that could end up matching an intermediate rtx during combine of
(set (reg:SI) (eq:CC_NZ x y)).

The suggested solution is to define a separate predicate with the same
conditions as aarch64_comparison_operator but make it not special, so it gets
automatic mode checks to prevent such a situation.

This patch does that.
Bootstrapped and tested on aarch64-linux-gnu.
SPEC2006 codegen did not change with this patch, so there shouldn't be
any code quality regressions.

Ok for trunk?

Thanks,
Kyrill

2016-01-29  Kyrylo Tkachov  

PR target/69161
* config/aarch64/predicates.md (aarch64_comparison_operator_mode):
New predicate.
(aarch64_comparison_operator): Break overly long line into two.
(aarch64_comparison_operation): Likewise.
* config/aarch64/aarch64.md (cstorecc4): Use
aarch64_comparison_operator_mode instead of
aarch64_comparison_operator.
(cstore4): Likewise.
(aarch64_cstore): Likewise.
(*cstoresi_insn_uxtw): Likewise.
(cstore_neg): Likewise.
(*cstoresi_neg_uxtw): Likewise.

2016-01-29  Kyrylo Tkachov  

PR target/69161
* gcc.c-torture/compile/pr69161.c: New test.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index f900c12cfb4d108fd4c1671c75b465966befee06..46ca6588c93d793668808fa2e3accaa038ea71d4 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -2957,7 +2957,7 @@ (define_expand "cstore4"
 
 (define_expand "cstorecc4"
   [(set (match_operand:SI 0 "register_operand")
-   (match_operator 1 "aarch64_comparison_operator"
+   (match_operator 1 "aarch64_comparison_operator_mode"
 	[(match_operand 2 "cc_register")
  (match_operand 3 "const0_operand")]))]
   ""
@@ -2969,7 +2969,7 @@ (define_expand "cstorecc4"
 
 (define_expand "cstore4"
   [(set (match_operand:SI 0 "register_operand" "")
-	(match_operator:SI 1 "aarch64_comparison_operator"
+	(match_operator:SI 1 "aarch64_comparison_operator_mode"
 	 [(match_operand:GPF 2 "register_operand" "")
 	  (match_operand:GPF 3 "aarch64_fp_compare_operand" "")]))]
   ""
@@ -2982,7 +2982,7 @@ (define_expand "cstore4"
 
 (define_insn "aarch64_cstore"
   [(set (match_operand:ALLI 0 "register_operand" "=r")
-	(match_operator:ALLI 1 "aarch64_comparison_operator"
+	(match_operator:ALLI 1 "aarch64_comparison_operator_mode"
 	 [(match_operand 2 "cc_register" "") (const_int 0)]))]
   ""
   "cset\\t%0, %m1"
@@ -3027,7 +3027,7 @@ (define_insn_and_split "*compare_cstore_insn"
 (define_insn "*cstoresi_insn_uxtw"
   [(set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI
-	 (match_operator:SI 1 "aarch64_comparison_operator"
+	 (match_operator:SI 1 "aarch64_comparison_operator_mode"
 	  [(match_operand 2 "cc_register" "") (const_int 0)])))]
   ""
   "cset\\t%w0, %m1"
@@ -3036,7 +3036,7 @@ (define_insn "*cstoresi_insn_uxtw"
 
 (define_insn "cstore_neg"
   [(set (match_operand:ALLI 0 "register_operand" "=r")
-	(neg:ALLI (match_operator:ALLI 1 "aarch64_comparison_operator"
+	(neg:ALLI (match_operator:ALLI 1 "aarch64_comparison_operator_mode"
 		  [(match_operand 2 "cc_register" "") (const_int 0)])))]
   ""
   "csetm\\t%0, %m1"
@@ -3047,7 +3047,7 @@ (define_insn "cstore_neg"
 (define_insn "*cstoresi_neg_uxtw"
   [(set (match_operand:DI 0 "register_operand" "=r")
 	(zero_extend:DI
-	 (neg:SI (match_operator:SI 1 "aarch64_comparison_operator"
+	 (neg:SI (match_operator:SI 1 "aarch64_comparison_operator_mode"
 		  [(match_operand 2 "cc_register" "") (const_int 0)]]
   ""
   "csetm\\t%w0, %m1"
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index e96dc000bea8470daa187dfd7c44e9c9993dbb0f..04cb695b4f038c9a9be5470598939e1ad20f36f4 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -229,10 +229,17 @@ (define_predicate "aarch64_reg_or_imm"
 
 ;; True for integer comparisons and for FP comparisons other than LTGT or UNEQ.
 (define_special_predicate "aarch64_comparison_operator"
-  (match_code "eq,ne,le,lt,ge,gt,geu,gtu,leu,ltu,unordered,ordered,unlt,unle,unge,ungt"))
+  (match_code 

[committed] Fix SSE1 V4SImode vector insert (PR target/69551)

2016-01-29 Thread Jakub Jelinek
Hi!

The following patch fixes a bug in the V4SImode ix86_expand_vector_set
SSE1 handling, before we recurse, we need to copy the original target
to the temporary, otherwise we set just the single element and leave
the rest of the elements uninitialized.

Bootstrapped/regtested on x86_64-linux and i686-linux, preapproved
by Uros in the PR, committed to trunk and 5.x so far.

2016-01-29  Jakub Jelinek  

PR target/69551
* config/i386/i386.c (ix86_expand_vector_set) : For
SSE1, copy target into the temporary reg first before recursing
on it.

* gcc.target/i386/pr69551.c: New test.

--- gcc/config/i386/i386.c.jj   2016-01-28 15:07:25.0 +0100
+++ gcc/config/i386/i386.c  2016-01-29 13:02:32.100788474 +0100
@@ -46744,6 +46744,7 @@ ix86_expand_vector_set (bool mmx_ok, rtx
{
  /* For SSE1, we have to reuse the V4SF code.  */
  rtx t = gen_reg_rtx (V4SFmode);
+ emit_move_insn (t, gen_lowpart (V4SFmode, target));
  ix86_expand_vector_set (false, t, gen_lowpart (SFmode, val), elt);
  emit_move_insn (target, gen_lowpart (mode, t));
}
--- gcc/testsuite/gcc.target/i386/pr69551.c.jj  2016-01-29 13:10:46.338993771 
+0100
+++ gcc/testsuite/gcc.target/i386/pr69551.c 2016-01-29 13:09:49.0 
+0100
@@ -0,0 +1,23 @@
+/* PR target/69551 */
+/* { dg-do run { target sse_runtime } } */
+/* { dg-options "-O2 -mno-sse2 -msse" } */
+
+typedef unsigned char v16qi __attribute__ ((vector_size (16)));
+typedef unsigned int v4si __attribute__ ((vector_size (16)));
+
+char __attribute__ ((noinline, noclone))
+test (v4si vec)
+{
+  vec[1] = 0x5fb856;
+  return ((v16qi) vec)[0];
+}
+
+int
+main ()
+{
+  char z = test ((v4si) { -1, -1, -1, -1 });
+
+  if (z != -1)
+__builtin_abort ();
+  return 0;
+}

Jakub


Re: [PATCH] Fix PR69547

2016-01-29 Thread Jakub Jelinek
On Fri, Jan 29, 2016 at 09:40:48AM +0100, Richard Biener wrote:
> I am testing the following patch to fix a regression that we no longer
> remove some empty loops.

Doesn't this mean that DCE will remove the clobbers as unnecessary, even
when they aren't in empty loops?

Jakub


Re: [PATCH] S/390: Require a hardware vector support for test to succeed.

2016-01-29 Thread Andreas Krebbel
On Wed, Jan 27, 2016 at 11:04:32AM +0100, Dominik Vogt wrote:
> gcc/testsuite/ChangeLog
> 
>   * gcc.dg/tree-ssa/ssa-dom-cse-2.c: Require a hardware vector support for
>   test to succeed.

Applied.  Thanks!

-Andreas-



Re: [PATCH] s390: Add -fsplit-stack support

2016-01-29 Thread Andreas Krebbel
Hi Marcin,

sorry for the late feedback.

A few comments regarding the split stack implementation:

The GNU coding style requires to replace every 8 leading blanks on a
line with a tab.  There are many lines in your patch violating this.
In case you are an emacs user `whitespace-cleanup' will fix this for
you.

Could you please add a testcase checking the different
variants. I.e. with early exit, no-alloc in __morestack, and with an
actual allocation?

There are a few more comments inline.

Bye,

-Andreas-

> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index c881d52..71f6f38 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,5 +1,38 @@
>  2016-01-16  Marcin Kościelnicki  
> 
> + * common/config/s390/s390-common.c (s390_supports_split_stack):
> + New function.
> + (TARGET_SUPPORTS_SPLIT_STACK): New macro.
> + * config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
> + * config/s390/s390.c (struct machine_function): New field
> + split_stack_varargs_pointer.
> + (s390_register_info): Mark r12 as clobbered if it'll be used as temp
> + in s390_emit_prologue.
> + (s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
> + vararg pointer.
> + (morestack_ref): New global.
> + (SPLIT_STACK_AVAILABLE): New macro.
> + (s390_expand_split_stack_prologue): New function.
> + (s390_expand_split_stack_call): New function.
> + (s390_live_on_entry): New function.
> + (s390_va_start): Use split-stack vararg pointer if appropriate.
> + (s390_reorg): Lower the split-stack pseudo-insns.
> + (s390_asm_file_end): Emit the split-stack note sections.
> + (TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
> + * config/s390/s390.md: (UNSPEC_STACK_CHECK): New unspec.
> + (UNSPECV_SPLIT_STACK_CALL): New unspec.
> + (UNSPECV_SPLIT_STACK_SIBCALL): New unspec.
> + (UNSPECV_SPLIT_STACK_MARKER): New unspec.
> + (split_stack_prologue): New expand.
> + (split_stack_call_*): New insn.
> + (split_stack_cond_call_*): New insn.
> + (split_stack_space_check): New expand.
> + (split_stack_sibcall_*): New insn.
> + (split_stack_cond_sibcall_*): New insn.
> + (split_stack_marker): New insn.
> +
> +2016-01-02  Marcin Kościelnicki  
> +
>   * cfgrtl.c (rtl_tidy_fallthru_edge): Bail for unconditional jumps
>   with side effects.
> 
> diff --git a/gcc/common/config/s390/s390-common.c 
> b/gcc/common/config/s390/s390-common.c
> index 4519c21..1e497e6 100644
> --- a/gcc/common/config/s390/s390-common.c
> +++ b/gcc/common/config/s390/s390-common.c
> @@ -105,6 +105,17 @@ s390_handle_option (struct gcc_options *opts 
> ATTRIBUTE_UNUSED,
>  }
>  }
> 
> +/* -fsplit-stack uses a field in the TCB, available with glibc-2.23.
> +   We don't verify it, since earlier versions just have padding at
> +   its place, which works just as well.  */
> +
> +static bool
> +s390_supports_split_stack (bool report ATTRIBUTE_UNUSED,
> +struct gcc_options *opts ATTRIBUTE_UNUSED)
> +{
> +  return true;
> +}
> +
>  #undef TARGET_DEFAULT_TARGET_FLAGS
>  #define TARGET_DEFAULT_TARGET_FLAGS (TARGET_DEFAULT)
> 
> @@ -117,4 +128,7 @@ s390_handle_option (struct gcc_options *opts 
> ATTRIBUTE_UNUSED,
>  #undef TARGET_OPTION_INIT_STRUCT
>  #define TARGET_OPTION_INIT_STRUCT s390_option_init_struct
> 
> +#undef TARGET_SUPPORTS_SPLIT_STACK
> +#define TARGET_SUPPORTS_SPLIT_STACK s390_supports_split_stack
> +
>  struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;
> diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
> index 633bc1e..09032c9 100644
> --- a/gcc/config/s390/s390-protos.h
> +++ b/gcc/config/s390/s390-protos.h
> @@ -42,6 +42,7 @@ extern bool s390_handle_option (struct gcc_options *opts 
> ATTRIBUTE_UNUSED,
>  extern HOST_WIDE_INT s390_initial_elimination_offset (int, int);
>  extern void s390_emit_prologue (void);
>  extern void s390_emit_epilogue (bool);
> +extern void s390_expand_split_stack_prologue (void);
>  extern bool s390_can_use_simple_return_insn (void);
>  extern bool s390_can_use_return_insn (void);
>  extern void s390_function_profiler (FILE *, int);
> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
> index 3be64de..6afce7c 100644
> --- a/gcc/config/s390/s390.c
> +++ b/gcc/config/s390/s390.c
> @@ -426,6 +426,13 @@ struct GTY(()) machine_function
>/* True if the current function may contain a tbegin clobbering
>   FPRs.  */
>bool tbegin_p;
> +
> +  /* For -fsplit-stack support: A stack local which holds a pointer to
> + the stack arguments for a function with a variable number of
> + arguments.  This is set at the start of the function and is used
> + to initialize the overflow_arg_area field of the va_list
> + structure.  */
> +  rtx split_stack_varargs_pointer;
>  };
> 
>  /* Few accessor macros for struct cfun->machine->s390_frame_layout.  */
> @@ -9316,9 +9323,13 @@ 

[committed] Add testcase for PR66137

2016-01-29 Thread Jakub Jelinek
Hi!

This PR has been fixed by the PR68701, I've committed
the testcase as obvious to trunk.

2016-01-29  Jakub Jelinek  

PR target/66137
* gcc.target/i386/pr66137.c: New test.

--- gcc/testsuite/gcc.target/i386/pr66137.c.jj  2016-01-29 15:05:19.804958974 
+0100
+++ gcc/testsuite/gcc.target/i386/pr66137.c 2016-01-29 15:04:53.0 
+0100
@@ -0,0 +1,11 @@
+/* PR target/66137 */
+/* { dg-do compile } */
+/* { dg-options "-mavx -O3 -ffixed-ebp" } */
+
+void
+foo (char *x, char *y, char *z, int a)
+{
+  int i;
+  for (i = a; i > 0; i--)
+*x++ = *y++ = *z++;
+}

Jakub


[hsa] Atomic assess memory model fixes

2016-01-29 Thread Martin Jambor
Hi,

this is a followup to comments by Jakub and Richi on handling of
memory models in HSA atomic operations:

- I have made user-visible diagnostics lower case simple words, rather
  than constant identifiers.

- I have added masking by MEMMODEL_BASE_MASK where appropriate.

- I have made sure that warning code does not crash even when it
  encounters an unknown model and that it never warns multiple times.

- I have fixed handling of atomic load operations which wrongly
  insisted on release semantics instead of acquire (apart from
  relaxed).

- And last but not least, after looking at the respective
  documentations, I have convinced myself that __ATOMIC_SEQ_CST can be
  implemented using the HSA scacq, screl and scar memory orders, so I
  implemented that.

Bootstrapped and tested on x86_64-linux.  Since all of the above seems
to be worth fixing and low risk, I am going to commit it trunk even at
this stage, even though of course nothing in HSA is a regression.

Thanks,

Martin


2016-01-29  Martin Jambor  

* hsa-gen.c (get_memory_order_name): Mask with MEMMODEL_BASE_MASK.
Use short lowercase names.
(get_memory_order): Mask with MEMMODEL_BASE_MASK.  Support
MEMMODEL_CONSUME with acquire semantics and MEMMODEL_SEQ_CST with
acq_rel one.  Protect warning agains segfaults if
get_memory_order_name returns NULL.
(gen_hsa_ternary_atomic_for_builtin): Support with MEMMODEL_SEQ_CST
with release semantics.  Do not warn if get_memory_order already did.
(gen_hsa_insns_for_call): Support with MEMMODEL_SEQ_CST with acquire
semantics.  Fix check for relaxed or acquire semantics.  Do not warn
if get_memory_order already did.
---
 gcc/hsa-gen.c | 59 ---
 1 file changed, 40 insertions(+), 19 deletions(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index e8f80da..768c2cf 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -4415,20 +4415,20 @@ get_address_from_value (tree val, hsa_bb *hbb)
 static const char *
 get_memory_order_name (unsigned memmodel)
 {
-  switch (memmodel)
+  switch (memmodel & MEMMODEL_BASE_MASK)
 {
 case MEMMODEL_RELAXED:
-  return "__ATOMIC_RELAXED";
+  return "relaxed";
 case MEMMODEL_CONSUME:
-  return "__ATOMIC_CONSUME";
+  return "consume";
 case MEMMODEL_ACQUIRE:
-  return "__ATOMIC_ACQUIRE";
+  return "acquire";
 case MEMMODEL_RELEASE:
-  return "__ATOMIC_RELEASE";
+  return "release";
 case MEMMODEL_ACQ_REL:
-  return "__ATOMIC_ACQ_REL";
+  return "acq_rel";
 case MEMMODEL_SEQ_CST:
-  return "__ATOMIC_SEQ_CST";
+  return "seq_cst";
 default:
   return NULL;
 }
@@ -4440,21 +4440,31 @@ get_memory_order_name (unsigned memmodel)
 static BrigMemoryOrder
 get_memory_order (unsigned memmodel, location_t location)
 {
-  switch (memmodel)
+  switch (memmodel & MEMMODEL_BASE_MASK)
 {
 case MEMMODEL_RELAXED:
   return BRIG_MEMORY_ORDER_RELAXED;
+case MEMMODEL_CONSUME:
+  /* HSA does not have an equivalent, but we can use the slightly stronger
+ACQUIRE.  */
 case MEMMODEL_ACQUIRE:
   return BRIG_MEMORY_ORDER_SC_ACQUIRE;
 case MEMMODEL_RELEASE:
   return BRIG_MEMORY_ORDER_SC_RELEASE;
 case MEMMODEL_ACQ_REL:
+case MEMMODEL_SEQ_CST:
+  /* Callers implementing a simple load or store need to remove the release
+or acquire part respectively.  */
   return BRIG_MEMORY_ORDER_SC_ACQUIRE_RELEASE;
 default:
-  HSA_SORRY_ATV (location,
-"support for HSA does not implement memory model: %s",
-get_memory_order_name (memmodel));
-  return BRIG_MEMORY_ORDER_NONE;
+  {
+   const char *mmname = get_memory_order_name (memmodel);
+   HSA_SORRY_ATV (location,
+  "support for HSA does not implement the specified "
+  " memory model%s %s",
+  mmname ? ": " : "", mmname ? mmname : "");
+   return BRIG_MEMORY_ORDER_NONE;
+  }
 }
 }
 
@@ -4523,13 +4533,20 @@ gen_hsa_ternary_atomic_for_builtin (bool ret_orig,
   nops = 2;
 }
 
-  if (acode == BRIG_ATOMIC_ST && memorder != BRIG_MEMORY_ORDER_RELAXED
-  && memorder != BRIG_MEMORY_ORDER_SC_RELEASE)
+  if (acode == BRIG_ATOMIC_ST)
 {
-  HSA_SORRY_ATV (gimple_location (stmt),
-"support for HSA does not implement memory model for "
-"ATOMIC_ST: %s", get_memory_order_name (mmodel));
-  return;
+  if (memorder == BRIG_MEMORY_ORDER_SC_ACQUIRE_RELEASE)
+   memorder = BRIG_MEMORY_ORDER_SC_RELEASE;
+
+  if (memorder != BRIG_MEMORY_ORDER_RELAXED
+ && memorder != BRIG_MEMORY_ORDER_SC_RELEASE
+ && memorder != BRIG_MEMORY_ORDER_NONE)
+   {
+ HSA_SORRY_ATV (gimple_location (stmt),
+"support for HSA does not implement memory 

Re: [off-list] Re: [PATCH PR68542]

2016-01-29 Thread Yuri Rumyantsev
Uros,

Here is update patch which includes (1) couple changes proposed by
Richard in tree-vect-loop.c and (2) the changes in back-end proposed
by you.

Is it OK for trunk?
Bootstrap and regression testing dis not show any new failures.

ChangeLog:

2016-01-29  Yuri Rumyantsev  

PR middle-end/68542
* config/i386/i386.c (ix86_expand_branch): Add support for conditional
branch with vector comparison.
*config/i386/sse.md (Vi48_AVX): New mode iterator.
(define_expand "cbranch4): Add support for conditional branch
with vector comparison.
* tree-vect-loop.c (optimize_mask_stores): New function.
* tree-vect-stmts.c (vectorizable_mask_load_store): Initialize
has_mask_store field of vect_info.
* tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for
vectorized loops having masked stores after vec_info destroy.
* tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and
correspondent macros.
(optimize_mask_stores): Add prototype.

gcc/testsuite/ChangeLog:
* gcc.dg/vect/vect-mask-store-move-1.c: New test.
* testsuite/gcc.target/i386/avx2-vect-mask-store-move1.c: Likewise.

2016-01-29 15:26 GMT+03:00 Uros Bizjak :
> On Fri, Jan 29, 2016 at 1:20 PM, Yuri Rumyantsev  wrote:
>> Uros,
>>
>> Thanks for your comments.
>> I deleted swap of operands as you told.
>> Let me explain my point in adding support for conditional branches
>> with vector comparison.
>> This feature is used to put  vectorized masked stores  and its
>> producers under guard that checks that mask is not zero, i.e. if mask
>> which is result of other vector computations is zero we don't need to
>> execute correspondent masked store and its producers if they don't
>> have other uses. It means that only integer 128-bit and 256-bit
>> vectors must be accepted as operands of cbranch. I did not introduce
>> new iterator but simply used existence iterator V48_AVX2. BTW you
>> proposed to add new iterator VI_AVX but it would be better to ad
>> VI48_AVX as
>>
>> (define_mode_iterator Vi48_AVX
>> [(V4SI "TARGET_AVX") (V2DI "TARGET_AVX")
>> (V8SI "TARGET_AVX") (V4DI "TARGET_AVX")])
>>
>> I also don't think that we need to add support in expand_compare since
>> such comparisons are not  generated.
>
> OK with me. If there is no need for cstore pattern, then the
> comparison can be integrated with existing code in expand_branch by
> using ""goto simple" as is already the case there.
>
> BR,
> Uros.
>
>> 2016-01-28 20:08 GMT+03:00 Uros Bizjak :
>>> Yuri,
>>>
>>> please find attached a target-dependent patch that illustrates my
>>> review remarks. The patch is lightly tested, and it produces desired
>>> ptest insns on the testcases you provided.
>>>
>>> Some further remarks:
>>>
>>> +  tmp = gen_rtx_fmt_ee (code, VOIDmode, flag, const0_rtx);
>>> +  if (code == EQ)
>>> +tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, tmp,
>>> +gen_rtx_LABEL_REF (VOIDmode, label), pc_rtx);
>>> +  else
>>> +tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, tmp,
>>> +pc_rtx, gen_rtx_LABEL_REF (VOIDmode, label));
>>> +  emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
>>> +  return;
>>>
>>> The above code is IMO wrong. You don't need to swap the arms of the
>>> target, since "code" will generate je or jne. Please see the attached
>>> patch.
>>>
>>> BTW: Maybe we can introduce corresponding cstore pattrn to use ptest
>>> in order to more efficiently vectorize code like:
>>>
>>> --cut here--
>>> int a[256];
>>>
>>> int foo (void)
>>> {
>>>   int ret = 0;
>>>   int i;
>>>
>>>   for (i = 0; i < 256; i++)
>>> {
>>>   if (a[i] != 0)
>>> ret = 1;
>>> }
>>>   return ret;
>>> }
>>> --cut here--
>>>
>>> Uros.


PR68542.patch.3
Description: Binary data


[PATCH][ARM] PR target/69161: Don't ignore mode when matching comparison operator in cstore-like patterns

2016-01-29 Thread Kyrill Tkachov

Hi all,

Similar to aarch64, the arm port also suffers from PR target/69161 when combine
tries to propagate a CCmode comparison into a vec_duplicate, creating invalid
RTL that ICEs.
Please refer to the PR and the aarch64 fix for more info.
The fix for arm is very similar.
We define a new predicate identical to arm_comparison_operator but
make it not special so that it gets the normal mode checks.
This prevents combine from matching an intermediate CCmode cstore
(where it's doing an SImode SET of a CCmode source) which it then
tries to propagate into a V4SImode vec_duplicate.

The offending patterns are the cstore patterns, so this patch updates them
to use the new predicate with mode checks. Both arm and thumb patterns
are updated.

There was no codegen difference observed on SPEC2006 for arm.
Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Kyrill

2016-01-29  Kyrylo Tkachov  

PR target/69161
* config/arm/predicates.md (arm_comparison_operator_mode):
New predicate.
* config/arm/arm.md (*mov_scc): Use arm_comparison_operator_mode
instead of arm_comparison_operator.
(*mov_negscc): Likewise.
(*mov_notscc): Likewise.
* config/arm/thumb2.md (*thumb2_mov_scc): Likewise.
(*thumb2_mov_negscc): Likewise.
(*thumb2_mov_negscc_strict_it): Likewise.
(*thumb2_mov_notscc): Likewise.
(*thumb2_mov_notscc_strict_it): Likewise.
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 5129e858578dd3f3c3c46b089c96011f6a6423c3..15b4a4a1278c6be14dca1887aff8dc7c7a8fc16d 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -7190,7 +7190,7 @@ (define_expand "cstore_cc"
 
 (define_insn_and_split "*mov_scc"
   [(set (match_operand:SI 0 "s_register_operand" "=r")
-	(match_operator:SI 1 "arm_comparison_operator"
+	(match_operator:SI 1 "arm_comparison_operator_mode"
 	 [(match_operand 2 "cc_register" "") (const_int 0)]))]
   "TARGET_ARM"
   "#"   ; "mov%D1\\t%0, #0\;mov%d1\\t%0, #1"
@@ -7207,7 +7207,7 @@ (define_insn_and_split "*mov_scc"
 
 (define_insn_and_split "*mov_negscc"
   [(set (match_operand:SI 0 "s_register_operand" "=r")
-	(neg:SI (match_operator:SI 1 "arm_comparison_operator"
+	(neg:SI (match_operator:SI 1 "arm_comparison_operator_mode"
 		 [(match_operand 2 "cc_register" "") (const_int 0)])))]
   "TARGET_ARM"
   "#"   ; "mov%D1\\t%0, #0\;mvn%d1\\t%0, #0"
diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index c66c31d5c6047aa7decfe7e95d111d5fbf6fb52e..b8f09ab6b109f80abe2df08a8b7f954f521ec1bf 100644
--- a/gcc/config/arm/predicates.md
+++ b/gcc/config/arm/predicates.md
@@ -341,6 +341,11 @@ (define_special_predicate "arm_comparison_operator"
   (and (match_operand 0 "expandable_comparison_operator")
(match_test "maybe_get_arm_condition_code (op) != ARM_NV")))
 
+;; Likewise, but don't ignore the mode.
+(define_predicate "arm_comparison_operator_mode"
+  (and (match_operand 0 "expandable_comparison_operator")
+   (match_test "maybe_get_arm_condition_code (op) != ARM_NV")))
+
 (define_special_predicate "lt_ge_comparison_operator"
   (match_code "lt,ge"))
 
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 3e762018e4d3761852dda6434d3a8e31166a8678..7368d0658da1afde05b2dfc40eda79e1c228df1f 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -370,7 +370,7 @@ (define_insn "*thumb2_cmpsi_neg_shiftsi"
 
 (define_insn_and_split "*thumb2_mov_scc"
   [(set (match_operand:SI 0 "s_register_operand" "=l,r")
-	(match_operator:SI 1 "arm_comparison_operator"
+	(match_operator:SI 1 "arm_comparison_operator_mode"
 	 [(match_operand 2 "cc_register" "") (const_int 0)]))]
   "TARGET_THUMB2"
   "#"   ; "ite\\t%D1\;mov%D1\\t%0, #0\;mov%d1\\t%0, #1"
@@ -388,7 +388,7 @@ (define_insn_and_split "*thumb2_mov_scc"
 
 (define_insn_and_split "*thumb2_mov_negscc"
   [(set (match_operand:SI 0 "s_register_operand" "=r")
-	(neg:SI (match_operator:SI 1 "arm_comparison_operator"
+	(neg:SI (match_operator:SI 1 "arm_comparison_operator_mode"
 		 [(match_operand 2 "cc_register" "") (const_int 0)])))]
   "TARGET_THUMB2 && !arm_restrict_it"
   "#"   ; "ite\\t%D1\;mov%D1\\t%0, #0\;mvn%d1\\t%0, #0"
@@ -407,7 +407,7 @@ (define_insn_and_split "*thumb2_mov_negscc"
 
 (define_insn_and_split "*thumb2_mov_negscc_strict_it"
   [(set (match_operand:SI 0 "low_register_operand" "=l")
-	(neg:SI (match_operator:SI 1 "arm_comparison_operator"
+	(neg:SI (match_operator:SI 1 "arm_comparison_operator_mode"
 		 [(match_operand 2 "cc_register" "") (const_int 0)])))]
   "TARGET_THUMB2 && arm_restrict_it"
   "#"   ; ";mvn\\t%0, #0 ;it\\t%D1\;mov%D1\\t%0, #0\"
@@ -436,7 +436,7 @@ (define_insn_and_split "*thumb2_mov_negscc_strict_it"
 
 (define_insn_and_split "*thumb2_mov_notscc"
   [(set (match_operand:SI 0 "s_register_operand" "=r")
-	(not:SI (match_operator:SI 1 "arm_comparison_operator"
+	(not:SI (match_operator:SI 1 "arm_comparison_operator_mode"
 		 [(match_operand 2 "cc_register" "") 

Re: [PATCH] Remove PTX link option

2016-01-29 Thread Alexander Monakov
On Mon, 11 Jan 2016, Alexander Monakov wrote:

> On Mon, 11 Jan 2016, Thomas Schwinge wrote:
> > Alexander, would you please also submit a fix for that for nvptx-tools'
> > nvptx-run.c?  (Or want me to do that?)
> 
> I can do that, along with another small change I used for -mgomp testing.

I have now done that as part of nvptx-tools pull request #10 here:
https://github.com/MentorEmbedded/nvptx-tools/pull/10
(the pull request also addresses other issues noted on the project's tracker)

Thanks.
Alexander


Re: Is it OK for rtx_addr_can_trap_p_1 to attempt to compute the frame layout? (was Re: [PATCH] Skip re-computing the mips frame info after reload completed)

2016-01-29 Thread Bernd Edlinger
On 29.01.2016 02:09, Bernd Schmidt wrote:
> On 01/28/2016 12:36 AM, Eric Botcazou wrote:
>>> [cc-ing Eric as RTL maintainer]
>>
>> Sorry for the delay, the message apparently bounced]
>>
>>> IMO the problem is that rtx_addr_can_trap_p_1 duplicates a large
>>> bit of LRA/reload logic:
>>>
>>> [...]
>>>
>>> Under the current interface macros like INITIAL_ELIMINATION_OFFSET
>>> are expected to trigger the calculation of the target's frame layout
>>> (since you need that information to answer the question).
>>> To me it seems wrong that we're attempting to call that sort of
>>> macro in a query routine like rtx_addr_can_trap_p_1.
>>
>> I'm a little uncomfortable stepping in here because, while I
>> essentially share
>> your objections and was opposed to the patch (I was almost sure that
>> it would
>> introduce maintainance issues for no practical benefit), I didn't
>> imagine that
>> such a failure mode would have been possible (computing an
>> approximation of
>> the frame layout after reload being problematic) so I didn't really
>> object to
>> being overruled after seeing Bernd's patch...
>>
>>> IMO we should cache the information we need@the start of each
>>> LRA/reload cycle.  This will be more robust, because there will
>>> be no accidental changes to global state either during or after
>>> LRA/reload.  It will also be more efficient because
>>> rtx_addr_can_trap_p_1 can read the cached variables rather
>>> than calling back into the target.
>>
>> That would be a better design for sure and would eliminate the
>> maintainance
>> issues I was originally afraid of.
>>
>> My recommendation would be to back out Bernd's patch for GCC 6.0
>> (again, it
>> doesn't fix any regression and, more importantly, any bug in real
>> software,
>> but only corner case bugs in dumb computer-generated testcases) but
>> with the
>> commitment to address the underlying issue for GCC 7.0 and backport
>> the fix to
>> GCC 6.x unless really impracticable.  That being said, it's ultimately
>> Jakub
>> and Richard's call.
>
> I'm on the fence; I do think the original problem is an issue we should
> fix, but I'm also not terribly happy with the implementation we have
> right now. Besides the issues already mentioned, doesn't it kind of
> assume these offsets are constant (which they definitely aren't,
> consider arg pushes for example)?
>

Yes that is right.  I saw it as a thing that could possibly happen more
often than we know, because it is difficult to spot a wrong code by
ordinary tests.  Even the reproducer from pr61047 does not crash when
it runs in the gcc testsuite, I had to tweak the example first to use
a larger offset to compensate the large number of environment values
that are passed from the test suite in each test execution.

Yes, rtx_addr_can_trap_p_1 does not know how many bytes are pushed on
the stack and I saw no easy way how to get to the REG_NOTES for
instance, because they are attached to the INSN and rtx_addr_can_trap_p
has only access to a MEM rtx.  It is no exact science, but the error is
on the safe side.

Nevertheless, with the data from the target hook the approximation is
good enough, to not pessimize any single bit of code that is generated
in stage2 vs. stage3, which would not have been the case without some
help from the target.

> I think a better approach might be to just mark accesses at known
> locations in the frame, or arg pushes, as MEM_NOTRAP_P, and consider
> accesses with non-constant or calculated offsets as potentially trapping.
>

Yes I think also that might be a next step.  It would probably be good
to somehow fixate the result of rtx_addr_can_trap_p immediately before
reload when the RTX's are still FRAMEP+x and ARGP+x, and annotate that
somehow to the reloaded RTX's, that way it would finally be superfluous
to call the target hook at all, because the actual addresses should not
change during or after reload.  It will probably have to be a spare bit
on the RTX that is currently unused, because MEM_NOTRAP_P is already
used for something different.

It will however not be simple to find a valid piece of C code where the
current implementation with all of its limitations generates different
code compared to an implementation that has access to the exact offsets.
As I said, I tried already, but could not find an example of a missed
optimization due to my patch.


Thanks
Bernd.

>
> Bernd


Re: Default compute dimensions

2016-01-29 Thread Jakub Jelinek
On Thu, Jan 28, 2016 at 10:38:51AM -0500, Nathan Sidwell wrote:
> This patch adds default compute dimension handling.  Users rarely specify
> compute dimensions, expecting the toolchain to DTRT.  More savvy users would
> like to specify global defaults.  This patch permits both.

Isn't it better to be able to override the defaults on the library side?
I mean, when when somebody is compiling the code, often he doesn't know the
exact properties of the hw it will be run on, if he does, I think it is
better to specify them explicitly in the code.  But if he doesn't, one just
has to hope libgomp will figure out the best defaults.
So, wouldn't it be better to add some env var that would allow to control
this instead?

Jakub


Re: [gomp4, PR68977, Committed] Don't gimplify in ssa mode if seen_error in oacc_xform_loop

2016-01-29 Thread Nathan Sidwell

On 01/29/16 02:48, Richard Biener wrote:


I see.  Is it possible to simply scrub the whole OACC region in this
case instead?


Do you mean jettison the body of the offloaded fn, or something else?  I guess 
the former's doable.  (Throwing away the fn entirely could result in unresolved 
symbol errors, which might confuse?)



Or even better, report those errors earlier?


Well, one could split the pass into two passes (I think) and move the first 
half.  But in general, these errors are only discoverable in the device compiler.


nathan
--
Nathan Sidwell - Director, Sourcery Services - Mentor Embedded


Re: [PATCH] New flag for dumping information about constexpr function calls memoization (GCC 5.2.0)

2016-01-29 Thread Andres Tiraboschi
2016-01-28 17:54 GMT-03:00 Joseph Myers :
> Any patch adding a new option needs to add documentation for it to
> invoke.texi (both substantive documentation, and inclusion in the summary
> lists of options).
>
> --
> Joseph S. Myers
> jos...@codesourcery.com

Hi,

Thanks for the feedback, I've just updated the documentation.

patch:


diff --git a/gcc/common.opt b/gcc/common.opt
index 1218a71..bf0c7df 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1168,6 +1168,10 @@ fdump-passes
 Common Var(flag_dump_passes) Init(0)
 Dump optimization passes

+fdump-memoization-hits
+Common Var(flag_dump_memoization_hits) Init(0)
+Dump info about constexpr calls memoized.
+
 fdump-unnumbered
 Common Report Var(flag_dump_unnumbered)
 Suppress output of instruction numbers, line number notes and
addresses in debugging dumps
diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index e250726..41ae5b3 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -42,6 +42,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "builtins.h"
 #include "tree-inline.h"
 #include "ubsan.h"
+#include "tree-pretty-print.h"
+#include "dumpfile.h"

 static bool verify_constant (tree, bool, bool *, bool *);
 #define VERIFY_CONSTANT(X)\
@@ -1173,6 +1175,14 @@ cx_error_context (void)
   return r;
 }

+static void
+dump_memoization_hit (FILE *file, tree call, int flags)
+{
+  fprintf(file, "Memoized call:\n");
+  print_generic_decl(file, call, flags);
+  fprintf(file, "\n");
+}
+
 /* Subroutine of cxx_eval_constant_expression.
Evaluate the call expression tree T in the context of OLD_CALL expression
evaluation.  */
@@ -1338,7 +1348,11 @@ cxx_eval_call_expression (const constexpr_ctx
*ctx, tree t,
   entry->result = result = error_mark_node;
 }
   else
-result = entry->result;
+{
+  if (flag_dump_memoization_hits)
+  dump_memoization_hit(stderr, t, 0);
+  result = entry->result;
+}
 }

   if (!depth_ok)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index f84a199..b78bd4b 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -322,6 +322,7 @@ Objective-C and Objective-C++ Dialects}.
 -fdump-class-hierarchy@r{[}-@var{n}@r{]} @gol
 -fdump-ipa-all -fdump-ipa-cgraph -fdump-ipa-inline @gol
 -fdump-passes @gol
+-fdump-memoization-hits @gol
 -fdump-statistics @gol
 -fdump-tree-all @gol
 -fdump-tree-original@r{[}-@var{n}@r{]}  @gol
@@ -6771,6 +6772,10 @@ Dump after function inlining.
 Dump the list of optimization passes that are turned on and off by
 the current command-line options.

+@item -fdump-memoization-hits
+@opindex fdump-memoization-hits
+Dump the list of constexpr function calls that were memoized.
+
 @item -fdump-statistics-@var{option}
 @opindex fdump-statistics
 Enable and control dumping of pass statistics in a separate file.  The


Re: [PING, PATCH] Reduce accuracy of bessel_6.f90.

2016-01-29 Thread Dominik Vogt
On Mon, Jan 11, 2016 at 03:40:56PM +0100, Dominik Vogt wrote:
> Another patch reducing the accuracy required in the bessel_6 test.

Fixes the test case for S/390.  Can this be committed?

> gcc/testsuite/ChangeLog
> 
>   * gfortran.dg/bessel_6.f90: Reduce accuracy for S/390.

> >From 70a35dd6f6bf906d8e5907667ad0f04f981a61ac Mon Sep 17 00:00:00 2001
> From: Dominik Vogt 
> Date: Mon, 11 Jan 2016 15:36:38 +0100
> Subject: [PATCH] S/390: Reduce accuracy of bessel_6.f90.
> 
> ---
>  gcc/testsuite/gfortran.dg/bessel_6.f90 | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gfortran.dg/bessel_6.f90 
> b/gcc/testsuite/gfortran.dg/bessel_6.f90
> index e0220f7..da917ff 100644
> --- a/gcc/testsuite/gfortran.dg/bessel_6.f90
> +++ b/gcc/testsuite/gfortran.dg/bessel_6.f90
> @@ -12,7 +12,7 @@
>  implicit none
>  real,parameter :: values(*) = [0.0, 0.5, 1.0, 0.9, 
> 1.8,2.0,3.0,4.0,4.25,8.0,34.53, 475.78] 
>  real,parameter :: myeps(size(values)) = epsilon(0.0) &
> -  * [2, 7, 5, 6, 9, 12, 12, 7, 7, 8, 92, 15 ]
> +  * [2, 7, 5, 6, 9, 12, 12, 7, 7, 8, 98, 15 ]
>  ! The following is sufficient for me - the values above are a bit
>  ! more tolerant
>  !  * [0, 5, 3, 4, 6, 7, 7, 5, 5, 6, 66, 4 ]
> -- 
> 2.3.0

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH] PR other/69006: S/390: Fix extra newlines after diagnostics.

2016-01-29 Thread Andreas Krebbel
On Wed, Jan 27, 2016 at 12:01:26PM +0100, Dominik Vogt wrote:
> gcc/ChangeLog
> 
>   PR other/69006
>   * config/s390/s390-c.c (s390_resolve_overloaded_builtin): Remove
>   trailing blank line from error message.

Applied.  Thanks!

-Andreas-



Fix some i386 testcases for -frename-registers

2016-01-29 Thread Bernd Schmidt
This patch corrects some tests that can fail with -frename-registers. 
The problems typically are of the form "xmm[0-7]+", disallowing 
registers 8 and 9, and "xmm[0-9]". disallowing numbers higher than 9. 
Most the patch was automatically generated, but there were some other 
cases as well.


Enabling register renaming would help with PR57193, I'll propose that 
separately after a few more tests.


This was bootstrapped and tested on x86_64-linux. Ok?


Bernd
	* gcc.target/i386/avx512bw-vptestmb-1.c: Correct [xyz]mm register
	number scans.
	* gcc.target/i386/avx512bw-vptestmw-1.c: Likewise.
	* gcc.target/i386/avx512bw-vptestnmb-1.c: Likewise.
	* gcc.target/i386/avx512bw-vptestnmw-1.c: Likewise.
	* gcc.target/i386/avx512cd-vpbroadcastmb2q-1.c: Likewise.
	* gcc.target/i386/avx512cd-vpbroadcastmw2d-1.c: Likewise.
	* gcc.target/i386/avx512dq-vfpclasspd-1.c: Likewise.
	* gcc.target/i386/avx512dq-vfpclassps-1.c: Likewise.
	* gcc.target/i386/avx512dq-vinsertf64x2-1.c: Likewise.
	* gcc.target/i386/avx512dq-vinserti64x2-1.c: Likewise.
	* gcc.target/i386/avx512f-gather-5.c: Likewise.
	* gcc.target/i386/avx512f-vptestmd-1.c: Likewise.
	* gcc.target/i386/avx512f-vptestmq-1.c: Likewise.
	* gcc.target/i386/avx512f-vptestnmd-1.c: Likewise.
	* gcc.target/i386/avx512f-vptestnmq-1.c: Likewise.
	* gcc.target/i386/avx512f-vrndscaleps-1.c: Likewise.
	* gcc.target/i386/avx512vl-vpbroadcastmb2q-1.c: Likewise.
	* gcc.target/i386/avx512vl-vpbroadcastmw2d-1.c: Likewise.
	* gcc.target/i386/avx512vl-vptestmd-1.c: Likewise.
	* gcc.target/i386/avx512vl-vptestmq-1.c: Likewise.
	* gcc.target/i386/avx512vl-vptestnmd-1.c: Likewise.
	* gcc.target/i386/avx512vl-vptestnmq-1.c: Likewise.
	* gcc.target/i386/pr32219-2.c: Allow registers other than %eax in
	scans.
	* gcc.target/i386/pr32219-4.c: Likewise.
	* gcc.target/i386/pr32219-6.c: Likewise.
	* gcc.target/i386/pr32219-8.c: Likewise.

Index: gcc/testsuite/gcc.target/i386/avx512bw-vptestmb-1.c
===
--- gcc/testsuite/gcc.target/i386/avx512bw-vptestmb-1.c	(revision 232689)
+++ gcc/testsuite/gcc.target/i386/avx512bw-vptestmb-1.c	(working copy)
@@ -1,11 +1,11 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512bw -mavx512vl -O2" } */
-/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%xmm\[0-7\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%ymm\[0-7\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%zmm\[0-7\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%xmm\[0-7\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%ymm\[0-7\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%zmm\[0-7\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vptestmb\[ \\t\]+\[^\{\n\]*%zmm\[0-9\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */
 
 #include 
 
Index: gcc/testsuite/gcc.target/i386/avx512bw-vptestmw-1.c
===
--- gcc/testsuite/gcc.target/i386/avx512bw-vptestmw-1.c	(revision 232689)
+++ gcc/testsuite/gcc.target/i386/avx512bw-vptestmw-1.c	(working copy)
@@ -1,11 +1,11 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512bw -mavx512vl -O2" } */
-/* { dg-final { scan-assembler-times "vptestmw\[ \\t\]+\[^\{\n\]*%xmm\[0-7\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vptestmw\[ \\t\]+\[^\{\n\]*%ymm\[0-7\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vptestmw\[ \\t\]+\[^\{\n\]*%zmm\[0-7\]+\[^\n\]*k\[1-7\]\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vptestmw\[ \\t\]+\[^\{\n\]*%xmm\[0-7\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vptestmw\[ \\t\]+\[^\{\n\]*%ymm\[0-7\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vptestmw\[ \\t\]+\[^\{\n\]*%zmm\[0-7\]+\[^\n\]*k\[1-7\](?:\n|\[ \\t\]+#)" 1 } } */
+/* { dg-final { scan-assembler-times "vptestmw\[ 

[Patch] Update GCC Internals: remove section Preserving virtual SSA form

2016-01-29 Thread Nicklas Bo Jensen
Hi,

The section "12.3.2 Preserving the virtual SSA form" in GCC Internals
is outdated. The two functions it documents push_stmt_changes and
pop_stmt_changes have been removed. The functionality have been
replaced with update_stmt. update_stmt is documented elsewhere in
internals. I therefore propose to remove section 12.3.2.

Furthermore, the function mark_stmt_modified have been replaced by
gimple_set_modified.

Please install this patch.

Best,
Nicklas

2016-01-29  Nicklas Bo Jensen  

* doc/tree-ssa.texi (Preserving the virtual SSA form): Remove
outdated section

index d795090..7ca607d 100644
--- a/gcc/doc/tree-ssa.texi
+++ b/gcc/doc/tree-ssa.texi
@@ -432,7 +432,7 @@ dominator optimizations currently do this.

 When lazy updating is being used, the immediate use information is out of date
 and cannot be used reliably.  Lazy updating is achieved by simply marking
-statements modified via calls to @code{mark_stmt_modified} instead of
+statements modified via calls to @code{gimple_set_modified} instead of
 @code{update_stmt}.  When lazy updating is no longer required, all the
 modified statements must have @code{update_stmt} called in order to bring them
 up to date.  This must be done before the optimization is finished, or
@@ -654,40 +654,6 @@ are explicitly destroyed and only the symbols marked for
 renaming are processed@.
 @end itemize

-@subsection Preserving the virtual SSA form
-@cindex preserving virtual SSA form
-
-The virtual SSA form is harder to preserve than the non-virtual SSA form
-mainly because the set of virtual operands for a statement may change at
-what some would consider unexpected times.  In general, statement
-modifications should be bracketed between calls to
-@code{push_stmt_changes} and @code{pop_stmt_changes}.  For example,
-
-@smallexample
-munge_stmt (tree stmt)
-@{
-   push_stmt_changes ();
-   @dots{} rewrite STMT @dots{}
-   pop_stmt_changes ();
-@}
-@end smallexample
-
-The call to @code{push_stmt_changes} saves the current state of the
-statement operands and the call to @code{pop_stmt_changes} compares
-the saved state with the current one and does the appropriate symbol
-marking for the SSA renamer.
-
-It is possible to modify several statements at a time, provided that
-@code{push_stmt_changes} and @code{pop_stmt_changes} are called in
-LIFO order, as when processing a stack of statements.
-
-Additionally, if the pass discovers that it did not need to make
-changes to the statement after calling @code{push_stmt_changes}, it
-can simply discard the topmost change buffer by calling
-@code{discard_stmt_changes}.  This will avoid the expensive operand
-re-scan operation and the buffer comparison that determines if symbols
-need to be marked for renaming.
-
 @subsection Examining @code{SSA_NAME} nodes
 @cindex examining SSA_NAMEs


Fix c/69522, memory management issue in c-parser

2016-01-29 Thread Bernd Schmidt

Let's say we have

struct a {
 int x[1];
 int y[1];
} x = { 0, { 0 } };
   ^

When we reach the marked brace, we call into push_init_level, where we 
notice that we have implicit initializers (for x[]) lying around that we 
should deal with now that we've seen another open brace. The problem is 
that we've created a new obstack for the initializer of y, and this is 
where we also put data for the inits of x, freeing it when we see the 
close brace for the initialization of y.


In the actual testcase, which is a little more complex to actually 
demonstrate the issue, we end up allocating two init elts at the same 
address (because of premature freeing) and place them in the same tree, 
which ends up containing a cycle because of this. Then we hang.


Fixed by this patch, which splits off a new function 
finish_implicit_inits from push_init_level and ensures it's called with 
the outer obstack instead of the new one in the problematic case.


Bootstrapped and tested on x86_64-linux, ok?


Bernd
c/
	PR c/69522
	* c-parser.c (c_parser_braced_init): New arg outer_obstack.  All
	callers changed.  If nested_p is true, use it to call
	finish_implicit_inits.
	* c-tree.h (finish_implicit_inits): Declare.
	* c-typeck.c (finish_implicit_inits): New function.  Move code
	from ...
	(push_init_level): ... here.
	(set_designator, process_init_element): Call finish_implicit_inits.

testsuite/
	PR c/69522
	gcc.dg/pr69522.c: New test.

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 43c26ae..eac0b1c 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -1284,7 +1284,8 @@ static tree c_parser_simple_asm_expr (c_parser *);
 static tree c_parser_attributes (c_parser *);
 static struct c_type_name *c_parser_type_name (c_parser *);
 static struct c_expr c_parser_initializer (c_parser *);
-static struct c_expr c_parser_braced_init (c_parser *, tree, bool);
+static struct c_expr c_parser_braced_init (c_parser *, tree, bool,
+	   struct obstack *);
 static void c_parser_initelt (c_parser *, struct obstack *);
 static void c_parser_initval (c_parser *, struct c_expr *,
 			  struct obstack *);
@@ -4289,7 +4290,7 @@ static struct c_expr
 c_parser_initializer (c_parser *parser)
 {
   if (c_parser_next_token_is (parser, CPP_OPEN_BRACE))
-return c_parser_braced_init (parser, NULL_TREE, false);
+return c_parser_braced_init (parser, NULL_TREE, false, NULL);
   else
 {
   struct c_expr ret;
@@ -4309,7 +4310,8 @@ c_parser_initializer (c_parser *parser)
top-level initializer in a declaration.  */
 
 static struct c_expr
-c_parser_braced_init (c_parser *parser, tree type, bool nested_p)
+c_parser_braced_init (c_parser *parser, tree type, bool nested_p,
+		  struct obstack *outer_obstack)
 {
   struct c_expr ret;
   struct obstack braced_init_obstack;
@@ -4318,7 +4320,10 @@ c_parser_braced_init (c_parser *parser, tree type, bool nested_p)
   gcc_assert (c_parser_next_token_is (parser, CPP_OPEN_BRACE));
   c_parser_consume_token (parser);
   if (nested_p)
-push_init_level (brace_loc, 0, _init_obstack);
+{
+  finish_implicit_inits (brace_loc, outer_obstack);
+  push_init_level (brace_loc, 0, _init_obstack);
+}
   else
 really_start_incremental_init (type);
   if (c_parser_next_token_is (parser, CPP_CLOSE_BRACE))
@@ -4576,7 +4581,8 @@ c_parser_initval (c_parser *parser, struct c_expr *after,
   location_t loc = c_parser_peek_token (parser)->location;
 
   if (c_parser_next_token_is (parser, CPP_OPEN_BRACE) && !after)
-init = c_parser_braced_init (parser, NULL_TREE, true);
+init = c_parser_braced_init (parser, NULL_TREE, true,
+ braced_init_obstack);
   else
 {
   init = c_parser_expr_no_commas (parser, after);
@@ -8060,7 +8066,7 @@ c_parser_postfix_expression_after_paren_type (c_parser *parser,
   error_at (type_loc, "compound literal has variable size");
   type = error_mark_node;
 }
-  init = c_parser_braced_init (parser, type, false);
+  init = c_parser_braced_init (parser, type, false, NULL);
   finish_init ();
   maybe_warn_string_init (type_loc, type, init);
 
diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h
index 00e72b1..5902fd2 100644
--- a/gcc/c/c-tree.h
+++ b/gcc/c/c-tree.h
@@ -625,6 +625,7 @@ extern void maybe_warn_string_init (location_t, tree, struct c_expr);
 extern void start_init (tree, tree, int);
 extern void finish_init (void);
 extern void really_start_incremental_init (tree);
+extern void finish_implicit_inits (location_t, struct obstack *);
 extern void push_init_level (location_t, int, struct obstack *);
 extern struct c_expr pop_init_level (location_t, int, struct obstack *);
 extern void set_init_index (location_t, tree, tree, struct obstack *);
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index a147ac6..e4e2944 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -7447,6 +7447,30 @@ really_start_incremental_init (tree type)
 }
 }
 
+/* Called when we see an open brace for a nested initializer.  Finish
+   

Re: [C++ PATCH] Fix -Wunused-function (PR debug/66869)

2016-01-29 Thread Jakub Jelinek
On Fri, Jan 29, 2016 at 11:35:07AM +0100, Jakub Jelinek wrote:
> I can try to stick there an assert whether for FUNCTION_DECL
> (DECL_INITIAL (decl) == 0) == DECL_EXTERNAL (decl).

Tried that, but cancelled that quickly, I see lots of cases where
DECL_INITIAL is non-NULL, but DECL_EXTERNAL is set, and some
where DECL_INITIAL is NULL, and DECL_EXTERNAL is not set,
at least in the other two spots (check_global_declaration in cgraphunit.c
and c-decl.c).  Haven't waited long enough to find out if the C++ FE is some
exception.

Jakub


Re: [C++ PATCH] Fix -Wunused-function (PR debug/66869)

2016-01-29 Thread Jason Merrill

On 01/29/2016 11:35 AM, Jakub Jelinek wrote:

On Thu, Jan 28, 2016 at 09:51:34PM -0500, Jason Merrill wrote:

On 01/28/2016 03:15 PM, Jakub Jelinek wrote:

+   if (TREE_CODE (decl) == FUNCTION_DECL
+   && DECL_INITIAL (decl) == 0
+   && DECL_EXTERNAL (decl)
+   && !TREE_PUBLIC (decl)
+   && !DECL_ARTIFICIAL (decl)
+   && !TREE_NO_WARNING (decl))


Do we need to check both DECL_INITIAL and DECL_EXTERNAL?


Dunno, but that is what cgraphunit.c does, c-decl.c too,
what the old toplev.c (check_global_declaration_1) did (back to at least
r26593 from ~ 1999), so I think we want some consistency.


OK.

Jason




[patch] libstdc++/69506 Fix Cygwin bootstrap error due to TM symbols

2016-01-29 Thread Jonathan Wakely

Another target that doesn't have the necessary weak ref support for
the TM-aware exception-handling.

Bootrapped successfully by the reporter, committed to trunk.
commit b7e2e38ab1d938ee19280eba11ed3643a140f86d
Author: Jonathan Wakely 
Date:   Fri Jan 29 10:38:45 2016 +

Fix Cygwin bootstrap error due to TM symbols

	PR libstdc++/69506
	* config/os/newlib/os_defines.h (_GLIBCXX_USE_WEAK_REF): Define.

diff --git a/libstdc++-v3/config/os/newlib/os_defines.h b/libstdc++-v3/config/os/newlib/os_defines.h
index 4a09dd1..2a87e74 100644
--- a/libstdc++-v3/config/os/newlib/os_defines.h
+++ b/libstdc++-v3/config/os/newlib/os_defines.h
@@ -53,6 +53,9 @@
 // their dtors are called
 #define _GLIBCXX_THREAD_ATEXIT_WIN32 1
 
+// See libstdc++/69506
+#define _GLIBCXX_USE_WEAK_REF 0
+
 #endif
 
 #endif


Re: Fix some i386 testcases for -frename-registers

2016-01-29 Thread Uros Bizjak
Hello!

> * gcc.target/i386/avx512bw-vptestmb-1.c: Correct [xyz]mm register
> number scans.
> * gcc.target/i386/avx512bw-vptestmw-1.c: Likewise.
> * gcc.target/i386/avx512bw-vptestnmb-1.c: Likewise.
> * gcc.target/i386/avx512bw-vptestnmw-1.c: Likewise.
> * gcc.target/i386/avx512cd-vpbroadcastmb2q-1.c: Likewise.
> * gcc.target/i386/avx512cd-vpbroadcastmw2d-1.c: Likewise.
> * gcc.target/i386/avx512dq-vfpclasspd-1.c: Likewise.
> * gcc.target/i386/avx512dq-vfpclassps-1.c: Likewise.
> * gcc.target/i386/avx512dq-vinsertf64x2-1.c: Likewise.
> * gcc.target/i386/avx512dq-vinserti64x2-1.c: Likewise.
> * gcc.target/i386/avx512f-gather-5.c: Likewise.
> * gcc.target/i386/avx512f-vptestmd-1.c: Likewise.
> * gcc.target/i386/avx512f-vptestmq-1.c: Likewise.
> * gcc.target/i386/avx512f-vptestnmd-1.c: Likewise.
> * gcc.target/i386/avx512f-vptestnmq-1.c: Likewise.
> * gcc.target/i386/avx512f-vrndscaleps-1.c: Likewise.
> * gcc.target/i386/avx512vl-vpbroadcastmb2q-1.c: Likewise.
> * gcc.target/i386/avx512vl-vpbroadcastmw2d-1.c: Likewise.
> * gcc.target/i386/avx512vl-vptestmd-1.c: Likewise.
> * gcc.target/i386/avx512vl-vptestmq-1.c: Likewise.
> * gcc.target/i386/avx512vl-vptestnmd-1.c: Likewise.
> * gcc.target/i386/avx512vl-vptestnmq-1.c: Likewise.
> * gcc.target/i386/pr32219-2.c: Allow registers other than %eax in
> scans.
> * gcc.target/i386/pr32219-4.c: Likewise.
> * gcc.target/i386/pr32219-6.c: Likewise.
> * gcc.target/i386/pr32219-8.c: Likewise.

OK.

Thanks,
Uros.


Re: [PATCH] PR c++/69462: Provide FLT_EVAL_METHOD and DECIMAL_DIG in float.h.

2016-01-29 Thread Jonathan Wakely

On 28/01/16 15:42 +0100, Jakub Jelinek wrote:

On Thu, Jan 28, 2016 at 01:32:18PM +, Jonathan Wakely wrote:

On 28/01/16 13:40 +0100, Dominik Vogt wrote:
>The attached patch (written by Jonathan, not me) makes
>FLT_EVAL_METHOD and DECIMAL_DIG available in C++-11 as they should
>be.
>
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69462
>
>Can this be committed (should it wait for stage1)?

I've just noticed we should also do the following, although this can
definitely wait for stage 1 as it works fine as is (unlike Dominik's
 case which is a conformance bug).

--- a/gcc/ginclude/stdarg.h
+++ b/gcc/ginclude/stdarg.h
@@ -47,7 +47,7 @@ typedef __builtin_va_list __gnuc_va_list;
#define va_start(v,l)  __builtin_va_start(v,l)
#define va_end(v)  __builtin_va_end(v)
#define va_arg(v,l)__builtin_va_arg(v,l)
-#if !defined(__STRICT_ANSI__) || __STDC_VERSION__ + 0 >= 199900L || 
defined(__GXX_EXPERIMENTAL_CXX0X__)
+#if !defined(__STRICT_ANSI__) || __STDC_VERSION__ + 0 >= 199900L || __cplusplus + 
0 >= 201103L
#define va_copy(d,s)   __builtin_va_copy(d,s)
#endif
#define __va_copy(d,s) __builtin_va_copy(d,s)


This is ok, but please fix up the formatting (avoid too long line).


I've committed the attached version, wrapping the line.


commit 92f4d9e8d059d2d1bad1dcea30ec44b60a5c35e7
Author: Jonathan Wakely 
Date:   Fri Jan 29 11:58:17 2016 +

Test __cplusplus instead of __GXX_EXPERIMENTAL_CXX0X__

	* ginclude/stdarg.h: Test __cplusplus instead of
	__GXX_EXPERIMENTAL_CXX0X__.

diff --git a/gcc/ginclude/stdarg.h b/gcc/ginclude/stdarg.h
index 74a234d..6525152 100644
--- a/gcc/ginclude/stdarg.h
+++ b/gcc/ginclude/stdarg.h
@@ -47,7 +47,8 @@ typedef __builtin_va_list __gnuc_va_list;
 #define va_start(v,l)	__builtin_va_start(v,l)
 #define va_end(v)	__builtin_va_end(v)
 #define va_arg(v,l)	__builtin_va_arg(v,l)
-#if !defined(__STRICT_ANSI__) || __STDC_VERSION__ + 0 >= 199900L || defined(__GXX_EXPERIMENTAL_CXX0X__)
+#if !defined(__STRICT_ANSI__) || __STDC_VERSION__ + 0 >= 199900L \
+|| __cplusplus + 0 >= 201103L
 #define va_copy(d,s)	__builtin_va_copy(d,s)
 #endif
 #define __va_copy(d,s)	__builtin_va_copy(d,s)


Re: [C++ PATCH] Fix -Wunused-function (PR debug/66869)

2016-01-29 Thread Jakub Jelinek
On Thu, Jan 28, 2016 at 09:51:34PM -0500, Jason Merrill wrote:
> On 01/28/2016 03:15 PM, Jakub Jelinek wrote:
> >+if (TREE_CODE (decl) == FUNCTION_DECL
> >+&& DECL_INITIAL (decl) == 0
> >+&& DECL_EXTERNAL (decl)
> >+&& !TREE_PUBLIC (decl)
> >+&& !DECL_ARTIFICIAL (decl)
> >+&& !TREE_NO_WARNING (decl))
> 
> Do we need to check both DECL_INITIAL and DECL_EXTERNAL?

Dunno, but that is what cgraphunit.c does, c-decl.c too,
what the old toplev.c (check_global_declaration_1) did (back to at least
r26593 from ~ 1999), so I think we want some consistency.
Either it is needed, or if it is not needed, then all the spots should
change, not just this one.
I can try to stick there an assert whether for FUNCTION_DECL
(DECL_INITIAL (decl) == 0) == DECL_EXTERNAL (decl).

Jakub


Re: Martin Jambor appointed HSA Maintainer

2016-01-29 Thread Martin Jambor
Hi,

On Fri, Dec 18, 2015 at 08:41:41AM -0500, David Edelsohn wrote:
>   I am pleased to announce that the GCC Steering Committee has
> appointed Martin Jambor as HSA maintainer.
> 
>   Please join me in congratulating Martin on his new role.
> Martin, please update your listing in the MAINTAINERS file.
> 

thank you very much for your trust.  I will do my best when carrying
out the associated duties.  Now that HSA is also in, I have committed
the following change to the MAINTAINERS file.

Martin

2016-01-29  Martin Jambor  

* MAINTAINERS (hsa maintainers): Add myself.
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index a5afeb7..aa757ea 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -209,6 +209,7 @@ fixincludes Bruce Korb  
 *gimpl*Jason Merrill   
 gcse.c Jeff Law
 global opt framework   Jeff Law
+hsaMartin Jambor   
 jump.c David S. Miller 
 web pages  Gerald Pfeifer  
 config.sub/config.guessBen Elliston
-- 
2.7.0



Re: [PATCH] PR other/69006: S/390: Fix extra newlines after diagnostics.

2016-01-29 Thread Dominik Vogt
On Wed, Jan 27, 2016 at 09:22:19AM -0500, David Malcolm wrote:
> On Wed, 2016-01-27 at 12:01 +0100, Dominik Vogt wrote:
> > The attached patch removes a blank line after an error message.
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69006
> 
> Presumably this was exposed by the stricter testing I added to lib/gcc
> -dg.exp in r232837?

Yes, of course.

> > -  error_at (loc, "ambiguous overload for intrinsic: %s\n",
> > +  error_at (loc, "ambiguous overload for intrinsic: %s",
> >  IDENTIFIER_POINTER (DECL_NAME (ob_fndecl)));
 
> Should this code be using %qs rather than %s? (or somesuch, or is that
> a gcc 7 thing)

Yes, probably.  I'll make a separate patch for this.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



[PATCH] Fix PR69547

2016-01-29 Thread Richard Biener

I am testing the following patch to fix a regression that we no longer
remove some empty loops.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2016-01-19  Richard Biener  

PR tree-optimization/69547
* tree-ssa-dce.c (mark_aliased_reaching_defs_necessary_1):
Do not mark clobbers necessary.
(mark_all_reaching_defs_necessary_1): Likewise.

* g++.dg/tree-ssa/pr69547.C: New testcase.

Index: gcc/tree-ssa-dce.c
===
*** gcc/tree-ssa-dce.c  (revision 232928)
--- gcc/tree-ssa-dce.c  (working copy)
*** mark_aliased_reaching_defs_necessary_1 (
*** 462,468 
gimple *def_stmt = SSA_NAME_DEF_STMT (vdef);
  
/* All stmts we visit are necessary.  */
!   mark_operand_necessary (vdef);
  
/* If the stmt lhs kills ref, then we can stop walking.  */
if (gimple_has_lhs (def_stmt)
--- 462,469 
gimple *def_stmt = SSA_NAME_DEF_STMT (vdef);
  
/* All stmts we visit are necessary.  */
!   if (! gimple_clobber_p (def_stmt))
! mark_operand_necessary (vdef);
  
/* If the stmt lhs kills ref, then we can stop walking.  */
if (gimple_has_lhs (def_stmt)
*** mark_all_reaching_defs_necessary_1 (ao_r
*** 584,590 
  }
  }
  
!   mark_operand_necessary (vdef);
  
return false;
  }
--- 585,592 
  }
  }
  
!   if (! gimple_clobber_p (def_stmt))
! mark_operand_necessary (vdef);
  
return false;
  }
Index: gcc/testsuite/g++.dg/tree-ssa/pr69547.C
===
*** gcc/testsuite/g++.dg/tree-ssa/pr69547.C (revision 0)
--- gcc/testsuite/g++.dg/tree-ssa/pr69547.C (working copy)
***
*** 0 
--- 1,15 
+ // { dg-do compile }
+ // { dg-options "-O2 -fdump-tree-cddce1" }
+ 
+ struct A { A () { } };
+ 
+ void foo (void*, int);
+ 
+ void bar ()
+ {
+   enum { N = 64 };
+   A a [N];
+   foo (, N);
+ }
+ 
+ // { dg-final { scan-tree-dump-not "if" "cddce1" } }


[Patch,microblaze]: Better register allocation to minimize the spill and fetch.

2016-01-29 Thread Ajit Kumar Agarwal

This patch improves the allocation of registers in the given function. The 
allocation
is optimized for the conditional branches. The temporary register used in the
conditional branches to store the comparison results and use of temporary in the
conditional branch is optimized. Such temporary registers are allocated with a 
fixed
register r18.

Currently such temporaries are allocated with a free registers in the given
function. Due to this one of the free register is reserved for the temporaries 
and
given function is left with a few registers. This is unoptimized with respect to
microblaze. In Microblaze r18 is marked as fixed and cannot be allocated to 
pseudos'
in the given function. Instead r18 can be used as a temporary for the 
conditional
branches with compare and branch. Use of r18 as a temporary for conditional 
branches
will save one of the free registers to be allocated. The free registers can be 
used
for other pseudos' and hence the better register allocation.

The usage of r18 as above reduces the spill and fetch because of the 
availability of
one of the free registers to other pseudos instead of being used for conditional
temporaries.

The advantage of the above is that the scope of the temporaries is limited to 
the
conditional branches and hence the usage of r18 as temporary for such 
conditional
branches is optimized and preserve the functionality of the function.

Regtested for Microblaze target.

Performance runs are done with Mibench/EEMBC benchmarks.

Following gains are achieved.

Benchmarks  Gains

automotive_qsort1   1.630730524%
network_dijkstra   1.527506256%
office_stringsearch   1 1.81356288%
security_rijndael_d   3.26129357%
basefp01_lite  4.465120185%
a2time01_lite  1.893862857%
cjpeg_lite  3.286496675%
djpeg_lite 3.120150612%
qos_lite 2.63964381%
office_ispell 1.531340405%

Code Size improvements:

Reduction in number of instructions for Mibench  :  12927.
Reduction in number of instructions for EEMBC :   212.

ChangeLog:
2016-01-29  Ajit Agarwal  

* config/microblaze/microblaze.c
(microblaze_expand_conditional_branch): Use of MB_ABI_ASM_TEMP_REGNUM
for temporary conditional branch.
(microblaze_expand_conditional_branch_reg): Use of 
MB_ABI_ASM_TEMP_REGNUM
for temporary conditional branch.
(microblaze_expand_conditional_branch_sf): Use of MB_ABI_ASM_TEMP_REGNUM
for temporary conditional branch.

Signed-off-by:Ajit Agarwal ajit...@xilinx.com.
---
 gcc/config/microblaze/microblaze.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/microblaze/microblaze.c 
b/gcc/config/microblaze/microblaze.c
index baff67a..b4277ad 100644
--- a/gcc/config/microblaze/microblaze.c
+++ b/gcc/config/microblaze/microblaze.c
@@ -3402,7 +3402,7 @@ microblaze_expand_conditional_branch (machine_mode mode, 
rtx operands[])
   rtx cmp_op0 = operands[1];
   rtx cmp_op1 = operands[2];
   rtx label1 = operands[3];
-  rtx comp_reg = gen_reg_rtx (SImode);
+  rtx comp_reg =  gen_rtx_REG (SImode, MB_ABI_ASM_TEMP_REGNUM);
   rtx condition;
 
   gcc_assert ((GET_CODE (cmp_op0) == REG) || (GET_CODE (cmp_op0) == SUBREG));
@@ -3439,7 +3439,7 @@ microblaze_expand_conditional_branch_reg (enum 
machine_mode mode,
   rtx cmp_op0 = operands[1];
   rtx cmp_op1 = operands[2];
   rtx label1 = operands[3];
-  rtx comp_reg = gen_reg_rtx (SImode);
+  rtx comp_reg =  gen_rtx_REG (SImode, MB_ABI_ASM_TEMP_REGNUM);
   rtx condition;
 
   gcc_assert ((GET_CODE (cmp_op0) == REG)
@@ -3483,7 +3483,7 @@ microblaze_expand_conditional_branch_sf (rtx operands[])
   rtx condition;
   rtx cmp_op0 = XEXP (operands[0], 0);
   rtx cmp_op1 = XEXP (operands[0], 1);
-  rtx comp_reg = gen_reg_rtx (SImode);
+  rtx comp_reg =  gen_rtx_REG (SImode, MB_ABI_ASM_TEMP_REGNUM);
 
   emit_insn (gen_cstoresf4 (comp_reg, operands[0], cmp_op0, cmp_op1));
   condition = gen_rtx_NE (SImode, comp_reg, const0_rtx);
-- 
1.7.1

Thanks & Regards
Ajit




better-reg-alloc.patch
Description: better-reg-alloc.patch


Re: [PATCH] PR c++/69462: Provide FLT_EVAL_METHOD and DECIMAL_DIG in float.h.

2016-01-29 Thread Jakub Jelinek
On Fri, Jan 29, 2016 at 09:27:46AM +0100, Dominik Vogt wrote:
> On Thu, Jan 28, 2016 at 03:41:29PM +0100, Jakub Jelinek wrote:
> > On Thu, Jan 28, 2016 at 01:40:12PM +0100, Dominik Vogt wrote:
> > > -#if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
> > > +#if (defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L) \
> > > + || (defined (__cplusplus) && __cplusplus >= 201103L)
> > 
> > The formatting is wrong, there is a tab before || when it should be aligned
> > below defined on the previous line.
> 
> Attached.

> gcc/ChangeLog
> 
>   PR c++/69462
>   * ginclude/float.h: Also provide FLT_EVAL_METHOD and DECIMAL_DIG for
>   C++-11.

Ok, thanks.

> --- a/gcc/ginclude/float.h
> +++ b/gcc/ginclude/float.h
> @@ -127,7 +127,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>  #undef FLT_ROUNDS
>  #define FLT_ROUNDS 1
>  
> -#if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
> +#if (defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L) \
> + || (defined (__cplusplus) && __cplusplus >= 201103L)
>  /* The floating-point expression evaluation method.
>  -1  indeterminate
>   0  evaluate all operations and constants just to the range and

Jakub


Re: [PATCH] S/390: Use %qs in error messages.

2016-01-29 Thread Andreas Krebbel
On Fri, Jan 29, 2016 at 10:06:47AM +0100, Dominik Vogt wrote:
> gcc/ChangeLog
> 
>   * config/s390/s390-c.c (s390_resolve_overloaded_builtin): Format
>   declaration name with %qs and print it in both error messages.  Also
>   fix indentation.

Applied.  Thanks!

-Andreas-



Re: [PATCH] PR c++/69462: Provide FLT_EVAL_METHOD and DECIMAL_DIG in float.h.

2016-01-29 Thread Dominik Vogt
On Thu, Jan 28, 2016 at 03:41:29PM +0100, Jakub Jelinek wrote:
> On Thu, Jan 28, 2016 at 01:40:12PM +0100, Dominik Vogt wrote:
> > -#if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
> > +#if (defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L) \
> > +   || (defined (__cplusplus) && __cplusplus >= 201103L)
> 
> The formatting is wrong, there is a tab before || when it should be aligned
> below defined on the previous line.

Attached.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

PR c++/69462
* ginclude/float.h: Also provide FLT_EVAL_METHOD and DECIMAL_DIG for
C++-11.
>From dcaa02429122bd66916caf730b34b0f86d9b1a8f Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Mon, 25 Jan 2016 11:59:06 +0100
Subject: [PATCH] PR c++/69462: Provide FLT_EVAL_METHOD and
 DECIMAL_DIG in float.h.

---
 gcc/ginclude/float.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/ginclude/float.h b/gcc/ginclude/float.h
index 18f5aac..862f7cc 100644
--- a/gcc/ginclude/float.h
+++ b/gcc/ginclude/float.h
@@ -127,7 +127,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #undef FLT_ROUNDS
 #define FLT_ROUNDS 1
 
-#if defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
+#if (defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L) \
+ || (defined (__cplusplus) && __cplusplus >= 201103L)
 /* The floating-point expression evaluation method.
 -1  indeterminate
  0  evaluate all operations and constants just to the range and
-- 
2.3.0



[PATCH] S/390: Use %qs in error messages.

2016-01-29 Thread Dominik Vogt
The attached patch replaces %qs instead of %s in an error message,
adds %qs to another and fixes indentation in one of the messages.

Compiled and checked that no tests rely on the changed error
messages on a zEC12.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* config/s390/s390-c.c (s390_resolve_overloaded_builtin): Format
declaration name with %qs and print it in both error messages.  Also
fix indentation.
>From 2455f3dd89b84e9e93ca30d0d52733f1f05a1802 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Fri, 29 Jan 2016 09:58:55 +0100
Subject: [PATCH] S/390: Use %qs in error messages.

---
 gcc/config/s390/s390-c.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/config/s390/s390-c.c b/gcc/config/s390/s390-c.c
index 2b6e405..cd3584b 100644
--- a/gcc/config/s390/s390-c.c
+++ b/gcc/config/s390/s390-c.c
@@ -904,13 +904,14 @@ s390_resolve_overloaded_builtin (location_t loc,
 
   if (last_match_type == INT_MAX)
 {
-  error_at (loc, "invalid parameter combination for intrinsic");
+  error_at (loc, "invalid parameter combination for intrinsic %qs",
+		IDENTIFIER_POINTER (DECL_NAME (ob_fndecl)));
   return error_mark_node;
 }
   else if (num_matches > 1)
 {
-  error_at (loc, "ambiguous overload for intrinsic: %s",
-	 IDENTIFIER_POINTER (DECL_NAME (ob_fndecl)));
+  error_at (loc, "ambiguous overload for intrinsic %qs",
+		IDENTIFIER_POINTER (DECL_NAME (ob_fndecl)));
   return error_mark_node;
 }
 
-- 
2.3.0



Re: [PATCH] PR c++/69462: Provide FLT_EVAL_METHOD and DECIMAL_DIG in float.h.

2016-01-29 Thread Andreas Krebbel
On Fri, Jan 29, 2016 at 09:27:46AM +0100, Dominik Vogt wrote:
> gcc/ChangeLog
> 
>   PR c++/69462
>   * ginclude/float.h: Also provide FLT_EVAL_METHOD and DECIMAL_DIG for
>   C++-11.

Applied. Thanks!

-Andreas-



Re: [PATCH] Fix use of declare'd vars by routine procedures.

2016-01-29 Thread Jakub Jelinek
On Thu, Jan 28, 2016 at 12:26:38PM -0600, James Norris wrote:
> I think the attached change is what you had in mind with
> regard to doing the check at gimplification time.

Nope, this is still a wrong location for that.
If you look at the next line after the block you've added, you'll see
if (gimplify_omp_ctxp && omp_notice_variable (gimplify_omp_ctxp, decl, true))
And that function fairly early calls is_global_var (decl).  So you know
already gimplify_omp_ctxp && is_global_var (decl), just put the rest into
that block.
The TREE_CODE (decl) == VAR_DECL check is VAR_P (decl).
What do you want to achieve with

> +  && ((TREE_STATIC (decl) && !DECL_EXTERNAL (decl))  
> +   || (!TREE_STATIC (decl) && DECL_EXTERNAL (decl

?  is_global_var already guarantees you that it is either TREE_STATIC
or DECL_EXTERNAL, why is that not good enough?

> diff --git a/trunk/gcc/gimplify.c b/trunk/gcc/gimplify.c
> --- a/trunk/gcc/gimplify.c(revision 232802)
> +++ b/trunk/gcc/gimplify.c(working copy)
> @@ -1841,6 +1841,33 @@
>return GS_ERROR;
>  }
>  
> +  /* Validate variable for use within routine function.  */
> +  if (gimplify_omp_ctxp && gimplify_omp_ctxp->region_type == ORT_TARGET
> +  && get_oacc_fn_attrib (current_function_decl)

If you only care about the implicit target region of acc routine, I think
you also want to check that gimplify_omp_ctxp->outer_context == NULL.

> +  && TREE_CODE (decl) == VAR_DECL
> +  && is_global_var (decl)
> +  && ((TREE_STATIC (decl) && !DECL_EXTERNAL (decl))
> +   || (!TREE_STATIC (decl) && DECL_EXTERNAL (decl
> +{
> +  location_t loc = DECL_SOURCE_LOCATION (decl);
> +
> +  if (lookup_attribute ("omp declare target link", DECL_ATTRIBUTES 
> (decl)))
> + {
> +   error_at (loc,
> + "%qE with % clause used in % function",
> + DECL_NAME (decl));
> +   return GS_ERROR;
> + }
> +  else if (!lookup_attribute ("omp declare target", DECL_ATTRIBUTES 
> (decl)))
> + {
> +   error_at (loc,
> + "storage class of %qE cannot be ", DECL_NAME (decl));
> +   error_at (gimplify_omp_ctxp->location,
> + "used in enclosing % function");

And I'm really confused by this error message.  If you are complaining that
the variable is not listed in acc declare clauses, why don't you say that?
What does the error have to do with its storage class?
Also, splitting one error into two is weird, usually there would be one
error message and perhaps inform after it.

Jakub


Re: [PATCH] Fix PR69547

2016-01-29 Thread Richard Biener
On Fri, 29 Jan 2016, Jakub Jelinek wrote:

> On Fri, Jan 29, 2016 at 09:40:48AM +0100, Richard Biener wrote:
> > I am testing the following patch to fix a regression that we no longer
> > remove some empty loops.
> 
> Doesn't this mean that DCE will remove the clobbers as unnecessary, even
> when they aren't in empty loops?

No, as with other cases (we never mark clobbers as necessary during 
propagation) they will be retained if all required operands are
still there (SSA names used in the LHS).

So the idea is that they should not keeping other stuff live but
remain in the IL if all uses are still live after DCE.

Richard.


Re: [C++ PATCH] Fix -Wunused-function (PR debug/66869)

2016-01-29 Thread Jason Merrill

On 01/28/2016 03:15 PM, Jakub Jelinek wrote:

+   if (TREE_CODE (decl) == FUNCTION_DECL
+   && DECL_INITIAL (decl) == 0
+   && DECL_EXTERNAL (decl)
+   && !TREE_PUBLIC (decl)
+   && !DECL_ARTIFICIAL (decl)
+   && !TREE_NO_WARNING (decl))


Do we need to check both DECL_INITIAL and DECL_EXTERNAL?

Jason



Re: [C++ patch] report better diagnostic for static following '[' in parameter declaration

2016-01-29 Thread Prathamesh Kulkarni
On 29 January 2016 at 05:03, Marek Polacek  wrote:
> On Fri, Jan 29, 2016 at 04:46:56AM +0530, Prathamesh Kulkarni wrote:
>> @@ -19016,10 +19017,22 @@ cp_parser_direct_declarator (cp_parser* parser,
>> cp_lexer_consume_token (parser->lexer);
>> /* Peek at the next token.  */
>> token = cp_lexer_peek_token (parser->lexer);
>> +
>> +   /* If static keyword immediately follows [, report error.  */
>> +   if (cp_lexer_next_token_is_keyword (parser->lexer, RID_STATIC)
>> +   && current_binding_level->kind == sk_function_parms)
>> + {
>> +   error_at (token->location,
>> + "static array size is a C99 feature,"
>> + "not permitted in C++");
>> +   bounds = error_mark_node;
>> + }
>> +
>
> I think this isn't sufficient as-is; if we're changing the diagnostics here,
> we should also handle e.g. void f(int a[const 10]); where clang++ says
> g.C:1:13: error: qualifier in array size is a C99 feature, not permitted in 
> C++
>
> And also e.g.
> void f(int a[const static 10]);
> void f(int a[static const 10]);
> and similar.
Thanks for the review. AFAIK the type-qualifiers would be const,
restrict, volatile and _Atomic (n1570 p 6.7.3) ?
I added a check for those and for variable length array.
I am having issues with writing the test-case,
some cases pass with -std=c++11 but fail with -std=c++98.
Could you please have a look ?

Thanks,
Prathamesh
>
> Marek
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 33f1df3..04137b3 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -982,6 +982,24 @@ cp_lexer_next_token_is_decl_specifier_keyword (cp_lexer 
*lexer)
 }
 }
 
+static bool
+cp_lexer_next_token_is_c_type_qual (cp_lexer *lexer)
+{
+  if (cp_lexer_next_token_is_keyword (lexer, RID_CONST) 
+  || cp_lexer_next_token_is_keyword (lexer, RID_VOLATILE))
+return true;
+
+  cp_token *token = cp_lexer_peek_token (lexer);
+  if (token->type == CPP_NAME)
+{
+  tree name = token->u.value;
+  const char *p = IDENTIFIER_POINTER (name);
+  return !strcmp (p, "restrict") || !strcmp (p, "_Atomic");
+}
+
+  return false;
+}
+
 /* Returns TRUE iff the token T begins a decltype type.  */
 
 static bool
@@ -18998,10 +19016,40 @@ cp_parser_direct_declarator (cp_parser* parser,
  cp_lexer_consume_token (parser->lexer);
  /* Peek at the next token.  */
  token = cp_lexer_peek_token (parser->lexer);
+
+ /* If static or type-qualifier or * immediately follows [,
+report error.  */
+ if (current_binding_level->kind == sk_function_parms)
+   {
+ if (cp_lexer_next_token_is_keyword (parser->lexer, RID_STATIC))
+   {
+   error_at (token->location,
+ "static array size is a C99 feature, "
+ "not permitted in C++");
+   bounds = error_mark_node;
+   }
+ else if (cp_lexer_next_token_is_c_type_qual (parser->lexer))
+   {
+   error_at (token->location,
+ "qualifier in array size is a C99 feature, "
+ "not permitted in C++");
+   bounds = error_mark_node;
+   }
+ 
+ else if (token->type == CPP_MULT)
+   {
+   error_at (token->location,
+ "variable-length array size is a C99 feature, "
+ "not permitted in C++");
+   bounds = error_mark_node;
+   }
+   }
+
  /* If the next token is `]', then there is no
 constant-expression.  */
- if (token->type != CPP_CLOSE_SQUARE)
+ if (token->type != CPP_CLOSE_SQUARE && bounds != error_mark_node)
{
+
  bool non_constant_p;
  bounds
= cp_parser_constant_expression (parser,
diff --git a/gcc/testsuite/g++.dg/parse/static-array-error.C 
b/gcc/testsuite/g++.dg/parse/static-array-error.C
new file mode 100644
index 000..028320d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/static-array-error.C
@@ -0,0 +1,33 @@
+// { dg-do compile }
+
+void f1(int a[static 10]);  /* { dg-error "static array size is a C99 feature" 
} */
+/* { dg-error "expected '\\]' before 'static'" "" { target *-*-* } 3 } */
+/* { dg-error "expected '\\)' before 'static'" "" { target *-*-* } 3 } */
+/* { dg-error "expected initializer before 'static'" "" { target *-*-* } 3 } */
+
+void f2(int a[const 10]); /* { dg-error "qualifier in array size is a C99 
feature" } */
+/* { dg-error "expected '\\]' before 'const'" "" { target *-*-* } 8 } */
+/* { dg-error "expected '\\)' before 'const'" "" { target *-*-* } 8 } */
+/* { dg-error "expected initializer before numeric constant" "" { target *-*-* 
} 8 } */
+
+void f3(int a[restrict 10]); /* { dg-error 

[PATCH PR67921]Convert pointer expr to proper type before negating it

2016-01-29 Thread Bin Cheng
Hi,
Function fold_binary_loc calls split_tree to split a tree into constant, 
literal and variable parts.  Function split_tree deals with minus_expr by 
negating different parts into NEXGATE_EXPR.  Since tree exprs fed to split_tree 
are with NOP conversions stripped, this could result in illegal expr for 
pointer expressions.  Given below example as described by PR67921:
  op0:  (4 - (sizetype) )
  code: MINUS_EXPR
  op1:  (sizetype)b

fold_binary_loc calls split_tree for both op0 and op1 and gets below from the 
function calls (it also flips the code):
  op0:  4, -(sizetype)
  code: PLUS_EXPR
  op1:  -b

Here we generate NEGATIVE_EXPR of pointer variable (b) which is illegal.  If 
"-b" can not be canceled by following call to associate_trees, it will be 
passed along in IR resulting in ICE somewhere.

This patch fixes it by converting pointer expression to proper type before 
negating it.  Note the proper type is the outer type stripped in 
fold_binary_loc before calling split_tree.  I also included a test which is 
heavily reduced from the original ffmpeg code in the original PR.  Considering 
it's stage4, I restricted the patch to the smallest change.  As a matter of 
fact, we may need to do the same thing for signed int types because -TYPE_MIN 
is undefined.  Unfortunately, I failed to create a test in this case.

Bootstrap and test on x64_64, is it OK?

2016-01-27  Bin Cheng  

PR tree-optimization/67921
* fold-const.c (split_tree): New parameters.  Convert pointer
type variable part to proper type before negating. 
(fold_binary_loc): Pass new arguments to split_tree.

gcc/testsuite/ChangeLog
2016-01-27  Bin Cheng  

PR tree-optimization/67921
* c-c++-common/ubsan/pr67921.c: New test.


diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index bece8d7..e34bc81 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -109,7 +109,8 @@ enum comparison_code {
 
 static bool negate_expr_p (tree);
 static tree negate_expr (tree);
-static tree split_tree (tree, enum tree_code, tree *, tree *, tree *, int);
+static tree split_tree (location_t, tree, tree, enum tree_code,
+   tree *, tree *, tree *, int);
 static tree associate_trees (location_t, tree, tree, enum tree_code, tree);
 static enum comparison_code comparison_to_compcode (enum tree_code);
 static enum tree_code compcode_to_comparison (enum comparison_code);
@@ -767,7 +768,10 @@ negate_expr (tree t)
literal for which we use *MINUS_LITP instead.
 
If NEGATE_P is true, we are negating all of IN, again except a literal
-   for which we use *MINUS_LITP instead.
+   for which we use *MINUS_LITP instead.  If a variable part is of pointer
+   type, it is negated after converting to TYPE.  This prevents us from
+   generating illegal MINUS pointer expression.  LOC is the location of
+   the converted variable part.
 
If IN is itself a literal or constant, return it as appropriate.
 
@@ -775,8 +779,8 @@ negate_expr (tree t)
same type as IN, but they will have the same signedness and mode.  */
 
 static tree
-split_tree (tree in, enum tree_code code, tree *conp, tree *litp,
-   tree *minus_litp, int negate_p)
+split_tree (location_t loc, tree in, tree type, enum tree_code code,
+   tree *conp, tree *litp, tree *minus_litp, int negate_p)
 {
   tree var = 0;
 
@@ -833,7 +837,12 @@ split_tree (tree in, enum tree_code code, tree *conp, tree 
*litp,
   if (neg_conp_p)
*conp = negate_expr (*conp);
   if (neg_var_p)
-   var = negate_expr (var);
+   {
+ /* Convert to TYPE before negating a pointer type expr.  */
+ if (var && POINTER_TYPE_P (TREE_TYPE (var)))
+   var = fold_convert_loc (loc, type, var);
+ var = negate_expr (var);
+   }
 }
   else if (TREE_CODE (in) == BIT_NOT_EXPR
   && code == PLUS_EXPR)
@@ -854,6 +863,9 @@ split_tree (tree in, enum tree_code code, tree *conp, tree 
*litp,
   else if (*minus_litp)
*litp = *minus_litp, *minus_litp = 0;
   *conp = negate_expr (*conp);
+  /* Convert to TYPE before negating a pointer type expr.  */
+  if (var && POINTER_TYPE_P (TREE_TYPE (var)))
+   var = fold_convert_loc (loc, type, var);
   var = negate_expr (var);
 }
 
@@ -9621,9 +9633,10 @@ fold_binary_loc (location_t loc,
 then the result with variables.  This increases the chances of
 literals being recombined later and of generating relocatable
 expressions for the sum of a constant and literal.  */
- var0 = split_tree (arg0, code, , , _lit0, 0);
- var1 = split_tree (arg1, code, , , _lit1,
-code == MINUS_EXPR);
+ var0 = split_tree (loc, arg0, type, code,
+, , _lit0, 0);
+ var1 = split_tree (loc, arg1, type, code,
+, , _lit1, code == MINUS_EXPR);
 
  /* Recombine 

Re: Default compute dimensions

2016-01-29 Thread Nathan Sidwell

On 01/29/16 10:18, Jakub Jelinek wrote:

On Thu, Jan 28, 2016 at 10:38:51AM -0500, Nathan Sidwell wrote:

This patch adds default compute dimension handling.  Users rarely specify
compute dimensions, expecting the toolchain to DTRT.  More savvy users would
like to specify global defaults.  This patch permits both.


Isn't it better to be able to override the defaults on the library side?
I mean, when when somebody is compiling the code, often he doesn't know the
exact properties of the hw it will be run on, if he does, I think it is
better to specify them explicitly in the code.  But if he doesn't, one just
has to hope libgomp will figure out the best defaults.
So, wouldn't it be better to add some env var that would allow to control
this instead?


You have anticipated part 2 of this patch, which would allow a default to be 
deferred to runtime in the manner you describe.


Generally, one can know at compile time the upper bound on workers (it's part of 
the chip specification), but the number of physical gangs depends on the 
accelerator card.  (That is true for PTX and IIUC for other GPGPUs too.) So, you 
may want defer num gangs to runtime -- but of course then you lose constant 
folding opportunities.


nathan


Re: Is it OK for rtx_addr_can_trap_p_1 to attempt to compute the frame layout? (was Re: [PATCH] Skip re-computing the mips frame info after reload completed)

2016-01-29 Thread Bernd Schmidt

On 01/29/2016 04:41 PM, Jakub Jelinek wrote:

On Fri, Jan 29, 2016 at 02:09:25AM +0100, Bernd Schmidt wrote:



I think a better approach might be to just mark accesses at known locations
in the frame, or arg pushes, as MEM_NOTRAP_P, and consider accesses with
non-constant or calculated offsets as potentially trapping.


I don't see how that would work generally.  Sure, if there is e.g. a constant
offset array access, it could be checked easily, but if there is variable offset
array access, that is at some point later on changed into a constant offset
access, you'd need to be conservative, unless you can prove it is in range.


Yes. What is the problem with that? If we have (plus sfp const_int) at 
any point before reload, we can check whether that offset is inside 
frame_size. If it isn't or if the offset isn't known, it could trap.



Bernd


[PATCH] [graphite] document that isl-0.16 is supported

2016-01-29 Thread Sebastian Pop
* config/isl.m4: Add comments about isl-0.16.
* configure: Regenerate.

gcc/
* doc/install.texi: Document that isl-0.16 is supported.
---
 config/isl.m4|  6 +++---
 configure| 12 ++--
 gcc/doc/install.texi |  2 +-
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/config/isl.m4 b/config/isl.m4
index 0103f1f..92524af 100644
--- a/config/isl.m4
+++ b/config/isl.m4
@@ -106,7 +106,7 @@ AC_DEFUN([ISL_CHECK_VERSION],
 LDFLAGS="${_isl_saved_LDFLAGS} ${isllibs} ${gmplibs}"
 LIBS="${_isl_saved_LIBS} -lisl -lgmp"
 
-AC_MSG_CHECKING([for isl 0.15 (or deprecated 0.14)])
+AC_MSG_CHECKING([for isl 0.16, 0.15, or deprecated 0.14])
 AC_TRY_LINK([#include ],
 [isl_ctx_get_max_operations (isl_ctx_alloc ());],
 [gcc_cv_isl=yes],
@@ -114,10 +114,10 @@ AC_DEFUN([ISL_CHECK_VERSION],
 AC_MSG_RESULT([$gcc_cv_isl])
 
 if test "${gcc_cv_isl}" = no ; then
-  AC_MSG_RESULT([recommended isl version is 0.15, minimum required isl 
version 0.14 is deprecated])
+  AC_MSG_RESULT([recommended isl version is 0.16 or 0.15, the minimum 
required isl version 0.14 is deprecated])
 fi
 
-AC_MSG_CHECKING([for isl-0.15])
+AC_MSG_CHECKING([for isl 0.16 or 0.15])
 AC_TRY_LINK([#include ],
 [isl_options_set_schedule_serialize_sccs (NULL, 0);],
 [ac_has_isl_options_set_schedule_serialize_sccs=yes],
diff --git a/configure b/configure
index b9a4b51..89c863c 100755
--- a/configure
+++ b/configure
@@ -6021,8 +6021,8 @@ $as_echo "$as_me: WARNING: using in-tree isl, disabling 
version check" >&2;}
 LDFLAGS="${_isl_saved_LDFLAGS} ${isllibs} ${gmplibs}"
 LIBS="${_isl_saved_LIBS} -lisl -lgmp"
 
-{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for isl 0.15 (or 
deprecated 0.14)" >&5
-$as_echo_n "checking for isl 0.15 (or deprecated 0.14)... " >&6; }
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for isl 0.16, 0.15, or 
deprecated 0.14" >&5
+$as_echo_n "checking for isl 0.16, 0.15, or deprecated 0.14... " >&6; }
 cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
 #include 
@@ -6045,12 +6045,12 @@ rm -f core conftest.err conftest.$ac_objext \
 $as_echo "$gcc_cv_isl" >&6; }
 
 if test "${gcc_cv_isl}" = no ; then
-  { $as_echo "$as_me:${as_lineno-$LINENO}: result: recommended isl version 
is 0.15, minimum required isl version 0.14 is deprecated" >&5
-$as_echo "recommended isl version is 0.15, minimum required isl version 0.14 
is deprecated" >&6; }
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: recommended isl version 
is 0.16 or 0.15, the minimum required isl version 0.14 is deprecated" >&5
+$as_echo "recommended isl version is 0.16 or 0.15, the minimum required isl 
version 0.14 is deprecated" >&6; }
 fi
 
-{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for isl-0.15" >&5
-$as_echo_n "checking for isl-0.15... " >&6; }
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for isl 0.16 or 0.15" >&5
+$as_echo_n "checking for isl 0.16 or 0.15... " >&6; }
 cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
 #include 
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 062f42c..3df7974 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -383,7 +383,7 @@ installed but it is not in your default library search 
path, the
 @option{--with-mpc} configure option should be used.  See also
 @option{--with-mpc-lib} and @option{--with-mpc-include}.
 
-@item isl Library version 0.15 or 0.14.
+@item isl Library version 0.16, 0.15, or 0.14.
 
 Necessary to build GCC with the Graphite loop optimizations.
 It can be downloaded from @uref{ftp://gcc.gnu.org/pub/gcc/infrastructure/}.
-- 
2.5.0



Re: [PATCH, RFC] New memory usage statistics infrastructure

2016-01-29 Thread Patrick Palka
On Thu, May 28, 2015 at 1:07 PM, Jeff Law  wrote:
> On 05/28/2015 06:29 AM, Martin Liška wrote:
>
>>>
>>
>> Hello.
>>
>> Thank you for pointing about missing copyright.
>> Following patch adds that.
>>
>> Ready for trunk?
>
> Yes.
> jeff
>

It looks like this patch was never committed.  gcc/mem-stats.h and
gcc/mem-stats-traits.h don't yet have copyright headers.


[RS6000] ABI_V4 init of toc section

2016-01-29 Thread Alan Modra
Since 4c4a180d, LTO has turned off flag_pic when linking a fixed
position executable.  This results in flag_pic being zero in
rs6000_file_start, and no definition of ".LCTOC1".

However, when we get to actually emitting code, flag_pic may be on
again, and references made to ".LCTOC1".  How flag_pic comes to be
enabled again is quite a story.  It goes like this..  If a function is
compiled with -fPIC then sysv4.h SUBTARGET_OVERRIDE_OPTIONS will set
TARGET_RELOCATABLE.  Conversely, if TARGET_RELOCATABLE is set and
flag_pic is zero, then SUBTARGET_OVERRIDE_OPTIONS will set flag_pic=2.
It also happens that TARGET_RELOCATABLE is a bit in rs6000_isa_flags,
which is handled by rs6000_function_specific_save and
rs6000_function_specific_restore.  That last fact means lto streaming
keeps track of the state of TARGET_RELOCATABLE for functions, and when
options are restored for a given function we'll set flag_pic=2 if the
function was originally compiled with -fPIC.  That's bad because it
defeats the purpose of the 4c4a180d lto change, resulting in worse
optimization of ppc32 executables.  What's more, we don't seem to turn
off flag_pic once it is on.

We should really untangle the flag_pic/TARGET_RELOCATABLE mess, but
that change is probably a little dangerous for stage4.  Instead, this
patch removes the toc symbol initialization from file_start and does
so when the first item is emitted to the toc, or after the function
epilogue in the cases where we emit code to initialize a toc pointer
but don't actually use it (-O0 mostly, I think).

Bootstrapped and regression tested powerpc64-linux biarch with all
languages enabled.  OK to apply?

PR target/68662
* config/rs6000/rs6000.c (need_toc_init): New var, set it
whenever toc_label_name used.
(rs6000_file_start): Don't set up toc section here,
(rs6000_output_function_epilogue): do so here instead,
(rs6000_xcoff_file_start): and here.
* config/rs6000/rs6000.md (load_toc_aix_si): Set need_toc_init.
(load_toc_aix_di): Likewise.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 4704d00..4ea4efb 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -209,7 +209,7 @@ tree rs6000_builtin_types[RS6000_BTI_MAX];
 tree rs6000_builtin_decls[RS6000_BUILTIN_COUNT];
 
 /* Flag to say the TOC is initialized */
-int toc_initialized;
+int toc_initialized, need_toc_init;
 char toc_label_name[10];
 
 /* Cached value of rs6000_variable_issue. This is cached in
@@ -5682,13 +5682,6 @@ rs6000_file_start (void)
 
   if (DEFAULT_ABI == ABI_ELFv2)
 fprintf (file, "\t.abiversion 2\n");
-
-  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2
-  || (TARGET_ELF && flag_pic == 2))
-{
-  switch_to_section (toc_section);
-  switch_to_section (text_section);
-}
 }
 
 
@@ -20375,6 +20368,7 @@ rs6000_output_addr_const_extra (FILE *file, rtx x)
  {
putc ('-', file);
assemble_name (file, toc_label_name);
+   need_toc_init = 1;
  }
else if (TARGET_ELF)
  fputs ("@toc", file);
@@ -24003,7 +23997,10 @@ rs6000_emit_load_toc_table (int fromprolog)
   ASM_GENERATE_INTERNAL_LABEL (buf, "L", CODE_LABEL_NUMBER (lab));
   lab = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (buf));
   if (flag_pic == 2)
-   got = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (toc_label_name));
+   {
+ got = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (toc_label_name));
+ need_toc_init = 1;
+   }
   else
got = rs6000_got_sym ();
   tmp1 = tmp2 = dest;
@@ -24048,6 +24045,7 @@ rs6000_emit_load_toc_table (int fromprolog)
  rtx tocsym, lab;
 
  tocsym = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (toc_label_name));
+ need_toc_init = 1;
  lab = gen_label_rtx ();
  emit_insn (gen_load_toc_v4_PIC_1b (tocsym, lab));
  emit_move_insn (dest, gen_rtx_REG (Pmode, LR_REGNO));
@@ -24062,6 +24060,7 @@ rs6000_emit_load_toc_table (int fromprolog)
   /* This is for AIX code running in non-PIC ELF32.  */
   rtx realsym = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (toc_label_name));
 
+  need_toc_init = 1;
   emit_insn (gen_elf_high (dest, realsym));
   emit_insn (gen_elf_low (dest, dest, realsym));
 }
@@ -27598,6 +27597,17 @@ rs6000_output_function_epilogue (FILE *file,
 
   fputs ("\t.align 2\n", file);
 }
+
+  /* Arrange to define .LCTOC1 label, if not already done.  */
+  if (need_toc_init)
+{
+  need_toc_init = 0;
+  if (!toc_initialized)
+   {
+ switch_to_section (toc_section);
+ switch_to_section (current_function_section ());
+   }
+}
 }
 
 /* -fsplit-stack support.  */
@@ -31745,6 +31755,7 @@ rs6000_elf_declare_function_name (FILE *file, const 
char *name, tree decl)
 
   fprintf (file, "\t.long ");
   assemble_name (file, toc_label_name);
+  need_toc_init = 1;
   putc ('-', file);
   

Re: Is it OK for rtx_addr_can_trap_p_1 to attempt to compute the frame layout? (was Re: [PATCH] Skip re-computing the mips frame info after reload completed)

2016-01-29 Thread Jakub Jelinek
On Fri, Jan 29, 2016 at 02:09:25AM +0100, Bernd Schmidt wrote:
> I'm on the fence; I do think the original problem is an issue we should fix,
> but I'm also not terribly happy with the implementation we have right now.

The fact that it has been only reported from generated testcases only means
we are lucky nobody encountered it yet in real-world programs.
Plus, we need to be thankful to people working on those generators that keep
reporting GCC bugs, they significantly improve the compiler.

> Besides the issues already mentioned, doesn't it kind of assume these
> offsets are constant (which they definitely aren't, consider arg pushes for
> example)?

Sure, for some registers the offsets aren't constant.  In some cases we e.g.
have REG_ARGS_SIZE notes, but in other cases don't and don't have anything
else that would help us understand the difference between current sp value
and one at the end of the prologue or so.

> I think a better approach might be to just mark accesses at known locations
> in the frame, or arg pushes, as MEM_NOTRAP_P, and consider accesses with
> non-constant or calculated offsets as potentially trapping.

I don't see how that would work generally.  Sure, if there is e.g. a constant
offset array access, it could be checked easily, but if there is variable offset
array access, that is at some point later on changed into a constant offset
access, you'd need to be conservative, unless you can prove it is in range.
Or perhaps we could also use some other flag (or turn it into __builtin_trap
or __builtin_unreachable or whatever) to mark MEMs that are always invalid
if executed.

Jakub


Re: [off-list] Re: [PATCH PR68542]

2016-01-29 Thread Uros Bizjak
On Fri, Jan 29, 2016 at 3:13 PM, Yuri Rumyantsev  wrote:
> Uros,
>
> Here is update patch which includes (1) couple changes proposed by
> Richard in tree-vect-loop.c and (2) the changes in back-end proposed
> by you.
>
> Is it OK for trunk?
> Bootstrap and regression testing dis not show any new failures.
>
> ChangeLog:
>
> 2016-01-29  Yuri Rumyantsev  
>
> PR middle-end/68542
> * config/i386/i386.c (ix86_expand_branch): Add support for conditional
> branch with vector comparison.
> *config/i386/sse.md (Vi48_AVX): New mode iterator.
> (define_expand "cbranch4): Add support for conditional branch
> with vector comparison.
> * tree-vect-loop.c (optimize_mask_stores): New function.
> * tree-vect-stmts.c (vectorizable_mask_load_store): Initialize
> has_mask_store field of vect_info.
> * tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for
> vectorized loops having masked stores after vec_info destroy.
> * tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and
> correspondent macros.
> (optimize_mask_stores): Add prototype.
>
> gcc/testsuite/ChangeLog:
> * gcc.dg/vect/vect-mask-store-move-1.c: New test.
> * testsuite/gcc.target/i386/avx2-vect-mask-store-move1.c: Likewise.


+(define_mode_iterator Vi48_AVX
+ [(V4SI "TARGET_AVX") (V2DI "TARGET_AVX")
+  (V8SI "TARGET_AVX") (V4DI "TARGET_AVX")])
+

Please name this iterator with all caps: VI48_AVX. Also, there is no
need for a condition at V4SI and V2DI:

(define_mode_iterator VI48_AVX
  [V4SI  V2DI
   (V8SI "TARGET_AVX") (V4DI "TARGET_AVX")

x86 part is OK with the above change.

Thanks,
Uros.


Re: [PATCH] s390: Add -fsplit-stack support

2016-01-29 Thread Marcin Kościelnicki

On 29/01/16 14:33, Andreas Krebbel wrote:

Hi Marcin,

sorry for the late feedback.

A few comments regarding the split stack implementation:

The GNU coding style requires to replace every 8 leading blanks on a
line with a tab.  There are many lines in your patch violating this.
In case you are an emacs user `whitespace-cleanup' will fix this for
you.


OK, will do.


Could you please add a testcase checking the different
variants. I.e. with early exit, no-alloc in __morestack, and with an
actual allocation?


The testsuite with -fsplit-stack already hits all of them, and checking 
them manually is rather tricky (I don't know if it could be done in 
target-independent way at all), but I think it'd be reasonable to make 
assembly testcases calling __morestack for the last two cases, to check 
if the registers are being preserved, etc.




There are a few more comments inline.

Bye,

-Andreas-


diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index c881d52..71f6f38 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,38 @@
  2016-01-16  Marcin Kościelnicki  

+   * common/config/s390/s390-common.c (s390_supports_split_stack):
+   New function.
+   (TARGET_SUPPORTS_SPLIT_STACK): New macro.
+   * config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
+   * config/s390/s390.c (struct machine_function): New field
+   split_stack_varargs_pointer.
+   (s390_register_info): Mark r12 as clobbered if it'll be used as temp
+   in s390_emit_prologue.
+   (s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
+   vararg pointer.
+   (morestack_ref): New global.
+   (SPLIT_STACK_AVAILABLE): New macro.
+   (s390_expand_split_stack_prologue): New function.
+   (s390_expand_split_stack_call): New function.
+   (s390_live_on_entry): New function.
+   (s390_va_start): Use split-stack vararg pointer if appropriate.
+   (s390_reorg): Lower the split-stack pseudo-insns.
+   (s390_asm_file_end): Emit the split-stack note sections.
+   (TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
+   * config/s390/s390.md: (UNSPEC_STACK_CHECK): New unspec.
+   (UNSPECV_SPLIT_STACK_CALL): New unspec.
+   (UNSPECV_SPLIT_STACK_SIBCALL): New unspec.
+   (UNSPECV_SPLIT_STACK_MARKER): New unspec.
+   (split_stack_prologue): New expand.
+   (split_stack_call_*): New insn.
+   (split_stack_cond_call_*): New insn.
+   (split_stack_space_check): New expand.
+   (split_stack_sibcall_*): New insn.
+   (split_stack_cond_sibcall_*): New insn.
+   (split_stack_marker): New insn.
+
+2016-01-02  Marcin Kościelnicki  
+
* cfgrtl.c (rtl_tidy_fallthru_edge): Bail for unconditional jumps
with side effects.

diff --git a/gcc/common/config/s390/s390-common.c 
b/gcc/common/config/s390/s390-common.c
index 4519c21..1e497e6 100644
--- a/gcc/common/config/s390/s390-common.c
+++ b/gcc/common/config/s390/s390-common.c
@@ -105,6 +105,17 @@ s390_handle_option (struct gcc_options *opts 
ATTRIBUTE_UNUSED,
  }
  }

+/* -fsplit-stack uses a field in the TCB, available with glibc-2.23.
+   We don't verify it, since earlier versions just have padding at
+   its place, which works just as well.  */
+
+static bool
+s390_supports_split_stack (bool report ATTRIBUTE_UNUSED,
+  struct gcc_options *opts ATTRIBUTE_UNUSED)
+{
+  return true;
+}
+
  #undef TARGET_DEFAULT_TARGET_FLAGS
  #define TARGET_DEFAULT_TARGET_FLAGS (TARGET_DEFAULT)

@@ -117,4 +128,7 @@ s390_handle_option (struct gcc_options *opts 
ATTRIBUTE_UNUSED,
  #undef TARGET_OPTION_INIT_STRUCT
  #define TARGET_OPTION_INIT_STRUCT s390_option_init_struct

+#undef TARGET_SUPPORTS_SPLIT_STACK
+#define TARGET_SUPPORTS_SPLIT_STACK s390_supports_split_stack
+
  struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;
diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index 633bc1e..09032c9 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -42,6 +42,7 @@ extern bool s390_handle_option (struct gcc_options *opts 
ATTRIBUTE_UNUSED,
  extern HOST_WIDE_INT s390_initial_elimination_offset (int, int);
  extern void s390_emit_prologue (void);
  extern void s390_emit_epilogue (bool);
+extern void s390_expand_split_stack_prologue (void);
  extern bool s390_can_use_simple_return_insn (void);
  extern bool s390_can_use_return_insn (void);
  extern void s390_function_profiler (FILE *, int);
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 3be64de..6afce7c 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -426,6 +426,13 @@ struct GTY(()) machine_function
/* True if the current function may contain a tbegin clobbering
   FPRs.  */
bool tbegin_p;
+
+  /* For -fsplit-stack support: A stack local which holds a pointer to
+ the stack arguments for a function with a variable number of
+ arguments.  This is set at the 

Re: [PATCH] s390: Add -fsplit-stack support

2016-01-29 Thread Andreas Krebbel
On 01/29/2016 04:43 PM, Marcin Kościelnicki wrote:
> The testsuite with -fsplit-stack already hits all of them, and checking 
> them manually is rather tricky (I don't know if it could be done in 
> target-independent way at all), but I think it'd be reasonable to make 
> assembly testcases calling __morestack for the last two cases, to check 
> if the registers are being preserved, etc.
Sounds good. Thanks!

...
>>> +  if (frame_size <= 0x7fff || (TARGET_EXTIMM && frame_size <= 0xu))
>> The agfi immediate value is a signed 32 bit integer.  So you can only
>> add up to 2G-1.  I think it would be more readable to write this as:
> 
> We're emitting ALGFI here, which accepts unsigned 32-bit integer.
Ah right. Then it would be:

if (CONST_OK_FOR_K (frame_size) || CONST_OK_FOR_Op (frame_size))

instead.

>>
>> if (CONST_OK_FOR_K (frame_size) || CONST_OK_FOR_Os (frame_size))
>>
>> as in s390_emit_prologue. The Os check will check for TARGET_EXTIMM as well.
> 
> Alright.

...

>> I'm wondering if it is really necessary to expand the call in that
>> two-step approach?! We do the general literal pool handling in
>> s390_reorg because we need all the insn lengths to be finalized before
>> performing the branch/pool splitting loop.  But this shouldn't be necessary
>> in this case.  Would it be possible to expand the call already in
>> emit_prologue phase and get rid of the s390_reorg part?
> 
> There's an internal literal pool involved, which needs to be emitted as 
> one chunk.  Optimizations are also very likely to destroy the sequence: 
> consider the target address that __morestack will call - the control 
> flow change happens in __morestack jump instruction, but the address 
> itself is encoded in one of the pool literals.  Just not worth the risk.
Ok.

...
>>> +   # OK, no stack allocation needed.  We still follow the protocol and
>>> +   # call our caller - it doesn't cost much and makes sure vararg works.
>>> +   # No need to set any registers here - %r0 and %r2-%r6 weren't modified.
>>> +   basr%r14, %r10  # Call our caller.
>> The comment confuses me.  It somewhat sounds to me like the call
>> wouldn't be really needed but in fact it cannot even remotely work
>> without jumping back to the function body right?!
> 
> Certainly.  __morestack's task is to call the given function entry point 
> once the necessary stack space is established.  In fact, in the no 
> allocation case, a sibling-call would actually be possible, if it 
> weren't for one annoying detail: there are no free GPRs we could use to 
> keep the address of the entry point - %r0 may be used to keep static 
> chain, %r1 may have to be the argument pointer, %r2-%r5 may be used to 
> keep parameters, and %r6-%r15 are callee-saved.
Ok. The comment isn't about no-call vs. call it is about sibcall vs. call - got 
it.

Bye,

-Andreas-



[PING] Re: [RFC][PATCH, ARM 8/8] Added support for ARMV8-M Security Extension cmse_nonsecure_caller intrinsic

2016-01-29 Thread Andre Vieira (lists)

On 26/12/15 01:59, Thomas Preud'homme wrote:

[Sending on behalf of Andre Vieira]

Hello,

This patch adds support ARMv8-M's Security Extension's cmse_nonsecure_caller 
intrinsic. This intrinsic is used to check whether an entry function was called 
from a non-secure state.
See Section 5.4.3 of ARM®v8-M Security Extensions: Requirements on Development 
Tools (http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html) 
for further details.

*** gcc/ChangeLog ***
2015-10-27  Andre Vieira
 Thomas Preud'homme  

 * gcc/config/arm/arm-builtins.c (arm_builtins): Define
   ARM_BUILTIN_CMSE_NONSECURE_CALLER.
   (bdesc_2arg): Add line for cmse_nonsecure_caller.
   (arm_init_builtins): Init for cmse_nonsecure_caller.
   (arm_expand_builtin): Handle cmse_nonsecure_caller.
 * gcc/config/arm/arm_cmse.h (cmse_nonsecure_caller): New.

*** gcc/testsuite/ChangeLog ***
2015-10-27  Andre Vieira
 Thomas Preud'homme  

 * gcc.target/arm/cmse/cmse-1.c: Added test for
   cmse_nonsecure_caller.


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
11cd17d0b8f3c29ccbe16cb463a17d55ba0fa1e3..7934cf1d4d96c40255d3e93dc9902b4568014984
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -515,6 +515,8 @@ enum arm_builtins
ARM_BUILTIN_GET_FPSCR,
ARM_BUILTIN_SET_FPSCR,

+  ARM_BUILTIN_CMSE_NONSECURE_CALLER,
+
  #undef CRYPTO1
  #undef CRYPTO2
  #undef CRYPTO3
@@ -1263,6 +1265,10 @@ static const struct builtin_description bdesc_2arg[] =
FP_BUILTIN (set_fpscr, SET_FPSCR)
  #undef FP_BUILTIN

+  {ARM_FSET_MAKE_CPU2 (FL2_CMSE), CODE_FOR_andsi3,
+   "__builtin_arm_cmse_nonsecure_caller", ARM_BUILTIN_CMSE_NONSECURE_CALLER,
+   UNKNOWN, 0},
+
  #define CRC32_BUILTIN(L, U) \
{ARM_FSET_EMPTY, CODE_FOR_##L, "__builtin_arm_"#L, \
 ARM_BUILTIN_##U, UNKNOWN, 0},
@@ -1797,6 +1803,17 @@ arm_init_builtins (void)
= add_builtin_function ("__builtin_arm_stfscr", ftype_set_fpscr,
ARM_BUILTIN_SET_FPSCR, BUILT_IN_MD, NULL, 
NULL_TREE);
  }
+
+  if (arm_arch_cmse)
+{
+  tree ftype_cmse_nonsecure_caller
+   = build_function_type_list (unsigned_type_node, NULL);
+  arm_builtin_decls[ARM_BUILTIN_CMSE_NONSECURE_CALLER]
+   = add_builtin_function ("__builtin_arm_cmse_nonsecure_caller",
+   ftype_cmse_nonsecure_caller,
+   ARM_BUILTIN_CMSE_NONSECURE_CALLER, BUILT_IN_MD,
+   NULL, NULL_TREE);
+}
  }

  /* Return the ARM builtin for CODE.  */
@@ -2356,6 +2373,14 @@ arm_expand_builtin (tree exp,
emit_insn (pat);
return target;

+case ARM_BUILTIN_CMSE_NONSECURE_CALLER:
+  icode = CODE_FOR_andsi3;
+  target = gen_reg_rtx (SImode);
+  op0 = arm_return_addr (0, NULL_RTX);
+  pat = GEN_FCN (icode) (target, op0, const1_rtx);
+  emit_insn (pat);
+  return target;
+
  case ARM_BUILTIN_TEXTRMSB:
  case ARM_BUILTIN_TEXTRMUB:
  case ARM_BUILTIN_TEXTRMSH:
diff --git a/gcc/config/arm/arm_cmse.h b/gcc/config/arm/arm_cmse.h
index 
ab20a3ec46025f268a1e9bed895d27da9af7aab6..0bdff668d03d54e1acf2bdd3b5ff1bfb2b463bd8
 100644
--- a/gcc/config/arm/arm_cmse.h
+++ b/gcc/config/arm/arm_cmse.h
@@ -163,6 +163,13 @@ __attribute__ ((__always_inline__))
  cmse_TTAT (void *p)
  CMSE_TT_ASM (at)

+//TODO: diagnose use outside cmse_nonsecure_entry functions
+__extension__ static __inline int __attribute__ ((__always_inline__))
+cmse_nonsecure_caller (void)
+{
+  return __builtin_arm_cmse_nonsecure_caller ();
+}
+
  #define CMSE_AU_NONSECURE 2
  #define CMSE_MPU_NONSECURE16
  #define CMSE_NONSECURE18
diff --git a/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c 
b/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
index 
1c3d4e9e934f4b1166d4d98383cf4ae8c3515117..ccecf396d3cda76536537b4d146bbb5f70589fd5
 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
@@ -66,3 +66,32 @@ int foo (char * p)
  /* { dg-final { scan-assembler-times "ttat " 2 } } */
  /* { dg-final { scan-assembler-times "bl.cmse_check_address_range" 7 } } */
  /* { dg-final { scan-assembler-not "cmse_check_pointed_object" } } */
+
+typedef int (*int_ret_funcptr_t) (void);
+typedef int __attribute__ ((cmse_nonsecure_call)) (*int_ret_nsfuncptr_t) 
(void);
+
+int __attribute__ ((cmse_nonsecure_entry))
+baz (void)
+{
+  return cmse_nonsecure_caller ();
+}
+
+int __attribute__ ((cmse_nonsecure_entry))
+qux (int_ret_funcptr_t int_ret_funcptr)
+{
+  int_ret_nsfuncptr_t int_ret_nsfunc_ptr;
+
+  if (cmse_is_nsfptr (int_ret_funcptr))
+{
+  int_ret_nsfunc_ptr = cmse_nsfptr_create (int_ret_funcptr);
+  return int_ret_nsfunc_ptr ();
+}
+  return 0;
+}
+/* { 

[PATCHv2] Re: [RFC][PATCH, ARM 7/8] ARMv8-M Security Extension's cmse_nonsecure_call: use __gnu_cmse_nonsecure_call]

2016-01-29 Thread Andre Vieira (lists)

On 19/01/16 15:28, Andre Vieira (lists) wrote:

On 16/01/16 14:49, Senthil Kumar Selvaraj wrote:

User-agent: mu4e 0.9.13; emacs 24.5.1

Hi,

Apologies for the bad posting style (I don't have the
original email handy), but shouldn't _gnu_cmse_nonsecure_call be defined
with the .global directive in the below hunk (to make it visible when
linking)?

diff --git a/libgcc/config/arm/cmse_nonsecure_call.S
b/libgcc/config/arm/cm=
se_nonsecure_call.S
new file mode 100644
index
..bdc140f5bbe87c6599db225b1b9=
b7bbc7d606710
--- /dev/null
+++ b/libgcc/config/arm/cmse_nonsecure_call.S
@@ -0,0 +1,87 @@
+.syntax unified
+.thumb
+__gnu_cmse_nonsecure_call:

Right now, it ends up as a local symbol, and compiling and linking a
program with cmse_nonsecure_call (say cmse-11.c), results in a linker
error - the linker doesn't find the symbol even if it is present in
libgcc.a. I found the problem that way - dumping symbols for my variant
of libgcc.a and grepping showed the symbol to be available but local.

Regards
Senthil


Hi Senthil,

Thanks for catching that!

Cheers,
Andre


Hi there,

Added missing global symbol.

Is this OK?

Cheers,
Andre

*** gcc/ChangeLog ***
2016-01-29  Andre Vieira
Thomas Preud'homme  

* gcc/config/arm/arm.c (detect_cmse_nonsecure_call): New.
  (cmse_nonsecure_call_clear_caller_saved): New.
* gcc/config/arm/arm-protos.h (detect_cmse_nonsecure_call): New.
* gcc/config/arm/arm.md (call): Handle cmse_nonsecure_entry.
  (call_value): Likewise.
  (nonsecure_call_internal): New.
  (nonsecure_call_value_internal): New.
* gcc/config/arm/thumb1.md (*nonsecure_call_reg_thumb1_v5): New.
  (*nonsecure_call_value_reg_thumb1_v5): New.
* gcc/config/arm/thumb2.md (*nonsecure_call_reg_thumb2): New.
  (*nonsecure_call_value_reg_thumb2): New.
* gcc/config/arm/unspecs.md (UNSPEC_NONSECURE_MEM): New.
* libgcc/config/arm/cmse_nonsecure_call.S: New.
* libgcc/config/arm/t-arm: Compile cmse_nonsecure_call.S


*** gcc/testsuite/ChangeLog ***
2016-01-29  Andre Vieira
Thomas Preud'homme  

* gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c: New.
* gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c: New.
* gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/hard/cmse-13.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/hard/cmse-7.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/hard/cmse-8.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/soft/cmse-13.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/soft/cmse-7.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/soft/cmse-8.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/softfp-sp/cmse-7.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/softfp-sp/cmse-8.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/softfp/cmse-13.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/softfp/cmse-7.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/softfp/cmse-8.c: New.
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 4fb4261794668752a8224e2d4a2363162ae9cb94..402313c5f4aeb9d2d26ea7d4a0412609142d490b 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -132,6 +132,7 @@ extern int arm_const_double_inline_cost (rtx);
 extern bool arm_const_double_by_parts (rtx);
 extern bool arm_const_double_by_immediates (rtx);
 extern void arm_emit_call_insn (rtx, rtx, bool);
+bool detect_cmse_nonsecure_call (tree);
 extern const char *output_call (rtx *);
 void arm_emit_movpair (rtx, rtx);
 extern const char *output_mov_long_double_arm_from_arm (rtx *);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index da33ba1136b97c5f534c135e6d39f8b5777b3f36..153c746ad1910ad8ea7527e74369930ca14d2594 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -17417,6 +17417,129 @@ note_invalid_constants (rtx_insn *insn, HOST_WIDE_INT address, int do_pushes)
   return;
 }
 
+/* Saves callee saved registers, clears callee saved registers and caller saved
+   registers not used to pass arguments before a cmse_nonsecure_call.  And
+   restores the callee saved registers after.  */
+
+static void
+cmse_nonsecure_call_clear_caller_saved (void)
+{
+  basic_block bb;
+
+  FOR_EACH_BB_FN (bb, cfun)
+{
+  rtx_insn *insn;
+
+  FOR_BB_INSNS (bb, insn)
+	{
+	  uint64_t to_clear_mask, float_mask;
+	  rtx_insn *seq;
+	  rtx pat, call, 

Enabling -frename-registers?

2016-01-29 Thread Bernd Schmidt
So PR57193 has an example of sub-optimal code generation, with some 
unnecessary register moves left after LRA. These seem to be difficult to 
prevent, but last year Robert Suchanek made some modifications to 
regrename that allow it to clean up such cases. Enabling 
-frename-registers removes one of the two unnecessary copies, and I'm 
pretty sure I could make it eliminate the other one as well with a bit 
more work.


Hence, this patch. The renamer has seen a lot of fixes over the years 
and should be in pretty good shape IMO. Still, I won't deny that this is 
a bit riskier than the usual bugfix patch at this stage.


Bootstrapped and tested on x86_64-linux, with my earlier patch to fix 
some i386 tests. Thoughts? Should we do this for gcc-7 at least?



Bernd
	PR rtl-optimization/57193
	* opts.c (default_options_table): Add OPT_frename_registers at -O2
	and above.
	* doc/invoke.texi (-frename-registers): Update documentation.

Index: gcc/opts.c
===
--- gcc/opts.c	(revision 232689)
+++ gcc/opts.c	(working copy)
@@ -498,6 +498,7 @@ static const struct default_options defa
 { OPT_LEVELS_2_PLUS, OPT_fstrict_overflow, NULL, 1 },
 { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_freorder_blocks_algorithm_, NULL,
   REORDER_BLOCKS_ALGORITHM_STC },
+{ OPT_LEVELS_2_PLUS, OPT_frename_registers, NULL, 1 },
 { OPT_LEVELS_2_PLUS, OPT_freorder_functions, NULL, 1 },
 { OPT_LEVELS_2_PLUS, OPT_ftree_vrp, NULL, 1 },
 { OPT_LEVELS_2_PLUS, OPT_ftree_pre, NULL, 1 },
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 232689)
+++ gcc/doc/invoke.texi	(working copy)
@@ -8467,7 +8467,8 @@ debug information format adopted by the
 make debugging impossible, since variables no longer stay in
 a ``home register''.
 
-Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops}.
+Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops},
+and also enabled at levels @option{-O2} and @option{-O3}.
 
 @item -fschedule-fusion
 @opindex fschedule-fusion



Re: RFA: patch to fix PR69299

2016-01-29 Thread Richard Henderson

On 01/28/2016 09:07 AM, Vladimir Makarov wrote:

On 01/28/2016 08:05 AM, Jakub Jelinek wrote:

On Wed, Jan 27, 2016 at 04:01:23PM -0500, Vladimir Makarov wrote:

   The following patch fixes PR69299.

   The details of the problem is described on

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69299

   The patch was successfully bootstrapped and tested on x86/x86-64.

   The patch introduces a new type of constraints
define_special_memory_constraint for memory constraints whose address reload
can not make memory to satisfy the constraint.  It is useful when
specifically aligned memory is necessary or desirable. I don't know what is
the best name for this constraint.  I use special_memory_constraint but it
could be more specific, e.g. aligned_memory_constraint.  Please let me know
what is the best name for you.

   Is the patch ok to commit?

I support the general idea and for naming will defer to Jeff,


I actually like the name special_memory_constraint.  I'm sure it will 
eventually be use for more than just aligned memories.  Off the top of my head 
I can imagine constraints for specific address spaces.




--- ira-costs.c(revision 232571)
+++ ira-costs.c(working copy)
@@ -777,6 +777,7 @@ record_reg_classes (int n_alts, int n_op
break;
  case CT_MEMORY:
+case CT_SPECIAL_MEMORY:
/* Every MEM can be reloaded to fit.  */
insn_allows_mem[i] = allows_mem[i] = 1;
if (MEM_P (op))

The comment is true only for CT_MEMORY.  Wonder if it wouldn't be better to
handle CT_SPECIAL_MEMORY separately, perhaps as:

  case CT_SPECIAL_MEMORY:
if (MEM_P (op) && constraint_satisfied_p (op, cn))
  {
insn_allows_mem[i] = allows_mem[i] = 1;
win = 1;
  }
break;

?  I.e. if the constraint is already satisfied, treat it like memory
constraint, otherwise treat like (unsatisfied) fixed form constraint.
Or, if you want to account for the possibility that it doesn't satisfy the
constraint yet due to address that if reloaded would make it satisfy,
consider !memory_operand (op, ...) case as unknown, with no need to
check the constraint.  Because if op satisfies already memory_operand,
but doesn't constraint_satisfied_p, it means it will never satisfy.


The difference in code most probably does not affect generated code correctness
but may change code quality.  After some thinking, I decided to change it to

   case CT_SPECIAL_MEMORY:
  if (MEM_P (op) && constraint_satisfied_p (op, cn))
win = 1;
  allows_mem[i] = insn_allows_mem[i] = 1;
  break;

...

But taking complexity of RA we can do unaligned memory -> pseudo first
assigning a hard reg to the pseudo and on later subpasses to spill the
pseudo for some reasons.  Removing such RA freedom might result in having
LRA stuck.


I believe that I follow your reasoning here, and the possible paths through LRA 
that would make this happen, and I agree that we should allow LRA the freedom 
to spill the pseudo that we just allocated.


The patch with that change looks good to me.


r~


Re: [Patch,microblaze]: Better register allocation to minimize the spill and fetch.

2016-01-29 Thread Michael Eager

On 01/29/2016 02:31 AM, Ajit Kumar Agarwal wrote:


This patch improves the allocation of registers in the given function. The 
allocation
is optimized for the conditional branches. The temporary register used in the
conditional branches to store the comparison results and use of temporary in the
conditional branch is optimized. Such temporary registers are allocated with a 
fixed
register r18.

Currently such temporaries are allocated with a free registers in the given
function. Due to this one of the free register is reserved for the temporaries 
and
given function is left with a few registers. This is unoptimized with respect to
microblaze. In Microblaze r18 is marked as fixed and cannot be allocated to 
pseudos'
in the given function. Instead r18 can be used as a temporary for the 
conditional
branches with compare and branch. Use of r18 as a temporary for conditional 
branches
will save one of the free registers to be allocated. The free registers can be 
used
for other pseudos' and hence the better register allocation.

The usage of r18 as above reduces the spill and fetch because of the 
availability of
one of the free registers to other pseudos instead of being used for conditional
temporaries.

The advantage of the above is that the scope of the temporaries is limited to 
the
conditional branches and hence the usage of r18 as temporary for such 
conditional
branches is optimized and preserve the functionality of the function.

Regtested for Microblaze target.

Performance runs are done with Mibench/EEMBC benchmarks.

Following gains are achieved.

Benchmarks  Gains

automotive_qsort1   1.630730524%
network_dijkstra   1.527506256%
office_stringsearch   1 1.81356288%
security_rijndael_d   3.26129357%
basefp01_lite  4.465120185%
a2time01_lite  1.893862857%
cjpeg_lite  3.286496675%
djpeg_lite 3.120150612%
qos_lite 2.63964381%
office_ispell 1.531340405%

Code Size improvements:

Reduction in number of instructions for Mibench  :  12927.
Reduction in number of instructions for EEMBC :   212.

ChangeLog:
2016-01-29  Ajit Agarwal  

* config/microblaze/microblaze.c
(microblaze_expand_conditional_branch): Use of MB_ABI_ASM_TEMP_REGNUM
for temporary conditional branch.
(microblaze_expand_conditional_branch_reg): Use of 
MB_ABI_ASM_TEMP_REGNUM
for temporary conditional branch.
(microblaze_expand_conditional_branch_sf): Use of MB_ABI_ASM_TEMP_REGNUM
for temporary conditional branch.


You can combine these ChangeLog entries:

* config/microblaze/microblaze.c
(microblaze_expand_conditional_branch, 
microblaze_expand_conditional_branch_reg,
microblaze_expand_conditional_branch_sf): Use MB_ABI_ASM_TEMP_REGNUM 
for temp reg.

Otherwise, OK.



Signed-off-by:Ajit Agarwal ajit...@xilinx.com.
---
  gcc/config/microblaze/microblaze.c |6 +++---
  1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/microblaze/microblaze.c 
b/gcc/config/microblaze/microblaze.c
index baff67a..b4277ad 100644
--- a/gcc/config/microblaze/microblaze.c
+++ b/gcc/config/microblaze/microblaze.c
@@ -3402,7 +3402,7 @@ microblaze_expand_conditional_branch (machine_mode mode, 
rtx operands[])
rtx cmp_op0 = operands[1];
rtx cmp_op1 = operands[2];
rtx label1 = operands[3];
-  rtx comp_reg = gen_reg_rtx (SImode);
+  rtx comp_reg =  gen_rtx_REG (SImode, MB_ABI_ASM_TEMP_REGNUM);
rtx condition;

gcc_assert ((GET_CODE (cmp_op0) == REG) || (GET_CODE (cmp_op0) == SUBREG));
@@ -3439,7 +3439,7 @@ microblaze_expand_conditional_branch_reg (enum 
machine_mode mode,
rtx cmp_op0 = operands[1];
rtx cmp_op1 = operands[2];
rtx label1 = operands[3];
-  rtx comp_reg = gen_reg_rtx (SImode);
+  rtx comp_reg =  gen_rtx_REG (SImode, MB_ABI_ASM_TEMP_REGNUM);
rtx condition;

gcc_assert ((GET_CODE (cmp_op0) == REG)
@@ -3483,7 +3483,7 @@ microblaze_expand_conditional_branch_sf (rtx operands[])
rtx condition;
rtx cmp_op0 = XEXP (operands[0], 0);
rtx cmp_op1 = XEXP (operands[0], 1);
-  rtx comp_reg = gen_reg_rtx (SImode);
+  rtx comp_reg =  gen_rtx_REG (SImode, MB_ABI_ASM_TEMP_REGNUM);

emit_insn (gen_cstoresf4 (comp_reg, operands[0], cmp_op0, cmp_op1));
condition = gen_rtx_NE (SImode, comp_reg, const0_rtx);




--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077


[Patch, fortran, pr67451, v1] [5/6 Regression] ICE with sourced allocation from coarray

2016-01-29 Thread Andre Vehreschild
Hi all,

attached is a patch to fix a regression in current gfortran when a
coarray is used in the source=-expression of an allocate(). The ICE was
caused by the class information, i.e., _vptr and so on, not at the
expected place. The patch fixes this.

The patch also fixes pr69418, which I will flag as a duplicate in a
second.

Bootstrapped and regtested ok on x86_64-linux-gnu/F23.

Ok for trunk?

Backport to gcc-5 is pending, albeit more difficult, because the
allocate() implementation on 5 is not as advanced the one in 6.

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index c5ae4c5..8f63d34 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -1103,7 +1103,14 @@ gfc_copy_class_to_class (tree from, tree to, tree nelems, bool unlimited)
 	}
   else
 	{
-	  from_data = gfc_class_data_get (from);
+	  /* Check that from is a class.  When the class is part of a coarray,
+	 then from is a common pointer and is to be used as is.  */
+	  tmp = POINTER_TYPE_P (TREE_TYPE (from))
+	  ? build_fold_indirect_ref (from) : from;
+	  from_data =
+	  (GFC_CLASS_TYPE_P (TREE_TYPE (tmp))
+	   || (DECL_P (tmp) && GFC_DECL_CLASS (tmp)))
+	  ? gfc_class_data_get (from) : from;
 	  is_from_desc = GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (from_data));
 	}
  }
diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index 310d2cd..5143c31 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -5358,7 +5358,8 @@ gfc_trans_allocate (gfc_code * code)
  expression.  */
   if (code->expr3)
 {
-  bool vtab_needed = false, temp_var_needed = false;
+  bool vtab_needed = false, temp_var_needed = false,
+	  is_coarray = gfc_is_coarray (code->expr3);
 
   /* Figure whether we need the vtab from expr3.  */
   for (al = code->ext.alloc.list; !vtab_needed && al != NULL;
@@ -5392,9 +5393,9 @@ gfc_trans_allocate (gfc_code * code)
 		 with the POINTER_PLUS_EXPR in this case.  */
 		  if (code->expr3->ts.type == BT_CLASS
 		  && TREE_CODE (se.expr) == NOP_EXPR
-		  && TREE_CODE (TREE_OPERAND (se.expr, 0))
-			   == POINTER_PLUS_EXPR)
-		  //&& ! GFC_CLASS_TYPE_P (TREE_TYPE (se.expr)))
+		  && (TREE_CODE (TREE_OPERAND (se.expr, 0))
+			== POINTER_PLUS_EXPR
+			  || is_coarray))
 		se.expr = TREE_OPERAND (se.expr, 0);
 		}
 	  /* Create a temp variable only for component refs to prevent
@@ -5435,7 +5436,7 @@ gfc_trans_allocate (gfc_code * code)
   if (se.expr != NULL_TREE && temp_var_needed)
 	{
 	  tree var, desc;
-	  tmp = GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (se.expr)) ?
+	  tmp = GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (se.expr)) || is_coarray ?
 		se.expr
 	  : build_fold_indirect_ref_loc (input_location, se.expr);
 
@@ -5448,7 +5449,7 @@ gfc_trans_allocate (gfc_code * code)
 	{
 	  /* When an array_ref was in expr3, then the descriptor is the
 		 first operand.  */
-	  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp)))
+	  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp)) || is_coarray)
 		{
 		  desc = TREE_OPERAND (tmp, 0);
 		}
@@ -5460,11 +5461,12 @@ gfc_trans_allocate (gfc_code * code)
 	  e3_is = E3_DESC;
 	}
 	  else
-	desc = se.expr;
+	desc = !is_coarray ? se.expr
+			   : TREE_OPERAND (TREE_OPERAND (se.expr, 0), 0);
 	  /* We need a regular (non-UID) symbol here, therefore give a
 	 prefix.  */
 	  var = gfc_create_var (TREE_TYPE (tmp), "source");
-	  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp)))
+	  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp)) || is_coarray)
 	{
 	  gfc_allocate_lang_decl (var);
 	  GFC_DECL_SAVED_DESCRIPTOR (var) = desc;
diff --git a/gcc/testsuite/gfortran.dg/coarray_allocate_2.f08 b/gcc/testsuite/gfortran.dg/coarray_allocate_2.f08
new file mode 100644
index 000..7a712a9
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/coarray_allocate_2.f08
@@ -0,0 +1,26 @@
+! { dg-do run }
+! { dg-options "-fcoarray=single" }
+!
+! Contributed by Ian Harvey  
+! Extended by Andre Vehreschild  
+! to test that coarray references in allocate work now
+! PR fortran/67451
+
+  program main
+implicit none
+type foo
+  integer :: bar = 99
+end type
+class(foo), allocatable :: foobar[:]
+class(foo), allocatable :: some_local_object
+allocate(foobar[*])
+
+allocate(some_local_object, source=foobar)
+
+if (.not. allocated(foobar)) call abort()
+if (.not. allocated(some_local_object)) call abort()
+
+deallocate(some_local_object)
+deallocate(foobar)
+  end program
+
diff --git a/gcc/testsuite/gfortran.dg/coarray_allocate_3.f08 b/gcc/testsuite/gfortran.dg/coarray_allocate_3.f08
new file mode 100644
index 000..46f34c0
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/coarray_allocate_3.f08
@@ -0,0 +1,28 @@
+! { dg-do run }
+! { dg-options "-fcoarray=single" }
+!
+! Contributed by Ian Harvey  

[PING] Re: [RFC][PATCH, ARM 5/8] ARMv8-M Security Extension's cmse_nonsecure_entry: clear registers

2016-01-29 Thread Andre Vieira (lists)

On 26/12/15 01:54, Thomas Preud'homme wrote:

[Sending on behalf of Andre Vieira]

Hello,

This patch extends support for the ARMv8-M Security Extensions 
'cmse_nonsecure_entry' attribute to safeguard against leak of information 
through unbanked registers.

When returning from a nonsecure entry function we clear all caller-saved 
registers that are not used to pass return values, by writing either the LR, in 
case of general purpose registers, or the value 0, in case of FP registers. We 
use the LR to write to APSR and FPSCR too. We currently only support 32 FP 
registers as in we only clear D0-D7.
We currently do not support entry functions that pass arguments or return 
variables on the stack and we diagnose this. This patch relies on the existing 
code to make sure callee-saved registers used in cmse_nonsecure_entry functions 
are saved and restored thus retaining their nonsecure mode value, this should 
be happening already as it is required by AAPCS.


*** gcc/ChangeLog ***
2015-10-27  Andre Vieira
 Thomas Preud'homme  

 * gcc/config/arm/arm.c (output_return_instruction): Clear
   registers.
   (thumb2_expand_return): Likewise.
   (thumb1_expand_epilogue): Likewise.
   (arm_expand_epilogue): Likewise.
   (cmse_nonsecure_entry_clear_before_return): New.
 * gcc/config/arm/arm.h (TARGET_DSP_ADD): New macro define.
 * gcc/config/arm/thumb1.md (*epilogue_insns): Change length attribute.
 * gcc/config/arm/thumb2.md (*thumb2_return): Likewise.

*** gcc/testsuite/ChangeLog ***
2015-10-27  Andre Vieira
 Thomas Preud'homme  

 * gcc.target/arm/cmse/cmse.exp: Test different multilibs separate.
 * gcc.target/arm/cmse/baseline/cmse-2.c: Test that registers are 
cleared.
 * gcc.target/arm/cmse/mainline/soft/cmse-5.c: New.
 * gcc.target/arm/cmse/mainline/hard/cmse-5.c: New.
 * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New.
 * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New.
 * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New.


diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
f12e3c93bbe24b10ed8eee6687161826773ef649..b06e0586a3da50f57645bda13629bc4dbd3d53b7
 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -230,6 +230,9 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
  /* Integer SIMD instructions, and extend-accumulate instructions.  */
  #define TARGET_INT_SIMD \
(TARGET_32BIT && arm_arch6 && (arm_arch_notm || arm_arch7em))
+/* Parallel addition and subtraction instructions.  */
+#define TARGET_DSP_ADD \
+  (TARGET_ARM_ARCH >= 6 && (arm_arch_notm || arm_arch7em))

  /* Should MOVW/MOVT be used in preference to a constant pool.  */
  #define TARGET_USE_MOVT \
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
e530b772e3cc053c16421a2a2861d815d53ebb01..0700478ca38307f35d0cb01f83ea182802ba28fa
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -19755,6 +19755,24 @@ output_return_instruction (rtx operand, bool 
really_return, bool reverse,
default:
  if (IS_CMSE_ENTRY (func_type))
{
+ char flags[12] = "APSR_nzcvq";
+ /* Check if we have to clear the 'GE bits' which is only used if
+parallel add and subtraction instructions are available.  */
+ if (TARGET_DSP_ADD)
+   {
+ /* If so also clear the ge flags.  */
+ flags[10] = 'g';
+ flags[11] = '\0';
+   }
+ snprintf (instr, sizeof (instr),  "msr%s\t%s, %%|lr", conditional,
+   flags);
+ output_asm_insn (instr, & operand);
+ if (TARGET_HARD_FLOAT && TARGET_VFP)
+   {
+ snprintf (instr, sizeof (instr), "vmsr%s\tfpscr, %%|lr",
+   conditional);
+ output_asm_insn (instr, & operand);
+   }
  snprintf (instr, sizeof (instr), "bxns%s\t%%|lr", conditional);
}
  /* Use bx if it's available.  */
@@ -23999,6 +24017,17 @@ thumb_pop (FILE *f, unsigned long mask)
  static void
  thumb1_cmse_nonsecure_entry_return (FILE *f, int reg_containing_return_addr)
  {
+  char flags[12] = "APSR_nzcvq";
+  /* Check if we have to clear the 'GE bits' which is only used if
+ parallel add and subtraction instructions are available.  */
+  if (TARGET_DSP_ADD)
+{
+  flags[10] = 'g';
+  flags[11] = '\0';
+}
+  asm_fprintf (f, "\tmsr\t%s, %r\n", flags, reg_containing_return_addr);
+  if (TARGET_HARD_FLOAT && TARGET_VFP)
+asm_fprintf (f, "\tvmsr\tfpscr, %r\n", reg_containing_return_addr);
asm_fprintf (f, "\tbxns\t%r\n", reg_containing_return_addr);
  }

@@ -25140,6 +25169,139 @@ 

[PING] Re: [RFC][PATCH, ARM 6/8] Handling ARMv8-M Security Extension's cmse_nonsecure_call attribute

2016-01-29 Thread Andre Vieira (lists)

On 26/12/15 01:55, Thomas Preud'homme wrote:

[Sending on behalf of Andre Vieira]

Hello,

This patch adds support for the ARMv8-M Security Extensions 
'cmse_nonsecure_call' attribute. This attribute may only be used for function 
types and when used in combination with the '-mcmse' compilation flag. See 
Section 5.5 of ARM®v8-M Security Extensions 
(http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html).

We currently do not support cmse_nonsecure_call functions that pass arguments 
or return variables on the stack and we diagnose this.

*** gcc/ChangeLog ***
2015-10-27  Andre Vieira
 Thomas Preud'homme  

 * gcc/config/arm/arm.c (gimplify.h): New include.
   (arm_handle_cmse_nonsecure_call): New.
   (arm_attribute_table): Added cmse_nonsecure_call.

*** gcc/testsuite/ChangeLog ***
2015-10-27  Andre Vieira
 Thomas Preud'homme  

 * gcc.target/arm/cmse/cmse-3.c: Add tests.
 * gcc.target/arm/cmse/cmse-4.c: Add tests.


diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
0700478ca38307f35d0cb01f83ea182802ba28fa..4b4eea88cbec8e04d5b92210f0af2440ce6fb6e4
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -61,6 +61,7 @@
  #include "builtins.h"
  #include "tm-constrs.h"
  #include "rtl-iter.h"
+#include "gimplify.h"

  /* This file should be included last.  */
  #include "target-def.h"
@@ -136,6 +137,7 @@ static tree arm_handle_isr_attribute (tree *, tree, tree, 
int, bool *);
  static tree arm_handle_notshared_attribute (tree *, tree, tree, int, bool *);
  #endif
  static tree arm_handle_cmse_nonsecure_entry (tree *, tree, tree, int, bool *);
+static tree arm_handle_cmse_nonsecure_call (tree *, tree, tree, int, bool *);
  static void arm_output_function_epilogue (FILE *, HOST_WIDE_INT);
  static void arm_output_function_prologue (FILE *, HOST_WIDE_INT);
  static int arm_comp_type_attributes (const_tree, const_tree);
@@ -347,6 +349,8 @@ static const struct attribute_spec arm_attribute_table[] =
/* ARMv8-M Security Extensions support.  */
{ "cmse_nonsecure_entry", 0, 0, true, false, false,
  arm_handle_cmse_nonsecure_entry, false },
+  { "cmse_nonsecure_call", 0, 0, true, false, false,
+arm_handle_cmse_nonsecure_call, false },
{ NULL,   0, 0, false, false, false, NULL, false }
  };


@@ -6667,6 +6671,76 @@ arm_handle_cmse_nonsecure_entry (tree *node, tree name,
return NULL_TREE;
  }

+
+/* Called upon detection of the use of the cmse_nonsecure_call attribute, this
+   function will check whether the attribute is allowed here and will add the
+   attribute to the function type tree or otherwise issue a diagnose.  The
+   reason we check this at declaration time is to only allow the use of the
+   attribute with declartions of function pointers and not function
+   declartions.  */
+
+static tree
+arm_handle_cmse_nonsecure_call (tree *node, tree name,
+tree /* args */,
+int /* flags */,
+bool *no_add_attrs)
+{
+  tree decl = NULL_TREE;
+  tree type, fntype, main_variant;
+
+  if (!use_cmse)
+{
+  *no_add_attrs = true;
+  return NULL_TREE;
+}
+
+  if (TREE_CODE (*node) == VAR_DECL || TREE_CODE (*node) == TYPE_DECL)
+{
+  decl = *node;
+  type = TREE_TYPE (decl);
+}
+
+  if (!decl
+  || (!(TREE_CODE (type) == POINTER_TYPE
+   && TREE_CODE (TREE_TYPE (type)) == FUNCTION_TYPE)
+ && TREE_CODE (type) != FUNCTION_TYPE))
+{
+   warning (OPT_Wattributes, "%qE attribute only applies to base type of a 
"
+"function pointer", name);
+   *no_add_attrs = true;
+   return NULL_TREE;
+}
+
+  /* type is either a function pointer, when the attribute is used on a 
function
+   * pointer, or a function type when used in a typedef.  */
+  if (TREE_CODE (type) == FUNCTION_TYPE)
+fntype = type;
+  else
+fntype = TREE_TYPE (type);
+
+  *no_add_attrs |= cmse_func_args_or_return_in_stack (NULL, name, fntype);
+
+  if (*no_add_attrs)
+return NULL_TREE;
+
+  /* Prevent tree's being shared among function types with and without
+ cmse_nonsecure_call attribute.  Do however make sure they keep the same
+ main_variant, this is required for correct DIE output.  */
+  main_variant = TYPE_MAIN_VARIANT (fntype);
+  fntype = build_distinct_type_copy (fntype);
+  TYPE_MAIN_VARIANT (fntype) = main_variant;
+  if (TREE_CODE (type) == FUNCTION_TYPE)
+TREE_TYPE (decl) = fntype;
+  else
+TREE_TYPE (type) = fntype;
+
+  /* Construct a type attribute and add it to the function type.  */
+  tree attrs = tree_cons (get_identifier ("cmse_nonsecure_call"), NULL_TREE,
+ TYPE_ATTRIBUTES (fntype));
+  TYPE_ATTRIBUTES (fntype) = attrs;
+  return NULL_TREE;
+}
+
  /* 

[PING] Re: [RFC][PATCH, ARM 3/8] Handling ARMv8-M Security Extension's cmse_nonsecure_entry attribute

2016-01-29 Thread Andre Vieira (lists)

On 26/12/15 01:47, Thomas Preud'homme wrote:

[Sending on behalf of Andre Vieira]

Hello,

This patch adds support for the ARMv8-M Security Extensions 
'cmse_nonsecure_entry' attribute. In this patch we implement the attribute 
handling and diagnosis around the attribute. See Section 5.4 of ARM®v8-M 
Security Extensions 
(http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html).

*** gcc/ChangeLog ***
2015-10-27  Andre Vieira
 Thomas Preud'homme  

 * gcc/config/arm/arm.c (arm_handle_cmse_nonsecure_entry): New.
   (arm_attribute_table): Added cmse_nonsecure_entry
   (arm_compute_func_type): Handle cmse_nonsecure_entry.
   (cmse_func_args_or_return_in_stack): New.
   (arm_handle_cmse_nonsecure_entry): New.
 * gcc/config/arm/arm.h (ARM_FT_CMSE_ENTRY): New macro define.
   (IS_CMSE_ENTRY): Likewise.

*** gcc/testsuite/ChangeLog ***
2015-10-27  Andre Vieira
 Thomas Preud'homme  

 * gcc.target/arm/cmse/cmse-3.c: New.


diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
cf6d9466fb79e4f8a2dbfe725c52d5be8ea24fd2..f12e3c93bbe24b10ed8eee6687161826773ef649
 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1375,6 +1375,7 @@ enum reg_class
  #define ARM_FT_VOLATILE   (1 << 4) /* Does not return.  */
  #define ARM_FT_NESTED (1 << 5) /* Embedded inside another func.  */
  #define ARM_FT_STACKALIGN (1 << 6) /* Called with misaligned stack.  */
+#define ARM_FT_CMSE_ENTRY  (1 << 7) /* ARMv8-M non-secure entry function.  
*/

  /* Some macros to test these flags.  */
  #define ARM_FUNC_TYPE(t)  (t & ARM_FT_TYPE_MASK)
@@ -1383,6 +1384,7 @@ enum reg_class
  #define IS_NAKED(t)   (t & ARM_FT_NAKED)
  #define IS_NESTED(t)  (t & ARM_FT_NESTED)
  #define IS_STACKALIGN(t)  (t & ARM_FT_STACKALIGN)
+#define IS_CMSE_ENTRY(t)   (t & ARM_FT_CMSE_ENTRY)


  /* Structure used to hold the function stack frame layout.  Offsets are
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
2223101fbf96bceb4beb3a7d6cb04162481dc3bf..5b9e51b10e91eee64e3383c1ed50269c3e6cf24c
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -135,6 +135,7 @@ static tree arm_handle_isr_attribute (tree *, tree, tree, 
int, bool *);
  #if TARGET_DLLIMPORT_DECL_ATTRIBUTES
  static tree arm_handle_notshared_attribute (tree *, tree, tree, int, bool *);
  #endif
+static tree arm_handle_cmse_nonsecure_entry (tree *, tree, tree, int, bool *);
  static void arm_output_function_epilogue (FILE *, HOST_WIDE_INT);
  static void arm_output_function_prologue (FILE *, HOST_WIDE_INT);
  static int arm_comp_type_attributes (const_tree, const_tree);
@@ -343,6 +344,9 @@ static const struct attribute_spec arm_attribute_table[] =
{ "notshared",0, 0, false, true, false, arm_handle_notshared_attribute,
  false },
  #endif
+  /* ARMv8-M Security Extensions support.  */
+  { "cmse_nonsecure_entry", 0, 0, true, false, false,
+arm_handle_cmse_nonsecure_entry, false },
{ NULL,   0, 0, false, false, false, NULL, false }
  };


@@ -3562,6 +3566,9 @@ arm_compute_func_type (void)
else
  type |= arm_isr_value (TREE_VALUE (a));

+  if (lookup_attribute ("cmse_nonsecure_entry", attr))
+type |= ARM_FT_CMSE_ENTRY;
+
return type;
  }

@@ -6552,6 +6559,109 @@ arm_handle_notshared_attribute (tree *node,
  }
  #endif

+/* This function is used to check whether functions with attributes
+   cmse_nonsecure_call or cmse_nonsecure_entry use the stack to pass arguments
+   or return variables.  If the function does indeed use the stack this
+   function returns true and diagnoses this, otherwise it returns false.  */
+
+static bool
+cmse_func_args_or_return_in_stack (tree fndecl, tree name, tree fntype)
+{
+  function_args_iterator args_iter;
+  CUMULATIVE_ARGS args_so_far_v;
+  cumulative_args_t args_so_far;
+  bool first_param = true;
+  tree arg_type, prev_arg_type = NULL_TREE, ret_type;
+
+  /* Error out if any argument is passed on the stack.  */
+  arm_init_cumulative_args (_so_far_v, fntype, NULL_RTX, fndecl);
+  args_so_far = pack_cumulative_args (_so_far_v);
+  FOREACH_FUNCTION_ARGS (fntype, arg_type, args_iter)
+{
+  rtx arg_rtx;
+  machine_mode arg_mode = TYPE_MODE (arg_type);
+
+  prev_arg_type = arg_type;
+  if (VOID_TYPE_P (arg_type))
+   continue;
+
+  if (!first_param)
+   arm_function_arg_advance (args_so_far, arg_mode, arg_type, true);
+  arg_rtx = arm_function_arg (args_so_far, arg_mode, arg_type, true);
+  if (!arg_rtx
+ || arm_arg_partial_bytes (args_so_far, arg_mode, arg_type, true))
+   {
+ error ("%qE attribute not available to functions with arguments "
+"passed on the stack", name);
+ return true;
+   }
+ 

Re: [RFC][PATCH , ARM 2/8] Add RTL patterns for thumb1 push/pop

2016-01-29 Thread Andre Vieira (lists)

On 26/12/15 01:45, Thomas Preud'homme wrote:

[Sending on behalf of Andre Vieira]

Hello,

This patch adds RTL patterns for the push and pop instructions for thumb1. 
These are needed by subsequent patches in the series.

*** gcc/ChangeLog ***
2015-10-27  Andre Vieira
 Thomas Preud'homme  

 * gcc/config/arm/arm-ldmstm.nl (constr thumb): Enabled
   stackpointer to be written/read.
 * gcc/config/arm/ldmstm.md: Regenerated.
 * gcc/config/arm/thumb1.md (*thumb1_pop_single): New.
   (*thumb1_load_multiple_operation): New.
 * gcc/config/arm/arm.c (thumb_pop): Fix of comment.


diff --git a/gcc/config/arm/arm-ldmstm.ml b/gcc/config/arm/arm-ldmstm.ml
index 
62982df594d5d4a1407df359e927c66986a9788c..f3ee741e93927d8d44a9eccec8970b46a8984216
 100644
--- a/gcc/config/arm/arm-ldmstm.ml
+++ b/gcc/config/arm/arm-ldmstm.ml
@@ -63,7 +63,7 @@ let rec final_offset addrmode nregs =
| DB -> -4 * nregs

  let constr thumb =
-  if thumb then "l" else "rk"
+  if thumb then "lk" else "rk"

  let inout_constr op_type =
match op_type with
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
06a6184ee0c4ed1a7cec1de4c1786e297cc57872..2223101fbf96bceb4beb3a7d6cb04162481dc3bf
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -23773,8 +23773,8 @@ thumb1_emit_multi_reg_push (unsigned long mask, 
unsigned long real_regs)
return insn;
  }

-/* Emit code to push or pop registers to or from the stack.  F is the
-   assembly file.  MASK is the registers to pop.  */
+/* Emit code to pop registers from the stack.  F is the assembly file.
+   MASK is the registers to pop.  */
  static void
  thumb_pop (FILE *f, unsigned long mask)
  {
diff --git a/gcc/config/arm/ldmstm.md b/gcc/config/arm/ldmstm.md
index 
ebb09ab86e799f3606e0988980edf3cd0189272b..8c0472e07799bd9d08759e35b6b98f3536d3d013
 100644
--- a/gcc/config/arm/ldmstm.md
+++ b/gcc/config/arm/ldmstm.md
@@ -43,7 +43,7 @@
  (define_insn "*thumb_ldm4_ia"
[(match_parallel 0 "load_multiple_operation"
  [(set (match_operand:SI 1 "low_register_operand" "")
-  (mem:SI (match_operand:SI 5 "s_register_operand" "l")))
+  (mem:SI (match_operand:SI 5 "s_register_operand" "lk")))
   (set (match_operand:SI 2 "low_register_operand" "")
(mem:SI (plus:SI (match_dup 5)
(const_int 4
@@ -80,7 +80,7 @@

  (define_insn "*thumb_ldm4_ia_update"
[(match_parallel 0 "load_multiple_operation"
-[(set (match_operand:SI 5 "s_register_operand" "+")
+[(set (match_operand:SI 5 "s_register_operand" "+")
(plus:SI (match_dup 5) (const_int 16)))
   (set (match_operand:SI 1 "low_register_operand" "")
(mem:SI (match_dup 5)))
@@ -133,7 +133,7 @@

  (define_insn "*thumb_stm4_ia_update"
[(match_parallel 0 "store_multiple_operation"
-[(set (match_operand:SI 5 "s_register_operand" "+")
+[(set (match_operand:SI 5 "s_register_operand" "+")
(plus:SI (match_dup 5) (const_int 16)))
   (set (mem:SI (match_dup 5))
(match_operand:SI 1 "low_register_operand" ""))
@@ -491,7 +491,7 @@
  (define_insn "*thumb_ldm3_ia"
[(match_parallel 0 "load_multiple_operation"
  [(set (match_operand:SI 1 "low_register_operand" "")
-  (mem:SI (match_operand:SI 4 "s_register_operand" "l")))
+  (mem:SI (match_operand:SI 4 "s_register_operand" "lk")))
   (set (match_operand:SI 2 "low_register_operand" "")
(mem:SI (plus:SI (match_dup 4)
(const_int 4
@@ -522,7 +522,7 @@

  (define_insn "*thumb_ldm3_ia_update"
[(match_parallel 0 "load_multiple_operation"
-[(set (match_operand:SI 4 "s_register_operand" "+")
+[(set (match_operand:SI 4 "s_register_operand" "+")
(plus:SI (match_dup 4) (const_int 12)))
   (set (match_operand:SI 1 "low_register_operand" "")
(mem:SI (match_dup 4)))
@@ -568,7 +568,7 @@

  (define_insn "*thumb_stm3_ia_update"
[(match_parallel 0 "store_multiple_operation"
-[(set (match_operand:SI 4 "s_register_operand" "+")
+[(set (match_operand:SI 4 "s_register_operand" "+")
(plus:SI (match_dup 4) (const_int 12)))
   (set (mem:SI (match_dup 4))
(match_operand:SI 1 "low_register_operand" ""))
@@ -877,7 +877,7 @@
  (define_insn "*thumb_ldm2_ia"
[(match_parallel 0 "load_multiple_operation"
  [(set (match_operand:SI 1 "low_register_operand" "")
-  (mem:SI (match_operand:SI 3 "s_register_operand" "l")))
+  (mem:SI (match_operand:SI 3 "s_register_operand" "lk")))
   (set (match_operand:SI 2 "low_register_operand" "")
(mem:SI (plus:SI (match_dup 3)
(const_int 4])]
@@ -902,7 +902,7 @@

  (define_insn "*thumb_ldm2_ia_update"
[(match_parallel 0 "load_multiple_operation"
-[(set (match_operand:SI 3 "s_register_operand" "+")
+[(set (match_operand:SI 3 

Re: Is it OK for rtx_addr_can_trap_p_1 to attempt to compute the frame layout? (was Re: [PATCH] Skip re-computing the mips frame info after reload completed)

2016-01-29 Thread Bernd Edlinger
On 28.01.2016 23:17, Richard Sandiford wrote:
> Bernd Edlinger  writes:
>> On 26.01.2016 22:18, Richard Sandiford wrote:
>>> [cc-ing Eric as RTL maintainer]
>>>
>>> Matthew Fortune  writes:
 Bernd Edlinger  writes:
> Matthew Fortune  writes:
>> Has the patch been tested beyond just building GCC? I can do a
>> test run for you if you don't have things set up to do one yourself.
>
> I built a cross-gcc with all languages and a cross-glibc, but I have
> not set up an emulation environment, so if you could give it a test
> that would be highly welcome.

 mipsel-linux-gnu test results are the same before and after this patch.

 Please go ahead and commit.
>>>
>>> I still object to this.  And it feels like the patch was posted
>>> as though it was a new one in order to avoid answering the objections
>>> that were raised when it was last posted:
>>>
>>> https://gcc.gnu.org/ml/gcc-patches/2015-12/msg02218.html
>>>
>>
>> Richard, I am really sorry when you feel now like I did not take your
>> objections seriously.  Let me first explain what happened from my point
>> of view:
>>
>> When I posted this response to your objections here:
>>
>>https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00235.html
>>
>> I was waiting for your response, but nothing happened, so I kind of
>> forgot about the issue.  In the meantime Ubuntu and Debian began to
>> roll out GCC-6 and they got stuck at the same issue, but instead of
>> waiting for pr69012 to be eventually resolved they created a duplicate
>> pr69129, and then last Thursday Nick applied my initial patch without
>> my intervention, probably because of the pressure they put on us.
>
> Ah, I'd missed that, sorry.  It's not obvious from the email thread
> that the patch was actually approved.  I just see a message from Matthias
> saying that it worked for him and then a message from Nick saying that
> he'd applied it.
>
> Ah well.  I guess at some point I have to get over the fact that I'm no
> longer the MIPS maintainer :-)
>
>> That changed significantly how I looked at the issue after that point,
>> as there was no actual regression anymore, the revised patch still made
>> sense, but for another reason.  When you look at the 20+ targets in the
>> gcc tree you'll see that almost all of them have a frame-layout
>> computation function and all except mips have a shortcut
>> "if (reload_completed) return;" in that function.  And OTOH mips has
>> one of the most complicated frame layout functions of all targets.
>>
>> For all of these reasons I posted a new patch which tries to resolve
>> differences between mips and other targets inital_elimination_offset
>> functions.
>
> OK.  But the point still stands that the patch is only useful because
> we're now calling mips_compute_frame_info in cases where we wouldn't
> previously, because of the rtx_addr_can_trap_p changes.
>

Yes of course, but it cannot hurt to have all targets behave
identical in such a central point.  Even if at some point also the
implementation in rtx_addr_can_trap_p will eventually be improved in
one way or the other.

>> I still think that it is not OK for the mips target to do the frame
>> layout already in mips_frame_pointer_required because the frame layout
>> will change several times, until reload is completed, and that function
>> is only called right in the beginning.
>
> I don't think it's any better or worse than doing the frame layout in
> INITIAL_ELIMINATION_OFFSET (which is common practice and pretty much
> required).  They're both part of the initial setup phase --
> targetm.frame_layout_required determines frame_pointer_needed, which is
> a vital input to the code that decides which eliminations to make.
>

Hmm... That can also be a difference between LRA and traditional reload.
Reload calls targetm.frame_pointer_required in update_eliminables, and
can apparently handle 0=>1 and 1=>0 transitions here.

While LRA calls frame_pointer_required only once in
ira_setup_eliminable_regset and can only handle 0=>1 transitions of
frame_pointer_needed in setup_can_eliminate, but not based on
targetm.frame_pointer_required but
targetm.can_eliminate(FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM).

At least it I read the source correctly, I could be wrong of course.

> And this is an example of us doing the kind of caching that I was
> suggesting.  Code that wants to know whether the frame pointer is needed
> for the current function should use frame_pointer_needed.  Only the code
> that sets up frame_pointer_needed should call frame_pointer_required
> directly.
>

But frame_pointer_needed adds at least some value to the raw
targetm.frame_pointer_required, while a cached result of
initial_elimination_offset would not, just conserve the current
interface exactly as it is.

>> And I think that it is not really well-designed to have a frame layout
>> 

Re: [PATCH, rs6000] Fix PR65546

2016-01-29 Thread David Edelsohn
On Thu, Jan 28, 2016 at 5:41 PM, Bill Schmidt
 wrote:
> Hi,
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65546 identifies a failure
> in gcc.dg/vect/costmodel/ppc/costmodel-vect-31a.c.  The test case hasn't
> kept up with changes in the vectorizer, so it's looking for the wrong
> error message.  Also, the error message should be conditioned by a check
> for support of unaligned memory accesses.  This patch corrects these
> problems.
>
> For 4.9 and 5, the error message needs to be similarly changed.
> However, for these earlier releases, the check for misalignment support
> doesn't apply.
>
> Verified on powerpc64le-unknown-linux-gnu for both -mcpu=power7 and
> -mcpu=power8, which differ in their support for misalignment.  Is this
> ok for trunk?  Provided verification succeeds on 4.9 and 5, is the
> revised test ok for those releases?
>
> Thanks,
> Bill
>
>
> 2016-01-28  Bill Schmidt  
>
> PR target/65546
> * gcc.dg/vect/costmodel/ppc/costmodel-vect-31a.c: Correct
> condition being checked, and disable it when the target supports
> misaligned loads and stores.

Okay.

Thanks, David


Re: [RS6000] ABI_V4 init of toc section

2016-01-29 Thread David Edelsohn
On Fri, Jan 29, 2016 at 11:38 AM, Alan Modra  wrote:
> Since 4c4a180d, LTO has turned off flag_pic when linking a fixed
> position executable.  This results in flag_pic being zero in
> rs6000_file_start, and no definition of ".LCTOC1".
>
> However, when we get to actually emitting code, flag_pic may be on
> again, and references made to ".LCTOC1".  How flag_pic comes to be
> enabled again is quite a story.  It goes like this..  If a function is
> compiled with -fPIC then sysv4.h SUBTARGET_OVERRIDE_OPTIONS will set
> TARGET_RELOCATABLE.  Conversely, if TARGET_RELOCATABLE is set and
> flag_pic is zero, then SUBTARGET_OVERRIDE_OPTIONS will set flag_pic=2.
> It also happens that TARGET_RELOCATABLE is a bit in rs6000_isa_flags,
> which is handled by rs6000_function_specific_save and
> rs6000_function_specific_restore.  That last fact means lto streaming
> keeps track of the state of TARGET_RELOCATABLE for functions, and when
> options are restored for a given function we'll set flag_pic=2 if the
> function was originally compiled with -fPIC.  That's bad because it
> defeats the purpose of the 4c4a180d lto change, resulting in worse
> optimization of ppc32 executables.  What's more, we don't seem to turn
> off flag_pic once it is on.
>
> We should really untangle the flag_pic/TARGET_RELOCATABLE mess, but
> that change is probably a little dangerous for stage4.  Instead, this
> patch removes the toc symbol initialization from file_start and does
> so when the first item is emitted to the toc, or after the function
> epilogue in the cases where we emit code to initialize a toc pointer
> but don't actually use it (-O0 mostly, I think).
>
> Bootstrapped and regression tested powerpc64-linux biarch with all
> languages enabled.  OK to apply?
>
> PR target/68662
> * config/rs6000/rs6000.c (need_toc_init): New var, set it
> whenever toc_label_name used.
> (rs6000_file_start): Don't set up toc section here,
> (rs6000_output_function_epilogue): do so here instead,
> (rs6000_xcoff_file_start): and here.
> * config/rs6000/rs6000.md (load_toc_aix_si): Set need_toc_init.
> (load_toc_aix_di): Likewise.

I'm worried about how this is going to interact with AIX.  AIX
assembler is single pass and this patch moves the initialization from
the beginning of the file to the end of the file, which means there
will be references to a label whose definition is delayed until the
end.

- David


[PATCHv2] Re: [RFC][PATCH, ARM 1/8] Add support for ARMv8-M's Security Extensions flag and intrinsics

2016-01-29 Thread Andre Vieira (lists)

On 05/01/16 14:38, Andre Vieira wrote:

On 31/12/15 20:54, Joseph Myers wrote:

On Sat, 26 Dec 2015, Thomas Preud'homme wrote:


+#define CMSE_TT_ASM(flags) \
+{ \
+  cmse_address_info_t result; \
+   __asm__ ("tt" # flags " %0,%1" \
+   : "=r"(result) \
+   : "r"(p) \
+   : "memory"); \
+  return result; \


Are the identifiers "result" and "p" really meant to be reserved by this
header (so that users can't have macros with those names before including
it), or should they actually be __result and __p (and likewise for any
other identifiers in this file not specified as reserved)?


+__extension__ void *
+cmse_check_address_range (void *p, size_t size, int flags);


Are "size" and "flags" really meant to be reserved?


+@item -mcmse
+@opindex mcmse
+Generate secure code as per ARMv8-M Security Extensions.


I think you also need a section in extend.texi much like the existing
ACLE
section, to describe support for this as a language extension.



I'll change all non-reserved and 'not-ment-for-export' identifiers to be
preceded by '__' and Ill also look into adding a section for ARMv8-M
Security Extensions (CMSE) to extend.texi.

Thank you for your feedback.

BR,
Andre


Hi there,

Forgot to send the reworked patch upstream, here it is following 
Joseph's comments. Thank you again.


Is this OK?

Cheers,
Andre

*** gcc/ChangeLog ***
2016-01-29  Andre Vieira
Thomas Preud'homme  

* gcc/config.gcc (extra_headers): Added arm_cmse.h.
* gcc/config/arm/arm-arches.def (ARM_ARCH):
  (armv8-m): Add FL2_CMSE.
  (armv8-m.main): Likewise.
  (armv8-m.main+dsp): Likewise.
* gcc/config/arm/arm-c.c
  (arm_cpu_builtins): Added __ARM_FEATURE_CMSE macro.
* gcc/config/arm/arm-protos.h
  (arm_is_constant_pool_ref): Define FL2_CMSE.
* gcc/config/arm.c (arm_arch_cmse): New.
  (arm_option_override): New error for unsupported cmse target.
* gcc/config/arm/arm.h (arm_arch_cmse): New.
* gcc/config/arm/arm.opt (mcmse): New.
* gcc/doc/invoke.texi (ARM Options): Add -mcmse.
* gcc/doc/extend.texi (ACLE): Add CMSE.
* gcc/config/arm/arm_cmse.h: New file.
* libgcc/config/arm/cmse.c: Likewise.
* libgcc/config/arm/t-arm (HAVE_CMSE): New.


*** gcc/testsuite/ChangeLog ***
2016-01-29  Andre Vieira
Thomas Preud'homme  

* gcc.target/arm/cmse/cmse.exp: New.
* gcc.target/arm/cmse/cmse-1.c: New.
* gcc.target/arm/cmse/cmse-12.c: New.
* lib/target-supports.exp
  (check_effective_target_arm_cmse_ok): New.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 7c3ad8984d8032b984b0acb21e9c05fdcc40579a..5d42d00819e74ff1c5b665f36e1b6f4033fe357d 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -323,7 +323,7 @@ arc*-*-*)
 arm*-*-*)
 	cpu_type=arm
 	extra_objs="arm-builtins.o aarch-common.o"
-	extra_headers="mmintrin.h arm_neon.h arm_acle.h"
+	extra_headers="mmintrin.h arm_neon.h arm_acle.h arm_cmse.h"
 	target_type_format_char='%'
 	c_target_objs="arm-c.o"
 	cxx_target_objs="arm-c.o"
diff --git a/gcc/config/arm/arm-arches.def b/gcc/config/arm/arm-arches.def
index be46521c9eaea54f9ad78a92874567589289dbdf..0e523959551cc3b1da31411ccdd1105b830db845 100644
--- a/gcc/config/arm/arm-arches.def
+++ b/gcc/config/arm/arm-arches.def
@@ -63,11 +63,11 @@ ARM_ARCH("armv8.1-a+crc",cortexa53, 8A,
 	  ARM_FSET_MAKE (FL_CO_PROC | FL_CRC32 | FL_FOR_ARCH8A,
 			 FL2_FOR_ARCH8_1A))
 ARM_ARCH("armv8-m.base", cortexm0, 8M_BASE,
-	 ARM_FSET_MAKE_CPU1 (			  FL_FOR_ARCH8M_BASE))
+	 ARM_FSET_MAKE (			  FL_FOR_ARCH8M_BASE, FL2_CMSE))
 ARM_ARCH("armv8-m.main", cortexm7, 8M_MAIN,
-	 ARM_FSET_MAKE_CPU1(FL_CO_PROC |	  FL_FOR_ARCH8M_MAIN))
+	 ARM_FSET_MAKE (FL_CO_PROC |		  FL_FOR_ARCH8M_MAIN, FL2_CMSE))
 ARM_ARCH("armv8-m.main+dsp", cortexm7, 8M_MAIN,
-	 ARM_FSET_MAKE_CPU1(FL_CO_PROC | FL_ARCH7EM | FL_FOR_ARCH8M_MAIN))
+	 ARM_FSET_MAKE (FL_CO_PROC | FL_ARCH7EM | FL_FOR_ARCH8M_MAIN, FL2_CMSE))
 ARM_ARCH("iwmmxt",  iwmmxt, 5TE,	ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT))
 ARM_ARCH("iwmmxt2", iwmmxt2,5TE,	ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT | FL_IWMMXT2))
 
diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index 195905fa25b36cd35fe9bc843c695333892106be..862bd095cb1c34626872194a03892ff915d18916 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -76,6 +76,14 @@ arm_cpu_builtins (struct cpp_reader* pfile)
 
   def_or_undef_macro (pfile, "__ARM_32BIT_STATE", TARGET_32BIT);
 
+  if (arm_arch8 && !arm_arch_notm)
+{
+  if (arm_arch_cmse && use_cmse)
+	builtin_define_with_int_value ("__ARM_FEATURE_CMSE", 3);
+  else
+	builtin_define ("__ARM_FEATURE_CMSE");
+}
+
   if (TARGET_ARM_FEATURE_LDREX)
 

[PING] Re: [RFC][PATCH, ARM 4/8] ARMv8-M Security Extension's cmse_nonsecure_entry: __acle_se label and bxns return

2016-01-29 Thread Andre Vieira (lists)

On 26/12/15 01:52, Thomas Preud'homme wrote:

[Sending on behalf of Andre Vieira]

Hello,

This patch extends support for the ARMv8-M Security Extensions 
'cmse_nonsecure_entry' attribute in two ways:

1) Generate two labels for the function, the regular function name and one with 
the function's name appended to '__acle_se_', this will trigger the linker to 
create a secure gateway veneer for this entry function.
2) Return from cmse_nonsecure_entry marked functions using bxns.

See Section 5.4 of ARM®v8-M Security Extensions 
(http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html).


*** gcc/ChangeLog ***
2015-10-27  Andre Vieira
 Thomas Preud'homme  

 * gcc/config/arm/arm.c (use_return_insn): Change to return with  bxns
   when cmse_nonsecure_entry.
   (output_return_instruction): Likewise.
   (arm_output_function_prologue): Likewise.
   (thumb_pop): Likewise.
   (thumb_exit): Likewise.
   (arm_function_ok_for_sibcall): Disable sibcall for entry functions.
   (arm_asm_declare_function_name): New.
   (thumb1_cmse_nonsecure_entry_return): New.
 * gcc/config/arm/arm-protos.h (arm_asm_declare_function_name): New.
 * gcc/config/arm/elf.h (ASM_DECLARE_FUNCTION_NAME): Redefine to
   use arm_asm_declare_function_name.

*** gcc/testsuite/ChangeLog ***
2015-10-27  Andre Vieira
 Thomas Preud'homme  

 * gcc.target/arm/cmse/cmse-2.c: New.
 * gcc.target/arm/cmse/cmse-4.c: New.


diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 
85dca057d63544c672188db39b05a33b1be10915..9ee8c333046d9a5bb0487f7b710a5aff42d2
 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -31,6 +31,7 @@ extern int arm_volatile_func (void);
  extern void arm_expand_prologue (void);
  extern void arm_expand_epilogue (bool);
  extern void arm_declare_function_name (FILE *, const char *, tree);
+extern void arm_asm_declare_function_name (FILE *, const char *, tree);
  extern void thumb2_expand_return (bool);
  extern const char *arm_strip_name_encoding (const char *);
  extern void arm_asm_output_labelref (FILE *, const char *);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
5b9e51b10e91eee64e3383c1ed50269c3e6cf24c..e530b772e3cc053c16421a2a2861d815d53ebb01
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3795,6 +3795,11 @@ use_return_insn (int iscond, rtx sibling)
return 0;
  }

+  /* ARMv8-M nonsecure entry function need to use bxns to return and thus need
+ several instructions if anything needs to be popped.  */
+  if (saved_int_regs && IS_CMSE_ENTRY (func_type))
+return 0;
+
/* If there are saved registers but the LR isn't saved, then we need
   two instructions for the return.  */
if (saved_int_regs && !(saved_int_regs & (1 << LR_REGNUM)))
@@ -6820,6 +6825,11 @@ arm_function_ok_for_sibcall (tree decl, tree exp)
if (IS_INTERRUPT (func_type))
  return false;

+  /* ARMv8-M non-secure entry functions need to return with bxns which is only
+ generated for entry functions themselves.  */
+  if (IS_CMSE_ENTRY (arm_current_func_type ()))
+return false;
+
if (!VOID_TYPE_P (TREE_TYPE (DECL_RESULT (cfun->decl
  {
/* Check that the return value locations are the same.  For
@@ -19607,6 +19617,7 @@ output_return_instruction (rtx operand, bool 
really_return, bool reverse,
 (e.g. interworking) then we can load the return address
 directly into the PC.  Otherwise we must load it into LR.  */
if (really_return
+ && !IS_CMSE_ENTRY (func_type)
  && (IS_INTERRUPT (func_type) || !TARGET_INTERWORK))
return_reg = reg_names[PC_REGNUM];
else
@@ -19742,8 +19753,12 @@ output_return_instruction (rtx operand, bool 
really_return, bool reverse,
  break;

default:
+ if (IS_CMSE_ENTRY (func_type))
+   {
+ snprintf (instr, sizeof (instr), "bxns%s\t%%|lr", conditional);
+   }
  /* Use bx if it's available.  */
- if (arm_arch5 || arm_arch4t)
+ else if (arm_arch5 || arm_arch4t)
sprintf (instr, "bx%s\t%%|lr", conditional);
  else
sprintf (instr, "mov%s\t%%|pc, %%|lr", conditional);
@@ -19756,6 +19771,42 @@ output_return_instruction (rtx operand, bool 
really_return, bool reverse,
return "";
  }

+/* Output in FILE asm statements needed to declare the NAME of the function
+   defined by its DECL node.  */
+
+void
+arm_asm_declare_function_name (FILE *file, const char *name, tree decl)
+{
+  size_t cmse_name_len;
+  char *cmse_name = 0;
+  char cmse_prefix[] = "__acle_se_";
+
+  if (use_cmse && lookup_attribute ("cmse_nonsecure_entry",
+   

[commited, PATCH] PR target/69530: [6 Regression] ICE: SIGSEGV

2016-01-29 Thread H.J. Lu
in ix86_split_long_move (i386.c:24353) with -fno-split-wide-types -mavx
Reply-To: "H.J. Lu" 

r229087, which caused PR 69530, was supposed to fix PR 67609.  r229458
has made r229087 unnecessary.

Approved by Vladimir in PR 69530.  Checked into trunk.


H.J.
---
gcc/

PR target/69530
* lra-splill.c (lra_final_code_change): Revert r229087 by
removing all sub-registers.

gcc/testsuite/

PR target/69530
* gcc.target/i386/pr69530.c: New test.
---
 gcc/lra-spills.c| 46 ++---
 gcc/testsuite/gcc.target/i386/pr69530.c | 11 
 2 files changed, 19 insertions(+), 38 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr69530.c

diff --git a/gcc/lra-spills.c b/gcc/lra-spills.c
index fa0a579..5709ef1 100644
--- a/gcc/lra-spills.c
+++ b/gcc/lra-spills.c
@@ -760,44 +760,14 @@ lra_final_code_change (void)
  
  struct lra_static_insn_data *static_id = id->insn_static_data;
  bool insn_change_p = false;
- 
-  for (i = id->insn_static_data->n_operands - 1; i >= 0; i--)
-   {
- if (! DEBUG_INSN_P (insn) && static_id->operand[i].is_operator)
-   continue;
- 
- rtx op = *id->operand_loc[i];
- 
- if (static_id->operand[i].type == OP_OUT
- && GET_CODE (op) == SUBREG && REG_P (SUBREG_REG (op))
- && ! LRA_SUBREG_P (op))
-   {
- hard_regno = REGNO (SUBREG_REG (op));
- /* We can not always remove sub-registers of
-hard-registers as we may lose information that
-only a part of registers is changed and
-subsequent optimizations may do wrong
-transformations (e.g. dead code eliminations).
-We can not also keep all sub-registers as the
-subsequent optimizations can not handle all such
-cases.  Here is a compromise which works.  */
- if ((GET_MODE_SIZE (GET_MODE (op))
-  < GET_MODE_SIZE (GET_MODE (SUBREG_REG (op
- && (hard_regno_nregs[hard_regno][GET_MODE (SUBREG_REG 
(op))]
- == hard_regno_nregs[hard_regno][GET_MODE (op)])
-#ifdef STACK_REGS
- && (hard_regno < FIRST_STACK_REG
- || hard_regno > LAST_STACK_REG)
-#endif
- )
-   continue;
-   }
- if (alter_subregs (id->operand_loc[i], ! DEBUG_INSN_P (insn)))
-   {
- lra_update_dup (id, i);
- insn_change_p = true;
-   }
-   }
+
+ for (i = id->insn_static_data->n_operands - 1; i >= 0; i--)
+   if ((DEBUG_INSN_P (insn) || ! static_id->operand[i].is_operator)
+   && alter_subregs (id->operand_loc[i], ! DEBUG_INSN_P (insn)))
+ {
+   lra_update_dup (id, i);
+   insn_change_p = true;
+ }
  if (insn_change_p)
lra_update_operator_dups (id);
}
diff --git a/gcc/testsuite/gcc.target/i386/pr69530.c 
b/gcc/testsuite/gcc.target/i386/pr69530.c
new file mode 100644
index 000..9146d1d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr69530.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O -fno-forward-propagate -fno-split-wide-types -mavx " } */
+
+typedef unsigned __int128 v32u128 __attribute__ ((vector_size (32)));
+
+v32u128
+foo (v32u128 v32u128_0)
+{
+  v32u128_0[0] *= v32u128_0[1];
+  return v32u128_0;
+}
-- 
2.5.0



[wwwdocs] fortran/index.html - remove local styles

2016-01-29 Thread Gerald Pfeifer
These local styles feel a bit odd to begin with, and if we skip
the ... within ..., the originally perceived/
addresses issue should go away.

Unless there are objections from the Fortran side, I plan on
committing this in a couple of days.

Gerald

Index: fortran/index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/fortran/index.html,v
retrieving revision 1.34
diff -u -r1.34 index.html
--- fortran/index.html  29 Jun 2014 11:31:33 -  1.34
+++ fortran/index.html  30 Jan 2016 04:02:12 -
@@ -91,52 +91,51 @@
 people:
 
 
-Paul Brook
-Steven Bosscher
-Bud Davis
-Jerry DeLisle
-Toon Moene
-Tobias Schlueter
-Janne Blomqvist
-Steve Kargl
-Thomas Koenig
-Paul Thomas
-Janus Weil
-Daniel Kraft
-Daniel Franke
+Paul Brook
+Steven Bosscher
+Bud Davis
+Jerry DeLisle
+Toon Moene
+Tobias Schlueter
+Janne Blomqvist
+Steve Kargl
+Thomas Koenig
+Paul Thomas
+Janus Weil
+Daniel Kraft
+Daniel Franke
 
 
 Under the rules specified below: 
 
 
-All normal
+All normal
 requirements for patch submission (assignment of copyright to
 the FSF, testing, ChangeLog entries, etc) still apply, and
 reviewers should ensure that these have been met before approving
-changes.
-Approval should be necessary for
+changes.
+Approval should be necessary for
 patches which don't fall under the obvious rule. So, with the approver list
 put in place, everybody (except maintainers) should still seek approval for 
 his/her patches.  We have found the mutual peer review process really 
-works well.
-Patches should only be reviewed by
+works well.
+Patches should only be reviewed by
 people who know the affected parts of the compiler. (i.e. the
 reviewer has to be sure he/she knows stuff well enough to make a
-good judgment.)
-Large/complicated patches should
-still go by one of our maintainers, or team consensus.
-We are all reasonable people, and nobody is working under
+good judgment.)
+Large/complicated patches should
+still go by one of our maintainers, or team consensus.
+We are all reasonable people, and nobody is working under
 employer pressure or needs an ego-boost badly, so in general we
-assume that no-one deliberately does anything stupid :-) 
+assume that no-one deliberately does anything stupid :-)
 
 
 The directories involved are:
 
-gcc/gcc/fortran/ 
-gcc/gcc/testsuite/gfortran.dg/ 
-gcc/gcc/testsuite/gfortran.fortran-torture/
-
-gcc/libgfortran/
+gcc/gcc/fortran/
+gcc/gcc/testsuite/gfortran.dg/
+gcc/gcc/testsuite/gfortran.fortran-torture/
+gcc/libgfortran/
 
 
 Documentation


[wwwdocs] Use external CSS for the News and Status panes on the main page

2016-01-29 Thread Gerald Pfeifer
This fixes the worst of what the new server settings broke; our
main page should look quite more reasonable with this.  And it
simplifies things a little.

Applied.

Index: index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v
retrieving revision 1.992
diff -u -r1.992 index.html
--- index.html  24 Jan 2016 23:54:36 -  1.992
+++ index.html  30 Jan 2016 05:58:12 -
@@ -45,12 +45,10 @@
 
 
 
-
 
-
-News
-
-
+
+News
+
 
 GCC 5.3 released
 [2015-12-04]
@@ -98,13 +96,9 @@
 
 
 
-
-
-
-
-Release Series and Status
-
-
+
+Release Series and Status
+
 
 GCC 5.3
   (changes)
Index: gcc.css
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc.css,v
retrieving revision 1.30
diff -u -r1.30 gcc.css
--- gcc.css 30 Jan 2016 04:28:47 -  1.30
+++ gcc.css 30 Jan 2016 05:58:12 -
@@ -14,15 +14,19 @@
 
 .highlight{ color: darkslategray; font-weight:bold; }
 
-dl.news  { margin-top:0; }
-dl.news dt   { color:darkslategrey; font-weight:bold; margin-top:0.3em; }
-dl.news dd   { margin-left:3ex; margin-top:0.1em; margin-bottom:0.1em; }
-dl.news .date { color:darkslategrey; font-size:90%; margin-left:0.1ex; }
-
-dl.status{ margin-top:0; }
-dl.status .version { font-weight:bold; }
-dl.status .regress { font-size: 90%; }
-dl.status dd { margin-left:3ex; }
+td.news  { width: 50%; padding-right: 8px; }
+td.news h2   { font-size: 1.2em; margin-top: 0; margin-bottom: 2%; }
+td.news dl   { margin-top:0; }
+td.news dt   { color:darkslategrey; font-weight:bold; margin-top:0.3em; }
+td.news dd   { margin-left:3ex; margin-top:0.1em; margin-bottom:0.1em; }
+td.news .date { color:darkslategrey; font-size:90%; margin-left:0.1ex; }
+
+td.status{ width: 50%; padding-left: 12px; border-left: #3366cc thin 
solid; }
+td.status h2 { font-size: 1.2em; margin-top:0; margin-bottom: 1%; }
+td.status dl { margin-top:0; }
+td.status .version { font-weight:bold; }
+td.status .regress { font-size: 90%; }
+td.status dd { margin-left:3ex; }
 
 .td_title {
   border-color: #3366cc;


Re: [PATCH] Fix up _Pragma GCC diagnostics regressions (PR preprocessor/69543, PR c/69558)

2016-01-29 Thread David Malcolm
On Fri, 2016-01-29 at 20:50 +0100, Jakub Jelinek wrote:
> Hi!
> 
> This patch reverts one tiny change from r228049 changes (which hasn't
> been
> mentioned in the ChangeLog or patch description).  We definitely need
> to
> revisit this for GCC 7, but stage4 is probably not the right time for
> that,
> and the patch fixes e.g. tons of warnings (or with -Werror errors on
> including pretty much all glib2 headers).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2016-01-29  Jakub Jelinek  
> 
>   PR preprocessor/69543
>   PR c/69558
>   * c-pragma.c (handle_pragma_diagnostic): Pass input_location
>   instead of loc to control_warning_option.
> 
>   * gcc.dg/pr69543.c: New test.
>   * gcc.dg/pr69558.c: New test.

This touches c-family; shouldn't the new tests be in c-c++-common,
rather than gcc.dg? (presumably we need to ensure that the glib2
headers are sane from C++ also)

I've been attempting to fix these by fixing linemap_compare_locations,
but I don't have that approach working, so fwiw I don't object to this
patch.


[wwwdocs] Avoid local styles in the standard footer.

2016-01-29 Thread Gerald Pfeifer
Our standard footer also got hit by the (needlessly for for us)
stricter server settings.

This again gets rid of extra vertical space in that box.

Installed, and afterwards I rebuilt our website using the
/www/gcc/bin/preprocess script on gcc.gnu.org.

Gerald

Index: gcc.css
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc.css,v
retrieving revision 1.28
diff -u -r1.28 gcc.css
--- gcc.css 22 Jan 2016 05:28:41 -  1.28
+++ gcc.css 30 Jan 2016 04:27:59 -
@@ -49,6 +49,7 @@
   border-width: thin;
   padding: 4px;
 }
+div.copyright p:nth-child(3) { margin-bottom: 0; }
 
 .boldcyan{ font-weight:bold; color:cyan; }
 .boldlime{ font-weight:bold; color:lime; }
Index: style.mhtml
===
RCS file: /cvs/gcc/wwwdocs/htdocs/style.mhtml,v
retrieving revision 1.125
diff -u -r1.125 style.mhtml
--- style.mhtml 29 Jun 2014 11:31:32 -  1.125
+++ style.mhtml 30 Jan 2016 04:16:21 -
@@ -241,7 +241,7 @@
 
 
 
-For questions related to the use of GCC,
+For questions related to the use of GCC,
 please consult these web pages and the
 https://gcc.gnu.org/onlinedocs/;>GCC manuals. If that fails,
 the mailto:gcc-h...@gcc.gnu.org;>gcc-h...@gcc.gnu.org
@@ -257,7 +257,7 @@
 Verbatim copying and distribution of this entire article is
 permitted in any medium, provided this notice is preserved.
 
-These pages are
+These pages are
 https://gcc.gnu.org/about.html;>maintained by the GCC team.
 Last modified http://validator.w3.org/check/referer;>.


Re: Enabling -frename-registers?

2016-01-29 Thread Richard Biener
On January 29, 2016 6:34:50 PM GMT+01:00, Bernd Schmidt  
wrote:
>So PR57193 has an example of sub-optimal code generation, with some 
>unnecessary register moves left after LRA. These seem to be difficult
>to 
>prevent, but last year Robert Suchanek made some modifications to 
>regrename that allow it to clean up such cases. Enabling 
>-frename-registers removes one of the two unnecessary copies, and I'm 
>pretty sure I could make it eliminate the other one as well with a bit 
>more work.
>
>Hence, this patch. The renamer has seen a lot of fixes over the years 
>and should be in pretty good shape IMO. Still, I won't deny that this
>is 
>a bit riskier than the usual bugfix patch at this stage.
>
>Bootstrapped and tested on x86_64-linux, with my earlier patch to fix 
>some i386 tests. Thoughts? Should we do this for gcc-7 at least?

I don't think it's appropriate at this stage.

Richard.

>
>Bernd




patch to fix PR69299

2016-01-29 Thread Vladimir Makarov

  The following patch is the final version of the patch for

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69299

  The patch was approved by Richard Henderson and Jakub.

 Committed as rev. 232993.


Index: ChangeLog
===
--- ChangeLog	(revision 232992)
+++ ChangeLog	(working copy)
@@ -1,3 +1,37 @@
+2016-01-29  Vladimir Makarov  
+
+	PR target/69299
+	* config/i386/constraints.md (Bm): Describe as special memory
+	constraint.
+	* doc/md.texi (DEFINE_SPECIAL_MEMORY_CONSTRAINT): Describe it.
+	* genoutput.c (main): Process DEFINE_SPECIAL_MEMORY_CONSTRAINT.
+	* genpreds.c (struct constraint_data): Add is_special_memory.
+	(have_special_memory_constraints, special_memory_start): New
+	static vars.
+	(special_memory_end): Ditto.
+	(add_constraint): Add new arg is_special_memory.  Add code to
+	process its true value.  Update have_special_memory_constraints.
+	(process_define_constraint): Pass the new arg.
+	(process_define_register_constraint): Ditto.
+	(choose_enum_order): Process special memory.
+	(write_tm_preds_h): Generate enum const CT_SPECIAL_MEMORY and
+	function insn_extra_special_memory_constraint.
+	(main): Process DEFINE_SPECIAL_MEMORY_CONSTRAINT.
+	* gensupport.c (process_rtx): Process
+	DEFINE_SPECIAL_MEMORY_CONSTRAINT.
+	* ira-costs.c (record_reg_classes): Process CT_SPECIAL_MEMORY.
+	* ira-lives.c (single_reg_class): Use
+	insn_extra_special_memory_constraint.
+	* ira.c (ira_setup_alts): Process CT_SPECIAL_MEMORY.
+	* lra-constraints.c (process_alt_operands): Ditto.
+	(curr_insn_transform): Use insn_extra_special_memory_constraint.
+	* recog.c (asm_operand_ok, preprocess_constraints): Process
+	CT_SPECIAL_MEMORY.
+	* reload.c (find_reloads): Ditto.
+	* rtl.def (DEFINE_SPECIFAL_MEMORY_CONSTRAINT): New.
+	* stmt.c (parse_input_constraint): Use
+	insn_extra_special_memory_constraint.
+
 2016-01-29  H.J. Lu  
 
 	PR target/69530
Index: config/i386/constraints.md
===
--- config/i386/constraints.md	(revision 232990)
+++ config/i386/constraints.md	(working copy)
@@ -162,7 +162,7 @@
   "@internal GOT memory operand."
   (match_operand 0 "GOT_memory_operand"))
 
-(define_constraint "Bm"
+(define_special_memory_constraint "Bm"
   "@internal Vector memory operand."
   (match_operand 0 "vector_memory_operand"))
 
Index: doc/md.texi
===
--- doc/md.texi	(revision 232990)
+++ doc/md.texi	(working copy)
@@ -4424,6 +4424,20 @@ The syntax and semantics are otherwise i
 @code{define_constraint}.
 @end deffn
 
+@deffn {MD Expression} define_special_memory_constraint name docstring exp
+Use this expression for constraints that match a subset of all memory
+operands: that is, @code{reload} can not make them match by reloading
+the address as it is described for @code{define_memory_constraint} or
+such address reload is undesirable with the performance point of view.
+
+For example, @code{define_special_memory_constraint} can be useful if
+specifically aligned memory is necessary or desirable for some insn
+operand.
+
+The syntax and semantics are otherwise identical to
+@code{define_constraint}.
+@end deffn
+
 @deffn {MD Expression} define_address_constraint name docstring exp
 Use this expression for constraints that match a subset of all address
 operands: that is, @code{reload} can make the constraint match by
Index: genoutput.c
===
--- genoutput.c	(revision 232990)
+++ genoutput.c	(working copy)
@@ -1019,6 +1019,7 @@ main (int argc, char **argv)
   case DEFINE_REGISTER_CONSTRAINT:
   case DEFINE_ADDRESS_CONSTRAINT:
   case DEFINE_MEMORY_CONSTRAINT:
+  case DEFINE_SPECIAL_MEMORY_CONSTRAINT:
 	note_constraint ();
 	break;
 
Index: genpreds.c
===
--- genpreds.c	(revision 232990)
+++ genpreds.c	(working copy)
@@ -659,11 +659,11 @@ write_one_predicate_function (struct pre
 
 /* Constraints fall into two categories: register constraints
(define_register_constraint), and others (define_constraint,
-   define_memory_constraint, define_address_constraint).  We
-   work out automatically which of the various old-style macros
-   they correspond to, and produce appropriate code.  They all
-   go in the same hash table so we can verify that there are no
-   duplicate names.  */
+   define_memory_constraint, define_special_memory_constraint,
+   define_address_constraint).  We work out automatically which of the
+   various old-style macros they correspond to, and produce
+   appropriate code.  They all go in the same hash table so we can
+   verify that there are no duplicate names.  */
 
 /* All data from one constraint definition.  */
 struct constraint_data
@@ -681,6 +681,7 @@ struct constraint_data
   unsigned int is_const_dbl	: 1;
   unsigned 

Re: [C++ patch] report better diagnostic for static following '[' in parameter declaration

2016-01-29 Thread Manuel López-Ibáñez

On 29/01/16 17:01, Prathamesh Kulkarni wrote:

Thanks for the review. AFAIK the type-qualifiers would be const,
restrict, volatile and _Atomic (n1570 p 6.7.3) ?
I added a check for those and for variable length array.
I am having issues with writing the test-case,
some cases pass with -std=c++11 but fail with -std=c++98.
Could you please have a look ?


Is there _Atomic in C++?

Also, why not simply reuse cp_parser_cv_qualifier_seq_opt (cp_parser* parser), 
perhaps adding a complain parameter that defaults to tf_error and calling it 
here with tf_none.


I think you will get nicer errors if you don't set bounds to error-mark, just 
give the error, consume the tokens and continue as usual.


Ideally, smart error-recovery should only be done when things already go wrong, 
thus after


 bounds = cp_parser_constant_expression (parser,
 /*allow_non_constant=*/true,
 _constant_p);
 if (!non_constant_p)
/* OK */;

fails, however our C++ parser tends to give errors quite deep in the stack 
instead of letting the caller decide what to do, which makes this too noisy in 
this case. Nonetheless, moving this error-recovery within:

if (token->type != CPP_CLOSE_SQUARE){ }
but before the above can only make the parser (marginally) faster for correct 
code.

Cheers,

Manuel.


Re: [C PATCH] Clear C_TYPE_INCOMPLETE_VARS even on variant types (PR debug/69518)

2016-01-29 Thread Marek Polacek
On Fri, Jan 29, 2016 at 08:41:18PM +0100, Jakub Jelinek wrote:
> Hi!
> 
> We ICE on the following testcase, because the C FE abuses TYPE_VFIELD
> for its FE stuff, but may leak it to the middle-end.
> We clear it for TYPE_MAIN_VARIANT (and only use it for that), but
> for the other variants it could be non-NULL, because build_variant_type*
> would just copy that field over.
> 
> Fixed by clearing it on all the variant types.
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2016-01-29  Jakub Jelinek  
> 
>   PR debug/69518
>   * c-decl.c (finish_struct): Clear C_TYPE_INCOMPLETE_VARS in
>   all type variants, not just TYPE_MAIN_VARIANT.
> 
>   * gcc.dg/torture/pr69518.c: New test.

Ok.

Marek


[C PATCH] Clear C_TYPE_INCOMPLETE_VARS even on variant types (PR debug/69518)

2016-01-29 Thread Jakub Jelinek
Hi!

We ICE on the following testcase, because the C FE abuses TYPE_VFIELD
for its FE stuff, but may leak it to the middle-end.
We clear it for TYPE_MAIN_VARIANT (and only use it for that), but
for the other variants it could be non-NULL, because build_variant_type*
would just copy that field over.

Fixed by clearing it on all the variant types.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-01-29  Jakub Jelinek  

PR debug/69518
* c-decl.c (finish_struct): Clear C_TYPE_INCOMPLETE_VARS in
all type variants, not just TYPE_MAIN_VARIANT.

* gcc.dg/torture/pr69518.c: New test.

--- gcc/c/c-decl.c.jj   2016-01-27 20:32:17.0 +0100
+++ gcc/c/c-decl.c  2016-01-29 13:54:52.776583200 +0100
@@ -7842,6 +7842,14 @@ finish_struct (location_t loc, tree t, t
   }
   }
 
+  /* Note: C_TYPE_INCOMPLETE_VARS overloads TYPE_VFIELD which is used
+ in dwarf2out via rest_of_decl_compilation below and means
+ something totally different.  Since we will be clearing
+ C_TYPE_INCOMPLETE_VARS shortly after we iterate through them,
+ clear it ahead of time and avoid problems in dwarf2out.  Ideally,
+ C_TYPE_INCOMPLETE_VARS should use some language specific
+ node.  */
+  tree incomplete_vars = C_TYPE_INCOMPLETE_VARS (TYPE_MAIN_VARIANT (t));
   for (x = TYPE_MAIN_VARIANT (t); x; x = TYPE_NEXT_VARIANT (x))
 {
   TYPE_FIELDS (x) = TYPE_FIELDS (t);
@@ -7849,6 +7857,7 @@ finish_struct (location_t loc, tree t, t
   C_TYPE_FIELDS_READONLY (x) = C_TYPE_FIELDS_READONLY (t);
   C_TYPE_FIELDS_VOLATILE (x) = C_TYPE_FIELDS_VOLATILE (t);
   C_TYPE_VARIABLE_SIZE (x) = C_TYPE_VARIABLE_SIZE (t);
+  C_TYPE_INCOMPLETE_VARS (x) = NULL_TREE;
 }
 
   /* If this was supposed to be a transparent union, but we can't
@@ -7862,17 +7871,7 @@ finish_struct (location_t loc, tree t, t
 }
 
   /* If this structure or union completes the type of any previous
- variable declaration, lay it out and output its rtl.
-
- Note: C_TYPE_INCOMPLETE_VARS overloads TYPE_VFIELD which is used
- in dwarf2out via rest_of_decl_compilation below and means
- something totally different.  Since we will be clearing
- C_TYPE_INCOMPLETE_VARS shortly after we iterate through them,
- clear it ahead of time and avoid problems in dwarf2out.  Ideally,
- C_TYPE_INCOMPLETE_VARS should use some language specific
- node.  */
-  tree incomplete_vars = C_TYPE_INCOMPLETE_VARS (TYPE_MAIN_VARIANT (t));
-  C_TYPE_INCOMPLETE_VARS (TYPE_MAIN_VARIANT (t)) = 0;
+ variable declaration, lay it out and output its rtl.  */
   for (x = incomplete_vars; x; x = TREE_CHAIN (x))
 {
   tree decl = TREE_VALUE (x);
--- gcc/testsuite/gcc.dg/torture/pr69518.c.jj   2016-01-29 13:52:22.547656581 
+0100
+++ gcc/testsuite/gcc.dg/torture/pr69518.c  2016-01-29 13:52:03.0 
+0100
@@ -0,0 +1,11 @@
+/* PR debug/69518 */
+/* { dg-do compile } */
+/* { dg-options "-g" } */
+
+struct A a;
+typedef struct A B;
+struct A {}
+foo (B x)
+{
+  __builtin_abort ();
+}

Jakub


[PATCH] Fix up _Pragma GCC diagnostics regressions (PR preprocessor/69543, PR c/69558)

2016-01-29 Thread Jakub Jelinek
Hi!

This patch reverts one tiny change from r228049 changes (which hasn't been
mentioned in the ChangeLog or patch description).  We definitely need to
revisit this for GCC 7, but stage4 is probably not the right time for that,
and the patch fixes e.g. tons of warnings (or with -Werror errors on
including pretty much all glib2 headers).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-01-29  Jakub Jelinek  

PR preprocessor/69543
PR c/69558
* c-pragma.c (handle_pragma_diagnostic): Pass input_location
instead of loc to control_warning_option.

* gcc.dg/pr69543.c: New test.
* gcc.dg/pr69558.c: New test.

--- gcc/c-family/c-pragma.c.jj  2016-01-15 21:57:00.0 +0100
+++ gcc/c-family/c-pragma.c 2016-01-29 18:34:51.743943283 +0100
@@ -817,9 +817,12 @@ handle_pragma_diagnostic(cpp_reader *ARG
   const char *arg = NULL;
   if (cl_options[option_index].flags & CL_JOINED)
 arg = option_string + 1 + cl_options[option_index].opt_len;
+  /* FIXME: input_location isn't the best location here, but it is
+ what we used to do here before and changing it breaks e.g.
+ PR69543 and PR69558.  */
   control_warning_option (option_index, (int) kind,
  arg, kind != DK_IGNORED,
- loc, lang_mask, ,
+ input_location, lang_mask, ,
  _options, _options_set,
  global_dc);
 }
--- gcc/testsuite/gcc.dg/pr69558.c.jj   2016-01-29 18:43:32.191665058 +0100
+++ gcc/testsuite/gcc.dg/pr69558.c  2016-01-29 18:40:05.0 +0100
@@ -0,0 +1,17 @@
+/* PR c/69558 */
+/* { dg-do compile } */
+/* { dg-options "-Wdeprecated-declarations" } */
+
+#define A \
+  _Pragma ("GCC diagnostic push") \
+  _Pragma ("GCC diagnostic ignored \"-Wdeprecated-declarations\"")
+#define B \
+  _Pragma ("GCC diagnostic pop")
+#define C(x) \
+  A \
+  static inline void bar (void) { x (); } \
+  B
+
+__attribute__((deprecated)) void foo (void); /* { dg-bogus "declared here" } */
+
+C (foo) /* { dg-bogus "is deprecated" } */
--- gcc/testsuite/gcc.dg/pr69543.c.jj   2016-01-29 18:45:09.520323395 +0100
+++ gcc/testsuite/gcc.dg/pr69543.c  2016-01-29 18:44:56.0 +0100
@@ -0,0 +1,18 @@
+/* PR preprocessor/69543 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wuninitialized" } */
+
+# define YY_IGNORE_MAYBE_UNINITIALIZED_BEGIN \
+_Pragma ("GCC diagnostic push") \
+_Pragma ("GCC diagnostic ignored \"-Wuninitialized\"")\
+_Pragma ("GCC diagnostic ignored \"-Wmaybe-uninitialized\"")
+# define YY_IGNORE_MAYBE_UNINITIALIZED_END \
+_Pragma ("GCC diagnostic pop")
+
+void test (char yylval)
+{
+  char *yyvsp;
+  YY_IGNORE_MAYBE_UNINITIALIZED_BEGIN
+  *++yyvsp = yylval;
+  YY_IGNORE_MAYBE_UNINITIALIZED_END
+}

Jakub


Re: Is it OK for rtx_addr_can_trap_p_1 to attempt to compute the frame layout? (was Re: [PATCH] Skip re-computing the mips frame info after reload completed)

2016-01-29 Thread Bernd Edlinger
On 29.01.2016 16:47 Bernd Schmidt wrote:
> On 01/29/2016 04:41 PM, Jakub Jelinek wrote:
>> On Fri, Jan 29, 2016 at 02:09:25AM +0100, Bernd Schmidt wrote:
>
>>> I think a better approach might be to just mark accesses at known
>>> locations
>>> in the frame, or arg pushes, as MEM_NOTRAP_P, and consider accesses with
>>> non-constant or calculated offsets as potentially trapping.
>>
>> I don't see how that would work generally.  Sure, if there is e.g. a
>> constant
>> offset array access, it could be checked easily, but if there is
>> variable offset
>> array access, that is at some point later on changed into a constant
>> offset
>> access, you'd need to be conservative, unless you can prove it is in
>> range.
>
> Yes. What is the problem with that? If we have (plus sfp const_int) at
> any point before reload, we can check whether that offset is inside
> frame_size. If it isn't or if the offset isn't known, it could trap.
>
>

Usually we have "if (x==1234) { read MEM[FP+x]; }", so wo don't know,
and then after reload: "if (x==1234) { read MEM[SP+x+sp_fp_offset]; }"
but wait, in the if statement we know, that x==1234, so everything
turns in one magic constant, and we have a totally new constant offset
from the SP register "if (x==1234) { read MEM[SP+1234+sp_fp_offset]; }".
Now if rtx_addr_can_trap_p(MEM[SP+1234+sp_fp_offset]) says it cannot
trap we think we do not need the if at all => BANG.


Bernd.


[Patch, MIPS] Fix PR target/68273, passing args in wrong regs

2016-01-29 Thread Steve Ellcey

This is a patch for PR 68273, where MIPS is passing arguments in the wrong
registers.  The problem is that when passing an argument by value,
mips_function_arg_boundary was looking at the type of the argument (and
not just the mode), and if that argument was a variable with extra alignment
info (say an integer variable with 8 byte alignment), MIPS was aligning the
argument on an 8 byte boundary instead of a 4 byte boundary the way it should.

Since we are passing values (not variables), the alignment of the variable
that the value is copied from should not affect the alignment of the value
being passed.

This patch fixes the problem and it could change what registers arguments
are passed in, which means it could affect backwards compatibility with
older programs.  But since the current behaviour is not compliant with the
MIPS ABI and does not match what LLVM does, I think we have to make this
change.  For the most part this will only affect arguments which are copied
from variables that have non-standard alignments set by the aligned attribute,
however the SRA optimization pass can create aligned variables as it splits
aggregates apart and that was what triggered this bug report.

This is basically the same bug as the ARM bug PR 65956 and the fix is
pretty much the same too.

Rather than create MIPS specific tests that check the use of specific
registers I created two tests to put in gcc.c-torture/execute that
were failing before because GCC on MIPS was not consistent in where
arguments were passed and which now work with this patch.

Tested with mips-mti-linux-gnu and no regressions.  OK to checkin?

Steve Ellcey
sell...@imgtec.com


2016-01-29  Steve Ellcey  

PR target/68273
* config/mips/mips.c (mips_function_arg_boundary): Fix argument
alignment.



diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index dd54d6a..ecce3cd 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -5643,8 +5643,9 @@ static unsigned int
 mips_function_arg_boundary (machine_mode mode, const_tree type)
 {
   unsigned int alignment;
-
-  alignment = type ? TYPE_ALIGN (type) : GET_MODE_ALIGNMENT (mode);
+  alignment = type && mode == BLKmode
+ ? TYPE_ALIGN (TYPE_MAIN_VARIANT (type))
+ : GET_MODE_ALIGNMENT (mode);
   if (alignment < PARM_BOUNDARY)
 alignment = PARM_BOUNDARY;
   if (alignment > STACK_BOUNDARY)


2016-01-29  Steve Ellcey  

PR target/68273
* gcc.c-torture/execute/pr68273-1.c: New test.
* gcc.c-torture/execute/pr68273-2.c: New test.


diff --git a/gcc/testsuite/gcc.c-torture/execute/pr68273-1.c 
b/gcc/testsuite/gcc.c-torture/execute/pr68273-1.c
index e69de29..3ce07c6 100644
--- a/gcc/testsuite/gcc.c-torture/execute/pr68273-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/pr68273-1.c
@@ -0,0 +1,74 @@
+/* Make sure that the alignment attribute on an argument passed by
+   value does not affect the calling convention and what registers
+   arguments are passed in.  */
+
+extern void exit (int);
+extern void abort (void);
+
+typedef int alignedint __attribute__((aligned(8)));
+
+int  __attribute__((noinline))
+foo1 (int a, alignedint b)
+{ return a + b; }
+
+int __attribute__((noinline))
+foo2 (int a, int b)
+{
+  return a + b;
+}
+
+int __attribute__((noinline))
+bar1 (alignedint x)
+{
+  return foo1 (1, x);
+}
+
+int __attribute__((noinline))
+bar2 (alignedint x)
+{
+  return foo1 (1, (alignedint) 99);
+}
+
+int __attribute__((noinline))
+bar3 (alignedint x)
+{
+  return foo1 (1, x + (alignedint) 1);
+}
+
+alignedint q = 77;
+
+int __attribute__((noinline))
+bar4 (alignedint x)
+{
+  return foo1 (1, q);
+}
+
+
+int __attribute__((noinline))
+bar5 (alignedint x)
+{
+  return foo2 (1, x);
+}
+
+int __attribute__((noinline))
+use_arg_regs (int i, int j, int k)
+{
+  return i+j-k;
+}
+
+int main()
+{
+   if (use_arg_regs (999, 999, 999) != 999) abort ();
+   if (foo1 (19,13) != 32) abort ();
+   if (use_arg_regs (999, 999, 999) != 999) abort ();
+   if (bar1 (-33) != -32) abort ();
+   if (use_arg_regs (999, 999, 999) != 999) abort ();
+   if (bar2 (1) != 100) abort ();
+   if (use_arg_regs (999, 999, 999) != 999) abort ();
+   if (bar3 (17) != 19) abort ();
+   if (use_arg_regs (999, 999, 999) != 999) abort ();
+   if (bar4 (-33) != 78) abort ();
+   if (use_arg_regs (999, 999, 999) != 999) abort ();
+   if (bar5 (-84) != -83) abort ();
+   exit (0);
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr68273-2.c 
b/gcc/testsuite/gcc.c-torture/execute/pr68273-2.c
index e69de29..1661be9 100644
--- a/gcc/testsuite/gcc.c-torture/execute/pr68273-2.c
+++ b/gcc/testsuite/gcc.c-torture/execute/pr68273-2.c
@@ -0,0 +1,109 @@
+/* Make sure that the alignment attribute on an argument passed by
+   value does not affect the calling convention and what registers
+   arguments are passed in.  */
+
+extern void exit (int);
+extern void abort (void);
+
+typedef struct s {
+   char c;
+   char