Re: Fix ICE in use-after-scope w/ -fno-tree-dce (PR, sanitize/79783).

2017-03-05 Thread Martin Liška
On 03/03/2017 01:57 PM, Jakub Jelinek wrote:
> On Thu, Mar 02, 2017 at 06:49:32PM +0100, Martin Liška wrote:
>> It can happen with inlining and -fno-tree-dce that VAR_DECL for a SSA
>> NAME was removed and thus the poisoning should not have any usage.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Ready to be installed?
>> Martin
> 
>> >From d8aa72dc1d696f5500c00b6c2f532f2a87cf58d2 Mon Sep 17 00:00:00 2001
>> From: marxin 
>> Date: Thu, 2 Mar 2017 11:55:00 +0100
>> Subject: [PATCH] Fix ICE in use-after-scope w/ -fno-tree-dce (PR
>>  sanitize/79783).
>>
>> gcc/ChangeLog:
>>
>> 2017-03-02  Martin Liska  
>>
>>  PR sanitize/79783
>>  * asan.c (asan_expand_poison_ifn): Do not expand ASAN_POISON
>>  when having a SSA NAME w/o VAR_DECL assigned to it.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2017-03-02  Martin Liska  
>>
>>  PR sanitize/79783
>>  * g++.dg/asan/pr79783.C: New test.
>> ---
>>  gcc/asan.c  |  5 -
>>  gcc/testsuite/g++.dg/asan/pr79783.C | 19 +++
>>  2 files changed, 23 insertions(+), 1 deletion(-)
>>  create mode 100644 gcc/testsuite/g++.dg/asan/pr79783.C
>>
>> diff --git a/gcc/asan.c b/gcc/asan.c
>> index 6cdd59b7ea7..307423ced03 100644
>> --- a/gcc/asan.c
>> +++ b/gcc/asan.c
>> @@ -3107,7 +3107,10 @@ asan_expand_poison_ifn (gimple_stmt_iterator *iter,
>>  {
>>gimple *g = gsi_stmt (*iter);
>>tree poisoned_var = gimple_call_lhs (g);
>> -  if (!poisoned_var)
>> +
>> +  /* It can happen with inlining and -fno-tree-dce that VAR_DECL for a SSA
>> + NAME was removed and thus the poisoning should not have any usage.  */
>> +  if (!poisoned_var || SSA_NAME_VAR (poisoned_var) == NULL_TREE)
> 
> I wonder if it wouldn't be better to do:
>   if (!poisoned_var || has_zero_uses (poisoned_var))
> 
> perhaps with -fno-tree-dce we could end up with SSA_NAME_VAR being
> non-NULL, yet no uses; in that case there is nothing to warn about.
> On the other side, in theory we could also end up with anonymous SSA_NAME
> and some uses - in that case perhaps it would be better to warn.
> So do:
>   if (SSA_NAME_VAR (poisoned_var) == NULL_TREE)
> SSA_NAME_VAR (poisoned_var) = create_tmp_var (TREE_TYPE (poisoned_var));
>   tree shadow_var = create_asan_shadow_var (SSA_NAME_VAR (poisoned_var),
> shadow_vars_mapping);
> or so?  We'll need SSA_NAME_VAR non-NULL so that we can use a default def
> later.
> 
>   Jakub
> 

Ok, I've just prepared and tested following patch that does what Jakub 
suggested.Hi.
Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Martin


>From bbbd4958fb95071e703efda8119de68cc252523f Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 2 Mar 2017 11:55:00 +0100
Subject: [PATCH] Fix ICE in use-after-scope w/ -fno-tree-dce (PR
 sanitize/79783).

gcc/ChangeLog:

2017-03-02  Martin Liska  

	PR sanitize/79783
	* asan.c (asan_expand_poison_ifn): Do not expand ASAN_POISON
	when having a SSA NAME w/o VAR_DECL assigned to it.

gcc/testsuite/ChangeLog:

2017-03-02  Martin Liska  

	PR sanitize/79783
	* g++.dg/asan/pr79783.C: New test.
---
 gcc/asan.c  |  4 
 gcc/testsuite/g++.dg/asan/pr79783.C | 19 +++
 2 files changed, 23 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/asan/pr79783.C

diff --git a/gcc/asan.c b/gcc/asan.c
index 6cdd59b7ea7..2e7dd04075f 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -3113,6 +3113,10 @@ asan_expand_poison_ifn (gimple_stmt_iterator *iter,
   return true;
 }
 
+  if (SSA_NAME_VAR (poisoned_var) == NULL_TREE)
+SET_SSA_NAME_VAR_OR_IDENTIFIER (poisoned_var,
+create_tmp_var (TREE_TYPE (poisoned_var)));
+
   tree shadow_var = create_asan_shadow_var (SSA_NAME_VAR (poisoned_var),
 	shadow_vars_mapping);
 
diff --git a/gcc/testsuite/g++.dg/asan/pr79783.C b/gcc/testsuite/g++.dg/asan/pr79783.C
new file mode 100644
index 000..939f60b2819
--- /dev/null
+++ b/gcc/testsuite/g++.dg/asan/pr79783.C
@@ -0,0 +1,19 @@
+// PR sanitizer/79783
+// { dg-options "-fno-tree-dce" }
+
+struct A
+{
+  static void foo(const char&) {}
+};
+
+struct B
+{
+  B() { A::foo(char()); }
+};
+
+struct C
+{
+  virtual void bar() const { B b; }
+};
+
+C c;
-- 
2.11.1



[PATCH][MIPS]MSA min,max insn family RTL fixes

2017-03-05 Thread Prachi Godbole
Hi,

Here are a couple of bugs for MSA min/max instructions along with proposed 
fixes. Patch for the same is also included below:

1. mini_s and maxi_s: Assembler error: invalid operand :-
Fix: Change print operand code so as to print signed immediate instead of an 
unsigned one.

2. max_a, min_a, fmax_a, fmin_a: RTL operand is missing mode; it was discovered 
while forward propagating the result.
Fix: Introduce mode iterator in if_then_else construct.

OK?

Changelog:

2017-03-06  Prachi Godbole  

gcc/
* config/mips/mips-msa.md (msa_fmax_a_, msa_fmin_a_,
msa_max_a_, msa_min_a_): Introduce mode interator for
if_then_else.
(smin3, smax3): Change operand print code from 'B' to 'E'.

gcc/testsuite/
* gcc.target/mips/msa-minmax.c: New tests.


Index: config/mips/mips-msa.md
===
--- config/mips/mips-msa.md (revision 245205)
+++ config/mips/mips-msa.md (working copy)
@@ -1688,7 +1688,7 @@
 
 (define_insn "msa_fmax_a_"
   [(set (match_operand:FMSA 0 "register_operand" "=f")
-   (if_then_else
+   (if_then_else:FMSA
   (gt (abs:FMSA (match_operand:FMSA 1 "register_operand" "f"))
   (abs:FMSA (match_operand:FMSA 2 "register_operand" "f")))
   (match_dup 1)
@@ -1709,7 +1709,7 @@
 
 (define_insn "msa_fmin_a_"
   [(set (match_operand:FMSA 0 "register_operand" "=f")
-   (if_then_else
+   (if_then_else:FMSA
   (lt (abs:FMSA (match_operand:FMSA 1 "register_operand" "f"))
   (abs:FMSA (match_operand:FMSA 2 "register_operand" "f")))
   (match_dup 1)
@@ -2174,7 +2174,7 @@
 
 (define_insn "msa_max_a_"
   [(set (match_operand:IMSA 0 "register_operand" "=f")
-   (if_then_else
+   (if_then_else:IMSA
   (gt (abs:IMSA (match_operand:IMSA 1 "register_operand" "f"))
   (abs:IMSA (match_operand:IMSA 2 "register_operand" "f")))
   (match_dup 1)
@@ -2191,7 +2191,7 @@
   "ISA_HAS_MSA"
   "@
max_s.\t%w0,%w1,%w2
-   maxi_s.\t%w0,%w1,%B2"
+   maxi_s.\t%w0,%w1,%E2"
   [(set_attr "type" "simd_int_arith")
(set_attr "mode" "")])
 
@@ -2208,7 +2208,7 @@
 
 (define_insn "msa_min_a_"
   [(set (match_operand:IMSA 0 "register_operand" "=f")
-   (if_then_else
+   (if_then_else:IMSA
   (lt (abs:IMSA (match_operand:IMSA 1 "register_operand" "f"))
   (abs:IMSA (match_operand:IMSA 2 "register_operand" "f")))
   (match_dup 1)
@@ -2225,7 +2225,7 @@
   "ISA_HAS_MSA"
   "@
min_s.\t%w0,%w1,%w2
-   mini_s.\t%w0,%w1,%B2"
+   mini_s.\t%w0,%w1,%E2"
   [(set_attr "type" "simd_int_arith")
(set_attr "mode" "")])
Index: testsuite/gcc.target/mips/msa-minmax.c
===
--- testsuite/gcc.target/mips/msa-minmax.c  (revision 0)
+++ testsuite/gcc.target/mips/msa-minmax.c  (revision 0)
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-mno-mips16 -mfp64 -mhard-float -mmsa" } */
+
+typedef int v4i32 __attribute__ ((vector_size(16)));
+typedef float v4f32 __attribute__ ((vector_size(16)));
+
+/* Test MSA signed min/max immediate for correct assembly output.  */
+
+void
+min_s_msa (v4i32 *vx, v4i32 *vy)
+{
+  *vy = __builtin_msa_mini_s_w (*vx, -15);
+}
+/* { dg-final { scan-assembler "-15" } }  */
+
+void
+max_s_msa (v4i32 *vx, v4i32 *vy)
+{
+  *vy = __builtin_msa_maxi_s_w (*vx, -15);
+}
+/* { dg-final { scan-assembler "-15" } }  */
+
+/* Test MSA min_a/max_a instructions for forward propagation optimization.  */
+
+#define FUNC(NAME, TYPE, RETTYPE) RETTYPE NAME##_a_msa (TYPE *vx, TYPE *vy) \
+{ \
+  TYPE dest = __builtin_msa_##NAME##_a_w (*vx, *vy); \
+  return dest[0]; \
+}
+
+FUNC(fmin, v4f32, float)
+/* { dg-final { scan-assembler "fmin_a.w" } }  */
+FUNC(fmax, v4f32, float)
+/* { dg-final { scan-assembler "fmax_a.w" } }  */
+FUNC(min, v4i32, int)
+/* { dg-final { scan-assembler "min_a.w" } }  */
+FUNC(max, v4i32, int)
+/* { dg-final { scan-assembler "max_a.w" } }  */


[PATCH][MIPS]MSA dotp.d, dpadd.d, dpsub.d insn RTL - fix MODE

2017-03-05 Thread Prachi Godbole
Hi,

A bug was discovered in MSA dotp__d, dpadd__d and dpsub__d RTL 
patterns while CSE'ing the result:
Wrong MODE for vec_select in the second mult operand.

The patch below fixes the same.

OK for trunk?

Changelog:

2017-03-06  Prachi Godbole  

gcc/
* config/mips/mips-msa.md (msa_dotp__d, msa_dpadd__d,
msa_dpsub__d): Fix MODE for vec_select.

gcc/testsuite/
* gcc.target/mips/msa-dotp.c: New tests.

Index: config/mips/mips-msa.md
===
--- config/mips/mips-msa.md (revision 245205)
+++ config/mips/mips-msa.md (working copy)
@@ -1230,10 +1230,10 @@
(parallel [(const_int 0) (const_int 2)]
  (mult:V2DI
(any_extend:V2DI
- (vec_select:V4SI (match_dup 1)
+ (vec_select:V2SI (match_dup 1)
(parallel [(const_int 1) (const_int 3)])))
(any_extend:V2DI
- (vec_select:V4SI (match_dup 2)
+ (vec_select:V2SI (match_dup 2)
(parallel [(const_int 1) (const_int 3)]))]
   "ISA_HAS_MSA"
   "dotp_.d\t%w0,%w1,%w2"
@@ -1319,10 +1319,10 @@
  (parallel [(const_int 0) (const_int 2)]
(mult:V2DI
  (any_extend:V2DI
-   (vec_select:V4SI (match_dup 2)
+   (vec_select:V2SI (match_dup 2)
  (parallel [(const_int 1) (const_int 3)])))
  (any_extend:V2DI
-   (vec_select:V4SI (match_dup 3)
+   (vec_select:V2SI (match_dup 3)
  (parallel [(const_int 1) (const_int 3)])
  (match_operand:V2DI 1 "register_operand" "0")))]
   "ISA_HAS_MSA"
@@ -1414,10 +1414,10 @@
  (parallel [(const_int 0) (const_int 2)]
(mult:V2DI
  (any_extend:V2DI
-   (vec_select:V4SI (match_dup 2)
+   (vec_select:V2SI (match_dup 2)
  (parallel [(const_int 1) (const_int 3)])))
  (any_extend:V2DI
-   (vec_select:V4SI (match_dup 3)
+   (vec_select:V2SI (match_dup 3)
  (parallel [(const_int 1) (const_int 3)])))]
   "ISA_HAS_MSA"
   "dpsub_.d\t%w0,%w2,%w3"
Index: testsuite/gcc.target/mips/msa-dotp.c
===
--- testsuite/gcc.target/mips/msa-dotp.c(revision 0)
+++ testsuite/gcc.target/mips/msa-dotp.c(revision 0)
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-mno-mips16 -mfp64 -mhard-float -mmsa" } */
+
+typedef int v4i32 __attribute__ ((vector_size(16)));
+typedef long long v2i64 __attribute__ ((vector_size(16)));
+
+/* Test MSA dot product family for CSE optimization.  */
+
+static v4i32 g = {0, 92, 93, 94};
+static v4i32 h = {12, 24, 36, 48};
+static v2i64 l = {84, 98};
+
+void
+dotp_d_msa (v2i64 *c)
+{
+  *c = __builtin_msa_dotp_s_d (g, h);
+}
+/* { dg-final { scan-assembler "dotp_s.d" } }  */
+
+void
+dpadd_d_msa (v2i64 *c)
+{
+  *c = __builtin_msa_dpadd_s_d (l, g, h);
+}
+/* { dg-final { scan-assembler "dpadd_s.d" } }  */
+
+void
+dpsub_d_msa (v2i64 *c)
+{
+  *c = __builtin_msa_dpsub_s_d (l, g, h);
+}
+/* { dg-final { scan-assembler "dpsub_s.d" } }  */


[PATCH][MIPS]MSA AND.d optimization to generate BCLRI.d

2017-03-05 Thread Prachi Godbole
Hi,

Below is the patch to fix ICE: output_operand: invalid use of '%V'
when generating BCLRI.d instruction from AND.d pattern.

Proposed fix:
mips_gen_const_int_vector (machine_mode mode, int val): Change type for 
argument VAL from int to HOST_WIDE_INT to allow const vector of type doubleword.
It is used by BCLRI.d alternative in AND.d pattern for immediate const vector 
operand with only one bit clear.

OK?

Changelog:

2017-03-06  Prachi Godbole  

gcc/
* config/mips/mips.c (mips_gen_const_int_vector): Change type of last
argument.
* config/mips/mips-protos.h (mips_gen_const_int_vector): Likewise.

gcc/testsuite/
* gcc.target/mips/msa-bclri.c: New test.


Index: config/mips/mips.c
===
--- config/mips/mips.c  (revision 245205)
+++ config/mips/mips.c  (working copy)
@@ -21608,7 +21608,7 @@
 /* Return a const_int vector of VAL with mode MODE.  */
 
 rtx
-mips_gen_const_int_vector (machine_mode mode, int val)
+mips_gen_const_int_vector (machine_mode mode, HOST_WIDE_INT val)
 {
   int nunits = GET_MODE_NUNITS (mode);
   rtvec v = rtvec_alloc (nunits);
Index: config/mips/mips-protos.h
===
--- config/mips/mips-protos.h   (revision 245205)
+++ config/mips/mips-protos.h   (working copy)
@@ -294,7 +294,7 @@
 extern bool mips_const_vector_bitimm_set_p (rtx, machine_mode);
 extern bool mips_const_vector_bitimm_clr_p (rtx, machine_mode);
 extern rtx mips_msa_vec_parallel_const_half (machine_mode, bool);
-extern rtx mips_gen_const_int_vector (machine_mode, int);
+extern rtx mips_gen_const_int_vector (machine_mode, HOST_WIDE_INT);
 extern bool mips_secondary_memory_needed (enum reg_class, enum reg_class,
  machine_mode);
 extern bool mips_cannot_change_mode_class (machine_mode,
Index: testsuite/gcc.target/mips/msa-bclri.c
===
--- testsuite/gcc.target/mips/msa-bclri.c   (revision 0)
+++ testsuite/gcc.target/mips/msa-bclri.c   (revision 0)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-mno-mips16 -mfp64 -mhard-float -mmsa" } */
+
+typedef long long v2i64 __attribute__ ((vector_size(16)));
+
+/* Test MSA AND.d optimization: generate BCLRI.d instead, for immediate const
+   vector operand with only one bit clear.  */
+
+void
+and_d_msa (v2i64 *vx, v2i64 *vy)
+{
+  v2i64 and_vec = {0x7FFF, 0x7FFF};
+  *vy = (*vx) & and_vec;
+}
+/* { dg-final { scan-assembler "bclri.d" } }  */


[PATCH][AArch64] Fix type for 1-element load

2017-03-05 Thread Hurugalawadi, Naveen
Hi,

Please find attached the patch that fixes type for 1-element load in AArch64.

Bootstrapped and Regression tested on aarch64-thunder-linux.
Please review the patch and let us know if its okay for Stage-1?

Thanks,
Naveen

2017-03-06  Julian Brown  
Naveen H.S  

* config/aarch64/aarch64-simd.md (aarch64_simd_vec_set): Fix
type for 1-element load.

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 878f86a..0443a86 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -561,7 +561,7 @@
 	gcc_unreachable ();
  }
   }
-  [(set_attr "type" "neon_from_gp, neon_ins, neon_load1_1reg")]
+  [(set_attr "type" "neon_from_gp, neon_ins, neon_load1_one_lane")]
 )
 
 (define_insn "*aarch64_simd_vec_copy_lane"


[PATCH][AArch64] Add neon_pairwise_add & neon_pairwise_add_q types

2017-03-05 Thread Hurugalawadi, Naveen
Hi,

Please find attached the patch that adds "neon_pairwise_add" & 
"neon_pairwise_add_qcrypto_pmull" for AArch64.

The patch doesn't change spec but improve other benchmarks.

Bootstrapped and Regression tested on aarch64-thunder-linux.
Please review the patch and let us know if its okay for Stage-1?

Thanks,
Naveen

2017-03-06  Julian Brown  
Naveen H.S  

* config/aarch64/aarch64-simd.md (aarch64_reduc_plus_internal)
(aarch64_reduc_plus_internalv2si, aarch64_addp, aarch64_addpdi):
Use neon_pairwise_add/neon_pairwise_add_q as appropriate.
* config/aarch64/iterators.md (reduc_pairwise): New mode attribute.
* config/aarch64/thunderx.md (thunderx_neon_add, thunderx_neon_add_q):
Tweak for neon_pairwise_add split.
* config/aarch64/thunderx2t99.md (thunderx2t99_asimd_int): Add
neon_pairwise_add/neon_pairwise_add_q types.
* config/arm/cortex-a15-neon.md (cortex_a15_neon_type): Likewise.
* config/arm/cortex-a17-neon.md (cortex_a17_neon_type): Likewise.
* config/arm/cortex-a57.md (cortex_a57_neon_type): Likewise.
* config/arm/cortex-a8-neon.md (cortex_a8_neon_type): Likewise.
* config/arm/cortex-a9-neon.md (cortex_a9_neon_type): Likewise.
* config/arm/xgene1.md (xgene1_neon_arith): Likewise.
* config/arm/types.md (neon_pairwise_add, neon_pairwise_add_q): Add.
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 338b9f8..878f86a 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -2101,7 +2101,7 @@
 		UNSPEC_ADDV))]
  "TARGET_SIMD"
  "add\\t%0, %1."
-  [(set_attr "type" "neon_reduc_add")]
+  [(set_attr "type" "neon__add")]
 )
 
 (define_insn "aarch64_reduc_plus_internalv2si"
@@ -2110,7 +2110,7 @@
 		UNSPEC_ADDV))]
  "TARGET_SIMD"
  "addp\\t%0.2s, %1.2s, %1.2s"
-  [(set_attr "type" "neon_reduc_add")]
+  [(set_attr "type" "neon_pairwise_add")]
 )
 
 (define_insn "reduc_plus_scal_"
@@ -4405,7 +4405,7 @@
   UNSPEC_ADDP))]
   "TARGET_SIMD"
   "addp\t%0, %1, %2"
-  [(set_attr "type" "neon_reduc_add")]
+  [(set_attr "type" "neon_pairwise_add")]
 )
 
 (define_insn "aarch64_addpdi"
@@ -4415,7 +4415,7 @@
   UNSPEC_ADDP))]
   "TARGET_SIMD"
   "addp\t%d0, %1.2d"
-  [(set_attr "type" "neon_reduc_add")]
+  [(set_attr "type" "neon_pairwise_add")]
 )
 
 ;; sqrt
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index c59d31e..c829cb5 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -790,6 +790,12 @@
 		  (V2SF "p") (V4SF  "v")
 		  (V4HF "v") (V8HF  "v")])
 
+(define_mode_attr reduc_pairwise [(V8QI "reduc") (V16QI "reduc")
+  (V4HI "reduc") (V8HI "reduc")
+  (V2SI "pairwise") (V4SI "reduc")
+  (V2DI "pairwise") (V2DF "pairwise")
+  (V2SF "pairwise") (V4SF "reduc")])
+
 (define_mode_attr vsi2qi [(V2SI "v8qi") (V4SI "v16qi")])
 (define_mode_attr VSI2QI [(V2SI "V8QI") (V4SI "V16QI")])
 
diff --git a/gcc/config/aarch64/thunderx.md b/gcc/config/aarch64/thunderx.md
index b67671d..95bfad4 100644
--- a/gcc/config/aarch64/thunderx.md
+++ b/gcc/config/aarch64/thunderx.md
@@ -266,7 +266,8 @@
 
 (define_insn_reservation "thunderx_neon_add" 4
   (and (eq_attr "tune" "thunderx")
-   (eq_attr "type" "neon_reduc_add, neon_reduc_minmax, neon_fp_reduc_add_s, \
+   (eq_attr "type" "neon_reduc_add, neon_pairwise_add, neon_reduc_minmax,\
+			neon_fp_reduc_add_s, \
 			neon_fp_reduc_add_d, neon_fp_to_int_s, neon_fp_to_int_d, \
 			neon_add_halve, neon_sub_halve, neon_qadd, neon_compare, \
 			neon_compare_zero, neon_minmax, neon_abd, neon_add, neon_sub, \
@@ -280,7 +281,8 @@
 
 (define_insn_reservation "thunderx_neon_add_q" 5
   (and (eq_attr "tune" "thunderx")
-   (eq_attr "type" "neon_reduc_add_q, neon_reduc_minmax_q, neon_fp_reduc_add_s_q, \
+   (eq_attr "type" "neon_reduc_add_q, neon_pairwise_add_q,\
+			neon_reduc_minmax_q, neon_fp_reduc_add_s_q, \
 			neon_fp_reduc_add_d_q, neon_fp_to_int_s_q, neon_fp_to_int_d_q, \
 			neon_add_halve_q, neon_sub_halve_q, neon_qadd_q, neon_compare_q, \
 			neon_compare_zero_q, neon_minmax_q, neon_abd_q, neon_add_q, neon_sub_q, \
diff --git a/gcc/config/aarch64/thunderx2t99.md b/gcc/config/aarch64/thunderx2t99.md
index 67011ac..f807547 100644
--- a/gcc/config/aarch64/thunderx2t99.md
+++ b/gcc/config/aarch64/thunderx2t99.md
@@ -231,6 +231,7 @@
 			neon_abs,neon_abs_q,\
 			neon_add,neon_add_q,\
 			neon_neg,neon_neg_q,\
+			neon_pairwise_add,neon_pairwise_add_q,\
 			neon_add_long,neon_add_widen,\
 			neon_add_halve,neon_add_halve_q,\
 			neon_sub_long,neon_sub_widen,\
diff --git a/gcc/config/arm/cortex-a15-neon.md b/gcc/config/arm/cortex-a15-neon.md
index 73ee84c..1a02fa2 100644
--- a/gcc/config/arm/cortex-a15-neon.md
+++ b/gcc/config/arm/cortex-a15-neon.md
@@ -48,6 +48,7 @@
   (eq_attr "type" "neon_add, neon_add_q, 

[PATCH] [AArch64] Implement automod load and store for Thunderx2t99

2017-03-05 Thread Hurugalawadi, Naveen
Hi,

Please find attached the patch that implements automod load and store for
Thunderx2t99.
The patch doesn't change spec but improve other benchmarks.

Bootstrapped and Regression tested on aarch64-thunder-linux.
Please review the patch and let us know if its okay for Stage-1?

Thanks,
Naveen

2017-03-06  Julian Brown  
Naveen H.S  

* config/aarch64/aarch64-protos.h (aarch64_automod_addr_only_dep): Add
prototype.
* config/aarch64/aarch64.c (aarch64_automod_addr_only_dep): New
function.
* config/aarch64/thunderx2t99.md (thunderx2t99_load_basic)
(thunderx2t99_store_basic, thunderx2t99_storepair_basic)
(thunderx2t99_fp_load_basic, thunderx2t99_fp_loadpair_basic)
(thunderx2t99_fp_storepair_basic): Add aarch64_mem_type_p test.
(thunderx2t99_load_automod, thunderx2t99_load_regoffset)
(thunderx2t99_load_scale_ext, thunderx2t99_store_automod)
(thunderx2t99_store_regoffset_scale_ext, thunderx2t99_fp_load_automod)
(thunderx2t99_storepair_automod, thunderx2t99_fp_load_regoffset)
(thunderx2t99_fp_load_scale_ext, thunderx2t99_fp_loadpair_automod)
(thunderx2t99_fp_store_automod, thunderx2t99_fp_storepair_automod)
(thunderx2t99_fp_store_regoffset_scale_ext): New insn reservations.
(thunderx2t99_load_automod, thunderx2t99_fp_load_automod)
(thunderx2t99_fp_loadpair_automod): Add bypass for output address-only
dependencies.
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index e045df8..7472d98 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -488,5 +488,6 @@ std::string aarch64_get_extension_string_for_isa_flags (unsigned long,
 			unsigned long);
 
 rtl_opt_pass *make_pass_fma_steering (gcc::context *ctxt);
+int aarch64_automod_addr_only_dep (rtx_insn *, rtx_insn *);
 
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 62f5461..c674c51 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -14875,6 +14875,94 @@ aarch64_run_selftests (void)
 
 #endif /* #if CHECKING_P */
 
+/* Return nonzero if the CONSUMER has a dependency only on an automodify
+   address in PRODUCER (a load instruction, i.e. the dependency is not on the
+   loaded value).  */
+
+int
+aarch64_automod_addr_only_dep (rtx_insn *producer, rtx_insn *consumer)
+{
+  rtx prod_set = single_set (producer);
+
+  if (prod_set)
+{
+  rtx dst, src = SET_SRC (prod_set);
+
+  if (GET_CODE (src) == ZERO_EXTEND || GET_CODE (src) == SIGN_EXTEND)
+	src = XEXP (src, 0);
+
+  gcc_assert (MEM_P (src));
+
+  dst = XEXP (prod_set, 0);
+
+  rtx cons_set = single_set (consumer);
+  rtx cons_pat = PATTERN (consumer);
+
+  if (cons_set)
+	return !reg_overlap_mentioned_p (dst, cons_set);
+  else if (GET_CODE (cons_pat) == PARALLEL)
+	{
+	  for (int i = 0; i < XVECLEN (cons_pat, 0); i++)
+	{
+	  rtx set = XVECEXP (cons_pat, 0, i);
+
+	  if (GET_CODE (set) != SET)
+		continue;
+
+	  if (reg_overlap_mentioned_p (dst, set))
+		return 0;
+	}
+	}
+  else
+	return 0;
+}
+  else if (GET_CODE (PATTERN (producer)) == PARALLEL)
+{
+  rtx prod_pat = PATTERN (producer);
+  rtx cons_set = single_set (consumer);
+  rtx cons_pat = PATTERN (consumer);
+
+  for (int i = 0; i < XVECLEN (prod_pat, 0); i++)
+	{
+	  prod_set = XVECEXP (prod_pat, 0, i);
+
+	  if (GET_CODE (prod_set) == SET)
+	{
+	  rtx src = XEXP (prod_set, 1), dst = XEXP (prod_set, 0);
+
+	  if (GET_CODE (src) == ZERO_EXTEND
+		  || GET_CODE (src) == SIGN_EXTEND)
+		src = XEXP (src, 0);
+
+	  gcc_assert (MEM_P (src));
+
+	  if (cons_set)
+		{
+		  if (reg_overlap_mentioned_p (dst, cons_set))
+		return 0;
+		}
+	  else if (GET_CODE (cons_pat) == PARALLEL)
+		{
+		  for (int i = 0; i < XVECLEN (cons_pat, 0); i++)
+		{
+		  rtx set = XVECEXP (cons_pat, 0, i);
+
+		  if (GET_CODE (set) != SET)
+		continue;
+
+		  if (reg_overlap_mentioned_p (dst, set))
+			return 0;
+		}
+		}
+	  else
+		return 0;
+	}
+	}
+}
+
+  return 1;
+}
+
 #undef TARGET_ADDRESS_COST
 #define TARGET_ADDRESS_COST aarch64_address_cost
 
diff --git a/gcc/config/aarch64/thunderx2t99.md b/gcc/config/aarch64/thunderx2t99.md
index 936078c..add3707 100644
--- a/gcc/config/aarch64/thunderx2t99.md
+++ b/gcc/config/aarch64/thunderx2t99.md
@@ -123,24 +123,73 @@
 
 (define_insn_reservation "thunderx2t99_load_basic" 4
   (and (eq_attr "tune" "thunderx2t99")
-   (eq_attr "type" "load1"))
+   (eq_attr "type" "load1")
+   (match_test "aarch64_mem_type_p (insn, AARCH64_ADDR_SYMBOLIC
+	  | AARCH64_ADDR_REG_IMM
+	  | AARCH64_ADDR_LO_SUM)"))
   "thunderx2t99_ls01")
 
+(define_insn_reservation "thunderx2t99_load_automod" 4
+  (and 

[PATCH][AArch64] Add aes and sha reservations for Thunderx2t99

2017-03-05 Thread Hurugalawadi, Naveen
Hi,

Please find attached the patch that adds aes and sha reservations for
Thunderx2t99.

Bootstrapped and Regression tested on aarch64-thunder-linux.
Please review the patch and let us know if its okay for Stage-1?

Thanks,
Naveen

2017-03-06  Julian Brown  
    Naveen H.S  

* config/aarch64/thunderx2t99.md (thunderx2t99_crc): New Reservation.diff --git a/gcc/config/aarch64/thunderx2t99.md b/gcc/config/aarch64/thunderx2t99.md
index f807547..2eb136b 100644
--- a/gcc/config/aarch64/thunderx2t99.md
+++ b/gcc/config/aarch64/thunderx2t99.md
@@ -443,7 +443,22 @@
(eq_attr "type" "neon_store2_one_lane,neon_store2_one_lane_q"))
   "thunderx2t99_ls01,thunderx2t99_f01")
 
+;; Crypto extensions.
+
+; FIXME: Forwarding path for aese/aesmc or aesd/aesimc pairs?
+
+(define_insn_reservation "thunderx2t99_aes" 5
+  (and (eq_attr "tune" "thunderx2t99")
+   (eq_attr "type" "crypto_aese,crypto_aesmc"))
+  "thunderx2t99_f1")
+
 (define_insn_reservation "thunderx2t99_pmull" 5
   (and (eq_attr "tune" "thunderx2t99")
(eq_attr "type" "crypto_pmull"))
   "thunderx2t99_f1")
+
+(define_insn_reservation "thunderx2t99_sha" 7
+  (and (eq_attr "tune" "thunderx2t99")
+   (eq_attr "type" "crypto_sha1_fast,crypto_sha1_xor,crypto_sha1_slow,\
+			crypto_sha256_fast,crypto_sha256_slow"))
+  "thunderx2t99_f1")


[PATCH][AArch64] Implement ALU_BRANCH fusion

2017-03-05 Thread Hurugalawadi, Naveen
Hi,

Please find attached the patch that implements alu_branch fusion
for AArch64.
The patch doesn't change spec but improve other benchmarks.

Bootstrapped and Regression tested on aarch64-thunder-linux.
Please review the patch and let us know if its okay for Stage-1?

Thanks,
Naveen

2017-03-06  Julian Brown  
Naveen H.S  

* config/aarch64/aarch64-fusion-pairs.def: Add ALU_BRANCH entry.
* config/aarch64/aarch64.c (AARCH64_FUSE_ALU_BRANCH): New fusion type.
(thunderx2t99_tunings): Set AARCH64_FUSE_ALU_BRANCH flag.
(aarch_macro_fusion_pair_p): Add support for AARCH64_FUSE_ALU_BRANCH.
diff --git a/gcc/config/aarch64/aarch64-fusion-pairs.def b/gcc/config/aarch64/aarch64-fusion-pairs.def
index f0e6dbc..300cd00 100644
--- a/gcc/config/aarch64/aarch64-fusion-pairs.def
+++ b/gcc/config/aarch64/aarch64-fusion-pairs.def
@@ -34,5 +34,6 @@ AARCH64_FUSION_PAIR ("movk+movk", MOVK_MOVK)
 AARCH64_FUSION_PAIR ("adrp+ldr", ADRP_LDR)
 AARCH64_FUSION_PAIR ("cmp+branch", CMP_BRANCH)
 AARCH64_FUSION_PAIR ("aes+aesmc", AES_AESMC)
+AARCH64_FUSION_PAIR ("alu+branch", ALU_BRANCH)
 
 #undef AARCH64_FUSION_PAIR
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index fa25d43..62f5461 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -792,7 +792,8 @@ static const struct tune_params thunderx2t99_tunings =
   _approx_modes,
   4, /* memmov_cost.  */
   4, /* issue_rate.  */
-  (AARCH64_FUSE_CMP_BRANCH | AARCH64_FUSE_AES_AESMC), /* fusible_ops  */
+  (AARCH64_FUSE_CMP_BRANCH | AARCH64_FUSE_AES_AESMC
+   | AARCH64_FUSE_ALU_BRANCH), /* fusible_ops  */
   16,	/* function_align.  */
   8,	/* jump_align.  */
   16,	/* loop_align.  */
@@ -14063,6 +14064,37 @@ aarch_macro_fusion_pair_p (rtx_insn *prev, rtx_insn *curr)
 return true;
 }
 
+  if (aarch64_fusion_enabled_p (AARCH64_FUSE_ALU_BRANCH)
+  && any_uncondjump_p (curr))
+{
+  /* These types correspond to the reservation "vulcan_alu_basic" for
+	 Broadcom Vulcan: these are ALU operations that produce a single uop
+	 during instruction decoding.  */
+  switch (get_attr_type (prev))
+	{
+	case TYPE_ALU_IMM:
+	case TYPE_ALU_SREG:
+	case TYPE_ADC_REG:
+	case TYPE_ADC_IMM:
+	case TYPE_ADCS_REG:
+	case TYPE_ADCS_IMM:
+	case TYPE_LOGIC_REG:
+	case TYPE_LOGIC_IMM:
+	case TYPE_CSEL:
+	case TYPE_ADR:
+	case TYPE_MOV_IMM:
+	case TYPE_SHIFT_REG:
+	case TYPE_SHIFT_IMM:
+	case TYPE_BFM:
+	case TYPE_RBIT:
+	case TYPE_REV:
+	case TYPE_EXTEND:
+	  return true;
+
+	default:;
+	}
+}
+
   return false;
 }
 


[PATCH][AArch64] Add crypto_pmull attribute

2017-03-05 Thread Hurugalawadi, Naveen
Hi,

Please find attached the patch that adds "crypto_pmull" for AArch64.

Bootstrapped and Regression tested on aarch64-thunder-linux.

Please review the patch and let us know if its okay for Stage-1?

Thanks,
Naveen

2017-03-06  Julian Brown  
Naveen H.S  

* config/aarch64/aarch64-simd.md (aarch64_crypto_pmulldi)
(aarch64_crypto_pmullv2di): Change type attribute to crypto_pmull.
* config/aarch64/thunderx2t99.md (thunderx2t99_pmull): New
reservation.
* config/arm/cortex-a57.md (cortex_a57_neon_type): Add crypto_pmull to
attribute type list for neon_multiply.
* config/arm/crypto.md (crypto_vmullp64): Change type to crypto_pmull.
* config/arm/types.md (crypto_pmull): Add.
* config/arm/xgene1.md (xgene1_neon_pmull): Add crypto_pmull to
attribute type list.
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index b61f79a..338b9f8 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -5818,7 +5818,7 @@
 		UNSPEC_PMULL))]
  "TARGET_SIMD && TARGET_CRYPTO"
  "pmull\\t%0.1q, %1.1d, %2.1d"
-  [(set_attr "type" "neon_mul_d_long")]
+  [(set_attr "type" "crypto_pmull")]
 )
 
 (define_insn "aarch64_crypto_pmullv2di"
@@ -5828,5 +5828,5 @@
 		  UNSPEC_PMULL2))]
   "TARGET_SIMD && TARGET_CRYPTO"
   "pmull2\\t%0.1q, %1.2d, %2.2d"
-  [(set_attr "type" "neon_mul_d_long")]
+  [(set_attr "type" "crypto_pmull")]
 )
diff --git a/gcc/config/aarch64/thunderx2t99.md b/gcc/config/aarch64/thunderx2t99.md
index 0dd7199..67011ac 100644
--- a/gcc/config/aarch64/thunderx2t99.md
+++ b/gcc/config/aarch64/thunderx2t99.md
@@ -441,3 +441,8 @@
   (and (eq_attr "tune" "thunderx2t99")
(eq_attr "type" "neon_store2_one_lane,neon_store2_one_lane_q"))
   "thunderx2t99_ls01,thunderx2t99_f01")
+
+(define_insn_reservation "thunderx2t99_pmull" 5
+  (and (eq_attr "tune" "thunderx2t99")
+   (eq_attr "type" "crypto_pmull"))
+  "thunderx2t99_f1")
diff --git a/gcc/config/arm/cortex-a57.md b/gcc/config/arm/cortex-a57.md
index fd30758..ebf4a49 100644
--- a/gcc/config/arm/cortex-a57.md
+++ b/gcc/config/arm/cortex-a57.md
@@ -76,7 +76,7 @@
 			   neon_mul_h_scalar_long, neon_mul_s_scalar_long,\
 			   neon_sat_mul_b_long, neon_sat_mul_h_long,\
 			   neon_sat_mul_s_long, neon_sat_mul_h_scalar_long,\
-			   neon_sat_mul_s_scalar_long")
+			   neon_sat_mul_s_scalar_long, crypto_pmull")
 	(const_string "neon_multiply")
 	  (eq_attr "type" "neon_mul_b_q, neon_mul_h_q, neon_mul_s_q,\
 			   neon_mul_h_scalar_q, neon_mul_s_scalar_q,\
diff --git a/gcc/config/arm/crypto.md b/gcc/config/arm/crypto.md
index 46b0715..a5e558b 100644
--- a/gcc/config/arm/crypto.md
+++ b/gcc/config/arm/crypto.md
@@ -81,7 +81,7 @@
  UNSPEC_VMULLP64))]
   "TARGET_CRYPTO"
   "vmull.p64\\t%q0, %P1, %P2"
-  [(set_attr "type" "neon_mul_d_long")]
+  [(set_attr "type" "crypto_pmull")]
 )
 
 (define_insn "crypto_"
diff --git a/gcc/config/arm/types.md b/gcc/config/arm/types.md
index b0b375c..253f496 100644
--- a/gcc/config/arm/types.md
+++ b/gcc/config/arm/types.md
@@ -539,6 +539,7 @@
 ; crypto_sha1_slow
 ; crypto_sha256_fast
 ; crypto_sha256_slow
+; crypto_pmull
 ;
 ; The classification below is for coprocessor instructions
 ;
@@ -1078,6 +1079,7 @@
   crypto_sha1_slow,\
   crypto_sha256_fast,\
   crypto_sha256_slow,\
+  crypto_pmull,\
   coproc"
(const_string "untyped"))
 
diff --git a/gcc/config/arm/xgene1.md b/gcc/config/arm/xgene1.md
index 62a0732..34a13f4 100644
--- a/gcc/config/arm/xgene1.md
+++ b/gcc/config/arm/xgene1.md
@@ -527,5 +527,6 @@
 (define_insn_reservation "xgene1_neon_pmull" 5
   (and (eq_attr "tune" "xgene1")
(eq_attr "type" "neon_mul_d_long,\
-   "))
+			crypto_pmull,\
+		   "))
   "xgene1_decode2op")


[PATCH][AArch64] Add crc reservations for Thunderx2t99

2017-03-05 Thread Hurugalawadi, Naveen
Hi,

Please find attached the patch that adds crc reservations for Thunderx2t99.

Bootstrapped and Regression tested on aarch64-thunder-linux.
Please review the patch and let us know if its okay for Stage-1?

Thanks,
Naveen

2017-03-06  Julian Brown  
Naveen H.S  

* config/aarch64/thunderx2t99.md (thunderx2t99_crc): New Reservation.
diff --git a/gcc/config/aarch64/thunderx2t99.md b/gcc/config/aarch64/thunderx2t99.md
index 2eb136b..936078c 100644
--- a/gcc/config/aarch64/thunderx2t99.md
+++ b/gcc/config/aarch64/thunderx2t99.md
@@ -462,3 +462,10 @@
(eq_attr "type" "crypto_sha1_fast,crypto_sha1_xor,crypto_sha1_slow,\
 			crypto_sha256_fast,crypto_sha256_slow"))
   "thunderx2t99_f1")
+
+;; CRC extension.
+
+(define_insn_reservation "thunderx2t99_crc" 4
+  (and (eq_attr "tune" "thunderx2t99")
+   (eq_attr "type" "crc"))
+  "thunderx2t99_i1")


[PATCH][AArch64] Add addr_type attribute

2017-03-05 Thread Hurugalawadi, Naveen
Hi,

Please find attached the patch that adds "addr_type" attribute 
for AArch64.

The patch doesn't change spec but improve other benchmarks.

Bootstrapped and Regression tested on aarch64-thunder-linux.
Please review the patch and let us know if its okay for Stage-1?

Thanks,
Naveen

2017-03-06  Julian Brown  
Naveen H.S  

* config/aarch64/aarch64-protos.h (AARCH64_ADDR_REG_IMM)
(AARCH64_ADDR_REG_WB, AARCH64_ADDR_REG_REG, AARCH64_ADDR_REG_SHIFT)
(AARCH64_ADDR_REG_EXT, AARCH64_ADDR_REG_SHIFT_EXT, AARCH64_ADDR_LO_SUM)
(AARCH64_ADDR_SYMBOLIC): New.
(aarch64_mem_type_p): Add prototype.
* config/aarch64/aarch64.c (aarch64_mem_type_p): New function.
* config/aarch64/aarch64.md (addr_type): New attribute.
(prefetch, *mov_aarch64, *movsi_aarch64, *movdi_aarch64)
(*movti_aarch64, *movtf_aarch64, *movsf_aarch64, *movdf_aarch64)
(load_pairsi, load_pairdi, store_pairsi, store_pairdi, load_pairsf)
(load_pairdf, store_pairsf)
(store_pairdf, loadwb_pair_)
(storewb_pair_, extendsidi2_aarch64)
(*load_pair_extendsidi2_aarch64, *zero_extendsidi2_aarch64)
(*load_pair_zero_extendsidi2_aarch64)
(*extend2_aarch64)
(*zero_extend2_aarch64)
(ldr_got_small_, ldr_got_small_sidi, ldr_got_tiny)
(tlsie_small_, tlsie_small_sidi): Add addr_type attribute.diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 9543f8c..e045df8 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -299,6 +299,19 @@ enum aarch64_parse_opt_result
 
 extern struct tune_params aarch64_tune_params;
 
+/* Mask bits to use for for aarch64_mem_type_p.  Unshifted/shifted index
+   register variants are separated for scheduling purposes because the
+   distinction matters on some cores.  */
+
+#define AARCH64_ADDR_REG_IMM		0x01
+#define AARCH64_ADDR_REG_WB		0x02
+#define AARCH64_ADDR_REG_REG		0x04
+#define AARCH64_ADDR_REG_SHIFT		0x08
+#define AARCH64_ADDR_REG_EXT		0x10
+#define AARCH64_ADDR_REG_SHIFT_EXT	0x20
+#define AARCH64_ADDR_LO_SUM		0x40
+#define AARCH64_ADDR_SYMBOLIC		0x80
+
 HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned);
 int aarch64_get_condition_code (rtx);
 bool aarch64_bitmask_imm (HOST_WIDE_INT val, machine_mode);
@@ -347,6 +360,7 @@ bool aarch64_simd_shift_imm_p (rtx, machine_mode, bool);
 bool aarch64_simd_valid_immediate (rtx, machine_mode, bool,
    struct simd_immediate_info *);
 bool aarch64_split_dimode_const_store (rtx, rtx);
+bool aarch64_mem_type_p (rtx_insn *, unsigned HOST_WIDE_INT);
 bool aarch64_symbolic_address_p (rtx);
 bool aarch64_uimm12_shift (HOST_WIDE_INT);
 bool aarch64_use_return_insn_p (void);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 714bb79..fa25d43 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4551,6 +4551,88 @@ aarch64_classify_address (struct aarch64_address_info *info,
 }
 }
 
+/* Return TRUE if INSN uses an address that satisfies any of the (non-strict)
+   addressing modes specified by MASK.  This is intended for use in scheduling
+   models that are sensitive to the form of address used by some particular
+   instruction.  */
+
+bool
+aarch64_mem_type_p (rtx_insn *insn, unsigned HOST_WIDE_INT mask)
+{
+  aarch64_address_info info;
+  bool valid;
+  attr_addr_type addr_type;
+  rtx mem, addr;
+  machine_mode mode;
+
+  addr_type = get_attr_addr_type (insn);
+
+  switch (addr_type)
+{
+case ADDR_TYPE_WB:
+  info.type = ADDRESS_REG_WB;
+  break;
+
+case ADDR_TYPE_LO_SUM:
+  info.type = ADDRESS_LO_SUM;
+  break;
+
+case ADDR_TYPE_OP0:
+case ADDR_TYPE_OP1:
+  extract_insn_cached (insn);
+
+  mem = recog_data.operand[(addr_type == ADDR_TYPE_OP0) ? 0 : 1];
+
+  gcc_assert (MEM_P (mem));
+  
+  addr = XEXP (mem, 0);
+  mode = GET_MODE (mem);
+
+classify:
+  valid = aarch64_classify_address (, addr, mode, MEM, false);
+  if (!valid)
+	return false;
+
+  break;
+
+case ADDR_TYPE_OP0ADDR:
+case ADDR_TYPE_OP1ADDR:
+  extract_insn_cached (insn);
+
+  addr = recog_data.operand[(addr_type == ADDR_TYPE_OP0ADDR) ? 0 : 1];
+  mode = DImode;
+  goto classify;
+
+case ADDR_TYPE_NONE:
+  return false;
+}
+
+  switch (info.type)
+{
+case ADDRESS_REG_IMM:
+  return (mask & AARCH64_ADDR_REG_IMM) != 0;
+case ADDRESS_REG_WB:
+  return (mask & AARCH64_ADDR_REG_WB) != 0;
+case ADDRESS_REG_REG:
+  if (info.shift == 0)
+	return (mask & AARCH64_ADDR_REG_REG) != 0;
+  else
+return (mask & AARCH64_ADDR_REG_SHIFT) != 0;
+case ADDRESS_REG_UXTW:
+case ADDRESS_REG_SXTW:
+  if (info.shift == 0)
+	return (mask & AARCH64_ADDR_REG_EXT) != 0;
+  else
+	return (mask & AARCH64_ADDR_REG_SHIFT_EXT) != 

[PATCH] Add std::scoped_lock for C++17

2017-03-05 Thread Jonathan Wakely

This was just approved at the WG21 meeting.

* doc/xml/manual/status_cxx2017.xml: Document P0156R2 status.
* doc/html/*: Regenerate.
* include/std/mutex (scoped_lock): Implement new C++17 template.
* testsuite/30_threads/scoped_lock/cons/1.cc: New test.
* testsuite/30_threads/scoped_lock/requirements/
explicit_instantiation.cc: New test.
* testsuite/30_threads/scoped_lock/requirements/typedefs.cc: New test.

Tested powerpc64le-linux, committed to trunk.


commit 109ebf33d95668712c93627d523afe65354c51f9
Author: Jonathan Wakely 
Date:   Sun Mar 5 17:39:00 2017 +

Add std::scoped_lock for C++17

* doc/xml/manual/status_cxx2017.xml: Document P0156R2 status.
* doc/html/*: Regenerate.
* include/std/mutex (scoped_lock): Implement new C++17 template.
* testsuite/30_threads/scoped_lock/cons/1.cc: New test.
* testsuite/30_threads/scoped_lock/requirements/
explicit_instantiation.cc: New test.
* testsuite/30_threads/scoped_lock/requirements/typedefs.cc: New test.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index add0514..1053f2d 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -719,15 +719,14 @@ Feature-testing recommendations for C++.
 
 
 
-  
Variadic lock_guard 
   
-   http://www.w3.org/1999/xlink; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0156r0.html;>
-   P0156R0
+   http://www.w3.org/1999/xlink; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0156r2.html;>
+   P0156R2

   
-   No 
-   __cpp_lib_lock_guard_variadic >= 201510 
+   7 
+   __cpp_lib_scoped_lock >= 201703 
 
 
   
diff --git a/libstdc++-v3/include/std/mutex b/libstdc++-v3/include/std/mutex
index d6f3899..6c3f920 100644
--- a/libstdc++-v3/include/std/mutex
+++ b/libstdc++-v3/include/std/mutex
@@ -556,6 +556,74 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 }
 
+#if __cplusplus > 201402L
+#define __cpp_lib_scoped_lock 201703
+  /** @brief A scoped lock type for multiple lockable objects.
+   *
+   * A scoped_lock controls mutex ownership within a scope, releasing
+   * ownership in the destructor.
+   */
+  template
+class scoped_lock
+{
+public:
+  explicit scoped_lock(_MutexTypes&... __m) : _M_devices(std::tie(__m...))
+  { std::lock(__m...); }
+
+  explicit scoped_lock(_MutexTypes&... __m, adopt_lock_t) noexcept
+  : _M_devices(std::tie(__m...))
+  { } // calling thread owns mutex
+
+  ~scoped_lock()
+  {
+   std::apply([](_MutexTypes&... __m) {
+ char __i[] __attribute__((__unused__)) = { (__m.unlock(), 0)... };
+   }, _M_devices);
+  }
+
+  scoped_lock(const scoped_lock&) = delete;
+  scoped_lock& operator=(const scoped_lock&) = delete;
+
+private:
+  tuple<_MutexTypes&...> _M_devices;
+};
+
+  template<>
+class scoped_lock<>
+{
+public:
+  explicit scoped_lock() = default;
+  explicit scoped_lock(adopt_lock_t) noexcept { }
+  ~scoped_lock() = default;
+
+  scoped_lock(const scoped_lock&) = delete;
+  scoped_lock& operator=(const scoped_lock&) = delete;
+};
+
+  template
+class scoped_lock<_Mutex>
+{
+public:
+  using mutex_type = _Mutex;
+
+  explicit scoped_lock(mutex_type& __m) : _M_device(__m)
+  { _M_device.lock(); }
+
+  explicit scoped_lock(mutex_type& __m, adopt_lock_t) noexcept
+  : _M_device(__m)
+  { } // calling thread owns mutex
+
+  ~scoped_lock()
+  { _M_device.unlock(); }
+
+  scoped_lock(const scoped_lock&) = delete;
+  scoped_lock& operator=(const scoped_lock&) = delete;
+
+private:
+  mutex_type&  _M_device;
+};
+#endif // C++17
+
 #ifdef _GLIBCXX_HAS_GTHREADS
   /// once_flag
   struct once_flag
diff --git a/libstdc++-v3/testsuite/30_threads/scoped_lock/cons/1.cc 
b/libstdc++-v3/testsuite/30_threads/scoped_lock/cons/1.cc
new file mode 100644
index 000..9f1b48c
--- /dev/null
+++ b/libstdc++-v3/testsuite/30_threads/scoped_lock/cons/1.cc
@@ -0,0 +1,133 @@
+// { dg-options "-std=gnu++17" }
+// { dg-do run { target c++1z } }
+// { dg-require-cstdint "" }
+
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+

Re: ARC options documentation questions

2017-03-05 Thread Claudiu Zissulescu
Hi,

It looks good, please go ahead and commit your changes.

Thank you for your contribution,
Claudiu

On Wed, Mar 1, 2017 at 5:35 AM, Sandra Loosemore
 wrote:
> On 02/24/2017 12:20 PM, Claudiu Zissulescu wrote:
>>
>> Hi,
>>
>> Indeed, we are not up to speed regarding updating and cleaning the
>> documentation.
>>
>> On 12/02/2017 05:18, Sandra Loosemore wrote:
>>>
>>> I noticed a bunch of copy-editing issues in the "ARC Options" section of
>>> invoke.texi.  I'm willing to take a stab at fixing them, but I need some
>>> technical assistance since I'm not familiar with the details of this
>>> architecture myself.
>>>
>>> * In e.g. "Compile for ARC 600 cpu with norm instruction enabled." is
>>> "norm" literally the name of an instruction, GCC implementor jargon, or
>>> a term that is used and capitalized like that in the processor
>>> documentation?  Ditto for "mul32x16", "mul64", "LR", "SR", "mpy", "mac",
>>> "mulu64", "swap", "DIV/REM", "MPY", "MPYU", "MPYW", "MPYUW", "MPY_S",
>>> "MPYM", "MPYMU".  For other targets, literal names of instructions are
>>> usually marked up with @code{}, and it would be good to be consistent
>>
>>
>> All those names are additional instructions support which are not
>> available in the base ARC configurations. Indeed, we should be
>> consistent here.
>>
>>> * In "FPX: Generate Double Precision FPX instructions", is "Double
>>> Precision FPX" a proper name literally capitalized like that, or is this
>>> a mistake for "double-precision FPX instructions"?  Likewise for "Single
>>> Precision FPX"?
>>
>>
>> It is a mistake, we should use lower letters.
>>
>>>
>>> * In e.g. the discussion of fpuda_div, is "simple precision" a typo for
>>> "single precision"?  Likewise is "multiple and add" a typo for "multiply
>>> and add"?
>>>
>> Here are typos.
>
>
> Thanks for the additional clarifications.
>
> I've committed the attached patch, which has a few more cleanups beyond the
> version I posted a couple weeks ago.  It's not perfect, but I think it's at
> least an incremental improvement overall.
>
> -Sandra
>


Re: [PATCH, Fortran, Coarray, v1] Add support for failed images

2017-03-05 Thread Andre Vehreschild
Hi Jerry,

thanks for seconding my read of the standard and reviewing so quickly.
Committed as r245900.

Regards,
Andre

On Sat, 4 Mar 2017 15:06:25 -0800
Jerry DeLisle  wrote:

> On 03/04/2017 09:58 AM, Andre Vehreschild wrote:
> > Hi all,
> > 
> > attached patch polishes the one begun by Alessandro. It adds documentation
> > and fixes the style issues. Furthermore did I try to interpret the standard
> > according to the FAIL IMAGE statement. IMHO should it just quit the
> > executable without any error code. The caf_single library emits "FAIL
> > IMAGE" to stderr, while in coarray=single mode it just quits. What do you
> > think?
> > 
> > Bootstraps and regtests ok on x86_64-linux/f25. Ok for trunk? (May be
> > later).
> > 
> > Gruß,
> > Andre
> >   
> 
> From my read:
> 
> "A failed image is usually associated with a hardware failure of the
> processor, memory system, or interconnection network"
> 
> Since the FAIL IMAGE statement is intended to simulate such failure, I agree
> with your interpretation as well, it just stops execution.
> 
> Yes OK for trunk now.
> 
> Jerry


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 245899)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,51 @@
+2017-03-05  Andre Vehreschild  ,
+	Alessandro Fanfarillo  
+
+	* check.c (positive_check): Add new function checking constant for
+	being greater then zero.
+	(gfc_check_image_status): Add checking of image_status arguments.
+	(gfc_check_failed_or_stopped_images): Same but for failed_- and
+	stopped_images function.
+	* dump-parse-tree.c (show_code_node): Added output of FAIL IMAGE.
+	* gfortran.h (enum gfc_statement): Added FAIL_IMAGE_ST.
+	(enum gfc_isym_id): Added new intrinsic symbols.
+	(enum gfc_exec_op): Added EXEC_FAIL_IMAGE.
+	* gfortran.texi: Added description for the new API functions. Updated
+	coverage of gfortran of TS18508.
+	* intrinsic.c (add_functions): Added symbols to resolve new intrinsic
+	functions. 
+	* intrinsic.h: Added prototypes.
+	* iresolve.c (gfc_resolve_failed_images): Resolve the failed_images
+	intrinsic.
+	(gfc_resolve_image_status): Same for image_status.
+	(gfc_resolve_stopped_images): Same for stopped_images.
+	* libgfortran.h: Added prototypes.
+	* match.c (gfc_match_if): Added matching of FAIL IMAGE statement.
+	(gfc_match_fail_image): Match a FAIL IMAGE statement.
+	* match.h: Added prototype.
+	* parse.c (decode_statement): Added matching for FAIL IMAGE.
+	(next_statement): Same.
+	(gfc_ascii_statement): Same.
+	* resolve.c: Same.
+	* simplify.c (gfc_simplify_failed_or_stopped_images): For COARRAY=
+	single a constant result can be returne.d
+	(gfc_simplify_image_status): For COARRAY=single the result is constant.
+	* st.c (gfc_free_statement): Added FAIL_IMAGE handling.
+	* trans-decl.c (gfc_build_builtin_function_decls): Added decls of the
+	new intrinsics.
+	* trans-expr.c (gfc_conv_procedure_call): This is first time all
+	arguments of a function are optional, which is now handled here
+	correctly.
+	* trans-intrinsic.c (conv_intrinsic_image_status): Translate
+	image_status.
+	(gfc_conv_intrinsic_function): Add support for image_status.
+	(gfc_is_intrinsic_libcall): Add support for the remaining new
+	intrinsics.
+	* trans-stmt.c (gfc_trans_fail_image): Trans a fail image.
+	* trans-stmt.h: Add the prototype for the above.
+	* trans.c (trans_code): Dispatch for fail_image.
+	* trans.h: Add the trees for the new intrinsics.
+
 2017-03-03  Jerry DeLisle  
 
 	PR fortran/79841
Index: gcc/fortran/check.c
===
--- gcc/fortran/check.c	(Revision 245899)
+++ gcc/fortran/check.c	(Arbeitskopie)
@@ -295,6 +295,29 @@
 }
 
 
+/* If expr is a constant, then check to ensure that it is greater than zero.  */
+
+static bool
+positive_check (int n, gfc_expr *expr)
+{
+  int i;
+
+  if (expr->expr_type == EXPR_CONSTANT)
+{
+  gfc_extract_int (expr, );
+  if (i <= 0)
+	{
+	  gfc_error ("%qs argument of %qs intrinsic at %L must be positive",
+		 gfc_current_intrinsic_arg[n]->name, gfc_current_intrinsic,
+		 >where);
+	  return false;
+	}
+}
+
+  return true;
+}
+
+
 /* If expr2 is constant, then check that the value is less than
(less than or equal to, if 'or_equal' is true) bit_size(expr1).  */
 
@@ -1138,6 +1161,60 @@
 
 
 bool
+gfc_check_image_status (gfc_expr *image, gfc_expr *team)
+{
+  /* IMAGE has to be a positive, scalar integer.  */
+  if (!type_check (image, 0, BT_INTEGER) || !scalar_check (image, 0)
+  || !positive_check (0, image))
+return false;
+
+  if (team)
+{
+  gfc_error ("%qs argument of %qs intrinsic at %L not yet supported",
+		 gfc_current_intrinsic_arg[1]->name, gfc_current_intrinsic,
+		 >where);
+  return false;
+}