date:20160118

Re: [PATCH 1/9] gensupport: Fix define_subst operand renumbering.

2016-01-18 Thread Bernd Schmidt


On 01/14/2016 05:33 PM, Andreas Krebbel wrote:

When processing substitutions the operands are renumbered.  To find a
free operand number the array used_operands_numbers is used to record
the operand numbers already in use.  Currently this array is used to
assign new numbers *before* all the RTXes in the vector have been
processed.



* gensupport.c (process_substs_on_one_elem): Split loop to
complete mark_operands_used_in_match_dup on all expressions in the
vector first.
(adjust_operands_numbers): Inline into process_substs_on_one_elem
and remove function.


Mostly ok, I think. As an aside, all the define_subst stuff in 
gensupport looks rather suspiciously clunky and the comments are in 
broken English. We should fix this stuff at some point.



@@ -1976,6 +1986,14 @@ find_first_unused_number_of_operand ()
 It visits all expressions in PATTERN and assigns not-occupied
 operand indexes to MATCH_OPERANDs and MATCH_OPERATORs of this
 PATTERN.  */
+/* If output pattern of define_subst contains MATCH_DUP, then this
+   expression would be replaced with the pattern, matched with
+   MATCH_OPERAND from input pattern.  This pattern could contain any
+   number of MATCH_OPERANDs, MATCH_OPERATORs etc., so it's possible
+   that a MATCH_OPERAND from output_pattern (if any) would have the
+   same number, as MATCH_OPERAND from copied pattern.  To avoid such
+   indexes overlapping, we assign new indexes to MATCH_OPERANDs,
+   laying in the output pattern outside of MATCH_DUPs.  */
  static void
  renumerate_operands_in_pattern (rtx pattern)
  {


If you want to keep this comment, you might want to move it inside the 
function (or into the caller). Ok with or without any such change - this 
looks a bit weird but I don't know what's best.



Bernd

Re: [PATCH PR68542]

2016-01-18 Thread Yuri Rumyantsev

Richard,

Here is the second part of patch which really preforms mask stores and
all statements related to it to new basic block guarded by test on
zero mask. Hew test is also added.

Is it OK for trunk?

Thanks.
Yuri.

2016-01-18  Yuri Rumyantsev  

PR middle-end/68542
* config/i386/i386.c (ix86_expand_branch): Implement integral vector
comparison with boolean result.
* config/i386/sse.md (define_expand "cbranch4): Add define-expand
for vector comparion with eq/ne only.
* tree-vect-loop.c (is_valid_sink): New function.
(optimize_mask_stores): Likewise.
* tree-vect-stmts.c (vectorizable_mask_load_store): Initialize
has_mask_store field of vect_info.
* tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for
vectorized loops having masked stores.
* tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and
correspondent macros.
(optimize_mask_stores): Add prototype.

gcc/testsuite/ChangeLog:
* gcc.dg/vect/vect-mask-store-move-1.c: New test.

2016-01-18 17:07 GMT+03:00 Richard Biener :
> On Mon, Jan 18, 2016 at 3:02 PM, Yuri Rumyantsev  wrote:
>> Thanks Richard.
>>
>> I changed the check on type as you proposed.
>>
>> What about the second back-end part of patch (it has been sent 08.12.15).
>
> Can't see it in my inbox - can you reply to the mail with a ping?
>
> Thanks,
> Richard.
>
>> Thanks.
>> Yuri.
>>
>> 2016-01-18 15:44 GMT+03:00 Richard Biener :
>>> On Mon, Jan 11, 2016 at 11:06 AM, Yuri Rumyantsev  
>>> wrote:
 Hi Richard,

 Did you have anu chance to look at updated patch?
>>>
>>> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
>>> index acbb70b..208a752 100644
>>> --- a/gcc/tree-vrp.c
>>> +++ b/gcc/tree-vrp.c
>>> @@ -5771,6 +5771,10 @@ register_edge_assert_for (tree name, edge e,
>>> gimple_stmt_iterator si,
>>> _code, ))
>>>  return;
>>>
>>> +  /* VRP doesn't track ranges for vector types.  */
>>> +  if (TREE_CODE (TREE_TYPE (name)) == VECTOR_TYPE)
>>> +return;
>>> +
>>>
>>> please instead fix extract_code_and_val_from_cond_with_ops with
>>>
>>> Index: gcc/tree-vrp.c
>>> ===
>>> --- gcc/tree-vrp.c  (revision 232506)
>>> +++ gcc/tree-vrp.c  (working copy)
>>> @@ -5067,8 +5067,9 @@ extract_code_and_val_from_cond_with_ops
>>>if (invert)
>>>  comp_code = invert_tree_comparison (comp_code, 0);
>>>
>>> -  /* VRP does not handle float types.  */
>>> -  if (SCALAR_FLOAT_TYPE_P (TREE_TYPE (val)))
>>> +  /* VRP only handles integral and pointer types.  */
>>> +  if (! INTEGRAL_TYPE_P (TREE_TYPE (val))
>>> +  && ! POINTER_TYPE_P (TREE_TYPE (val)))
>>>  return false;
>>>
>>>/* Do not register always-false predicates.
>>>
>>> Ok with that change.
>>>
>>> Thanks,
>>> Richard.
>>>
 Thanks.
 Yuri.

 2015-12-18 13:20 GMT+03:00 Yuri Rumyantsev :
> Hi Richard,
>
> Here is updated patch for middle-end part of the whole patch which
> fixes all your remarks I hope.
>
> Regression testing and bootstrapping did not show any new failures.
> Is it OK for trunk?
>
> Yuri.
>
> ChangeLog:
> 2015-12-18  Yuri Rumyantsev  
>
> PR middle-end/68542
> * fold-const.c (fold_binary_op_with_conditional_arg): Bail out for case
> of mixind vector and scalar types.
> (fold_relational_const): Add handling of vector
> comparison with boolean result.
> * tree-cfg.c (verify_gimple_comparison): Add argument CODE, allow
> comparison of vector operands with boolean result for EQ/NE only.
> (verify_gimple_assign_binary): Adjust call for verify_gimple_comparison.
> (verify_gimple_cond): Likewise.
> * tree-ssa-forwprop.c (combine_cond_expr_cond): Do not perform


PR68542.patch
Description: Binary data

Re: [PATCH][ARM] Add movv4hf/v8hf expanders & later insns; disable VnHF immediates.

2016-01-18 Thread Christophe Lyon

On 18 January 2016 at 14:12, Kyrill Tkachov  wrote:
> Hi Alan,
>
>
> On 18/01/16 12:14, Alan Lawrence wrote:
>>
>> This fixes ICEs on armeb for float16x[48]_t vectors, e.g. in
>> check_effective_target_arm_neon_fp_16_ok.
>>
>> At present, without the expander, moving v4hf/v8hf values around is done
>> via subregs. On armeb, this ICEs because REG_CANNOT_CHANGE_MODE_P. (On
>> arm-*,
>> moving via two subregs is less efficient than one native insn!)
>>
>> However, adding the expanders, reveals a latent bug in the V4HF variant of
>> *neon_mov, that vector constants are not handled properly in the
>> neon_valid_immediate code. Hence, for now I've used a separate expander
>> that
>> disallows immediates, and disabled VnHF vectors as immediates in
>> neon_valid_immediate_for_move; I'll file a PR for this.
>>
>> Also to fix the advsimd-intrinsics/vcombine test I had to add HF vector
>> modes to
>> the VX iterator and V_reg attribute, for vdup_n, as loading a vector of
>> identical HF elements is now done by loading the scalar + vdup rather than
>> forcing the vector out to the constant pool.
>>
>> On armeb, one of the ICEs this fixes, is in the test program for
>> check_effective_target_arm_neon_fp_16_ok. This means the
>> advsimd-intrinsics
>> vcvt_f16 test now runs (and passes), and also that the other tests now run
>> with neon-fp16, rather than only neon as previously (on armeb).
>> This reveals that the fp16 cases of vld1_lane and vset_lane are (and were)
>> failing. Since those tests would previously have failed *if fp16 had been
>> passed in*, I think this is still a step forward; one can still run the
>> tests
>> with an explicit non-fp16 multilib if the old behaviour is desired.
>>
>> Note the previous patch removes other uses of VQXMOV (not strictly a
>> dependency,
>> generating V4HF/V8HF reinterpret patterns is harmless, they just aren't
>> used).
>>
>> Bootstrapped + check-gcc on arm-none-linux-gnueabihf;
>> cross-tested armeb-none-eabi.
>
>
> Seems that you and Christophe have some duplicated work here?
> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01031.html
>

Indeed, that's unfortunate :-(
I had filed PR 68620 to avoid duplication.

Both patches look similar, fortunately.

Christophe.


> Kyrill
>
>
>> gcc/ChangeLog:
>>
>> * config/arm/arm.c (neon_valid_immediate): Disallow vectors of
>> HFmode.
>> * config/arm/iterators.md (V_HF): New.
>> (VQXMOV): Add V8HF.
>> (VX): Add V4HF, V8HF.
>> (V_reg): Add cases for V4HF, V8HF.
>> * config/arm/vec-common.md (mov V_HF): New.
>> ---
>>   gcc/config/arm/arm.c |  2 ++
>>   gcc/config/arm/iterators.md  |  8 ++--
>>   gcc/config/arm/vec-common.md | 20 
>>   3 files changed, 28 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>> index 3276b03..4fdba38 100644
>> --- a/gcc/config/arm/arm.c
>> +++ b/gcc/config/arm/arm.c
>> @@ -12371,6 +12371,8 @@ neon_valid_immediate (rtx op, machine_mode mode,
>> int inverse,
>> /* Vectors of float constants.  */
>> if (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT)
>>   {
>> +  if (GET_MODE_INNER (mode) == HFmode)
>> +   return -1;
>> rtx el0 = CONST_VECTOR_ELT (op, 0);
>> const REAL_VALUE_TYPE *r0;
>>   diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
>> index 974cf51..c5db868 100644
>> --- a/gcc/config/arm/iterators.md
>> +++ b/gcc/config/arm/iterators.md
>> @@ -59,6 +59,9 @@
>>   ;; Integer and float modes supported by Neon and IWMMXT.
>>   (define_mode_iterator VALL [V2DI V2SI V4HI V8QI V2SF V4SI V8HI V16QI
>> V4SF])
>>   +;; Vectors of half-precision floats.
>> +(define_mode_iterator V_HF [V4HF V8HF])
>> +
>>   ;; Integer and float modes supported by Neon and IWMMXT, except V2DI.
>>   (define_mode_iterator VALLW [V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
>>   @@ -99,7 +102,7 @@
>>   (define_mode_iterator VQI [V16QI V8HI V4SI])
>> ;; Quad-width vector modes, with TImode added, for moves.
>> -(define_mode_iterator VQXMOV [V16QI V8HI V4SI V4SF V2DI TI])
>> +(define_mode_iterator VQXMOV [V16QI V8HI V8HF V4SI V4SF V2DI TI])
>> ;; Opaque structure types wider than TImode.
>>   (define_mode_iterator VSTRUCT [EI OI CI XI])
>> @@ -160,7 +163,7 @@
>>   (define_mode_iterator VMDQI [V4HI V2SI V8HI V4SI])
>> ;; Modes with 8-bit and 16-bit elements.
>> -(define_mode_iterator VX [V8QI V4HI V16QI V8HI])
>> +(define_mode_iterator VX [V8QI V4HI V4HF V16QI V8HI V8HF])
>> ;; Modes with 8-bit elements.
>>   (define_mode_iterator VE [V8QI V16QI])
>> @@ -428,6 +431,7 @@
>>   ;; Register width from element mode
>>   (define_mode_attr V_reg [(V8QI "P") (V16QI "q")
>>(V4HI "P") (V8HI  "q")
>> +(V4HF "P") (V8HF  "q")
>>(V2SI "P") (V4SI  "q")
>>(V2SF "P") (V4SF  "q")
>>(DI   "P") (V2DI  "q")
>>

C++ PATCH for c++/68586 (rejects-valid with enum in C++11)

2016-01-18 Thread Marek Polacek

In this PR, we find ourselves in a curious situation.  When parsing this enum:

  enum E { x = 1, y = x << 1 };

we process the LSHIFT_EXPR in cp_build_binary_op and call 
fold_non_dependent_expr
on each of the operands.  Then fold_non_dependent_expr calls 
maybe_constant_value
which, for CONST_DECL x, sticks 1 of INTEGER_TYPE to the cache.  But, as 
explained
in finish_enum_value_list:

  /* [dcl.enum]: Following the closing brace of an enum-specifier,
 each enumerator has the type of its enumeration.  Prior to the
 closing brace, the type of each enumerator is the type of its
 initializing value.  */

the type of CONST_DECL x will be different after the whole enumerator-specifier
has been parsed.  This discrepancy can cause problems down the line, as seen in
the testcase.  (It's standard_conversion that says that integer -> unscoped enum
conversion is bad.)

That's why I think we shouldn't put CONST_DECLs into the cache while parsing an
enum.  I feel uneasy about the new check, but I couldn't find anything better;
TYPE_BEING_DEFINED is for classes only, and COMPLETE_TYPE_P won't work here.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-01-18  Marek Polacek  

PR c++/68586
* constexpr.c (maybe_constant_value): Don't put enumerators into the
cv_cache while parsing an enumerator-list.

* g++.dg/cpp0x/enum30.C: New test.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index 6ab4696..d52005b 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -4022,7 +4022,13 @@ maybe_constant_value (tree t, tree decl)
   if (!ret)
 {
   ret = maybe_constant_value_1 (t, decl);
-  cv_cache.put (t, ret);
+  /* While parsing an enumerator-list, the type of each enumerator
+is the type of its initializing value.  After the parsing has
+been done, it will have the type of its enumeration.  And this
+discrepancy could get us in trouble later.  */
+  if (TREE_CODE (t) != CONST_DECL
+ || TREE_TYPE (t) == DECL_CONTEXT (t))
+   cv_cache.put (t, ret);
 }
   return ret;
 }
diff --git gcc/testsuite/g++.dg/cpp0x/enum30.C 
gcc/testsuite/g++.dg/cpp0x/enum30.C
index e69de29..b9bdfd4 100644
--- gcc/testsuite/g++.dg/cpp0x/enum30.C
+++ gcc/testsuite/g++.dg/cpp0x/enum30.C
@@ -0,0 +1,14 @@
+// PR c++/68586
+// { dg-do compile { target c++11 } }
+
+enum E { x = 1, y = x << 1 };
+template struct A {};
+A a;
+
+enum E2 : int { x2 = 1, y2 = x2 << 1 };
+template struct A2 {};
+A2 a2;
+
+enum class E3 { x3 = 1, y3 = x3 << 1 };
+template struct A3 {};
+A3 a3;

Marek

Re: C++ PATCH for c++/68586 (rejects-valid with enum in C++11)

2016-01-18 Thread Jason Merrill

This wouldn't cover cases where this change affects the type or value of 
more complicated expressions, so my preference would be to clear the 
caches when we finish_enum_value_list.


Jason

Re: genattrab.c generate switch

2016-01-18 Thread Jakub Jelinek

On Mon, Jan 18, 2016 at 03:15:08PM +0100, Bernd Schmidt wrote:
> Secondly, we're currently in a development phase where we only accept bug
> fixes for gcc-6. You should resubmit/ping the patch once stage1 opens again.

I think this is a bug fix, it is a workaround for a broken compiler that
some people use as system compiler to bootstrap gcc.

Jakub

[PING^2][PATCH, 3/16] Ignore reduction clause on kernels directive

2016-01-18 Thread Tom de Vries


On 24/11/15 13:21, Tom de Vries wrote:

On 09/11/15 16:50, Tom de Vries wrote:

On 09/11/15 16:35, Tom de Vries wrote:

Hi,

this patch series for stage1 trunk adds support to:
- parallelize oacc kernels regions using parloops, and
- map the loops onto the oacc gang dimension.

The patch series contains these patches:

  1Insert new exit block only when needed in
 transform_to_exit_first_loop_alt
  2Make create_parallel_loop return void
  3Ignore reduction clause on kernels directive
  4Implement -foffload-alias
  5Add in_oacc_kernels_region in struct loop
  6Add pass_oacc_kernels
  7Add pass_dominator_oacc_kernels
  8Add pass_ch_oacc_kernels
  9Add pass_parallelize_loops_oacc_kernels
 10Add pass_oacc_kernels pass group in passes.def
 11Update testcases after adding kernels pass group
 12Handle acc loop directive
 13Add c-c++-common/goacc/kernels-*.c
 14Add gfortran.dg/goacc/kernels-*.f95
 15Add libgomp.oacc-c-c++-common/kernels-*.c
 16Add libgomp.oacc-fortran/kernels-*.f95

The first 9 patches are more or less independent, but patches 10-16 are
intended to be committed at the same time.

Bootstrapped and reg-tested on x86_64.

Build and reg-tested with nvidia accelerator, in combination with a
patch that enables accelerator testing (which is submitted at
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).

I'll post the individual patches in reply to this message.


As discussed here (
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00785.html ), the kernels
directive does not allow the reduction clause.  This patch fixes that.





Ping^2.

Thanks,
- Tom

[PATCH] Fix PR69297

2016-01-18 Thread Richard Biener


The following patch fixes us miscounting the number of scalar
instructions for BB vectorization leading to vectorizations that
are not profitable.

A simple fix is to count each scalar stmt at most once.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2016-01-18  Richard Biener  

PR tree-optimization/69297
* tree-vect-slp.c (vect_bb_slp_scalar_cost): Count each scalar
stmt at most once.
(vect_bb_vectorization_profitable_p): Clear visited flag again.

Index: gcc/tree-vect-slp.c
===
*** gcc/tree-vect-slp.c (revision 232496)
--- gcc/tree-vect-slp.c (working copy)
*** vect_bb_slp_scalar_cost (basic_block bb,
*** 2409,2414 
--- 2409,2419 
if ((*life)[i])
continue;
  
+   /* Count scalar stmts only once.  */
+   if (gimple_visited_p (stmt))
+   continue;
+   gimple_set_visited (stmt, true);
+ 
stmt_info = vinfo_for_stmt (stmt);
if (STMT_VINFO_DATA_REF (stmt_info))
  {
*** vect_bb_vectorization_profitable_p (bb_v
*** 2451,2456 
--- 2456,2466 
  );
  }
  
+   /* Unset visited flag.  */
+   for (gimple_stmt_iterator gsi = bb_vinfo->region_begin;
+gsi_stmt (gsi) != gsi_stmt (bb_vinfo->region_end); gsi_next ())
+ gimple_set_visited  (gsi_stmt (gsi), false);
+ 
/* Complete the target-specific cost calculation.  */
finish_cost (BB_VINFO_TARGET_COST_DATA (bb_vinfo), _prologue_cost,
   _inside_cost, _epilogue_cost);
Index: gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr69297.c
===
*** gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr69297.c  
(revision 0)
--- gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr69297.c  
(working copy)
***
*** 0 
--- 1,83 
+ /* { dg-do compile } */
+ /* { dg-additional-options "-march=core-avx2 -fdump-tree-slp-details" } */
+ 
+ #define abs(x) (x) < 0 ? -(x) : (x)
+ int
+ foo (int* diff)
+ {
+   int k, satd = 0, m[16], d[16];
+   
+ m[ 0] = diff[ 0] + diff[12];
+ m[ 4] = diff[ 4] + diff[ 8];
+ m[ 8] = diff[ 4] - diff[ 8];
+ m[12] = diff[ 0] - diff[12];
+ m[ 1] = diff[ 1] + diff[13];
+ m[ 5] = diff[ 5] + diff[ 9];
+ m[ 9] = diff[ 5] - diff[ 9];
+ m[13] = diff[ 1] - diff[13];
+ m[ 2] = diff[ 2] + diff[14];
+ m[ 6] = diff[ 6] + diff[10];
+ m[10] = diff[ 6] - diff[10];
+ m[14] = diff[ 2] - diff[14];
+ m[ 3] = diff[ 3] + diff[15];
+ m[ 7] = diff[ 7] + diff[11];
+ m[11] = diff[ 7] - diff[11];
+ m[15] = diff[ 3] - diff[15];
+ 
+ d[ 0] = m[ 0] + m[ 4];
+ d[ 8] = m[ 0] - m[ 4];
+ d[ 4] = m[ 8] + m[12];
+ d[12] = m[12] - m[ 8];
+ d[ 1] = m[ 1] + m[ 5];
+ d[ 9] = m[ 1] - m[ 5];
+ d[ 5] = m[ 9] + m[13];
+ d[13] = m[13] - m[ 9];
+ d[ 2] = m[ 2] + m[ 6];
+ d[10] = m[ 2] - m[ 6];
+ d[ 6] = m[10] + m[14];
+ d[14] = m[14] - m[10];
+ d[ 3] = m[ 3] + m[ 7];
+ d[11] = m[ 3] - m[ 7];
+ d[ 7] = m[11] + m[15];
+ d[15] = m[15] - m[11];
+ 
+ m[ 0] = d[ 0] + d[ 3];
+ m[ 1] = d[ 1] + d[ 2];
+ m[ 2] = d[ 1] - d[ 2];
+ m[ 3] = d[ 0] - d[ 3];
+ m[ 4] = d[ 4] + d[ 7];
+ m[ 5] = d[ 5] + d[ 6];
+ m[ 6] = d[ 5] - d[ 6];
+ m[ 7] = d[ 4] - d[ 7];
+ m[ 8] = d[ 8] + d[11];
+ m[ 9] = d[ 9] + d[10];
+ m[10] = d[ 9] - d[10];
+ m[11] = d[ 8] - d[11];
+ m[12] = d[12] + d[15];
+ m[13] = d[13] + d[14];
+ m[14] = d[13] - d[14];
+ m[15] = d[12] - d[15];
+ 
+ d[ 0] = m[ 0] + m[ 1];
+ d[ 1] = m[ 0] - m[ 1];
+ d[ 2] = m[ 2] + m[ 3];
+ d[ 3] = m[ 3] - m[ 2];
+ d[ 4] = m[ 4] + m[ 5];
+ d[ 5] = m[ 4] - m[ 5];
+ d[ 6] = m[ 6] + m[ 7];
+ d[ 7] = m[ 7] - m[ 6];
+ d[ 8] = m[ 8] + m[ 9];
+ d[ 9] = m[ 8] - m[ 9];
+ d[10] = m[10] + m[11];
+ d[11] = m[11] - m[10];
+ d[12] = m[12] + m[13];
+ d[13] = m[12] - m[13];
+ d[14] = m[14] + m[15];
+ d[15] = m[15] - m[14];
+ for (k=0; k<16; k++)
+   satd += abs(d[k]);
+   return satd;
+ }
+ 
+ /* { dg-final { scan-tree-dump "vectorization is not profitable" "slp1" } } */
+ /* { dg-final { scan-tree-dump-not "basic block vectorized" "slp1" } } */

Re: [PING^2][PATCH, 3/16] Ignore reduction clause on kernels directive

2016-01-18 Thread Jakub Jelinek

On Mon, Jan 18, 2016 at 03:24:21PM +0100, Tom de Vries wrote:
> >>As discussed here (
> >>https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00785.html ), the kernels
> >>directive does not allow the reduction clause.  This patch fixes that.
> >>
> >
> 
> Ping^2.

Ok.

Jakub

[PING^2][PATCH, 12/16] Handle acc loop directive

2016-01-18 Thread Tom de Vries


On 24/11/15 13:26, Tom de Vries wrote:

On 09/11/15 21:06, Tom de Vries wrote:

On 09/11/15 16:35, Tom de Vries wrote:

Hi,

this patch series for stage1 trunk adds support to:
- parallelize oacc kernels regions using parloops, and
- map the loops onto the oacc gang dimension.

The patch series contains these patches:

  1Insert new exit block only when needed in
 transform_to_exit_first_loop_alt
  2Make create_parallel_loop return void
  3Ignore reduction clause on kernels directive
  4Implement -foffload-alias
  5Add in_oacc_kernels_region in struct loop
  6Add pass_oacc_kernels
  7Add pass_dominator_oacc_kernels
  8Add pass_ch_oacc_kernels
  9Add pass_parallelize_loops_oacc_kernels
 10Add pass_oacc_kernels pass group in passes.def
 11Update testcases after adding kernels pass group
 12Handle acc loop directive
 13Add c-c++-common/goacc/kernels-*.c
 14Add gfortran.dg/goacc/kernels-*.f95
 15Add libgomp.oacc-c-c++-common/kernels-*.c
 16Add libgomp.oacc-fortran/kernels-*.f95

The first 9 patches are more or less independent, but patches 10-16 are
intended to be committed at the same time.

Bootstrapped and reg-tested on x86_64.

Build and reg-tested with nvidia accelerator, in combination with a
patch that enables accelerator testing (which is submitted at
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).

I'll post the individual patches in reply to this message.


this patch deals with loops in an oacc kernels region which are
annotated using "#pragma acc loop". It expands such a loop as a normal
loop, which has the effect of ignoring the "#pragma acc loop".





Ping^2.

Thanks,
- Tom

Re: genattrab.c generate switch

2016-01-18 Thread Jesper Broge Jørgensen



On 18/01/16 15:15, Bernd Schmidt wrote:

On 01/13/2016 01:53 AM, Jesper Broge Jørgensen wrote:

genattrab.c can generate if statements that have very deep bracket
nesting causing clang to produce errors (when target=arm-none-eabi) as
explained at https://gcc.gnu.org/ml/gcc/2014-05/msg00032.html
At the above link it was suggested that genattrab.c generated a switch
statement instead. I have made a patch that does just that.


Some preliminaries first - I don't see your name in existing 
ChangeLogs; am I correct in assuming you've not gone through the 
copyright assignment process?


Secondly, we're currently in a development phase where we only accept 
bug fixes for gcc-6. You should resubmit/ping the patch once stage1 
opens again.



2016-01-13  Jesper Broge Jørgensen 

 * genattrtab.c (check_attr_set_switch): implemented the function
 (write_attr_set): Check if expression can be written as a switch


Please review our coding and documentation standards. ChangeLog 
entries should be complete sentences (or sometimes brief short-hands: 
the first one should just be "New function.")



+static int check_attr_set_switch (FILE *outf, rtx exp,
+unsigned int attrs_cached, int write_cases, int
indent);


No reason to declare it if it is defined before its use.

+  while (1)
+  {


This and everything else here looks like it isn't following our 
indentation rules.



Bernd

No i have not gone through copyright assignment.
This is my first time trying to contribute to a GNU project so i have 
tried following the "Contributing to GCC" at 
https://gcc.gnu.org/contribute.html
There i followed the advice to run the patch through 
contrib/check_GNU_style.sh and it came out clean. Maybe 
contrib/check_GNU_style.sh does not check for indention rules and/or my 
editor is set up wrongly so it looked to me like i was following the 
coding standard.


I did not know you only accepted bug fixes though one could argue that 
this fixes a (style)bug in generated code.

[Committed] Allow pass_parallelize_loops to be run outside the loop pipeline

2016-01-18 Thread Tom de Vries


[ was: Re: [PIING][PATCH, 9/16] Add pass_parallelize_loops_oacc_kernels ]

On 14/12/15 16:22, Richard Biener wrote:

Can the pass not just use a pass parameter to switch between oacc/non-oacc?


It can, and that means that parloops is run outside the loops pipeline. 
This patch enables that.


Bootstrapped and reg-tested on x86_64.

Committed to trunk.

Thanks,
- Tom

Allow pass_parallelize_loops to be run outside the loop pipeline

2016-01-18  Tom de Vries  

	* tree-parloops.c (pass_parallelize_loops::execute): Allow
	pass_parallelize_loops to be run outside the loop pipeline.

---
 gcc/tree-parloops.c | 28 +++-
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index 46d70ac..885103e 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -2844,23 +2844,41 @@ public:
 unsigned
 pass_parallelize_loops::execute (function *fun)
 {
-  if (number_of_loops (fun) <= 1)
-return 0;
-
   tree nthreads = builtin_decl_explicit (BUILT_IN_OMP_GET_NUM_THREADS);
   if (nthreads == NULL_TREE)
 return 0;
 
+  bool in_loop_pipeline = scev_initialized_p ();
+  if (!in_loop_pipeline)
+loop_optimizer_init (LOOPS_NORMAL
+			 | LOOPS_HAVE_RECORDED_EXITS);
+
+  if (number_of_loops (fun) <= 1)
+return 0;
+
+  if (!in_loop_pipeline)
+{
+  rewrite_into_loop_closed_ssa (NULL, TODO_update_ssa);
+  scev_initialize ();
+}
+
+  unsigned int todo = 0;
   if (parallelize_loops ())
 {
   fun->curr_properties &= ~(PROP_gimple_eomp);
 
   checking_verify_loop_structure ();
 
-  return TODO_update_ssa;
+  todo |= TODO_update_ssa;
+}
+
+  if (!in_loop_pipeline)
+{
+  scev_finalize ();
+  loop_optimizer_finalize ();
 }
 
-  return 0;
+  return todo;
 }
 
 } // anon namespace

[committed] Add oacc_kernels_p argument to pass_parallelize_loops

2016-01-18 Thread Tom de Vries


[was: Re: [PIING][PATCH, 9/16] Add pass_parallelize_loops_oacc_kernels ]

On 14/12/15 16:22, Richard Biener wrote:

On Sun, Dec 13, 2015 at 5:58 PM, Tom de Vries  wrote:

On 24/11/15 13:24, Tom de Vries wrote:


On 16/11/15 12:59, Tom de Vries wrote:


On 09/11/15 20:52, Tom de Vries wrote:


On 09/11/15 16:35, Tom de Vries wrote:


Hi,

this patch series for stage1 trunk adds support to:
- parallelize oacc kernels regions using parloops, and
- map the loops onto the oacc gang dimension.

The patch series contains these patches:

   1Insert new exit block only when needed in
  transform_to_exit_first_loop_alt
   2Make create_parallel_loop return void
   3Ignore reduction clause on kernels directive
   4Implement -foffload-alias
   5Add in_oacc_kernels_region in struct loop
   6Add pass_oacc_kernels
   7Add pass_dominator_oacc_kernels
   8Add pass_ch_oacc_kernels
   9Add pass_parallelize_loops_oacc_kernels
  10Add pass_oacc_kernels pass group in passes.def
  11Update testcases after adding kernels pass group
  12Handle acc loop directive
  13Add c-c++-common/goacc/kernels-*.c
  14Add gfortran.dg/goacc/kernels-*.f95
  15Add libgomp.oacc-c-c++-common/kernels-*.c
  16Add libgomp.oacc-fortran/kernels-*.f95

The first 9 patches are more or less independent, but patches 10-16 are
intended to be committed at the same time.

Bootstrapped and reg-tested on x86_64.

Build and reg-tested with nvidia accelerator, in combination with a
patch that enables accelerator testing (which is submitted at
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).

I'll post the individual patches in reply to this message.



This patch adds pass_parallelize_loops_oacc_kernels.

There's a number of things we do differently in parloops for oacc
kernels:
- in normal parloops, we generate code to choose between a parallel
version of the loop, and a sequential (low iteration count) version.
Since the code in oacc kernels region is supposed to run on the
accelerator anyway, we skip this check, and don't add a low iteration
count loop.
- in normal parloops, we generate an #pragma omp parallel /
GIMPLE_OMP_RETURN pair to delimit the region which will we split off
into a thread function. Since the oacc kernels region is already
split off, we don't add this pair.
- we indicate the parallelization factor by setting the oacc function
attributes
- we generate an #pragma oacc loop instead of an #pragma omp for, and
we add the gang clause
- in normal parloops, we rewrite the variable accesses in the loop in
terms into accesses relative to a thread function parameter. For the
oacc kernels region, that rewrite has already been done at omp-lower,
so we skip this.
- we need to ensure that the entire kernels region can be run in
parallel. The loop independence check is already present, so for oacc
kernels we add a check between blocks outside the loop and the entire
region.
- we guard stores in the blocks outside the loop with gang_pos == 0.
There's no need for each gang to write to a single location, we can
do this in just one gang. (Typically this is the write of the final
value of the iteration variable if that one is copied back to the
host).



Reposting with loop optimizer init added in
pass_parallelize_loops_oacc_kernels::execute.



Reposting with loop_optimizer_finalize,scev_initialize and scev_finalize
   added in pass_parallelize_loops_oacc_kernels::execute.



Ping.

Anything I can do to facilitate the review?


Document new functions.


Done.

avoid if (1).

Done.


Ideally some refactoring would avoid some of the if (!oacc_kernels_p) spaghetti


Ack. For now, i've tried to minimize the number of oacc_kernels_p tests 
in the code.


Further suggestions on how to improve here are much appreciated.


but I'm considering tree-parloops.c (and its bugs) yours.


Ack.


Can the pass not just use a pass parameter to switch between oacc/non-oacc?



This patch introduces the pass parameter oacc_kernels_p (but does not 
instantiate an oacc_kernels_p == true pass version yet).


Bootstrapped and reg-tested on x86_64.

Committed to trunk.

Thanks,
- Tom

Add oacc_kernels_p argument to pass_parallelize_loops

2015-11-09  Tom de Vries  

	* omp-low.c (set_oacc_fn_attrib): Make extern.
	* omp-low.h (set_oacc_fn_attrib): Declare.
	* tree-parloops.c (struct reduction_info): Add reduc_addr field.
	(create_call_for_reduction_1): Handle case that reduc_addr is non-NULL.
	(create_parallel_loop, gen_parallel_loop, try_create_reduction_list):
	Add and handle function parameter oacc_kernels_p.
	(find_reduc_addr, get_omp_data_i_param): New function.
	(ref_conflicts_with_region, oacc_entry_exit_ok_1)
	(oacc_entry_exit_single_gang, oacc_entry_exit_ok): New function.
	(parallelize_loops): Add and handle

Re: [PATCH][ARM] Add movv4hf/v8hf expanders & later insns; disable VnHF immediates.

2016-01-18 Thread Kyrill Tkachov


Hi Alan,

On 18/01/16 12:14, Alan Lawrence wrote:

This fixes ICEs on armeb for float16x[48]_t vectors, e.g. in
check_effective_target_arm_neon_fp_16_ok.

At present, without the expander, moving v4hf/v8hf values around is done
via subregs. On armeb, this ICEs because REG_CANNOT_CHANGE_MODE_P. (On arm-*,
moving via two subregs is less efficient than one native insn!)

However, adding the expanders, reveals a latent bug in the V4HF variant of
*neon_mov, that vector constants are not handled properly in the
neon_valid_immediate code. Hence, for now I've used a separate expander that
disallows immediates, and disabled VnHF vectors as immediates in
neon_valid_immediate_for_move; I'll file a PR for this.

Also to fix the advsimd-intrinsics/vcombine test I had to add HF vector modes to
the VX iterator and V_reg attribute, for vdup_n, as loading a vector of
identical HF elements is now done by loading the scalar + vdup rather than
forcing the vector out to the constant pool.

On armeb, one of the ICEs this fixes, is in the test program for
check_effective_target_arm_neon_fp_16_ok. This means the advsimd-intrinsics
vcvt_f16 test now runs (and passes), and also that the other tests now run
with neon-fp16, rather than only neon as previously (on armeb).
This reveals that the fp16 cases of vld1_lane and vset_lane are (and were)
failing. Since those tests would previously have failed *if fp16 had been
passed in*, I think this is still a step forward; one can still run the tests
with an explicit non-fp16 multilib if the old behaviour is desired.

Note the previous patch removes other uses of VQXMOV (not strictly a dependency,
generating V4HF/V8HF reinterpret patterns is harmless, they just aren't used).

Bootstrapped + check-gcc on arm-none-linux-gnueabihf;
cross-tested armeb-none-eabi.


Seems that you and Christophe have some duplicated work here?
https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01031.html

Kyrill


gcc/ChangeLog:

* config/arm/arm.c (neon_valid_immediate): Disallow vectors of HFmode.
* config/arm/iterators.md (V_HF): New.
(VQXMOV): Add V8HF.
(VX): Add V4HF, V8HF.
(V_reg): Add cases for V4HF, V8HF.
* config/arm/vec-common.md (mov V_HF): New.
---
  gcc/config/arm/arm.c |  2 ++
  gcc/config/arm/iterators.md  |  8 ++--
  gcc/config/arm/vec-common.md | 20 
  3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 3276b03..4fdba38 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -12371,6 +12371,8 @@ neon_valid_immediate (rtx op, machine_mode mode, int 
inverse,
/* Vectors of float constants.  */
if (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT)
  {
+  if (GET_MODE_INNER (mode) == HFmode)
+   return -1;
rtx el0 = CONST_VECTOR_ELT (op, 0);
const REAL_VALUE_TYPE *r0;
  
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md

index 974cf51..c5db868 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -59,6 +59,9 @@
  ;; Integer and float modes supported by Neon and IWMMXT.
  (define_mode_iterator VALL [V2DI V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
  
+;; Vectors of half-precision floats.

+(define_mode_iterator V_HF [V4HF V8HF])
+
  ;; Integer and float modes supported by Neon and IWMMXT, except V2DI.
  (define_mode_iterator VALLW [V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
  
@@ -99,7 +102,7 @@

  (define_mode_iterator VQI [V16QI V8HI V4SI])
  
  ;; Quad-width vector modes, with TImode added, for moves.

-(define_mode_iterator VQXMOV [V16QI V8HI V4SI V4SF V2DI TI])
+(define_mode_iterator VQXMOV [V16QI V8HI V8HF V4SI V4SF V2DI TI])
  
  ;; Opaque structure types wider than TImode.

  (define_mode_iterator VSTRUCT [EI OI CI XI])
@@ -160,7 +163,7 @@
  (define_mode_iterator VMDQI [V4HI V2SI V8HI V4SI])
  
  ;; Modes with 8-bit and 16-bit elements.

-(define_mode_iterator VX [V8QI V4HI V16QI V8HI])
+(define_mode_iterator VX [V8QI V4HI V4HF V16QI V8HI V8HF])
  
  ;; Modes with 8-bit elements.

  (define_mode_iterator VE [V8QI V16QI])
@@ -428,6 +431,7 @@
  ;; Register width from element mode
  (define_mode_attr V_reg [(V8QI "P") (V16QI "q")
   (V4HI "P") (V8HI  "q")
+(V4HF "P") (V8HF  "q")
   (V2SI "P") (V4SI  "q")
   (V2SF "P") (V4SF  "q")
   (DI   "P") (V2DI  "q")
diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md
index ce98f71..c27578a 100644
--- a/gcc/config/arm/vec-common.md
+++ b/gcc/config/arm/vec-common.md
@@ -38,6 +38,26 @@
  }
  })
  
+;; This exists separately from the above pattern to exclude an immediate RHS.

+
+(define_expand "mov"
+  [(set (match_operand:V_HF 0 "nonimmediate_operand" "")
+   (match_operand:V_HF 1 "nonimmediate_operand" ""))]
+  "TARGET_NEON
+   || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (mode))"
+{
+  if

Re: [PATCH PR68542]

2016-01-18 Thread Yuri Rumyantsev

Thanks Richard.

I changed the check on type as you proposed.

What about the second back-end part of patch (it has been sent 08.12.15).

Thanks.
Yuri.

2016-01-18 15:44 GMT+03:00 Richard Biener :
> On Mon, Jan 11, 2016 at 11:06 AM, Yuri Rumyantsev  wrote:
>> Hi Richard,
>>
>> Did you have anu chance to look at updated patch?
>
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index acbb70b..208a752 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -5771,6 +5771,10 @@ register_edge_assert_for (tree name, edge e,
> gimple_stmt_iterator si,
> _code, ))
>  return;
>
> +  /* VRP doesn't track ranges for vector types.  */
> +  if (TREE_CODE (TREE_TYPE (name)) == VECTOR_TYPE)
> +return;
> +
>
> please instead fix extract_code_and_val_from_cond_with_ops with
>
> Index: gcc/tree-vrp.c
> ===
> --- gcc/tree-vrp.c  (revision 232506)
> +++ gcc/tree-vrp.c  (working copy)
> @@ -5067,8 +5067,9 @@ extract_code_and_val_from_cond_with_ops
>if (invert)
>  comp_code = invert_tree_comparison (comp_code, 0);
>
> -  /* VRP does not handle float types.  */
> -  if (SCALAR_FLOAT_TYPE_P (TREE_TYPE (val)))
> +  /* VRP only handles integral and pointer types.  */
> +  if (! INTEGRAL_TYPE_P (TREE_TYPE (val))
> +  && ! POINTER_TYPE_P (TREE_TYPE (val)))
>  return false;
>
>/* Do not register always-false predicates.
>
> Ok with that change.
>
> Thanks,
> Richard.
>
>> Thanks.
>> Yuri.
>>
>> 2015-12-18 13:20 GMT+03:00 Yuri Rumyantsev :
>>> Hi Richard,
>>>
>>> Here is updated patch for middle-end part of the whole patch which
>>> fixes all your remarks I hope.
>>>
>>> Regression testing and bootstrapping did not show any new failures.
>>> Is it OK for trunk?
>>>
>>> Yuri.
>>>
>>> ChangeLog:
>>> 2015-12-18  Yuri Rumyantsev  
>>>
>>> PR middle-end/68542
>>> * fold-const.c (fold_binary_op_with_conditional_arg): Bail out for case
>>> of mixind vector and scalar types.
>>> (fold_relational_const): Add handling of vector
>>> comparison with boolean result.
>>> * tree-cfg.c (verify_gimple_comparison): Add argument CODE, allow
>>> comparison of vector operands with boolean result for EQ/NE only.
>>> (verify_gimple_assign_binary): Adjust call for verify_gimple_comparison.
>>> (verify_gimple_cond): Likewise.
>>> * tree-ssa-forwprop.c (combine_cond_expr_cond): Do not perform

[PING] genattrab.c generate switch

2016-01-18 Thread Jesper Broge Jørgensen


Ping patch:

https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00784.html

thanks

[committed] Add pass_parallelize_loops to pass_oacc_kernels

2016-01-18 Thread Tom de Vries

[ was: Re: [committed] Add oacc_kernels_p argument to 
pass_parallelize_loops ]


On 18/01/16 14:07, Tom de Vries wrote:

[was: Re: [PIING][PATCH, 9/16] Add pass_parallelize_loops_oacc_kernels ]

On 14/12/15 16:22, Richard Biener wrote:

On Sun, Dec 13, 2015 at 5:58 PM, Tom de Vries 
wrote:

On 24/11/15 13:24, Tom de Vries wrote:


On 16/11/15 12:59, Tom de Vries wrote:


On 09/11/15 20:52, Tom de Vries wrote:


On 09/11/15 16:35, Tom de Vries wrote:


Hi,

this patch series for stage1 trunk adds support to:
- parallelize oacc kernels regions using parloops, and
- map the loops onto the oacc gang dimension.

The patch series contains these patches:

   1Insert new exit block only when needed in
  transform_to_exit_first_loop_alt
   2Make create_parallel_loop return void
   3Ignore reduction clause on kernels directive
   4Implement -foffload-alias
   5Add in_oacc_kernels_region in struct loop
   6Add pass_oacc_kernels
   7Add pass_dominator_oacc_kernels
   8Add pass_ch_oacc_kernels
   9Add pass_parallelize_loops_oacc_kernels
  10Add pass_oacc_kernels pass group in passes.def
  11Update testcases after adding kernels pass group
  12Handle acc loop directive
  13Add c-c++-common/goacc/kernels-*.c
  14Add gfortran.dg/goacc/kernels-*.f95
  15Add libgomp.oacc-c-c++-common/kernels-*.c
  16Add libgomp.oacc-fortran/kernels-*.f95

The first 9 patches are more or less independent, but patches
10-16 are
intended to be committed at the same time.

Bootstrapped and reg-tested on x86_64.

Build and reg-tested with nvidia accelerator, in combination with a
patch that enables accelerator testing (which is submitted at
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).

I'll post the individual patches in reply to this message.



This patch adds pass_parallelize_loops_oacc_kernels.

There's a number of things we do differently in parloops for oacc
kernels:
- in normal parloops, we generate code to choose between a parallel
version of the loop, and a sequential (low iteration count)
version.
Since the code in oacc kernels region is supposed to run on the
accelerator anyway, we skip this check, and don't add a low
iteration
count loop.
- in normal parloops, we generate an #pragma omp parallel /
GIMPLE_OMP_RETURN pair to delimit the region which will we
split off
into a thread function. Since the oacc kernels region is already
split off, we don't add this pair.
- we indicate the parallelization factor by setting the oacc function
attributes
- we generate an #pragma oacc loop instead of an #pragma omp for, and
we add the gang clause
- in normal parloops, we rewrite the variable accesses in the loop in
terms into accesses relative to a thread function parameter.
For the
oacc kernels region, that rewrite has already been done at
omp-lower,
so we skip this.
- we need to ensure that the entire kernels region can be run in
parallel. The loop independence check is already present, so
for oacc
kernels we add a check between blocks outside the loop and the
entire
region.
- we guard stores in the blocks outside the loop with gang_pos == 0.
There's no need for each gang to write to a single location,
we can
do this in just one gang. (Typically this is the write of the
final
value of the iteration variable if that one is copied back to the
host).



Reposting with loop optimizer init added in
pass_parallelize_loops_oacc_kernels::execute.



Reposting with loop_optimizer_finalize,scev_initialize and
scev_finalize
   added in pass_parallelize_loops_oacc_kernels::execute.



Ping.

Anything I can do to facilitate the review?


Document new functions.


Done.

avoid if (1).

Done.


Ideally some refactoring would avoid some of the if (!oacc_kernels_p)
spaghetti


Ack. For now, i've tried to minimize the number of oacc_kernels_p tests
in the code.

Further suggestions on how to improve here are much appreciated.


but I'm considering tree-parloops.c (and its bugs) yours.


Ack.


Can the pass not just use a pass parameter to switch between
oacc/non-oacc?



This patch introduces the pass parameter oacc_kernels_p (but does not
instantiate an oacc_kernels_p == true pass version yet).


This patch add pass_parallelize_loops to pass_oacc_kernels (using pass 
parameter oacc_kernels_p == true).


As a consequence, it needs to update parloops testcases to use dumpfile 
parloops2.


Bootstrapped and reg-tested on x86_64.

Build with nvidia accelerator and tested goacc.exp and libgomp.

Committed to trunk.

Thanks,
- Tom

Add pass_parallelize_loops to pass_oacc_kernels

2016-01-18  Tom de Vries  

	* passes.def: Add pass_parallelize_loops to pass_oacc_kernels.

	* gcc.dg/autopar/outer-1.c: Update for new parloops instantiation.
	* gcc.dg/autopar/outer-2.c: Same.
	*

[committed] Add oacc kernels tests in goacc

2016-01-18 Thread Tom de Vries


[ was: Re: [PATCH, 13/16] Add c-c++-common/goacc/kernels-*.c ]

On 09/11/15 21:07, Tom de Vries wrote:

On 09/11/15 16:35, Tom de Vries wrote:

Hi,

this patch series for stage1 trunk adds support to:
- parallelize oacc kernels regions using parloops, and
- map the loops onto the oacc gang dimension.

The patch series contains these patches:

  1Insert new exit block only when needed in
 transform_to_exit_first_loop_alt
  2Make create_parallel_loop return void
  3Ignore reduction clause on kernels directive
  4Implement -foffload-alias
  5Add in_oacc_kernels_region in struct loop
  6Add pass_oacc_kernels
  7Add pass_dominator_oacc_kernels
  8Add pass_ch_oacc_kernels
  9Add pass_parallelize_loops_oacc_kernels
 10Add pass_oacc_kernels pass group in passes.def
 11Update testcases after adding kernels pass group
 12Handle acc loop directive
 13Add c-c++-common/goacc/kernels-*.c
 14Add gfortran.dg/goacc/kernels-*.f95
 15Add libgomp.oacc-c-c++-common/kernels-*.c
 16Add libgomp.oacc-fortran/kernels-*.f95

The first 9 patches are more or less independent, but patches 10-16 are
intended to be committed at the same time.

Bootstrapped and reg-tested on x86_64.

Build and reg-tested with nvidia accelerator, in combination with a
patch that enables accelerator testing (which is submitted at
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).

I'll post the individual patches in reply to this message.


This patch adds C/C++ oacc kernels compilation tests.



This reduced patch contains the test-cases that currently pass.

Bootstrapped and reg-tested on x86_64.

Build with nvidia accelerator and tested goacc.exp and libgomp.

Committed to trunk.

Thanks,
- Tom

Add oacc kernels tests in goacc

2015-11-09  Tom de Vries  

	* c-c++-common/goacc/kernels-counter-vars-function-scope.c: New test.
	* c-c++-common/goacc/kernels-double-reduction.c: New test.
	* c-c++-common/goacc/kernels-empty.c: New test.
	* c-c++-common/goacc/kernels-eternal.c: New test.
	* c-c++-common/goacc/kernels-loop-2.c: New test.
	* c-c++-common/goacc/kernels-loop-3.c: New test.
	* c-c++-common/goacc/kernels-loop-data-2.c: New test.
	* c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: New test.
	* c-c++-common/goacc/kernels-loop-data-enter-exit.c: New test.
	* c-c++-common/goacc/kernels-loop-data-update.c: New test.
	* c-c++-common/goacc/kernels-loop-data.c: New test.
	* c-c++-common/goacc/kernels-loop-g.c: New test.
	* c-c++-common/goacc/kernels-loop-mod-not-zero.c: New test.
	* c-c++-common/goacc/kernels-loop-n.c: New test.
	* c-c++-common/goacc/kernels-loop-nest.c: New test.
	* c-c++-common/goacc/kernels-loop.c: New test.
	* c-c++-common/goacc/kernels-noreturn.c: New test.
	* c-c++-common/goacc/kernels-one-counter-var.c: New test.
	* c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c: New test.
	* c-c++-common/goacc/kernels-reduction.c: New test.

---
 .../goacc/kernels-counter-vars-function-scope.c| 54 +
 .../goacc/kernels-double-reduction-n.c | 37 
 .../c-c++-common/goacc/kernels-double-reduction.c  | 37 
 gcc/testsuite/c-c++-common/goacc/kernels-empty.c   |  6 ++
 gcc/testsuite/c-c++-common/goacc/kernels-eternal.c | 11 
 gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c  | 70 ++
 gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c  | 49 +++
 gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c  | 17 ++
 .../c-c++-common/goacc/kernels-loop-mod-not-zero.c | 52 
 gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c  | 56 +
 .../c-c++-common/goacc/kernels-loop-nest.c | 39 
 gcc/testsuite/c-c++-common/goacc/kernels-loop.c| 56 +
 .../c-c++-common/goacc/kernels-noreturn.c  | 12 
 .../c-c++-common/goacc/kernels-one-counter-var.c   | 54 +
 .../c-c++-common/goacc/kernels-reduction.c | 36 +++
 15 files changed, 586 insertions(+)

diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c b/gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c
new file mode 100644
index 000..e8b5357
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c
@@ -0,0 +1,54 @@
+/* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-ftree-parallelize-loops=32" } */
+/* { dg-additional-options "-fdump-tree-parloops1-all" } */
+/* { dg-additional-options "-fdump-tree-optimized" } */
+
+#include 
+
+#define N (1024 * 512)
+#define COUNTERTYPE unsigned int
+
+int
+main (void)
+{
+  unsigned int *__restrict a;
+  unsigned int *__restrict b;
+  unsigned int *__restrict c;
+  COUNTERTYPE i;
+  COUNTERTYPE ii;
+
+  a = (unsigned int *)malloc (N * sizeof (unsigned int));
+  b = (unsigned int *)malloc (N * sizeof (unsigned

[comitted] Add oacc kernels test in libgomp

2016-01-18 Thread Tom de Vries


[ was: Re: [PATCH, 15/16] Add libgomp.oacc-c-c++-common/kernels-*.c ]

On 09/11/15 21:10, Tom de Vries wrote:

On 09/11/15 16:35, Tom de Vries wrote:

Hi,

this patch series for stage1 trunk adds support to:
- parallelize oacc kernels regions using parloops, and
- map the loops onto the oacc gang dimension.

The patch series contains these patches:

  1Insert new exit block only when needed in
 transform_to_exit_first_loop_alt
  2Make create_parallel_loop return void
  3Ignore reduction clause on kernels directive
  4Implement -foffload-alias
  5Add in_oacc_kernels_region in struct loop
  6Add pass_oacc_kernels
  7Add pass_dominator_oacc_kernels
  8Add pass_ch_oacc_kernels
  9Add pass_parallelize_loops_oacc_kernels
 10Add pass_oacc_kernels pass group in passes.def
 11Update testcases after adding kernels pass group
 12Handle acc loop directive
 13Add c-c++-common/goacc/kernels-*.c
 14Add gfortran.dg/goacc/kernels-*.f95
 15Add libgomp.oacc-c-c++-common/kernels-*.c
 16Add libgomp.oacc-fortran/kernels-*.f95

The first 9 patches are more or less independent, but patches 10-16 are
intended to be committed at the same time.

Bootstrapped and reg-tested on x86_64.

Build and reg-tested with nvidia accelerator, in combination with a
patch that enables accelerator testing (which is submitted at
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).

I'll post the individual patches in reply to this message.


This patch adds C/C++ oacc kernels execution tests.



Bootstrapped and reg-tested on x86_64.

Build with nvidia accelerator and tested goacc.exp and libgomp.

Committed to trunk as attached (AFAICT, no changes compared to original 
posting, other than commit title).


Thanks,
- Tom

Add oacc kernels test in libgomp

2015-11-09  Tom de Vries  

	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-2.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-enter-exit-2.c:
	Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-enter-exit.c:
	Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-update.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-parallel-loop-data-enter-exit.c:
	Same.
	* testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c: Same.

---
 .../libgomp.oacc-c-c++-common/kernels-loop-2.c | 47 ++
 .../libgomp.oacc-c-c++-common/kernels-loop-3.c | 34 
 .../kernels-loop-and-seq-2.c   | 36 +
 .../kernels-loop-and-seq-3.c   | 37 +
 .../kernels-loop-and-seq-4.c   | 36 +
 .../kernels-loop-and-seq-5.c   | 37 +
 .../kernels-loop-and-seq-6.c   | 36 +
 .../kernels-loop-and-seq.c | 37 +
 .../kernels-loop-collapse.c| 40 ++
 .../libgomp.oacc-c-c++-common/kernels-loop-g.c |  5 +++
 .../kernels-loop-mod-not-zero.c| 41 +++
 .../libgomp.oacc-c-c++-common/kernels-loop-n.c | 47 ++
 .../libgomp.oacc-c-c++-common/kernels-loop-nest.c  | 26 
 .../libgomp.oacc-c-c++-common/kernels-loop.c   | 41 +++
 .../libgomp.oacc-c-c++-common/kernels-reduction.c  | 37 +
 15 files changed, 537 insertions(+)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c
new file mode 100644
index 000..13e57bd
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c
@@ -0,0 +1,47 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=32" } */
+
+#include 
+
+#define N (1024 * 512)
+#define COUNTERTYPE

Re: [PATCH PR68542]

2016-01-18 Thread Richard Biener

On Mon, Jan 18, 2016 at 3:02 PM, Yuri Rumyantsev  wrote:
> Thanks Richard.
>
> I changed the check on type as you proposed.
>
> What about the second back-end part of patch (it has been sent 08.12.15).

Can't see it in my inbox - can you reply to the mail with a ping?

Thanks,
Richard.

> Thanks.
> Yuri.
>
> 2016-01-18 15:44 GMT+03:00 Richard Biener :
>> On Mon, Jan 11, 2016 at 11:06 AM, Yuri Rumyantsev  wrote:
>>> Hi Richard,
>>>
>>> Did you have anu chance to look at updated patch?
>>
>> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
>> index acbb70b..208a752 100644
>> --- a/gcc/tree-vrp.c
>> +++ b/gcc/tree-vrp.c
>> @@ -5771,6 +5771,10 @@ register_edge_assert_for (tree name, edge e,
>> gimple_stmt_iterator si,
>> _code, ))
>>  return;
>>
>> +  /* VRP doesn't track ranges for vector types.  */
>> +  if (TREE_CODE (TREE_TYPE (name)) == VECTOR_TYPE)
>> +return;
>> +
>>
>> please instead fix extract_code_and_val_from_cond_with_ops with
>>
>> Index: gcc/tree-vrp.c
>> ===
>> --- gcc/tree-vrp.c  (revision 232506)
>> +++ gcc/tree-vrp.c  (working copy)
>> @@ -5067,8 +5067,9 @@ extract_code_and_val_from_cond_with_ops
>>if (invert)
>>  comp_code = invert_tree_comparison (comp_code, 0);
>>
>> -  /* VRP does not handle float types.  */
>> -  if (SCALAR_FLOAT_TYPE_P (TREE_TYPE (val)))
>> +  /* VRP only handles integral and pointer types.  */
>> +  if (! INTEGRAL_TYPE_P (TREE_TYPE (val))
>> +  && ! POINTER_TYPE_P (TREE_TYPE (val)))
>>  return false;
>>
>>/* Do not register always-false predicates.
>>
>> Ok with that change.
>>
>> Thanks,
>> Richard.
>>
>>> Thanks.
>>> Yuri.
>>>
>>> 2015-12-18 13:20 GMT+03:00 Yuri Rumyantsev :
 Hi Richard,

 Here is updated patch for middle-end part of the whole patch which
 fixes all your remarks I hope.

 Regression testing and bootstrapping did not show any new failures.
 Is it OK for trunk?

 Yuri.

 ChangeLog:
 2015-12-18  Yuri Rumyantsev  

 PR middle-end/68542
 * fold-const.c (fold_binary_op_with_conditional_arg): Bail out for case
 of mixind vector and scalar types.
 (fold_relational_const): Add handling of vector
 comparison with boolean result.
 * tree-cfg.c (verify_gimple_comparison): Add argument CODE, allow
 comparison of vector operands with boolean result for EQ/NE only.
 (verify_gimple_assign_binary): Adjust call for verify_gimple_comparison.
 (verify_gimple_cond): Likewise.
 * tree-ssa-forwprop.c (combine_cond_expr_cond): Do not perform

Re: [PATCH, i386, AVX512] PR target/67895: Fix position of embedded rounding/SAE mode in AVX512 vrangep* and vcvt?si2s* instructions.

2016-01-18 Thread Kirill Yukhin

Hello,
On 15 Jan 15:39, Alexander Fomin wrote:
> I've bootstrapped and regtested it against GCC v5 on x86_64-gnu-linux.
> OK for 5-branch?
Yes, it is ok for gcc-5-branch

--
Thanks, K
> 
> --
> Thanks,
> Alexander
> 
> On Fri, Oct 09, 2015 at 05:24:56PM +0300, Kirill Yukhin wrote:
> > Hello,
> > On 08 Oct 20:31, Alexander Fomin wrote:
> > > Hi All,
> > > 
> > > This patch addresses PR target/67895. For some AVX512 instructions
> > > we've used  to emit embedded rounding/SAE specifier in a wrong place.
> > > The patch fixes its position for vrange* and vcvt?si2s* instructions.
> > > I've also updated regular expressions for corresponding assembly in
> > > i386 testsuite, so they act like regression tests now.
> > > 
> > > Bootstrap is OK, waiting for regression testing now.
> > > If the last one is fine, is this patch OK for trunk and 5 branch?
> > I am OK.
> > 
> > --
> > Thanks, K

[PATCH] Fix PR69308

2016-01-18 Thread Richard Biener


This fixes missing handling of GIMPLE_COND in gimple_could_trap_p[_1].

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2016-01-18  Richard Biener  

PR middle-end/69308
* gimple.c (gimple_could_trap_p_1): Handle GIMPLE_COND.

Index: gcc/gimple.c
===
*** gcc/gimple.c(revision 232496)
--- gcc/gimple.c(working copy)
*** gimple_could_trap_p_1 (gimple *s, bool i
*** 1931,1936 
--- 1931,1941 
   && TYPE_OVERFLOW_TRAPS (t)),
  div));
  
+ case GIMPLE_COND:
+   t = TREE_TYPE (gimple_cond_lhs (s));
+   return operation_could_trap_p (gimple_cond_code (s),
+FLOAT_TYPE_P (t), false, NULL_TREE);
+ 
  default:
break;
  }

Re: [Patch, fortran] (4/5-regression) PR61831 side-effect deallocation of variable components)

2016-01-18 Thread Dominique d'Humières

The failures with -m32 are

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Without the closing brace, I get

UNRESOLVED: gfortran.dg/derived_constructor_comps_6.f90   -O0   
scan-tree-dump-times original "__builtin_free » 33

Dominique

> Le 18 janv. 2016 à 13:48, Paul Richard Thomas  
> a écrit :
> 
> Hi Dominique,
> 
> Late or not, thanks for the feedback. I'll fix the right brace. More
> worrying is the failure with -m32. I presume that the failure with
> -O0/O1 is at runtime? If not, the correction of the missing right
> brace is a mysterious trigger for a fault that is optimization
> dependent.
> 
> Cheers
> 
> Paul

Re: [PATCH v2] libstdc++: Make certain exceptions transaction_safe.

2016-01-18 Thread Torvald Riegel

On Sun, 2016-01-17 at 18:30 -0500, David Edelsohn wrote:
> On Sun, Jan 17, 2016 at 3:21 PM, Torvald Riegel  wrote:
> > On Sat, 2016-01-16 at 15:38 -0500, David Edelsohn wrote:
> >> On Sat, Jan 16, 2016 at 8:35 AM, Jakub Jelinek  wrote:
> >> > On Sat, Jan 16, 2016 at 07:47:33AM -0500, David Edelsohn wrote:
> >> >> stage1 libstdc++ builds just fine.  the problem is stage2 configure
> >> >> fails due to missing ITM_xxx symbols when configure tries to compile
> >> >> and run conftest programs.
> >> >
> >> > On x86_64-linux, the _ITM_xxx symbols are undef weak ones and thus it is
> >> > fine to load libstdc++ without libitm and libstdc++ doesn't depend on
> >> > libitm.
> >> >
> >> > So, is AIX defining __GXX_WEAK__ or not?  Perhaps some other macro or
> >> > configure check needs to be used to determine if undefined weak symbols
> >> > work the way libstdc++ needs them to.
> >>
> >> __GXX_WEAK__ appears to be defined by gcc/c-family/c-cppbuiltin.c
> >> based on  SUPPORTS_ONE_ONLY.  gcc/defaults.h defines SUPPORTS_ONE_ONLY
> >> if the target supports MAKE_DECL_ONE_ONLY and link-once semantics.
> >> AIX weak correctly supports link-once semantics.  AIX also supports
> >> the definition of __GXX_WEAK__ in gcc/doc/cpp.texi, namely collapsing
> >> symbols with vague linkage in multiple translation units.
> >>
> >> libstdc++/src/c++11/cow-stdexcept.cc appears to be using __GXX_WEAK__
> >> and __attribute__ ((weak)) for references to symbols that may not be
> >> defined at link time or run time.  AIX does not allow undefined symbol
> >> errors by default.  And the libstdc++ inference about the semantics of
> >> __GXX_WEAK__ are different than the documentation.
> >>
> >> AIX supports MAKE_DECL_ONE_ONLY and the documented meaning of
> >> __GXX_WEAK__.  AIX does not support extension of the meaning to
> >> additional SVR4 semantics not specified in the documentation.
> >
> > I see, so we might be assuming that __GXX_WEAK__ means more than it
> > actually does (I'm saying "might" because personally, I don't know; your
> > information supports this is the case, but the initial info I got was
> > that __GXX_WEAK__ would mean we could have weak decls without
> > definitions).
> 
> I believe that libstdc++ must continue with the weak undefined
> references to the symbols as designed, but protect them with a
> different macro.  For example, __GXX_WEAK_REF__ or __GXX_WEAK_UNDEF__
> defined in defaults.h based on configure test or simply overridden in
> config/rs6000/aix.h.  Or the macro could be local to libstdc++ and
> overridden in config/os/aix/os_defines.h.

OK.  I'm currently testing the attached patch on x86_64-linux.  David,
if there are no objections from you and those CC'ed, could you give this
one a try on AIX, please?

* include/bits/c++config (_GLIBCXX_USE_WEAK_REF): New.
(_GLIBCXX_TXN_SAFE, _GLIBCXX_TXN_SAFE_DYN): Use _GLIBCXX_USE_WEAK_REF
and move after its definition.
* config/os/aix/os_defines.h (_GLIBCXX_USE_WEAK_REF): Override.
* src/c++11/cow-stdexcept.cc: Use _GLIBCXX_USE_WEAK_REF instead of
__GXX_WEAK__, and only provide transactional clones if
_GLIBCXX_USE_WEAK_REF is true.  Don't provide stubs of libitm
functions.

commit a5a8819bce824815a94ef8d58f6d4123db92f1d4
Author: Torvald Riegel 
Date:   Mon Jan 18 14:42:21 2016 +0100

libstdc++: Fix usage of __GXX_WEAK__ in TM TS support.

	* include/bits/c++config (_GLIBCXX_USE_WEAK_REF): New.
	(_GLIBCXX_TXN_SAFE, _GLIBCXX_TXN_SAFE_DYN): Use	_GLIBCXX_USE_WEAK_REF
	and move after its definition.
	* config/os/aix/os_defines.h (_GLIBCXX_USE_WEAK_REF): Override.
	* src/c++11/cow-stdexcept.cc: Use _GLIBCXX_USE_WEAK_REF instead of
	__GXX_WEAK__, and only provide transactional clones if
	_GLIBCXX_USE_WEAK_REF is true.  Don't provide stubs of libitm
	functions.

diff --git a/libstdc++-v3/config/os/aix/os_defines.h b/libstdc++-v3/config/os/aix/os_defines.h
index d895471..0949446 100644
--- a/libstdc++-v3/config/os/aix/os_defines.h
+++ b/libstdc++-v3/config/os/aix/os_defines.h
@@ -48,4 +48,7 @@
 #define __COMPATMATH__
 #endif
 
+// No support for referencing weak symbols without a definition.
+#define _GLIBCXX_USE_WEAK_REF 0
+
 #endif
diff --git a/libstdc++-v3/include/bits/c++config b/libstdc++-v3/include/bits/c++config
index 387a7bb..57024e4 100644
--- a/libstdc++-v3/include/bits/c++config
+++ b/libstdc++-v3/include/bits/c++config
@@ -483,20 +483,6 @@ namespace std
 
 #define _GLIBCXX_USE_ALLOCATOR_NEW
 
-// Conditionally enable annotations for the Transactional Memory TS on C++11.
-// Most of the following conditions are due to limitations in the current
-// implementation.
-#if __cplusplus >= 201103L && _GLIBCXX_USE_CXX11_ABI			\
-  && _GLIBCXX_USE_DUAL_ABI && __cpp_transactional_memory >= 201505L	\
-  &&  !_GLIBCXX_FULLY_DYNAMIC_STRING && __GXX_WEAK__ 			\
-  && _GLIBCXX_USE_ALLOCATOR_NEW
-#define

Re: genattrab.c generate switch

2016-01-18 Thread Bernd Schmidt


On 01/13/2016 01:53 AM, Jesper Broge Jørgensen wrote:

genattrab.c can generate if statements that have very deep bracket
nesting causing clang to produce errors (when target=arm-none-eabi) as
explained at https://gcc.gnu.org/ml/gcc/2014-05/msg00032.html
At the above link it was suggested that genattrab.c generated a switch
statement instead. I have made a patch that does just that.


Some preliminaries first - I don't see your name in existing ChangeLogs; 
am I correct in assuming you've not gone through the copyright 
assignment process?


Secondly, we're currently in a development phase where we only accept 
bug fixes for gcc-6. You should resubmit/ping the patch once stage1 
opens again.



2016-01-13  Jesper Broge Jørgensen  

 * genattrtab.c (check_attr_set_switch): implemented the function
 (write_attr_set): Check if expression can be written as a switch


Please review our coding and documentation standards. ChangeLog entries 
should be complete sentences (or sometimes brief short-hands: the first 
one should just be "New function.")



+static int check_attr_set_switch (FILE *outf, rtx exp,
+unsigned int attrs_cached, int write_cases, int
indent);


No reason to declare it if it is defined before its use.

+  while (1)
+  {


This and everything else here looks like it isn't following our 
indentation rules.



Bernd

Re: [hsa merge 08/10] HSAIL BRIG description header file

2016-01-18 Thread Martin Jambor

Hi,

On Sat, Jan 16, 2016 at 12:43:07PM +0100, Jakub Jelinek wrote:
> On Fri, Jan 15, 2016 at 06:23:05PM +0100, Martin Jambor wrote:
> >   BRIG_KIND_OPERAND_REGISTER = 0x300a,
> >   BRIG_KIND_OPERAND_STRING = 0x300b,
> >   BRIG_KIND_OPERAND_WAVESIZE = 0x3009c,
> >   BRIG_KIND_OPERAND_END = 0x300d
> 
> The above looks weird, I'd have expected BRIG_KIND_OPERAND_WAVESIZE
> to be 0x300c instead.  Bug in the standard?
> As typedef uint16_t BrigKind16_t;, I'm afraid this doesn't even fit
> into the data type.  Note the original brig header you've posted
> had this fixed.
> 

That is clearly a bug.  We did not catch it whe comparing the compiler
binary because we never use this constant.  Have you found this by
hand or did you do any more systematic comparison?

BRIG is always validated when finalized and I belive that fortunately
this particular bug would be caught by that as would majority of
similar "random" ones.

I am going to commit the following patch to the branch.

Thanks for spotting this.

Martin


2016-01-18  Martin Jambor  

* hsa-brig-format.h (BrigKind): Fix the value of
BRIG_KIND_OPERAND_WAVESIZE.
---
 gcc/hsa-brig-format.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/hsa-brig-format.h b/gcc/hsa-brig-format.h
index 247799b..e1c6cd2 100644
--- a/gcc/hsa-brig-format.h
+++ b/gcc/hsa-brig-format.h
@@ -303,7 +303,7 @@ enum BrigKind
   BRIG_KIND_OPERAND_OPERAND_LIST = 0x3009,
   BRIG_KIND_OPERAND_REGISTER = 0x300a,
   BRIG_KIND_OPERAND_STRING = 0x300b,
-  BRIG_KIND_OPERAND_WAVESIZE = 0x3009c,
+  BRIG_KIND_OPERAND_WAVESIZE = 0x300c,
   BRIG_KIND_OPERAND_END = 0x300d
 };
 
-- 
2.6.4

Re: [hsa merge 09/10] Majority of the HSA back-end

2016-01-18 Thread Martin Jambor

Hi,

On Sat, Jan 16, 2016 at 09:58:51AM +0100, Jakub Jelinek wrote:
> On Sat, Jan 16, 2016 at 12:49:12AM +0100, Martin Jambor wrote:
> > bootstrapping on i686-linux revealed the need for the following simple
> > patch.  I've run into two types of compilation errors on
> > powerpc-ibm-aix (no htolenn functions and ASM_GENERATE_INTERNAL_LABEL
> > somehow expanding to undeclared rs6000_xcoff_strip_dollar).  I plan to
> > workaround them quickly by making most of the contents of hsa-*.c
> > files compiled only conditionally (and leave potential hsa support on
> > non-linux platforms for later), but I will not have time to do the
> > change and test it properly until Monday.
> > 
> > But that will hopefully really be it,
> 
> IMHO you'd be best to write your own helpers for conversion to little
> endian (and back).
> gcc configure already has AC_C_BIGENDIAN (dunno how it handles pdp endian
> host though, so not sure if it is safe to rely on that), for recent GCC
> you can use __BYTE_ORDER__ macro to check endianity and __builtin_bswap*.
> So perhaps just
> #if GCC_VERSION >= 4006
> // use __BYTE_ORDER__ and __builtin_bswap or nothing
> #else
> // provide a safe slower default, with shifts and masking
> #endif
> 
> As for rs6000_xcoff_strip_dollar, look at other sources that use it what
> headers they do include, bet you want to #include "tm_p.h" to make it work.
> 

thanks for the suggestion.  With the following two patches, I can
compile HSA branch on powerpc-aix.  I'm going to prepare a new patch
with them, bootstrap it on x86_64, i686 and ppc-aix and unless
something new pops up again, I will commit it either at nigh today or
early morning tomorrow.

I have tested the slow paths of little endian conversion only very
rudimentarily but I did.  OTOH, I am actually not quite sure how 64
bit-wide numbers are spaced out on PDP-endian systems.  But I guess it
is OK to fix those only later if I am wrong.

I am also willing to incorporate any feedback later, even if it is
only a matter of style.

Thanks,

Martin


2016-01-18  Martin Jambor  

* hsa-brig.c: Include target.h and tm_p.h.
---
 gcc/hsa-brig.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/hsa-brig.c b/gcc/hsa-brig.c
index 9260c21..ee06804 100644
--- a/gcc/hsa-brig.c
+++ b/gcc/hsa-brig.c
@@ -23,6 +23,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "tm.h"
+#include "target.h"
+#include "tm_p.h"
 #include "is-a.h"
 #include "vec.h"
 #include "hash-table.h"
-- 
2.6.4


2016-01-18  Martin Jambor  

* hsa-brig.c (lendian16): New function.  Changed all uses of htole16
to use it.
(lendian32): New function.  Changed all uses of htole32 to use it.
(lendian64): New function.  Changed all uses of htole64 to use it.
---
 gcc/hsa-brig.c | 412 ++---
 1 file changed, 245 insertions(+), 167 deletions(-)

diff --git a/gcc/hsa-brig.c b/gcc/hsa-brig.c
index d4e644f..9260c21 100644
--- a/gcc/hsa-brig.c
+++ b/gcc/hsa-brig.c
@@ -44,6 +44,83 @@ along with GCC; see the file COPYING3.  If not see
 #include "hsa.h"
 #include "gomp-constants.h"
 
+/* Convert VAL to little endian form, if necessary.  */
+
+static uint16_t
+lendian16 (uint16_t val)
+{
+#if GCC_VERSION >= 4006
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  return val;
+#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  return __builtin_bswap16 (val);
+#else   /* __ORDER_PDP_ENDIAN__ */
+  return val;
+#endif
+#else
+// provide a safe slower default, with shifts and masking
+#ifndef WORDS_BIGENDIAN
+  return val;
+#else
+  return (val >> 8) | (val << 8);
+#endif
+#endif
+}
+
+/* Convert VAL to little endian form, if necessary.  */
+
+static uint32_t
+lendian32 (uint32_t val)
+{
+#if GCC_VERSION >= 4006
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  return val;
+#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  return __builtin_bswap32 (val);
+#else  /* __ORDER_PDP_ENDIAN__ */
+  return (val >> 16) | (val << 16);
+#endif
+#else
+// provide a safe slower default, with shifts and masking
+#ifndef WORDS_BIGENDIAN
+  return val;
+#else
+  val  = ((val & 0xff00ff00) >> 8) | ((val & 0xff00ff) << 8);
+  return (val >> 16) | (val << 16);
+#endif
+#endif
+}
+
+/* Convert VAL to little endian form, if necessary.  */
+
+static uint64_t
+lendian64 (uint64_t val)
+{
+#if GCC_VERSION >= 4006
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  return val;
+#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  return __builtin_bswap64 (val);
+#else  /* __ORDER_PDP_ENDIAN__ */
+  return (((val & 0x) << 48)
+ | ((val & 0x) << 16)
+ | ((val & 0x) >> 16)
+ | ((val & 0x) >> 48));
+#endif
+#else
+// provide a safe slower default, with shifts and masking
+#ifndef WORDS_BIGENDIAN
+  return val;
+#else
+  val = (((val & 0xff00ff00ff00ff00ll) >> 8)
+| ((val & 0x00ff00ff00ff00ffll) <<

Re: [PATCH] Fix the remaining PR c++/24666 blockers (arrays decay to pointers too early)

2016-01-18 Thread Jason Merrill


On 12/25/2015 12:37 PM, Patrick Palka wrote:

That alone would not be sufficient because more_specialized_fn()
doesn't call maybe_adjust_types_for_deduction() beforehand, yet we
have to do the decaying there too (and on both types, not just one of
them).

And maybe_adjust_types_for_deduction() seems to operate on the
presumption that one type is the parameter type and one is the
argument type. But in more_specialized_fn() and in get_bindings() we
are really working with two parameter types and have to decay them
both. So sometimes we have to decay one of the types that are
eventually going to get passed to unify(), and other times we want to
decay both types that are going to get passed to unify().
maybe_adjust_types_for_deduction() seems to only expect the former
case.

Finally, maybe_adjust_types_for_deduction() is not called when
unifying a nested function declarator (because it is guarded by the
subr flag in unify_one_argument), so doing it there we would also
regress in the following test case:


Ah, that makes sense.

How about keeping the un-decayed type in the PARM_DECLs, so that we get 
the substitution failure in instantiate_template, but having the decayed 
type in the TYPE_ARG_TYPES, probably by doing the decay in grokparms, so 
it's already decayed when we're doing unification?


Jason

Re: [hsa merge 09/10] Majority of the HSA back-end

2016-01-18 Thread Jakub Jelinek

Hi!

PDP endian is
  gcc_assert (!BYTES_BIG_ENDIAN);
  gcc_assert (WORDS_BIG_ENDIAN);
and 16-bit words, thus within uint16_t it is little endian, and the
16-bit words are ordered in larger units in big endian order.

> +#else
> +  val  = ((val & 0xff00ff00) >> 8) | ((val & 0xff00ff) << 8);

Too many spaces before =?

> +  return (val >> 16) | (val << 16);
> +#endif
> +#endif
> +}
> +
> +/* Convert VAL to little endian form, if necessary.  */
> +
> +static uint64_t
> +lendian64 (uint64_t val)
> +{
> +#if GCC_VERSION >= 4006
> +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
> +  return val;
> +#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
> +  return __builtin_bswap64 (val);
> +#else  /* __ORDER_PDP_ENDIAN__ */
> +  return (((val & 0x) << 48)
> +   | ((val & 0x) << 16)
> +   | ((val & 0x) >> 16)
> +   | ((val & 0x) >> 48));

You are missing ll suffixes on the large constants.

That said, PDP endian host will not work with your patch if the system
compiler is not GCC >= 4.6, and most likely you are relying on __CHAR_BIT__
== 8 on the host too.  Guess it can be handled incrementally though.

Jakub

Re: [PATCH] Fix RTL DSE (PR rtl-optimization/68955)

2016-01-18 Thread Jakub Jelinek

On Mon, Jan 18, 2016 at 10:06:44AM +0100, Eric Botcazou wrote:
> > The following testcase is miscompiled on i686-linux at -O3.
> > The bug is in DSE record_store, which for group_id < 0 uses mem_addr
> > set to result of get_addr (base->val_rtx) (plus optional offset),
> > which is fine for canon_true_dependence with other MEMs in that function,
> > but we also store that address in store_info.  The problem is if later on
> > e.g. some read uses the same e.g. hard register as get_addr returned, but
> > that register contains at that later point a different value.
> > canon_true_dependence then happily returns the read does not alias the
> > store, although it might.
> > The fix is to store the VALUE (plus optional offset) into
> > store_info->mem_addr instead, then at some later insn when get_addr is
> > called on it it will either return the same register or expression (if it
> > has not changed), or some different one otherwise.
> 
> I presume that the origin of the bug is:
> 
>   /* get_addr can only handle VALUE but cannot handle expr like:
>VALUE + OFFSET, so call get_addr to get original addr for
>mem_addr before plus_constant.  */
>   mem_addr = get_addr (mem_addr);
>   if (offset)
>   mem_addr = plus_constant (get_address_mode (mem), mem_addr, offset);
> 
> both in record_store and check_mem_read_rtx, so I wonder if we shouldn't bite 
> the bullet and try enhancing get_addr since it's a mainline-only regression.

So, do you suggest to tweak get_addr like the patch below, and remove the
  mem_addr = get_addr (mem_addr);
line above and the comment?

--- gcc/alias.c.jj  2016-01-14 17:01:09.316932111 +0100
+++ gcc/alias.c 2016-01-18 10:30:46.780994699 +0100
@@ -2203,7 +2203,23 @@ get_addr (rtx x)
   struct elt_loc_list *l;
 
   if (GET_CODE (x) != VALUE)
-return x;
+{
+  if ((GET_CODE (x) == PLUS || GET_CODE (x) == MINUS)
+ && GET_CODE (XEXP (x, 0)) == VALUE
+ && CONST_SCALAR_INT_P (XEXP (x, 1)))
+   {
+ rtx op0 = get_addr (XEXP (x, 0));
+ if (op0 != XEXP (x, 0))
+   {
+ if (GET_CODE (x) == PLUS
+ && GET_CODE (XEXP (x, 1)) == CONST_INT)
+   return plus_constant (GET_MODE (x), op0, INTVAL (XEXP (x, 1)));
+ return simplify_gen_binary (GET_CODE (x), GET_MODE (x),
+ op0, XEXP (x, 1));
+   }
+   }
+  return x;
+}
   v = CSELIB_VAL_PTR (x);
   if (v)
 {


Jakub

Re: [PATCH] Fix a warning in mpx wrappers

2016-01-18 Thread Ilya Enkovich

2016-01-17 20:53 GMT+03:00 Jakub Jelinek :
> Hi!
>
> The following patch fixes a warning in libmpx:
> ../../../../libmpx/mpxwrap/mpx_wrappers.c:492:8: warning: assignment discards 
> 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
>  *d = *s;
> ^
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK. Thanks for the fix!

Ilya

>
> 2016-01-17  Jakub Jelinek  
>
> * mpxwrap/mpx_wrappers.c (__mpx_wrapper_memmove): Avoid
> -Wdiscarded-qualifiers warning.  Fix up formatting.
>
> --- libmpx/mpxwrap/mpx_wrappers.c.jj2015-12-31 01:11:17.0 +0100
> +++ libmpx/mpxwrap/mpx_wrappers.c   2016-01-16 10:37:54.488048781 +0100
> @@ -486,12 +486,12 @@ __mpx_wrapper_memmove (void *dst, const
>/* When we copy exactly one pointer it is faster to
>   just use bndldx + bndstx.  */
>if (n == sizeof (void *))
> -  {
> -const void **s = (const void**)src;
> -void **d = (void**)dst;
> -*d = *s;
> -return dst;
> -  }
> +{
> +  void *const *s = (void *const *) src;
> +  void **d = (void **) dst;
> +  *d = *s;
> +  return dst;
> +}
>
>memmove (dst, src, n);
>
>
>
> Jakub

RE: [PATCH] [ARC] Add basic support for double load and store instructions

2016-01-18 Thread Claudiu Zissulescu

>  >if (n_pieces >= (unsigned int) (optimize_size ? 3 : 15))
>  >  return false;
>  > -  if (piece > 4)
>  > +  if (TARGET_LL64 && (piece != 8) && (align >= 4))
>  > +piece = 8;
>  > +  else if (piece > 4)
>  >  piece = 4;
>  >dst_addr = force_offsettable (XEXP (operands[0], 0), size, 0);
> 
> That bit doesn't make sense to me.
> Assume the alignment is 8.  Thus, piece becomes 8 too.  Then the above
> conditional gets processed, and it sets piece to 4.
> I think instead of "(piece != 8) && (align >= 4)" it should be:
> "(piece >= 8)"

Right. My intention is to force 64 bit transfer also for 32 bit datum. Hence, 
the condition should be like this:

If (TARGET_LL64 && (piece >= 4))
  piece = 8;
...

So, whenever the align is 32 bit or larger (as piece is align), we use the 64 
bit transfers. The number of pieces is computed few lines above.

> 
>   * config/arc/arc.md (*movdi_insn): Emit ldd/std instructions.
> 
> 
>  > -  "&& reload_completed && optimize"
>  > -  [(set (match_dup 2) (match_dup 3)) (set (match_dup 4) (match_dup 5))]
> > -  "arc_split_move (operands);"
>  > +  "reload_completed"
>  > +  [(match_dup 2)]
>  > +  "operands[2] = arc_split_move (operands);"
> 
> arc_split_move uses, inter alia,  operands[2]..operands[[5].
> Thus, it is not save to stop mentioning these in the pattern.
> 
> > (*movdf_insn): Likewise.
> Likewise.
> 

Noted.

> When you say 'basic support', I suppose you have a plan to re-visit this later
> to get the register allocator to use register pairs, and stop regrename
> breaking them up?

Indeed, I am preparing a patch for (new) floating point support which I am very 
interested to get ur feedback. The double precision floating point operations 
are using the double registers. Hence, breaking the registers and, afterwards,  
introducing moves to get them right, it is not desirable. Thus, I will 
introduce a new abi variant which will pass the arguments on even-odd registers 
and mods in the hard_regno_mode_ok to keep the registers in proper pairs. But 
more in the upcoming patch.

Re: [PATCH] DWARF: add abstract origin links on lexical blocks DIEs

2016-01-18 Thread Richard Biener

On Sun, Jan 17, 2016 at 9:09 PM, Eric Botcazou  wrote:
>> Sounds like a good excuse to add a guality for Ada (which has unique
>> needs for dwarf).
>
> Well, the guality testsuite is a pain to maintain so I'd rather not.
> The GDB testsuite is clearly the right place for this kind of testcases.

But that tests GDB and not GCCs generation of DWARF ... which means
take the other option of writing a scan-assembler testcase looking for the
previously missing DWARF.

It would be nice if we'd support dropping in gdb/testsuite into
testsuite/gdb/ or so and include that in testing (plus in the test_summary
report).

Richard.

> --
> Eric Botcazou

Re: [PATCH] DWARF: add abstract origin links on lexical blocks DIEs

2016-01-18 Thread Eric Botcazou

> But that tests GDB and not GCCs generation of DWARF ...

But GDB only consumes the DWARF generated by GCC, it cannot synthetize it. ;-)

> which means take the other option of writing a scan-assembler testcase
> looking for the previously missing DWARF.

Fine with me (either Ada or C as far as I'm concerned).

-- 
Eric Botcazou

Re: [PING][PATCH] Fix line number that is expected to generate an error.

2016-01-18 Thread Dominik Vogt

On Mon, Jan 11, 2016 at 12:28:40PM +0100, Dominik Vogt wrote:
> The attached patch fixes a test failure caused by expecting the
> error message for the wrong line.

Can this be committed?

> gcc/testsuite/ChangeLog
> 
>   * g++.dg/cpp0x/constexpr-reinterpret1.C: Fix line number that is
>   expected to generate an error.

> >From 7405de336d2e22e68d39fdfc2c1520ea0c3cc91a Mon Sep 17 00:00:00 2001
> From: Dominik Vogt 
> Date: Mon, 11 Jan 2016 12:26:56 +0100
> Subject: [PATCH] Fix line number that is expected to generate an error.
> 
> ---
>  gcc/testsuite/g++.dg/cpp0x/constexpr-reinterpret1.C | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-reinterpret1.C 
> b/gcc/testsuite/g++.dg/cpp0x/constexpr-reinterpret1.C
> index a5e3c1f1..0ea42a0 100644
> --- a/gcc/testsuite/g++.dg/cpp0x/constexpr-reinterpret1.C
> +++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-reinterpret1.C
> @@ -15,10 +15,8 @@ public:
>};
>  
>constexpr static Inner & getInner()
> -  {
> -/* I am surprised this is considered a constexpr */
> -return *((Inner *)4);
> -  } // { dg-error "reinterpret_cast" "" }
> +  /* I am surprised this is considered a constexpr */
> +  { return *((Inner *)4); } // { dg-error "reinterpret_cast" "" }
>  };
>  
>  B B::instance;

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

Re: [PING][PATCH] Fix line number that is expected to generate an error.

2016-01-18 Thread Jakub Jelinek

On Mon, Jan 18, 2016 at 10:53:51AM +0100, Dominik Vogt wrote:
> On Mon, Jan 11, 2016 at 12:28:40PM +0100, Dominik Vogt wrote:
> > The attached patch fixes a test failure caused by expecting the
> > error message for the wrong line.
> 
> Can this be committed?
> 
> > gcc/testsuite/ChangeLog
> > 

Missing
PR c++/68810

> > * g++.dg/cpp0x/constexpr-reinterpret1.C: Fix line number that is
> > expected to generate an error.

Ok with that change, I'm afraid we can't do better for GCC 6.

> > --- a/gcc/testsuite/g++.dg/cpp0x/constexpr-reinterpret1.C
> > +++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-reinterpret1.C
> > @@ -15,10 +15,8 @@ public:
> >};
> >  
> >constexpr static Inner & getInner()
> > -  {
> > -/* I am surprised this is considered a constexpr */
> > -return *((Inner *)4);
> > -  } // { dg-error "reinterpret_cast" "" }
> > +  /* I am surprised this is considered a constexpr */
> > +  { return *((Inner *)4); } // { dg-error "reinterpret_cast" "" }
> >  };
> >  
> >  B B::instance;

Jakub

RE: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation

2016-01-18 Thread Ajit Kumar Agarwal



-Original Message-
From: Jeff Law [mailto:l...@redhat.com] 
Sent: Saturday, January 16, 2016 12:03 PM
To: Ajit Kumar Agarwal; Richard Biener
Cc: GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; 
Nagaraju Mekala
Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa 
representation

On 01/04/2016 07:32 AM, Ajit Kumar Agarwal wrote:
>
>
> -Original Message- From: Jeff Law [mailto:l...@redhat.com]
> Sent: Wednesday, December 23, 2015 12:06 PM To: Ajit Kumar Agarwal; 
> Richard Biener Cc: GCC Patches; Vinod Kathail; Shail Aditya Gupta; 
> Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re:
> [Patch,tree-optimization]: Add new path Splitting pass on tree ssa 
> representation
>
> On 12/11/2015 02:11 AM, Ajit Kumar Agarwal wrote:
>>
>> Mibench/EEMBC benchmarks (Target Microblaze)
>>
>> Automotive_qsort1(4.03%), Office_ispell(4.29%), 
>> Office_stringsearch1(3.5%). Telecom_adpcm_d( 1.37%), 
>> ospfv2_lite(1.35%).
>>> I'm having a real tough time reproducing any of these results.
>>> In fact, I'm having a tough time seeing cases where path splitting 
>>> even applies to the Mibench/EEMBC benchmarks
>>> >>mentioned above.
>
>>> In the very few cases where split-paths might apply, the net 
>>> resulting assembly code I get is the same with and without 
>>> split-paths.
>
>>> How consistent are these results?
>
> I am consistently getting the gains for office_ispell and 
> office_stringsearch1, telcom_adpcm_d. I ran it again today and we see 
> gains in the same bench mark tests with the split path changes.
>
>>> What functions are being affected that in turn impact performance?
>
> For office_ispell: The function are Function "linit (linit, 
> funcdef_no=0, decl_uid=2535, cgraph_uid=0, symbol_order=2) for 
> lookup.c file". "Function checkfile (checkfile, funcdef_no=1, 
> decl_uid=2478, cgraph_uid=1, symbol_order=4)" " Function correct 
> (correct, funcdef_no=2, decl_uid=2503, cgraph_uid=2, symbol_order=5)" 
> " Function askmode (askmode, funcdef_no=24, decl_uid=2464, 
> cgraph_uid=24, symbol_order=27)" for correct.c file.
>
> For office_stringsearch1: The function is Function "bmhi_search 
> (bmhi_search, funcdef_no=1, decl_uid=2178, cgraph_uid=1, 
> symbol_order=5)" for bmhisrch.c file.
>>Can you send me the pre-processed lookup.c, correct.c and bmhi_search.c?

>>I generated mine using x86 and that may be affecting my ability to reproduce 
>>your results on the microblaze target.  Looking specifically at bmhi_search.c 
>>and correct.c, I see they are >>going to be sensitive to the target headers.  
>>If (for exmaple) they use FORTIFY_SOURCE or macros for toupper.

>>In the bmhi_search I'm looking at, I don't see any opportunities for the path 
>>splitter to do anything.  The CFG just doesn't have the right shape.  Again, 
>>that may be an artifact of how >>toupper is implemented in the system header 
>>files -- hence my request for the cpp output on each of the important files.

Would you like me  to send the above files and function pre-processed with -E 
option flag.

Thanks & Regards
Ajit
Jeff

Re: [aarch64] Fix target/69176

2016-01-18 Thread Richard Earnshaw (lists)

> +(define_constraint "Upl"
> +  "A constraint that matches two uses of add instructions."

That's not a particularly helpful description for external users of the
compiler.  I think that either needs to be sufficiently precise that
people who understand the ISA but not the guts of GCC can use it, or it
should be marked @internal.

Otherwise OK.

R.

On 15/01/16 21:36, Richard Henderson wrote:
> See the PR for details, but basically, the plus operations are special so you
> can't just split out one of the alternatives to a different pattern.
> 
> This merges the two-instruction add case back into the main plus pattern, and
> then adds peepholes and splitters to generate the same code as before.
> 
> Ok?
> 
> 
> r~
> 
> 
> d-69176
> 
> 
>   * config/aarch64/aarch64.md (add3): Move long immediate
>   operands to pseudo only if CSE is expected.  Split long immediate
>   operands only after reload, and for the stack pointer.
>   (*add3_pluslong): Remove.
>   (*addsi3_aarch64, *adddi3_aarch64): Merge into...
>   (*add3_aarch64): ... here.  Add r/rk/Upl alternative.
>   (*addsi3_aarch64_uxtw): Add r/rk/Upl alternative.
>   (*add3 peepholes): New.
>   (*add3 splitters): New.
>   * config/aarch64/constraints.md (Upl): New.
>   * config/aarch64/predicates.md (aarch64_pluslong_strict_immedate): New.
> 
> 
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index f6c8eb1..bde231b 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -1590,96 +1590,120 @@
>  (plus:GPI (match_operand:GPI 1 "register_operand" "")
> (match_operand:GPI 2 "aarch64_pluslong_operand" "")))]
>""
> -  "
> -  if (!aarch64_plus_operand (operands[2], VOIDmode))
> +{
> +  if (aarch64_pluslong_strict_immedate (operands[2], mode))
>  {
> -  if (can_create_pseudo_p ())
> - {
> -   rtx tmp = gen_reg_rtx (mode);
> -   emit_move_insn (tmp, operands[2]);
> -   operands[2] = tmp;
> - }
> -  else
> +  /* Give CSE the opportunity to share this constant across additions.  
> */
> +  if (!cse_not_expected && can_create_pseudo_p ())
> +operands[2] = force_reg (mode, operands[2]);
> +
> +  /* Split will refuse to operate on a modification to the stack pointer.
> +  Aid the prologue and epilogue expanders by splitting this now.  */
> +  else if (reload_completed && operands[0] == stack_pointer_rtx)
>   {
> -   HOST_WIDE_INT imm = INTVAL (operands[2]);
> -   imm = imm >= 0 ? imm & 0xfff : -(-imm & 0xfff);
> -   emit_insn (gen_add3 (operands[0], operands[1],
> -  GEN_INT (INTVAL (operands[2]) - imm)));
> +   HOST_WIDE_INT i = INTVAL (operands[2]);
> +   HOST_WIDE_INT s = (i >= 0 ? i & 0xfff : -(-i & 0xfff));
> +   emit_insn (gen_rtx_SET (operands[0],
> +   gen_rtx_PLUS (mode, operands[1],
> + GEN_INT (i - s;
> operands[1] = operands[0];
> -   operands[2] = GEN_INT (imm);
> +   operands[2] = GEN_INT (s);
>   }
>  }
> -  "
> -)
> -
> -;; Find add with a 2-instruction immediate and merge into 2 add instructions.
> -
> -(define_insn_and_split "*add3_pluslong"
> -  [(set
> -(match_operand:GPI 0 "register_operand" "=r")
> -(plus:GPI (match_operand:GPI 1 "register_operand" "r")
> -   (match_operand:GPI 2 "aarch64_pluslong_immediate" "i")))]
> -  "!aarch64_plus_operand (operands[2], VOIDmode)
> -   && !aarch64_move_imm (INTVAL (operands[2]), mode)"
> -  "#"
> -  "&& true"
> -  [(set (match_dup 0) (plus:GPI (match_dup 1) (match_dup 3)))
> -   (set (match_dup 0) (plus:GPI (match_dup 0) (match_dup 4)))]
> -  "
> -{
> -  HOST_WIDE_INT imm = INTVAL (operands[2]);
> -  imm = imm >= 0 ? imm & 0xfff : -(-imm & 0xfff);
> -  operands[3] = GEN_INT (INTVAL (operands[2]) - imm);
> -  operands[4] = GEN_INT (imm);
> -}
> -  "
> -)
> +})
>  
> -(define_insn "*addsi3_aarch64"
> +(define_insn "*add3_aarch64"
>[(set
> -(match_operand:SI 0 "register_operand" "=rk,rk,w,rk")
> -(plus:SI
> - (match_operand:SI 1 "register_operand" "%rk,rk,w,rk")
> - (match_operand:SI 2 "aarch64_plus_operand" "I,r,w,J")))]
> +(match_operand:GPI 0 "register_operand" "=rk,rk,w,rk,r")
> +(plus:GPI
> + (match_operand:GPI 1 "register_operand" "%rk,rk,w,rk,rk")
> + (match_operand:GPI 2 "aarch64_pluslong_operand" "I,r,w,J,Upl")))]
>""
>"@
> -  add\\t%w0, %w1, %2
> -  add\\t%w0, %w1, %w2
> -  add\\t%0.2s, %1.2s, %2.2s
> -  sub\\t%w0, %w1, #%n2"
> -  [(set_attr "type" "alu_imm,alu_sreg,neon_add,alu_imm")
> -   (set_attr "simd" "*,*,yes,*")]
> +  add\\t%0, %1, %2
> +  add\\t%0, %1, %2
> +  add\\t%0, %1, %2
> +  sub\\t%0, %1, #%n2
> +  #"
> +  [(set_attr "type" "alu_imm,alu_sreg,neon_add,alu_imm,multiple")
> +   (set_attr "simd" "*,*,yes,*,*")]
>  )
>  
>  ;; zero_extend version of above
>

Re: [PATCH] Fix RTL DSE (PR rtl-optimization/68955)

2016-01-18 Thread Eric Botcazou

> The following testcase is miscompiled on i686-linux at -O3.
> The bug is in DSE record_store, which for group_id < 0 uses mem_addr
> set to result of get_addr (base->val_rtx) (plus optional offset),
> which is fine for canon_true_dependence with other MEMs in that function,
> but we also store that address in store_info.  The problem is if later on
> e.g. some read uses the same e.g. hard register as get_addr returned, but
> that register contains at that later point a different value.
> canon_true_dependence then happily returns the read does not alias the
> store, although it might.
> The fix is to store the VALUE (plus optional offset) into
> store_info->mem_addr instead, then at some later insn when get_addr is
> called on it it will either return the same register or expression (if it
> has not changed), or some different one otherwise.

I presume that the origin of the bug is:

  /* get_addr can only handle VALUE but cannot handle expr like:
 VALUE + OFFSET, so call get_addr to get original addr for
 mem_addr before plus_constant.  */
  mem_addr = get_addr (mem_addr);
  if (offset)
mem_addr = plus_constant (get_address_mode (mem), mem_addr, offset);

both in record_store and check_mem_read_rtx, so I wonder if we shouldn't bite 
the bullet and try enhancing get_addr since it's a mainline-only regression.

-- 
Eric Botcazou

Re: Thoughts on memcmp expansion (PR43052)

2016-01-18 Thread Richard Biener

On Fri, Jan 15, 2016 at 5:58 PM, Bernd Schmidt  wrote:
> PR43052 is a PR complaining about how the rep cmpsb expansion that gcc uses
> for memcmp is slower than the library function. As is so often the case, if
> you investigate a bit, you can find a lot of issues with the current
> situation in the compiler.
>
> This PR was accidentally fixed by a patch by Nick which disabled the use of
> cmpstrnsi for memcmp expansion, on the grounds that cmpstrnsi could stop
> looking after seeing a null byte, which would be invalid for memcmp, so only
> cmpmemsi should be used. This fix was for an out-of-tree target.
>
> I believe the rep cmpsb sequence used by i386 would actually be valid, so we
> could duplicate the cmpstrn pattern to also match cmpmem and be done - but
> that would then again cause the performance problem described in the PR, so
> it's probably not a good idea.
>
> One question Richard posed in the comments: why aren't we optimizing small
> constant size memcmps other than size 1 to *s == *q? The reason is the
> return value of memcmp, which implies byte-sized operation (incidentally,
> the use of SImode in the cmpmem/cmpstr patterns is really odd). It's
> possible to work around this, but expansion becomes a little more tricky
> (subtract after bswap, maybe). Still, the current code generation is lame.
>
> So, for gcc-6, I think we shouldn't do anything. The PR is fixed, and
> there's no easy bug-fix that can be done to improve matters. Not sure
> whether to keep the PR open or create a new one for the remaining issues.
> For the next stage1, I'm attaching a proof-of-concept patch that does the
> following:
>  * notice if memcmp results are only used for equality comparison
>against zero
>  * if so, replace with a different builtin __memcmp_eq
>  * Expand __memcmp_eq for small constant sizes with loads and
>comparison, fall back to a memcmp call.
>
> The whole thing could be extended to work for sizes larger than an int,
> along the lines of memcpy expansion controlled by move ratio etc. Thoughts?

See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52171 - the
inline expansion
for small sizes and equality compares should be done on GIMPLE.  Today the
strlen pass might be an appropriate place to do this given its
superior knowledge
about string lengths.

The idea of turning eq feeding memcmp into a special memcmp_eq is good but
you have to avoid doing that too early - otherwise you'd lose on

  res = memcmp (p, q, sz);
  if (memcmp (p, q, sz) == 0)
   ...

that is, you have to make sure CSE got the chance to common the two calls.
This is why I think this kind of transform needs to happen in specific places
(like during strlen opt) rather than in generic folding.

Richard.

>
> Bernd

Fix PR ada/69219

2016-01-18 Thread Eric Botcazou

This is a spurious error on nested subprograms with pragma Inline_Always and 
Intrinsic, which comes from a thinko in check_inlining_for_nested_subprog, so 
it's a regression present on the mainline and 5 branch.

Tested on x86_64-suse-linux, applied on the mainline and 5 branch.


2016-01-18  Eric Botcazou  

PR ada/69219
* gcc-interface/trans.c (check_inlining_for_nested_subprog): Consider
the parent function instead of the current function in order to issue
the warning or the error.  Add guard for ignored functions.


2016-01-18  Eric Botcazou  

* gnat.dg/inline12.adb: New test.


-- 
Eric BotcazouIndex: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 232465)
+++ gcc-interface/trans.c	(working copy)
@@ -1487,7 +1487,7 @@ Pragma_to_gnu (Node_Id gnat_node)
 }
 
 
-/* Check the inlining status of nested function FNDECL in the current context.
+/* Check the inline status of nested function FNDECL wrt its parent function.
 
If a non-inline nested function is referenced from an inline external
function, we cannot honor both requests at the same time without cloning
@@ -1495,24 +1495,27 @@ Pragma_to_gnu (Node_Id gnat_node)
We could inline it as well but it's probably better to err on the side
of too little inlining.
 
-   This must be invoked only on nested functions present in the source code
+   This must be done only on nested functions present in the source code
and not on nested functions generated by the compiler, e.g. finalizers,
-   because they are not marked inline and we don't want them to block the
-   inlining of the parent function.  */
+   because they may be not marked inline and we don't want them to block
+   the inlining of the parent function.  */
 
 static void
 check_inlining_for_nested_subprog (tree fndecl)
 {
-  if (!DECL_DECLARED_INLINE_P (fndecl)
-  && current_function_decl
-  && DECL_EXTERNAL (current_function_decl)
-  && DECL_DECLARED_INLINE_P (current_function_decl))
+  if (DECL_IGNORED_P (current_function_decl) || DECL_IGNORED_P (fndecl))
+return;
+
+  if (DECL_DECLARED_INLINE_P (fndecl))
+return;
+
+  tree parent_decl = decl_function_context (fndecl);
+  if (DECL_EXTERNAL (parent_decl) && DECL_DECLARED_INLINE_P (parent_decl))
 {
   const location_t loc1 = DECL_SOURCE_LOCATION (fndecl);
-  const location_t loc2 = DECL_SOURCE_LOCATION (current_function_decl);
+  const location_t loc2 = DECL_SOURCE_LOCATION (parent_decl);
 
-  if (lookup_attribute ("always_inline",
-			DECL_ATTRIBUTES (current_function_decl)))
+  if (lookup_attribute ("always_inline", DECL_ATTRIBUTES (parent_decl)))
 	{
 	  error_at (loc1, "subprogram %q+F not marked Inline_Always", fndecl);
 	  error_at (loc2, "parent subprogram cannot be inlined");
@@ -1524,8 +1527,8 @@ check_inlining_for_nested_subprog (tree
 	  warning_at (loc2, OPT_Winline, "parent subprogram cannot be inlined");
 	}
 
-  DECL_DECLARED_INLINE_P (current_function_decl) = 0;
-  DECL_UNINLINABLE (current_function_decl) = 1;
+  DECL_DECLARED_INLINE_P (parent_decl) = 0;
+  DECL_UNINLINABLE (parent_decl) = 1;
 }
 }
 
-- PR ada/69219
-- Testcae by yuta tomino  */

-- { dg-do compile }

procedure Inline12 is

   procedure NI;

   procedure IA;
   pragma Convention (Intrinsic, IA);
   pragma Inline_Always (IA);

   procedure IA is
   begin
  NI;
   end;

   procedure NI is null;

begin
  IA;
end;

Re: [PATCH] Fix RTL DSE (PR rtl-optimization/68955)

2016-01-18 Thread Eric Botcazou

> So, do you suggest to tweak get_addr like the patch below, and remove the
>   mem_addr = get_addr (mem_addr);
> line above and the comment?

Yes, exactly.  And if that doesn't easily work, then go for your solution and 
add a blurb to the comment explaining why get_addr cannot be easily changed.

-- 
Eric Botcazou

[PATCH PR66796]Obvious, revise check condition in test case

2016-01-18 Thread Bin Cheng

Hi,
Turns out the check on number of iv_uses is still too large on target hppa.  It 
only supports small offset in REG+offset addressing mode for floating point 
load/store.  Even with this restriction, the grouped version is better than 
before, so I am going to further relax the check condition for it.
Test run on HPPA.  Applied as an obvious change.

Thanks,
bin

gcc/testsuite/ChangeLog
2016-01-18  Bin Cheng  

PR tree-optimization/66797
* gcc.c-torture/execute/pr65447.c: Relax check condition.

Re: [PATCH] DWARF: add abstract origin links on lexical blocks DIEs

2016-01-18 Thread Pierre-Marie de Rodat


On 01/18/2016 10:45 AM, Eric Botcazou wrote:

which means take the other option of writing a scan-assembler testcase
looking for the previously missing DWARF.


Fine with me (either Ada or C as far as I'm concerned).


Thank you for your inputs! I’m going to try that, then. I hope this test 
will not be too fragile…


--
Pierre-Marie de Rodat

Re: [PATCH] S/390: Reduce accuracy of bessel_6.f90.

2016-01-18 Thread Dominik Vogt

On Mon, Jan 11, 2016 at 03:40:56PM +0100, Dominik Vogt wrote:
> Another patch reducing the accuracy required in the bessel_6 test.

Can this be committed?

> gcc/testsuite/ChangeLog
> 
>   * gfortran.dg/bessel_6.f90: Reduce accuracy for S/390.

> >From 70a35dd6f6bf906d8e5907667ad0f04f981a61ac Mon Sep 17 00:00:00 2001
> From: Dominik Vogt 
> Date: Mon, 11 Jan 2016 15:36:38 +0100
> Subject: [PATCH] S/390: Reduce accuracy of bessel_6.f90.
> 
> ---
>  gcc/testsuite/gfortran.dg/bessel_6.f90 | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gfortran.dg/bessel_6.f90 
> b/gcc/testsuite/gfortran.dg/bessel_6.f90
> index e0220f7..da917ff 100644
> --- a/gcc/testsuite/gfortran.dg/bessel_6.f90
> +++ b/gcc/testsuite/gfortran.dg/bessel_6.f90
> @@ -12,7 +12,7 @@
>  implicit none
>  real,parameter :: values(*) = [0.0, 0.5, 1.0, 0.9, 
> 1.8,2.0,3.0,4.0,4.25,8.0,34.53, 475.78] 
>  real,parameter :: myeps(size(values)) = epsilon(0.0) &
> -  * [2, 7, 5, 6, 9, 12, 12, 7, 7, 8, 92, 15 ]
> +  * [2, 7, 5, 6, 9, 12, 12, 7, 7, 8, 98, 15 ]
>  ! The following is sufficient for me - the values above are a bit
>  ! more tolerant
>  !  * [0, 5, 3, 4, 6, 7, 7, 5, 5, 6, 66, 4 ]

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

Re: [PATCH 0/2][AArch64] Implement AAPCS64 updates for alignment attribute

2016-01-18 Thread Eric Botcazou

> Similarly to ARM, I note that Ada is affected. Indeed, with a gcc 4.9 host
> compiler, I saw a bootstrap miscompare iff including Ada; however, I was
> able to bootstrap Ada successfully, if I first built a GCC including this
> patch with --disable-bootstrap, and then used that as host compiler. The
> best explanation I can see for this is mismatched host vs built libraries
> and compiler being used together, something like Jakub's suggestion
> http://gcc.gnu.org/ml/gcc-patches/2015-11/msg00338.html. I don't feel I have
> the expertise for this, and am CCing the Ada maintainers in the hope they
> can help.

That's a bit weird though because this should have also occurred for ARM when 
the ABI was broken the same way if the Ada bootstrap is not entirely correct.
Now, as far I know, this didn't occur for ARM during bootstrap but only during 
testing with make -k check.  Or else could this be a parallel compilation bug?

Could you post the list of files that differ?  How do they differ exactly?

-- 
Eric Botcazou

[PATCH] Add fopt-info-oacc

2016-01-18 Thread Tom de Vries


Hi,

This patch introduces an option fopt-info-oacc.

When using the option like this with a kernels region in kernels-loop.c 
that parloops does not manage to parallelize:

...
$ gcc kernels-loop.c -S -O2 -fopenacc -fopt-info-oacc-all
...

we get a message:
...
kernels-loop.c:23:9: note: kernels region executed sequentially. 
Consider mapping it to host execution, to avoid data copy penalty.

...

Any comments?

Thanks,
- Tom
Add fopt-info-oacc

---
 gcc/dumpfile.c |  1 +
 gcc/dumpfile.h |  5 +++--
 gcc/omp-low.c  | 30 +-
 3 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c
index 144e371..e8aa0e1 100644
--- a/gcc/dumpfile.c
+++ b/gcc/dumpfile.c
@@ -137,6 +137,7 @@ static const struct dump_option_value_info optgroup_options[] =
   {"loop", OPTGROUP_LOOP},
   {"inline", OPTGROUP_INLINE},
   {"vec", OPTGROUP_VEC},
+  {"oacc", OPTGROUP_OACC},
   {"optall", OPTGROUP_ALL},
   {NULL, 0}
 };
diff --git a/gcc/dumpfile.h b/gcc/dumpfile.h
index c168cbf..6e1c657 100644
--- a/gcc/dumpfile.h
+++ b/gcc/dumpfile.h
@@ -97,9 +97,10 @@ enum tree_dump_index
 #define OPTGROUP_LOOP(1 << 2)   /* Loop optimization passes */
 #define OPTGROUP_INLINE  (1 << 3)   /* Inlining passes */
 #define OPTGROUP_VEC (1 << 4)   /* Vectorization passes */
-#define OPTGROUP_OTHER   (1 << 5)   /* All other passes */
+#define OPTGROUP_OACC(1 << 5)   /* Openacc passes */
+#define OPTGROUP_OTHER   (1 << 6)   /* All other passes */
 #define OPTGROUP_ALL	 (OPTGROUP_IPA | OPTGROUP_LOOP | OPTGROUP_INLINE \
-  | OPTGROUP_VEC | OPTGROUP_OTHER)
+  | OPTGROUP_VEC | OPTGROUP_OACC | OPTGROUP_OTHER)
 
 /* Define a tree dump switch.  */
 struct dump_file_info
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index a6e3fe3..d5c3484 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -20139,6 +20139,34 @@ execute_oacc_device_lower ()
 	 : fn_level < 0 ? "Function is parallel offload\n"
 	 : "Function is routine level %d\n", fn_level);
 
+#if defined ACCEL_COMPILER
+  bool is_kernels = oacc_fn_attrib_kernels_p (attrs);
+  if (is_kernels)
+{
+  bool all_one = true;
+  tree pos = TREE_VALUE (attrs);
+  for (unsigned ix = 0; ix != GOMP_DIM_MAX; ix++)
+	{
+	  tree tree_val = TREE_VALUE (pos);
+	  unsigned HOST_WIDE_INT val = (tree_val
+	? TREE_INT_CST_LOW (tree_val)
+	: 1);
+	  if (val != 1)
+	{
+	  all_one = false;
+	  break;
+	}
+	  pos = TREE_CHAIN (pos);
+	}
+
+  if (all_one)
+	dump_printf_loc (MSG_MISSED_OPTIMIZATION, cfun->function_start_locus,
+			 "Kernels region executed sequentially.  Consider"
+			 " mapping it to host execution, to avoid data copy"
+			 " penalty.\n");
+}
+#endif
+
   unsigned outer_mask = fn_level >= 0 ? GOMP_DIM_MASK (fn_level) - 1 : 0;
   unsigned used_mask = oacc_loop_partition (loops, outer_mask);
   int dims[GOMP_DIM_MAX];
@@ -20312,7 +20340,7 @@ const pass_data pass_data_oacc_device_lower =
 {
   GIMPLE_PASS, /* type */
   "oaccdevlow", /* name */
-  OPTGROUP_NONE, /* optinfo_flags */
+  OPTGROUP_OACC, /* optinfo_flags */
   TV_NONE, /* tv_id */
   PROP_cfg, /* properties_required */
   0 /* Possibly PROP_gimple_eomp.  */, /* properties_provided */

Re: genattrab.c generate switch

2016-01-18 Thread Manuel López-Ibáñez


On 18/01/16 14:39, Jesper Broge Jørgensen wrote:

No i have not gone through copyright assignment.
This is my first time trying to contribute to a GNU project so i have tried
following the "Contributing to GCC"@
https://gcc.gnu.org/contribute.html
There i followed the advice to run the patch through contrib/check_GNU_style.sh
and it came out clean. Maybe contrib/check_GNU_style.sh does not check for
indention rules and/or my editor is set up wrongly so it looked to me like i
was following the coding standard.


Hi Jesper,

Unfortunately, https://gcc.gnu.org/contribute.html is quite hard to follow and 
outdated. I would suggest to start here: 
https://gcc.gnu.org/wiki/GettingStarted#Basics:_Contributing_to_GCC_in_10_easy_steps


From there, you'll get to https://gcc.gnu.org/wiki/FormattingCodeForGCC

If you know how to improve those pages, for example extending them to other 
editors, I can give you write access.


Cheers,

Manuel.

Re: C++ PATCH for c++/68586 (rejects-valid with enum in C++11)

2016-01-18 Thread Jason Merrill


On 01/18/2016 11:57 AM, Marek Polacek wrote:

On Mon, Jan 18, 2016 at 10:04:12AM -0500, Jason Merrill wrote:

This wouldn't cover cases where this change affects the type or value of
more complicated expressions, so my preference would be to clear the caches
when we finish_enum_value_list.


So like this?


Yes, and let's also clear the fold_cache at the same time.

Jason

RE: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation

2016-01-18 Thread Ajit Kumar Agarwal

-Original Message-
From: Jeff Law [mailto:l...@redhat.com] 
Sent: Saturday, January 16, 2016 4:33 AM
To: Ajit Kumar Agarwal; Richard Biener
Cc: GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; 
Nagaraju Mekala
Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa 
representation

On 01/14/2016 01:55 AM, Jeff Law wrote:
[ Replying to myself again, mostly to make sure we've got these thoughts in the 
archives. ]
>
> Anyway, going back to adpcm_decode, we do end up splitting this path:
>
>   # vpdiff_12 = PHI 
>if (sign_41 != 0)
>  goto ;
>else
>  goto ;
> ;;succ:   15
> ;;16
>
> ;;   basic block 15, loop depth 1
> ;;pred:   14
>valpred_51 = valpred_76 - vpdiff_12;
>goto ;
> ;;succ:   17
>
> ;;   basic block 16, loop depth 1
> ;;pred:   14
>valpred_52 = vpdiff_12 + valpred_76;
> ;;succ:   17
>
> ;;   basic block 17, loop depth 1
> ;;pred:   15
> ;;16
># valpred_7 = PHI 
>_85 = MAX_EXPR ;
>valpred_13 = MIN_EXPR <_85, 32767>;
>step_53 = stepsizeTable[index_62];
>outp_54 = outp_69 + 2;
>_55 = (short int) valpred_13;
>MEM[base: outp_54, offset: -2B] = _55;
>if (outp_54 != _74)
>  goto ;
>else
>  goto ;
>
> This doesn't result in anything particularly interesting/good AFAICT. 
> We propagate valpred_51/52 into the use in the MAX_EXPR in the 
> duplicate paths, but that doesn't allow any further simplification.
>>So with the heuristic I'm poking at, this gets rejected.  Essentially it 
>>doesn't think it's likely to expose CSE/DCE opportunities (and it's correct). 
>> The number of statements in predecessor >>blocks that feed operands in the 
>>to-be-copied-block is too small relative to the size of the 
>>to-be-copied-block.

>
> Ajit, can you confirm which of adpcm_code or adpcm_decode where path 
> splitting is showing a gain?  I suspect it's the former but would like 
> to make sure so that I can adjust the heuristics properly.
>>I'd still like to have this answered when you can Ajit, just to be 100%
 >> that it's the path splitting in adpcm_code that's responsible for the 
 >> improvements you're seeing in adpcm.

The adpcm_coder get optimized with path splitting whereas the adpcm_decoder is 
not optimized further with path splitting. In adpcm_decoder
the join node is duplicated into its predecessors and with the duplication of 
join node the code is not optimized further.

In adpcm_coder with path splitting the following optimization is triggered with 
path splitting.

1. /* Output last step, if needed */
if ( !bufferstep )
  *outp++ = outputbuffer;

 IF-THEN inside the loop will be triggered with bufferstep is 1.  Then the 
flip happens and bufferstep is 0. For the exit branch if the bufferstep
Is 1 the flip convert it to 0  and above IF-THEN generate store to assign 
outputbuffer to outp.

The above sequence is optimized with path splitting, if the bufferstep is 1 
then exit branch of the loop branches to the above store. This does not require 
the flip of
bufferstep using xor with immediate 1. With this optimization there is one 
level of exit branch for the bufferstep 1 path. This lead to scheduling the
exit branch to the store with a meaningful instruction instead of xor with 
immediate 1.

Without Path Splitting if the bufferstep is 1  the exit branch of the loop 
branches to piece of branch flipping it to zero and the above IF-THEN outside 
the
loop does the store to assign outputbuffer to outp. Thus without path splitting 
there is two level of branch in the case of exit branch in the path where 
bufferstep is 1 inside the loop generating non optimized. Also without path 
splitting the two level of exit branch of the loop is scheduled with xor 
immediate with 1.

 Thanks & Regards
Ajit

jeff

Re: [PATCH] Add fopt-info-oacc

2016-01-18 Thread Sandra Loosemore


On 01/18/2016 10:26 AM, Tom de Vries wrote:

Hi,

This patch introduces an option fopt-info-oacc.

When using the option like this with a kernels region in kernels-loop.c
that parloops does not manage to parallelize:
...
$ gcc kernels-loop.c -S -O2 -fopenacc -fopt-info-oacc-all
...

we get a message:
...
kernels-loop.c:23:9: note: kernels region executed sequentially.
Consider mapping it to host execution, to avoid data copy penalty.
...

Any comments?


Needs documentation?

-Sandra

Re: [PING] genattrab.c generate switch

2016-01-18 Thread Jeff Law


On 01/18/2016 07:09 AM, Jesper Broge Jørgensen wrote:

Ping patch:

https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00784.html
I'd put it in my gcc-7 queue.  But if Richard, Bernd, Richi or someone 
else wants to work though the changes as a bugfix for bootstrapping on 
platforms with crippled compilers, I won't object.


jeff

Re: [PATCH] PR testsuite/69181: ensure expected multiline outputs is cleared per-test (v2)

2016-01-18 Thread Mike Stump

On Jan 18, 2016, at 8:14 AM, David Malcolm  wrote:
> I assumed that these differences were unintentional, so the patch
> consolidates things to make the cleanup identical between (A) and (B).

I also think this is the right path forward.

Re: [PATCH PR66796]Obvious, revise check condition in test case

2016-01-18 Thread Bin.Cheng

On Mon, Jan 18, 2016 at 9:28 AM, Bin Cheng  wrote:
> Hi,
> Turns out the check on number of iv_uses is still too large on target hppa.  
> It only supports small offset in REG+offset addressing mode for floating 
> point load/store.  Even with this restriction, the grouped version is better 
> than before, so I am going to further relax the check condition for it.
> Test run on HPPA.  Applied as an obvious change.
>
> Thanks,
> bin
>
> gcc/testsuite/ChangeLog
> 2016-01-18  Bin Cheng  
>
> PR tree-optimization/66797
> * gcc.c-torture/execute/pr65447.c: Relax check condition.
>
Hmm, the patch is at revision 232497.

Index: gcc/testsuite/gcc.dg/tree-ssa/pr65447.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr65447.c(revision 232496)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr65447.c(revision 232497)
@@ -50,4 +50,4 @@
 }

 /* We should groups address type IV uses.  */
-/* { dg-final { scan-tree-dump-not "\\nuse 5\\n" "ivopts" } }  */
+/* { dg-final { scan-tree-dump-not "\\nuse 21\\n" "ivopts" } }  */
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog(revision 232496)
+++ gcc/testsuite/ChangeLog(revision 232497)
@@ -1,3 +1,8 @@
+2016-01-18  Bin Cheng  
+
+PR tree-optimization/66797
+* gcc.c-torture/execute/pr65447.c: Relax check condition.
+
 2016-01-18  Richard Biener  

 PR tree-optimization/69170


Thanks,
bin

Re: [Patch, fortran] (4/5-regression) PR61831 side-effect deallocation of variable components)

2016-01-18 Thread Dominique d'Humières

Dear Paul,

Sorry for the late feedback. There is a missing right brace in 
gfortran.dg/derived_constructor_comps_6.f90. This is fixed by the obvious patch:

--- ../5_clean/gcc/testsuite/gfortran.dg/derived_constructor_comps_6.f90
2016-01-17 19:27:04.0 +0100
+++ gcc/testsuite/gfortran.dg/derived_constructor_comps_6.f90   2016-01-18 
03:02:17.0 +0100
@@ -1,5 +1,5 @@
 ! { dg-do run }
-! { dg-additional-options "-fdump-tree-original"
+! { dg-additional-options "-fdump-tree-original" }
 !
 ! PR fortran/61831
 ! The deallocation of components of array constructor elements

However with this patch the test fails with -m32 and -O0/O1.

TIA

Dominique

> Le 17 janv. 2016 à 18:37, Paul Richard Thomas  
> a écrit :
> 
> Dear Andre,
> 
> Thanks for the very useful discussion. It cleared away one or two cobwebs!
> 
> Committed as revision 232482.
> 
> Now to see if it will apply to 4.9 branch This might be a step too
> far, since all sorts of other prerequisites are not there. If it
> doesn't go well, I will close the PR as a WONTFIX on 4.9.
> 
> Cheers
> 
> Paul

[patch] libstdc++/69293 Fix construction of std::function from null pointer-to-member

2016-01-18 Thread Jonathan Wakely


The wrong overload of _M_not_empty_function gets chosen and we treat a
null pointer-to-member as a valid target.

Tested powerpc64le-linux, comitted to trunk.


commit c0f055172fb4ceda0257a1a4ccd5f244609a0f37
Author: Jonathan Wakely 
Date:   Mon Jan 18 11:25:43 2016 +

Fix construction of std::function from null pointer-to-member

	PR libstdc++/69293
	* include/std/functional (_Function_base::_M_not_empty_function):
	Change overloads for pointers to take arguments by value.
	* testsuite/20_util/function/cons/57465.cc: Add tests for
	pointer-to-member cases.

diff --git a/libstdc++-v3/include/std/functional b/libstdc++-v3/include/std/functional
index 557156a..9799410 100644
--- a/libstdc++-v3/include/std/functional
+++ b/libstdc++-v3/include/std/functional
@@ -1633,13 +1633,13 @@ _GLIBCXX_MEM_FN_TRAITS(&&, false_type, true_type)
 
 	template
 	  static bool
-	  _M_not_empty_function(_Tp* const& __fp)
-	  { return __fp; }
+	  _M_not_empty_function(_Tp* __fp)
+	  { return __fp != nullptr; }
 
 	template
 	  static bool
-	  _M_not_empty_function(_Tp _Class::* const& __mp)
-	  { return __mp; }
+	  _M_not_empty_function(_Tp _Class::* __mp)
+	  { return __mp != nullptr; }
 
 	template
 	  static bool
diff --git a/libstdc++-v3/testsuite/20_util/function/cons/57465.cc b/libstdc++-v3/testsuite/20_util/function/cons/57465.cc
index be2d132..7b13d4b 100644
--- a/libstdc++-v3/testsuite/20_util/function/cons/57465.cc
+++ b/libstdc++-v3/testsuite/20_util/function/cons/57465.cc
@@ -15,17 +15,33 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// libstdc++/57465
-
 // { dg-options "-std=gnu++11" }
 
 #include 
 #include 
 
-int main()
+void test01()
 {
   using F = void();
   F* f = nullptr;
   std::function x(f);
-  VERIFY( !x );
+  VERIFY( !x ); // libstdc++/57465
+}
+
+void test02()
+{
+  struct X { };
+  int (X::*mf)() = nullptr;
+  std::function f = mf;
+  VERIFY( !f ); // libstdc++/69243
+
+  int X::*mp = nullptr;
+  f = mp;
+  VERIFY( !f );
+}
+
+int main()
+{
+  test01();
+  test02();
 }

Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM

2016-01-18 Thread Jakub Jelinek

On Mon, Jan 18, 2016 at 11:38:37AM +, Kyrill Tkachov wrote:
> On 18/01/16 11:31, Andreas Schwab wrote:
> >Jeff Law  writes:
> >
> >>commit 1384b36abcd52a7ac72ca6538afa2aed2e04f8e0
> >>Author: Jeff Law 
> >>Date:   Fri Jan 15 17:15:24 2016 -0500
> >>
> >>PR tree-optimization/69270
> >>* tree-ssanames.c (ssa_name_has_boolean_range): Moved here from
> >>tree-ssa-dom.c.  Improve test for [0..1] ranve from VRP.
> >>* tree-ssa-dom.c (ssa_name_has_boolean_range): Remove.
> >>* tree-ssanames.h (ssa_name_has_boolean_range): Prototype.
> >>* tree-ssa-uncprop.c (associate_equivalences_with_edges): Use
> >>ssa_name_has_boolean_range and constant_boolean_node.
> >>PR tree-optimization/69270
> >>* gcc.dg/tree-ssa/pr69270-2.c: New test.
> >>* gcc.dg/tree-ssa/pr69270-3.c: New test.
> >This breaks gcc.target/aarch64/tst_3.c.
> >
> > //.tune generic
> > .type   f1, %function
> >  f1:
> >-tst x0, 1
> >-csinc   w0, w0, wzr, eq
> >+andsw1, w0, 1
> >+cselw0, w1, w0, ne
> > ret
> > .size   f1, .-f1
> 
> The two sequences look equally valid to me.
> Instead of doing an and-compare followed by a conditional increment
> we do an and-compare followed by a conditional select (without discarding
> the result of the and).
> So the testcase should be adjusted.
> I'll do it.

IMHO please wait for the resolution of PR69320 here.

Jakub

Re: [PATCH PR68542]

2016-01-18 Thread Richard Biener

On Mon, Jan 11, 2016 at 11:06 AM, Yuri Rumyantsev  wrote:
> Hi Richard,
>
> Did you have anu chance to look at updated patch?

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index acbb70b..208a752 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -5771,6 +5771,10 @@ register_edge_assert_for (tree name, edge e,
gimple_stmt_iterator si,
_code, ))
 return;

+  /* VRP doesn't track ranges for vector types.  */
+  if (TREE_CODE (TREE_TYPE (name)) == VECTOR_TYPE)
+return;
+

please instead fix extract_code_and_val_from_cond_with_ops with

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 232506)
+++ gcc/tree-vrp.c  (working copy)
@@ -5067,8 +5067,9 @@ extract_code_and_val_from_cond_with_ops
   if (invert)
 comp_code = invert_tree_comparison (comp_code, 0);

-  /* VRP does not handle float types.  */
-  if (SCALAR_FLOAT_TYPE_P (TREE_TYPE (val)))
+  /* VRP only handles integral and pointer types.  */
+  if (! INTEGRAL_TYPE_P (TREE_TYPE (val))
+  && ! POINTER_TYPE_P (TREE_TYPE (val)))
 return false;

   /* Do not register always-false predicates.

Ok with that change.

Thanks,
Richard.

> Thanks.
> Yuri.
>
> 2015-12-18 13:20 GMT+03:00 Yuri Rumyantsev :
>> Hi Richard,
>>
>> Here is updated patch for middle-end part of the whole patch which
>> fixes all your remarks I hope.
>>
>> Regression testing and bootstrapping did not show any new failures.
>> Is it OK for trunk?
>>
>> Yuri.
>>
>> ChangeLog:
>> 2015-12-18  Yuri Rumyantsev  
>>
>> PR middle-end/68542
>> * fold-const.c (fold_binary_op_with_conditional_arg): Bail out for case
>> of mixind vector and scalar types.
>> (fold_relational_const): Add handling of vector
>> comparison with boolean result.
>> * tree-cfg.c (verify_gimple_comparison): Add argument CODE, allow
>> comparison of vector operands with boolean result for EQ/NE only.
>> (verify_gimple_assign_binary): Adjust call for verify_gimple_comparison.
>> (verify_gimple_cond): Likewise.
>> * tree-ssa-forwprop.c (combine_cond_expr_cond): Do not perform

Re: [PATCH 1/4] Make SRA scalarize constant-pool loads

2016-01-18 Thread Richard Biener

On Fri, Jan 15, 2016 at 11:27 AM, Alan Lawrence
 wrote:
> On 24/12/15 11:53, Alan Lawrence wrote:
>>
>> Here's a new version that fixes the gcc.dg/guality/pr54970.c failures seen
>> on
>> aarch64 and powerpc64. Prior to SRA handling constant pool decls,
>> -fdump-tree-esra-details (at -O1 -g) had shown:
>>:
>>a = *.LC0;
>># DEBUG a$0 => MEM[(int[3] *)&*.LC0]
>>a$4_3 = MEM[(int[3] *)&*.LC0 + 4B];
>># DEBUG a$4 => a$4_3
>>a$8_4 = MEM[(int[3] *)&*.LC0 + 8B];
>>
>> The previous patch changed this to:
>>:
>>SR.5_3 = *.LC0[0];
>>SR.6_4 = *.LC0[1];
>>SR.7_19 = *.LC0[2];
>>SR.8_21 = *.LC1[0];
>>SR.9_22 = *.LC1[1];
>>SR.10_23 = *.LC1[2];
>># DEBUG a$0 => NULL   // Note here
>>a$4_24 = SR.6_4;
>># DEBUG a$4 => a$4_24
>>a$8_25 = SR.7_19;
>>
>> Turns out the DEBUG a$0 => NULL was produced in
>> load_assign_lhs_subreplacements:
>>
>>   if (lacc && lacc->grp_to_be_debug_replaced)
>> {
>>   gdebug *ds;
>>   tree drhs;
>>   struct access *racc = find_access_in_subtree (sad->top_racc,
>> offset,
>> lacc->size);
>>
>>   if (racc && racc->grp_to_be_replaced)
>> {
>>   if (racc->grp_write)
>> drhs = get_access_replacement (racc);
>>   else
>> drhs = NULL;  // <=== HERE
>> }
>> ...
>>   ds = gimple_build_debug_bind (get_access_replacement (lacc),
>> drhs, gsi_stmt
>> (sad->old_gsi));
>>
>> Prior to the patch, we'd skipped around load_assign_lhs_subreplacements,
>> because
>> access_has_children_p(racc) (for racc = *.LC0) didn't hold in
>> sra_modify_assign.
>>
>> I also added a constant_decl_p function, combining the two checks, plus
>> some
>> testcase fixes.
>>
>> Bootstrapped + check-gcc,g++ on x86_64, ARM, AArch64,
>>also on powerpc64{,le}-none-linux-gnu *in combination with the other
>> patches
>>in the series* (I haven't tested the individual patches on PPC),
>>plus Ada on ARM and x86_64.
>>
>> gcc/ChangeLog:
>>
>> PR target/63679
>> * tree-sra.c (disqualified_constants, constant_decl_p): New.
>> (sra_initialize): Allocate disqualified_constants.
>> (sra_deinitialize): Free disqualified_constants.
>> (disqualify_candidate): Update disqualified_constants when
>> appropriate.
>> (create_access): Scan for constant-pool entries as we go along.
>> (scalarizable_type_p): Add check against
>> type_contains_placeholder_p.
>> (maybe_add_sra_candidate): Allow constant-pool entries.
>> (load_assign_lhs_subreplacements): Bind debug for constant pool
>> vars.
>> (initialize_constant_pool_replacements): New.
>> (sra_modify_assign): Avoid mangling assignments created by
>> previous,
>> and don't generate writes into constant pool.
>> (sra_modify_function_body): Call
>> initialize_constant_pool_replacements.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.dg/tree-ssa/sra-17.c: New.
>> * gcc.dg/tree-ssa/sra-18.c: New.
>
>
> Ping.

Ok.

Thanks,
Richard.

> (The next bit is false, unless you force SRA to happen more widely, but all
> the above stands)
>
>> This also fixes a bunch of other guality tests on AArch64 that were
>> failing
>> prior to the patch series, and another bunch on PowerPC64 (bigendian
>> -m32), listed below.
>
>>
>>
>> Tests fixed on aarch64-none-linux-gnu:
>>
>> gcc.dg/guality/pr54970.c   -O1  line 15 *p == 3
>> gcc.dg/guality/pr54970.c   -O1  line 15 *q == 2
>> gcc.dg/guality/pr54970.c   -O1  line 20 *p == 13
>> gcc.dg/guality/pr54970.c   -O1  line 20 *q == 2
>> gcc.dg/guality/pr54970.c   -O1  line 25 *p == 13
>> gcc.dg/guality/pr54970.c   -O1  line 25 *q == 12
>> gcc.dg/guality/pr54970.c   -O1  line 31 *p == 6
>> gcc.dg/guality/pr54970.c   -O1  line 31 *q == 5
>> gcc.dg/guality/pr54970.c   -O1  line 36 *p == 26
>> gcc.dg/guality/pr54970.c   -O1  line 36 *q == 5
>> gcc.dg/guality/pr54970.c   -O1  line 45 *p == 26
>> gcc.dg/guality/pr54970.c   -O1  line 45 *q == 25
>> gcc.dg/guality/pr54970.c   -O1  line 45 p[-1] == 25
>> gcc.dg/guality/pr54970.c   -O1  line 45 p[-2] == 4
>> gcc.dg/guality/pr54970.c   -O1  line 45 q[-1] == 4
>> gcc.dg/guality/pr54970.c   -O1  line 45 q[1] == 26
>> gcc.dg/guality/pr56154-1.c   -O1  line pr56154-1.c:20 x.a == 6
>> gcc.dg/guality/pr59776.c   -O1  line pr59776.c:17 s1.f == 5.0
>> gcc.dg/guality/pr59776.c   -O1  line pr59776.c:17 s1.g == 6.0
>> gcc.dg/guality/pr59776.c   -O1  line pr59776.c:17 s2.f == 0.0
>> gcc.dg/guality/pr59776.c   -O1  line pr59776.c:17 s2.g == 6.0
>> gcc.dg/guality/pr59776.c   -O1  line pr59776.c:20 s1.f == 5.0
>> gcc.dg/guality/pr59776.c   -O1  line pr59776.c:20 s1.g == 6.0
>> gcc.dg/guality/pr59776.c   -O1

[PATCH][GCC-5] Fix "#pragma GCC pop_options" warning.

2016-01-18 Thread Andre Vieira (lists)


Hi there,

Can we have the "#pragma GCC pop_options" fix backported to GCC-5?

Patch found in https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01261.html 
and was committed in r228794.


The same patch applies cleanly to gcc-5, which would otherwise not be 
able to use this pragma even though the support is there.


Cheers,
Andre

[Ada] Housekeeping work in gigi (bis)

2016-01-18 Thread Eric Botcazou

In preparation for the implementation of the 'char' compatibility fix.

Tested on x86_64-suse-linux, applied on the mainline.


2016-01-18  Eric Botcazou  

* gcc-interface/gigi.h (build_call_raise_column): Adjust prototype.
(build_call_raise_range): Likewise.
(gnat_unsigned_type): Delete.
(gnat_signed_type): Likewise.
(gnat_signed_or_unsigned_type_for): New prototype.
(gnat_unsigned_type_for): New inline function.
(gnat_signed_type_for): Likewise.
* gcc-interface/cuintp.c (build_cst_from_int): Call build_int_cst.
* gcc-interface/decl.c (gnat_to_gnu_entity): Likewise.
(gnat_to_gnu_entity) : Always translate the index types
and compute their base type from that.
: Remove duplicate declaration.
* gcc-interface/misc.c (get_array_bit_stride): Call build_int_cst.
* gcc-interface/trans.c (get_type_length): Likewise.
(Attribute_to_gnu): Likewise.
(Loop_Statement_to_gnu): Likewise.
(Call_to_gnu): Likewise.
(gnat_to_gnu): Call build_real, build_int_cst, gnat_unsigned_type_for
and gnat_signed_type_for.  Minor tweaks.
(build_binary_op_trapv): Likewise.
(emit_check): Likewise.
(convert_with_check): Likewise.
(Raise_Error_to_gnu): Adjust calls to the build_call_raise family of
functions.  Minor tweaks.
(Case_Statement_to_gnu): Remove dead code.
(gnat_to_gnu): Call gnat_unsigned_type_for and gnat_signed_type_for.
(init_code_table): Minor reordering.
* gcc-interface/utils.c (gnat_unsigned_type): Delete.
(gnat_signed_type): Likewise.
(gnat_signed_or_unsigned_type_for): New function.
(unchecked_convert): Use directly the size in the test for precision
vs size adjustments.
(install_builtin_elementary_types): Call gnat_signed_type_for.
* gcc-interface/utils2.c (nonbinary_modular_operation): Call
build_int_cst.
(build_goto_raise): New function taken from...
(build_call_raise): ...here.  Call it.
(build_call_raise_column): Add KIND parameter and call it.
(build_call_raise_range): Likewise.


-- 
Eric BotcazouIndex: gcc-interface/cuintp.c
===
--- gcc-interface/cuintp.c	(revision 232465)
+++ gcc-interface/cuintp.c	(working copy)
@@ -6,7 +6,7 @@
  *  *
  *  C Implementation File   *
  *  *
- *  Copyright (C) 1992-2015, Free Software Foundation, Inc. *
+ *  Copyright (C) 1992-2016, Free Software Foundation, Inc. *
  *  *
  * GNAT is free software;  you can  redistribute it  and/or modify it under *
  * terms of the  GNU General Public License as published  by the Free Soft- *
@@ -52,8 +52,8 @@
the integer value itself.  The origin of the Uints_Ptr table is adjusted so
that a Uint value of Uint_Bias indexes the first element.
 
-   First define a utility function that operates like build_int_cst_type for
-   integral types and does a conversion for floating-point types.  */
+   First define a utility function that is build_int_cst for integral types and
+   does a conversion for floating-point types.  */
 
 static tree
 build_cst_from_int (tree type, HOST_WIDE_INT low)
@@ -61,7 +61,7 @@ build_cst_from_int (tree type, HOST_WIDE
   if (SCALAR_FLOAT_TYPE_P (type))
 return convert (type, build_int_cst (gnat_type_for_size (32, 0), low));
   else
-return build_int_cst_type (type, low);
+return build_int_cst (type, low);
 }
 
 /* Similar to UI_To_Int, but return a GCC INTEGER_CST or REAL_CST node,
Index: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 232501)
+++ gcc-interface/decl.c	(working copy)
@@ -1716,7 +1716,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	TYPE_MODULAR_P (gnu_type) = 1;
 	SET_TYPE_MODULUS (gnu_type, gnu_modulus);
 	gnu_high = fold_build2 (MINUS_EXPR, gnu_type, gnu_modulus,
-convert (gnu_type, integer_one_node));
+build_int_cst (gnu_type, 1));
 	  }
 
 	/* If the upper bound is not maximal, make an extra subtype.  */
@@ -2113,8 +2113,8 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	 gnat_index = Next_Index (gnat_index))
 	  {
 	char field_name[16];
-	tree gnu_index_base_type
-	  = get_unpadded_type (Base_Type (Etype (gnat_index)));
+	tree gnu_index_type = get_unpadded_type (Etype (gnat_index));
+	tree gnu_index_base_type = get_base_type (gnu_index_type);
 	tree gnu_lb_field, gnu_hb_field, gnu_orig_min, gnu_orig_max;
 	tree gnu_min, gnu_max, gnu_high;
 
@@ -2173,7 +2173,6 @@

Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM

2016-01-18 Thread Kyrill Tkachov



On 18/01/16 11:49, Jakub Jelinek wrote:

On Mon, Jan 18, 2016 at 11:38:37AM +, Kyrill Tkachov wrote:

On 18/01/16 11:31, Andreas Schwab wrote:

Jeff Law  writes:


commit 1384b36abcd52a7ac72ca6538afa2aed2e04f8e0
Author: Jeff Law 
Date:   Fri Jan 15 17:15:24 2016 -0500

PR tree-optimization/69270
* tree-ssanames.c (ssa_name_has_boolean_range): Moved here from
tree-ssa-dom.c.  Improve test for [0..1] ranve from VRP.
* tree-ssa-dom.c (ssa_name_has_boolean_range): Remove.
* tree-ssanames.h (ssa_name_has_boolean_range): Prototype.
* tree-ssa-uncprop.c (associate_equivalences_with_edges): Use
ssa_name_has_boolean_range and constant_boolean_node.
PR tree-optimization/69270
* gcc.dg/tree-ssa/pr69270-2.c: New test.
* gcc.dg/tree-ssa/pr69270-3.c: New test.

This breaks gcc.target/aarch64/tst_3.c.

//.tune generic
.type   f1, %function
  f1:
-   tst x0, 1
-   csinc   w0, w0, wzr, eq
+   andsw1, w0, 1
+   cselw0, w1, w0, ne
ret
.size   f1, .-f1

The two sequences look equally valid to me.
Instead of doing an and-compare followed by a conditional increment
we do an and-compare followed by a conditional select (without discarding
the result of the and).
So the testcase should be adjusted.
I'll do it.

IMHO please wait for the resolution of PR69320 here.


Ok, but both sequences should be ok for aarch64 here.
The function in the test is:
int
f1 (int x)
{
  if (x & 1)
return 1;
  return x;
}

and the "return 1" should just be changed to "return 2" to avoid whatever
optimiser decided to perform that transformation.

Kyrill

[PATCH][ARM] Remove neon_reinterpret, use casts

2016-01-18 Thread Alan Lawrence

This cleans up the neon_reinterpret code on ARM in a similar way to AArch64.
Rather than a builtin backing onto an expander that emits a mov insn, we can
just use a cast, because GCC defines casts of vector types as keeping the same
bit pattern.

On armeb, this fixes previously-failing test:
gcc.target/arm/crypto-vldrq_p128.c scan-assembler vld1.64\t{d[0-9]+-d[0-9]+}.*

Bootstrap + check-gcc on arm-none-linux-gnueabihf;
cross-tested armeb-none-eabi.

OK for trunk?

gcc/ChangeLog:

* config/arm/arm-protos.h (neon_reinterpret): Remove.
* config/arm/arm.c (neon_reinterpret): Remove.
* config/arm/arm_neon_builtins.def (vreinterpretv8qi, vreinterpretv4hi,
vreinterpretv2si, vreinterpretv2sf, vreinterpretdi, vreinterpretv16qi,
vreinterpretv8hi, vreinterpretv4si, vreinterpretv4sf, vreinterpretv2di,
vreinterpretti): Remove.
* config/arm/neon.md (neon_vreinterpretv8qi,
neon_vreinterpretv4hi, neon_vreinterpretv2si,
neon_vreinterpretv2sf, neon_vreinterpretdi,
neon_vreinterpretti, neon_vreinterpretv16qi,
neon_vreinterpretv8hi, neon_vreinterpretv4si,
neon_vreinterpretv4sf, neon_vreinterpretv2di): Remove.
* config/arm/arm_neon.h (vreinterpret_p8_p16, vreinterpret_p8_f32,
vreinterpret_p8_p64, vreinterpret_p8_s64, vreinterpret_p8_u64,
vreinterpret_p8_s8, vreinterpret_p8_s16, vreinterpret_p8_s32,
vreinterpret_p8_u8, vreinterpret_p8_u16, vreinterpret_p8_u32,
vreinterpret_p16_p8, vreinterpret_p16_f32, vreinterpret_p16_p64,
vreinterpret_p16_s64, vreinterpret_p16_u64, vreinterpret_p16_s8,
vreinterpret_p16_s16, vreinterpret_p16_s32, vreinterpret_p16_u8,
vreinterpret_p16_u16, vreinterpret_p16_u32, vreinterpret_f32_p8,
vreinterpret_f32_p16, vreinterpret_f32_p64, vreinterpret_f32_s64,
vreinterpret_f32_u64, vreinterpret_f32_s8, vreinterpret_f32_s16,
vreinterpret_f32_s32, vreinterpret_f32_u8, vreinterpret_f32_u16,
vreinterpret_f32_u32, vreinterpret_p64_p8, vreinterpret_p64_p16,
vreinterpret_p64_f32, vreinterpret_p64_s64, vreinterpret_p64_u64,
vreinterpret_p64_s8, vreinterpret_p64_s16, vreinterpret_p64_s32,
vreinterpret_p64_u8, vreinterpret_p64_u16, vreinterpret_p64_u32,
vreinterpret_s64_p8, vreinterpret_s64_p16, vreinterpret_s64_f32,
vreinterpret_s64_p64, vreinterpret_s64_u64, vreinterpret_s64_s8,
vreinterpret_s64_s16, vreinterpret_s64_s32, vreinterpret_s64_u8,
vreinterpret_s64_u16, vreinterpret_s64_u32, vreinterpret_u64_p8,
vreinterpret_u64_p16, vreinterpret_u64_f32, vreinterpret_u64_p64,
vreinterpret_u64_s64, vreinterpret_u64_s8, vreinterpret_u64_s16,
vreinterpret_u64_s32, vreinterpret_u64_u8, vreinterpret_u64_u16,
vreinterpret_u64_u32, vreinterpret_s8_p8, vreinterpret_s8_p16,
vreinterpret_s8_f32, vreinterpret_s8_p64, vreinterpret_s8_s64,
vreinterpret_s8_u64, vreinterpret_s8_s16, vreinterpret_s8_s32,
vreinterpret_s8_u8, vreinterpret_s8_u16, vreinterpret_s8_u32,
vreinterpret_s16_p8, vreinterpret_s16_p16, vreinterpret_s16_f32,
vreinterpret_s16_p64, vreinterpret_s16_s64, vreinterpret_s16_u64,
vreinterpret_s16_s8, vreinterpret_s16_s32, vreinterpret_s16_u8,
vreinterpret_s16_u16, vreinterpret_s16_u32, vreinterpret_s32_p8,
vreinterpret_s32_p16, vreinterpret_s32_f32, vreinterpret_s32_p64,
vreinterpret_s32_s64, vreinterpret_s32_u64, vreinterpret_s32_s8,
vreinterpret_s32_s16, vreinterpret_s32_u8, vreinterpret_s32_u16,
vreinterpret_s32_u32, vreinterpret_u8_p8, vreinterpret_u8_p16,
vreinterpret_u8_f32, vreinterpret_u8_p64, vreinterpret_u8_s64,
vreinterpret_u8_u64, vreinterpret_u8_s8, vreinterpret_u8_s16,
vreinterpret_u8_s32, vreinterpret_u8_u16, vreinterpret_u8_u32,
vreinterpret_u16_p8, vreinterpret_u16_p16, vreinterpret_u16_f32,
vreinterpret_u16_p64, vreinterpret_u16_s64, vreinterpret_u16_u64,
vreinterpret_u16_s8, vreinterpret_u16_s16, vreinterpret_u16_s32,
vreinterpret_u16_u8, vreinterpret_u16_u32, vreinterpret_u32_p8,
vreinterpret_u32_p16, vreinterpret_u32_f32, vreinterpret_u32_p64,
vreinterpret_u32_s64, vreinterpret_u32_u64, vreinterpret_u32_s8,
vreinterpret_u32_s16, vreinterpret_u32_s32, vreinterpret_u32_u8,
vreinterpret_u32_u16, vreinterpretq_p8_p16, vreinterpretq_p8_f32,
vreinterpretq_p8_p64, vreinterpretq_p8_p128, vreinterpretq_p8_s64,
vreinterpretq_p8_u64, vreinterpretq_p8_s8, vreinterpretq_p8_s16,
vreinterpretq_p8_s32, vreinterpretq_p8_u8, vreinterpretq_p8_u16,
vreinterpretq_p8_u32, vreinterpretq_p16_p8, vreinterpretq_p16_f32,
vreinterpretq_p16_p64, vreinterpretq_p16_p128, vreinterpretq_p16_s64,
vreinterpretq_p16_u64, vreinterpretq_p16_s8, vreinterpretq_p16_s16,
vreinterpretq_p16_s32, vreinterpretq_p16_u8, vreinterpretq_p16_u16,

Re: [patch] libstdc++/69293 Fix construction of std::function from null pointer-to-member

2016-01-18 Thread Jonathan Wakely


On 18/01/16 11:43 +, Jonathan Wakely wrote:

   Fix construction of std::function from null pointer-to-member
   
   	PR libstdc++/69293


I've done it again, that was for 69243 not 69293. I have too many
similar PR numbers flying around :-(

[Ada] Housekeeping work in gigi

2016-01-18 Thread Eric Botcazou

In preparation for the implementation of the 'char' compatibility fix.

Tested on x86_64-suse-linux, applied on the mainline.


2016-01-18  Eric Botcazou  

* gcc-interface/ada-tree.h (TYPE_IMPLEMENTS_PACKED_ARRAY_P): Rename to
(TYPE_IMPL_PACKED_ARRAY_P): ...this.
(TYPE_CAN_HAVE_DEBUG_TYPE_P): Do not test TYPE_DEBUG_TYPE.
* gcc-interface/decl.c (gnat_to_gnu_entity): Simplify NULL_TREE tests
and tweak gnat_encodings tests throughout.
(initial_value_needs_conversion): Likewise.
(intrin_arglists_compatible_p): Likewise.
* gcc-interface/misc.c (gnat_print_type): Likewise.
(gnat_get_debug_type): Likewise.
(gnat_get_fixed_point_type_info): Likewise.
(gnat_get_array_descr_info): Likewise.
(get_array_bit_stride): Likewise.
(gnat_get_type_bias): Fix formatting.
(enumerate_modes): Likewise.
* gcc-interface/trans.c (gnat_to_gnu): Likewise.
(add_decl_expr): Simplify NULL_TREE test.
(end_stmt_group): Likewise.
(build_binary_op_trapv): Fix formatting.
(get_exception_label): Use switch statement.
(init_code_table): Move around.
* gcc-interface/utils.c (global_bindings_p): Simplify NULL_TREE test.
(gnat_poplevel): Likewise.
(gnat_set_type_context): Likewise.
(defer_or_set_type_context): Fix formatting.
(gnat_pushdecl): Simplify NULL_TREE test.
(maybe_pad_type): Likewise.
(add_parallel_type): Likewise.
(create_range_type): Likewise.
(process_deferred_decl_context): Likewise.
(convert): Likewise.
(def_builtin_1): Likewise.
* gcc-interface/utils2.c (find_common_type): Likewise.
(build_binary_op): Likewise.
(gnat_rewrite_reference): Likewise.
(get_inner_constant_reference): Likewise.


-- 
Eric BotcazouIndex: gcc-interface/ada-tree.h
===
--- gcc-interface/ada-tree.h	(revision 232465)
+++ gcc-interface/ada-tree.h	(working copy)
@@ -6,7 +6,7 @@
  *  *
  *  C Header File   *
  *  *
- *  Copyright (C) 1992-2015, Free Software Foundation, Inc. *
+ *  Copyright (C) 1992-2016, Free Software Foundation, Inc. *
  *  *
  * GNAT is free software;  you can  redistribute it  and/or modify it under *
  * terms of the  GNU General Public License as published  by the Free Soft- *
@@ -189,14 +189,12 @@ do {			 \
 
 /* True for types that implement a packed array and for original packed array
types.  */
-#define TYPE_IMPLEMENTS_PACKED_ARRAY_P(NODE) \
-  ((TREE_CODE (NODE) == ARRAY_TYPE && TYPE_PACKED (NODE))		  \
-|| (TREE_CODE (NODE) == INTEGER_TYPE && TYPE_PACKED_ARRAY_TYPE_P (NODE))) \
+#define TYPE_IMPL_PACKED_ARRAY_P(NODE) \
+  ((TREE_CODE (NODE) == ARRAY_TYPE && TYPE_PACKED (NODE)) \
+   || (TREE_CODE (NODE) == INTEGER_TYPE && TYPE_PACKED_ARRAY_TYPE_P (NODE)))
 
 /* True for types that can hold a debug type.  */
-#define TYPE_CAN_HAVE_DEBUG_TYPE_P(NODE)  \
- (!TYPE_IMPLEMENTS_PACKED_ARRAY_P (NODE)  \
-  && TYPE_DEBUG_TYPE (NODE) != NULL_TREE)
+#define TYPE_CAN_HAVE_DEBUG_TYPE_P(NODE) (!TYPE_IMPL_PACKED_ARRAY_P (NODE))
 
 /* For an UNCONSTRAINED_ARRAY_TYPE, this is the record containing both the
template and the object.
@@ -385,8 +383,8 @@ do {		   \
 #define SET_TYPE_DEBUG_TYPE(NODE, X) \
   SET_TYPE_LANG_SPECIFIC2 (NODE, X)
 
-/* For types with TYPE_IMPLEMENTS_PACKED_ARRAY_P, this is the original packed
-   array type.  Note that this predicate is trou for original packed array
+/* For types with TYPE_IMPL_PACKED_ARRAY_P, this is the original packed
+   array type.  Note that this predicate is true for original packed array
types, so these cannot have a debug type.  */
 #define TYPE_ORIGINAL_PACKED_ARRAY(NODE) \
   GET_TYPE_LANG_SPECIFIC2 (NODE)
Index: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 232465)
+++ gcc-interface/decl.c	(working copy)
@@ -6,7 +6,7 @@
  *  *
  *  C Implementation File   *
  *  *
- *  Copyright (C) 1992-2015, Free Software Foundation, Inc. *
+ *  Copyright (C) 1992-2016, Free Software Foundation, Inc. *
  *  *
  * GNAT is free software;  you can  redistribute it  and/or modify it under *
  * terms of the  GNU General Public License as published  by the Free

Re: [PATCH 1/2] DWARF: process all TYPE_DECL nodes when iterating on scopes

2016-01-18 Thread Richard Biener

On Tue, Jan 5, 2016 at 3:32 PM, Pierre-Marie de Rodat
 wrote:
> Hello,
>
> In Ada, it is possible to have nested subprograms in the following
> configuration:
>
> procedure Parent is
>type T;
>[...]
>procedure Child (Value : T) is
>begin
>   [...]
>end Child;
> begin
>[...]
> end Parent;
>
> If we generate debugging information for Child first before Parent, the
> debug info for T will be generated at global scope since the DIE for
> Parent does not exist yet. It is when generating debug info for Parent
> that we are supposed to relocate it thanks to decls_for_scope and
> process_scope_var.
>
> However, process_scope_var currently works only on TYPE_DECL nodes that
> are stubs, for unknown reasons. This change adapts it to work on all
> TYPE_DECL nodes.
>
> It bootstrapped and regtested fine on x86_64-linux and triggered to
> regression in the GDB testsuite for Ada, C, C++ and Fortran. Ok to
> commit? Thank you in advance!

Looking for TYPE_DECL_IS_STUB uses I come along dwarf2out_ignore_block
which you'd need to change as well I think.

> gcc/ChangeLog:
>
> * dwarf2out.c (process_scope_var): Relocate TYPE_DECL nodes that
> are not stubs just like stub ones.
> ---
>  gcc/dwarf2out.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index 2c0bd63..da5524e 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -22829,8 +22829,7 @@ process_scope_var (tree stmt, tree decl, tree
> origin, dw_die_ref context_die)
> if (TREE_CODE (decl_or_origin) == FUNCTION_DECL)
>  die = lookup_decl_die (decl_or_origin);
> -  else if (TREE_CODE (decl_or_origin) == TYPE_DECL
> -   && TYPE_DECL_IS_STUB (decl_or_origin))
> +  else if (TREE_CODE (decl_or_origin) == TYPE_DECL)
>  die = lookup_type_die (TREE_TYPE (decl_or_origin));

But ... I think this change is wrong.  It is supposed to use the _type_ DIE
in case the FE didn't create a proper TYPE_DECL.  So I think what is
maybe missing is

  else if (TREE_CODE (decl_or_origin) == TYPE_DECL)
die = lookup_decl_die (decl_or_origin);

?  That is, why should we lookup the type if the type-decl isn't a stub?

Btw, not sure how you get at the "wrong" debug info gen order, I can't seem to
get at it with a C testcase.

As with the other patch this misses a testcase.

Richard.

>else
>  die = NULL;
> --
> 2.6.4
>

Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr

2016-01-18 Thread Richard Biener

On Fri, Jan 8, 2016 at 2:29 PM, Christian Bruel  wrote:
> When compiling code with attribute targets on arm or aarch64,
> vector_type_mode returns different results (eg Vmode or BLKmode) depending
> on the current simd flags that are not set between functions.
>
> for example the following code:
>
> #include 
>
> extern int8x8_t a;
> extern int8x8_t b;
>
> int16x8_t
> __attribute__ ((target("fpu=neon")))
> foo(void)
> {
>return vaddl_s8 (a, b);
> }
>
> Triggers gcc_asserts in copy_to_mode_regs while expanding NEON builtins ,
> because the mismatch and DECL_MODE current's TYPE_MODE used in
> expand_builtin for global variables.
>
> but the best explanation is in the vector_type_mode:
> /* Vector types need to re-check the target flags each time we report
> the machine mode.  We need to do this because attribute target can
> change the result of vector_mode_supported_p and have_regs_of_mode
> on a per-function basis.  Thus the TYPE_MODE of a VECTOR_TYPE can
> change on a per-function basis.  */
>
> I first tried to hack the 2 machine descriptions to insert convert_to_mode
> or relayout_decls here and there, but I found this very fragile. Instead a
> more central relayout the of type while expanding gave good results, as
> proposed here.
>
> bootstraped and tested with no regression for arm, aarch64 and i586.
>
> Does this look to be the right approach ?
>
> nb: for testing this patch is complementary with
>
> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00332.html
> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00248.html
>
> thanks for your comments.

A x86 specific testcase that ICEs as well:

typedef int v8si __attribute__((vector_size(32)));
v8si a;
v8si __attribute__((target("avx"))) foo()
{
  return a;
}

in your patch not using the shared DECL_RTL of the global var
"fixes" this so I think a conceptually better fix would be to
"adjust" DECL_RTL from globals via a adjust_address (or so).

Also given that we do

  /* ... fall through ...  */

case FUNCTION_DECL:
case RESULT_DECL:
  decl_rtl = DECL_RTL (exp);
expand_decl_rtl:
  gcc_assert (decl_rtl);
  decl_rtl = copy_rtx (decl_rtl);

thus always "unshare" DECL_RTL anyway it might be not so
bad to simply do

 decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0);

instead of that to avoid one copy.

Index: expr.c
===
--- expr.c  (revision 232496)
+++ expr.c  (working copy)
@@ -9597,7 +9597,10 @@ expand_expr_real_1 (tree exp, rtx target
   decl_rtl = DECL_RTL (exp);
 expand_decl_rtl:
   gcc_assert (decl_rtl);
-  decl_rtl = copy_rtx (decl_rtl);
+  if (MEM_P (decl_rtl))
+   decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0);
+  else
+   decl_rtl = copy_rtx (decl_rtl);
   /* Record writes to register variables.  */
   if (modifier == EXPAND_WRITE
  && REG_P (decl_rtl)

untested apart from on the x86_64 testcase (which it fixes).  One could guard
this further to only apply on vector typed decls with mismatched mode of course.

I think that re-layouting globals is not very good design.

Richard.

>
>
>
>
>
>

[PATCH][ARM] Add movv4hf/v8hf expanders & later insns; disable VnHF immediates.

2016-01-18 Thread Alan Lawrence

This fixes ICEs on armeb for float16x[48]_t vectors, e.g. in
check_effective_target_arm_neon_fp_16_ok.

At present, without the expander, moving v4hf/v8hf values around is done
via subregs. On armeb, this ICEs because REG_CANNOT_CHANGE_MODE_P. (On arm-*,
moving via two subregs is less efficient than one native insn!)

However, adding the expanders, reveals a latent bug in the V4HF variant of
*neon_mov, that vector constants are not handled properly in the
neon_valid_immediate code. Hence, for now I've used a separate expander that
disallows immediates, and disabled VnHF vectors as immediates in
neon_valid_immediate_for_move; I'll file a PR for this.

Also to fix the advsimd-intrinsics/vcombine test I had to add HF vector modes to
the VX iterator and V_reg attribute, for vdup_n, as loading a vector of
identical HF elements is now done by loading the scalar + vdup rather than
forcing the vector out to the constant pool.

On armeb, one of the ICEs this fixes, is in the test program for
check_effective_target_arm_neon_fp_16_ok. This means the advsimd-intrinsics
vcvt_f16 test now runs (and passes), and also that the other tests now run
with neon-fp16, rather than only neon as previously (on armeb).
This reveals that the fp16 cases of vld1_lane and vset_lane are (and were)
failing. Since those tests would previously have failed *if fp16 had been
passed in*, I think this is still a step forward; one can still run the tests
with an explicit non-fp16 multilib if the old behaviour is desired.

Note the previous patch removes other uses of VQXMOV (not strictly a dependency,
generating V4HF/V8HF reinterpret patterns is harmless, they just aren't used).

Bootstrapped + check-gcc on arm-none-linux-gnueabihf;
cross-tested armeb-none-eabi.

gcc/ChangeLog:

* config/arm/arm.c (neon_valid_immediate): Disallow vectors of HFmode.
* config/arm/iterators.md (V_HF): New.
(VQXMOV): Add V8HF.
(VX): Add V4HF, V8HF.
(V_reg): Add cases for V4HF, V8HF.
* config/arm/vec-common.md (mov V_HF): New.
---
 gcc/config/arm/arm.c |  2 ++
 gcc/config/arm/iterators.md  |  8 ++--
 gcc/config/arm/vec-common.md | 20 
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 3276b03..4fdba38 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -12371,6 +12371,8 @@ neon_valid_immediate (rtx op, machine_mode mode, int 
inverse,
   /* Vectors of float constants.  */
   if (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT)
 {
+  if (GET_MODE_INNER (mode) == HFmode)
+   return -1;
   rtx el0 = CONST_VECTOR_ELT (op, 0);
   const REAL_VALUE_TYPE *r0;
 
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 974cf51..c5db868 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -59,6 +59,9 @@
 ;; Integer and float modes supported by Neon and IWMMXT.
 (define_mode_iterator VALL [V2DI V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
 
+;; Vectors of half-precision floats.
+(define_mode_iterator V_HF [V4HF V8HF])
+
 ;; Integer and float modes supported by Neon and IWMMXT, except V2DI.
 (define_mode_iterator VALLW [V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
 
@@ -99,7 +102,7 @@
 (define_mode_iterator VQI [V16QI V8HI V4SI])
 
 ;; Quad-width vector modes, with TImode added, for moves.
-(define_mode_iterator VQXMOV [V16QI V8HI V4SI V4SF V2DI TI])
+(define_mode_iterator VQXMOV [V16QI V8HI V8HF V4SI V4SF V2DI TI])
 
 ;; Opaque structure types wider than TImode.
 (define_mode_iterator VSTRUCT [EI OI CI XI])
@@ -160,7 +163,7 @@
 (define_mode_iterator VMDQI [V4HI V2SI V8HI V4SI])
 
 ;; Modes with 8-bit and 16-bit elements.
-(define_mode_iterator VX [V8QI V4HI V16QI V8HI])
+(define_mode_iterator VX [V8QI V4HI V4HF V16QI V8HI V8HF])
 
 ;; Modes with 8-bit elements.
 (define_mode_iterator VE [V8QI V16QI])
@@ -428,6 +431,7 @@
 ;; Register width from element mode
 (define_mode_attr V_reg [(V8QI "P") (V16QI "q")
  (V4HI "P") (V8HI  "q")
+(V4HF "P") (V8HF  "q")
  (V2SI "P") (V4SI  "q")
  (V2SF "P") (V4SF  "q")
  (DI   "P") (V2DI  "q")
diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md
index ce98f71..c27578a 100644
--- a/gcc/config/arm/vec-common.md
+++ b/gcc/config/arm/vec-common.md
@@ -38,6 +38,26 @@
 }
 })
 
+;; This exists separately from the above pattern to exclude an immediate RHS.
+
+(define_expand "mov"
+  [(set (match_operand:V_HF 0 "nonimmediate_operand" "")
+   (match_operand:V_HF 1 "nonimmediate_operand" ""))]
+  "TARGET_NEON
+   || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (mode))"
+{
+  if (can_create_pseudo_p ())
+{
+  if (!REG_P (operands[0]))
+   operands[1] = force_reg (mode, operands[1]);
+  else if (TARGET_NEON && CONSTANT_P (operands[1]))
+   {
+ operands[1] =

Re: Thoughts on memcmp expansion (PR43052)

2016-01-18 Thread Nick Clifton


Hi Bernd,

+  rtx op0 = force_reg (direct_mode, arg1_rtx);
+  rtx op1 = force_reg (direct_mode, arg2_rtx);
+  rtx tem = emit_store_flag (target, NE, op0, op1,
+direct_mode, true, false);

This is me being ignorant here... wouldn't it be easier to have a new 
cmpmem_eq pattern (and resulting optab) than to generate this code 
sequence directly ?  That way backends can choose to support this 
optimization, and if they do, they can also choose to support longer 
lengths of comparison.



 DEF_LIB_BUILTIN(BUILT_IN_MEMCMP, "memcmp", 
BT_FN_INT_CONST_PTR_CONST_PTR_SIZE, ATTR_PURE_NOTHROW_NONNULL_LEAF)
+DEF_GCC_BUILTIN(BUILT_IN_MEMCMP_EQ, "__memcmp_eq", 
BT_FN_INT_CONST_PTR_CONST_PTR_SIZE, ATTR_PURE_NOTHROW_NONNULL_LEAF)
 DEF_LIB_BUILTIN_CHKP   (BUILT_IN_MEMCPY, "memcpy", 
BT_FN_PTR_PTR_CONST_PTR_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)


Presumably you would also document this new builtin in doc/extend.texi ? 
 Plus maybe add a testcase for it as well ?



+  /* If the return value is used, don't do the transformation.  */

This comment struck me as wrong.  Surely if the return value is not used 
then the entire memcmp can be transformed into nothing.  Plus if the 
return value is used, but only for an equality comparison with zero then 
the transformation can take place.



Cheers
  Nick

Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM

2016-01-18 Thread Andreas Schwab

Jeff Law  writes:

> commit 1384b36abcd52a7ac72ca6538afa2aed2e04f8e0
> Author: Jeff Law 
> Date:   Fri Jan 15 17:15:24 2016 -0500
>
>   PR tree-optimization/69270
>   * tree-ssanames.c (ssa_name_has_boolean_range): Moved here from
>   tree-ssa-dom.c.  Improve test for [0..1] ranve from VRP.
>   * tree-ssa-dom.c (ssa_name_has_boolean_range): Remove.
>   * tree-ssanames.h (ssa_name_has_boolean_range): Prototype.
>   * tree-ssa-uncprop.c (associate_equivalences_with_edges): Use
>   ssa_name_has_boolean_range and constant_boolean_node.
> 
>   PR tree-optimization/69270
>   * gcc.dg/tree-ssa/pr69270-2.c: New test.
>   * gcc.dg/tree-ssa/pr69270-3.c: New test.

This breaks gcc.target/aarch64/tst_3.c.

//.tune generic
.type   f1, %function
 f1:
-   tst x0, 1
-   csinc   w0, w0, wzr, eq
+   andsw1, w0, 1
+   cselw0, w1, w0, ne
ret
.size   f1, .-f1

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM

2016-01-18 Thread Kyrill Tkachov



On 18/01/16 11:31, Andreas Schwab wrote:

Jeff Law  writes:


commit 1384b36abcd52a7ac72ca6538afa2aed2e04f8e0
Author: Jeff Law 
Date:   Fri Jan 15 17:15:24 2016 -0500

PR tree-optimization/69270
* tree-ssanames.c (ssa_name_has_boolean_range): Moved here from
tree-ssa-dom.c.  Improve test for [0..1] ranve from VRP.
* tree-ssa-dom.c (ssa_name_has_boolean_range): Remove.
* tree-ssanames.h (ssa_name_has_boolean_range): Prototype.
* tree-ssa-uncprop.c (associate_equivalences_with_edges): Use
ssa_name_has_boolean_range and constant_boolean_node.
 
 	PR tree-optimization/69270

* gcc.dg/tree-ssa/pr69270-2.c: New test.
* gcc.dg/tree-ssa/pr69270-3.c: New test.

This breaks gcc.target/aarch64/tst_3.c.

//.tune generic
.type   f1, %function
  f1:
-   tst x0, 1
-   csinc   w0, w0, wzr, eq
+   andsw1, w0, 1
+   cselw0, w1, w0, ne
ret
.size   f1, .-f1


The two sequences look equally valid to me.
Instead of doing an and-compare followed by a conditional increment
we do an and-compare followed by a conditional select (without discarding
the result of the and).
So the testcase should be adjusted.
I'll do it.

Thanks,
Kyrill


Andreas.

[PATCH] Fix PR69337

2016-01-18 Thread Richard Biener


This fixes ICEs when trying to merge functions and variables.  As this is
a fatal error in the end no need to try doing sth fancy.

Applied as obvious.

Richard.

2016-01-18  Richard Biener  

PR lto/69337
* lto-symtab.c (lto_symtab_merge): Return early for mismatched
function vs. variable.

Index: gcc/lto/lto-symtab.c
===
*** gcc/lto/lto-symtab.c(revision 232496)
--- gcc/lto/lto-symtab.c(working copy)
*** lto_symtab_merge (symtab_node *prevailin
*** 303,308 
--- 303,311 
if (prevailing_decl == decl)
  return true;
  
+   if (TREE_CODE (decl) != TREE_CODE (prevailing_decl))
+ return false;
+ 
/* Merge decl state in both directions, we may still end up using
   the new decl.  */
TREE_ADDRESSABLE (prevailing_decl) |= TREE_ADDRESSABLE (decl);

Re: [Patch, fortran] (4/5-regression) PR61831 side-effect deallocation of variable components)

2016-01-18 Thread Paul Richard Thomas

Hi Dominique,

Late or not, thanks for the feedback. I'll fix the right brace. More
worrying is the failure with -m32. I presume that the failure with
-O0/O1 is at runtime? If not, the correction of the missing right
brace is a mysterious trigger for a fault that is optimization
dependent.

Cheers

Paul

On 18 January 2016 at 12:05, Dominique d'Humières  wrote:
> Dear Paul,
>
> Sorry for the late feedback. There is a missing right brace in 
> gfortran.dg/derived_constructor_comps_6.f90. This is fixed by the obvious 
> patch:
>
> --- ../5_clean/gcc/testsuite/gfortran.dg/derived_constructor_comps_6.f90  
>   2016-01-17 19:27:04.0 +0100
> +++ gcc/testsuite/gfortran.dg/derived_constructor_comps_6.f90   2016-01-18 
> 03:02:17.0 +0100
> @@ -1,5 +1,5 @@
>  ! { dg-do run }
> -! { dg-additional-options "-fdump-tree-original"
> +! { dg-additional-options "-fdump-tree-original" }
>  !
>  ! PR fortran/61831
>  ! The deallocation of components of array constructor elements
>
> However with this patch the test fails with -m32 and -O0/O1.
>
> TIA
>
> Dominique
>
>> Le 17 janv. 2016 à 18:37, Paul Richard Thomas 
>>  a écrit :
>>
>> Dear Andre,
>>
>> Thanks for the very useful discussion. It cleared away one or two cobwebs!
>>
>> Committed as revision 232482.
>>
>> Now to see if it will apply to 4.9 branch This might be a step too
>> far, since all sorts of other prerequisites are not there. If it
>> doesn't go well, I will close the PR as a WONTFIX on 4.9.
>>
>> Cheers
>>
>> Paul
>



-- 
The difference between genius and stupidity is; genius has its limits.

Albert Einstein

Re: [PATCH] c++/58109 - alignas() fails to compile with constant expression

2016-01-18 Thread Jason Merrill


On 01/12/2016 01:11 PM, Martin Sebor wrote:

On 01/11/2016 10:20 PM, Jason Merrill wrote:

On 12/22/2015 09:32 PM, Martin Sebor wrote:

+  if (is_attribute_p ("aligned", name)
+  || is_attribute_p ("vector_size", name))
+{
+  /* Attribute argument may be a dependent indentifier.  */
+  if (tree t = args ? TREE_VALUE (args) : NULL_TREE)
+if (value_dependent_expression_p (t)
+|| type_dependent_expression_p (t))
+  return true;
+}


Instead of this, is_late_template_attribute should be fixed to check
attribute_takes_identifier_p.


attribute_takes_identifier_p() returns false for the aligned
attribute and for vector_size (it returns true only for
attributes cleanup, format, and mode, and none others).


Right.  The problem is this code in is_late_template_attribute:


  /* If the first attribute argument is an identifier, only consider
 second and following arguments.  Attributes like mode, format,
 cleanup and several target specific attributes aren't late
 just because they have an IDENTIFIER_NODE as first argument.  */
  if (arg == args && identifier_p (t))
continue;


It shouldn't skip an initial identifier if !attribute_takes_identifier_p.

Jason

Re: C++ PATCH to suppress bogus -Wunused warning for parameter packs (PR c++/68965)

2016-01-18 Thread Jason Merrill


On 01/13/2016 05:52 PM, Marek Polacek wrote:

On Wed, Jan 13, 2016 at 03:33:27PM -0500, Jason Merrill wrote:

On 01/13/2016 11:50 AM, Marek Polacek wrote:

So to quash that -Wunused-parameter warning, I decided to set TREE_USED at the
place where we create those #xs parameters.


Won't that cause false negatives when the parameter pack is never mentioned
in the function?


You mean that e.g. for

auto fn = [](auto&&... xs)
{
};

int
main ()
{
   fn (1, 2, 3);
}

we won't print
z.cc:1:24: warning: unused parameter 'xs' [-Wunused-parameter]
anymore?  Unfortunately, we don't print that even without the patch :(.


But we do currently print

wa.C:1:24: warning: unused parameter ‘xs#0’ [-Wunused-parameter]
wa.C:1:24: warning: unused parameter ‘xs#1’ [-Wunused-parameter]
wa.C:1:24: warning: unused parameter ‘xs#2’ [-Wunused-parameter]

for that testcase, and I think your patch would remove this warning as well.

Jason

[patch] libstdc++/60637 Fix C++98 std::signbit

2016-01-18 Thread Jonathan Wakely


This fixes PR60637 by using the appropriate built-in for the size of
the argument type. In Bugzilla Marc asked why we don't just use the
same code as for C++11, but I want to make this less intrusive change
on the branches (trunk is already OK anyway).

The new test is only run on x86 because it fails with a -Woverflow
warning on (at least) powerpc.

Tested x86_64-linux and powerpc64le-linux. Committed to gcc-5-branch
and gcc-4_9-branch.



commit e30df471f94c383d742bf4506a794b6505526f77
Author: Jonathan Wakely 
Date:   Mon Jan 18 14:39:57 2016 +

Fix C++98 std::signbit

	PR libstdc++/60637
	* include/c_global/cmath (signbit) [__cplusplus < 201103L]: Use
	__builtin_signbitf or __builtin_signbitl as appropriate.
	* testsuite/26_numerics/headers/cmath/60637.cc: New.

diff --git a/libstdc++-v3/include/c_global/cmath b/libstdc++-v3/include/c_global/cmath
index 4cafe5f..d3fc8b7 100644
--- a/libstdc++-v3/include/c_global/cmath
+++ b/libstdc++-v3/include/c_global/cmath
@@ -880,7 +880,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 signbit(_Tp __f)
 {
   typedef typename __gnu_cxx::__promote<_Tp>::__type __type;
-  return __builtin_signbit(__type(__f));
+  return sizeof(__type) == sizeof(float)
+	? __builtin_signbitf(__type(__f))
+	: sizeof(__type) == sizeof(double)
+	? __builtin_signbit(__type(__f))
+	: __builtin_signbitl(__type(__f));
 }
 
   template
diff --git a/libstdc++-v3/testsuite/26_numerics/headers/cmath/60637.cc b/libstdc++-v3/testsuite/26_numerics/headers/cmath/60637.cc
new file mode 100644
index 000..16a7896
--- /dev/null
+++ b/libstdc++-v3/testsuite/26_numerics/headers/cmath/60637.cc
@@ -0,0 +1,35 @@
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++98 -ffast-math" }
+// { dg-do run { target i?86-*-* x86_64-*-* } }
+
+#include 
+#include 
+
+void
+test01()
+{
+  long double ld = -5.3165867831218916301793863361917824e-2467L;
+  VERIFY( std::signbit(ld) == 1 );
+}
+
+int
+main()
+{
+  test01();
+}

Re: Fix pasto in lto-partition.c

2016-01-18 Thread Richard Biener

On January 18, 2016 4:59:30 PM GMT+01:00, Jan Hubicka  wrote:
>Hi,
>this patch fixes pasto that leads to undefined symbol in the testcase
>in PR69003.
>Unfortunately i am not sure how to  do incremental linking tests in the
>testsuite.

Only with custom .exp files I guess.

>The patch should work for the release branches, too,

Can you back port it then please?

Richard.

>Honza
>
>Index: ChangeLog
>===
>--- ChangeLog  (revision 232466)
>+++ ChangeLog  (working copy)
>@@ -1,3 +1,8 @@
>+2016-01-12  Jan Hubicka  
>+
>+  PR lto/69003
>+  * lto-partition.c (rename_statics): Fix pasto.
>+
> 2016-01-12  Richard Biener  
> 
>   PR lto/69077
>Index: lto-partition.c
>===
>--- lto-partition.c(revision 232466)
>+++ lto-partition.c(working copy)
>@@ -1077,8 +1077,8 @@ rename_statics (lto_symtab_encoder_t enc
> IDENTIFIER_POINTER
>   (DECL_ASSEMBLER_NAME (s->get_alias_target()->decl
>   && ((s->real_symbol_p ()
>- && !DECL_EXTERNAL (node->decl)
>-   && !TREE_PUBLIC (node->decl))
>+ && !DECL_EXTERNAL (s->decl)
>+   && !TREE_PUBLIC (s->decl))
>   || may_need_named_section_p (encoder, s))
>   && (!encoder
>   || lto_symtab_encoder_lookup (encoder, s) != LCC_NOT_FOUND))

[PATCH] libcc1: rerun configure when gcc/BASE-VER changes

2016-01-18 Thread Andreas Schwab

* configure.ac (CONFIG_STATUS_DEPENDENCIES): Substitute.
* configure: Regenerate.
* Makefile.in: Regenerate.

diff --git a/libcc1/configure.ac b/libcc1/configure.ac
index 6c97afd..e2e3fda 100644
--- a/libcc1/configure.ac
+++ b/libcc1/configure.ac
@@ -50,6 +50,7 @@ AC_CHECK_DECLS([basename])
 
 gcc_version=`cat $srcdir/../gcc/BASE-VER`
 AC_SUBST(gcc_version)
+AC_SUBST([CONFIG_STATUS_DEPENDENCIES], ['$(top_srcdir)/../gcc/BASE-VER'])
 
 ACX_PROG_CC_WARNING_OPTS([-W -Wall], [WARN_FLAGS])
 AC_SUBST(WARN_FLAGS)


-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

C++/c-common PATCH for c++/68767 (warning regression with ?:)

2016-01-18 Thread Jason Merrill

In this testcase, we weren't getting the benefits of fold's cleverness 
in handling COND_EXPR because we were only calling fold_for_warn on the 
condition itself.  This patch changes check_function_arguments_recurse 
to fold the entire COND_EXPR, and also fixes cp_fold to actually fold 
COND_EXPR.


Along with this, I've cleaned up some other bits I noticed in cp_fold: 
there was various unnecessary special-casing for unary and binary ops as 
well, and we were clobbering an input CONSTRUCTOR rather than returning 
a new folded one.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 0a3081a56bcd9b98d66ddc5eadf8c3fece9152dd
Author: Jason Merrill 
Date:   Fri Jan 15 12:39:28 2016 -0500

	PR c++/68767
gcc/c-family/
	* c-common.c (check_function_arguments_recurse): Fold the whole
	COND_EXPR, not just the condition.
gcc/cp/
	* cp-gimplify.c (cp_fold) [COND_EXPR]: Simplify.  Do fold COND_EXPR.
	(contains_label_1, contains_label_p): Remove.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 0bfa1f6..1a2c21b 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -9765,15 +9765,19 @@ check_function_arguments_recurse (void (*callback)
 
   if (TREE_CODE (param) == COND_EXPR)
 {
-  tree cond = fold_for_warn (TREE_OPERAND (param, 0));
-  /* Check both halves of the conditional expression.  */
-  if (!integer_zerop (cond))
-	check_function_arguments_recurse (callback, ctx,
-	  TREE_OPERAND (param, 1), param_num);
-  if (!integer_nonzerop (cond))
-	check_function_arguments_recurse (callback, ctx,
-	  TREE_OPERAND (param, 2), param_num);
-  return;
+  /* Simplify to avoid warning for an impossible case.  */
+  param = fold_for_warn (param);
+  if (TREE_CODE (param) == COND_EXPR)
+	{
+	  /* Check both halves of the conditional expression.  */
+	  check_function_arguments_recurse (callback, ctx,
+	TREE_OPERAND (param, 1),
+	param_num);
+	  check_function_arguments_recurse (callback, ctx,
+	TREE_OPERAND (param, 2),
+	param_num);
+	  return;
+	}
 }
 
   (*callback) (ctx, param, param_num);
diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index c0ee8e4..e151753 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -1851,38 +1851,6 @@ cxx_omp_disregard_value_expr (tree decl, bool shared)
 	 && DECL_OMP_PRIVATIZED_MEMBER (decl);
 }
 
-/* Callback for walk_tree, looking for LABEL_EXPR.  Return *TP if it is
-   a LABEL_EXPR; otherwise return NULL_TREE.  Do not check the subtrees
-   of GOTO_EXPR.  */
-
-static tree
-contains_label_1 (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED)
-{
-  switch (TREE_CODE (*tp))
-{
-case LABEL_EXPR:
-  return *tp;
-
-case GOTO_EXPR:
-  *walk_subtrees = 0;
-
-  /* ... fall through ...  */
-
-default:
-  return NULL_TREE;
-}
-}
-
-/* Return whether the sub-tree ST contains a label which is accessible from
-   outside the sub-tree.  */
-
-static bool
-contains_label_p (tree st)
-{
-  return
-   walk_tree_without_duplicates (, contains_label_1 , NULL) != NULL_TREE;
-}
-
 /* Perform folding on expression X.  */
 
 tree
@@ -2110,54 +2078,22 @@ cp_fold (tree x)
 case VEC_COND_EXPR:
 case COND_EXPR:
 
+  /* Don't bother folding a void condition, since it can't produce a
+	 constant value.  Also, some statement-level uses of COND_EXPR leave
+	 one of the branches NULL, so folding would crash.  */
+  if (VOID_TYPE_P (TREE_TYPE (x)))
+	return x;
+
   loc = EXPR_LOCATION (x);
   op0 = cp_fold_rvalue (TREE_OPERAND (x, 0));
-
-  if (TREE_SIDE_EFFECTS (op0))
-	break;
-
   op1 = cp_fold (TREE_OPERAND (x, 1));
   op2 = cp_fold (TREE_OPERAND (x, 2));
 
-  if (TREE_CODE (op0) == INTEGER_CST)
-	{
-	  tree un;
-
-	  if (integer_zerop (op0))
-	{
-	  un = op1;
-	  r = op2;
-	}
-	  else
-	{
-	  un = op2;
-	  r = op1;
-	}
-
-  if ((!TREE_SIDE_EFFECTS (un) || !contains_label_p (un))
-  && (! VOID_TYPE_P (TREE_TYPE (r)) || VOID_TYPE_P (x)))
-{
-	  if (CAN_HAVE_LOCATION_P (r)
-		  && EXPR_LOCATION (r) != loc
-		  && !(TREE_CODE (r) == SAVE_EXPR
-		   || TREE_CODE (r) == TARGET_EXPR
-		   || TREE_CODE (r) == BIND_EXPR))
-	{
-		  r = copy_node (r);
-		  SET_EXPR_LOCATION (r, loc);
-	}
-	  x = r;
-	}
-
-	  break;
-	}
-
-  if (VOID_TYPE_P (TREE_TYPE (x)))
-	break;
-
-  x = build3_loc (loc, code, TREE_TYPE (x), op0, op1, op2);
-
-  if (code != COND_EXPR)
+  if (op0 != TREE_OPERAND (x, 0)
+	  || op1 != TREE_OPERAND (x, 1)
+	  || op2 != TREE_OPERAND (x, 2))
+	x = fold_build3_loc (loc, code, TREE_TYPE (x), op0, op1, op2);
+  else
 	x = fold (x);
 
   break;
diff --git a/gcc/testsuite/g++.dg/warn/Wnonnull2.C b/gcc/testsuite/g++.dg/warn/Wnonnull2.C
new file mode 100644
index 000..6757437
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wnonnull2.C

[PATCH] PR testsuite/69181: ensure expected multiline outputs is cleared per-test (v2)

2016-01-18 Thread David Malcolm

On Tue, 2016-01-12 at 23:21 -0700, Jeff Law wrote:
On 01/12/2016 12:34 PM, David Malcolm wrote:
> >> I looked at this code, and there are two near-identical blocks which
> >> reset all these variables. You are modifying only one of them, leaving
> >> the one inside the if { catch } thing unchanged - is this intentional?
> >
> > I'm not particularly strong at Tcl, but am I right in thinking that
> > given that we have this:
> >
> > if { [ catch { eval saved-dg-test $args } errmsg ] } {
> > (A) set and unset various things
> > error $errmsg $saved_info
> > }
> > (B) set and unset the same various things as (A)
> >
> > that (B) will always be reached, and that the duplicates in (A) are
> > redundant? (unless they affect "error")
> Seems like it would, but, well it's TCL, so who in the hell knows.

I was wrong: "error" in Tcl is roughly equivalent to throwing an
exception.

Hence the above is actually akin to:

  try:
eval saved-dg-test $args
  catch:
do cleanup A
re-raise current error
  do cleanup B

which is a workaround for the lack of a try-finally construct:

  try:
eval saved-dg-test $args
  finally:
do cleanup

So we do need error cleanup for both blocks (A) and (B).

> > I see that this pattern was introduced back in r67696 aka
> > 91a385a522a94154f9e0cd940c5937177737af02:
> Strangely, I can't find the patch in the archives nor any discussion for
> the patch.  It seems to have appeared from nowhere.   My search-fu must
> be weak tonight.  It may not have helped understand why this code is the
> way it is anyway.
>
> This duplication screams that it ought to be its own procedure if we're
> going to keep the apparently duplicated behaviour.

The following patch implements this, moving the existing cleanup into
a new "cleanup-after-saved-dg-test" proc, and calling it from both (A)
and (B).

I noticed in doing so that we were missing a:

  global additional_sources_used

and hence (if I understand Tcl correctly), the code was merely uselessly
setting a local with that name, rather than clearing the global.

Also, the cleanups weren't quite identical between (A) and (B); these
clauses:

if [info exists set_target_env_var] {
unset set_target_env_var
}
if [info exists keep_saved_temps_suffixes] {
unset keep_saved_temps_suffixes
}

were present in (B) but missing (A); also some of the ordering of
the cleanups varied between (A) and (B).

I assumed that these differences were unintentional, so the patch
consolidates things to make the cleanup identical between (A) and (B).

Successfully bootstrapped on x86_64-pc-linux-gnu; as before,
adds 1 UNSUPPORTED (by design: gcc.dg/pr69181-1.c) and 1 PASS to gcc.sum.

OK for trunk?

Alternatively, would you prefer the simpler patch to add the cleanup
of multiline_expected_outputs to both (A) and (B), and leave the
consolidation idea for next stage 1?  (keeping the new test cases
and the renaming to lose the leading underscore).

gcc/testsuite/ChangeLog:
PR testsuite/69181
* gcc.dg/pr69181-1.c: New test file.
* gcc.dg/pr69181-2.c: New test file.
* lib/gcc-dg.exp (dg-test): Consolidate post-test cleanup of
globals by moving it to...
(cleanup-after-saved-dg-test): ...this new function.  Add
"global additional_sources_used".  Add reset of global
multiline_expected_outputs to the empty list.
* lib/multiline.exp (_multiline_expected_outputs): Rename this
global to...
(multiline_expected_outputs): ...this, and updated comments to
note that it is modified from gcc-dg.exp.
(dg-end-multiline-output): Update for the above renaming.
(handle-multiline-outputs): Likewise.  Remove the clearing
of the expected outputs to the empty list.
---
 gcc/testsuite/gcc.dg/pr69181-1.c |  7 +++
 gcc/testsuite/gcc.dg/pr69181-2.c |  4 
 gcc/testsuite/lib/gcc-dg.exp | 36 ++--
 gcc/testsuite/lib/multiline.exp  | 22 +-
 4 files changed, 38 insertions(+), 31 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr69181-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr69181-2.c

diff --git a/gcc/testsuite/gcc.dg/pr69181-1.c b/gcc/testsuite/gcc.dg/pr69181-1.c
new file mode 100644
index 000..e851f0c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr69181-1.c
@@ -0,0 +1,7 @@
+/* { dg-do compile { target this_will_not_be_matched-*-* } } */
+
+/* { dg-begin-multiline-output "" }
+   This message should never be checked for.
+   In particular, it shouldn't be checked for in the *next*
+   test case.
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/gcc.dg/pr69181-2.c b/gcc/testsuite/gcc.dg/pr69181-2.c
new file mode 100644
index 000..dca90dc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr69181-2.c
@@ -0,0 +1,4 @@
+/* Dummy test case, to verify that the dg-begin-multiline-output directive
+   from pr69181-1.c isn't

Re: C++/c-common PATCH for c++/68767 (warning regression with ?:)

2016-01-18 Thread Jason Merrill


On 01/18/2016 11:06 AM, Jakub Jelinek wrote:

On Mon, Jan 18, 2016 at 10:53:41AM -0500, Jason Merrill wrote:

In this testcase, we weren't getting the benefits of fold's cleverness in
handling COND_EXPR because we were only calling fold_for_warn on the
condition itself.  This patch changes check_function_arguments_recurse to
fold the entire COND_EXPR, and also fixes cp_fold to actually fold
COND_EXPR.

Along with this, I've cleaned up some other bits I noticed in cp_fold: there
was various unnecessary special-casing for unary and binary ops as well, and
we were clobbering an input CONSTRUCTOR rather than returning a new folded
one.

Tested x86_64-pc-linux-gnu, applying to trunk.


Guess too late for GCC 6, but I'm slightly worried about fold_for_warn
compile time complexity, if we have very large trees of expressions and try
to fold_for_warn it all, then on the original fold_for_warn 2-3 operands
thereof and so on, with something foldable somewhere very deep in the trees.
For that perhaps best would be if we cache cp_fold results on
expressions somewhere (custom tree that holds original and corresponding
folded expression, or hash table, whatever) and if we try to cp_fold it
again, we just reuse what we've folded it to last time.


cp_fold already caches its results.

Jason

Re: [PATCH v2] libstdc++: Make certain exceptions transaction_safe.

2016-01-18 Thread Torvald Riegel

On Mon, 2016-01-18 at 14:54 +0100, Torvald Riegel wrote:
> On Sun, 2016-01-17 at 18:30 -0500, David Edelsohn wrote:
> > On Sun, Jan 17, 2016 at 3:21 PM, Torvald Riegel  wrote:
> > > On Sat, 2016-01-16 at 15:38 -0500, David Edelsohn wrote:
> > >> On Sat, Jan 16, 2016 at 8:35 AM, Jakub Jelinek  wrote:
> > >> > On Sat, Jan 16, 2016 at 07:47:33AM -0500, David Edelsohn wrote:
> > >> >> stage1 libstdc++ builds just fine.  the problem is stage2 configure
> > >> >> fails due to missing ITM_xxx symbols when configure tries to compile
> > >> >> and run conftest programs.
> > >> >
> > >> > On x86_64-linux, the _ITM_xxx symbols are undef weak ones and thus it 
> > >> > is
> > >> > fine to load libstdc++ without libitm and libstdc++ doesn't depend on
> > >> > libitm.
> > >> >
> > >> > So, is AIX defining __GXX_WEAK__ or not?  Perhaps some other macro or
> > >> > configure check needs to be used to determine if undefined weak symbols
> > >> > work the way libstdc++ needs them to.
> > >>
> > >> __GXX_WEAK__ appears to be defined by gcc/c-family/c-cppbuiltin.c
> > >> based on  SUPPORTS_ONE_ONLY.  gcc/defaults.h defines SUPPORTS_ONE_ONLY
> > >> if the target supports MAKE_DECL_ONE_ONLY and link-once semantics.
> > >> AIX weak correctly supports link-once semantics.  AIX also supports
> > >> the definition of __GXX_WEAK__ in gcc/doc/cpp.texi, namely collapsing
> > >> symbols with vague linkage in multiple translation units.
> > >>
> > >> libstdc++/src/c++11/cow-stdexcept.cc appears to be using __GXX_WEAK__
> > >> and __attribute__ ((weak)) for references to symbols that may not be
> > >> defined at link time or run time.  AIX does not allow undefined symbol
> > >> errors by default.  And the libstdc++ inference about the semantics of
> > >> __GXX_WEAK__ are different than the documentation.
> > >>
> > >> AIX supports MAKE_DECL_ONE_ONLY and the documented meaning of
> > >> __GXX_WEAK__.  AIX does not support extension of the meaning to
> > >> additional SVR4 semantics not specified in the documentation.
> > >
> > > I see, so we might be assuming that __GXX_WEAK__ means more than it
> > > actually does (I'm saying "might" because personally, I don't know; your
> > > information supports this is the case, but the initial info I got was
> > > that __GXX_WEAK__ would mean we could have weak decls without
> > > definitions).
> > 
> > I believe that libstdc++ must continue with the weak undefined
> > references to the symbols as designed, but protect them with a
> > different macro.  For example, __GXX_WEAK_REF__ or __GXX_WEAK_UNDEF__
> > defined in defaults.h based on configure test or simply overridden in
> > config/rs6000/aix.h.  Or the macro could be local to libstdc++ and
> > overridden in config/os/aix/os_defines.h.
> 
> OK.  I'm currently testing the attached patch on x86_64-linux.

No regressions in the libstdc++ and libitm tests on x86_64-linux.

Fix pasto in lto-partition.c

2016-01-18 Thread Jan Hubicka

Hi,
this patch fixes pasto that leads to undefined symbol in the testcase in 
PR69003.
Unfortunately i am not sure how to  do incremental linking tests in the 
testsuite.
The patch should work for the release branches, too,

Honza

Index: ChangeLog
===
--- ChangeLog   (revision 232466)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2016-01-12  Jan Hubicka  
+
+   PR lto/69003
+   * lto-partition.c (rename_statics): Fix pasto.
+
 2016-01-12  Richard Biener  
 
PR lto/69077
Index: lto-partition.c
===
--- lto-partition.c (revision 232466)
+++ lto-partition.c (working copy)
@@ -1077,8 +1077,8 @@ rename_statics (lto_symtab_encoder_t enc
  IDENTIFIER_POINTER
(DECL_ASSEMBLER_NAME (s->get_alias_target()->decl
&& ((s->real_symbol_p ()
- && !DECL_EXTERNAL (node->decl)
-&& !TREE_PUBLIC (node->decl))
+ && !DECL_EXTERNAL (s->decl)
+&& !TREE_PUBLIC (s->decl))
|| may_need_named_section_p (encoder, s))
&& (!encoder
|| lto_symtab_encoder_lookup (encoder, s) != LCC_NOT_FOUND))

Re: C++/c-common PATCH for c++/68767 (warning regression with ?:)

2016-01-18 Thread Jakub Jelinek

On Mon, Jan 18, 2016 at 10:53:41AM -0500, Jason Merrill wrote:
> In this testcase, we weren't getting the benefits of fold's cleverness in
> handling COND_EXPR because we were only calling fold_for_warn on the
> condition itself.  This patch changes check_function_arguments_recurse to
> fold the entire COND_EXPR, and also fixes cp_fold to actually fold
> COND_EXPR.
> 
> Along with this, I've cleaned up some other bits I noticed in cp_fold: there
> was various unnecessary special-casing for unary and binary ops as well, and
> we were clobbering an input CONSTRUCTOR rather than returning a new folded
> one.
> 
> Tested x86_64-pc-linux-gnu, applying to trunk.

Guess too late for GCC 6, but I'm slightly worried about fold_for_warn
compile time complexity, if we have very large trees of expressions and try
to fold_for_warn it all, then on the original fold_for_warn 2-3 operands
thereof and so on, with something foldable somewhere very deep in the trees.
For that perhaps best would be if we cache cp_fold results on
expressions somewhere (custom tree that holds original and corresponding
folded expression, or hash table, whatever) and if we try to cp_fold it
again, we just reuse what we've folded it to last time.

Jakub

Re: [PATCH] PR testsuite/69181: ensure expected multiline outputs is cleared per-test (v2)

2016-01-18 Thread Bernd Schmidt


So we do need error cleanup for both blocks (A) and (B).



gcc/testsuite/ChangeLog:
PR testsuite/69181
* gcc.dg/pr69181-1.c: New test file.
* gcc.dg/pr69181-2.c: New test file.
* lib/gcc-dg.exp (dg-test): Consolidate post-test cleanup of
globals by moving it to...
(cleanup-after-saved-dg-test): ...this new function.  Add
"global additional_sources_used".  Add reset of global
multiline_expected_outputs to the empty list.
* lib/multiline.exp (_multiline_expected_outputs): Rename this
global to...
(multiline_expected_outputs): ...this, and updated comments to
note that it is modified from gcc-dg.exp.
(dg-end-multiline-output): Update for the above renaming.
(handle-multiline-outputs): Likewise.  Remove the clearing
of the expected outputs to the empty list.


Ok.


bernd

Re: C++ PATCH for c++/68586 (rejects-valid with enum in C++11)

2016-01-18 Thread Marek Polacek

On Mon, Jan 18, 2016 at 10:04:12AM -0500, Jason Merrill wrote:
> This wouldn't cover cases where this change affects the type or value of
> more complicated expressions, so my preference would be to clear the caches
> when we finish_enum_value_list.

So like this?

Bootstrapped/regtested on x86_64-linux.

2016-01-18  Marek Polacek  

PR c++/68586
* constexpr.c (clear_cv_cache): New.
* cp-tree.h (clear_cv_cache): Declare.
* decl.c (finish_enum_value_list): Call it.

* g++.dg/cpp0x/enum30.C: New test.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index 6ab4696..6b0e5a8 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -4027,6 +4027,14 @@ maybe_constant_value (tree t, tree decl)
   return ret;
 }
 
+/* Dispose of the whole CV_CACHE.  */
+
+void
+clear_cv_cache (void)
+{
+  gt_cleare_cache (cv_cache);
+}
+
 /* Like maybe_constant_value but first fully instantiate the argument.
 
Note: this is equivalent to instantiate_non_dependent_expr_sfinae
diff --git gcc/cp/cp-tree.h gcc/cp/cp-tree.h
index fc9507e..1ae0d5a 100644
--- gcc/cp/cp-tree.h
+++ gcc/cp/cp-tree.h
@@ -6919,6 +6919,7 @@ extern bool var_in_constexpr_fn (tree);
 extern void explain_invalid_constexpr_fn(tree);
 extern vec cx_error_context   (void);
 extern tree fold_sizeof_expr   (tree);
+extern void clear_cv_cache (void);
 
 /* In c-family/cilk.c */
 extern bool cilk_valid_spawn(tree);
diff --git gcc/cp/decl.c gcc/cp/decl.c
index df95133..d489b87 100644
--- gcc/cp/decl.c
+++ gcc/cp/decl.c
@@ -13386,6 +13386,10 @@ finish_enum_value_list (tree enumtype)
 
   /* Finish debugging output for this type.  */
   rest_of_type_compilation (enumtype, namespace_bindings_p ());
+
+  /* Each enumerator now has the type of its enumeration.  Clear the cache
+ so that this change in types doesn't confuse us later on.  */
+  clear_cv_cache ();
 }
 
 /* Finishes the enum type. This is called only the first time an
diff --git gcc/testsuite/g++.dg/cpp0x/enum30.C 
gcc/testsuite/g++.dg/cpp0x/enum30.C
index e69de29..cf0c1b5 100644
--- gcc/testsuite/g++.dg/cpp0x/enum30.C
+++ gcc/testsuite/g++.dg/cpp0x/enum30.C
@@ -0,0 +1,14 @@
+// PR c++/68586
+// { dg-do compile { target c++11 } }
+
+enum E { x , y = 1 + (x << 1) };
+template struct A {};
+A a;
+
+enum E2 : int { x2 , y2 = x2 << 1 };
+template struct A2 {};
+A2 a2;
+
+enum class E3 { x3 , y3 = x3 << 1 };
+template struct A3 {};
+A3 a3;

Marek

[patch, fortran] PR65996 [5/6 Regression] gfortran ICE with -dH

2016-01-18 Thread Jerry DeLisle

This patch follows the suggestion Jakub made in the PR and is very 
straightforward.

With the patch, an abort is given on actual errors, in agreement with the
documentation for -dH.
(Yes, not very useful, but we can clear this PR)

Buffered errors bypass this abort by saving and restoring the state.

Regression tested on x86-64-linux.

OK for trunk and then back port to 5 in about a week?

A test case will be included, similar to that given by Dominique in the PR.

Regards,

Jerry

2016-01-18  Jerry DeLisle  

PR fortran/65996
* error.c (gfc_error): Save the state of abort_on_error and set
it to false for buffered errors to allow normal processing.
Restore the state before leaving.

Index: error.c
===
--- error.c	(revision 232535)
+++ error.c	(working copy)
@@ -1226,6 +1226,7 @@ gfc_error (const char *gmsgid, va_list ap)
 {
   va_list argp;
   va_copy (argp, ap);
+  bool saved_abort_on_error = false;
 
   if (warnings_not_errors)
 {
@@ -1250,10 +1251,14 @@ gfc_error (const char *gmsgid, va_list ap)
 
   if (buffered_p)
 {
+  /* To prevent -dH from triggering an abort on a buffered error,
+	 save abort_on_error and restore it below.  */
+  saved_abort_on_error = global_dc->abort_on_error;
+  global_dc->abort_on_error = false;
   pp->buffer = pp_error_buffer;
   global_dc->fatal_errors = false;
   /* To prevent -fmax-errors= triggering, we decrease it before
- report_diagnostic increases it.  */
+	 report_diagnostic increases it.  */
   --errorcount;
 }
 
@@ -1264,6 +1269,8 @@ gfc_error (const char *gmsgid, va_list ap)
 {
   pp->buffer = tmp_buffer;
   global_dc->fatal_errors = fatal_errors;
+  global_dc->abort_on_error = saved_abort_on_error;
+
 }
 
   va_end (argp);

Re: [PATCH] ARM PR68620 (ICE with FP16 on armeb)

2016-01-18 Thread Alan Lawrence

Thanks for working on this, Christophe, and sorry I missed the PR. You got
further in fixing more things than I did though :). A couple of comments:

> For the vec_set_internal and neon_vld1_dup patterns, I
> switched to an existing iterator which already had the needed
> V4HF/V8HF (so I switched to VD_LANE and VQ2).

It's a separate issue, and I hadn't done this either, but looking again - I
don't see any reason why we shouldn't apply VD->VD_LANE to the vec_extract
standard name pattern too. (At present looks like we have vec_extractv8hf but no
vec_extractv4hf ?)

> For neon_vdupn, I chose to implement neon_vdup_nv4hf and
> neon_vdup_nv8hf instead of updating the VX iterator because I thought
> it was not desirable to impact neon_vrev32.

Well, the same instruction will suffice for vrev32'ing vectors of HF just as
well as vectors of HI, so I think I'd argue that's harmless enough. To gain the
benefit, we'd need to update arm_evpc_neon_vrev with a few new cases, though.

> @@ -5252,12 +5252,22 @@ vget_lane_s32 (int32x2_t __a, const int __b)
> were marked always-inline so there were no call sites, the declaration
> would nonetheless raise an error.  Hence, we must use a macro instead.  */
>
> +  /* For big-endian, GCC's vector indices are the opposite way around
> + to the architectural lane indices used by Neon intrinsics.  */

Not quite the opposite way around, as you take into account yourself! 'Reversed
within each 64 bits', perhaps?

> +#ifdef __ARM_BIG_ENDIAN
> +  /* Here, 3 is (4-1) where 4 is the number of lanes. This is also the
> + right value for vectors with 8 lanes.  */
> +#define __arm_lane(__vec, __idx) (__idx ^ 3)
> +#else
> +#define __arm_lane(__vec, __idx) __idx
> +#endif
> +

Looks right, but sounds... my concern here is that I'm hoping at some point we
will move the *other* vget/set_lane intrinsics to use GCC vector extensions
too. At which time (unlike __aarch64_lane which can be used everywhere) this
will be the wrong formula. Can we name (and/or comment) it to avoid misleading
anyone? The key characteristic seems to be that it is for vectors of 16-bit
elements only.

> @@ -5334,7 +5344,7 @@ vgetq_lane_s32 (int32x4_t __a, const int __b)
>  ({   \
>float16x8_t __vec = (__v); \
>__builtin_arm_lane_check (8, __idx);   \
> -  float16_t __res = __vec[__idx];\
> +  float16_t __res = __vec[__arm_lane(__vec, __idx)]; \

In passing - the function name in the @@ header is of course misleading, this 
is #define vgetq_lane_f16 (and the later hunks)

Thanks, Alan

Re: [PATCH] Fix the remaining PR c++/24666 blockers (arrays decay to pointers too early)

2016-01-18 Thread Patrick Palka

On Mon, Jan 18, 2016 at 10:34 AM, Jason Merrill  wrote:
> On 12/25/2015 12:37 PM, Patrick Palka wrote:
>>
>> That alone would not be sufficient because more_specialized_fn()
>> doesn't call maybe_adjust_types_for_deduction() beforehand, yet we
>> have to do the decaying there too (and on both types, not just one of
>> them).
>>
>> And maybe_adjust_types_for_deduction() seems to operate on the
>> presumption that one type is the parameter type and one is the
>> argument type. But in more_specialized_fn() and in get_bindings() we
>> are really working with two parameter types and have to decay them
>> both. So sometimes we have to decay one of the types that are
>> eventually going to get passed to unify(), and other times we want to
>> decay both types that are going to get passed to unify().
>> maybe_adjust_types_for_deduction() seems to only expect the former
>> case.
>>
>> Finally, maybe_adjust_types_for_deduction() is not called when
>> unifying a nested function declarator (because it is guarded by the
>> subr flag in unify_one_argument), so doing it there we would also
>> regress in the following test case:
>
>
> Ah, that makes sense.
>
> How about keeping the un-decayed type in the PARM_DECLs, so that we get the
> substitution failure in instantiate_template, but having the decayed type in
> the TYPE_ARG_TYPES, probably by doing the decay in grokparms, so it's
> already decayed when we're doing unification?

I just tried this, and it works well! With this approach, all but one
of the test cases pass.  The failing test case is unify17.C:

-- 8< --

void foo (int *);

template 
void bar (void (T[5])); // { dg-error "array of 'void'" }

void
baz (void)
{
  bar (0); // { dg-error "no matching function" }
}

-- 8< --

Here, we don't get a substitution failure because we don't have a
corresponding FUNCTION_DECL for the nested function specifier, only a
FUNCTION_TYPE. So there is no PARM_DECL to recurse into during
substitution, that retains the un-decayed argument type "T[5]" of the
nested function specifier.

I'm not yet sure how to work around this...

[PATCH] gcc/configure test for AIX DWARF

2016-01-18 Thread David Edelsohn

AIX7 has added support for DWARF to XCOFF, but complete and correct
support did not occur with a single update and the initial release of
AIX7.  The initial support defined a subset of common DWARF debug
sections.  A later update added most of the remaining sections for
location lists and frames, but the AIX Assembler did not correctly
handle references to labels generated by GCC.

This patch updates the gcc/configure test for the extended DWARF
support to ensure that the AIX toolchain correctly handles the label
reference.

Bootstrapped on powerpc-ibm-aix7.1.2.0 with and without the corrected assembler.

Okay?

Thanks, David

* configure.ac (gcc_cv_as_dwloc): Test support for debug frame section
label reference.
* configure: Regenerate.

Index: configure.ac
===
--- configure.ac(revision 232532)
+++ configure.ac(working copy)
@@ -4384,7 +4384,7 @@

 case $target in
   *-*-aix*)
-   gcc_GAS_CHECK_FEATURE([.ref support],
+   gcc_GAS_CHECK_FEATURE([AIX .ref support],
  gcc_cv_as_aix_ref, [2,21,0],,
  [ .csect stuff[[rw]]
 stuff:
@@ -4395,19 +4395,17 @@
  [AC_DEFINE(HAVE_AS_REF, 1,
[Define if your assembler supports .ref])])
;;
-esac

-case $target in
-  *-*-aix*)
-   gcc_GAS_CHECK_FEATURE([dwarf location lists section support],
+   gcc_GAS_CHECK_FEATURE([AIX DWARF location lists section support],
  gcc_cv_as_aix_dwloc, [2,21,0],,
- [ .dwsect 0xB
+ [ .dwsect 0xA
+   Lframe..0:
+   .vbyte 4,Lframe..0:
  ],,
  [AC_DEFINE(HAVE_XCOFF_DWARF_EXTRAS, 1,
-   [Define if your assembler supports .dwsect 0xB])])
+   [Define if your assembler supports AIX debug frame section
label reference.])])
;;
 esac
-;;

   mips*-*-*)
 gcc_GAS_CHECK_FEATURE([explicit relocation support],

[PATCH] fix #69253 - [6 Regression] ICE in cxx_incomplete_type_diagnostic initializing a flexible array member with empty string

2016-01-18 Thread Martin Sebor


The attached patch fixes the ICE reported for the test case below:

  struct str {
int a;
char s[];
  };
  void fn1() { (struct str){1, ""}; }

While I don't think the patch is incorrect as far as it goes, it's
not the last word on the subject of initializing flexible array
members.  I uncovered a number of other problems in this area while
testing the patch.

First, while the patch rejects the submitted test case, it doesn't
reject it when the struct is defined like so:

  struct str {
char a, s[];
  };

The other problems are variations on the case above that I came
across while testing the patch.  They are summarized in c++69338
- incorrect ctor initialization of a flexible array member.

Since these outstanding problems are not strictly speaking
a regression (the code is accepted in 5.1 with -fpermissive),
rather than working up a more robust fix addressing all of these
issues I think it's probably better to fix just the ICE for now
(because that is a regression) and tackle the remaining problems
at some later point.

Martin
gcc/testsuite/ChangeLog:
2016-01-18  Martin Sebor  

	PR c++/69253
	* g++.dg/ext/flexary11.C: New test.

gcc/cp/ChangeLog:
2016-01-18  Martin Sebor  

	PR c++/69253
	* typeck2.c (cxx_incomplete_type_diagnostic): Handle flexible
	array members.

Index: gcc/cp/typeck2.c
===
--- gcc/cp/typeck2.c	(revision 232526)
+++ gcc/cp/typeck2.c	(working copy)
@@ -498,8 +498,15 @@ cxx_incomplete_type_diagnostic (const_tr
 case ARRAY_TYPE:
   if (TYPE_DOMAIN (type))
 	{
-	  type = TREE_TYPE (type);
-	  goto retry;
+	  if (TYPE_MAX_VALUE (TYPE_DOMAIN (type)))
+	{
+	  type = TREE_TYPE (type);
+	  goto retry;
+	}
+	  /* Flexible array members have no upper bound.  */
+	  emit_diagnostic (diag_kind, input_location, 0,
+			   "invalid use of a flexible array member");
+	  break;
 	}
   emit_diagnostic (diag_kind, loc, 0,
 		   "invalid use of array with unspecified bounds");
Index: gcc/testsuite/g++.dg/ext/flexary11.C
===
--- gcc/testsuite/g++.dg/ext/flexary11.C	(revision 0)
+++ gcc/testsuite/g++.dg/ext/flexary11.C	(working copy)
@@ -0,0 +1,19 @@
+// PR c++/69253 - [6 Regression] g++ ICE at -O0 on x86_64-linux-gnu
+//in "cxx_incomplete_type_diagnostic"
+// { dg-do compile }
+
+struct A {
+  int n;
+  char a [];
+};
+
+void f ()
+{
+  // Compound literals and flexible array members are G++ extensions
+  // accepted for compatibility with C and GCC.
+
+  // The following use of a flexible array member in a compound literal
+  // is invalid in C and rejected by GCC in C mode and so it's also
+  // rejected in C++ mode.
+  (struct A){ 1, "" };   // { dg-error "forbids compound-literals|initialization of a flexible array member|invalid use of a flexible array member" }
+}

Re: [patch, fortran] Inline MATMUL(A,TRANSPOSE(B)), PR 66094

2016-01-18 Thread Toon Moene


On 01/17/2016 01:44 PM, Thomas Koenig wrote:


So... comments?  Toon, would this help you?  Could yo maybe give this
a spin?


Thanks, the nightly test at my home computer will build with your patch.


2016-01-17  Thomas Koenig  

 PR fortran/66094
 * frontend-passes.c (enum matrix_case):  Add case A2B2T for
 MATMUL(A,TRANSPoSE(B)) where A and B are rank 2.
 (inline_limit_check):  Also add A2B2T.
 (matmul_lhs_realloc):  Handle A2B2T.
 (check_conjg_variable):  Rename to
 (check_conjg_transpose_variable):  and also count TRANSPOSE.
 (inline_matmul_assign):  Handle A2B2T.


It will also perform the following tests (minus the 
"inline_matmul_13.f90" one, which wasn't included in the attachements :-)



2016-01-17  Thomas Koenig  

 PR fortran/66094
 * gfortran.dg/inline_matmul_13.f90:  New test.
 * gfortran.dg/matmul_bounds_8.f90:  New test.
 * gfortran.dg/matmul_bounds_9.f90:  New test.
 * gfortran.dg/matmul_bounds_10.f90:  New test.


Unfortunately, running the whole of our weather forecasting system with 
gcc-6 will be *a lot of work*, because I have to build all kinds of 
support libraries (for which I now depend on Debian Testing) by hand.


But I hope just testing your examples will at least give you an idea (on 
-march=haswell).


Thanks, and kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

[PATCH] Fix debug info handling in prepare_shrink_wrap (PR debug/65779)

2016-01-18 Thread Jakub Jelinek

Hi!

On the following testcase with -mrelocatable on ppc32 we get assembly that
contains undefined reference to a local .LC* symbol.
The problem is that prepare_shrink_wrap attempts to schedule some
instructions from the entry block to later basic blocks, if they set a
register that is only used in one of the paths, but the debug infos are kept
where they used to appear (that is correct thing to do), but nothing adjusts
them or resets them); on most targets we get away just with wrong debug
info, but on ppc32 -mrelocatable something in the prologue sets the hard
registers used by some debug insns to subtraction of two magic labels, which
are later removed as unneeded, but kept in the debug info.

Fixed by using the infrastructure we have for this in valtrack.c, that is
used by DCE and DF note problem computation.

E.g. on ppc32, we used to have:
(insn 34 38 35 2 (set (reg/v:SI 6 6 [orig:238 s1 ] [238])
(and:SI (reg/v:SI 3 3 [orig:264 adler ] [264])
(const_int 65535 [0x]))) 
../../../../../../../../../rtems/c/src/lib/libbsp/powerpc/motorola_powerpc/bootloader/../../../powerpc/sha
 (nil))
(debug_insn 35 34 36 2 (var_location:SI s1 (reg/v:SI 6 6 [orig:238 s1 ] [238])) 
../../../../../../../../../rtems/c/src/lib/libbsp/powerpc/motorola_
 (nil))
(insn 36 35 37 2 (set (reg/v:SI 0 0 [orig:239 s2 ] [239])
(lshiftrt:SI (reg/v:SI 3 3 [orig:264 adler ] [264])
(const_int 16 [0x10]))) 
../../../../../../../../../rtems/c/src/lib/libbsp/powerpc/motorola_powerpc/bootloader/../../../powerpc/shared/b
 (nil))
(debug_insn 37 36 39 2 (var_location:SI s2 (reg/v:SI 0 0 [orig:239 s2 ] [239])) 
../../../../../../../../../rtems/c/src/lib/libbsp/powerpc/motorola_
 (nil))
in bb2 and decide to move insn 34 and 36 to the start of bb3,
before the patch we'd keep the debug insns untouched, while with the
patch we get:
(debug_insn 365 38 35 2 (var_location:SI D#38 (and:SI (reg/v:SI 3 3 [orig:264 
adler ] [264])
(const_int 65535 [0x]))) -1
 (nil))
(debug_insn 35 365 363 2 (var_location:SI s1 (debug_expr:SI D#38)) 
../../../../../../../../../rtems/c/src/lib/libbsp/powerpc/motorola_powerpc/bootl
 (nil))
(debug_insn 363 35 37 2 (var_location:SI D#37 (lshiftrt:SI (reg/v:SI 3 3 
[orig:264 adler ] [264])
(const_int 16 [0x10]))) -1
 (nil))
(debug_insn 37 363 39 2 (var_location:SI s2 (debug_expr:SI D#37)) 
../../../../../../../../../rtems/c/src/lib/libbsp/powerpc/motorola_powerpc/bootlo
 (nil))
in bb2.  Even on x86_64-linux I saw even on this testcase improvement
in debug info coverage (it didn't end up being wrong debug there, but
one of the variables used to be  without the patch in
certain range.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-01-19  Jakub Jelinek  

PR debug/65779
* shrink-wrap.c: Include valtrack.h.
(move_insn_for_shrink_wrap): Add DEBUG argument.  If
MAY_HAVE_DEBUG_INSNS, call dead_debug_add on DEBUG_INSNs
in between insn and where it will be moved to.  Call
dead_debug_insert_temp.
(prepare_shrink_wrap): Adjust caller.  Call dead_debug_local_init
first and dead_debug_local_finish at the end.
For uses and defs bitmap, handle all regs in between REGNO and
END_REGNO, not just the first one.

* gcc.dg/pr65779.c: New test.

--- gcc/shrink-wrap.c.jj2016-01-08 07:31:10.0 +0100
+++ gcc/shrink-wrap.c   2016-01-18 20:06:53.029487254 +0100
@@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.
 #include "shrink-wrap.h"
 #include "regcprop.h"
 #include "rtl-iter.h"
+#include "valtrack.h"
 
 
 /* Return true if INSN requires the stack frame to be set up.
@@ -149,7 +150,8 @@ static bool
 move_insn_for_shrink_wrap (basic_block bb, rtx_insn *insn,
   const HARD_REG_SET uses,
   const HARD_REG_SET defs,
-  bool *split_p)
+  bool *split_p,
+  struct dead_debug_local *debug)
 {
   rtx set, src, dest;
   bitmap live_out, live_in, bb_uses, bb_defs;
@@ -158,6 +160,8 @@ move_insn_for_shrink_wrap (basic_block b
   unsigned int end_sregno = FIRST_PSEUDO_REGISTER;
   basic_block next_block;
   edge live_edge;
+  rtx_insn *dinsn;
+  df_ref def;
 
   /* Look for a simple register assignment.  We don't use single_set here
  because we can't deal with any CLOBBERs, USEs, or REG_UNUSED secondary
@@ -298,6 +302,19 @@ move_insn_for_shrink_wrap (basic_block b
   *split_p = true;
 }
 
+  if (MAY_HAVE_DEBUG_INSNS)
+{
+  for (dinsn = BB_END (bb); dinsn != insn; dinsn = PREV_INSN (dinsn))
+   if (DEBUG_INSN_P (dinsn))
+ {
+   df_ref use;
+   FOR_EACH_INSN_USE (use, dinsn)
+ if (refers_to_regno_p (dregno, end_dregno,
+DF_REF_REG (use), (rtx *) NULL))
+   dead_debug_add (debug, use, DF_REF_REGNO

[PATCH] Fix RTL DSE (PR rtl-optimization/68955, take 2)

2016-01-18 Thread Jakub Jelinek

On Mon, Jan 18, 2016 at 11:40:23AM +0100, Eric Botcazou wrote:
> > So, do you suggest to tweak get_addr like the patch below, and remove the
> >   mem_addr = get_addr (mem_addr);
> > line above and the comment?
> 
> Yes, exactly.  And if that doesn't easily work, then go for your solution and 
> add a blurb to the comment explaining why get_addr cannot be easily changed.

Bootstrap/regtest passed on x86_64-linux and i686-linux, ok for trunk?

No changes in DSE locally_deleted/globally_deleted statistics from
x86_64-linux and i686-linux bootstraps/regtests seen (unpatched vs patched
compiler), other than in the testcase from this PR on i686-linux -O3 -g,
and varying number of compilations of struct-layout-1_generate.c (but always
the same number of DSE deleted insns in them) due to make -jN -k check
(the generator # of compilations/invocations depends on scheduling of
the make check tasks, unlike the number of actual testings).

2016-01-19  Jakub Jelinek  

PR rtl-optimization/68955
PR rtl-optimization/64557
* dse.c (record_store, check_mem_read_rtx): Don't call get_addr
here.  Fix up formatting.
* alias.c (get_addr): Handle VALUE + CONST_INT.

* gcc.dg/torture/pr68955.c: New test.

--- gcc/dse.c.jj2016-01-15 20:37:24.0 +0100
+++ gcc/dse.c   2016-01-18 12:18:04.115988214 +0100
@@ -1515,14 +1515,9 @@ record_store (rtx body, bb_info_t bb_inf
mem_addr = base->val_rtx;
   else
{
- group_info *group
-   = rtx_group_vec[group_id];
+ group_info *group = rtx_group_vec[group_id];
  mem_addr = group->canon_base_addr;
}
-  /* get_addr can only handle VALUE but cannot handle expr like:
-VALUE + OFFSET, so call get_addr to get original addr for
-mem_addr before plus_constant.  */
-  mem_addr = get_addr (mem_addr);
   if (offset)
mem_addr = plus_constant (get_address_mode (mem), mem_addr, offset);
 }
@@ -2128,14 +2123,9 @@ check_mem_read_rtx (rtx *loc, bb_info_t
mem_addr = base->val_rtx;
   else
{
- group_info *group
-   = rtx_group_vec[group_id];
+ group_info *group = rtx_group_vec[group_id];
  mem_addr = group->canon_base_addr;
}
-  /* get_addr can only handle VALUE but cannot handle expr like:
-VALUE + OFFSET, so call get_addr to get original addr for
-mem_addr before plus_constant.  */
-  mem_addr = get_addr (mem_addr);
   if (offset)
mem_addr = plus_constant (get_address_mode (mem), mem_addr, offset);
 }
--- gcc/alias.c.jj  2016-01-14 17:01:09.0 +0100
+++ gcc/alias.c 2016-01-18 10:30:46.780994699 +0100
@@ -2203,7 +2203,23 @@ get_addr (rtx x)
   struct elt_loc_list *l;
 
   if (GET_CODE (x) != VALUE)
-return x;
+{
+  if ((GET_CODE (x) == PLUS || GET_CODE (x) == MINUS)
+ && GET_CODE (XEXP (x, 0)) == VALUE
+ && CONST_SCALAR_INT_P (XEXP (x, 1)))
+   {
+ rtx op0 = get_addr (XEXP (x, 0));
+ if (op0 != XEXP (x, 0))
+   {
+ if (GET_CODE (x) == PLUS
+ && GET_CODE (XEXP (x, 1)) == CONST_INT)
+   return plus_constant (GET_MODE (x), op0, INTVAL (XEXP (x, 1)));
+ return simplify_gen_binary (GET_CODE (x), GET_MODE (x),
+ op0, XEXP (x, 1));
+   }
+   }
+  return x;
+}
   v = CSELIB_VAL_PTR (x);
   if (v)
 {
--- gcc/testsuite/gcc.dg/torture/pr68955.c.jj   2016-01-18 12:18:31.254612960 
+0100
+++ gcc/testsuite/gcc.dg/torture/pr68955.c  2016-01-18 12:18:31.254612960 
+0100
@@ -0,0 +1,41 @@
+/* PR rtl-optimization/68955 */
+/* { dg-do run } */
+/* { dg-output "ONE1ONE" } */
+
+int a, b, c, d, g, m;
+int i[7][7][5] = { { { 5 } }, { { 5 } },
+  { { 5 }, { 5 }, { 5 }, { 5 }, { 5 }, { -1 } } };
+static int j = 11;
+short e, f, h, k, l;
+
+static void
+foo ()
+{
+  for (; e < 5; e++)
+for (h = 3; h; h--)
+  {
+   for (g = 1; g < 6; g++)
+ {
+   m = c == 0 ? b : b / c;
+   i[e][1][e] = i[1][1][1] | (m & l) && f;
+ }
+   for (k = 0; k < 6; k++)
+ {
+   for (d = 0; d < 6; d++)
+ i[1][e][h] = i[h][k][e] >= l;
+   i[e + 2][h + 3][e] = 6 & l;
+   i[2][1][2] = a;
+   for (; j < 5;)
+ for (;;)
+   ;
+ }
+  }
+}
+
+int
+main ()
+{
+  foo ();
+  __builtin_printf ("ONE%dONE\n", i[1][0][2]);
+  return 0;
+}


Jakub

[PATCH] fix #69251 - [6 Regression] ICE in unify_array_domain on a flexible array member

2016-01-18 Thread Martin Sebor


The attached is a minimal patch to avoid the ICE.  The patch doesn't
fix the type substitution of flexible array members as that seems
more involved and is, strictly speaking, outside the scope of this
bug.

Type substitution of flexible array is wrong in 5.3.0 (which treats
flexible array members the same as zero-length arrays). It's also
wrong in 6.0 but for a different reason (one having to do with their
domain, unlike the domain of arrays of unspecified bound which have
no domain, having no upper bound.  Fixing that will require more
time and surgery than just fixing the ICE and might also be more
intrusive than is appropriate at this stage.

Jason, please let me know whether or not you would like to see
the substitution failure fixed before the upcoming release as well.

Martin
gcc/testsuite/ChangeLog:
2016-01-18  Martin Sebor  

	PR target/69318
	* g++.dg/ext/flexarray-subst.C: New test.

gcc/cp/ChangeLog:
2016-01-18  Martin Sebor  

	PR target/69318
	* pt.c (unify): Handle flexible array members somewhat more gracefully.

Index: gcc/cp/pt.c
===
--- gcc/cp/pt.c	(revision 232526)
+++ gcc/cp/pt.c	(working copy)
@@ -19657,12 +19657,17 @@ unify (tree tparms, tree targs, tree par
 case ARRAY_TYPE:
   if (TREE_CODE (arg) != ARRAY_TYPE)
 	return unify_type_mismatch (explain_p, parm, arg);
+
+  /* Flexible array members have no upper bound.  Have them match
+ template parameters of unspecified bounds (which have a null
+ domain).  */
   if ((TYPE_DOMAIN (parm) == NULL_TREE)
-	  != (TYPE_DOMAIN (arg) == NULL_TREE))
-	return unify_type_mismatch (explain_p, parm, arg);
+  != (TYPE_DOMAIN (arg) == NULL_TREE
+  || TYPE_MAX_VALUE (TYPE_DOMAIN (arg)) == NULL_TREE))
+return unify_type_mismatch (explain_p, parm, arg);
   RECUR_AND_CHECK_FAILURE (tparms, targs, TREE_TYPE (parm), TREE_TYPE (arg),
 			   strict & UNIFY_ALLOW_MORE_CV_QUAL, explain_p);
-  if (TYPE_DOMAIN (parm) != NULL_TREE)
+  if (TYPE_DOMAIN (parm))
 	return unify_array_domain (tparms, targs, TYPE_DOMAIN (parm),
    TYPE_DOMAIN (arg), explain_p);
   return unify_success (explain_p);
Index: gcc/testsuite/g++.dg/ext/flexarray-subst.C
===
--- gcc/testsuite/g++.dg/ext/flexarray-subst.C	(revision 0)
+++ gcc/testsuite/g++.dg/ext/flexarray-subst.C	(working copy)
@@ -0,0 +1,33 @@
+// PR c++/69251 - [6 Regression] ICE (segmentation fault) in unify_array_domain
+// on i686-linux-gnu
+// { dg-do compile }
+
+struct A { int n; char a[]; };
+
+template 
+struct B;
+
+// The following definition shouldn't be needed but is provided to prevent
+// the test from failing with an error due to PR c++/69349 - template
+// substitution error for flexible array members.  (This doesn't compromise
+// the validity of this test since all it tests for is the absennce of
+// the ICE.)
+template 
+struct B { typedef int X; };
+
+template 
+struct B { typedef int X; };
+
+template 
+struct C { typedef typename B::X X; };
+
+template 
+int foo (T&, typename C::X = 0)
+{
+  return 0;
+}
+
+void bar (A *a)
+{
+  foo (a->a);
+}

Re: [PATCH 1/2] fix memory chunk corruption for opts_obstack (PR jit/68446)

2016-01-18 Thread Jakub Jelinek

On Fri, Jan 15, 2016 at 03:04:33PM -0500, David Malcolm wrote:
> OK for trunk?  (assuming it bootstraps by itself)
> 
> gcc/ChangeLog:
>   PR jit/68446
>   * gcc.c (driver::decode_argv): Add call to
>   init_opts_obstack before init_options_struct.
>   * opts.c (init_opts_obstack): Remove idempotency.
>   (init_options_struct): Replace call to init_opts_obstack
>   with a gcc_assert to verify that it has already been called.
>   * toplev.c (toplev::main): Add call to init_opts_obstack before
>   calls to init_options_struct.
>   (toplev::finalize): Move cleanup of opts_obstack next to
>   cleanup of save_decoded_options, clearing the latter, and
>   save_decoded_options_count.

Ok.

Jakub

-z bndplt documentation in GCC manual

2016-01-18 Thread Sandra Loosemore

I think the documentation relating to '-z bndplt' in the GCC manual 
description of -fcheck-pointer-bounds is incorrect.  It looks like, as 
of r225862, the GCC driver is supposed to emit an error message if GCC 
was configured with a linker that doesn't support this option and you 
pass -mmpx without -static.  Is that right?  I'll fix the documentation 
once I'm clear on what the actual behavior is.


-Sandra

Re: reject decl with incomplete struct/union type in check_global_declaration()

2016-01-18 Thread Joseph Myers

On Sat, 16 Jan 2016, Prathamesh Kulkarni wrote:

> > There's a GNU C extension allowing forward declarations of enums, and it
> > seems that
> >
> > static enum e x;
> >
> > doesn't get diagnosed either with -fsyntax-only.  Thus I think you should
> > cover that case as well.
> Done in the attached patch. It regresses pr63549.c because the error
> showed up there, adjusted
> the test-case to accept the error.
> OK to commit ?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH 2/4] Equate MEM_REFs and ARRAY_REFs in tree-ssa-scopedtables.c

2016-01-18 Thread H.J. Lu

On Thu, Dec 24, 2015 at 3:55 AM, Alan Lawrence  wrote:
> This version changes the test cases to fix failures on some platforms, by
> rewriting the initializers so that they aren't pushed out to the constant 
> pool.
>
> gcc/ChangeLog:
>
> * tree-ssa-scopedtables.c (avail_expr_hash): Hash MEM_REF and 
> ARRAY_REF
> using get_ref_base_and_extent.
> (equal_mem_array_ref_p): New.
> (hashable_expr_equal_p): Add call to previous.
>

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69352

-- 
H.J.

[PR66726] Fixe regression caused by Factor conversion out of COND_EXPR

2016-01-18 Thread Kugan

Hi,

This is an updated version of
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02196.html.

Patch to fix PR66726 missed optimization, factor conversion out of
COND_EXPR caused a regression for targets with branch cost greater than
i.e., testcase gcc.dg/pr46309.c failed for these targets. I posted a
patch for this which had some issues. Please find an updated version of
this patch that now passes regression.

This patch makes optimize_range_tests understand the factored out
COND_EXPR. i.e., Updated the final_range_test_p to look for the new
pattern. Changed the maybe_optimize_range_tests (which does the inter
basic block range test optimization) accordingly.

Bootstrapped and regression tested on x86_64-none-linux-gnu with no new
regressions. And also regression tested on arm-none-linux-gnu and
aarch64-none-linux-gnu with no new regressions.
Is this Ok for trunk?

Thanks,
Kugan


gcc/ChangeLog:

2016-01-19  Kugan Vivekanandarajah  

PR middle-end/66726
* tree-ssa-reassoc.c (optimize_range_tests): Handle tcc_compare stmt
whose result is used in PHI.
(maybe_optimize_range_tests): Likewise.
(final_range_test_p): Lokweise.

diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index e53cc56..d0a5cee 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -2687,18 +2687,33 @@ optimize_range_tests (enum tree_code opcode,
# _345 = PHI <_123(N), 1(...), 1(...)>
where _234 has bool type, _123 has single use and
bb N has a single successor M.  This is commonly used in
+   the last block of a range test.
+
+   Also Return true if STMT is tcc_compare like:
+   :
+   ...
+   _234 = a_2(D) == 2;
+
+   :
+   # _345 = PHI <_234(N), 1(...), 1(...)>
+   _346 = (int) _345;
+   where _234 has booltype, single use and
+   bb N has a single successor M.  This is commonly used in
the last block of a range test.  */
 
 static bool
 final_range_test_p (gimple *stmt)
 {
-  basic_block bb, rhs_bb;
+  basic_block bb, rhs_bb, lhs_bb;
   edge e;
   tree lhs, rhs;
   use_operand_p use_p;
   gimple *use_stmt;
 
-  if (!gimple_assign_cast_p (stmt))
+  if (!gimple_assign_cast_p (stmt)
+  && (!is_gimple_assign (stmt)
+ || (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+ != tcc_comparison)))
 return false;
   bb = gimple_bb (stmt);
   if (!single_succ_p (bb))
@@ -2709,11 +2724,16 @@ final_range_test_p (gimple *stmt)
 
   lhs = gimple_assign_lhs (stmt);
   rhs = gimple_assign_rhs1 (stmt);
-  if (!INTEGRAL_TYPE_P (TREE_TYPE (lhs))
-  || TREE_CODE (rhs) != SSA_NAME
-  || TREE_CODE (TREE_TYPE (rhs)) != BOOLEAN_TYPE)
+  if (gimple_assign_cast_p (stmt)
+  && (!INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+ || TREE_CODE (rhs) != SSA_NAME
+ || TREE_CODE (TREE_TYPE (rhs)) != BOOLEAN_TYPE))
 return false;
 
+  if (!gimple_assign_cast_p (stmt)
+  && (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE))
+  return false;
+
   /* Test whether lhs is consumed only by a PHI in the only successor bb.  */
   if (!single_imm_use (lhs, _p, _stmt))
 return false;
@@ -2723,10 +2743,20 @@ final_range_test_p (gimple *stmt)
 return false;
 
   /* And that the rhs is defined in the same loop.  */
-  rhs_bb = gimple_bb (SSA_NAME_DEF_STMT (rhs));
-  if (rhs_bb == NULL
-  || !flow_bb_inside_loop_p (loop_containing_stmt (stmt), rhs_bb))
-return false;
+  if (gimple_assign_cast_p (stmt))
+{
+  if (TREE_CODE (rhs) != SSA_NAME
+ || !(rhs_bb = gimple_bb (SSA_NAME_DEF_STMT (rhs)))
+ || !flow_bb_inside_loop_p (loop_containing_stmt (stmt), rhs_bb))
+   return false;
+}
+  else
+{
+  if (TREE_CODE (lhs) != SSA_NAME
+ || !(lhs_bb = gimple_bb (SSA_NAME_DEF_STMT (lhs)))
+ || !flow_bb_inside_loop_p (loop_containing_stmt (stmt), lhs_bb))
+   return false;
+}
 
   return true;
 }
@@ -3119,6 +3149,8 @@ maybe_optimize_range_tests (gimple *stmt)
 
  /* stmt is
 _123 = (int) _234;
+OR
+_234 = a_2(D) == 2;
 
 followed by:
 :
@@ -3148,6 +3180,8 @@ maybe_optimize_range_tests (gimple *stmt)
 of the bitwise or resp. and, recursively.  */
  if (!get_ops (rhs, code, ,
loop_containing_stmt (stmt))
+ && (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+ != tcc_comparison)
  && has_single_use (rhs))
{
  /* Otherwise, push the _234 range test itself.  */
@@ -3160,6 +3194,23 @@ maybe_optimize_range_tests (gimple *stmt)
  ops.safe_push (oe);
  bb_ent.last_idx++;
}
+ else if (!get_ops (lhs, code, ,
+loop_containing_stmt (stmt))
+  && TREE_CODE (lhs) == SSA_NAME
+  && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+  && is_gimple_assign (stmt)
+  && (TREE_CODE_CLASS

1 2 >

1 - 100 of 121 matches

Mail list logo