Re: [Patch] [x86_64] libgcc changes to add znver1

2015-10-30 Thread Uros Bizjak
On Thu, Oct 29, 2015 at 2:16 PM, Kumar, Venkataramanan
 wrote:
> Hi Uros,
>
> As per your comments in 
> https://gcc.gnu.org/ml/gcc-patches/2015-09/msg02326.html  please find the 
> patch that also adds changes to libgcc.
>
> It was bootstrapped and regressed tested on x86_64.
>
> Ok for trunk?
>
> Change logs
> gcc/ChangeLog
>
> 2015-10-29  Venkataramanan Kumar  
>
>* config/i386/i386.c (get_builtin_code_for_version): Set priority for
>PROCESSOR_ZNVER1.
>(enum processor_model): Add M_AMDFAM17H_znver1.
>(struct arch_names_table): Likewise.
>* doc/extend.texi: ADD znver1.
>
> libgcc/ChangeLog
> 2015-10-12  Venkataramanan kumar  
>
>* config/i386/cpuinfo.c (enum processor_types): Add AMDFAM17H.
>(processor_subtypes): Add znver1.
>(get_amd_cpu): Detect znver1.

OK.

Thanks,
Uros.


Re: Robustify REAL_MODE_FORMAT

2015-10-30 Thread Richard Biener
On Thu, Oct 29, 2015 at 5:32 PM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On October 29, 2015 4:33:17 PM GMT+01:00, Bernd Schmidt
>>  wrote:
>>>On 10/29/2015 04:30 PM, Richard Sandiford wrote:
 Make sure that REAL_MODE_FORMAT aborts if it is passed an invalid
>>>mode,
 rather than stepping beyond the bounds of an array.  It turned out
>>>that
 some code was passing non-float modes to the real.h routines.
>>>
 gcc/
 * real.h (REAL_MODE_FORMAT): Abort if the mode isn't a
 SCALAR_FLOAT_MODE_P.
>>>
>>>I'm assuming that the code you mention has already been fixed so that
>>>we
>>>don't trigger the abort. Ok.
>>
>> Rather than the weird macro can't we turn real_mode_format to an inline
>> function?
>
> It needs to be an lvalue for things like:
>
> REAL_MODE_FORMAT (TFmode) = _extended_format;
>
> I suppose we could return a non-const reference, but I'd rather stay
> clear of returning those :-)

Yes please.  But SET_REAL_MODE_FORMAT (TFmode, _extended_format)
would work as well.

Richard.

> Thanks,
> Richard
>


[PATCH] Allow more pointer-plus folding

2015-10-30 Thread Tom de Vries
[ was: Re: [PATCH] Don't handle CAST_RESTRICT (PR 
tree-optimization/49279)  ]


On 29/10/15 12:38, Richard Biener wrote:

On Thu, Oct 29, 2015 at 11:38 AM, Tom de Vries  wrote:

[ quote-pasted from https://gcc.gnu.org/ml/gcc-patches/2011-10/msg00464.html
]


CAST_RESTRICT based disambiguation unfortunately isn't reliable,
e.g. to store a non-restrict pointer into a restricted field,
we add a non-useless cast to restricted pointer in the gimplifier,
and while we don't consider that field to have a special restrict tag
because it is unsafe to do so, we unfortunately create it for the
CAST_RESTRICT before that and end up with different restrict tags
for the same thing.  See the PR for more details.

This patch turns off CAST_RESTRICT handling for now, in the future
we might try to replace it by explicit CAST_RESTRICT stmts in some form,
but need to solve problems with multiple inlined copies of the same
function
with restrict arguments or restrict variables in it and intermixed code
from
them (or similarly code from different non-overlapping source blocks).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
4.6 too?

2011-10-06  Jakub Jelinek  

 PR tree-optimization/49279
 * tree-ssa-structalias.c (find_func_aliases): Don't handle
 CAST_RESTRICT.
 * tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Allow
 restrict propagation.
 * tree-ssa.c (useless_type_conversion_p): Don't return false
 if TYPE_RESTRICT differs.

 * gcc.dg/tree-ssa/restrict-4.c: XFAIL.
 * gcc.c-torture/execute/pr49279.c: New test.



Hi,

In the patch adding support for CAST_RESTRICT (
https://gcc.gnu.org/ml/gcc-patches/2011-10/msg00176.html ) there was also a
bit:
...
 * fold-const.c (fold_unary_loc): Don't optimize
 POINTER_PLUS_EXPR casted to TYPE_RESTRICT pointer by
 casting the inner pointer if it isn't TYPE_RESTRICT.
...
which is still around. I suppose we can remove this bit as well.

OK for trunk if bootstrap and reg-test succeeds?


Ok.


Committed.


I think the checks on TREE_OPERAND (arg0, 1) are bogus though
and either we should unconditionally sink the conversion or only
if a conversion on TREE_OPERAND (arg0, 0) vanishes (I prefer the
latter).



Like this? OK for trunk if bootstrap/reg-test succeeds?

Thanks,
- Tom

Allow more pointer-plus folding

2015-10-30  Tom de Vries  

	* fold-const.c (fold_unary_loc): Allow more POINTER_PLUS_EXPR folding.
---
 gcc/fold-const.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 47ed609..6763e80 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -7770,9 +7770,7 @@ fold_unary_loc (location_t loc, enum tree_code code, tree type, tree op0)
 	 that this happens when X or Y is NOP_EXPR or Y is INTEGER_CST. */
   if (POINTER_TYPE_P (type)
 	  && TREE_CODE (arg0) == POINTER_PLUS_EXPR
-	  && (TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST
-	  || TREE_CODE (TREE_OPERAND (arg0, 0)) == NOP_EXPR
-	  || TREE_CODE (TREE_OPERAND (arg0, 1)) == NOP_EXPR))
+	  && TREE_CODE (TREE_OPERAND (arg0, 0)) == NOP_EXPR)
 	{
 	  tree arg00 = TREE_OPERAND (arg0, 0);
 	  tree arg01 = TREE_OPERAND (arg0, 1);
-- 
1.9.1



Re: [PATCH 08/10] Wire things up so that libcpp users get token underlines

2015-10-30 Thread Jeff Law

On 10/23/2015 02:41 PM, David Malcolm wrote:

A previous patch introduced the ability to print one or more ranges
for a diagnostic via a rich_location class.

Another patch generalized source_location (aka location_t) to be both
a caret and a range, and generated range information for all tokens
coming out of libcpp's lexer.

The attached patch combines these efforts by updating the
rich_location constructor for a single source_location so that it
makes use of the range within the source_location.  Doing so requires
passing the line_table to the ctor, so that it can extract the range
from there.

The effect of this is that all of the various "warning", "warning_at"
"error", "error_at" diagnostics now emit underlines showing the range
of the token associated with the location_t (or input_location), for
those frontends using libcpp.  Similar things should happen for
expressions in the C FE for diagnostics using EXPR_LOCATION.

A test case is added showing various token-based warnings that now
have underlines (without having to go through and add range information
to them).  For example:

diagnostic-token-ranges.c: In function ‘wide_string_literal_in_asm’:
diagnostic-token-ranges.c:68:8: error: wide string literal in ‘asm’
asm (L"nop");
 ^~

gcc/c-family/ChangeLog:
* c-opts.c (c_common_init_options): Set
global_dc->colorize_source_p.

gcc/c/ChangeLog:
* c-decl.c (warn_defaults_to): Pass line_table to
rich_location ctor.
* c-errors.c (pedwarn_c99): Likewise.
(pedwarn_c90): Likewise.

gcc/cp/ChangeLog:
* error.c (pedwarn_cxx98): Pass line_table to
rich_location ctor.

gcc/ChangeLog:
* diagnostic.c (diagnostic_append_note): Pass line_table to
rich_location ctor.
(emit_diagnostic): Likewise.
(inform): Likewise.
(inform_n): Likewise.
(warning): Likewise.
(warning_at): Likewise.
(warning_n): Likewise.
(pedwarn): Likewise.
(permerror): Likewise.
(error): Likewise.
(error_n): Likewise.
(error_at): Likewise.
(sorry): Likewise.
(fatal_error): Likewise.
(internal_error): Likewise.
(internal_error_no_backtrace): Likewise.
(real_abort): Likewise.
* gcc-rich-location.h (gcc_rich_location::gcc_rich_location):
Likewise.
* genmatch.c (fatal_at): Likewise.
(warning_at): Likewise.
* rtl-error.c (diagnostic_for_asm): Likewise.

gcc/fortran/ChangeLog:
* error.c (gfc_warning): Pass line_table to rich_location ctor.
(gfc_warning_now_at): Likewise.
(gfc_warning_now): Likewise.
(gfc_error_now): Likewise.
(gfc_fatal_error): Likewise.
(gfc_error): Likewise.
(gfc_internal_error): Likewise.

gcc/testsuite/ChangeLog:
* gcc.dg/diagnostic-token-ranges.c: New file.
* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
(test_show_locus): Pass line_table to rich_location ctors.
(plugin_init): Remove setting of global_dc->colorize_source_p.
* gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c:
Remove include of gcc-rich-location.h.
(get_range_for_expr): Delete.
(gcc_rich_location::add_expr): Delete.
(emit_warning): Change param from rich_location * to location_t.
Require an ad-hoc location, and extract range from it.
Use warning_at directly, without using a rich_location.
(cb_walk_tree_fn): Pass EXPR_LOCATION (arg) directly to
emit_warning, without creating a rich_location.

libcpp/ChangeLog:
* errors.c (cpp_diagnostic): Pass pfile->line_table to
rich_location ctor.
(cpp_diagnostic_with_line): Likewise.
* include/line-map.h (rich_location::rich_location): Add
line_maps * param.
* line-map.c (rich_location::rich_location): Likewise; use
it to extract the range from the source_location.

OK.  Commit with prereqs.

jeff




Re: [PATCH 06/10] Track expression ranges in C frontend

2015-10-30 Thread Jeff Law

On 10/23/2015 02:41 PM, David Malcolm wrote:

As in the previous version of this patch
  "Implement tree expression tracking in C FE (v2)"
the patch now captures ranges for all C expressions during parsing within
a new field of c_expr, and for all tree nodes with a location_t, it stores
them in ad-hoc locations for later use.

Hence compound expressions get ranges; see:
   
https://dmalcolm.fedorapeople.org/gcc/2015-09-22/diagnostic-test-expressions-1.html

and for this example:

   int test (int foo)
   {
 return foo * 100;
^^^   ^^^
   }

we have access to the ranges of "foo" and "100" during C parsing via
the c_expr, but once we have GENERIC, all we have is a VAR_DECL and an
INTEGER_CST (the former's location is in at the top of the
function, and the latter has no location).

gcc/ChangeLog:
* Makefile.in (OBJS): Add gcc-rich-location.o.
* gcc-rich-location.c: New file.
* gcc-rich-location.h: New file.
* print-tree.c (print_node): Print any source range information.
* tree.c (set_source_range): New functions.
* tree.h (CAN_HAVE_RANGE_P): New.
(EXPR_LOCATION_RANGE): New.
(EXPR_HAS_RANGE): New.
(get_expr_source_range): New inline function.
(DECL_LOCATION_RANGE): New.
(set_source_range): New decls.
(get_decl_source_range): New inline function.

gcc/c-family/ChangeLog:
* c-common.c (c_fully_fold_internal): Capture existing souce_range,
and store it on the result.

gcc/c/ChangeLog:
* c-parser.c (set_c_expr_source_range): New functions.
(c_token::get_range): New method.
(c_token::get_finish): New method.
(c_parser_expr_no_commas): Call set_c_expr_source_range on the ret
based on the range from the start of the LHS to the end of the
RHS.
(c_parser_conditional_expression): Likewise, based on the range
from the start of the cond.value to the end of exp2.value.
(c_parser_binary_expression): Call set_c_expr_source_range on
the stack values for TRUTH_ANDIF_EXPR and TRUTH_ORIF_EXPR.
(c_parser_cast_expression): Call set_c_expr_source_range on ret
based on the cast_loc through to the end of the expr.
(c_parser_unary_expression): Likewise, based on the
op_loc through to the end of op.
(c_parser_sizeof_expression) Likewise, based on the start of the
sizeof token through to either the closing paren or the end of
expr.
(c_parser_postfix_expression): Likewise, using the token range,
or from the open paren through to the close paren for
parenthesized expressions.
(c_parser_postfix_expression_after_primary): Likewise, for
various kinds of expression.
* c-tree.h (struct c_expr): Add field "src_range".
(c_expr::get_start): New method.
(c_expr::get_finish): New method.
(set_c_expr_source_range): New decls.
* c-typeck.c (parser_build_unary_op): Call set_c_expr_source_range
on ret for prefix unary ops.
(parser_build_binary_op): Likewise, running from the start of
arg1.value through to the end of arg2.value.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/diagnostic-test-expressions-1.c: New file.
* gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c:
New file.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
diagnostic_plugin_test_tree_expression_range.c and
diagnostic-test-expressions-1.c.



  /* Initialization routine for this file.  */

@@ -6085,6 +6112,9 @@ c_parser_expr_no_commas (c_parser *parser, struct c_expr 
*after,
ret.value = build_modify_expr (op_location, lhs.value, lhs.original_type,
 code, exp_location, rhs.value,
 rhs.original_type);
+  set_c_expr_source_range (,
+  lhs.get_start (),
+  rhs.get_finish ());

One line if it fits.



@@ -6198,6 +6232,9 @@ c_parser_conditional_expression (c_parser *parser, struct 
c_expr *after,
   ? t1
   : NULL);
  }
+  set_c_expr_source_range (,
+  start,
+  exp2.get_finish ());

Here too.


@@ -6522,6 +6564,10 @@ c_parser_cast_expression (c_parser *parser, struct 
c_expr *after)
expr = convert_lvalue_to_rvalue (expr_loc, expr, true, true);
}
ret.value = c_cast_expr (cast_loc, type_name, expr.value);
+  if (ret.value && expr.value)
+   set_c_expr_source_range (,
+cast_loc,
+expr.get_finish ());

And here?

With the nits fixed, this is OK.

I think that covers this iteration of the rich location work and that 
you'll continue working with Jason on extending this into the C++ front-end.


jeff


Re: [PATCH] Pass manager: add support for termination of pass list

2015-10-30 Thread Richard Biener
On Thu, Oct 29, 2015 at 3:50 PM, Martin Liška  wrote:
> On 10/29/2015 02:15 PM, Richard Biener wrote:
>> On Thu, Oct 29, 2015 at 10:49 AM, Martin Liška  wrote:
>>> On 10/28/2015 04:23 PM, Richard Biener wrote:
 On Tue, Oct 27, 2015 at 4:30 PM, Martin Liška  wrote:
> On 10/27/2015 03:49 PM, Richard Biener wrote:
>> On Tue, Oct 27, 2015 at 1:36 PM, Martin Liška  wrote:
>>> On 10/26/2015 02:48 PM, Richard Biener wrote:
 On Thu, Oct 22, 2015 at 1:02 PM, Martin Liška  wrote:
> On 10/21/2015 04:06 PM, Richard Biener wrote:
>> On Wed, Oct 21, 2015 at 1:24 PM, Martin Liška  wrote:
>>> On 10/21/2015 11:59 AM, Richard Biener wrote:
 On Wed, Oct 21, 2015 at 11:19 AM, Martin Liška  
 wrote:
> On 10/20/2015 03:39 PM, Richard Biener wrote:
>> On Tue, Oct 20, 2015 at 3:00 PM, Martin Liška  
>> wrote:
>>> Hello.
>>>
>>> As part of upcoming merge of HSA branch, we would like to have 
>>> possibility to terminate
>>> pass manager after execution of the HSA generation pass. The 
>>> HSA back-end is implemented
>>> as a tree pass that directly emits HSAIL from gimple tree 
>>> representation. The pass operates
>>> on clones created by HSA IPA pass and the pass manager should 
>>> stop execution of further
>>> RTL passes.
>>>
>>> Suggested patch survives bootstrap and regression tests on 
>>> x86_64-linux-pc.
>>>
>>> What do you think about it?
>>
>> Are you sure it works this way?
>>
>> Btw, you will miss executing of all the cleanup passes that will
>> eventually free memory
>> associated with the function.  So I'd rather support a
>> TODO_discard_function which
>> should basically release the body from the cgraph.
>
> Hi.
>
> Agree with you that I should execute all TODOs, which can be 
> easily done.
> However, if I just try to introduce the suggested TODO and handle 
> it properly
> by calling cgraph_node::release_body, then for instance 
> fn->gimple_df, fn->cfg are
> released and I hit ICEs on many places.
>
> Stopping the pass manager looks necessary, or do I miss something?

 "Stopping the pass manager" is necessary after 
 TODO_discard_function, yes.
 But that may be simply done via a has_body () check then?
>>>
>>> Thanks, there's second version of the patch. I'm going to start 
>>> regression tests.
>>
>> As release_body () will free cfun you should pop_cfun () before
>> calling it (and thus
>
> Well, release_function_body calls both push & pop, so I think calling 
> pop
> before cgraph_node::release_body is not necessary.

 (ugh).

> If tried to call pop_cfun before cgraph_node::release_body, I have 
> cfun still
> pointing to the same (function *) (which is gcc_freed, but cfun != 
> NULL).

 Hmm, I meant to call pop_cfun then after it (unless you want to fix 
 the above,
 none of the freeing functions should techincally need 'cfun', just add
 'fn' parameters ...).

 I expected pop_cfun to eventually set cfun to NULL if it popped the
 "last" cfun.  Why
 doesn't it do that?

>> drop its modification).  Also TODO_discard_functiuon should be only 
>> set for
>> local passes thus you should probably add a gcc_assert (cfun).
>> I'd move its handling earlier, definitely before the ggc_collect, 
>> eventually
>> before the pass_fini_dump_file () (do we want a last dump of the
>> function or not?).
>
> Fully agree, moved here.
>
>>
>> @@ -2397,6 +2410,10 @@ execute_pass_list_1 (opt_pass *pass)
>>  {
>>gcc_assert (pass->type == GIMPLE_PASS
>>   || pass->type == RTL_PASS);
>> +
>> +
>> +  if (!gimple_has_body_p (current_function_decl))
>> +   return;
>>
>> too much vertical space.  With popping cfun before releasing the 
>> body the check
>> might just become if (!cfun) and
>
> As mentioned above, as release body is symmetric (calling push & 
> pop), the suggested
> guard will not work.

 I suggest to fix it.  If it calls push/pop it should leave 

Re: [patch 2/6] scalar-storage-order merge: C front-end

2015-10-30 Thread Eric Botcazou
> It won’t.  Fixing the language line for the options and a make to ensure it
> still builds for you is enough testing.

I was talking about the feature itself though, not about the option per se.  
The feature is tested for C & C++ but not for ObjC & ObjC++ so there might be 
surprises.

> None is needed.  One merely copies the string as found on other options for
> your new options, and you’re done.  Indeed, the default should be to always
> include the objective languages unless one goes out of their way to exclude
> them.

OK, I can enable the option for ObjC & ObjC++ but this comes with no warranty.

-- 
Eric Botcazou


Re: Add VIEW_CONVERT_EXPR to operand_equal_p

2015-10-30 Thread Richard Biener
On Thu, Oct 29, 2015 at 4:52 PM, Jan Hubicka  wrote:
>> On Thu, Oct 29, 2015 at 4:02 PM, Jan Hubicka  wrote:
>> >>
>> >> IMHO it was always wrong/fragile for backends to look at the actual 
>> >> arguments to
>> >> decide on the calling convention.  The backends should _solely_ rely on
>> >> gimple_call_fntype and its TYPE_ARG_TYPES here.
>> >>
>> >> Of course then there are varargs ... (not sure if we hit this here).
>> >
>> > Yep, you have varargs and K prototypes, so it can't work this way.
>>
>> Well, then I suppose we need to compute the ABI upfront when we gimplify
>> from the orginal args (like we preserve fntype).  Having a separate fntype
>> was really meant to make us preserve the ABI throughout the GIMPLE phase...
>
> Hmm, the idea of doing some part of ABI explicitly is definitly nice (at least
> the implicit promotions and pass by reference bits), but storing the full
> lowlevel info on how to pass argument seems bit steep. You will need to
> preserve the RTL containers for parameters that may get non-trivial (PARALLEL)
> and precompute all the other information how to get data on stack.
>
> While playing with the ABi checker I was just looking into this after several
> years (when i was cleaning up calls.c) and calls.c basically works by 
> computing
> arg_data that holds most of the info you would need (you need also return
> argument passing and the hidden argument for structure returns).  You can 
> check
> it out - it is fairly non-trivial beast plus it really holds two parallel sets
> of infos - tailcall and normal call (because these differ for targets with
> register windows). The info also depends on flags used to compile function 
> body
> (such as -maccumulate-outgoing-args)
>
> To make something like this a permanent part of GIMPLE would probably need 
> quite
> careful re-engineering of the APIs inventing more high-level intermediate
> representation to get out of the machine description.  There is not realy 
> immediate
> benefit from knowing how parameters are housed on stack for gimple 
> optimizers, so
> perhaps just keeping the type information (after promotions) as the way to 
> specify
> call conventions is more practical way to go.

Yeah, I suppose we'd need to either build a new function type for each
variadic call
then or somehow represent 'fntype' differently (note that function
attributes also
need to be preserved).

Richard.

> Honza
>
>> >> But yes, the VIEW_CONVERT "stripping" is a bit fragile and I don't 
>> >> remember
>> >> what exactly we gain from it (when not done on registers).
>> >
>> > I guess gain is really limited to Ada - there are very few cases we do VCE 
>> > otherwise.
>> > (I think we could do more of them).  We can make useless_type_conversion 
>> > NOP/CONVERT
>> > only. That in fact makes quite a sense because those are types with gimple 
>> > operations
>> > on it.  Perhaps also VCE on vectors, but not VCE in general.
>> >
>> > Honza
>> >>
>> >> But I also don't see where we do the stripping mentioned on memory 
>> >> references.
>> >> The match.pd pattern doesn't apply to memory, only in the GENERIC path
>> >> which is guarded with exact type equality.  So I can't see where we end up
>> >> stripping the V_C_E.
>> >>
>> >> There is one bogus case still in fold-const.c:
>> >>
>> >> case VIEW_CONVERT_EXPR:
>> >>   if (TREE_CODE (op0) == MEM_REF)
>> >> /* ???  Bogus for aligned types.  */
>> >> return fold_build2_loc (loc, MEM_REF, type,
>> >> TREE_OPERAND (op0, 0), TREE_OPERAND (op0, 
>> >> 1));
>> >>
>> >>   return NULL_TREE;
>> >>
>> >> that comment is only in my local tree ... (we lose alignment info that is
>> >> on the original MEM_REF type which may be a smaller one).
>> >>
>> >> Richard.
>> >>
>> >> > Honza
>> >> >>
>> >> >>
>> >> >>   * gnat.dg/discr44.adb: New test.
>> >> >>
>> >> >> --
>> >> >> Eric Botcazou
>> >> >
>> >> >> -- { dg-do run }
>> >> >> -- { dg-options "-gnatws" }
>> >> >>
>> >> >> procedure Discr44 is
>> >> >>
>> >> >>   function Ident (I : Integer) return Integer is
>> >> >>   begin
>> >> >> return I;
>> >> >>   end;
>> >> >>
>> >> >>   type Int is range 1 .. 10;
>> >> >>
>> >> >>   type Str is array (Int range <>) of Character;
>> >> >>
>> >> >>   type Parent (D1, D2 : Int; B : Boolean) is record
>> >> >> S : Str (D1 .. D2);
>> >> >>   end record;
>> >> >>
>> >> >>   type Derived (D : Int) is new Parent (D1 => D, D2 => D, B => False);
>> >> >>
>> >> >>   X1 : Derived (D => Int (Ident (7)));
>> >> >>
>> >> >> begin
>> >> >>   if X1.D /= 7 then
>> >> >> raise Program_Error;
>> >> >>   end if;
>> >> >> end;
>> >> >


Re: [PATCH 4/4] Add -Wmisleading-indentation to -Wall

2015-10-30 Thread Richard Biener
On Thu, Oct 29, 2015 at 6:38 PM, Jeff Law  wrote:
> On 10/29/2015 10:49 AM, David Malcolm wrote:
>>
>> Our documentation describes -Wall as enabling "all the warnings about
>> constructions that some users consider questionable, and that are easy to
>> avoid
>> (or modify to prevent the warning), even in conjunction with macros."
>>
>> I believe that -Wmisleading-indentation meets these criteria, and is
>> likely to be of benefit to users who may not read release notes; it
>> warns for indentation that's misleading, but not for indentation
>> that's merely bad: the former are places where a user will likely
>> want to fix the code.
>>
>> The fix is usually easy and obvious: fix the misleadingly-indented
>> code.  If that isn't an option for some reason, pragmas can be used to
>> turn off the warning for a particular fragment of code:
>>
>>#pragma GCC diagnostic push
>>#pragma GCC diagnostic ignored "-Wmisleading-indentation"
>>  if (flag)
>>x = 3;
>>y = 2;
>>#pragma GCC diagnostic pop
>>
>> -Wmisleading-indentation has been tested with a variety of indentation
>> styles (see gcc/testsuite/c-c++-common/Wmisleading-indentation.c)
>> and on a variety of real-world projects.  For example, in:
>>https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg119790.html
>> Patrick reports:
>> "Tested by building the linux, git, vim, sqlite and gdb-binutils sources
>>   with -Wmisleading-indentation."
>>
>> With the tweak earlier in this kit I believe we now have a good
>> enough signal:noise ratio for this warning to be widely used; hence this
>> patch adds the warning to -Wall.
>>
>> Bootstrapped with x86_64-pc-linux-gnu.
>>
>> OK for trunk?
>>
>> gcc/c-family/ChangeLog:
>> * c.opt (Wmisleading-indentation): Add to -Wall for C and C++.
>>
>> gcc/ChangeLog:
>> * doc/invoke.texi (-Wall): Add -Wmisleading-indentation to the
>> list.
>> (-Wmisleading-indentation): Update documentation to reflect
>> being enabled by -Wall in C/C++.
>
> I'm sure we'll get some grief for this :-)
>
> Approved once we're clean in GCC.  I'm going to explicitly say that we'll
> have to watch for fallout, particularly as we start getting feedback from
> Debian & Fedora mass-rebuilds as we approach release time.  If the fallout
> is too bad, we'll have to reconsider.
>
> I'll pre-approve patches which fix anything caught by this option in GCC as
> long as the fix just adjusts whitespace :-)

Please at least check also binutils and gdb and other packages that use -Werror
(well, just rebuild Fedora world).

I'd say this shouldn't be in -Wall ... (and I suppose I'll happily
patch it out of SUSE
GCC ...).  Maybe put it into -Wextra?

Richard.

> jeff
>
>


Re: [gomp4] acc_on_device

2015-10-30 Thread Thomas Schwinge
Hi!

On Thu, 29 Oct 2015 13:15:13 -0700, Nathan Sidwell  wrote:
> I've  committed this to gomp4 branch.  It resolves a problem with 
> builtin_acc_on_device and C++.

> The test cases in the gcc testsuite were hiding the problem by providing part 
> of 
> openacc.h in the test directory, and this  had diverged from the openacc.h we 
> actually have.  I deleted those tests and inserted one in the libgomp 
> testsuite, 
> which correctly picks up the openacc.h of the tool under test, (rather than 
> one 
> in system includes).

The idea had been to test the compiler handling of the acc_on_device
builtin in the compiler testsuite, but yes, having to duplicate parts of
openacc.h was ugly.


> --- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c   
> (revision 0)
> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c   
> (working copy)

In r229568 committed to gomp-4_0-branch as obvious:

commit e2c1427d60ffcc9183fbd5a0996dfe98c7219dc5
Author: tschwinge 
Date:   Fri Oct 30 08:29:54 2015 +

De-duplicate testsuite file

libgomp/
* testsuite/libgomp.oacc-c-c++-common/acc-on-device.c:
De-duplicate file.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@229568 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog.gomp  |  5 +
 libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c | 12 
 2 files changed, 5 insertions(+), 12 deletions(-)

diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp
index ddbcdee..89f57ef 100644
--- libgomp/ChangeLog.gomp
+++ libgomp/ChangeLog.gomp
@@ -1,3 +1,8 @@
+2015-10-30  Thomas Schwinge  
+
+   * testsuite/libgomp.oacc-c-c++-common/acc-on-device.c:
+   De-duplicate file.
+
 2015-10-29  Nathan Sidwell  
 
* openacc.h (enum acc_device_t): Ensure layout compatibility.
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c 
libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
index 0f73aeb..c1eed0e 100644
--- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
@@ -1,15 +1,3 @@
-/* { dg-do compile } */
-/* { dg-additional-options "-O2" } */
-
-#include 
-
-int Foo (acc_device_t x)
-{
-  return acc_on_device (x);
-}
-
-/* { dg-final { scan-assembler-not "acc_on_device" } } */
-/* { dg-do compile } */
 /* { dg-additional-options "-O2" } */
 
 #include 


Grüße
 Thomas


signature.asc
Description: PGP signature


Re: [gomp4] acc_on_device

2015-10-30 Thread Thomas Schwinge
Hi!

On Fri, 30 Oct 2015 09:31:52 +0100, I wrote:
> > --- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c 
> > (revision 0)
> > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c 
> > (working copy)
> 
> In r229568 committed to gomp-4_0-branch as obvious:
> 
> commit e2c1427d60ffcc9183fbd5a0996dfe98c7219dc5
> Author: tschwinge 
> Date:   Fri Oct 30 08:29:54 2015 +
> 
> De-duplicate testsuite file

Chopped too much; in r229570 committed to gomp-4_0-branch as obvious:

commit 07e6f70f45dc4bbe343a972bce05aee8e0897e2e
Author: tschwinge 
Date:   Fri Oct 30 08:39:02 2015 +

De-duplicate testsuite file: restore dg-do compile directive

* testsuite/libgomp.oacc-c-c++-common/acc-on-device.c: Restore
dg-do compile directive.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@229570 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog.gomp  | 3 +++
 libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c | 1 +
 2 files changed, 4 insertions(+)

diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp
index 89f57ef..ba33e02 100644
--- libgomp/ChangeLog.gomp
+++ libgomp/ChangeLog.gomp
@@ -1,5 +1,8 @@
 2015-10-30  Thomas Schwinge  
 
+   * testsuite/libgomp.oacc-c-c++-common/acc-on-device.c: Restore
+   dg-do compile directive.
+
* testsuite/libgomp.oacc-c-c++-common/acc-on-device.c:
De-duplicate file.
 
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c 
libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
index c1eed0e..88c000e 100644
--- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
@@ -1,3 +1,4 @@
+/* { dg-do compile } */
 /* { dg-additional-options "-O2" } */
 
 #include 


Grüße
 Thomas


signature.asc
Description: PGP signature


Re: [gomp4, committed] Add goacc/kernels-acc-on-device.c

2015-10-30 Thread Thomas Schwinge
Hi Tom!

On Tue, 13 Oct 2015 17:49:21 +0200, Tom de Vries  wrote:
> On 12/10/15 14:52, Tom de Vries wrote:
> > On 12/10/15 12:49, Thomas Schwinge wrote:
> >> On Sat, 10 Oct 2015 12:49:01 +0200, Tom de
> >> Vries  wrote:
> >>> >--- /dev/null
> >>> >+++ b/gcc/testsuite/c-c++-common/goacc/kernels-acc-on-device.c
> >>> >@@ -0,0 +1,39 @@
> >>> >+/* { dg-additional-options "-O2" } */
> >>> >+
> >>> >+#include 
> >
> > Hi Thomas,
> >
> >> That doesn't work (at least in build-tree testing), as gcc/testsuite/ is
> >> not set up to look for header files in [target]/libgomp/:
> >>
> >> [...]/source-gcc/gcc/testsuite/c-c++-common/goacc/kernels-acc-on-device.c:3:21:
> >> fatal error: openacc.h: No such file or directory
> >>  compilation terminated.
> >>  compiler exited with status 1

> As a follow-up patch, I've factored the code into a mockup openacc.h, 
> now shared by several test-cases.

> Factor out goacc/openacc.h
> 
> 2015-10-13  Tom de Vries  
> 
>   * c-c++-common/goacc/openacc.h: New header file, factored out of ...
>   * c-c++-common/goacc/kernels-acc-on-device.c: ... here.
>   * c-c++-common/goacc/acc_on_device-2-off.c: Use openacc.h.
>   * c-c++-common/goacc/acc_on_device-2.c: Same.

As a clean-up, and to move acc_on_device testing to where that OpenACC
Runtime Library routine is actually defined (libgomp's openacc.h), Nathan
has just removed this stub openacc.h header file, and also the test files
listed just above, plus the
gcc/testsuite/c-c++-common/goacc/kernels-acc-on-device-2.c that you added
later, see:

(gomp-4_0-branch),

(trunk).  If the kernels tests are still important (they tested ICEs, as
far as I remember), you'll have to re-instantiate these in
libgomp/testsuite/, or using __builtin_acc_on_device in gcc/testsuite/.


Grüße
 Thomas


signature.asc
Description: PGP signature


Re: [patch] New backend header reduction

2015-10-30 Thread Andrew MacLeod

On 10/30/2015 02:09 PM, Andrew MacLeod wrote:

On 10/30/2015 01:56 PM, Cesar Philippidis wrote:

On 10/23/2015 12:24 PM, Jeff Law wrote:

On 10/23/2015 10:53 AM, Andrew MacLeod wrote:


There's a little bit of fallout with this patch when building an
offloaded compiler for openacc. It looks like cgraph.c needs to include
context.h and varpool.c needs context.h and omp-low.h. There's a couple
of ifdef ENABLE_OFFLOADING which may have gone undetected with your 
script.
If they are defined on the command line or some other way I couldn't 
see with the targets I built, then that is the common case when that 
happens.  I don't think I did any openacc builds. OR maybe I need 
to add nvptx to my coverage builds. Perhaps that is best.

I've bootstrapped the attached patch for an nvptx/x86_64-linux target.
I'm still testing that toolchain. If the testing comes back clean, is
this patch OK for trunk?
Ah, I see.  there is no nvptx target in config-list.mk, so it never got 
covered.


Andrew


Re: [patch] New backend header reduction

2015-10-30 Thread Jeff Law

On 10/30/2015 02:23 PM, Cesar Philippidis wrote:

On 10/30/2015 01:20 PM, Andrew MacLeod wrote:

On 10/30/2015 02:09 PM, Andrew MacLeod wrote:

On 10/30/2015 01:56 PM, Cesar Philippidis wrote:

On 10/23/2015 12:24 PM, Jeff Law wrote:

On 10/23/2015 10:53 AM, Andrew MacLeod wrote:


There's a little bit of fallout with this patch when building an
offloaded compiler for openacc. It looks like cgraph.c needs to include
context.h and varpool.c needs context.h and omp-low.h. There's a couple
of ifdef ENABLE_OFFLOADING which may have gone undetected with your
script.

If they are defined on the command line or some other way I couldn't
see with the targets I built, then that is the common case when that
happens.  I don't think I did any openacc builds. OR maybe I need
to add nvptx to my coverage builds. Perhaps that is best.

I've bootstrapped the attached patch for an nvptx/x86_64-linux target.
I'm still testing that toolchain. If the testing comes back clean, is
this patch OK for trunk?

Ah, I see.  there is no nvptx target in config-list.mk, so it never got
covered.


Yeah, you need to build two separate compilers. Thomas posted some
directions here . You could
probably reproduce it with openmp and Intel's MIC emulation target too.
For config-list.mk testing, we just need to be able to build cross 
compilers for the given target -- the offloading bits wouldn't be 
applicable here, unless the PTX backend inherently depends on them.


jeff


cgraph offloading error?

2015-10-30 Thread Nathan Sidwell
This bit of trunk code in cgraph_node::create at around line  500 of cgraph.c 
looks wrong.  Specifically the contents of the #ifdef -- it's uncompilable as 
there's no 'g'.


 if ((flag_openacc || flag_openmp)
  && lookup_attribute ("omp declare target", DECL_ATTRIBUTES (decl)))
{
  node->offloadable = 1;
#ifdef ENABLE_OFFLOADING
  g->have_offload = true;
#endif
}

nathan


Re: cgraph offloading error?

2015-10-30 Thread Nathan Sidwell

On 10/30/15 14:28, Jeff Law wrote:


So when we don't use src_reg or dst_mode, we'll get a warning about the unused
variable. I guess this is the first port where HARD_REGNO_NREGS is a constant.


Yeah,  I noticed that when first lookingat the port, but as it wasn't 
(apparently) broken ...



Second, MOVE_MAX is 4.  That's causing out-of-bounds array access warnings in
various places.



There were a variety of other problems associated with MOVE_MAX being smaller
than a word.   If I change MOVE_MAX to 8, then everything is good.


Makes sense.


Testing attached ...

nathan
2015-10-30  Jeff Law 
	Nathan Sidwell  

	* config/nvptx/nvptx.h (HARD_REGNO_NREGS): Avoid warning on unused
	args.
	(MOVE_MAX): Set to 8.

Index: config/nvptx/nvptx.h
===
--- config/nvptx/nvptx.h	(revision 229595)
+++ config/nvptx/nvptx.h	(working copy)
@@ -88,7 +88,7 @@
 #define CALL_USED_REGISTERS\
   { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 }
 
-#define HARD_REGNO_NREGS(regno, mode)	1
+#define HARD_REGNO_NREGS(regno, mode)	((void)regno, (void)mode, 1)
 #define CANNOT_CHANGE_MODE_CLASS(M1, M2, CLS) ((CLS) == RETURN_REG)
 #define HARD_REGNO_MODE_OK(REG, MODE) nvptx_hard_regno_mode_ok (REG, MODE)
 
@@ -356,7 +356,7 @@ struct GTY(()) machine_function
 #define FLOAT_STORE_FLAG_VALUE(MODE) REAL_VALUE_ATOF("1.0", (MODE))
 
 #define CASE_VECTOR_MODE SImode
-#define MOVE_MAX 4
+#define MOVE_MAX 8
 #define MOVE_RATIO(SPEED) 4
 #define TRULY_NOOP_TRUNCATION(outprec, inprec) 1
 #define FUNCTION_MODE QImode


Re: cgraph offloading error?

2015-10-30 Thread Nathan Sidwell

On 10/30/15 13:54, Jeff Law wrote:

On 10/30/2015 02:52 PM, Nathan Sidwell wrote:

This bit of trunk code in cgraph_node::create at around line  500 of
cgraph.c looks wrong.  Specifically the contents of the #ifdef -- it's
uncompilable as there's no 'g'.

  if ((flag_openacc || flag_openmp)
   && lookup_attribute ("omp declare target", DECL_ATTRIBUTES (decl)))
 {
   node->offloadable = 1;
#ifdef ENABLE_OFFLOADING
   g->have_offload = true;
#endif
 }

Missing #include of context.h.  This was missed because the ptx backend doesn't
appear in config-list.mk.  I'm looking to see what it would take to add the ptx
backend right now :-0


Thanks Jeff!

nathan


Re: [OpenACC] num_gangs, num_workers and vector_length in c++

2015-10-30 Thread Cesar Philippidis
On 10/30/2015 10:05 AM, Jakub Jelinek wrote:
> On Fri, Oct 30, 2015 at 07:42:39AM -0700, Cesar Philippidis wrote:

>>> Another thing is what Jason as C++ maintainer wants, it is nice to get rid
>>> of some code redundancies, on the other side the fact that there is one
>>> function per non-terminal in the grammar is also quite nice property.
>>> I know I've violated this a few times too.
> 
>> That name had some legacy from the c FE in gomp-4_0-branch which the
>> function was inherited from. On one hand, it doesn't make sense to allow
>> negative integer values for those clauses, but at the same time, those
>> values aren't checked during scanning. Maybe it should be renamed
>> cp_parser_oacc_single_int_clause instead?
> 
> That is better.
> 
>> If you like, I could make a more general
>> cp_parser_omp_generic_expression that has a scan_list argument so that
>> it can be used for both general expressions and assignment-expressions.
>> That way it can be used for both omp and oacc clauses of the form 'foo (
>> expression )'.
> 
> No, that will only confuse readers of the parser.  After all, the code to
> parse an expression argument of a clause is not that large.
> So, either cp_parser_oacc_single_int_clause or just keeping the old separate
> parsing functions, just remove the cruft from those (testing the type,
> using cp_parser_condition instead of cp_parser_assignment_expression) is ok
> with me.  Please ping Jason on what he prefers from those two.

Jason, what's your preference here? Should I create a single function to
parser num_gangs, num_workers and vector_length since they all accept
the same type of argument or should I just correct the existing
functions as I did in the attached patch? Either one would be specific
to openacc.

This patch has been bootstrapped and regression tested on trunk.

Cesar
2015-10-30  Cesar Philippidis  

	gcc/cp/
	* parser.c (cp_parser_oacc_clause_vector_length): Parse the clause
	argument as an assignment expression. Bail out early on error.
	(cp_parser_omp_clause_num_gangs): Likewise.
	(cp_parser_omp_clause_num_workers): Likewise.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index c8f8b3d..a0d3f3b 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -29732,37 +29732,29 @@ cp_parser_oacc_shape_clause (cp_parser *parser, omp_clause_code kind,
 static tree
 cp_parser_oacc_clause_vector_length (cp_parser *parser, tree list)
 {
-  tree t, c;
-  location_t location = cp_lexer_peek_token (parser->lexer)->location;
-  bool error = false;
+  location_t loc = cp_lexer_peek_token (parser->lexer)->location;
 
   if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
 return list;
 
-  t = cp_parser_condition (parser);
-  if (t == error_mark_node || !INTEGRAL_TYPE_P (TREE_TYPE (t)))
-{
-  error_at (location, "expected positive integer expression");
-  error = true;
-}
+  tree t = cp_parser_assignment_expression (parser, NULL, false, false);
 
-  if (error || !cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN))
+  if (t == error_mark_node
+  || !cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN))
 {
   cp_parser_skip_to_closing_parenthesis (parser, /*recovering=*/true,
-	   /*or_comma=*/false,
-	   /*consume_paren=*/true);
+	 /*or_comma=*/false,
+	 /*consume_paren=*/true);
   return list;
 }
 
   check_no_duplicate_clause (list, OMP_CLAUSE_VECTOR_LENGTH, "vector_length",
-			 location);
+			 loc);
 
-  c = build_omp_clause (location, OMP_CLAUSE_VECTOR_LENGTH);
-  OMP_CLAUSE_VECTOR_LENGTH_EXPR (c) = t;
+  tree c = build_omp_clause (loc, OMP_CLAUSE_VECTOR_LENGTH);
+  OMP_CLAUSE_OPERAND (c, 0) = t;
   OMP_CLAUSE_CHAIN (c) = list;
-  list = c;
-
-  return list;
+  return c;
 }
 
 /* OpenACC 2.0
@@ -30149,34 +30141,28 @@ cp_parser_omp_clause_nowait (cp_parser * /*parser*/,
 static tree
 cp_parser_omp_clause_num_gangs (cp_parser *parser, tree list)
 {
-  tree t, c;
-  location_t location = cp_lexer_peek_token (parser->lexer)->location;
+  location_t loc = cp_lexer_peek_token (parser->lexer)->location;
 
   if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
 return list;
 
-  t = cp_parser_condition (parser);
+  tree t = cp_parser_assignment_expression (parser, NULL, false, false);
 
   if (t == error_mark_node
   || !cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN))
-cp_parser_skip_to_closing_parenthesis (parser, /*recovering=*/true,
-	   /*or_comma=*/false,
-	   /*consume_paren=*/true);
-
-  if (!INTEGRAL_TYPE_P (TREE_TYPE (t)))
 {
-  error_at (location, "expected positive integer expression");
+  cp_parser_skip_to_closing_parenthesis (parser, /*recovering=*/true,
+	 /*or_comma=*/false,
+	 /*consume_paren=*/true);
   return list;
 }
 
-  check_no_duplicate_clause (list, OMP_CLAUSE_NUM_GANGS, "num_gangs", location);
+  check_no_duplicate_clause (list, OMP_CLAUSE_NUM_GANGS, 

Re: cgraph offloading error?

2015-10-30 Thread Nathan Sidwell

On 10/30/15 15:16, Nathan Sidwell wrote:



Testing attached ...


... with parens on void cast args ...

nathan


Re: [patch] New backend header reduction

2015-10-30 Thread Jeff Law

On 10/30/2015 02:23 PM, Cesar Philippidis wrote:

On 10/30/2015 01:20 PM, Andrew MacLeod wrote:

On 10/30/2015 02:09 PM, Andrew MacLeod wrote:

On 10/30/2015 01:56 PM, Cesar Philippidis wrote:

On 10/23/2015 12:24 PM, Jeff Law wrote:

On 10/23/2015 10:53 AM, Andrew MacLeod wrote:


There's a little bit of fallout with this patch when building an
offloaded compiler for openacc. It looks like cgraph.c needs to include
context.h and varpool.c needs context.h and omp-low.h. There's a couple
of ifdef ENABLE_OFFLOADING which may have gone undetected with your
script.

If they are defined on the command line or some other way I couldn't
see with the targets I built, then that is the common case when that
happens.  I don't think I did any openacc builds. OR maybe I need
to add nvptx to my coverage builds. Perhaps that is best.

I've bootstrapped the attached patch for an nvptx/x86_64-linux target.
I'm still testing that toolchain. If the testing comes back clean, is
this patch OK for trunk?

Ah, I see.  there is no nvptx target in config-list.mk, so it never got
covered.


Yeah, you need to build two separate compilers. Thomas posted some
directions here . You could
probably reproduce it with openmp and Intel's MIC emulation target too.
Oh, there's something specific to the offloading support that needs 
context.h, it's not the ptx port.  Duh.  Anyway, it'd still be good to 
get standard builds of the nvptx backend into config-list.mk.


Not sure how painful it'd be to add the offloading path.

This is a great example of why we're trying to minimize/eliminate 
conditional compilation :-)



jeff


Re: cgraph offloading error?

2015-10-30 Thread Jeff Law

On 10/30/2015 04:16 PM, Nathan Sidwell wrote:

On 10/30/15 14:28, Jeff Law wrote:


So when we don't use src_reg or dst_mode, we'll get a warning about
the unused
variable. I guess this is the first port where HARD_REGNO_NREGS is a
constant.


Yeah,  I noticed that when first lookingat the port, but as it wasn't
(apparently) broken ...
It's only fatal with -Wjumble-mumble that gets set by config-list.mk, so 
it could have been easily sneaking by.  Certainly wouldn't affect the 
correctness of the compiler though.





Second, MOVE_MAX is 4.  That's causing out-of-bounds array access
warnings in
various places.



There were a variety of other problems associated with MOVE_MAX being
smaller
than a word.   If I change MOVE_MAX to 8, then everything is good.


Makes sense.
I find myself wondering if that ought to be explicitly sanity-checked 
somewhere.  I'm sure there's a significant number of invariants for 
ports that we could check at build time and hopefully prevent problems 
of this nature in the future.  Though it may not be worth it..



Testing attached ...
Cool.  I'll go ahead and add nvptx-elf to the list shortly on the 
assumption this stuff will get fixed on way or another.


Jeff



Re: [patch] New backend header reduction

2015-10-30 Thread Cesar Philippidis
On 10/30/2015 01:20 PM, Andrew MacLeod wrote:
> On 10/30/2015 02:09 PM, Andrew MacLeod wrote:
>> On 10/30/2015 01:56 PM, Cesar Philippidis wrote:
>>> On 10/23/2015 12:24 PM, Jeff Law wrote:
 On 10/23/2015 10:53 AM, Andrew MacLeod wrote:

>>> There's a little bit of fallout with this patch when building an
>>> offloaded compiler for openacc. It looks like cgraph.c needs to include
>>> context.h and varpool.c needs context.h and omp-low.h. There's a couple
>>> of ifdef ENABLE_OFFLOADING which may have gone undetected with your
>>> script.
>> If they are defined on the command line or some other way I couldn't
>> see with the targets I built, then that is the common case when that
>> happens.  I don't think I did any openacc builds. OR maybe I need
>> to add nvptx to my coverage builds. Perhaps that is best.
>>> I've bootstrapped the attached patch for an nvptx/x86_64-linux target.
>>> I'm still testing that toolchain. If the testing comes back clean, is
>>> this patch OK for trunk?
> Ah, I see.  there is no nvptx target in config-list.mk, so it never got
> covered.

Yeah, you need to build two separate compilers. Thomas posted some
directions here . You could
probably reproduce it with openmp and Intel's MIC emulation target too.

Cesar



Re: [PATCH 3/6] Share code from fold_array_ctor_reference with fold.

2015-10-30 Thread Richard Biener
On Thu, Oct 29, 2015 at 8:18 PM, Alan Lawrence  wrote:
> This is in response to https://gcc.gnu.org/ml/gcc/2015-10/msg00097.html, where
> Richi points out that CONSTRUCTOR elements are not necessarily ordered.
>
> I wasn't sure of a good naming convention for the new 
> get_ctor_element_at_index,
> other suggestions welcome.

get_array_ctor_element_at_index

(ok it also handles vectors).

Ok with that change.

Richard.

> gcc/ChangeLog:
>
> * gimple-fold.c (fold_array_ctor_reference): Move searching code to:
> * fold-const.c (get_ctor_element_at_index): New.
> (fold): Remove binary-search through CONSTRUCTOR, call previous.
>
> * fold-const.h (get_ctor_element_at_index): New.
> ---
>  gcc/fold-const.c  | 93 
> ---
>  gcc/fold-const.h  |  1 +
>  gcc/gimple-fold.c | 47 ++--
>  3 files changed, 72 insertions(+), 69 deletions(-)
>
> diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> index de45a2c..5d27b91 100644
> --- a/gcc/fold-const.c
> +++ b/gcc/fold-const.c
> @@ -12018,6 +12018,72 @@ fold_ternary_loc (location_t loc, enum tree_code 
> code, tree type,
>  } /* switch (code) */
>  }
>
> +/* Gets the element ACCESS_INDEX from CTOR, which must be a CONSTRUCTOR.  */
> +
> +tree
> +get_ctor_element_at_index (tree ctor, offset_int access_index)
> +{
> +  tree index_type = NULL_TREE;
> +  offset_int low_bound = 0;
> +
> +  if (TREE_CODE (TREE_TYPE (ctor)) == ARRAY_TYPE)
> +  {
> +tree domain_type = TYPE_DOMAIN (TREE_TYPE (ctor));
> +if (domain_type && TYPE_MIN_VALUE (domain_type))
> +{
> +  /* Static constructors for variably sized objects makes no sense.  */
> +  gcc_assert (TREE_CODE (TYPE_MIN_VALUE (domain_type)) == INTEGER_CST);
> +  index_type = TREE_TYPE (TYPE_MIN_VALUE (domain_type));
> +  low_bound = wi::to_offset (TYPE_MIN_VALUE (domain_type));
> +}
> +  }
> +
> +  if (index_type)
> +access_index = wi::ext (access_index, TYPE_PRECISION (index_type),
> +   TYPE_SIGN (index_type));
> +
> +  offset_int index = low_bound - 1;
> +  if (index_type)
> +index = wi::ext (index, TYPE_PRECISION (index_type),
> +TYPE_SIGN (index_type));
> +
> +  offset_int max_index;
> +  unsigned HOST_WIDE_INT cnt;
> +  tree cfield, cval;
> +
> +  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), cnt, cfield, cval)
> +  {
> +/* Array constructor might explicitely set index, or specify range
> + * or leave index NULL meaning that it is next index after previous
> + * one.  */
> +if (cfield)
> +{
> +  if (TREE_CODE (cfield) == INTEGER_CST)
> +   max_index = index = wi::to_offset (cfield);
> +  else
> +  {
> +   gcc_assert (TREE_CODE (cfield) == RANGE_EXPR);
> +   index = wi::to_offset (TREE_OPERAND (cfield, 0));
> +   max_index = wi::to_offset (TREE_OPERAND (cfield, 1));
> +  }
> +}
> +else
> +{
> +  index += 1;
> +  if (index_type)
> +   index = wi::ext (index, TYPE_PRECISION (index_type),
> +TYPE_SIGN (index_type));
> +   max_index = index;
> +}
> +
> +/* Do we have match?  */
> +if (wi::cmpu (access_index, index) >= 0
> +   && wi::cmpu (access_index, max_index) <= 0)
> +  return cval;
> +  }
> +  return NULL_TREE;
> +}
> +
>  /* Perform constant folding and related simplification of EXPR.
> The related simplifications include x*1 => x, x*0 => 0, etc.,
> and application of the associative law.
> @@ -12094,31 +12160,8 @@ fold (tree expr)
> && TREE_CODE (op0) == CONSTRUCTOR
> && ! type_contains_placeholder_p (TREE_TYPE (op0)))
>   {
> -   vec *elts = CONSTRUCTOR_ELTS (op0);
> -   unsigned HOST_WIDE_INT end = vec_safe_length (elts);
> -   unsigned HOST_WIDE_INT begin = 0;
> -
> -   /* Find a matching index by means of a binary search.  */
> -   while (begin != end)
> - {
> -   unsigned HOST_WIDE_INT middle = (begin + end) / 2;
> -   tree index = (*elts)[middle].index;
> -
> -   if (TREE_CODE (index) == INTEGER_CST
> -   && tree_int_cst_lt (index, op1))
> - begin = middle + 1;
> -   else if (TREE_CODE (index) == INTEGER_CST
> -&& tree_int_cst_lt (op1, index))
> - end = middle;
> -   else if (TREE_CODE (index) == RANGE_EXPR
> -&& tree_int_cst_lt (TREE_OPERAND (index, 1), op1))
> - begin = middle + 1;
> -   else if (TREE_CODE (index) == RANGE_EXPR
> -&& tree_int_cst_lt (op1, TREE_OPERAND (index, 0)))
> - end = middle;
> -   else
> - return (*elts)[middle].value;
> - }
> +   if (tree val = get_ctor_element_at_index (op0, 

Re: Multiply Optimization in match and Simplify

2015-10-30 Thread Marc Glisse

On Fri, 30 Oct 2015, Richard Biener wrote:


+/* Convert (A + A) * C -> A * 2 * C.  */
+(simplify
+ (mult:c (convert? (plus @0 @0)) (convert? @1))
+  (if (tree_nop_conversion_p (TREE_TYPE (@0), type))
+   (convert (mult @0 (mult { build_int_cst (TREE_TYPE (@1), 2); } @1)
+(simplify
+ (mult:c (convert? (plus @0 @0)) INTEGER_CST@1)
+  (if (tree_nop_conversion_p (TREE_TYPE (@0), type))
+   (convert (mult @0 (mult { build_int_cst (TREE_TYPE (@1), 2); } @1)

fold-const.c only handles constant C, so we only need to 2nd pattern.
Also the :c on the mult in that is not needed due to canonicalization rules.
Please build the result of the inner multiplication directly.
I think the fold-const.c code has an overflow issue when the outer
multiplication
is signed and the inner addition unsigned.  (signed)((unsigned)INT_MAX
+ (unsigned)INT_MAX)*2
is valid but INT_MAX * 4 is not as it overflows.  So I think we should
_not_ allow
nop conversions here (it's fine if all ops are signed or unsigned).


Is there a reason why the simple transformation A+A->2*A is not
generally a desirable canonicalization? We currently restrict it to
SCALAR_FLOAT_TYPE_P.

--
Marc Glisse


[PATCH] replace BITS_PER_UNIT with __CHAR_BIT__ in target libs

2015-10-30 Thread tbsaunde+gcc
From: Trevor Saunders 

Hi,

$subject as far as I am aware these are the same on all supported targets.

Trev

libgcc/ChangeLog:

2015-10-30  Trevor Saunders  

* config/visium/lib2funcs.c (__set_trampoline_parity): Use
__CHAR_BIT__ instead of BITS_PER_UNIT.
* fixed-bit.h: Likewise.
* fp-bit.h: Likewise.
* libgcc2.c (__popcountSI2): Likewise.
(__popcountDI2): Likewise.
* libgcc2.h: Likewise.
* libgcov.h: Likewise.

libobjc/ChangeLog:

2015-10-30  Trevor Saunders  

PR libobjc/24775
* encoding.c (_darwin_rs6000_special_round_type_align): Use
__CHAR_BIT__ instead of BITS_PER_UNIT.
(objc_sizeof_type): Likewise.
(objc_layout_structure): Likewise.
(objc_layout_structure_next_member): Likewise.
(objc_layout_finish_structure): Likewise.
(objc_layout_structure_get_info): Likewise.
---
 libgcc/config/visium/lib2funcs.c |  2 +-
 libgcc/fixed-bit.h   | 10 +-
 libgcc/fp-bit.h  |  4 ++--
 libgcc/libgcc2.c | 24 
 libgcc/libgcc2.h |  8 
 libgcc/libgcov.h |  4 ++--
 libobjc/encoding.c   | 35 ---
 7 files changed, 42 insertions(+), 45 deletions(-)

diff --git a/libgcc/config/visium/lib2funcs.c b/libgcc/config/visium/lib2funcs.c
index ba720a3..ed9561f 100644
--- a/libgcc/config/visium/lib2funcs.c
+++ b/libgcc/config/visium/lib2funcs.c
@@ -315,7 +315,7 @@ __set_trampoline_parity (UWtype *addr)
 {
   int i;
 
-  for (i = 0; i < (TRAMPOLINE_SIZE * BITS_PER_UNIT) / W_TYPE_SIZE; i++)
+  for (i = 0; i < (TRAMPOLINE_SIZE * __CHAR_BIT__) / W_TYPE_SIZE; i++)
 addr[i] |= parity_bit (addr[i]);
 }
 #endif
diff --git a/libgcc/fixed-bit.h b/libgcc/fixed-bit.h
index 2efe01d..7f51f7b 100644
--- a/libgcc/fixed-bit.h
+++ b/libgcc/fixed-bit.h
@@ -434,7 +434,7 @@ typedef union
 } INTunion;
 #endif
 
-#define FIXED_WIDTH(FIXED_SIZE * BITS_PER_UNIT) /* in bits.  */
+#define FIXED_WIDTH(FIXED_SIZE * __CHAR_BIT__) /* in bits.  */
 #define FIXED_C_TYPE1(NAME)NAME ## type
 #define FIXED_C_TYPE2(NAME)FIXED_C_TYPE1(NAME)
 #define FIXED_C_TYPE   FIXED_C_TYPE2(MODE_NAME)
@@ -1108,17 +1108,17 @@ extern FIXED_C_TYPE FIXED_USASHL (FIXED_C_TYPE, 
word_type);
 #if defined (FROM_MODE_NAME_S) && defined (TO_MODE_NAME_S)
 
 #if FROM_TYPE == 1 /* Signed integer.  */
-#define FROM_INT_WIDTH (FROM_INT_SIZE * BITS_PER_UNIT)
+#define FROM_INT_WIDTH (FROM_INT_SIZE * __CHAR_BIT__)
 #endif
 
 #if FROM_TYPE == 2 /* Unsigned integer.  */
-#define FROM_INT_WIDTH (FROM_INT_SIZE * BITS_PER_UNIT)
+#define FROM_INT_WIDTH (FROM_INT_SIZE * __CHAR_BIT__)
 #endif
 
 #if FROM_TYPE == 4 /* Fixed-point.  */
 #define FROM_FIXED_C_TYPE  FIXED_C_TYPE2(FROM_MODE_NAME)
 #define FROM_FBITS FBITS2(FROM_MODE_NAME)
-#define FROM_FIXED_WIDTH   (FROM_FIXED_SIZE * BITS_PER_UNIT)
+#define FROM_FIXED_WIDTH   (FROM_FIXED_SIZE * __CHAR_BIT__)
 #define FROM_FBITS FBITS2(FROM_MODE_NAME)
 #define FROM_IBITS IBITS2(FROM_MODE_NAME)
 #define FROM_I_F_BITS  (FROM_FBITS + FROM_IBITS)
@@ -1136,7 +1136,7 @@ extern FIXED_C_TYPE FIXED_USASHL (FIXED_C_TYPE, 
word_type);
 #if TO_TYPE == 4   /* Fixed-point.  */
 #define TO_FIXED_C_TYPEFIXED_C_TYPE2(TO_MODE_NAME)
 #define TO_FBITS   FBITS2(TO_MODE_NAME)
-#define TO_FIXED_WIDTH (TO_FIXED_SIZE * BITS_PER_UNIT)
+#define TO_FIXED_WIDTH (TO_FIXED_SIZE * __CHAR_BIT__)
 #define TO_FBITS   FBITS2(TO_MODE_NAME)
 #define TO_IBITS   IBITS2(TO_MODE_NAME)
 #define TO_I_F_BITS(TO_FBITS + TO_IBITS)
diff --git a/libgcc/fp-bit.h b/libgcc/fp-bit.h
index d844f42..29661be 100644
--- a/libgcc/fp-bit.h
+++ b/libgcc/fp-bit.h
@@ -117,11 +117,11 @@ typedef unsigned int UTItype __attribute__ ((mode (TI)));
 
 #define MAX_USI_INT  (~(USItype)0)
 #define MAX_SI_INT   ((SItype) (MAX_USI_INT >> 1))
-#define BITS_PER_SI  (4 * BITS_PER_UNIT)
+#define BITS_PER_SI  (4 * __CHAR_BIT__)
 #ifdef TMODES
 #define MAX_UDI_INT  (~(UDItype)0)
 #define MAX_DI_INT   ((DItype) (MAX_UDI_INT >> 1))
-#define BITS_PER_DI  (8 * BITS_PER_UNIT)
+#define BITS_PER_DI  (8 * __CHAR_BIT__)
 #endif
 
 #ifdef FLOAT_ONLY
diff --git a/libgcc/libgcc2.c b/libgcc/libgcc2.c
index c737620..90dba06 100644
--- a/libgcc/libgcc2.c
+++ b/libgcc/libgcc2.c
@@ -160,7 +160,7 @@ __mulvSI3 (Wtype a, Wtype b)
 }
 #ifdef COMPAT_SIMODE_TRAPPING_ARITHMETIC
 #undef WORD_SIZE
-#define WORD_SIZE (sizeof (SItype) * BITS_PER_UNIT)
+#define WORD_SIZE (sizeof (SItype) * __CHAR_BIT__)
 SItype
 __mulvsi3 (SItype a, SItype b)
 {
@@ -820,16 +820,16 @@ const UQItype __popcount_tab[256] =
 #endif
 
 #if defined(L_popcountsi2) || defined(L_popcountdi2)
-#define POPCOUNTCST2(x) (((UWtype) x << 

RE: [PATCH 2/2][ARC] Add support for ARCv2 CPUs

2015-10-30 Thread Claudiu Zissulescu
Hi,

Please find the updated patch.  Both ARC patches were tested using dg.exp. The 
ChangeLog entry is unchanged. 

Thank you,
Claudiu


02-arcv2Updated.patch
Description: 02-arcv2Updated.patch


Re: Don't free dominators after sincos

2015-10-30 Thread Richard Biener
On Fri, Oct 30, 2015 at 12:16 PM, Richard Sandiford
 wrote:
> sincos has always freed dominators at the end, but AFAICT they should
> still be up-to-date.  (In particular, gimple_purge_dead_eh_edges
> updates the information.)
>
> Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu.
> OK to install?

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
> * tree-ssa-math-opts.c (pass_cse_sincos::execute): Don't free
> CDI_DOMINATORS.
>
> diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
> index 2080328..1802754 100644
> --- a/gcc/tree-ssa-math-opts.c
> +++ b/gcc/tree-ssa-math-opts.c
> @@ -1857,7 +1857,6 @@ pass_cse_sincos::execute (function *fun)
>statistics_counter_event (fun, "sincos statements inserted",
> sincos_stats.inserted);
>
> -  free_dominance_info (CDI_DOMINATORS);
>return cfg_changed ? TODO_cleanup_cfg : 0;
>  }
>
>


Re: [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF

2015-10-30 Thread Richard Biener
On Fri, Oct 30, 2015 at 6:35 AM, Jeff Law  wrote:
> On 10/29/2015 01:18 PM, Alan Lawrence wrote:
>>
>> This patch just teaches DOM that ARRAY_REFs can be equivalent to MEM_REFs
>> (with
>> pointer type to the array element type).
>>
>> gcc/ChangeLog:
>>
>> * tree-ssa-dom.c (dom_normalize_single_rhs): New.
>> (dom_normalize_gimple_stmt): New.
>> (lookup_avail_expr): Call dom_normalize_gimple_stmt.
>
> Has this been tested?  Do you have tests where it can be shown to make a
> difference independent of the changes to tree-sra.c?
>
> The implementation looks fine, I just want to have some basic tests in the
> testsuite that show the benefit of this normalization.

Err, I think the implementation is extremely wasteful ...

> Similarly for patch #2. Interestingly enough we had code that made that kind
> of transformation during gimplification eons ago.  Presumably it was ripped
> out at some point because of the differences in aliasing properties.

Please have a look at how SCCVN handles variants of memory references.
You might even want to re-use it's copy_reference_ops_from_ref implementation
(and reference hashing).  Note that SCCVN needs to be able to reconstruct a
tree expression from its representation (for PRE) which DOM needs not, so
DOM might use a simpler form.  The basic idea is to accumulate constant
offset handled_components into a "offset component".  Thus

  a[2].i ->  ()->offset(8 + offsetof(i))
  MEM_REF[, 8].i ->  ()->offset(8 + offsetof(i))

for DOM you can probably do the copy_reference_ops_from_ref work
on-the-fly for hashing and comparing.  The main point will be to forgo
with the DOM way of hashing/comparing for memory references.

Richard.

>
>
> Jeff


[AARCH64][PATCH 2/3] Implementing vmulx_lane NEON intrinsic variants

2015-10-30 Thread Bilyan Borisov

Implementing vmulx_* and vmulx_lane* NEON intrinsics

Hi all,

This series of patches focuses on the different vmulx_ and vmulx_lane NEON
intrinsics variants. All of the existing inlined assembly block implementations
are replaced with newly defined __builtin functions, and the missing intrinsics
are implemented with __builtins as well.

The rationale for the change from assembly to __builtin is that the compiler
would be able to do more optimisations like instruction scheduling. A new named
md pattern was added for the new fmulx __builtin.

Most vmulx_lane variants have been implemented as a combination of a vdup
followed by a vmulx_, rather than as separate __builtins.  The remaining
vmulx_lane intrinsics (vmulx(s|d)_lane*) were implemented using
__aarch64_vget_lane_any () and an appropriate vmulx. Four new nameless md
patterns were added to replace all the different types of RTL generated from the
combination of these intrinsics during the combine pass.

The rationale for this change is that in this way we would be able to optimise
away all uses of a dup followed by a fmulx to the appropriate fmulx lane variant
instruction.

New test cases were added for all the implemented intrinsics. Also new tests
were added for the proper error reporting of out-of-bounds accesses to _lane
intrinsics.

Tested on targets aarch64-none-elf and aarch64_be-none-elf.

Dependencies: patch 2/3 depends on patch 1/3, and patch 3/3 depends on patch
2/3.

---

In this patch from the series, all vmulx_lane variants have been implemented as
a vdup followed by a vmulx. Existing implementations of intrinsics were
refactored to use this new approach.

Several new nameless md patterns are added that will enable the combine pass to
pick up the dup/fmulx combination and replace it with a proper fmulx[lane]
instruction.

In addition, test cases for all new intrinsics were added. Tested on targets
aarch64-none-elf and aarch64_be-none-elf.

gcc/

2015-XX-XX  Bilyan Borisov  

* config/aarch64/arm_neon.h (vmulx_lane_f32): New.
(vmulx_lane_f64): New.
(vmulxq_lane_f32): Refactored & moved.
(vmulxq_lane_f64): Refactored & moved.
(vmulx_laneq_f32): New.
(vmulx_laneq_f64): New.
(vmulxq_laneq_f32): New.
(vmulxq_laneq_f64): New.
(vmulxs_lane_f32): New.
(vmulxs_laneq_f32): New.
(vmulxd_lane_f64): New.
(vmulxd_laneq_f64): New.
* config/aarch64/aarch64-simd.md (*aarch64_combine_dupfmulx1,
VDQSF): New pattern.
(*aarch64_combine_dupfmulx2, VDQF): New pattern.
(*aarch64_combine_dupfmulx3): New pattern.
(*aarch64_combine_vgetfmulx1, VDQF_DF): New pattern.

gcc/testsuite/

2015-XX-XX  Bilyan Borisov  

* gcc/testsuite/gcc.target/aarch64/simd/vmulx_lane_f32_1.c: New.
* gcc/testsuite/gcc.target/aarch64/simd/vmulx_lane_f64_1.c: New.
* gcc/testsuite/gcc.target/aarch64/simd/vmulx_laneq_f32_1.c: New.
* gcc/testsuite/gcc.target/aarch64/simd/vmulx_laneq_f64_1.c: New.
* gcc/testsuite/gcc.target/aarch64/simd/vmulxq_lane_f32_1.c: New.
* gcc/testsuite/gcc.target/aarch64/simd/vmulxq_lane_f64_1.c: New.
* gcc/testsuite/gcc.target/aarch64/simd/vmulxq_laneq_f32_1.c: New.
* gcc/testsuite/gcc.target/aarch64/simd/vmulxq_laneq_f64_1.c: New.
* gcc/testsuite/gcc.target/aarch64/simd/vmulxs_lane_f32_1.c: New.
* gcc/testsuite/gcc.target/aarch64/simd/vmulxs_laneq_f32_1.c: New.
* gcc/testsuite/gcc.target/aarch64/simd/vmulxd_lane_f64_1.c: New.
* gcc/testsuite/gcc.target/aarch64/simd/vmulxd_laneq_f64_1.c: New.

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index e7ebbd158d21691791a8d7db8a2616062e50..8d6873a45ad0cdef42f7c632bca38096b9de1787 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -2822,6 +2822,79 @@
  [(set_attr "type" "neon_fp_mul_")]
 )
 
+;; fmulxq_lane_f32, and fmulx_laneq_f32
+
+(define_insn "*aarch64_combine_dupfmulx1"
+  [(set (match_operand:VDQSF 0 "register_operand" "=w")
+	(unspec:VDQSF
+	 [(match_operand:VDQSF 1 "register_operand" "w")
+	  (vec_duplicate:VDQSF
+	   (vec_select:
+	(match_operand: 2 "register_operand" "w")
+	(parallel [(match_operand:SI 3 "immediate_operand" "i")])))]
+	 UNSPEC_FMULX))]
+  "TARGET_SIMD"
+  {
+operands[3] = GEN_INT (ENDIAN_LANE_N (mode,
+	  INTVAL (operands[3])));
+return "fmulx\t%0, %1, %2.[%3]";
+  }
+  [(set_attr "type" "neon_fp_mul__scalar")]
+)
+
+;; fmulxq_laneq_f32, fmulxq_laneq_f64, fmulx_lane_f32
+
+(define_insn "*aarch64_combine_dupfmulx2"
+  [(set (match_operand:VDQF 0 "register_operand" "=w")
+	(unspec:VDQF
+	 [(match_operand:VDQF 1 "register_operand" "w")
+	  (vec_duplicate:VDQF
+	   (vec_select:
+	(match_operand:VDQF 2 "register_operand" "w")
+	(parallel [(match_operand:SI 3 "immediate_operand" "i")])))]
+	 UNSPEC_FMULX))]
+  

Re: [PATCH] PR fortran/68154 -- repair damage done byfix for PR fortran/65429

2015-10-30 Thread Paul Richard Thomas
Dear Steve,

OK to commit.

Thanks for the fix.

Paul

On 30 October 2015 at 01:03, Steve Kargl
 wrote:
> The attached patch restores 3 lines of code removed in my
> fix for PR fortran/65429.  The code now checks for a NULL
> character length in the typespec.  If it is indeed NULL,
> gfortran will look for a valid constructor to use (ie.,
> the 3 lines of code).  It is somewhat surprising that
> it took 6 months for this bug to appear.  Patch tested
> on x86_64-*-freebsd.  OK to commit?
>
> 2015-10-29  Steven G. Kargl  
>
> PR fortran/68154
> * decl.c (add_init_expr_to_sym): if the char length in the typespec
> is NULL, check for and use the constructor.
>
> 2015-10-29  Steven G. Kargl  
>
> PR fortran/68154
> *gfortran.dg/pr68154.f90
>
> --
> Steve



-- 
Outside of a dog, a book is a man's best friend. Inside of a dog it's
too dark to read.

Groucho Marx


RE: [PATCH 1/2][ARC] Add support for ARCv2 CPUs

2015-10-30 Thread Claudiu Zissulescu
Hi,

Please find the updated patch. I will defer the secondary reload optimization 
which will use the ld  instructions with LIMM, for the time being.

Thank you,
Claudiu

gcc/ChangeLog:

2015-08-27  Claudiu Zissulescu    
 

 
* common/config/arc/arc-common.c (arc_handle_option): Handle ARCv2  
 
options.
 
* config/arc/arc-opts.h: Add ARCv2 CPUs.
 
* config/arc/arc-protos.h (arc_secondary_reload_conv): Prototype.   
 
* config/arc/arc.c (arc_secondary_reload): Handle subreg (reg)  
 
situation, and store instructions with large offsets.   
 
(arc_secondary_reload_conv): New function.  
 
(arc_init): Add ARCv2 options.  
 
(arc_conditional_register_usage): Select the proper register usage  
 
for ARCv2 processors.   
 
(arc_handle_interrupt_attribute): ILINK2 is only valid for ARCv1
 
architecture.   
 
(arc_compute_function_type): Likewise.  
 
(arc_print_operand): Handle new ARCv2 punctuation characters.   
 
(arc_return_in_memory): ARCv2 ABI returns in registers up to 16 
 
bytes.  
 
(workaround_arc_anomaly, arc_asm_insn_p, arc_loop_hazard): New  
 
function.   
 
(arc_reorg, arc_hazard): Use it.
 
(gen_compare_reg): Eliminate false assert situations.   
 
* config/arc/arc.h (TARGET_CPU_CPP_BUILTINS): Define __HS__ and 
 
__EM__. 
 
(ASM_SPEC): Add ARCv2 options.  
 
(TARGET_NORM): ARC HS has norm instructions by default. 
 
(TARGET_OPTFPE): Use optimized floating point emulation for ARC 
 
HS. 
 
(TARGET_AT_DBR_CONDEXEC): Only for ARC600 family.   
 
(TARGET_EM, TARGET_HS, TARGET_V2, TARGET_MPYW, TARGET_MULTI):   
 
Define. 
 
(SIGNED_INT16, TARGET_MPY, TARGET_ARC700_MPY, TARGET_ANY_MPY):  
 
Likewise.   
 
(TARGET_ARC600_FAMILY, TARGET_ARCOMPACT_FAMILY): Likewise.  
   

[AARCH64][PATCH 1/3] Implementing the variants of the vmulx_ NEON intrinsic

2015-10-30 Thread Bilyan Borisov

Implementing vmulx_* and vmulx_lane* NEON intrinsics

Hi all,

This series of patches focuses on the different vmulx_ and vmulx_lane NEON
intrinsics variants. All of the existing inlined assembly block implementations
are replaced with newly defined __builtin functions, and the missing intrinsics
are implemented with __builtins as well.

The rationale for the change from assembly to __builtin is that the compiler
would be able to do more optimisations like instruction scheduling. A new named
md pattern was added for the new fmulx __builtin.

Most vmulx_lane variants have been implemented as a combination of a vdup
followed by a vmulx_, rather than as separate __builtins.  The remaining
vmulx_lane intrinsics (vmulx(s|d)_lane*) were implemented using
__aarch64_vget_lane_any () and an appropriate vmulx. Four new nameless md
patterns were added to replace all the different types of RTL generated from the
combination of these intrinsics during the combine pass.

The rationale for this change is that in this way we would be able to optimise
away all uses of a dup followed by a fmulx to the appropriate fmulx lane variant
instruction.

New test cases were added for all the implemented intrinsics. Also new tests
were added for the proper error reporting of out-of-bounds accesses to _lane
intrinsics.

Tested on targets aarch64-none-elf and aarch64_be-none-elf.

Dependencies: patch 2/3 depends on patch 1/3, and patch 3/3 depends on patch
2/3.

---

In this patch from the series, a single new md pattern is added: the one for
fmulx, from which all necessary __builtin functions are derived.

Several intrinsics were refactored to use the new __builtin functions as some
of them already had an assembly block implementation. The rest, which had no
existing implementation, were also added. A single intrinsic was removed:
vmulx_lane_f32, since there was no test case that covered it and, moreover,
its implementation was wrong: it was in fact implementing vmulxq_lane_f32.

In addition, test cases for all new intrinsics were added. Tested on targets
aarch64-none-elf and aarch64_be-none-elf.

gcc/

2015-XX-XX  Bilyan Borisov  

* config/aarch64/aarch64-simd-builtins.def: BUILTIN declaration for
fmulx...
* config/aarch64/aarch64-simd.md: And its corresponding md pattern.
* config/aarch64/arm_neon.h (vmulx_f32): Refactored to call fmulx
__builtin, also moved.
(vmulxq_f32): Same.
(vmulx_f64): New, uses __builtin.
(vmulxq_f64): Refactored to call fmulx __builtin, also moved.
(vmulxs_f32): Same.
(vmulxd_f64): Same.
(vmulx_lane_f32): Removed, implementation was wrong.
* config/aarch64/iterators.md: UNSPEC enum for fmulx.

gcc/testsuite/

2015-XX-XX  Bilyan Borisov  

* gcc/testsuite/gcc.target/aarch64/simd/vmulx_f32_1.c: New.
* gcc/testsuite/gcc.target/aarch64/simd/vmulx_f64_1.c: New.
* gcc/testsuite/gcc.target/aarch64/simd/vmulxq_f32_1.c: New.
* gcc/testsuite/gcc.target/aarch64/simd/vmulxq_f64_1.c: New.
* gcc/testsuite/gcc.target/aarch64/simd/vmulxs_f32_1.c: New.
* gcc/testsuite/gcc.target/aarch64/simd/vmulxd_f64_1.c: New.

diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 2c13cfb0823640254f02c202b19ddae78484d537..eed5f2b21997d4ea439dea828a0888cb253ad041 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -41,6 +41,7 @@
 
   BUILTIN_VDC (COMBINE, combine, 0)
   BUILTIN_VB (BINOP, pmul, 0)
+  BUILTIN_VALLF (BINOP, fmulx, 0)
   BUILTIN_VDQF_DF (UNOP, sqrt, 2)
   BUILTIN_VD_BHSI (BINOP, addp, 0)
   VAR1 (UNOP, addp, 0, di)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 541faf982effc7195a5f8d0d82738f76a7e04b4b..e7ebbd158d21691791a8d7db8a2616062e50 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -2810,6 +2810,18 @@
   [(set_attr "type" "neon_mul_")]
 )
 
+;; fmulx.
+
+(define_insn "aarch64_fmulx"
+  [(set (match_operand:VALLF 0 "register_operand" "=w")
+	(unspec:VALLF [(match_operand:VALLF 1 "register_operand" "w")
+		   (match_operand:VALLF 2 "register_operand" "w")]
+		  UNSPEC_FMULX))]
+ "TARGET_SIMD"
+ "fmulx\t%0, %1, %2"
+ [(set_attr "type" "neon_fp_mul_")]
+)
+
 ;; q
 
 (define_insn "aarch64_"
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 91ada618b79e038eb61e09ecd29af5129de81f51..4a3ef455b0945ed7e77fb3e78621d5010cd4c094 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -8509,63 +8509,6 @@ vmulq_n_u32 (uint32x4_t a, uint32_t b)
   return result;
 }
 
-__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
-vmulx_f32 (float32x2_t a, float32x2_t b)
-{
-  float32x2_t result;
-  __asm__ ("fmulx %0.2s,%1.2s,%2.2s"
-   : "=w"(result)
-   

Don't free dominators after sincos

2015-10-30 Thread Richard Sandiford
sincos has always freed dominators at the end, but AFAICT they should
still be up-to-date.  (In particular, gimple_purge_dead_eh_edges
updates the information.)

Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu.
OK to install?

Thanks,
Richard


gcc/
* tree-ssa-math-opts.c (pass_cse_sincos::execute): Don't free
CDI_DOMINATORS.

diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 2080328..1802754 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1857,7 +1857,6 @@ pass_cse_sincos::execute (function *fun)
   statistics_counter_event (fun, "sincos statements inserted",
sincos_stats.inserted);
 
-  free_dominance_info (CDI_DOMINATORS);
   return cfg_changed ? TODO_cleanup_cfg : 0;
 }
 



Re: [PATCH, 1/6] Simplify constraint handling

2015-10-30 Thread Richard Biener
On Thu, 29 Oct 2015, Tom de Vries wrote:

> On 29/10/15 14:12, Richard Biener wrote:
> > On Thu, 29 Oct 2015, Tom de Vries wrote:
> > 
> > > >On 29/10/15 12:13, Richard Biener wrote:
> > > > > >On Wed, 28 Oct 2015, Tom de Vries wrote:
> > > > > >
> > > > > > > > > >On 28/10/15 16:35, Richard Biener wrote:
> > > > > > > > > > > > > >On Tue, 27 Oct 2015, Tom de Vries wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >On 27/10/15 13:24, Tom de Vries wrote:
> > > > > > > > > > > > > > > > > > > > > >Thinking it over a bit more, I
> > > > > > > > > > > > realized the constraint
> > > > > > > > > > > > > >handling started
> > > > > > > > > > > > > > > > > > > > > >to be messy. I've reworked the patch
> > > > > > > > > > > > series to simplify that
> > > > > > > > > > > > > >first.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >1Simplify constraint
> > > > > > > > > > > > handling
> > > > > > > > > > > > > > > > > > > > > >2Rename
> > > > > > > > > > > > make_restrict_var_constraints to
> > > > > > > > > > > > > > > > > > > > > >make_param_constraints
> > > > > > > > > > > > > > > > > > > > > >3Add recursion to
> > > > > > > > > > > > make_param_constraints
> > > > > > > > > > > > > > > > > > > > > >4Add handle_param
> > > > > > > > > > > > parameter to
> > > > > > > > > > > > > >create_variable_info_for_1
> > > > > > > > > > > > > > > > > > > > > >5Handle recursive
> > > > > > > > > > > > restrict pointer in
> > > > > > > > > > > > > > > > > > > > > >create_variable_info_for_1
> > > > > > > > > > > > > > > > > > > > > >6Handle restrict struct
> > > > > > > > > > > > fields recursively
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >Currently doing bootstrap and regtest
> > > > > > > > > > > > on x86_64.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >I'll repost the patch series in reply
> > > > > > > > > > > > to this message.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >This patch gets rid of this bit of code in
> > > > > > > > > > > >intra_create_variable_infos:
> > > > > > > > > > > > > > > > > >...
> > > > > > > > > > > > > > > > > >if (restrict_pointer_p)
> > > > > > > > > > > > > > > > > > make_constraint_from_global_restrict
> > > > > > > > > > (p, "PARM_RESTRICT");
> > > > > > > > > > > > > > > > > >else
> > > > > > > > > > > > > > > > > >..
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >I already proposed to remove it here (
> > > > > > > > > > > > > > > > >
> > > > > > > > > > >https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02426.html
> > > > > > > > > > ) but
> > > > > > > > > > > >there is a
> > > > > > > > > > > > > > > > > >problem with that approach: It can happen
> > > > > > > > > > that restrict_pointer_p
> > > > > > > > > > > >is true,
> > > > > > > > > > > > > > > > > >but
> > > > > > > > > > > > > > > > > >p->only_restrict_pointers is false. This
> > > > > > > > > > happens with fipa-pta,
> > > > > > > > > > > >when
> > > > > > > > > > > > > > > > > >create_function_info_for created a varinfo
> > > > > > > > > > for the parameter
> > > > > > > > > > > >before
> > > > > > > > > > > > > > > > > >intra_create_variable_infos was called.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >The patch handles that case now by setting
> > > > > > > > > > > >p->only_restrict_pointers.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >Hmm, but ... restrict only has an effect in non-IPA
> > > > > > > > mode.
> > > > > > > > > >
> > > > > > > > > >Indeed, I also realized that by now.
> > > > > > > > > >
> > > > > > > > > >I wrote attached patch to make that explicit and simplify
> > > > > > fipa-pta.
> > > > > > > > > >
> > > > > > > > > >OK for trunk if bootstrap and reg-test succeeds?
> > > >
> > > >First, there was an error in the patch, it tested for flag_ipa_pta (so it
> > > also
> > > >affected ealias), but it was supposed to test for in_ipa mode. That is
> > > fixed
> > > >in attached version.
> > > >
> > > > > >I don't see the patch simplifies anything but only removes spurious
> > > > > >settings by adding IMHO redundant checks.
> > > >
> > > >Consider testcase:
> > > >...
> > > >int __attribute__((noinline, noclone))
> > > >foo (int *__restrict__ a, int *__restrict__ b)
> > > >{
> > > >   *a = 1;
> > > >   *b = 2;
> > > >}
> > > >
> > > >int __attribute__((noinline, noclone))
> > > >bar (int *a, int *b)
> > > >{
> > > >   foo (a, b);
> > > >}
> > > >...
> > > >
> > > >The impact of this patch in the pta dump (focusing on the constraints
> > > bit) is:
> > > >...
> > > >  Generating constraints for foo (foo)
> > > >
> > > >-foo.arg0 = _NOALIAS(20)
> > > >-PARM_NOALIAS(20) = NONLOCAL
> > > >-foo.arg1 = _NOALIAS(21)
> > > >-PARM_NOALIAS(21) = NONLOCAL
> > > >+foo.arg0 = 
> > > >+foo.arg1 = 
> > > 

[RFC] [Patch] Relax tree-if-conv.c trap assumptions.

2015-10-30 Thread Kumar, Venkataramanan
Hi Richard,

I am trying to "if covert the store" in the below test case and later help it 
to get vectorized under -Ofast -ftree-loop-if-convert-stores -fno-common

#define LEN 4096 
 __attribute__((aligned(32))) float array[LEN]; void test() { for (int i = 0; i 
< LEN; i++) {
   if (array[i] > (float)0.)
    array[i] =3 ;

}
}

Currently in GCC 5.2  does not vectorize it.
https://goo.gl/9nS6Pd

However ICC seems to vectorize it 
https://goo.gl/y1yGHx

As discussed with you  earlier, to allow "if convert store"  I am checking the 
following:

(1) We already  read the reference "array[i]"  unconditionally once .
(2) I am now checking  if we are conditionally writing to memory which is 
defined as read and write and is bound to the definition we are seeing. 

With this change, I get able to if convert and the vectorize the case also.

/build/gcc/xgcc -B ./build/gcc/  ifconv.c -Ofast -fopt-info-vec  -S 
-ftree-loop-if-convert-stores -fno-common
ifconv.c:2:63: note: loop vectorized

Patch 
--
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index f201ab5..6475cc0 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -727,6 +727,34 @@ write_memrefs_written_at_least_once (gimple *stmt,
   return true;
}

+static bool
+write_memrefs_safe_to_access_unconditionally (gimple *stmt,
+   
    vec drs)
+{
+  int i;
+  data_reference_p a;
+  bool found = false;
+
+  for (i = 0; drs.iterate (i, ); i++)
+    {
+  if (DR_STMT (a) == stmt
+   && DR_IS_WRITE (a)
+   && (DR_WRITTEN_AT_LEAST_ONCE (a) == 0)
+   && (DR_RW_UNCONDITIONALLY (a) == 1))
+ {
+   tree base = get_base_address (DR_REF (a));
+   found = false;
+   if (DECL_P (base)
+   && decl_binds_to_current_def_p (base)
+   && !TREE_READONLY (base))
+     {
+   found = true;
+     }
+ }
+    }
+  return found;
+}
+
/* Return true when the memory references of STMT won't trap in the
    if-converted code.  There are two things that we have to check for:

@@ -748,8 +776,20 @@ write_memrefs_written_at_least_once (gimple *stmt,
static bool
ifcvt_memrefs_wont_trap (gimple *stmt, vec refs)
{
-  return write_memrefs_written_at_least_once (stmt, refs)
-    && memrefs_read_or_written_unconditionally (stmt, refs);
+  bool memrefs_written_once, memrefs_read_written_unconditionally;
+  bool memrefs_safe_to_access;
+
+  memrefs_written_once
+ = write_memrefs_written_at_least_once (stmt, refs);
+
+  memrefs_read_written_unconditionally
+ =  memrefs_read_or_written_unconditionally (stmt, refs);
+
+  memrefs_safe_to_access
+ = write_memrefs_safe_to_access_unconditionally (stmt, refs);
+
+  return ((memrefs_written_once || memrefs_safe_to_access)
+    && memrefs_read_written_unconditionally);
}

 /* Wrapper around gimple_could_trap_p refined for the needs of the


do I need this function write_memrefs_written_at_least_once anymore?
Please suggest if there a better way to do this.

Bootstrapped and regression  tested on x86_64.   
I am not  adding change log and comments now, as I  wanted to check  approach 
first. 

Regards,
Venkat.




Re: [PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices

2015-10-30 Thread Richard Biener
On Thu, Oct 29, 2015 at 8:18 PM, Alan Lawrence  wrote:
> The code I added to completely_scalarize for arrays isn't right in some cases
> of negative array indices (e.g. arrays with indices from -1 to 1 in the Ada
> testsuite). On ARM, this prevents a failure bootstrapping Ada with the next
> patch, as well as a few ACATS tests (e.g. c64106a).
>
> Some discussion here: https://gcc.gnu.org/ml/gcc/2015-10/msg00096.html .
>
> gcc/ChangeLog:
>
> * tree-sra.c (completely_scalarize): Deal properly with negative array
> indices.
> ---
>  gcc/tree-sra.c | 17 -
>  1 file changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index e15df1f..358db79 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -1010,18 +1010,25 @@ completely_scalarize (tree base, tree decl_type, 
> HOST_WIDE_INT offset, tree ref)
> if (maxidx)
>   {
> gcc_assert (TREE_CODE (maxidx) == INTEGER_CST);
> -   /* MINIDX and MAXIDX are inclusive.  Try to avoid overflow.  */
> -   unsigned HOST_WIDE_INT lenp1 = tree_to_shwi (maxidx)
> -   - tree_to_shwi (minidx);
> +   /* MINIDX and MAXIDX are inclusive, and must be interpreted in the
> +  TYPE_DOMAIN (e.g. signed int, whereas min/max may be size_int).
> +  Try also to avoid overflow.  */
> +   minidx = build_int_cst (TYPE_DOMAIN (decl_type),
> +   tree_to_shwi (minidx));
> +   maxidx = build_int_cst (TYPE_DOMAIN (decl_type),
> +   tree_to_shwi (maxidx));

I think you want to use wide-ints here and

   wide_int idx = wi::from (minidx, TYPE_PRECISION (TYPE_DOMAIN
(...)), TYPE_SIGN (TYPE_DOMAIN (..)));
   wide_int maxidx = ...

you can then simply iterate minidx with ++ and do the final compare
against maxidx
with while (++idx <= maxidx).  For the array ref index we want to use
TYPE_DOMAIN
as type as well, not size_int.  Thus wide_int_to_tree (TYPE_DOMAIN (...)..idx).

RIchard.

> +   HOST_WIDE_INT min = tree_to_shwi (minidx);
> +   unsigned HOST_WIDE_INT lenlessone = tree_to_shwi (maxidx) - min;
> unsigned HOST_WIDE_INT idx = 0;
> do
>   {
> -   tree nref = build4 (ARRAY_REF, elemtype, ref, size_int (idx),
> +   tree nref = build4 (ARRAY_REF, elemtype,
> +   ref, size_int (idx + min),
> NULL_TREE, NULL_TREE);
> int el_off = offset + idx * el_size;
> scalarize_elem (base, el_off, el_size, nref, elemtype);
>   }
> -   while (++idx <= lenp1);
> +   while (++idx <= lenlessone);
>   }
>}
>break;
> --
> 1.9.1
>


Re: [PATCH] Pass manager: add support for termination of pass list

2015-10-30 Thread Martin Liška
On 10/30/2015 09:54 AM, Richard Biener wrote:
> On Thu, Oct 29, 2015 at 3:50 PM, Martin Liška  wrote:
>> On 10/29/2015 02:15 PM, Richard Biener wrote:
>>> On Thu, Oct 29, 2015 at 10:49 AM, Martin Liška  wrote:
 On 10/28/2015 04:23 PM, Richard Biener wrote:
> On Tue, Oct 27, 2015 at 4:30 PM, Martin Liška  wrote:
>> On 10/27/2015 03:49 PM, Richard Biener wrote:
>>> On Tue, Oct 27, 2015 at 1:36 PM, Martin Liška  wrote:
 On 10/26/2015 02:48 PM, Richard Biener wrote:
> On Thu, Oct 22, 2015 at 1:02 PM, Martin Liška  wrote:
>> On 10/21/2015 04:06 PM, Richard Biener wrote:
>>> On Wed, Oct 21, 2015 at 1:24 PM, Martin Liška  
>>> wrote:
 On 10/21/2015 11:59 AM, Richard Biener wrote:
> On Wed, Oct 21, 2015 at 11:19 AM, Martin Liška  
> wrote:
>> On 10/20/2015 03:39 PM, Richard Biener wrote:
>>> On Tue, Oct 20, 2015 at 3:00 PM, Martin Liška  
>>> wrote:
 Hello.

 As part of upcoming merge of HSA branch, we would like to have 
 possibility to terminate
 pass manager after execution of the HSA generation pass. The 
 HSA back-end is implemented
 as a tree pass that directly emits HSAIL from gimple tree 
 representation. The pass operates
 on clones created by HSA IPA pass and the pass manager should 
 stop execution of further
 RTL passes.

 Suggested patch survives bootstrap and regression tests on 
 x86_64-linux-pc.

 What do you think about it?
>>>
>>> Are you sure it works this way?
>>>
>>> Btw, you will miss executing of all the cleanup passes that will
>>> eventually free memory
>>> associated with the function.  So I'd rather support a
>>> TODO_discard_function which
>>> should basically release the body from the cgraph.
>>
>> Hi.
>>
>> Agree with you that I should execute all TODOs, which can be 
>> easily done.
>> However, if I just try to introduce the suggested TODO and 
>> handle it properly
>> by calling cgraph_node::release_body, then for instance 
>> fn->gimple_df, fn->cfg are
>> released and I hit ICEs on many places.
>>
>> Stopping the pass manager looks necessary, or do I miss 
>> something?
>
> "Stopping the pass manager" is necessary after 
> TODO_discard_function, yes.
> But that may be simply done via a has_body () check then?

 Thanks, there's second version of the patch. I'm going to start 
 regression tests.
>>>
>>> As release_body () will free cfun you should pop_cfun () before
>>> calling it (and thus
>>
>> Well, release_function_body calls both push & pop, so I think 
>> calling pop
>> before cgraph_node::release_body is not necessary.
>
> (ugh).
>
>> If tried to call pop_cfun before cgraph_node::release_body, I have 
>> cfun still
>> pointing to the same (function *) (which is gcc_freed, but cfun != 
>> NULL).
>
> Hmm, I meant to call pop_cfun then after it (unless you want to fix 
> the above,
> none of the freeing functions should techincally need 'cfun', just add
> 'fn' parameters ...).
>
> I expected pop_cfun to eventually set cfun to NULL if it popped the
> "last" cfun.  Why
> doesn't it do that?
>
>>> drop its modification).  Also TODO_discard_functiuon should be only 
>>> set for
>>> local passes thus you should probably add a gcc_assert (cfun).
>>> I'd move its handling earlier, definitely before the ggc_collect, 
>>> eventually
>>> before the pass_fini_dump_file () (do we want a last dump of the
>>> function or not?).
>>
>> Fully agree, moved here.
>>
>>>
>>> @@ -2397,6 +2410,10 @@ execute_pass_list_1 (opt_pass *pass)
>>>  {
>>>gcc_assert (pass->type == GIMPLE_PASS
>>>   || pass->type == RTL_PASS);
>>> +
>>> +
>>> +  if (!gimple_has_body_p (current_function_decl))
>>> +   return;
>>>
>>> too much vertical space.  With popping cfun before releasing the 
>>> body the check
>>> might just become if (!cfun) and
>>
>> As mentioned 

Re: [PATCH 1/5] 2015-01-25 Paul Thomas <pa...@gcc.gnu.org>

2015-10-30 Thread Bernd Schmidt

On 10/30/2015 12:48 PM, tbsaunde+...@tbsaunde.org wrote:

From: pault 

PR fortran/67171
* trans-array.c (structure_alloc_comps): On deallocation of
class components, reset the vptr to the declared type vtable
and reset the _len field of unlimited polymorphic components.
*trans-expr.c (gfc_find_and_cut_at_last_class_ref): Bail out on
allocatable component references to the right of part reference
with non-zero rank and return NULL.
(gfc_reset_vptr): Simplify this function by using the function
gfc_get_vptr_from_expr. Return if the vptr is NULL_TREE.
(gfc_reset_len): If gfc_find_and_cut_at_last_class_ref returns
NULL return.
* trans-stmt.c (gfc_trans_allocate): Rely on the use of
gfc_trans_assignment if expr3 is a variable expression since
this deals correctly with array sections.


There's no explanation of this patch or how it relates to the others in 
this series. Did you send the wrong patch?



Bernd



Re: [PATCH][AArch64] Fix insn types

2015-10-30 Thread Marcus Shawcroft
On 20 October 2015 at 17:14, Evandro Menezes  wrote:
> Kyrill,
>
> Indeed, the correct log would be:
>
> The type assigned to some insn definitions was not correct.
>
> gcc/
> * config/aarch64/aarch64.md
> (*movhf_aarch64): Change the type of "mov %0.h[0], %1.h[0] to
> "neon_move".
> (*movtf_aarch64): Change the type of "fmov %s0, wzr" to "f_mcr".
> (*cmov_insn): Change the types of "mov %0, {-1,1}" to
> "mov_imm".
> (*cmovsi_insn_uxtw): Likewise.
>
> Thank you,
>

OK thanks,  committed as r229572.

/Marcus


Re: Multiply Optimization in match and Simplify

2015-10-30 Thread Richard Biener
On Thu, Oct 29, 2015 at 5:34 AM, Hurugalawadi, Naveen
 wrote:
> Hi,
>
> Please find attached the patch that moves some multiply optimizations
> from fold-const using simplify and match.
>
> Please review the patch and let me know if any modifications are required.
>
> Tested the patch on X86.
>
> Observing following failures:-
>
>>> FAIL: gcc.dg/fold-plusmult.c scan-tree-dump-times original " \\* 4" 2
>
> Should the testcase be changed to suit current pattern?
>
>>> FAIL: gcc.dg/tree-ssa/vrp47.c scan-tree-dump-times vrp2 " & 1;" 0
>>> FAIL: gcc.dg/tree-ssa/vrp59.c scan-tree-dump-not vrp1 " & 3;"
>
> Its due to the following pattern. Pattern seems to be right.
> Fold X & (X ^ Y) as X & ~Y
> The test PASSes when we have the result as ~X & Y in some
> of the following combinations which is wrong.
> (bit_and (convert? (bit_xor:c @0 @1) (convert? @0) ))

Please do not drop A - B -> A + (-B) from fold-const as match.pd
doesn't implement all of fold-const.c negate_expr_p support.

+/* Fold X & (X ^ Y) as X & ~Y.  */
+(simplify
+ (bit_and:c (convert? @0) (convert? (bit_xor:c @0 @1)))
+  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
+   (convert (bit_and @0 (bit_not @1)

Ok, so the VRP regression is because we convert

  _8 = x_2(D) ^ 1;
  _4 = (_Bool) _8;
  _5 = (int) _4;

to

  _7 = ~x_2(D);
  _9 = _7 & 1;

via the new pattern and

/* A truncation to an unsigned type (a zero-extension) should be
   canonicalized as bitwise and of a mask.  */
(if (final_int && inter_int && inside_int
 && final_prec == inside_prec
 && final_prec > inter_prec
 && inter_unsignedp)
 (convert (bit_and @0 { wide_int_to_tree
  (inside_type,
   wi::mask (inter_prec, false,
 TYPE_PRECISION (inside_type))); })))

Previously VRP ended up with

  :
  _8 = x_2(D) ^ 1;

and now we have

  _7 = ~x_2(D);
  _9 = _7 & 1;

which is more expensive.  This means that we miss a

(bit_and (bit_not @0) INTEGER_CST@1)

-> (bit_xor @0 @1)

pattern that applies when VRP knows the range of x_2(D)
(all masked bits are know to zero).

+/* Convert X * -C into -X * C.  */
+(simplify
+ (mult:c (convert? negate_expr_p@0) INTEGER_CST@1)
+  (if (tree_int_cst_sgn (@1) == -1)
+   (with { tree tem = const_unop (NEGATE_EXPR, type, @1); }
+(if (!TREE_OVERFLOW (tem) && wi::ne_p (tem, @1)
+ && tree_nop_conversion_p (type, TREE_TYPE (@0)))
+ (mult (convert (negate @0)) @1)

as said above match.pd negate_expr_p doesn't capture everything
fold-const.c does so moving the above isn't a good idea.

+/* Convert (A + A) * C -> A * 2 * C.  */
+(simplify
+ (mult:c (convert? (plus @0 @0)) (convert? @1))
+  (if (tree_nop_conversion_p (TREE_TYPE (@0), type))
+   (convert (mult @0 (mult { build_int_cst (TREE_TYPE (@1), 2); } @1)
+(simplify
+ (mult:c (convert? (plus @0 @0)) INTEGER_CST@1)
+  (if (tree_nop_conversion_p (TREE_TYPE (@0), type))
+   (convert (mult @0 (mult { build_int_cst (TREE_TYPE (@1), 2); } @1)

fold-const.c only handles constant C, so we only need to 2nd pattern.
Also the :c on the mult in that is not needed due to canonicalization rules.
Please build the result of the inner multiplication directly.
I think the fold-const.c code has an overflow issue when the outer
multiplication
is signed and the inner addition unsigned.  (signed)((unsigned)INT_MAX
+ (unsigned)INT_MAX)*2
is valid but INT_MAX * 4 is not as it overflows.  So I think we should
_not_ allow
nop conversions here (it's fine if all ops are signed or unsigned).

Richard.



> Thanks,
> Naveen
>
> ChangeLog
>
> 2015-10-29  Naveen H.S  
>
> * fold-const.c (fold_binary_loc) : Remove A - B -> A + (-B) if B
> is easily negatable as its already present.
> Move x * -C into -x * C if x is easily negatable to match.pd.
> Move (A + A) * C -> A * 2 * C to match.pd.
> Move Fold (X ^ Y) & Y as ~X & Y to match.pd.
> Move Fold (X ^ Y) & X as ~Y & X to match.pd.
> Move Fold X & (X ^ Y) as X & ~Y to match.pd.
> Move Fold X & (Y ^ X) as ~Y & X to match.pd.
>
> * match.pd (bit_and:c (convert? @0) (convert? (bit_xor:c @0 @1))):
> New simplifier.
> (mult:c (convert? negate_expr_p@0) INTEGER_CST@1): New simplifier.
> (mult:c (convert? (plus @0 @0)) (convert? @1)): New simplifier.
> (mult:c (convert? (plus @0 @0)) INTEGER_CST@1): New simplifier.


Re: [RFC] [Patch] Relax tree-if-conv.c trap assumptions.

2015-10-30 Thread Andrew Pinski
On Fri, Oct 30, 2015 at 6:06 PM, Kumar, Venkataramanan
 wrote:
> Hi Richard,
>
> I am trying to "if covert the store" in the below test case and later help it 
> to get vectorized under -Ofast -ftree-loop-if-convert-stores -fno-common
>
> #define LEN 4096
>  __attribute__((aligned(32))) float array[LEN]; void test() { for (int i = 0; 
> i < LEN; i++) {
>if (array[i] > (float)0.)
> array[i] =3 ;
>
> }
> }
>
> Currently in GCC 5.2  does not vectorize it.
> https://goo.gl/9nS6Pd
>
> However ICC seems to vectorize it
> https://goo.gl/y1yGHx
>
> As discussed with you  earlier, to allow "if convert store"  I am checking 
> the following:
>
> (1) We already  read the reference "array[i]"  unconditionally once .
> (2) I am now checking  if we are conditionally writing to memory which is 
> defined as read and write and is bound to the definition we are seeing.


I don't think this is thread safe 

Thanks,
Andrew

>
> With this change, I get able to if convert and the vectorize the case also.
>
> /build/gcc/xgcc -B ./build/gcc/  ifconv.c -Ofast -fopt-info-vec  -S 
> -ftree-loop-if-convert-stores -fno-common
> ifconv.c:2:63: note: loop vectorized
>
> Patch
> --
> diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
> index f201ab5..6475cc0 100644
> --- a/gcc/tree-if-conv.c
> +++ b/gcc/tree-if-conv.c
> @@ -727,6 +727,34 @@ write_memrefs_written_at_least_once (gimple *stmt,
>return true;
> }
>
> +static bool
> +write_memrefs_safe_to_access_unconditionally (gimple *stmt,
> + 
>   vec drs)
> +{
> +  int i;
> +  data_reference_p a;
> +  bool found = false;
> +
> +  for (i = 0; drs.iterate (i, ); i++)
> +{
> +  if (DR_STMT (a) == stmt
> +   && DR_IS_WRITE (a)
> +   && (DR_WRITTEN_AT_LEAST_ONCE (a) == 0)
> +   && (DR_RW_UNCONDITIONALLY (a) == 1))
> + {
> +   tree base = get_base_address (DR_REF (a));
> +   found = false;
> +   if (DECL_P (base)
> +   && decl_binds_to_current_def_p (base)
> +   && !TREE_READONLY (base))
> + {
> +   found = true;
> + }
> + }
> +}
> +  return found;
> +}
> +
> /* Return true when the memory references of STMT won't trap in the
> if-converted code.  There are two things that we have to check for:
>
> @@ -748,8 +776,20 @@ write_memrefs_written_at_least_once (gimple *stmt,
> static bool
> ifcvt_memrefs_wont_trap (gimple *stmt, vec refs)
> {
> -  return write_memrefs_written_at_least_once (stmt, refs)
> -&& memrefs_read_or_written_unconditionally (stmt, refs);
> +  bool memrefs_written_once, memrefs_read_written_unconditionally;
> +  bool memrefs_safe_to_access;
> +
> +  memrefs_written_once
> + = write_memrefs_written_at_least_once (stmt, refs);
> +
> +  memrefs_read_written_unconditionally
> + =  memrefs_read_or_written_unconditionally (stmt, refs);
> +
> +  memrefs_safe_to_access
> + = write_memrefs_safe_to_access_unconditionally (stmt, refs);
> +
> +  return ((memrefs_written_once || memrefs_safe_to_access)
> +&& memrefs_read_written_unconditionally);
> }
>
>  /* Wrapper around gimple_could_trap_p refined for the needs of the
>
>
> do I need this function write_memrefs_written_at_least_once anymore?
> Please suggest if there a better way to do this.
>
> Bootstrapped and regression  tested on x86_64.
> I am not  adding change log and comments now, as I  wanted to check  approach 
> first.
>
> Regards,
> Venkat.
>
>


RE: [RFC] [Patch] Relax tree-if-conv.c trap assumptions.

2015-10-30 Thread Kumar, Venkataramanan
Hi Andrew, 

> -Original Message-
> From: Andrew Pinski [mailto:pins...@gmail.com]
> Sent: Friday, October 30, 2015 3:38 PM
> To: Kumar, Venkataramanan
> Cc: Richard Beiner (richard.guent...@gmail.com); gcc-patches@gcc.gnu.org
> Subject: Re: [RFC] [Patch] Relax tree-if-conv.c trap assumptions.
> 
> On Fri, Oct 30, 2015 at 6:06 PM, Kumar, Venkataramanan
>  wrote:
> > Hi Richard,
> >
> > I am trying to "if covert the store" in the below test case and later
> > help it to get vectorized under -Ofast -ftree-loop-if-convert-stores
> > -fno-common
> >
> > #define LEN 4096
> >  __attribute__((aligned(32))) float array[LEN]; void test() { for (int i = 
> > 0; i <
> LEN; i++) {
> >if (array[i] > (float)0.)
> > array[i] =3 ;
> >
> > }
> > }
> >
> > Currently in GCC 5.2  does not vectorize it.
> > https://goo.gl/9nS6Pd
> >
> > However ICC seems to vectorize it
> > https://goo.gl/y1yGHx
> >
> > As discussed with you  earlier, to allow "if convert store"  I am checking 
> > the
> following:
> >
> > (1) We already  read the reference "array[i]"  unconditionally once .
> > (2) I am now checking  if we are conditionally writing to memory which is
> defined as read and write and is bound to the definition we are seeing.
> 
> 
> I don't think this is thread safe 
> 

I thought under -ftree-loop-if-convert-stores it is ok to do this 
transformation.

Regards,
Venkat.

> Thanks,
> Andrew
> 
> >
> > With this change, I get able to if convert and the vectorize the case also.
> >
> > /build/gcc/xgcc -B ./build/gcc/  ifconv.c -Ofast -fopt-info-vec  -S
> > -ftree-loop-if-convert-stores -fno-common
> > ifconv.c:2:63: note: loop vectorized
> >
> > Patch
> > --
> > diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c index
> > f201ab5..6475cc0 100644
> > --- a/gcc/tree-if-conv.c
> > +++ b/gcc/tree-if-conv.c
> > @@ -727,6 +727,34 @@ write_memrefs_written_at_least_once (gimple
> *stmt,
> >return true;
> > }
> >
> > +static bool
> > +write_memrefs_safe_to_access_unconditionally (gimple *stmt,
> > +
> > +vec drs) {
> > +  int i;
> > +  data_reference_p a;
> > +  bool found = false;
> > +
> > +  for (i = 0; drs.iterate (i, ); i++)
> > +{
> > +  if (DR_STMT (a) == stmt
> > +   && DR_IS_WRITE (a)
> > +   && (DR_WRITTEN_AT_LEAST_ONCE (a) == 0)
> > +   && (DR_RW_UNCONDITIONALLY (a) == 1))
> > + {
> > +   tree base = get_base_address (DR_REF (a));
> > +   found = false;
> > +   if (DECL_P (base)
> > +   && decl_binds_to_current_def_p (base)
> > +   && !TREE_READONLY (base))
> > + {
> > +   found = true;
> > + }
> > + }
> > +}
> > +  return found;
> > +}
> > +
> > /* Return true when the memory references of STMT won't trap in the
> > if-converted code.  There are two things that we have to check for:
> >
> > @@ -748,8 +776,20 @@ write_memrefs_written_at_least_once (gimple
> > *stmt, static bool ifcvt_memrefs_wont_trap (gimple *stmt,
> > vec refs) {
> > -  return write_memrefs_written_at_least_once (stmt, refs)
> > -&& memrefs_read_or_written_unconditionally (stmt, refs);
> > +  bool memrefs_written_once, memrefs_read_written_unconditionally;
> > +  bool memrefs_safe_to_access;
> > +
> > +  memrefs_written_once
> > + = write_memrefs_written_at_least_once (stmt, refs);
> > +
> > +  memrefs_read_written_unconditionally
> > + =  memrefs_read_or_written_unconditionally (stmt, refs);
> > +
> > +  memrefs_safe_to_access
> > + = write_memrefs_safe_to_access_unconditionally (stmt,
> > + refs);
> > +
> > +  return ((memrefs_written_once || memrefs_safe_to_access)
> > +&& memrefs_read_written_unconditionally);
> > }
> >
> >  /* Wrapper around gimple_could_trap_p refined for the needs of the
> >
> >
> > do I need this function write_memrefs_written_at_least_once anymore?
> > Please suggest if there a better way to do this.
> >
> > Bootstrapped and regression  tested on x86_64.
> > I am not  adding change log and comments now, as I  wanted to check
> approach first.
> >
> > Regards,
> > Venkat.
> >
> >


Re: [PATCH 00/16] Unit tests framework (v3)

2015-10-30 Thread Bernd Schmidt

On 10/29/2015 08:21 PM, Jeff Law wrote:

Excellent point.  I think this is worth some serious thought.  Given the
state of GCC's sources, tests of this nature are going to be inherently
tied to implementation details/sources rather than interfaces.  That's
obviously not ideal, but it is where we are.  Combined with the
cleanups/refactoring I think we ought to be doing, we've got a fairly
strong argument to set these along side the sources.

The counter is that when grepping, you should probably be using
find/xargs grep :-)


There's actually a tool called ack which automates that. But we've often 
seen cases where people fail to spot occurrences in config/ directories. 
I think tests for things like bitmap or wide-int could well live at the 
end of the respective source files - this would be the most convenient 
location so that if you add a new bitmap function, you can immediately 
add tests as well.


Do we even support plugins on every host?


The tests you have so far are focused mostly on high-level gimple/tree
tests where this limitation is probably not showing up very much, but I
think it would be better to have something that allows us to have more
in-depth tests.

Yes.  But I think this level of testing is on another point in the
testing continuum and will probably require some significant work beyond
the unit testing being proposed by David.


Not sure about significant - it shouldn't take long to set up an extra 
test-gcc build at the top-level if we decide to go with that and add a 
-ftest option. I think it's worth spending some time thinking long-term 
about what the best way to go about this would be.



Bernd



Re: Multiply Optimization in match and Simplify

2015-10-30 Thread Richard Biener
On Fri, Oct 30, 2015 at 11:26 AM, Marc Glisse  wrote:
> On Fri, 30 Oct 2015, Richard Biener wrote:
>
>> +/* Convert (A + A) * C -> A * 2 * C.  */
>> +(simplify
>> + (mult:c (convert? (plus @0 @0)) (convert? @1))
>> +  (if (tree_nop_conversion_p (TREE_TYPE (@0), type))
>> +   (convert (mult @0 (mult { build_int_cst (TREE_TYPE (@1), 2); } @1)
>> +(simplify
>> + (mult:c (convert? (plus @0 @0)) INTEGER_CST@1)
>> +  (if (tree_nop_conversion_p (TREE_TYPE (@0), type))
>> +   (convert (mult @0 (mult { build_int_cst (TREE_TYPE (@1), 2); } @1)
>>
>> fold-const.c only handles constant C, so we only need to 2nd pattern.
>> Also the :c on the mult in that is not needed due to canonicalization
>> rules.
>> Please build the result of the inner multiplication directly.
>> I think the fold-const.c code has an overflow issue when the outer
>> multiplication
>> is signed and the inner addition unsigned.  (signed)((unsigned)INT_MAX
>> + (unsigned)INT_MAX)*2
>> is valid but INT_MAX * 4 is not as it overflows.  So I think we should
>> _not_ allow
>> nop conversions here (it's fine if all ops are signed or unsigned).
>
>
> Is there a reason why the simple transformation A+A->2*A is not
> generally a desirable canonicalization? We currently restrict it to
> SCALAR_FLOAT_TYPE_P.

No special reason I know of.

>
> --
> Marc Glisse


Re: [PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices

2015-10-30 Thread Eric Botcazou
> I think you want to use wide-ints here and
> 
>wide_int idx = wi::from (minidx, TYPE_PRECISION (TYPE_DOMAIN
> (...)), TYPE_SIGN (TYPE_DOMAIN (..)));
>wide_int maxidx = ...
> 
> you can then simply iterate minidx with ++ and do the final compare
> against maxidx
> with while (++idx <= maxidx).  For the array ref index we want to use
> TYPE_DOMAIN
> as type as well, not size_int.  Thus wide_int_to_tree (TYPE_DOMAIN
> (...)..idx).

Yes, you generally cannot use HOST_WIDE_INT to avoid overflow because this 
will break for 64-bit HOST_WIDE_INT and 32-bit sizetype in corner cases.

But using offset_int should be OK, see for example get_ref_base_and_extent.

-- 
Eric Botcazou


Try to update dominance info in tree-call-cdce.c

2015-10-30 Thread Richard Sandiford
The pass would free the dominance info after making a change, but it
should be pretty easy to keep the information up-to-date when the call
has no EH edges.  In a way the main hurdle was split_block, which seemed
to assume that the new block would postdominate the old one, and that
all blocks immediately dominated by the old block are now immediately
dominated by the new one.

Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu.
OK to install?

Thanks,
Richard


gcc/
* cfghooks.h (split_block): Add a flag to say whether all blocks
dominated by the old block are now dominated by the new one.
* cfghooks.c (split_block_1, split_block): Likewise,
(split_block_after_labels): Update accordingly.
* tree-call-cdce.c (shrink_wrap_one_built_in_call): Try to update
the dominance info; free it if we can't.
(pass_call_cdce::execute): Don't free the dominance info here.

diff --git a/gcc/cfghooks.c b/gcc/cfghooks.c
index 2c5c96c..82c427f 100644
--- a/gcc/cfghooks.c
+++ b/gcc/cfghooks.c
@@ -483,12 +483,18 @@ redirect_edge_and_branch_force (edge e, basic_block dest)
   return ret;
 }
 
-/* Splits basic block BB after the specified instruction I (but at least after
-   the labels).  If I is NULL, splits just after labels.  The newly created 
edge
-   is returned.  The new basic block is created just after the old one.  */
+/* Splits basic block BB after the specified instruction I (but at least
+   after the labels).  If I is NULL, splits just after labels.
+
+   The newly created edge is returned.  The new basic block is created
+   just after the old one.  It is assumed that the old block will dominate
+   the new one; the caller can use set_immediate_dominator if this
+   assumption is wrong.  If ASSUME_POSTDOM, it is further assumed that
+   the new block will postdominate the old block, so that all blocks
+   dominated by the old block are now also dominated by the new one.  */
 
 static edge
-split_block_1 (basic_block bb, void *i)
+split_block_1 (basic_block bb, void *i, bool assume_postdom)
 {
   basic_block new_bb;
   edge res;
@@ -506,7 +512,8 @@ split_block_1 (basic_block bb, void *i)
 
   if (dom_info_available_p (CDI_DOMINATORS))
 {
-  redirect_immediate_dominators (CDI_DOMINATORS, bb, new_bb);
+  if (assume_postdom)
+   redirect_immediate_dominators (CDI_DOMINATORS, bb, new_bb);
   set_immediate_dominator (CDI_DOMINATORS, new_bb, bb);
 }
 
@@ -533,15 +540,15 @@ split_block_1 (basic_block bb, void *i)
 }
 
 edge
-split_block (basic_block bb, gimple *i)
+split_block (basic_block bb, gimple *i, bool assume_postdom)
 {
-  return split_block_1 (bb, i);
+  return split_block_1 (bb, i, assume_postdom);
 }
 
 edge
-split_block (basic_block bb, rtx i)
+split_block (basic_block bb, rtx i, bool assume_postdom)
 {
-  return split_block_1 (bb, i);
+  return split_block_1 (bb, i, assume_postdom);
 }
 
 /* Splits block BB just after labels.  The newly created edge is returned.  */
@@ -549,7 +556,7 @@ split_block (basic_block bb, rtx i)
 edge
 split_block_after_labels (basic_block bb)
 {
-  return split_block_1 (bb, NULL);
+  return split_block_1 (bb, NULL, true);
 }
 
 /* Moves block BB immediately after block AFTER.  Returns false if the
diff --git a/gcc/cfghooks.h b/gcc/cfghooks.h
index a0cb6fd..8f6b465 100644
--- a/gcc/cfghooks.h
+++ b/gcc/cfghooks.h
@@ -208,8 +208,8 @@ extern edge redirect_edge_succ_nodup (edge, basic_block);
 extern bool can_remove_branch_p (const_edge);
 extern void remove_branch (edge);
 extern void remove_edge (edge);
-extern edge split_block (basic_block, rtx);
-extern edge split_block (basic_block, gimple *);
+extern edge split_block (basic_block, rtx, bool = true);
+extern edge split_block (basic_block, gimple *, bool = true);
 extern edge split_block_after_labels (basic_block);
 extern bool move_block_after (basic_block, basic_block);
 extern void delete_basic_block (basic_block);
diff --git a/gcc/tree-call-cdce.c b/gcc/tree-call-cdce.c
index 57be8a4..72828dd 100644
--- a/gcc/tree-call-cdce.c
+++ b/gcc/tree-call-cdce.c
@@ -735,6 +735,32 @@ shrink_wrap_one_built_in_call (gcall *bi_call)
   if (nconds == 0)
 return false;
 
+  /* The cfg we want to create looks like this:
+
+  [guard n-1] <- guard_bb (old block)
+|\
+| [guard n-2]   }
+|/ \}
+|   /  ...  } new blocks
+|  /  [guard 0] }
+| //   |}
+   [ call ]| <- bi_call_bb  }
+| \|
+|  \   |
+|   [ join ] <- join_tgt_bb (old iff call must end bb)
+|
+possible EH edges (only if [join] is old)
+
+ When [join] is new, the immediate dominators for these blocks are:
+
+ 1. [guard n-1]: unchanged
+ 2. [call]: [guard n-1]
+ 3. [guard m]: [guard m+1] for 0 <= m 

Try to avoid mark_virtual_operands_for_renmaing in call-cdce

2015-10-30 Thread Richard Sandiford
It's fairly easy to update the virtual ops when the call has no EH edges,
which should be cheaper than mark_virtual_operands_for_renaming.

Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu.
OK to install?

Thanks,
Richard


gcc/
* tree-call-cdce.c (join_vdefs): New function.
(shrink_wrap_one_built_in_call): Use it when the call does not
need to end a block.  Call mark_virtual_operands_for_renaming
otherwise.
(pass_call_cdce::execute): Don't call
mark_virtual_operands_for_renaming here.

diff --git a/gcc/tree-call-cdce.c b/gcc/tree-call-cdce.c
index 72828dd..dcaa974 100644
--- a/gcc/tree-call-cdce.c
+++ b/gcc/tree-call-cdce.c
@@ -708,6 +708,23 @@ gen_shrink_wrap_conditions (gcall *bi_call, vec 
conds,
 /* Probability of the branch (to the call) is taken.  */
 #define ERR_PROB 0.01
 
+/* Replace BI_CALL's vdef with a phi that uses BI_CALL's definition
+   when WITH_CALL is taken and the previous definition when WITHOUT_CALL
+   is taken.  JOIN_BB is the target of both edges.  */
+
+static void
+join_vdefs (gcall *bi_call, edge with_call, edge without_call,
+   basic_block join_bb)
+{
+  tree old_vuse = gimple_vuse (bi_call);
+  tree old_vdef = gimple_vdef (bi_call);
+  tree new_vdef = copy_ssa_name (old_vuse);
+  gphi *phi = create_phi_node (new_vdef, join_bb);
+  replace_uses_by (old_vdef, new_vdef);
+  add_phi_arg (phi, old_vuse, without_call, UNKNOWN_LOCATION);
+  add_phi_arg (phi, old_vdef, with_call, UNKNOWN_LOCATION);
+}
+
 /* The function to shrink wrap a partially dead builtin call
whose return value is not used anywhere, but has to be kept
live due to potential error condition.  Returns true if the
@@ -764,7 +781,8 @@ shrink_wrap_one_built_in_call (gcall *bi_call)
   bi_call_bb = gimple_bb (bi_call);
 
   /* Now find the join target bb -- split bi_call_bb if needed.  */
-  if (stmt_ends_bb_p (bi_call))
+  bool call_must_end_bb_p = stmt_ends_bb_p (bi_call);
+  if (call_must_end_bb_p)
 {
   /* If the call must be the last in the bb, don't split the block,
 it could e.g. have EH edges.  */
@@ -822,6 +840,12 @@ shrink_wrap_one_built_in_call (gcall *bi_call)
   join_tgt_in_edge_fall_thru->count =
   guard_bb->count - bi_call_in_edge0->count;
 
+  if (call_must_end_bb_p)
+mark_virtual_operands_for_renaming (cfun);
+  else
+join_vdefs (bi_call, join_tgt_in_edge_from_call,
+   join_tgt_in_edge_fall_thru, join_tgt_bb);
+
   /* Code generation for the rest of the conditions  */
   basic_block prev_guard_bb = NULL;
   while (nconds > 0)
@@ -978,9 +1002,6 @@ pass_call_cdce::execute (function *fun)
   if (something_changed)
 {
   free_dominance_info (CDI_POST_DOMINATORS);
-  /* As we introduced new control-flow we need to insert PHI-nodes
- for the call-clobbers of the remaining call.  */
-  mark_virtual_operands_for_renaming (fun);
   return TODO_update_ssa;
 }
 



Re: [PATCH] Allow more pointer-plus folding

2015-10-30 Thread Richard Biener
On Fri, Oct 30, 2015 at 9:58 AM, Tom de Vries  wrote:
> [ was: Re: [PATCH] Don't handle CAST_RESTRICT (PR tree-optimization/49279)
> ]
>
> On 29/10/15 12:38, Richard Biener wrote:
>>
>> On Thu, Oct 29, 2015 at 11:38 AM, Tom de Vries 
>> wrote:
>>>
>>> [ quote-pasted from
>>> https://gcc.gnu.org/ml/gcc-patches/2011-10/msg00464.html
>>> ]
>>>
 CAST_RESTRICT based disambiguation unfortunately isn't reliable,
 e.g. to store a non-restrict pointer into a restricted field,
 we add a non-useless cast to restricted pointer in the gimplifier,
 and while we don't consider that field to have a special restrict tag
 because it is unsafe to do so, we unfortunately create it for the
 CAST_RESTRICT before that and end up with different restrict tags
 for the same thing.  See the PR for more details.

 This patch turns off CAST_RESTRICT handling for now, in the future
 we might try to replace it by explicit CAST_RESTRICT stmts in some form,
 but need to solve problems with multiple inlined copies of the same
 function
 with restrict arguments or restrict variables in it and intermixed code
 from
 them (or similarly code from different non-overlapping source blocks).

 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 4.6 too?

 2011-10-06  Jakub Jelinek  

  PR tree-optimization/49279
  * tree-ssa-structalias.c (find_func_aliases): Don't handle
  CAST_RESTRICT.
  * tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Allow
  restrict propagation.
  * tree-ssa.c (useless_type_conversion_p): Don't return false
  if TYPE_RESTRICT differs.

  * gcc.dg/tree-ssa/restrict-4.c: XFAIL.
  * gcc.c-torture/execute/pr49279.c: New test.
>>>
>>>
>>>
>>> Hi,
>>>
>>> In the patch adding support for CAST_RESTRICT (
>>> https://gcc.gnu.org/ml/gcc-patches/2011-10/msg00176.html ) there was also
>>> a
>>> bit:
>>> ...
>>>  * fold-const.c (fold_unary_loc): Don't optimize
>>>  POINTER_PLUS_EXPR casted to TYPE_RESTRICT pointer by
>>>  casting the inner pointer if it isn't TYPE_RESTRICT.
>>> ...
>>> which is still around. I suppose we can remove this bit as well.
>>>
>>> OK for trunk if bootstrap and reg-test succeeds?
>>
>>
>> Ok.
>
>
> Committed.
>
>> I think the checks on TREE_OPERAND (arg0, 1) are bogus though
>> and either we should unconditionally sink the conversion or only
>> if a conversion on TREE_OPERAND (arg0, 0) vanishes (I prefer the
>> latter).
>>
>
> Like this? OK for trunk if bootstrap/reg-test succeeds?

Ok with using CONVERT_EXPR_P (TREE_OPERAND (arg0, 0)) instead of
an explicit NOP_EXPR check.

Thanks,
Richard.

> Thanks,
> - Tom
>


Re: Add VIEW_CONVERT_EXPR to operand_equal_p

2015-10-30 Thread Eric Botcazou
> > But yes, the VIEW_CONVERT "stripping" is a bit fragile and I don't
> > remember what exactly we gain from it (when not done on registers).
> 
> I guess gain is really limited to Ada - there are very few cases we do VCE
> otherwise. (I think we could do more of them).  We can make
> useless_type_conversion NOP/CONVERT only. That in fact makes quite a sense
> because those are types with gimple operations on it.  Perhaps also VCE on
> vectors, but not VCE in general.

FWIW that's fine with me.  Yes, Ada tends to generate a lot of VCEs but I try 
to get rid of the useless ones as much as I can so assistance from the middle-
end is not really required.  I'll test Richard's patch and install it if the 
outcome is positive (unless you want to do the vector thing right away).

-- 
Eric Botcazou


Re: [PATCH] libitm: Support sized delete.

2015-10-30 Thread Torvald Riegel
On Thu, 2015-10-29 at 12:38 -0700, Richard Henderson wrote:
> On 10/29/2015 11:19 AM, Torvald Riegel wrote:
> > diff --git a/libitm/libitm.map b/libitm/libitm.map
> > index 21bcfdf..7fc9a41 100644
> > --- a/libitm/libitm.map
> > +++ b/libitm/libitm.map
> > @@ -168,10 +168,12 @@ LIBITM_1.0 {
> > _ZGTtnw?;
> > _ZGTtna?;
> > _ZGTtdlPv;
> > +   _ZGTtdlPv?;
> > _ZGTtdaPv;
> > _ZGTtnw?RKSt9nothrow_t;
> > _ZGTtna?RKSt9nothrow_t;
> > _ZGTtdlPvRKSt9nothrow_t;
> > +   _ZGTtdlPv?RKSt9nothrow_t;
> > _ZGTtdaPvRKSt9nothrow_t;
> >  
> > _ITM_cxa_allocate_exception;
> 
> Everything looks good except for this part.  The new symbols need to go into a
> new symbol version.  C.f. libatomic.map for the syntax.

Ah, thanks.  OK in the updated patch that's attached?

I've also looked at the sized delete paper again, and we should be able
to called the underlying unsized delete even if the compiler issued a
call to the sized delete (ie, to answer my own question).  However, it's
not difficult or too much overhead to call the right version, so I'll
just keep the patch the way it is.
commit 1125e1b96cddf71b907ff382898239f09410d48e
Author: Torvald Riegel 
Date:   Thu Oct 29 18:52:20 2015 +0100

Support sized delete.

This adds transactional clones of the sized version of operator delete.

diff --git a/libitm/alloc.cc b/libitm/alloc.cc
index bb292da..a72848d 100644
--- a/libitm/alloc.cc
+++ b/libitm/alloc.cc
@@ -37,6 +37,7 @@ gtm_thread::record_allocation (void *ptr, void (*free_fn)(void *))
 
   a->free_fn = free_fn;
   a->allocated = true;
+  a->sized_delete = false;
 }
 
 void
@@ -50,6 +51,23 @@ gtm_thread::forget_allocation (void *ptr, void (*free_fn)(void *))
 
   a->free_fn = free_fn;
   a->allocated = false;
+  a->sized_delete = false;
+}
+
+void
+gtm_thread::forget_allocation (void *ptr, size_t sz,
+			   void (*free_fn_sz)(void *, size_t))
+{
+  uintptr_t iptr = (uintptr_t) ptr;
+
+  gtm_alloc_action *a = this->alloc_actions.find(iptr);
+  if (a == 0)
+a = this->alloc_actions.insert(iptr);
+
+  a->free_fn_sz = free_fn_sz;
+  a->allocated = false;
+  a->sized_delete = true;
+  a->sz = sz;
 }
 
 namespace {
@@ -102,7 +120,12 @@ commit_allocations_1 (uintptr_t key, gtm_alloc_action *a, void *cb_data)
   uintptr_t revert_p = (uintptr_t) cb_data;
 
   if (a->allocated == revert_p)
-a->free_fn (ptr);
+{
+  if (a->sized_delete)
+	a->free_fn_sz (ptr, a->sz);
+  else
+	a->free_fn (ptr);
+}
 }
 
 /* Permanently commit allocated memory during transaction.
diff --git a/libitm/alloc_cpp.cc b/libitm/alloc_cpp.cc
index 8514618..13185a7 100644
--- a/libitm/alloc_cpp.cc
+++ b/libitm/alloc_cpp.cc
@@ -35,41 +35,50 @@ using namespace GTM;
 
 #define _ZnwX			S(_Znw,MANGLE_SIZE_T)
 #define _ZnaX			S(_Zna,MANGLE_SIZE_T)
+#define _ZdlPvX			S(_ZdlPv,MANGLE_SIZE_T)
 #define _ZnwXRKSt9nothrow_t	S(S(_Znw,MANGLE_SIZE_T),RKSt9nothrow_t)
 #define _ZnaXRKSt9nothrow_t	S(S(_Zna,MANGLE_SIZE_T),RKSt9nothrow_t)
+#define _ZdlPvXRKSt9nothrow_t	S(S(_ZdlPv,MANGLE_SIZE_T),RKSt9nothrow_t)
 
 #define _ZGTtnwX		S(_ZGTtnw,MANGLE_SIZE_T)
 #define _ZGTtnaX		S(_ZGTtna,MANGLE_SIZE_T)
+#define _ZGTtdlPvX		S(_ZGTtdlPv,MANGLE_SIZE_T)
 #define _ZGTtnwXRKSt9nothrow_t	S(S(_ZGTtnw,MANGLE_SIZE_T),RKSt9nothrow_t)
 #define _ZGTtnaXRKSt9nothrow_t	S(S(_ZGTtna,MANGLE_SIZE_T),RKSt9nothrow_t)
+#define _ZGTtdlPvXRKSt9nothrow_t S(S(_ZGTtdlPv,MANGLE_SIZE_T),RKSt9nothrow_t)
 
 /* Everything from libstdc++ is weak, to avoid requiring that library
to be linked into plain C applications using libitm.so.  */
 
 extern "C" {
 
-extern void *_ZnwX (size_t) __attribute__((weak));
-extern void _ZdlPv (void *) __attribute__((weak));
-extern void *_ZnaX (size_t) __attribute__((weak));
-extern void _ZdaPv (void *) __attribute__((weak));
+extern void *_ZnwX  (size_t) __attribute__((weak));
+extern void _ZdlPv  (void *) __attribute__((weak));
+extern void _ZdlPvX (void *, size_t) __attribute__((weak));
+extern void *_ZnaX  (size_t) __attribute__((weak));
+extern void _ZdaPv  (void *) __attribute__((weak));
 
 typedef const struct nothrow_t { } *c_nothrow_p;
 
 extern void *_ZnwXRKSt9nothrow_t (size_t, c_nothrow_p) __attribute__((weak));
 extern void _ZdlPvRKSt9nothrow_t (void *, c_nothrow_p) __attribute__((weak));
+extern void _ZdlPvXRKSt9nothrow_t
+(void *, size_t, c_nothrow_p) __attribute__((weak));
 extern void *_ZnaXRKSt9nothrow_t (size_t, c_nothrow_p) __attribute__((weak));
 extern void _ZdaPvRKSt9nothrow_t (void *, c_nothrow_p) __attribute__((weak));
 
 #if !defined (HAVE_ELF_STYLE_WEAKREF) 
-void *_ZnwX (size_t) { return NULL; }
-void _ZdlPv (void *) { return; }
-void *_ZnaX (size_t) { return NULL; }
-void _ZdaPv (void *) { return; }
-
-void *_ZnwXRKSt9nothrow_t (size_t, c_nothrow_p) { return NULL; }
-void _ZdlPvRKSt9nothrow_t (void *, c_nothrow_p) { return; }
-void *_ZnaXRKSt9nothrow_t (size_t, c_nothrow_p) { return NULL; }
-void _ZdaPvRKSt9nothrow_t (void *, c_nothrow_p) { return; }
+void *_ZnwX  

Re: Try to avoid mark_virtual_operands_for_renmaing in call-cdce

2015-10-30 Thread Richard Biener
On Fri, Oct 30, 2015 at 12:18 PM, Richard Sandiford
 wrote:
> It's fairly easy to update the virtual ops when the call has no EH edges,
> which should be cheaper than mark_virtual_operands_for_renaming.
>
> Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu.
> OK to install?

Well.  I think this can be easily improved to handle the EH edge case
by not replacing the virtual uses in the EH region.

Btw, did you verify the pass does things correctly when facing an EH
throwing situation?  It seems all math builtins are marked as NOTHROW
regardless of -fnon-call-exceptions ...

Richard.

> Thanks,
> Richard
>
>
> gcc/
> * tree-call-cdce.c (join_vdefs): New function.
> (shrink_wrap_one_built_in_call): Use it when the call does not
> need to end a block.  Call mark_virtual_operands_for_renaming
> otherwise.
> (pass_call_cdce::execute): Don't call
> mark_virtual_operands_for_renaming here.
>
> diff --git a/gcc/tree-call-cdce.c b/gcc/tree-call-cdce.c
> index 72828dd..dcaa974 100644
> --- a/gcc/tree-call-cdce.c
> +++ b/gcc/tree-call-cdce.c
> @@ -708,6 +708,23 @@ gen_shrink_wrap_conditions (gcall *bi_call, vec *> conds,
>  /* Probability of the branch (to the call) is taken.  */
>  #define ERR_PROB 0.01
>
> +/* Replace BI_CALL's vdef with a phi that uses BI_CALL's definition
> +   when WITH_CALL is taken and the previous definition when WITHOUT_CALL
> +   is taken.  JOIN_BB is the target of both edges.  */
> +
> +static void
> +join_vdefs (gcall *bi_call, edge with_call, edge without_call,
> +   basic_block join_bb)
> +{
> +  tree old_vuse = gimple_vuse (bi_call);
> +  tree old_vdef = gimple_vdef (bi_call);
> +  tree new_vdef = copy_ssa_name (old_vuse);
> +  gphi *phi = create_phi_node (new_vdef, join_bb);
> +  replace_uses_by (old_vdef, new_vdef);
> +  add_phi_arg (phi, old_vuse, without_call, UNKNOWN_LOCATION);
> +  add_phi_arg (phi, old_vdef, with_call, UNKNOWN_LOCATION);
> +}
> +
>  /* The function to shrink wrap a partially dead builtin call
> whose return value is not used anywhere, but has to be kept
> live due to potential error condition.  Returns true if the
> @@ -764,7 +781,8 @@ shrink_wrap_one_built_in_call (gcall *bi_call)
>bi_call_bb = gimple_bb (bi_call);
>
>/* Now find the join target bb -- split bi_call_bb if needed.  */
> -  if (stmt_ends_bb_p (bi_call))
> +  bool call_must_end_bb_p = stmt_ends_bb_p (bi_call);
> +  if (call_must_end_bb_p)
>  {
>/* If the call must be the last in the bb, don't split the block,
>  it could e.g. have EH edges.  */
> @@ -822,6 +840,12 @@ shrink_wrap_one_built_in_call (gcall *bi_call)
>join_tgt_in_edge_fall_thru->count =
>guard_bb->count - bi_call_in_edge0->count;
>
> +  if (call_must_end_bb_p)
> +mark_virtual_operands_for_renaming (cfun);
> +  else
> +join_vdefs (bi_call, join_tgt_in_edge_from_call,
> +   join_tgt_in_edge_fall_thru, join_tgt_bb);
> +
>/* Code generation for the rest of the conditions  */
>basic_block prev_guard_bb = NULL;
>while (nconds > 0)
> @@ -978,9 +1002,6 @@ pass_call_cdce::execute (function *fun)
>if (something_changed)
>  {
>free_dominance_info (CDI_POST_DOMINATORS);
> -  /* As we introduced new control-flow we need to insert PHI-nodes
> - for the call-clobbers of the remaining call.  */
> -  mark_virtual_operands_for_renaming (fun);
>return TODO_update_ssa;
>  }
>
>


Re: [PATCH 5/5] remove usage of ADJUST_FIELD_ALIGN in encoding.c

2015-10-30 Thread Bernd Schmidt

On 10/30/2015 12:48 PM, tbsaunde+...@tbsaunde.org wrote:

-#ifdef ADJUST_FIELD_ALIGN
-  desired_align = ADJUST_FIELD_ALIGN (type, desired_align);
+#if defined __arc__ || defined _AIX
+  if (TYPE_MODE (strip_array_types (TREE_TYPE (type))) == DFmode)
+desired_align = MIN (desired_align, 32);
+#elif __POWERPC__ && __APPLE__
+  if (desired_align != 128)
+desired_align = MIN (desired_align, 32);
  #endif


No way. We never use this kind of test in target-independent code.


Bernd


Re: [PATCH][AArch64] Replace insn to zero up DF register

2015-10-30 Thread Marcus Shawcroft
On 20 October 2015 at 00:40, Evandro Menezes  wrote:
> In the existing targets, it seems that it's always faster to zero up a DF
> register with "movi %d0, #0" instead of "fmov %d0, xzr".
>
> This patch modifies the respective pattern.


Hi Evandro,

This patch changes the generic, u architecture independent instruction
selection. The ARM ARM (C3.5.3) makes a specific recommendation about
the choice of instruction in this situation and the current
implementation in GCC follows that recommendation.  Wilco has also
picked up on this issue he has the same patch internal to ARM along
with an ongoing discussion with ARM architecture folk regarding this
recommendation.  I'm reluctant to take this patch right now on the
basis that it runs contrary to ARM ARM recommendation pending the
conclusion of Wilco's discussion with ARM architecture folk.

Cheers
/Marcus


[PATCH 2/5] remove usage of ROUND_TYPE_SIZE from encoding.c

2015-10-30 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc got rid of this target macro in 2003, so it seems safe to assume the
alternate path works fine on all targets.

libobjc/ChangeLog:

2015-10-30  Trevor Saunders  

PR libobjc/24775
* encoding.c (objc_layout_finish_structure): Remove usage of
ROUND_TYPE_SIZE.
---
 libobjc/encoding.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/libobjc/encoding.c b/libobjc/encoding.c
index abb6145..7de768f 100644
--- a/libobjc/encoding.c
+++ b/libobjc/encoding.c
@@ -1245,14 +1245,8 @@ void objc_layout_finish_structure (struct 
objc_struct_layout *layout,
   layout->record_align = MAX (1, layout->record_align);
 #endif
 
-#ifdef ROUND_TYPE_SIZE
-  layout->record_size = ROUND_TYPE_SIZE (layout->original_type,
- layout->record_size,
- layout->record_align);
-#else
   /* Round the size up to be a multiple of the required alignment */
   layout->record_size = ROUND (layout->record_size, layout->record_align);
-#endif
 
   layout->type = NULL;
 }
-- 
2.6.2



[PATCH 1/5] 2015-01-25 Paul Thomas <pa...@gcc.gnu.org>

2015-10-30 Thread tbsaunde+gcc
From: pault 

PR fortran/67171
* trans-array.c (structure_alloc_comps): On deallocation of
class components, reset the vptr to the declared type vtable
and reset the _len field of unlimited polymorphic components.
*trans-expr.c (gfc_find_and_cut_at_last_class_ref): Bail out on
allocatable component references to the right of part reference
with non-zero rank and return NULL.
(gfc_reset_vptr): Simplify this function by using the function
gfc_get_vptr_from_expr. Return if the vptr is NULL_TREE.
(gfc_reset_len): If gfc_find_and_cut_at_last_class_ref returns
NULL return.
* trans-stmt.c (gfc_trans_allocate): Rely on the use of
gfc_trans_assignment if expr3 is a variable expression since
this deals correctly with array sections.

2015-01-25  Paul Thomas  

PR fortran/67171
* gfortran.dg/allocate_with_source_12.f03: New test

PR fortran/61819
* gfortran.dg/allocate_with_source_13.f03: New test

PR fortran/61830
* gfortran.dg/allocate_with_source_14.f03: New test


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@229303 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/fortran/ChangeLog  |  21 +-
 gcc/fortran/trans-array.c  |  32 +++
 gcc/fortran/trans-expr.c   |  70 ---
 gcc/fortran/trans-stmt.c   |   9 +-
 gcc/testsuite/ChangeLog|  11 ++
 .../gfortran.dg/allocate_with_source_12.f03|  38 
 .../gfortran.dg/allocate_with_source_13.f03| 220 +
 .../gfortran.dg/allocate_with_source_14.f03| 214 
 8 files changed, 579 insertions(+), 36 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/allocate_with_source_12.f03
 create mode 100644 gcc/testsuite/gfortran.dg/allocate_with_source_13.f03
 create mode 100644 gcc/testsuite/gfortran.dg/allocate_with_source_14.f03

diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog
index 1a351be..668013d 100644
--- a/gcc/fortran/ChangeLog
+++ b/gcc/fortran/ChangeLog
@@ -1,8 +1,25 @@
+2015-01-25  Paul Thomas  
+
+   PR fortran/67171
+   * trans-array.c (structure_alloc_comps): On deallocation of
+   class components, reset the vptr to the declared type vtable
+   and reset the _len field of unlimited polymorphic components.
+   *trans-expr.c (gfc_find_and_cut_at_last_class_ref): Bail out on
+   allocatable component references to the right of part reference
+   with non-zero rank and return NULL.
+   (gfc_reset_vptr): Simplify this function by using the function
+   gfc_get_vptr_from_expr. Return if the vptr is NULL_TREE.
+   (gfc_reset_len): If gfc_find_and_cut_at_last_class_ref returns
+   NULL return.
+   * trans-stmt.c (gfc_trans_allocate): Rely on the use of
+   gfc_trans_assignment if expr3 is a variable expression since
+   this deals correctly with array sections.
+
 2015-10-25  Andre Vehreschild  
 
PR fortran/66927
-   PR fortran/67044
-   * trans-array.c (build_array_ref): Modified call to 
+   PR fortran/67044
+   * trans-array.c (build_array_ref): Modified call to
gfc_get_class_array_ref to adhere to new interface.
(gfc_conv_expr_descriptor): For one-based arrays that
are filled by a loop starting at one the start index of the
diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 45c18a5..b726998 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -8024,6 +8024,38 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl,
 build_int_cst (TREE_TYPE (comp), 0));
}
  gfc_add_expr_to_block (, tmp);
+
+ /* Finally, reset the vptr to the declared type vtable and, if
+necessary reset the _len field.
+
+First recover the reference to the component and obtain
+the vptr.  */
+ comp = fold_build3_loc (input_location, COMPONENT_REF, ctype,
+decl, cdecl, NULL_TREE);
+ tmp = gfc_class_vptr_get (comp);
+
+ if (UNLIMITED_POLY (c))
+   {
+ /* Both vptr and _len field should be nulled.  */
+ gfc_add_modify (, tmp,
+ build_int_cst (TREE_TYPE (tmp), 0));
+ tmp = gfc_class_len_get (comp);
+ gfc_add_modify (, tmp,
+ build_int_cst (TREE_TYPE (tmp), 0));
+   }
+ else
+   {
+ /* Build the vtable address and set the vptr with it.  */
+ tree vtab;
+ gfc_symbol *vtable;
+  

[PATCH 3/5] stop using ROUND_TYPE_ALIGN in libobjc/

2015-10-30 Thread tbsaunde+gcc
From: Trevor Saunders 

Given the layering violation that using ROUND_TYPE_ALIGN in target libs
is, and the hacks needed to make it work just coppying the relevant code
into encoding.c seems to make sense as an incremental improvement.  The
epiphany version of this macro called a function that doesn't exist in
target libs, so libobjc must not build on that target and not coppying
that macro definition doesn't make anything worse.  We already
explicitly prefered the default version to the macro for sparc so we
don't need to copy that version either.  On ppc linux64 and freebsd64
after constant folding values of other target macros used for libobjc
the macro turned out to be the same as the default version so we can
just use the default for them.  Which means the only version of the
macro we actually need to copy are for ppc-darwin and AIX.

libobjc/ChangeLog:

2015-10-30  Trevor Saunders  

PR libobjc/24775
* encoding.c (objc_layout_finish_structure): Remove usage of
ROUND_TYPE_ALIGN.
---
 libobjc/encoding.c | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/libobjc/encoding.c b/libobjc/encoding.c
index 7de768f..867372d 100644
--- a/libobjc/encoding.c
+++ b/libobjc/encoding.c
@@ -1237,10 +1237,22 @@ void objc_layout_finish_structure (struct 
objc_struct_layout *layout,
   /* Work out the alignment of the record as one expression and store
  in the record type.  Round it up to a multiple of the record's
  alignment. */
-#if defined (ROUND_TYPE_ALIGN) && ! defined (__sparc__)
-  layout->record_align = ROUND_TYPE_ALIGN (layout->original_type-1,
-   1,
-   layout->record_align);
+#if _AIX
+  char type = layout->original_type[-1];
+  if (type == '{' || type == '(')
+   layout->record_align =
+ rs6000_special_round_type_align (layout->original_type-1, 1,
+  layout->record_align);
+  else
+   layout->record_align = MAX (1, layout->record_align);
+#elif __POWERPC__ && __APPLE__
+  char type = layout->original_type[-1];
+  if (type == '{' || type == '(')
+   layout->record_align
+ = darwin_rs6000_special_round_type_align (layout->original_type - 1,
+   1, layout->record_align);
+  else
+   layout->record_align = MAX (1, layout->record_align);
 #else
   layout->record_align = MAX (1, layout->record_align);
 #endif
-- 
2.6.2



[PATCH 4/5] remove usage of BIGGEST_FIELD_ALIGNMENT in encoding.c

2015-10-30 Thread tbsaunde+gcc
From: Trevor Saunders 

Similar to ROUND_TYPE_ALIGN it seems to make sense to copy the
information in the target macros to libobjc as an incremental step.  Its
worth noting a large portion of the definitions of this macro only exist
to work around ADJUST_FIELD_ALIGN being used in target libs, so once all
target macros are gone from target libs we should be able to remove most
of the definitions of BIGGEST_FIELD_ALIGNMENT in gcc/, at which point
there won't be a significant amount of dupplication.

libobjc/ChangeLog:

2015-10-30  Trevor Saunders  

PR libobjc/24775
* encoding.c (objc_layout_structure_next_member): Remove usage
of BIGGEST_FIELD_ALIGNMENT.
---
 libobjc/encoding.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/libobjc/encoding.c b/libobjc/encoding.c
index 867372d..7438d64 100644
--- a/libobjc/encoding.c
+++ b/libobjc/encoding.c
@@ -1158,8 +1158,18 @@ objc_layout_structure_next_member (struct 
objc_struct_layout *layout)
 }
 
   /* The following won't work for vectors.  */
-#ifdef BIGGEST_FIELD_ALIGNMENT
-  desired_align = MIN (desired_align, BIGGEST_FIELD_ALIGNMENT);
+#if defined __x86_64__ || defined __i386__
+#if defined __CYGWIN__ || defined __MINGW32__
+  desired_align = MIN (desired_align, 64);
+#elif defined __x86_64__
+  desired_align = MIN (desired_align, 128);
+#else
+  desired_align = MIN (desired_align, 32);
+#endif
+#elif defined __tilepro__ || defined __frv__ || defined __arm__
+  desired_align = MIN (desired_align, 64);
+#elif defined __tilegx__
+  desired_align = MIN (desired_align, 128);
 #endif
 #ifdef ADJUST_FIELD_ALIGN
   desired_align = ADJUST_FIELD_ALIGN (type, desired_align);
-- 
2.6.2



[PATCH 0/5] remove tm.h from encoding.c

2015-10-30 Thread tbsaunde+gcc
From: Trevor Saunders 

Hi,

Its not the nicest code in the world, and there's definitely room for cleanups,
however it seems like an improvement.  After this series the only usage of tm.h
in libobjc is thr.c which only uses tm.h so it can include gthr.h which uses
SUPPORTS_WEAK and possibly other target macros.

bootstrapped + regtested patches individually on x86_64-linux-gnu, also tested 
the
series as a whole doesn't regress check-objc on ppc64{,le}-linux-gnu and AIX,
ok?

Trev

Trevor Saunders (4):
  remove usage of ROUND_TYPE_SIZE from encoding.c
  stop using ROUND_TYPE_ALIGN in libobjc/
  remove usage of BIGGEST_FIELD_ALIGNMENT in encoding.c
  remove usage of ADJUST_FIELD_ALIGN in encoding.c

pault (1):
  2015-01-25  Paul Thomas  

 gcc/fortran/ChangeLog  |  21 +-
 gcc/fortran/trans-array.c  |  32 +++
 gcc/fortran/trans-expr.c   |  70 ---
 gcc/fortran/trans-stmt.c   |   9 +-
 gcc/testsuite/ChangeLog|  11 ++
 .../gfortran.dg/allocate_with_source_12.f03|  38 
 .../gfortran.dg/allocate_with_source_13.f03| 220 +
 .../gfortran.dg/allocate_with_source_14.f03| 214 
 libobjc/encoding.c |  48 +++--
 9 files changed, 613 insertions(+), 50 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/allocate_with_source_12.f03
 create mode 100644 gcc/testsuite/gfortran.dg/allocate_with_source_13.f03
 create mode 100644 gcc/testsuite/gfortran.dg/allocate_with_source_14.f03

-- 
2.6.2



[PATCH 5/5] remove usage of ADJUST_FIELD_ALIGN in encoding.c

2015-10-30 Thread tbsaunde+gcc
From: Trevor Saunders 

Not many targets define this macro in ways that do something in libojc,
so it seems to make sense to just inline the few definitions that do do
something.

libobjc/ChangeLog:

2015-10-30  Trevor Saunders  

PR libobjc/24775
* encoding.c (objc_layout_structure_next_member): Remove usage
of ADJUST_FIELD_ALIGN.
---
 libobjc/encoding.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/libobjc/encoding.c b/libobjc/encoding.c
index 7438d64..0fbfd39 100644
--- a/libobjc/encoding.c
+++ b/libobjc/encoding.c
@@ -1171,8 +1171,12 @@ objc_layout_structure_next_member (struct 
objc_struct_layout *layout)
 #elif defined __tilegx__
   desired_align = MIN (desired_align, 128);
 #endif
-#ifdef ADJUST_FIELD_ALIGN
-  desired_align = ADJUST_FIELD_ALIGN (type, desired_align);
+#if defined __arc__ || defined _AIX
+  if (TYPE_MODE (strip_array_types (TREE_TYPE (type))) == DFmode)
+desired_align = MIN (desired_align, 32);
+#elif __POWERPC__ && __APPLE__
+  if (desired_align != 128)
+desired_align = MIN (desired_align, 32);
 #endif
 
   /* Record must have at least as much alignment as any field.
-- 
2.6.2



Re: [PATCH] Pass manager: add support for termination of pass list

2015-10-30 Thread Richard Biener
On Fri, Oct 30, 2015 at 12:59 PM, Martin Liška  wrote:
> On 10/30/2015 09:54 AM, Richard Biener wrote:
>> On Thu, Oct 29, 2015 at 3:50 PM, Martin Liška  wrote:
>>> On 10/29/2015 02:15 PM, Richard Biener wrote:
 On Thu, Oct 29, 2015 at 10:49 AM, Martin Liška  wrote:
> On 10/28/2015 04:23 PM, Richard Biener wrote:
>> On Tue, Oct 27, 2015 at 4:30 PM, Martin Liška  wrote:
>>> On 10/27/2015 03:49 PM, Richard Biener wrote:
 On Tue, Oct 27, 2015 at 1:36 PM, Martin Liška  wrote:
> On 10/26/2015 02:48 PM, Richard Biener wrote:
>> On Thu, Oct 22, 2015 at 1:02 PM, Martin Liška  wrote:
>>> On 10/21/2015 04:06 PM, Richard Biener wrote:
 On Wed, Oct 21, 2015 at 1:24 PM, Martin Liška  
 wrote:
> On 10/21/2015 11:59 AM, Richard Biener wrote:
>> On Wed, Oct 21, 2015 at 11:19 AM, Martin Liška  
>> wrote:
>>> On 10/20/2015 03:39 PM, Richard Biener wrote:
 On Tue, Oct 20, 2015 at 3:00 PM, Martin Liška  
 wrote:
> Hello.
>
> As part of upcoming merge of HSA branch, we would like to 
> have possibility to terminate
> pass manager after execution of the HSA generation pass. The 
> HSA back-end is implemented
> as a tree pass that directly emits HSAIL from gimple tree 
> representation. The pass operates
> on clones created by HSA IPA pass and the pass manager should 
> stop execution of further
> RTL passes.
>
> Suggested patch survives bootstrap and regression tests on 
> x86_64-linux-pc.
>
> What do you think about it?

 Are you sure it works this way?

 Btw, you will miss executing of all the cleanup passes that 
 will
 eventually free memory
 associated with the function.  So I'd rather support a
 TODO_discard_function which
 should basically release the body from the cgraph.
>>>
>>> Hi.
>>>
>>> Agree with you that I should execute all TODOs, which can be 
>>> easily done.
>>> However, if I just try to introduce the suggested TODO and 
>>> handle it properly
>>> by calling cgraph_node::release_body, then for instance 
>>> fn->gimple_df, fn->cfg are
>>> released and I hit ICEs on many places.
>>>
>>> Stopping the pass manager looks necessary, or do I miss 
>>> something?
>>
>> "Stopping the pass manager" is necessary after 
>> TODO_discard_function, yes.
>> But that may be simply done via a has_body () check then?
>
> Thanks, there's second version of the patch. I'm going to start 
> regression tests.

 As release_body () will free cfun you should pop_cfun () before
 calling it (and thus
>>>
>>> Well, release_function_body calls both push & pop, so I think 
>>> calling pop
>>> before cgraph_node::release_body is not necessary.
>>
>> (ugh).
>>
>>> If tried to call pop_cfun before cgraph_node::release_body, I have 
>>> cfun still
>>> pointing to the same (function *) (which is gcc_freed, but cfun != 
>>> NULL).
>>
>> Hmm, I meant to call pop_cfun then after it (unless you want to fix 
>> the above,
>> none of the freeing functions should techincally need 'cfun', just 
>> add
>> 'fn' parameters ...).
>>
>> I expected pop_cfun to eventually set cfun to NULL if it popped the
>> "last" cfun.  Why
>> doesn't it do that?
>>
 drop its modification).  Also TODO_discard_functiuon should be 
 only set for
 local passes thus you should probably add a gcc_assert (cfun).
 I'd move its handling earlier, definitely before the ggc_collect, 
 eventually
 before the pass_fini_dump_file () (do we want a last dump of the
 function or not?).
>>>
>>> Fully agree, moved here.
>>>

 @@ -2397,6 +2410,10 @@ execute_pass_list_1 (opt_pass *pass)
  {
gcc_assert (pass->type == GIMPLE_PASS
   || pass->type == RTL_PASS);
 +
 +
 +  if (!gimple_has_body_p (current_function_decl))
 +   

Re: [PATCH 3/5] stop using ROUND_TYPE_ALIGN in libobjc/

2015-10-30 Thread Andrew Pinski
On Fri, Oct 30, 2015 at 7:48 PM,   wrote:
> From: Trevor Saunders 
>
> Given the layering violation that using ROUND_TYPE_ALIGN in target libs
> is, and the hacks needed to make it work just coppying the relevant code
> into encoding.c seems to make sense as an incremental improvement.  The
> epiphany version of this macro called a function that doesn't exist in
> target libs, so libobjc must not build on that target and not coppying
> that macro definition doesn't make anything worse.  We already
> explicitly prefered the default version to the macro for sparc so we
> don't need to copy that version either.  On ppc linux64 and freebsd64
> after constant folding values of other target macros used for libobjc
> the macro turned out to be the same as the default version so we can
> just use the default for them.  Which means the only version of the
> macro we actually need to copy are for ppc-darwin and AIX.
>
> libobjc/ChangeLog:
>
> 2015-10-30  Trevor Saunders  
>
> PR libobjc/24775
> * encoding.c (objc_layout_finish_structure): Remove usage of
> ROUND_TYPE_ALIGN.
> ---
>  libobjc/encoding.c | 20 
>  1 file changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/libobjc/encoding.c b/libobjc/encoding.c
> index 7de768f..867372d 100644
> --- a/libobjc/encoding.c
> +++ b/libobjc/encoding.c
> @@ -1237,10 +1237,22 @@ void objc_layout_finish_structure (struct 
> objc_struct_layout *layout,
>/* Work out the alignment of the record as one expression and store
>   in the record type.  Round it up to a multiple of the record's
>   alignment. */
> -#if defined (ROUND_TYPE_ALIGN) && ! defined (__sparc__)
> -  layout->record_align = ROUND_TYPE_ALIGN (layout->original_type-1,
> -   1,
> -   layout->record_align);
> +#if _AIX
> +  char type = layout->original_type[-1];
> +  if (type == '{' || type == '(')
> +   layout->record_align =
> + rs6000_special_round_type_align (layout->original_type-1, 1,
> +  layout->record_align);
> +  else
> +   layout->record_align = MAX (1, layout->record_align);
> +#elif __POWERPC__ && __APPLE__
> +  char type = layout->original_type[-1];
> +  if (type == '{' || type == '(')
> +   layout->record_align
> + = darwin_rs6000_special_round_type_align (layout->original_type - 1,
> +   1, layout->record_align);
> +  else
> +   layout->record_align = MAX (1, layout->record_align);
>  #else
>layout->record_align = MAX (1, layout->record_align);
>  #endif

I rather not have a special defines in the source but rather a header
that gets included for these special cases instead.

Thanks,
Andrew


> --
> 2.6.2
>


Re: [PATCH 4/5] remove usage of BIGGEST_FIELD_ALIGNMENT in encoding.c

2015-10-30 Thread Andrew Pinski
On Fri, Oct 30, 2015 at 7:48 PM,   wrote:
> From: Trevor Saunders 
>
> Similar to ROUND_TYPE_ALIGN it seems to make sense to copy the
> information in the target macros to libobjc as an incremental step.  Its
> worth noting a large portion of the definitions of this macro only exist
> to work around ADJUST_FIELD_ALIGN being used in target libs, so once all
> target macros are gone from target libs we should be able to remove most
> of the definitions of BIGGEST_FIELD_ALIGNMENT in gcc/, at which point
> there won't be a significant amount of dupplication.
>
> libobjc/ChangeLog:
>
> 2015-10-30  Trevor Saunders  
>
> PR libobjc/24775
> * encoding.c (objc_layout_structure_next_member): Remove usage
> of BIGGEST_FIELD_ALIGNMENT.
> ---
>  libobjc/encoding.c | 14 --
>  1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/libobjc/encoding.c b/libobjc/encoding.c
> index 867372d..7438d64 100644
> --- a/libobjc/encoding.c
> +++ b/libobjc/encoding.c
> @@ -1158,8 +1158,18 @@ objc_layout_structure_next_member (struct 
> objc_struct_layout *layout)
>  }
>
>/* The following won't work for vectors.  */
> -#ifdef BIGGEST_FIELD_ALIGNMENT
> -  desired_align = MIN (desired_align, BIGGEST_FIELD_ALIGNMENT);
> +#if defined __x86_64__ || defined __i386__
> +#if defined __CYGWIN__ || defined __MINGW32__
> +  desired_align = MIN (desired_align, 64);
> +#elif defined __x86_64__
> +  desired_align = MIN (desired_align, 128);
> +#else
> +  desired_align = MIN (desired_align, 32);
> +#endif
> +#elif defined __tilepro__ || defined __frv__ || defined __arm__
> +  desired_align = MIN (desired_align, 64);
> +#elif defined __tilegx__
> +  desired_align = MIN (desired_align, 128);
>  #endif
>  #ifdef ADJUST_FIELD_ALIGN
>desired_align = ADJUST_FIELD_ALIGN (type, desired_align);

Just like the other patch, I would rather have a header files for each
of the target instead so we separate out the target specific code from
the target independent code.
And so when porting to a new arch, it is easier to figure out all of
the macros that need to be defined rather than having to audit the
code.

Thanks,
Andrew

> --
> 2.6.2
>


libgo patch committed: Update to Go 1.5 release

2015-10-30 Thread Ian Lance Taylor
I have committed a patch to libgo to update it to the Go 1.5 release.

As usual for libgo updates, the actual patch is too large to attach to
this e-mail message.  I've attached the changes to the gccgo-specific
files.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

This may cause trouble on non-GNU/Linux operating systems.  Please let
me know about any problems you encounter.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 229612)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-16f69a4007a1903da4055a496882b514e05f45f3
+4b6b496579225cdd897130f6d6fd18ecb100bf99
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/MERGE
===
--- libgo/MERGE (revision 228306)
+++ libgo/MERGE (working copy)
@@ -1,4 +1,4 @@
-883bc6ed0ea815293fe6309d66f967ea60630e87
+bb03defe933c89fee44be675d7aa0fbd893ced30
 
 The first line of this file holds the git revision number of the
 last merge done from the master library sources.
Index: libgo/Makefile.am
===
--- libgo/Makefile.am   (revision 228306)
+++ libgo/Makefile.am   (working copy)
@@ -233,12 +233,15 @@ toolexeclibgogodir = $(toolexeclibgodir)
 toolexeclibgogo_DATA = \
go/ast.gox \
go/build.gox \
+   go/constant.gox \
go/doc.gox \
go/format.gox \
+   go/importer.gox \
go/parser.gox \
go/printer.gox \
go/scanner.gox \
-   go/token.gox
+   go/token.gox \
+   go/types.gox
 
 toolexeclibgohashdir = $(toolexeclibgodir)/hash
 
@@ -292,7 +295,8 @@ toolexeclibgomath_DATA = \
 toolexeclibgomimedir = $(toolexeclibgodir)/mime
 
 toolexeclibgomime_DATA = \
-   mime/multipart.gox
+   mime/multipart.gox \
+   mime/quotedprintable.gox
 
 toolexeclibgonetdir = $(toolexeclibgodir)/net
 
@@ -676,46 +680,74 @@ go_math_files = \
go/math/tanh.go \
go/math/unsafe.go
 
+if LIBGO_IS_OPENBSD
+go_mime_type_file = go/mime/type_openbsd.go
+else
+if LIBGO_IS_FREEBSD
+go_mime_type_file = go/mime/type_freebsd.go
+else
+if LIBGO_IS_DRAGONFLY
+go_mime_type_file = go/mime/type_dragonfly.go
+else
+go_mime_type_file =
+endif
+endif
+endif
+
 go_mime_files = \
+   go/mime/encodedword.go \
go/mime/grammar.go \
go/mime/mediatype.go \
go/mime/type.go \
-   go/mime/type_unix.go
+   go/mime/type_unix.go \
+   $(go_mime_type_file)
 
 if LIBGO_IS_LINUX
 go_net_cgo_file = go/net/cgo_linux.go
 go_net_sock_file = go/net/sock_linux.go
 go_net_sockopt_file = go/net/sockopt_linux.go
 go_net_sockoptip_file = go/net/sockoptip_linux.go go/net/sockoptip_posix.go
+go_net_cgo_sock_file = go/net/cgo_socknew.go
+go_net_cgo_res_file = go/net/cgo_resnew.go
 else
 if LIBGO_IS_IRIX
 go_net_cgo_file = go/net/cgo_linux.go
 go_net_sock_file = go/net/sock_linux.go
 go_net_sockopt_file = go/net/sockopt_linux.go
 go_net_sockoptip_file = go/net/sockoptip_linux.go go/net/sockoptip_posix.go
+go_net_cgo_sock_file = go/net/cgo_socknew.go
+go_net_cgo_res_file = go/net/cgo_resnew.go
 else
 if LIBGO_IS_SOLARIS
-go_net_cgo_file = go/net/cgo_linux.go
+go_net_cgo_file = go/net/cgo_solaris.go
 go_net_sock_file = go/net/sock_stub.go
 go_net_sockopt_file = go/net/sockopt_solaris.go
 go_net_sockoptip_file = go/net/sockoptip_stub.go
+go_net_cgo_sock_file = go/net/cgo_socknew.go
+go_net_cgo_res_file = go/net/cgo_resnew.go
 else
 if LIBGO_IS_FREEBSD
 go_net_cgo_file = go/net/cgo_bsd.go
 go_net_sock_file = go/net/sock_bsd.go
 go_net_sockopt_file = go/net/sockopt_bsd.go
 go_net_sockoptip_file = go/net/sockoptip_bsd.go go/net/sockoptip_posix.go
+go_net_cgo_sock_file = go/net/cgo_sockold.go
+go_net_cgo_res_file = go/net/cgo_resold.go
 else
 if LIBGO_IS_NETBSD
 go_net_cgo_file = go/net/cgo_netbsd.go
 go_net_sock_file = go/net/sock_bsd.go
 go_net_sockopt_file = go/net/sockopt_bsd.go
 go_net_sockoptip_file = go/net/sockoptip_bsd.go go/net/sockoptip_posix.go
+go_net_cgo_sock_file = go/net/cgo_sockold.go
+go_net_cgo_res_file = go/net/cgo_resnew.go
 else
 go_net_cgo_file = go/net/cgo_bsd.go
 go_net_sock_file = go/net/sock_bsd.go
 go_net_sockopt_file = go/net/sockopt_bsd.go
 go_net_sockoptip_file = go/net/sockoptip_bsd.go go/net/sockoptip_posix.go
+go_net_cgo_sock_file = go/net/cgo_sockold.go
+go_net_cgo_res_file = go/net/cgo_resold.go
 endif
 endif
 endif
@@ -731,10 +763,14 @@ else
 if LIBGO_IS_DRAGONFLY
 go_net_sendfile_file = go/net/sendfile_dragonfly.go
 else
+if LIBGO_IS_SOLARIS
+go_net_sendfile_file = go/net/sendfile_solaris.go
+else
 go_net_sendfile_file = go/net/sendfile_stub.go
 endif
 endif
 endif
+endif
 
 if LIBGO_IS_LINUX
 go_net_interface_file = go/net/interface_linux.go
@@ -775,15 +811,22 @@ endif
 endif
 
 go_net_common_files = \
+   go/net/addrselect.go \

Re: [PATCH 2/5] remove usage of ROUND_TYPE_SIZE from encoding.c

2015-10-30 Thread Andrew Pinski
On Fri, Oct 30, 2015 at 7:48 PM,   wrote:
> From: Trevor Saunders 
>
> gcc got rid of this target macro in 2003, so it seems safe to assume the
> alternate path works fine on all targets.

This is ok.


>
> libobjc/ChangeLog:
>
> 2015-10-30  Trevor Saunders  
>
> PR libobjc/24775
> * encoding.c (objc_layout_finish_structure): Remove usage of
> ROUND_TYPE_SIZE.
> ---
>  libobjc/encoding.c | 6 --
>  1 file changed, 6 deletions(-)
>
> diff --git a/libobjc/encoding.c b/libobjc/encoding.c
> index abb6145..7de768f 100644
> --- a/libobjc/encoding.c
> +++ b/libobjc/encoding.c
> @@ -1245,14 +1245,8 @@ void objc_layout_finish_structure (struct 
> objc_struct_layout *layout,
>layout->record_align = MAX (1, layout->record_align);
>  #endif
>
> -#ifdef ROUND_TYPE_SIZE
> -  layout->record_size = ROUND_TYPE_SIZE (layout->original_type,
> - layout->record_size,
> - layout->record_align);
> -#else
>/* Round the size up to be a multiple of the required alignment */
>layout->record_size = ROUND (layout->record_size, 
> layout->record_align);
> -#endif
>
>layout->type = NULL;
>  }
> --
> 2.6.2
>


[PATCH] [ARM] PR61551 RFC: Improve costs for NEON addressing modes

2015-10-30 Thread Charles Baylis
Hi Ramana,

[revisiting https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01593.html]

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61551

This patch is an initial attempt to rework the ARM rtx costs to better
handle the costs of various addressing modes, in particular to remove
the incorrect large costs associated with post-indexed addressing in
NEON memory operations.

This patch introduces per-core tables for the costs of using different
addressing modes for different access modes. I have retained the
original code so that the calculated costs can be compared. Currently,
the tables replicate the costs calculated by the original code, and a
debug assert is left in place.

Obviously, a fair amount of clean up is needed before this can be
applied, but I would like a quick comment on the general approach to
check that I haven't completely missed the point before continuing.

After that, I will clean up the coding style, check for impact on the
AArch64 backend, remove the debug code and in a separate patch improve
the tuning for the vector modes.

Thanks
Charles
From b10c6dd7af1f5b9821946783ba9d96b08c751f2b Mon Sep 17 00:00:00 2001
From: Charles Baylis 
Date: Wed, 28 Oct 2015 18:48:16 +
Subject: [PATCH] WIP

Change-Id: If349ffd7dbbe13a814be4a0d022382ddc8270973
---
 gcc/config/arm/aarch-common-protos.h |  28 ++
 gcc/config/arm/aarch-cost-tables.h   | 328 +
 gcc/config/arm/arm.c | 677 ++-
 3 files changed, 1023 insertions(+), 10 deletions(-)

diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h
index 348ae74..dae42d7 100644
--- a/gcc/config/arm/aarch-common-protos.h
+++ b/gcc/config/arm/aarch-common-protos.h
@@ -130,6 +130,33 @@ struct vector_cost_table
   const int alu;
 };
 
+struct cbmem_cost_table
+{
+  enum access_type
+  {
+REG,
+POST_INCDEC,
+PRE_INCDEC,
+/*PRE_MODIFY,*/
+POST_MODIFY,
+PLUS,
+ACCESS_TYPE_LAST = PLUS
+  };
+  const int si[ACCESS_TYPE_LAST + 1];
+  const int di[ACCESS_TYPE_LAST + 1];
+  const int cdi[ACCESS_TYPE_LAST + 1];
+  const int sf[ACCESS_TYPE_LAST + 1];
+  const int df[ACCESS_TYPE_LAST + 1];
+  const int cdf[ACCESS_TYPE_LAST + 1];
+  const int blk[ACCESS_TYPE_LAST + 1];
+  const int vec64[ACCESS_TYPE_LAST + 1];
+  const int vec128[ACCESS_TYPE_LAST + 1];
+  const int vec192[ACCESS_TYPE_LAST + 1];
+  const int vec256[ACCESS_TYPE_LAST + 1];
+  const int vec384[ACCESS_TYPE_LAST + 1];
+  const int vec512[ACCESS_TYPE_LAST + 1];
+};
+
 struct cpu_cost_table
 {
   const struct alu_cost_table alu;
@@ -137,6 +164,7 @@ struct cpu_cost_table
   const struct mem_cost_table ldst;
   const struct fp_cost_table fp[2]; /* SFmode and DFmode.  */
   const struct vector_cost_table vect;
+  const struct cbmem_cost_table addr;
 };
 
 
diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h
index 66e09a8..c5ecdcf 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -122,6 +122,88 @@ const struct cpu_cost_table generic_extra_costs =
   /* Vector */
   {
 COSTS_N_INSNS (1)	/* alu.  */
+  },
+  /* Memory */
+  {
+{ 0, 0, 0, 0, 0 },		/* si */
+{
+  0,
+  COSTS_N_INSNS (1),
+  COSTS_N_INSNS (1),
+  COSTS_N_INSNS (1),
+  COSTS_N_INSNS (1)
+},/* di */
+{
+  0,
+  COSTS_N_INSNS (3),
+  COSTS_N_INSNS (3),
+  COSTS_N_INSNS (3),
+  COSTS_N_INSNS (3)
+},/* cdi */
+{ 0, 0, 0, 0, 0 },		/* sf */
+{
+  0,
+  COSTS_N_INSNS (1),
+  COSTS_N_INSNS (1),
+  COSTS_N_INSNS (1),
+  COSTS_N_INSNS (1)
+},/* df */
+{
+  0,
+  COSTS_N_INSNS (3),
+  COSTS_N_INSNS (3),
+  COSTS_N_INSNS (3),
+  COSTS_N_INSNS (3)
+},/* cdf */
+{
+  0,
+  - COSTS_N_INSNS (1),
+  - COSTS_N_INSNS (1),
+  - COSTS_N_INSNS (1),
+  - COSTS_N_INSNS (1),
+},/* blk */
+{
+  0,
+  COSTS_N_INSNS (1),
+  COSTS_N_INSNS (1),
+  COSTS_N_INSNS (1),
+  COSTS_N_INSNS (1)
+},/* vec64 */
+{
+  0,
+  COSTS_N_INSNS (3),
+  COSTS_N_INSNS (3),
+  COSTS_N_INSNS (3),
+  COSTS_N_INSNS (3)
+},/* vec128 */
+{
+  0,
+  COSTS_N_INSNS (5),
+  COSTS_N_INSNS (5),
+  COSTS_N_INSNS (5),
+  COSTS_N_INSNS (5)
+},/* vec192 */
+{
+  0,
+  COSTS_N_INSNS (7),
+  COSTS_N_INSNS (7),
+  COSTS_N_INSNS (7),
+  COSTS_N_INSNS (7)
+},/* vec256 */
+{
+  0,
+  COSTS_N_INSNS (11),
+  COSTS_N_INSNS (11),
+  COSTS_N_INSNS (11),
+  COSTS_N_INSNS (11)
+},/* vec384 */
+{
+  0,
+  COSTS_N_INSNS (15),
+  COSTS_N_INSNS (15),
+  COSTS_N_INSNS (15),
+  COSTS_N_INSNS (15)
+}/* vec512 */
   }
 };
 
@@ -225,6 +307,88 @@ const struct cpu_cost_table cortexa53_extra_costs =
   /* Vector */
   {
 COSTS_N_INSNS (1)	/* alu.  */
+  },
+  /* 

Re: more accurate omp in fortran

2015-10-30 Thread Dominique d'Humières
> diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c

Revision r229609 breaks bootstrap:

../../work/gcc/fortran/openmp.c: In function 'void 
resolve_omp_clauses(gfc_code*, gfc_omp_clauses*, gfc_namespace*, bool)':
../../work/gcc/fortran/openmp.c:2925:27: error: format '%L' expects argument of 
type 'locus*', but argument 3 has type 'locus' [-Werror=format=]
 n->sym->name, n->where);
   ^
cc1plus: all warnings being treated as errors

TIA

Dominique



Re: [PATCH 5/5] remove usage of ADJUST_FIELD_ALIGN in encoding.c

2015-10-30 Thread Richard Biener
On Fri, Oct 30, 2015 at 1:06 PM, Bernd Schmidt  wrote:
> On 10/30/2015 12:48 PM, tbsaunde+...@tbsaunde.org wrote:
>>
>> -#ifdef ADJUST_FIELD_ALIGN
>> -  desired_align = ADJUST_FIELD_ALIGN (type, desired_align);
>> +#if defined __arc__ || defined _AIX
>> +  if (TYPE_MODE (strip_array_types (TREE_TYPE (type))) == DFmode)
>> +desired_align = MIN (desired_align, 32);
>> +#elif __POWERPC__ && __APPLE__
>> +  if (desired_align != 128)
>> +desired_align = MIN (desired_align, 32);
>>   #endif
>
>
> No way. We never use this kind of test in target-independent code.

it's not target independent code.  Are you suggesting to add a config/
to libobjc?  IMHO for a not really mantained frontend / target lib that's
an excessive requirement.

For any such replacements as in the patch I suggest to at least keep a comment
before it indicating the origin of the inlined vairants (in this case refer to
ADJUST_FIELD_ALIGN).

In general I'm happy with this kind of patches (maybe not the
BIGGEST_FIELD_ALIGN
one which could be made a CPP macro when -fbuilding-libgcc)

Richard.

>
> Bernd


Re: [PATCH] Remove fold () dispatch from fold_gimple_assign

2015-10-30 Thread Richard Biener
On Thu, 29 Oct 2015, Richard Biener wrote:

> 
> After
> 
> Index: gcc/gimple-fold.c
> ===
> --- gcc/gimple-fold.c   (revision 229518)
> +++ gcc/gimple-fold.c   (working copy)
> @@ -398,7 +398,10 @@ fold_gimple_assign (gimple_stmt_iterator
>  /* If we couldn't fold the RHS, hand over to the generic
> fold routines.  */
>  if (result == NULL_TREE)
> -  result = fold (rhs);
> + {
> +   result = fold (rhs);
> +   gcc_assert (result == rhs);
> + }
>  
>  /* Strip away useless type conversions.  Both the NON_LVALUE_EXPR
> that may have been added by fold, and "useless" type
> 
> passed bootstrap and regtest on x86_64-unknown-linux-gnu I am now
> testing the following.

The following is what I ended up applying.

Bootstrapped & tested on x86_64-unknown-linux-gnu.

Richard.

2015-10-30  Richard Biener  

* gimple-fold.c (fold_gimple_assign): Do not dispatch to
fold () on single RHSs.  Allow CONSTRUCTORS with trailing
zeros to be folded to VECTOR_CSTs.
* tree.c (build_vector_from_ctor): Handle VECTOR_CST elements.
* fold-const.c (fold): Use build_vector_from_ctor.

Index: gcc/gimple-fold.c
===
*** gcc/gimple-fold.c   (revision 229520)
--- gcc/gimple-fold.c   (working copy)
*** fold_gimple_assign (gimple_stmt_iterator
*** 355,362 
return val;
  }
  }
- 
  }
else if (TREE_CODE (rhs) == ADDR_EXPR)
  {
tree ref = TREE_OPERAND (rhs, 0);
--- 355,362 
return val;
  }
  }
  }
+ 
else if (TREE_CODE (rhs) == ADDR_EXPR)
  {
tree ref = TREE_OPERAND (rhs, 0);
*** fold_gimple_assign (gimple_stmt_iterator
*** 371,391 
else if (TREE_CODE (ref) == MEM_REF
 && integer_zerop (TREE_OPERAND (ref, 1)))
  result = fold_convert (TREE_TYPE (rhs), TREE_OPERAND (ref, 0));
  }
  
else if (TREE_CODE (rhs) == CONSTRUCTOR
!&& TREE_CODE (TREE_TYPE (rhs)) == VECTOR_TYPE
!&& (CONSTRUCTOR_NELTS (rhs)
!== TYPE_VECTOR_SUBPARTS (TREE_TYPE (rhs
  {
/* Fold a constant vector CONSTRUCTOR to VECTOR_CST.  */
unsigned i;
tree val;
  
FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (rhs), i, val)
! if (TREE_CODE (val) != INTEGER_CST
! && TREE_CODE (val) != REAL_CST
! && TREE_CODE (val) != FIXED_CST)
return NULL_TREE;
  
return build_vector_from_ctor (TREE_TYPE (rhs),
--- 371,399 
else if (TREE_CODE (ref) == MEM_REF
 && integer_zerop (TREE_OPERAND (ref, 1)))
  result = fold_convert (TREE_TYPE (rhs), TREE_OPERAND (ref, 0));
+ 
+   if (result)
+ {
+   /* Strip away useless type conversions.  Both the
+  NON_LVALUE_EXPR that may have been added by fold, and
+  "useless" type conversions that might now be apparent
+  due to propagation.  */
+   STRIP_USELESS_TYPE_CONVERSION (result);
+ 
+   if (result != rhs && valid_gimple_rhs_p (result))
+ return result;
+ }
  }
  
else if (TREE_CODE (rhs) == CONSTRUCTOR
!&& TREE_CODE (TREE_TYPE (rhs)) == VECTOR_TYPE)
  {
/* Fold a constant vector CONSTRUCTOR to VECTOR_CST.  */
unsigned i;
tree val;
  
FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (rhs), i, val)
! if (! CONSTANT_CLASS_P (val))
return NULL_TREE;
  
return build_vector_from_ctor (TREE_TYPE (rhs),
*** fold_gimple_assign (gimple_stmt_iterator
*** 394,414 
  
else if (DECL_P (rhs))
  return get_symbol_constant_value (rhs);
- 
- /* If we couldn't fold the RHS, hand over to the generic
-fold routines.  */
- if (result == NULL_TREE)
-   result = fold (rhs);
- 
- /* Strip away useless type conversions.  Both the NON_LVALUE_EXPR
-that may have been added by fold, and "useless" type
-conversions that might now be apparent due to propagation.  */
- STRIP_USELESS_TYPE_CONVERSION (result);
- 
- if (result != rhs && valid_gimple_rhs_p (result))
- return result;
- 
-   return NULL_TREE;
}
break;
  
--- 402,407 
Index: gcc/tree.c
===
*** gcc/tree.c  (revision 229520)
--- gcc/tree.c  (working copy)
*** tree
*** 1730,1742 
  build_vector_from_ctor 

[PATCH 1/2] Implement Levenshtein distance

2015-10-30 Thread David Malcolm
This patch adds an implementation of Levenshtein distance to gcc,
along with unit testing of the algorithm.

The unit testing is implemented via a plugin within gcc.dg/plugin.
(I'd prefer to do this via the unit testing patches I've been
proposing in a separate patch kit, but to avoid depending on that
this kit does it within a custom plugin.)

The plugin actually fails until followup patches are applied, with:

 cc1: error: cannot load plugin ./levenshtein_plugin.so
 ./levenshtein_plugin.so: undefined symbol: _Z20levenshtein_distancePKcS0_

due to nothing in the tree initially using the API, but I've broken
it out in the hope that it makes review easier.

gcc/ChangeLog:
* Makefile.in (OBJS): Add spellcheck.o.
* spellcheck.c: New file.
* spellcheck.h: New file.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/levenshtein-test-1.c: New file.
* gcc.dg/plugin/levenshtein_plugin.c: New file.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
levenshtein_plugin.c.
---
 gcc/Makefile.in  |   1 +
 gcc/spellcheck.c | 136 +++
 gcc/spellcheck.h |  32 ++
 gcc/testsuite/gcc.dg/plugin/levenshtein-test-1.c |   9 ++
 gcc/testsuite/gcc.dg/plugin/levenshtein_plugin.c |  64 +++
 gcc/testsuite/gcc.dg/plugin/plugin.exp   |   1 +
 6 files changed, 243 insertions(+)
 create mode 100644 gcc/spellcheck.c
 create mode 100644 gcc/spellcheck.h
 create mode 100644 gcc/testsuite/gcc.dg/plugin/levenshtein-test-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/levenshtein_plugin.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 2685b38..9fb643e 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1394,6 +1394,7 @@ OBJS = \
shrink-wrap.o \
simplify-rtx.o \
sparseset.o \
+   spellcheck.o \
sreal.o \
stack-ptr-mod.o \
statistics.o \
diff --git a/gcc/spellcheck.c b/gcc/spellcheck.c
new file mode 100644
index 000..532df58
--- /dev/null
+++ b/gcc/spellcheck.c
@@ -0,0 +1,136 @@
+/* Find near-matches for strings and identifiers.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "spellcheck.h"
+
+/* The Levenshtein distance is an "edit-distance": the minimal
+   number of one-character insertions, removals or substitutions
+   that are needed to change one string into another.
+
+   This implementation uses the Wagner-Fischer algorithm.  */
+
+static edit_distance_t
+levenshtein_distance (const char *s, int m,
+ const char *t, int n)
+{
+  const bool debug = false;
+
+  if (debug)
+{
+  printf ("s: \"%s\" (m=%i)\n", s, m);
+  printf ("t: \"%s\" (n=%i)\n", t, n);
+}
+
+  if (m == 0)
+return n;
+  if (n == 0)
+return m;
+
+  /* We effectively build a matrix where each (i, j) contains the
+ Levenshtein distance between the prefix strings s[0:i]
+ and t[0:j].
+ Rather than actually build an (m + 1) * (n + 1) matrix,
+ we simply keep track of the last row, v0 and a new row, v1,
+ which avoids an (m + 1) * (n + 1) allocation and memory accesses
+ in favor of two (m + 1) allocations.  These could potentially be
+ statically-allocated if we impose a maximum length on the
+ strings of interest.  */
+  edit_distance_t *v0 = new edit_distance_t[m + 1];
+  edit_distance_t *v1 = new edit_distance_t[m + 1];
+
+  /* The first row is for the case of an empty target string, which
+ we can reach by deleting every character in the source string.  */
+  for (int i = 0; i < m + 1; i++)
+v0[i] = i;
+
+  /* Build successive rows.  */
+  for (int i = 0; i < n; i++)
+{
+  if (debug)
+   {
+ printf ("i:%i v0 = ", i);
+ for (int j = 0; j < m + 1; j++)
+   printf ("%i ", v0[j]);
+ printf ("\n");
+   }
+
+  /* The initial column is for the case of an empty source string; we
+can reach prefixes of the target string of length i
+by inserting i characters.  */
+  v1[0] = i + 1;
+
+  /* Build the rest of the row by considering neighbours to
+the north, west and northwest.  */
+  for (int j = 0; j < m; j++)
+   {
+   

[PATCH 2/2] C FE: suggest corrections for misspelled field names

2015-10-30 Thread David Malcolm
This is similar to the field-name part of the v2 patch:
 https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01090.html
with the following changes:
  - don't call unit tests from lookup_field_fuzzy
(instead, see patch 1 in the kit)
  - use a cutoff: if more than half of the letters
were misspelled, the suggestion is likely to
be meaningless, so don't offer it.
  - more test coverage
  - deferral of the hints for type-name lookup (this can
wait to a later patch, since it seemed more
controversial)

gcc/c/ChangeLog:
* c-typeck.c: Include spellcheck.h.
(lookup_field_fuzzy_find_candidates): New function.
(lookup_field_fuzzy): New function.
(build_component_ref): If the field was not found, try using
lookup_field_fuzzy and potentially offer a suggestion.

gcc/testsuite/ChangeLog:
* gcc.dg/spellcheck-fields.c: New file.
---
 gcc/c/c-typeck.c | 74 +++-
 gcc/testsuite/gcc.dg/spellcheck-fields.c | 63 +++
 2 files changed, 136 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/spellcheck-fields.c

diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 61c5313..0660610 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -54,6 +54,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "c-family/c-ubsan.h"
 #include "cilk.h"
 #include "gomp-constants.h"
+#include "spellcheck.h"
 
 /* Possible cases of implicit bad conversions.  Used to select
diagnostic messages in convert_for_assignment.  */
@@ -2249,6 +2250,72 @@ lookup_field (tree type, tree component)
   return tree_cons (NULL_TREE, field, NULL_TREE);
 }
 
+/* Recursively append candidate IDENTIFIER_NODEs to CANDIDATES.  */
+
+static void
+lookup_field_fuzzy_find_candidates (tree type, tree component,
+   vec *candidates)
+{
+  tree field;
+  for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
+{
+  if (DECL_NAME (field) == NULL_TREE
+ && (TREE_CODE (TREE_TYPE (field)) == RECORD_TYPE
+ || TREE_CODE (TREE_TYPE (field)) == UNION_TYPE))
+   {
+ lookup_field_fuzzy_find_candidates (TREE_TYPE (field),
+ component,
+ candidates);
+   }
+
+  if (DECL_NAME (field))
+   candidates->safe_push (DECL_NAME (field));
+}
+}
+
+/* Like "lookup_field", but find the closest matching IDENTIFIER_NODE,
+   rather than returning a TREE_LIST for an exact match.  */
+
+static tree
+lookup_field_fuzzy (tree type, tree component)
+{
+  gcc_assert (TREE_CODE (component) == IDENTIFIER_NODE);
+
+  /* First, gather a list of candidates.  */
+  auto_vec  candidates;
+
+  lookup_field_fuzzy_find_candidates (type, component,
+ );
+
+  /* Now determine which is closest.  */
+  int i;
+  tree identifier;
+  tree best_identifier = NULL;
+  edit_distance_t best_distance = MAX_EDIT_DISTANCE;
+  FOR_EACH_VEC_ELT (candidates, i, identifier)
+{
+  gcc_assert (TREE_CODE (identifier) == IDENTIFIER_NODE);
+  edit_distance_t dist = levenshtein_distance (component, identifier);
+  if (dist < best_distance)
+   {
+ best_distance = dist;
+ best_identifier = identifier;
+   }
+}
+
+  /* If more than half of the letters were misspelled, the suggestion is
+ likely to be meaningless.  */
+  if (best_identifier)
+{
+  unsigned int cutoff = MAX (IDENTIFIER_LENGTH (component),
+IDENTIFIER_LENGTH (best_identifier)) / 2;
+  if (best_distance > cutoff)
+   return NULL;
+}
+
+  return best_identifier;
+}
+
 /* Make an expression to refer to the COMPONENT field of structure or
union value DATUM.  COMPONENT is an IDENTIFIER_NODE.  LOC is the
location of the COMPONENT_REF.  */
@@ -2284,7 +2351,12 @@ build_component_ref (location_t loc, tree datum, tree 
component)
 
   if (!field)
{
- error_at (loc, "%qT has no member named %qE", type, component);
+ tree guessed_id = lookup_field_fuzzy (type, component);
+ if (guessed_id)
+   error_at (loc, "%qT has no member named %qE; did you mean %qE?",
+ type, component, guessed_id);
+ else
+   error_at (loc, "%qT has no member named %qE", type, component);
  return error_mark_node;
}
 
diff --git a/gcc/testsuite/gcc.dg/spellcheck-fields.c 
b/gcc/testsuite/gcc.dg/spellcheck-fields.c
new file mode 100644
index 000..01be550
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/spellcheck-fields.c
@@ -0,0 +1,63 @@
+/* { dg-do compile } */
+
+struct foo
+{
+  int foo;
+  int bar;
+  int baz;
+};
+
+int test (struct foo *ptr)
+{
+  return ptr->m_bar; /* { dg-error "'struct foo' has no member named 'm_bar'; 
did you mean 'bar'?" } */
+}
+
+int test2 (void)
+{
+  struct foo instance = {0, 0, 0};
+  return 

Re: [PATCH] Pass manager: add support for termination of pass list

2015-10-30 Thread Martin Liška
On 10/30/2015 01:13 PM, Richard Biener wrote:
> So I suggest to do the push/pop of cfun there.
> do_per_function_toporder can be made static btw.
> 
> Richard.

Right, I've done that and it works (bootstrap has been currently running),
feasible for HSA branch too.

tree-pass.h:

/* Declare for plugins.  */
extern void do_per_function_toporder (void (*) (function *, void *), void *);

Attaching the patch that I'm going to test.

Martin

>From 8438a0518b3b65162201d6181c2771e7a7203476 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 22 Oct 2015 12:46:16 +0200
Subject: [PATCH] Pass manager: add support for termination of pass list

gcc/ChangeLog:

2015-10-30  Martin Liska  

	* passes.c (do_per_function_toporder): Push to cfun before
	calling the pass manager.
	(execute_one_pass): Handle TODO_discard_function.
	(execute_pass_list_1): Terminate if current function is null.
	(execute_pass_list): Do not push and pop function.
	* tree-pass.h: Define new TODO_discard_function.
---
 gcc/passes.c| 32 
 gcc/tree-pass.h |  3 +++
 3 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/gcc/passes.c b/gcc/passes.c
index 8b3fb2f..f24fc57 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -1728,7 +1728,12 @@ do_per_function_toporder (void (*callback) (function *, void *data), void *data)
 	  order[i] = NULL;
 	  node->process = 0;
 	  if (node->has_gimple_body_p ())
-	callback (DECL_STRUCT_FUNCTION (node->decl), data);
+	{
+	  struct function *fn = DECL_STRUCT_FUNCTION (node->decl);
+	  push_cfun (fn);
+	  callback (fn, data);
+	  pop_cfun ();
+	}
 	}
   symtab->remove_cgraph_removal_hook (hook);
 }
@@ -2378,6 +2383,23 @@ execute_one_pass (opt_pass *pass)
 
   current_pass = NULL;
 
+  if (todo_after & TODO_discard_function)
+{
+  gcc_assert (cfun);
+  /* As cgraph_node::release_body expects release dominators info,
+	 we have to release it.  */
+  if (dom_info_available_p (CDI_DOMINATORS))
+	free_dominance_info (CDI_DOMINATORS);
+
+  if (dom_info_available_p (CDI_POST_DOMINATORS))
+	free_dominance_info (CDI_POST_DOMINATORS);
+
+  cgraph_node::get (current_function_decl)->release_body ();
+
+  current_function_decl = NULL;
+  set_cfun (NULL);
+}
+
   /* Signal this is a suitable GC collection point.  */
   if (!((todo_after | pass->todo_flags_finish) & TODO_do_not_ggc_collect))
 ggc_collect ();
@@ -2392,6 +2414,9 @@ execute_pass_list_1 (opt_pass *pass)
 {
   gcc_assert (pass->type == GIMPLE_PASS
 		  || pass->type == RTL_PASS);
+
+  if (cfun == NULL)
+	return;
   if (execute_one_pass (pass) && pass->sub)
 execute_pass_list_1 (pass->sub);
 
@@ -2403,14 +2428,13 @@ execute_pass_list_1 (opt_pass *pass)
 void
 execute_pass_list (function *fn, opt_pass *pass)
 {
-  push_cfun (fn);
+  gcc_assert (fn == cfun);
   execute_pass_list_1 (pass);
-  if (fn->cfg)
+  if (cfun && fn->cfg)
 {
   free_dominance_info (CDI_DOMINATORS);
   free_dominance_info (CDI_POST_DOMINATORS);
 }
-  pop_cfun ();
 }
 
 /* Write out all LTO data.  */
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 7a5f476..2627df3 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -296,6 +296,9 @@ protected:
 /* Rebuild the callgraph edges.  */
 #define TODO_rebuild_cgraph_edges   (1 << 22)
 
+/* Release function body and stop pass manager.  */
+#define TODO_discard_function		(1 << 23)
+
 /* Internally used in execute_function_todo().  */
 #define TODO_update_ssa_any		\
 (TODO_update_ssa			\
-- 
2.6.2



Re: Try to update dominance info in tree-call-cdce.c

2015-10-30 Thread Richard Sandiford
Richard Biener  writes:
> On Fri, Oct 30, 2015 at 1:14 PM, Richard Sandiford
>  wrote:
>> Richard Biener  writes:
>>> On Fri, Oct 30, 2015 at 12:18 PM, Richard Sandiford
>>>  wrote:
 The pass would free the dominance info after making a change, but it
 should be pretty easy to keep the information up-to-date when the call
 has no EH edges.  In a way the main hurdle was split_block, which seemed
 to assume that the new block would postdominate the old one, and that
 all blocks immediately dominated by the old block are now immediately
 dominated by the new one.

 Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu.
 OK to install?
>>>
>>> Hmm, I don't understand why split_block needs to be touched.  The
>>> operation itself correctly updates dominator info.  It is up to the
>>> pass to fix things up if it does further CFG modifications that make
>>> the new block no longer post-dominate the old one.
>>>
>>> So why do you need the split_block change?
>>
>> The updates we'd need here would be:
>>
>> redirect_immediate_dominators (CDI_DOMINATORS, call, guard_bb);
>>
>> which undoes the earlier:
>>
>> redirect_immediate_dominators (CDI_DOMINATORS, guard_bb, call);
>>
>> that split_block did.  It just seemed wasteful to call
>> redirect_immediate_dominators twice to get a no-op.
>>
>> In other words, there are going to be callers to split_block that
>> know the second block isn't going to postdominate the first and
>> where the calling;
>>
>> redirect_immediate_dominators (CDI_DOMINATORS, first_block,
>>second_block);
>>
>> is taking us further from where we want to be.
>
> That's true.  In an ideal world we'd have a CFG hook creating a
> (half) diamond directly.
>
> I wonder how other passes work around this issue?  I suppose
> they are splitting the block to form the conditonal block and the
> joiner?  If you have those and then only split the fallthru edge
> between them the redundant work done is minimal.

Yeah, but that then makes the code more complicated because you
still need to split the call block from the guard block rather than
the joiner in the EH case.  It also means that you need to do more
set_immediate_dominators (the call is not the immediate dominator
of the join block), whereas with the current code that falls out
naturally.

Is the split_block change really so bad?

Thanks,
Richard



Re: [PATCH 5/5] remove usage of ADJUST_FIELD_ALIGN in encoding.c

2015-10-30 Thread Bernd Schmidt

On 10/30/2015 01:47 PM, Richard Biener wrote:

On Fri, Oct 30, 2015 at 1:28 PM, Bernd Schmidt  wrote:


it's not target independent code.  Are you suggesting to add a config/
to libobjc?  IMHO for a not really mantained frontend / target lib that's
an excessive requirement.



If necessary, then yes that would be a better solution.

Even just keeping the abstraction of the macro and putting definitions of it
inside #ifdef at the top of the file would be an improvement over the
submitted patch, but IMO still not really compatible with our standards.


I agree, that would make the source of the copy more obvious.


If we go down that path, there's another minimum requirement - adjust 
the docs to say that the macro must be defined in two places. That's my 
main worry, someone creating a new port, completely oblivious that 
they'd have to modify library code for it to work.



Bernd


RE: [Patch] [x86_64] libgcc changes to add znver1

2015-10-30 Thread Kumar, Venkataramanan
Hi Uros,

> -Original Message-
> From: Uros Bizjak [mailto:ubiz...@gmail.com]
> Sent: Friday, October 30, 2015 2:33 PM
> To: Kumar, Venkataramanan
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [Patch] [x86_64] libgcc changes to add znver1
> 
> On Thu, Oct 29, 2015 at 2:16 PM, Kumar, Venkataramanan
>  wrote:
> > Hi Uros,
> >
> > As per your comments in https://gcc.gnu.org/ml/gcc-patches/2015-
> 09/msg02326.html  please find the patch that also adds changes to libgcc.
> >
> > It was bootstrapped and regressed tested on x86_64.
> >
> > Ok for trunk?
> >
> > Change logs
> > gcc/ChangeLog
> >
> > 2015-10-29  Venkataramanan Kumar  
> >
> >* config/i386/i386.c (get_builtin_code_for_version): Set priority for
> >PROCESSOR_ZNVER1.
> >(enum processor_model): Add M_AMDFAM17H_znver1.
> >(struct arch_names_table): Likewise.
> >* doc/extend.texi: ADD znver1.
> >
> > libgcc/ChangeLog
> > 2015-10-12  Venkataramanan kumar  
> >
> >* config/i386/cpuinfo.c (enum processor_types): Add AMDFAM17H.
> >(processor_subtypes): Add znver1.
> >(get_amd_cpu): Detect znver1.
> 
> OK.

Thank you.  I committed  the patch.
Ref: https://gcc.gnu.org/viewcvs/gcc?view=revision=229575

Regards,
Venkat. 

> 
> Thanks,
> Uros.


Re: Try to update dominance info in tree-call-cdce.c

2015-10-30 Thread Richard Biener
On Fri, Oct 30, 2015 at 1:14 PM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Fri, Oct 30, 2015 at 12:18 PM, Richard Sandiford
>>  wrote:
>>> The pass would free the dominance info after making a change, but it
>>> should be pretty easy to keep the information up-to-date when the call
>>> has no EH edges.  In a way the main hurdle was split_block, which seemed
>>> to assume that the new block would postdominate the old one, and that
>>> all blocks immediately dominated by the old block are now immediately
>>> dominated by the new one.
>>>
>>> Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu.
>>> OK to install?
>>
>> Hmm, I don't understand why split_block needs to be touched.  The
>> operation itself correctly updates dominator info.  It is up to the
>> pass to fix things up if it does further CFG modifications that make
>> the new block no longer post-dominate the old one.
>>
>> So why do you need the split_block change?
>
> The updates we'd need here would be:
>
> redirect_immediate_dominators (CDI_DOMINATORS, call, guard_bb);
>
> which undoes the earlier:
>
> redirect_immediate_dominators (CDI_DOMINATORS, guard_bb, call);
>
> that split_block did.  It just seemed wasteful to call
> redirect_immediate_dominators twice to get a no-op.
>
> In other words, there are going to be callers to split_block that
> know the second block isn't going to postdominate the first and
> where the calling;
>
> redirect_immediate_dominators (CDI_DOMINATORS, first_block,
>second_block);
>
> is taking us further from where we want to be.

That's true.  In an ideal world we'd have a CFG hook creating a
(half) diamond directly.

I wonder how other passes work around this issue?  I suppose
they are splitting the block to form the conditonal block and the
joiner?  If you have those and then only split the fallthru edge
between them the redundant work done is minimal.

Richard.

> Thanks,
> Richard
>


[PR 68064] Do not create jump functions with zero alignment

2015-10-30 Thread Martin Jambor
Hi,

in PR 68064, IPA-CP hits an assert upon encountering a jump function
claiming that a pointer has known alignment of zero.  That is actually
what get_pointer_alignment_1 returns when asked what is the alignment
of iftmp.0_1 in:

  :
  # iftmp.0_1 = PHI <0B(2), 2147483648B(3)>
  {anonymous}::fn1 (iftmp.0_1);

I suppose that given the circumstances, it is more-or-less reasonable
answer, even if very weird, so we should check for that possibility.
That is what the patch below does.  Bootstrapped and tested on
x86_64-linux.  OK for trunk?

Thanks,

Martin



2015-10-29  Martin Jambor  

PR ipa/68064
* ipa-prop.c (ipa_compute_jump_functions_for_edge): Check that
alignment iz not zero.

testsuite/
* g++.dg/torture/pr68064.C: New.

diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 19846a8..f36e2fd 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -1651,6 +1651,7 @@ ipa_compute_jump_functions_for_edge (struct 
ipa_func_body_info *fbi,
  unsigned align;
 
  if (get_pointer_alignment_1 (arg, , _bitpos)
+ && align != 0
  && align % BITS_PER_UNIT == 0
  && hwi_bitpos % BITS_PER_UNIT == 0)
{
diff --git a/gcc/testsuite/g++.dg/torture/pr68064.C 
b/gcc/testsuite/g++.dg/torture/pr68064.C
new file mode 100644
index 000..59b6897
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr68064.C
@@ -0,0 +1,35 @@
+// { dg-do compile }
+
+template  class A {
+public:
+  class B;
+  typedef typename Config::template D::type TypeHandle;
+  static A *Tagged() { return B::New(B::kTagged); }
+  static TypeHandle Union(TypeHandle);
+  static TypeHandle Representation(TypeHandle, typename Config::Region *);
+  bool Is();
+};
+
+template  class A::B {
+  friend A;
+  enum { kTaggedPointer = 1 << 31, kTagged = kTaggedPointer };
+  static A *New(int p1) { return Config::from_bitset(p1); }
+};
+
+struct C {
+  typedef int Region;
+  template  struct D { typedef A *type; };
+  static A *from_bitset(unsigned);
+};
+A *C::from_bitset(unsigned p1) { return reinterpret_cast(p1); }
+
+namespace {
+int *a;
+void fn1(A *p1) { A::Union(A::Representation(p1, a)); }
+}
+
+void fn2() {
+  A b;
+  A *c = b.Is() ? 0 : A::Tagged();
+  fn1(c);
+}


Re: [PATCH 5/5] remove usage of ADJUST_FIELD_ALIGN in encoding.c

2015-10-30 Thread Trevor Saunders
On Fri, Oct 30, 2015 at 01:16:16PM +0100, Richard Biener wrote:
> On Fri, Oct 30, 2015 at 1:06 PM, Bernd Schmidt  wrote:
> > On 10/30/2015 12:48 PM, tbsaunde+...@tbsaunde.org wrote:
> >>
> >> -#ifdef ADJUST_FIELD_ALIGN
> >> -  desired_align = ADJUST_FIELD_ALIGN (type, desired_align);
> >> +#if defined __arc__ || defined _AIX
> >> +  if (TYPE_MODE (strip_array_types (TREE_TYPE (type))) == DFmode)
> >> +desired_align = MIN (desired_align, 32);
> >> +#elif __POWERPC__ && __APPLE__
> >> +  if (desired_align != 128)
> >> +desired_align = MIN (desired_align, 32);
> >>   #endif
> >
> >
> > No way. We never use this kind of test in target-independent code.
> 
> it's not target independent code.  Are you suggesting to add a config/
> to libobjc?  IMHO for a not really mantained frontend / target lib that's
> an excessive requirement.

Given the amount of target dependant stuff involved adding a config/
actually seems worse to me.  You are accomplishing the exact same thing,
but you need a whole lot more machinary to do it, and its hard to
understand what happens for any given platform.  Sure, if there was a
whole lot more target code doing something else might make sense, but
there isn't and I'm certainly not planning on adding more.  SO I think
its best to leave it this way and if someone wants to do substantial
work on libobjc in the future they can worry about that then.

btw the claim its never done just doesn't stand up either, look at the
__SPARC__ check this series removes, or the __MINGW__ check in gthr.h, or
even all the crap at the top of encoding.c that makes using these target
macros possible (it wouldn't actually suprise me if cleaning that up
ment doing this was a net reduction in target dependent code in
encoding.c).

If you want to be kind of sad I discovered
https://github.com/gnustep/libobjc2 while looking at this, so it seems
like many of the possible users of libobjc may even be not using it.

> For any such replacements as in the patch I suggest to at least keep a comment
> before it indicating the origin of the inlined vairants (in this case refer to
> ADJUST_FIELD_ALIGN).

That seems fairly reasonable, I'd kind of worry about them getting out
of date, but i guess it at least gives a place to start looking for an
explanation.

> In general I'm happy with this kind of patches (maybe not the
> BIGGEST_FIELD_ALIGN
> one which could be made a CPP macro when -fbuilding-libgcc)

I considered that, but the only targets that define
BIGGEST_FIELD_ALIGNMENT  for purposes other than IN_TARGET_LIBS hacks
were v850, vax, tilegx, and tilepro so considering
BIGGEST_FIELD_ALIGNMENT kind of dupplicates ADJUST_FIELD_ALIGN my
conclusion was it would make more sense to not do
that.  I'm thinking it makes sense to instead just merge
BIGGEST_FIELD_ALIGNMENT into ADJUST_FIELD_ALIGN, but adding a predefined
macro would make that harder.

Trev

> 
> Richard.
> 
> >
> > Bernd


Re: [gomp4 04/14] nvptx: fix output of _Bool global variables

2015-10-30 Thread Alexander Monakov


On Thu, 29 Oct 2015, Bernd Schmidt wrote:

> On 10/28/2015 08:29 PM, Alexander Monakov wrote:
> 
> > Anything wrong with the simple fix: pick an integer type with the largest
> > size
> > dividing the original struct type size?
> 
> Try it and run it through the testsuite.

The following patch passes testing with

make -k check-c DEJAGNU=.../dejagnu.exp 
RUNTESTFLAGS=--target_board=nvptx-none-run

with no new regressions, and fixes 1 test: 
-FAIL: gcc.dg/compat/struct-align-1 c_compat_x_tst.o-c_compat_y_tst.o execute

OK?

Thanks.
Alexander

nvptx: fix chunk size selection for structure types

* config/nvptx/nvptx.c (nvptx_ptx_type_for_output): New.  Handle
COMPLEX_TYPE like ARRAY_TYPE.  Drop special handling of scalar types.
Fix handling of structure types by choosing integer type that divides
original size evenly.  Split out from and use it...
(init_output_initializer): ...here.

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index b541666..3a0cac2 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -1692,6 +1692,29 @@ nvptx_assemble_decl_end (void)
   fprintf (asm_out_file, ";\n");
 }
 
+/* Return a type suitable to output initializers for TYPE.  */
+static const_tree
+nvptx_ptx_type_for_output (const_tree type)
+{
+  /* Avoid picking a larger type than the underlying type.  */
+  if (TREE_CODE (type) == ARRAY_TYPE
+  || TREE_CODE (type) == COMPLEX_TYPE)
+type = TREE_TYPE (type);
+  int sz = int_size_in_bytes (type);
+  if (sz < 0)
+return char_type_node;
+  /* Size of the output type must divide that of original type.  Initializers
+ with pointers to objects need a pointer-sized type.  These requirements
+ may be contradictory for packed structs, but giving priority to first at
+ least allows to output some initializers correctly.  Here we pick largest
+ suitable integer type without deeper inspection.  */
+  return (sz % 8 || !TARGET_ABI64
+ ? (sz % 4
+? (sz % 2 ? char_type_node : short_integer_type_node)
+: integer_type_node)
+ : long_integer_type_node);
+}
+
 /* Start a declaration of a variable of TYPE with NAME to
FILE.  IS_PUBLIC says whether this will be externally visible.
Here we just write the linker hint and decide on the chunk size
@@ -1705,15 +1728,7 @@ init_output_initializer (FILE *file, const char *name, 
const_tree type,
   assemble_name_raw (file, name);
   fputc ('\n', file);
 
-  if (TREE_CODE (type) == ARRAY_TYPE)
-type = TREE_TYPE (type);
-  int sz = int_size_in_bytes (type);
-  if ((TREE_CODE (type) != INTEGER_TYPE
-   && TREE_CODE (type) != ENUMERAL_TYPE
-   && TREE_CODE (type) != REAL_TYPE)
-  || sz < 0
-  || sz > HOST_BITS_PER_WIDE_INT)
-type = ptr_type_node;
+  type = nvptx_ptx_type_for_output (type);
   decl_chunk_size = int_size_in_bytes (type);
   decl_chunk_mode = int_mode_for_mode (TYPE_MODE (type));
   decl_offset = 0;



Re: [gomp4] openacc reduction simplification

2015-10-30 Thread Thomas Schwinge
Hi Nathan!

On Thu, 29 Oct 2015 13:28:02 -0700, Nathan Sidwell  wrote:
> I've committed this to gomp4.   It removes a no-longer needed field from 
> omp_context &  simplifies the dummy head/tail generation needed for 
> reductions 
> at the outermost level.  Also incorporates the simplification I committed to 
> trunk  earlier today.

>   * omp-low.c (struct omp_context): Remove reductions field.
>   (scan_sharing_clauses): Don't increment it.
>   (lower_omp_target): Don't check it.  Move openacc dummy gang head
>   & tail generation later & simplify.  Merge ifs.

Thanks for clearing that up.  When working through merge conflicts of a
recent merge from trunk into gomp-4_0-branch, I had noticed the same
oddities (unreachable "if" condition, too complicated reductions
handling), but at that time couldn't allocate time for working on it.


Grüße
 Thomas


signature.asc
Description: PGP signature


[gomp4, committed] Backport tree-ssa-structalias.c fixes from trunk

2015-10-30 Thread Tom de Vries

Hi,

this patch backports my commits to trunk of this week in 
tree-ssa-structalias.c.


Committed to gomp-4_0-branch.

Thanks,
- Tom
Backport tree-ssa-structalias.c fixes from trunk

2015-10-30  Tom de Vries  

	backport from trunk:
	2015-10-30  Tom de Vries  

	* tree-ssa-structalias.c (ipa_pta_execute): Declare variable from as
	unsigned, and initialize, and use initial value instead of hardcoded
	constant.  Add generic constraints dumping section.  Don't dump global
	initializers constraints dumping section if empty.  Don't update
	variable from if unused.

	2015-10-28  Tom de Vries  

	* tree-ssa-structalias.c (intra_create_variable_infos): Remove
	superfluous code.

	* tree-ssa-structalias.c (intra_create_variable_infos): Don't iterate
	into vi_next of a full_var.

	* tree-ssa-structalias.c (new_var_info, make_heapvar)
	(make_constraint_from_restrict, make_constraint_from_global_restrict)
	(create_function_info_for, create_variable_info_for_1)
	(create_variable_info_for): Add and handle add_id parameter.
	(get_call_vi, new_scalar_tmp_constraint_exp, handle_rhs_call)
	(init_base_vars): Add extra argument to calls to new_var_info.
	(get_vi_for_tree): Add extra argument to call to
	create_variable_info_for.
	(process_constraint, do_deref, process_all_all_constraints): Add extra
	argument to calls to new_scalar_tmp_constraint_exp.
	(handle_lhs_call, find_func_aliases_for_builtin_call): Add extra
	argument to calls to make_heapvar.
	(make_restrict_var_constraints): Add extra argument to call to
	make_constraint_from_global_restrict.
	(intra_create_variable_infos): Add extra argument to call to
	create_variable_info_for_1.
	(ipa_pta_execute): Add extra argument to call to
	create_function_info_for.

	* gcc.dg/tree-ssa/pta-callused.c: Update to scan for CALLUSED(id).

	2015-10-27  Tom de Vries  

	* tree-ssa-structalias.c (push_fields_onto_fieldstack): Add and use var
	field_type.

	2015-10-26  Tom de Vries  

	* tree-ssa-structalias.c (make_restrict_var_constraints): New function,
	factored out of ...
	(intra_create_variable_infos): ... here.

	* tree-ssa-structalias.c (intra_create_variable_infos): Add
	restrict_pointer_p and recursive_restrict_p variables.

	* tree-ssa-structalias.c (intra_create_variable_infos): Inline
	get_vi_for_tree call.

	2015-10-23  Tom de Vries  

	* tree-ssa-structalias.c (intra_create_variable_infos): Use
	make_constraint_from.

	* tree-ssa-structalias.c (create_variable_info_for_1): Add missing
	setting of is_full_var in case of a single field.
---
 gcc/ChangeLog.gomp   |  65 +
 gcc/testsuite/ChangeLog.gomp |   7 +
 gcc/testsuite/gcc.dg/tree-ssa/pta-callused.c |   2 +-
 gcc/tree-ssa-structalias.c   | 203 +++
 4 files changed, 189 insertions(+), 88 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pta-callused.c b/gcc/testsuite/gcc.dg/tree-ssa/pta-callused.c
index 59408fa..b9a57d8 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pta-callused.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pta-callused.c
@@ -22,5 +22,5 @@ int bar (int b)
   return *foo ();
 }
 
-/* { dg-final { scan-tree-dump "CALLUSED = { ESCAPED NONLOCAL f.* i q }" "alias" } } */
+/* { dg-final { scan-tree-dump "CALLUSED\\(\[0-9\]+\\) = { ESCAPED NONLOCAL f.* i q }" "alias" } } */
 
diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index 8d86dcb..f5e17a3 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -220,7 +220,7 @@ static bitmap_obstack oldpta_obstack;
 /* Used for per-solver-iteration bitmaps.  */
 static bitmap_obstack iteration_obstack;
 
-static unsigned int create_variable_info_for (tree, const char *);
+static unsigned int create_variable_info_for (tree, const char *, bool);
 typedef struct constraint_graph *constraint_graph_t;
 static void unify_nodes (constraint_graph_t, unsigned int, unsigned int, bool);
 
@@ -361,11 +361,18 @@ enum { nothing_id = 1, anything_id = 2, string_id = 3,
to the vector of variable info structures.  */
 
 static varinfo_t
-new_var_info (tree t, const char *name)
+new_var_info (tree t, const char *name, bool add_id)
 {
   unsigned index = varmap.length ();
   varinfo_t ret = variable_info_pool.allocate ();
 
+  if (dump_file && add_id)
+{
+  char *tempname = xasprintf ("%s(%d)", name, index);
+  name = ggc_strdup (tempname);
+  free (tempname);
+}
+
   ret->id = index;
   ret->name = name;
   ret->decl = t;
@@ -416,13 +423,13 @@ get_call_vi (gcall *call)
   if (existed)
 return *slot_p;
 
-  vi = new_var_info (NULL_TREE, "CALLUSED");
+  vi = new_var_info (NULL_TREE, "CALLUSED", true);
   vi->offset = 0;
   vi->size = 1;
   vi->fullsize = 2;
   vi->is_full_var = true;
 
-  vi2 = new_var_info (NULL_TREE, "CALLCLOBBERED");
+  vi2 = new_var_info (NULL_TREE, "CALLCLOBBERED", true);
   

Re: [PATCH] New attribute to create target clones

2015-10-30 Thread Evgeny Stupachenko
I've fixed the misprint and vertical spaces.
I'll ask to commit the patch when x86 bootstrap and make check finished.

Thanks,
Evgeny

Updated ChangeLog:

2015-10-30  Evgeny Stupachenko  

gcc/
* Makefile.in (OBJS): Add multiple_target.o.
* attrib.c (make_attribute): Moved from config/i386/i386.c
* config/i386/i386.c (make_attribute): Deleted.
* multiple_target.c (create_dispatcher_calls): New.
(get_attr_len): Ditto.
(get_attr_str): Ditto.
(separate_attrs): Ditto.
(is_valid_asm_symbol): Ditto.
(create_new_asm_name): Ditto.
(create_target_clone): Ditto.
(expand_target_clones): Ditto.
(ipa_target_clone): Ditto.
(ipa_dispatcher_calls): Ditto.
* passes.def (pass_target_clone): Two new ipa passes.
* tree-pass.h (make_pass_target_clone): Ditto.

gcc/c-family
* c-common.c (handle_target_clones_attribute): New.
* (c_common_attribute_table): Add handle_target_clones_attribute.
* (handle_always_inline_attribute): Add check on target_clones
attribute.
* (handle_target_attribute): Ditto.

gcc/testsuite
* gcc.dg/mvc1.c: New test for multiple targets cloning.
* gcc.dg/mvc2.c: Ditto.
* gcc.dg/mvc3.c: Ditto.
* gcc.dg/mvc4.c: Ditto.
* gcc.dg/mvc5.c: Ditto.
* gcc.dg/mvc6.c: Ditto.
* gcc.dg/mvc7.c: Ditto.
* g++.dg/ext/mvc1.C: Ditto.
* g++.dg/ext/mvc2.C: Ditto.
* g++.dg/ext/mvc3.C: Ditto.
* g++.dg/ext/mvc4.C: Ditto.

gcc/doc
* doc/extend.texi (target_clones): New attribute description.

On Fri, Oct 30, 2015 at 8:27 AM, Jeff Law  wrote:
> On 10/29/2015 12:13 PM, Evgeny Stupachenko wrote:
>>
>> On Thu, Oct 29, 2015 at 8:02 PM, Jan Hubicka  wrote:

 >>Yes. This is not necessary. However that way we'll have the following
 >>code in dispatcher:
 >> cmpl$6, __cpu_model+4(%rip)
 >> sete%al
 >> movzbl  %al, %eax
 >> testl   %eax, %eax
 >> jle .L16
 >> movl$foo.target_clone.1, %eax
 >>I think it is very hard to read and debug such...
 >>
 >>While now we have:
 >>
 >> cmpl$6, __cpu_model+4(%rip)
 >> sete%al
 >> movzbl  %al, %eax
 >> testl   %eax, %eax
 >> jle .L16
 >> movl$foo.arch_slm, %eax
 >>
 >>and it is clear that we are jumping to SLM code here.
 >>I'd like to keep target in names.
>>>
>>> >
>>> >I am not against more informative names, but why you don't pass the info
>>> > here:
>>> >
>>> >+create_target_clone (cgraph_node *node, bool definition)
>>> >+{
>>> >+  cgraph_node *new_node;
>>> >+  if (definition)
>>> >+{
>>> >+  new_node = node->create_version_clone_with_body (vNULL, NULL,
>>> >+  NULL, false,
>>> >+  NULL, NULL,
>>> >+  "target_clone");
>>> >+  new_node->force_output = true;
>>> >+}
>>> >+  else
>>> >+{
>>> >+  tree new_decl = copy_node (node->decl);
>>> >+  new_node = cgraph_node::get_create (new_decl);
>>> >+}
>>> >+  return new_node;
>>> >+}
>>> >
>>> >passing "arch_slm" instead of target_clone will get you the name you
>>> > want
>>> >(plus the extra index that may be needed anyway to disambiguate).
>>> >
>>> >Note that in general those .suffixes should be machine parseable, so
>>> > cp-demangle.c
>>> >can expand them correctly.  We may want to have some consistent grammar
>>> > for them here
>>> >and update cp-demangle.c to output nice info like "target clone for..."
>>
>> Ok. I've modified the patch correspondingly.
>
> You'll need updated ChangeLog entries.  Don't forget to drop the omp-low
> spurious whitespace change.
>
> You should also fix the formatting nits Jan pointed out.
>
> With those changes, this patch is OK for the trunk.  I'll run the header
> file reordering & cleanup tool after the patch is committed to the trunk.
>
> jeff
>
>


Re: [PATCH 1/5] 2015-01-25 Paul Thomas <pa...@gcc.gnu.org>

2015-10-30 Thread Trevor Saunders
On Fri, Oct 30, 2015 at 01:10:32PM +0100, Bernd Schmidt wrote:
> On 10/30/2015 12:48 PM, tbsaunde+...@tbsaunde.org wrote:
> >From: pault 
> >
> > PR fortran/67171
> > * trans-array.c (structure_alloc_comps): On deallocation of
> > class components, reset the vptr to the declared type vtable
> > and reset the _len field of unlimited polymorphic components.
> > *trans-expr.c (gfc_find_and_cut_at_last_class_ref): Bail out on
> > allocatable component references to the right of part reference
> > with non-zero rank and return NULL.
> > (gfc_reset_vptr): Simplify this function by using the function
> > gfc_get_vptr_from_expr. Return if the vptr is NULL_TREE.
> > (gfc_reset_len): If gfc_find_and_cut_at_last_class_ref returns
> > NULL return.
> > * trans-stmt.c (gfc_trans_allocate): Rely on the use of
> > gfc_trans_assignment if expr3 is a variable expression since
> > this deals correctly with array sections.
> 
> There's no explanation of this patch or how it relates to the others in this
> series. Did you send the wrong patch?

yes, sorry about that.

Trev

> 
> 
> Bernd
> 


Re: [PATCH 5/5] remove usage of ADJUST_FIELD_ALIGN in encoding.c

2015-10-30 Thread Bernd Schmidt


it's not target independent code.  Are you suggesting to add a config/
to libobjc?  IMHO for a not really mantained frontend / target lib that's
an excessive requirement.


If necessary, then yes that would be a better solution.

Even just keeping the abstraction of the macro and putting definitions 
of it inside #ifdef at the top of the file would be an improvement over 
the submitted patch, but IMO still not really compatible with our standards.



Bernd



Re: [gomp4] declare directive [3/5]

2015-10-30 Thread Thomas Schwinge
Hi!

On Mon, 8 Jun 2015 10:04:11 -0500, James Norris  
wrote:
> --- a/gcc/fortran/gfortran.h
> +++ b/gcc/fortran/gfortran.h
> @@ -1174,6 +1183,7 @@ enum
>OMP_LIST_FROM,
>OMP_LIST_REDUCTION,
>OMP_LIST_DEVICE_RESIDENT,
> +  OMP_LIST_LINK,
>OMP_LIST_USE_DEVICE,
>OMP_LIST_CACHE,
>OMP_LIST_NUM

I noticed (my means of hitting a segmentation fault) that this was
missing an update to the clause_names in
gcc/fortran/openmp.c:resolve_omp_clauses.  (Yes, I agree that is a
strange, non-obvious dependency that this function needs to be updated
for OMP_LIST_* changes...)  Fixed on gomp-4_0-branch in r229576:

commit a5246d7b6c91e0800eeb6355bf5e4c63d27aafb2
Author: tschwinge 
Date:   Fri Oct 30 13:24:35 2015 +

Fix OMP_LIST_LINK handling

gcc/fortran/
* openmp.c (resolve_omp_clauses): Add "LINK" to clause_names.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@229576 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/fortran/ChangeLog.gomp | 4 
 gcc/fortran/openmp.c   | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git gcc/fortran/ChangeLog.gomp gcc/fortran/ChangeLog.gomp
index 7fe3eac..592dd8d 100644
--- gcc/fortran/ChangeLog.gomp
+++ gcc/fortran/ChangeLog.gomp
@@ -1,3 +1,7 @@
+2015-10-30  Thomas Schwinge  
+
+   * openmp.c (resolve_omp_clauses): Add "LINK" to clause_names.
+
 2015-10-29  Thomas Schwinge  
 
* openmp.c (gfc_match_omp_map_clause): Remove allow_sections
diff --git gcc/fortran/openmp.c gcc/fortran/openmp.c
index a2c5105..32779f7 100644
--- gcc/fortran/openmp.c
+++ gcc/fortran/openmp.c
@@ -3197,7 +3197,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses 
*omp_clauses,
   static const char *clause_names[]
 = { "PRIVATE", "FIRSTPRIVATE", "LASTPRIVATE", "COPYPRIVATE", "SHARED",
"COPYIN", "UNIFORM", "ALIGNED", "LINEAR", "DEPEND", "MAP",
-   "TO", "FROM", "REDUCTION", "DEVICE_RESIDENT", "USE_DEVICE",
+   "TO", "FROM", "REDUCTION", "DEVICE_RESIDENT", "LINK", "USE_DEVICE",
"CACHE" };
 
   if (omp_clauses == NULL)


Grüße
 Thomas


signature.asc
Description: PGP signature


Re: [PATCH][PR tree-optimization/67892] Use FSM threader to handle backedges

2015-10-30 Thread Andreas Schwab
I'm getting this regression on m68k:

FAIL: gcc.dg/tree-ssa/ssa-thread-11.c scan-tree-dump vrp2 "FSM"

The generated code looks equivalent, though.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH 1/2] s/390: Implement "target" attribute.

2015-10-30 Thread Andreas Krebbel
Hi Dominik,

on-top of the discussions we had off-list I only have a few additional
comments/questions.

Apart from that the patch looks good to me. Thanks!

Bye,

-Andreas-

> diff --git a/gcc/common/config/s390/s390-common.c 
> b/gcc/common/config/s390/s390-common.c
> index 43459c8..4cf0df7 100644
> --- a/gcc/common/config/s390/s390-common.c
> +++ b/gcc/common/config/s390/s390-common.c
> @@ -79,41 +79,27 @@ s390_option_init_struct (struct gcc_options *opts)
>
>  /* Implement TARGET_HANDLE_OPTION.  */
>
> -static bool
> -s390_handle_option (struct gcc_options *opts,
> +bool
> +s390_handle_option (struct gcc_options *opts ATTRIBUTE_UNUSED,
>   struct gcc_options *opts_set ATTRIBUTE_UNUSED,
>   const struct cl_decoded_option *decoded,
>   location_t loc)
>  {
>size_t code = decoded->opt_index;
> -  const char *arg = decoded->arg;
>int value = decoded->value;
>
>switch (code)
>  {
> -case OPT_march_:
> -  opts->x_s390_arch_flags = processor_flags_table[value];
> -  opts->x_s390_arch_string = arg;
> -  return true;
> -
>  case OPT_mstack_guard_:
> -  if (exact_log2 (value) == -1)
> +  if (value != 0 && exact_log2 (value) == -1)
>   error_at (loc, "stack guard value must be an exact power of 2");
>return true;
>
>  case OPT_mstack_size_:
> -  if (exact_log2 (value) == -1)
> +  if (value != 0 && exact_log2 (value) == -1)
>   error_at (loc, "stack size must be an exact power of 2");
>return true;

This probably is supposed to allow disabling of stack_guard and
stack-size options with 0 settings. Would removing the
`RejectNegative' in s390.opt be an option? I'm not sure but perhaps we
discussed this off-list already.

...
> +/* Helper function that defines or undefines macros.  If SET is true, the 
> macro
> +   MACRO_DEF is defined.  If SET is false, the macro MACRO_UNDEF is 
> undefined.
> +   Nothing is done if SET and WAS_SET have the same value.  */
> +static void
> +s390_def_or_undef_macro (cpp_reader *pfile, bool set, bool was_set,
> +  const char *macro_def, const char *macro_undef)
>  {
> -  cpp_assert (pfile, "cpu=s390");
> -  cpp_assert (pfile, "machine=s390");
> -  cpp_define (pfile, "__s390__");
> -  if (TARGET_ZARCH)
> -cpp_define (pfile, "__zarch__");
> -  if (TARGET_64BIT)
> -cpp_define (pfile, "__s390x__");
> -  if (TARGET_LONG_DOUBLE_128)
> -cpp_define (pfile, "__LONG_DOUBLE_128__");
> -  if (TARGET_HTM)
> -cpp_define (pfile, "__HTM__");
> -  if (TARGET_ZVECTOR)
> +  if (set == was_set)
> +return;
> +  if (set)
> +cpp_define (pfile, macro_def);
> +  else
> +cpp_undef (pfile, macro_undef);
> +}
> +
> +/* Internal function to either define or undef the appropriate system
> +   macros.  */
> +static void
> +s390_cpu_cpp_builtins_internal (cpp_reader *pfile,
> + struct cl_target_option *opts,
> + struct cl_target_option *old_opts)
> +{
> +  bool old;
> +  bool set;
> +
> +  old = (!old_opts) ? false : TARGET_HTM_P (old_opts);
> +  set = TARGET_HTM_P (opts);
> +  s390_def_or_undef_macro (pfile, set, old, "__HTM__", "__HTM__");
> +
> +  old = (!old_opts) ? false : TARGET_ZVECTOR_P (old_opts->x_target_flags);
> +  set = TARGET_ZVECTOR_P (opts->x_target_flags);
> +  s390_def_or_undef_macro (pfile, set, old, "__VEC__=10301", "__VEC__");
> +  s390_def_or_undef_macro (pfile, set, old,
> +"__vector=__attribute__((vector_size(16)))",
> +"__vector__");
> +  s390_def_or_undef_macro (pfile, set, old,
> +"__bool=__attribute__((s390_vector_bool)) unsigned",
> +"__bool");
> +  if (!flag_iso)
>  {
> -  cpp_define (pfile, "__VEC__=10301");
> -  cpp_define (pfile, "__vector=__attribute__((vector_size(16)))");
> -  cpp_define (pfile, "__bool=__attribute__((s390_vector_bool)) 
> unsigned");
> +  s390_def_or_undef_macro (pfile, set, old, 
> "__VECTOR_KEYWORD_SUPPORTED__",
> +"__VECTOR_KEYWORD_SUPPORTED__");
> +  s390_def_or_undef_macro (pfile, set, old, "vector=vector", "vector");
> +  s390_def_or_undef_macro (pfile, set, old, "bool=bool", "bool");
>
> -  if (!flag_iso)
> +  if (set && __vector_keyword == NULL)
>   {
> -   cpp_define (pfile, "__VECTOR_KEYWORD_SUPPORTED__");
> -   cpp_define (pfile, "vector=vector");
> -   cpp_define (pfile, "bool=bool");
> -
> __vector_keyword = get_identifier ("__vector");
> C_CPP_HASHNODE (__vector_keyword)->flags |= NODE_CONDITIONAL;
>

Slightly better:

static void
s390_def_or_undef_macro (cpp_reader *pfile,
 int mask,
 struct cl_target_option *newopts,
 struct cl_target_option *oldopts,
 const char *macro_def, const char *macro_undef)
{
  bool old;
  bool set;

  old = 

Re: [PR 68064] Do not create jump functions with zero alignment

2015-10-30 Thread Richard Biener
On Fri, Oct 30, 2015 at 1:44 PM, Martin Jambor  wrote:
> Hi,
>
> in PR 68064, IPA-CP hits an assert upon encountering a jump function
> claiming that a pointer has known alignment of zero.  That is actually
> what get_pointer_alignment_1 returns when asked what is the alignment
> of iftmp.0_1 in:
>
>   :
>   # iftmp.0_1 = PHI <0B(2), 2147483648B(3)>
>   {anonymous}::fn1 (iftmp.0_1);
>
> I suppose that given the circumstances, it is more-or-less reasonable
> answer, even if very weird, so we should check for that possibility.
> That is what the patch below does.  Bootstrapped and tested on
> x86_64-linux.  OK for trunk?

Hmm.  So it's overflowing to zero at

  *alignp = ptr_align * BITS_PER_UNIT;

and I'm not sure how many callers are affected.  Having alignment in
bits is of course odd in the first place.

A safe fix would be to set_ptr_info_alignment, capping the alignment
to a value that is safe to scale to bit alignment.  A very forward-looking
fix would be to make alignment be in BITS_PER_UNIT throughout
the whole compiler ...

Richard.

> Thanks,
>
> Martin
>
>
>
> 2015-10-29  Martin Jambor  
>
> PR ipa/68064
> * ipa-prop.c (ipa_compute_jump_functions_for_edge): Check that
> alignment iz not zero.
>
> testsuite/
> * g++.dg/torture/pr68064.C: New.
>
> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
> index 19846a8..f36e2fd 100644
> --- a/gcc/ipa-prop.c
> +++ b/gcc/ipa-prop.c
> @@ -1651,6 +1651,7 @@ ipa_compute_jump_functions_for_edge (struct 
> ipa_func_body_info *fbi,
>   unsigned align;
>
>   if (get_pointer_alignment_1 (arg, , _bitpos)
> + && align != 0
>   && align % BITS_PER_UNIT == 0
>   && hwi_bitpos % BITS_PER_UNIT == 0)
> {
> diff --git a/gcc/testsuite/g++.dg/torture/pr68064.C 
> b/gcc/testsuite/g++.dg/torture/pr68064.C
> new file mode 100644
> index 000..59b6897
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/torture/pr68064.C
> @@ -0,0 +1,35 @@
> +// { dg-do compile }
> +
> +template  class A {
> +public:
> +  class B;
> +  typedef typename Config::template D::type TypeHandle;
> +  static A *Tagged() { return B::New(B::kTagged); }
> +  static TypeHandle Union(TypeHandle);
> +  static TypeHandle Representation(TypeHandle, typename Config::Region *);
> +  bool Is();
> +};
> +
> +template  class A::B {
> +  friend A;
> +  enum { kTaggedPointer = 1 << 31, kTagged = kTaggedPointer };
> +  static A *New(int p1) { return Config::from_bitset(p1); }
> +};
> +
> +struct C {
> +  typedef int Region;
> +  template  struct D { typedef A *type; };
> +  static A *from_bitset(unsigned);
> +};
> +A *C::from_bitset(unsigned p1) { return reinterpret_cast(p1); }
> +
> +namespace {
> +int *a;
> +void fn1(A *p1) { A::Union(A::Representation(p1, a)); }
> +}
> +
> +void fn2() {
> +  A b;
> +  A *c = b.Is() ? 0 : A::Tagged();
> +  fn1(c);
> +}


[PR68083] stop ifcombine from moving uninitialized uses before their guards

2015-10-30 Thread Alexandre Oliva
The ifcombine pass may move a conditional access to an uninitialized
value before the condition that ensures it is always well-defined,
thus introducing undefined behavior.  Stop it from doing so.

Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?


Incidentally, bb_no_side_effects_p (inner_cond_bb) is called in all four
tests in tree_ssa_ifcombine_bb_1, for each outer_cond_bb that
tree_ssa_ifcombine_bb might choose.  Is there any reason to not factor
it out to the test that checks whether the inner_cond_bb is indeed an
if_then_else block, early in tree_ssa_ifcombine_bb, so as to
short-circuit the whole thing when the inner block is not viable?


for  gcc/ChangeLog

PR tree-optimization/68083
* tree-ssa-ifcombine.c: Include tree-ssa.h.
(bb_no_side_effects_p): Test for undefined uses too.
* tree-ssa.c (gimple_uses_undefined_value_p): New.
* tree-ssa.h (gimple_uses_undefined_value_p): Declare.

for  gcc/testsuite/ChangeLog

PR tree-optimization/68083
* gcc.dg/torture/pr68083.c: New.  From Zhendong Su.
---
 gcc/testsuite/gcc.dg/torture/pr68083.c |   35 
 gcc/tree-ssa-ifcombine.c   |2 ++
 gcc/tree-ssa.c |   18 
 gcc/tree-ssa.h |1 +
 4 files changed, 56 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr68083.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr68083.c 
b/gcc/testsuite/gcc.dg/torture/pr68083.c
new file mode 100644
index 000..ae24781
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr68083.c
@@ -0,0 +1,35 @@
+/* { dg-do run } */
+
+int a = 2, b = 1, c = 1;
+
+int
+fn1 ()
+{
+  int d;
+  for (; a; a--)
+{
+  for (d = 0; d < 4; d++)
+   {
+ int k;
+ if (c < 1)
+   if (k)
+ c = 0;
+ if (b)
+   continue;
+ return 0;
+   }
+  b = !1;
+}
+  return 0;
+}
+
+int
+main ()
+{
+  fn1 ();
+
+  if (a != 1)
+__builtin_abort ();
+
+  return 0;
+}
diff --git a/gcc/tree-ssa-ifcombine.c b/gcc/tree-ssa-ifcombine.c
index ca55b57..622dc6b 100644
--- a/gcc/tree-ssa-ifcombine.c
+++ b/gcc/tree-ssa-ifcombine.c
@@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 #include "gimplify-me.h"
 #include "tree-cfg.h"
+#include "tree-ssa.h"
 
 #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT
 #define LOGICAL_OP_NON_SHORT_CIRCUIT \
@@ -125,6 +126,7 @@ bb_no_side_effects_p (basic_block bb)
continue;
 
   if (gimple_has_side_effects (stmt)
+ || gimple_uses_undefined_value_p (stmt)
  || gimple_could_trap_p (stmt)
  || gimple_vuse (stmt))
return false;
diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c
index c7be442..8dc2d61 100644
--- a/gcc/tree-ssa.c
+++ b/gcc/tree-ssa.c
@@ -1210,6 +1210,24 @@ ssa_undefined_value_p (tree t, bool partial)
 }
 
 
+/* Return TRUE iff STMT, a gimple statement, references an undefined
+   SSA name.  */
+
+bool
+gimple_uses_undefined_value_p (gimple *stmt)
+{
+  ssa_op_iter iter;
+  tree op;
+
+  FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
+if (ssa_undefined_value_p (op))
+  return true;
+
+  return false;
+}
+
+
+
 /* If necessary, rewrite the base of the reference tree *TP from
a MEM_REF to a plain or converted symbol.  */
 
diff --git a/gcc/tree-ssa.h b/gcc/tree-ssa.h
index 5a409e5..3b5bd70 100644
--- a/gcc/tree-ssa.h
+++ b/gcc/tree-ssa.h
@@ -51,6 +51,7 @@ extern bool tree_ssa_useless_type_conversion (tree);
 extern tree tree_ssa_strip_useless_type_conversions (tree);
 
 extern bool ssa_undefined_value_p (tree, bool = true);
+extern bool gimple_uses_undefined_value_p (gimple *);
 extern void execute_update_addresses_taken (void);
 
 /* Given an edge_var_map V, return the PHI arg definition.  */


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: Try to update dominance info in tree-call-cdce.c

2015-10-30 Thread Richard Sandiford
Richard Biener  writes:
> On Fri, Oct 30, 2015 at 12:18 PM, Richard Sandiford
>  wrote:
>> The pass would free the dominance info after making a change, but it
>> should be pretty easy to keep the information up-to-date when the call
>> has no EH edges.  In a way the main hurdle was split_block, which seemed
>> to assume that the new block would postdominate the old one, and that
>> all blocks immediately dominated by the old block are now immediately
>> dominated by the new one.
>>
>> Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu.
>> OK to install?
>
> Hmm, I don't understand why split_block needs to be touched.  The
> operation itself correctly updates dominator info.  It is up to the
> pass to fix things up if it does further CFG modifications that make
> the new block no longer post-dominate the old one.
>
> So why do you need the split_block change?

The updates we'd need here would be:

redirect_immediate_dominators (CDI_DOMINATORS, call, guard_bb);

which undoes the earlier:

redirect_immediate_dominators (CDI_DOMINATORS, guard_bb, call);

that split_block did.  It just seemed wasteful to call
redirect_immediate_dominators twice to get a no-op.

In other words, there are going to be callers to split_block that
know the second block isn't going to postdominate the first and
where the calling;

redirect_immediate_dominators (CDI_DOMINATORS, first_block,
   second_block);

is taking us further from where we want to be.

Thanks,
Richard



Re: Try to avoid mark_virtual_operands_for_renmaing in call-cdce

2015-10-30 Thread Richard Sandiford
Richard Biener  writes:
> On Fri, Oct 30, 2015 at 12:18 PM, Richard Sandiford
>  wrote:
>> It's fairly easy to update the virtual ops when the call has no EH edges,
>> which should be cheaper than mark_virtual_operands_for_renaming.
>>
>> Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu.
>> OK to install?
>
> Well.  I think this can be easily improved to handle the EH edge case
> by not replacing the virtual uses in the EH region.

OK.  I suppose I was just going for the low-hanging fruit. :-)  I'll drop
this in favour of getting the internal function stuff finished for stage 1.

> Btw, did you verify the pass does things correctly when facing an EH
> throwing situation?  It seems all math builtins are marked as NOTHROW
> regardless of -fnon-call-exceptions ...

The main test case for that seems to be g++.dg/opt/pr58165.C.  I'd tried
other variations of that locally.

Thanks,
Richard



Re: [PATCH 5/5] remove usage of ADJUST_FIELD_ALIGN in encoding.c

2015-10-30 Thread Richard Biener
On Fri, Oct 30, 2015 at 1:28 PM, Bernd Schmidt  wrote:
>>
>> it's not target independent code.  Are you suggesting to add a config/
>> to libobjc?  IMHO for a not really mantained frontend / target lib that's
>> an excessive requirement.
>
>
> If necessary, then yes that would be a better solution.
>
> Even just keeping the abstraction of the macro and putting definitions of it
> inside #ifdef at the top of the file would be an improvement over the
> submitted patch, but IMO still not really compatible with our standards.

I agree, that would make the source of the copy more obvious.

Still libobjc is beyond what I consider compatible with our standards (just
look at the hoops it jumps through to make the target macros work!)

Richard.

>
> Bernd
>


Re: Try to avoid mark_virtual_operands_for_renmaing in call-cdce

2015-10-30 Thread Richard Biener
On Fri, Oct 30, 2015 at 1:21 PM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Fri, Oct 30, 2015 at 12:18 PM, Richard Sandiford
>>  wrote:
>>> It's fairly easy to update the virtual ops when the call has no EH edges,
>>> which should be cheaper than mark_virtual_operands_for_renaming.
>>>
>>> Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu.
>>> OK to install?
>>
>> Well.  I think this can be easily improved to handle the EH edge case
>> by not replacing the virtual uses in the EH region.
>
> OK.  I suppose I was just going for the low-hanging fruit. :-)

Heh, true.  It's never that simple.

> I'll drop this in favour of getting the internal function stuff finished for 
> stage 1.

Fair enough (and yes, I definitely like to see at least internal
function support
for genmatch/match.pd for GCC 6).

>> Btw, did you verify the pass does things correctly when facing an EH
>> throwing situation?  It seems all math builtins are marked as NOTHROW
>> regardless of -fnon-call-exceptions ...
>
> The main test case for that seems to be g++.dg/opt/pr58165.C.  I'd tried
> other variations of that locally.

Looking at this example I realize it would be nice to be in "(virtual) EH-closed
SSA form" so that all uses on an EH edge are reachable by walking the
EH edge destination PHI nodes ...

Otherwise of couse the fix is to propagate only into uses which are
not dominated
by the EH dest.  A replace_uses_by variant which takes an edge to check ignored
uses would come handy here.

Richard.

> Thanks,
> Richard
>


Re: [AArch64][PATCH 6/7] Add NEON intrinsics vqrdmlah and vqrdmlsh.

2015-10-30 Thread Christophe Lyon
On 23 October 2015 at 14:26, Matthew Wahab  wrote:
> The ARMv8.1 architecture extension adds two Adv.SIMD instructions,
> sqrdmlah and sqrdmlsh. This patch adds the NEON intrinsics vqrdmlah and
> vqrdmlsh for these instructions. The new intrinsics are of the form
> vqrdml{as}h[q]_.
>
> Tested the series for aarch64-none-linux-gnu with native bootstrap and
> make check on an ARMv8 architecture. Also tested aarch64-none-elf with
> cross-compiled check-gcc on an ARMv8.1 emulator.
>

Is there a publicly available simulator for v8.1? QEMU or Foundation Model?


> Ok for trunk?
> Matthew
>
> gcc/
> 2015-10-23  Matthew Wahab  
>
> * gcc/config/aarch64/arm_neon.h (vqrdmlah_s16, vqrdmlah_s32): New.
> (vqrdmlahq_s16, vqrdmlahq_s32): New.
> (vqrdmlsh_s16, vqrdmlsh_s32): New.
> (vqrdmlshq_s16, vqrdmlshq_s32): New.
>
> gcc/testsuite
> 2015-10-23  Matthew Wahab  
>
> * gcc.target/aarch64/advsimd-intrinsics/vqrdmlXh.inc: New file,
> support code for vqrdml{as}h tests.
> * gcc.target/aarch64/advsimd-intrinsics/vqrdmlah.c: New.
> * gcc.target/aarch64/advsimd-intrinsics/vqrdmlsh.c: New.
>


[patch] New 'all' front end header reduction

2015-10-30 Thread Andrew MacLeod
OK, here's the much delayed front end reduction patch based on the 
reordering already being checked in.


I discovered that my targets builds were only building c/c++, so the 
other languages were being reduced based only on the host 
x86_64-pc-linux-gnu build.   Thats *probably* ok, but I wanted to be 
sure.  This is when I discovered that the other languages have varying 
amounts of support amongst the targets. Simply building all the targets 
to compile, say ada, doesn't actually work quite right.


So this patch covers all the languages which do have full support.. the 
ones enabled by 'all' languages.


I am determining which targets build the other languages now, and will 
submit separate reduction patches for those languages.


This bootstraps on  x86_64-pc-linux-gnu with no new regressions, and is 
currently undergoing the full config-list target build.


OK for trunk?

Andrew


c
	* c-array-notation.c: Remove unused headers.
	* c-aux-info.c: Likewise.
	* c-convert.c: Likewise.
	* c-decl.c: Likewise.
	* c-errors.c: Likewise.
	* c-lang.c: Likewise.
	* c-objc-common.c: Likewise.
	* c-parser.c: Likewise.
	* c-typeck.c: Likewise.
	* gccspec.c: Likewise.

c-family
	* array-notation-common.c: Remove unused headers. 
	* c-ada-spec.c: Likewise.
	* c-cilkplus.c: Likewise.
	* c-common.c: Likewise.
	* c-cppbuiltin.c: Likewise.
	* c-dump.c: Likewise.
	* c-format.c: Likewise.
	* c-gimplify.c: Likewise.
	* c-indentation.c: Likewise.
	* c-lex.c: Likewise.
	* c-omp.c: Likewise.
	* c-opts.c: Likewise.
	* c-pch.c: Likewise.
	* c-ppoutput.c: Likewise.
	* c-pragma.c: Likewise.
	* c-pretty-print.c: Likewise.
	* c-semantics.c: Likewise.
	* c-ubsan.c: Likewise.
	* cilk.c: Likewise.
	* stub-objc.c: Likewise.

cp
	* call.c: Remove unused headers. 
	* class.c: Likewise.
	* constexpr.c: Likewise.
	* cp-array-notation.c: Likewise.
	* cp-cilkplus.c: Likewise.
	* cp-gimplify.c: Likewise.
	* cp-lang.c: Likewise.
	* cp-objcp-common.c: Likewise.
	* cp-ubsan.c: Likewise.
	* cvt.c: Likewise.
	* cxx-pretty-print.c: Likewise.
	* decl.c: Likewise.
	* decl2.c: Likewise.
	* dump.c: Likewise.
	* error.c: Likewise.
	* except.c: Likewise.
	* expr.c: Likewise.
	* friend.c: Likewise.
	* g++spec.c: Likewise.
	* init.c: Likewise.
	* lambda.c: Likewise.
	* lex.c: Likewise.
	* mangle.c: Likewise.
	* method.c: Likewise.
	* name-lookup.c: Likewise.
	* optimize.c: Likewise.
	* parser.c: Likewise.
	* pt.c: Likewise.
	* ptree.c: Likewise.
	* repo.c: Likewise.
	* rtti.c: Likewise.
	* search.c: Likewise.
	* semantics.c: Likewise.
	* tree.c: Likewise.
	* typeck.c: Likewise.
	* typeck2.c: Likewise.
	* vtable-class-hierarchy.c: Likewise.

fortran
	* array.c: Remove unused headers. 
	* convert.c: Likewise.
	* cpp.c: Likewise.
	* decl.c: Likewise.
	* f95-lang.c: Likewise.
	* frontend-passes.c: Likewise.
	* iresolve.c: Likewise.
	* match.c: Likewise.
	* module.c: Likewise.
	* options.c: Likewise.
	* parse.c: Likewise.
	* target-memory.c: Likewise.
	* trans-array.c: Likewise.
	* trans-common.c: Likewise.
	* trans-const.c: Likewise.
	* trans-decl.c: Likewise.
	* trans-expr.c: Likewise.
	* trans-intrinsic.c: Likewise.
	* trans-io.c: Likewise.
	* trans-openmp.c: Likewise.
	* trans-stmt.c: Likewise.
	* trans-types.c: Likewise.
	* trans.c: Likewise.

lto
	* lto-lang.c: Remove unused headers. 
	* lto-object.c: Likewise.
	* lto-partition.c: Likewise.
	* lto-symtab.c: Likewise.
	* lto.c: Likewise.

objc
	* objc-act.c: Remove unused headers. 
	* objc-encoding.c: Likewise.
	* objc-gnu-runtime-abi-01.c: Likewise.
	* objc-lang.c: Likewise.
	* objc-map.c: Likewise.
	* objc-next-runtime-abi-01.c: Likewise.
	* objc-next-runtime-abi-02.c: Likewise.
	* objc-runtime-shared-support.c: Likewise.

Index: c/c-array-notation.c
===
*** c/c-array-notation.c	(revision 229538)
--- c/c-array-notation.c	(working copy)
***
*** 68,79 
  #include "config.h"
  #include "system.h"
  #include "coretypes.h"
- #include "tree.h"
- #include "c-family/c-common.h"
  #include "c-tree.h"
  #include "gimple-expr.h"
  #include "tree-iterator.h"
- #include "opts.h"
  
  /* If *VALUE is not of type INTEGER_CST, PARM_DECL or VAR_DECL, then map it
 to a variable and then set *VALUE to the new variable.  */
--- 68,76 
Index: c/c-aux-info.c
===
*** c/c-aux-info.c	(revision 229538)
--- c/c-aux-info.c	(working copy)
*** along with GCC; see the file COPYING3.
*** 24,33 
  #include "system.h"
  #include "coretypes.h"
  #include "tm.h"
- #include "tree.h"
  #include "c-tree.h"
- #include "flags.h"
- #include "alias.h"
  
  enum formals_style {
ansi,
--- 24,30 
Index: c/c-convert.c
===
*** c/c-convert.c	(revision 229538)
--- c/c-convert.c	(working copy)
*** along with GCC; see the file COPYING3.
*** 27,36 
  #include "system.h"
  #include "coretypes.h"
  #include 

Re: [OpenACC] num_gangs, num_workers and vector_length in c++

2015-10-30 Thread Jakub Jelinek
On Thu, Oct 29, 2015 at 04:02:11PM -0700, Cesar Philippidis wrote:
> I noticed that num_gangs, num_workers and vector_length are parsed in
> somewhat insistent ways in the c++ FE. Both vector_length and num_gangs
> bail out whenever as soon as they detect errors, whereas num_workers
> does not. Besides for that, they are also checking for integral
> expressions as the arguments are scanned instead of deferring that check
> to finish_omp_clauses. That check will cause ICEs when template
> arguments are used when we add support for template arguments later on.
> 
> Rather than fix each function individually, I've consolidated them into
> a single cp_parser_oacc_positive_int_clause function. While this
> function could be extended to support openmp clauses which accept an
> integer expression argument, like num_threads, I've decided to leave
> those as-is since there are no known problems with those functions at
> this moment.

First question is what int-expr in OpenACC actually stands for (but I'll
have to raise similar question for OpenMP too).

Previously you were using cp_parser_condition, which is clearly undesirable
in this case, it allows e.g.
num_gangs (int a = 5)
but the question is if
num_gangs (5, 6)
is valid and stands for (5, 6) expression, then it should use
cp_parser_expression, or if you want to error on it, then you should use
cp_parser_assignment_expression.
>From quick skimming of the (now removed) C/C++ Grammar Appendix in OpenMP,
I believe all the places where expression or scalar-expression is used
in the grammar are meant to be cp_parser_expression cases (except
expression-list used in UDRs which stands for normal C++ expression-list
non-terminal), so clearly I need to fix up omp_clause_{if,final} to call
cp_parser_expression instead of cp_parser_condition, and the various
OpenMP clauses that use cp_parser_assignment_expression to instead use
cp_parser_expression.  Say schedule(static, 3, 6) should be valid IMHO.
But, in OpenMP expression or scalar-expression in the grammar is never
followed by , or optional , while in OpenACC grammar clearly is (e.g. for
the gang clause).
If OpenACC wants something different, clearly you can't share the parsing
routines between say num_tasks and num_workers.

Another thing is what Jason as C++ maintainer wants, it is nice to get rid
of some code redundancies, on the other side the fact that there is one
function per non-terminal in the grammar is also quite nice property.
I know I've violated this a few times too.

Next question is, why do you call it cp_parser_oacc_positive_int_clause
when the parsing function actually doesn't verify neither the positive nor
the int properties (and it should not), so perhaps it should just reflect
in the name that it is a clause with assignment? expression.
Or, see the previous paragraph, have a helper that does that and then
have a separate function for each clause kind that calls those with the
right arguments.

Jakub


[PATCH 0/2] Levenshtein-based suggestions (v3)

2015-10-30 Thread David Malcolm
On Thu, 2015-09-17 at 13:31 -0600, Jeff Law wrote:
> On 09/16/2015 02:34 AM, Richard Biener wrote:
> >
> > Btw, this looks quite expensive - I'm sure we want to limit the effort
> > here a bit.
> A limiter is reasonable, though as it's been pointed out this only fires 
> during error processing, so we probably have more leeway to take time 
> and see if we can do better error recovery.
> 
> FWIW, I've used this algorithm in totally unrelated projects and while 
> it seems expensive, it's worked out quite nicely.
> 
> >
> > So while the idea might be an improvement to selected cases it can cause
> > confusion as well.  And if using the suggestion for further parsing it can
> > cause worse followup errors (unless we can limit such "fixup" use to the
> > cases where we can parse the result without errors).  Consider
> >
> > foo()
> > {
> >foz = 1;
> > }
> >
> > if we suggest 'foo' instead of foz then we'll get a more confusing followup
> > error if we actually use it.
> True.  This kind of problem is probably inherent in this kind of "I'm 
> going assume you meant..." error recovery mechanisms.
> 
> And just to be clear, even in a successful recovery scenario, we still 
> issue an error.  The error recovery is just meant to try and give the 
> user a hint what might have gone wrong and gracefully handle the case 
> where they just made a minor goof.  Obviously the idea here is to cut 
> down on the number of iterations of edit-compile cycle one has to do :-)
> 
> 
> Jeff

The typename suggestion seems to be at least somewhat controversial,
whereas (I hope) the misspelled field names suggestion is more
acceptable.

Hence I'm focusing on the field name lookup for now; other uses of the
algorithm (e.g. the typename lookup) could be done in followup patches,
but I'm deferring them for now in the hope of getting the simplest case
into trunk as a first step.  Similarly, for simplicity, I didn't
implement any attempt at error-recovery using the hint.

The following patch kit is in two parts (for ease of review; they would
be applied together):

  patch 1: Implement Levenshtein distance
  patch 2: C FE: suggest corrections for misspelled field names

I didn't implement a limiter, on the grounds that this only fires
once per "has no member named" error, and so is unlikely to slow
things down noticeably.

Successfully bootstrapped the combination of these two
on x86_64-pc-linux-gnu (adds 11 new PASS results to gcc.sum)

OK for trunk?

 gcc/Makefile.in  |   1 +
 gcc/c/c-typeck.c |  70 +++-
 gcc/spellcheck.c | 136 +++
 gcc/spellcheck.h |  32 ++
 gcc/testsuite/gcc.dg/plugin/levenshtein-test-1.c |   9 ++
 gcc/testsuite/gcc.dg/plugin/levenshtein_plugin.c |  64 +++
 gcc/testsuite/gcc.dg/plugin/plugin.exp   |   1 +
 gcc/testsuite/gcc.dg/spellcheck-fields.c |  63 +++
 8 files changed, 375 insertions(+), 1 deletion(-)
 create mode 100644 gcc/spellcheck.c
 create mode 100644 gcc/spellcheck.h
 create mode 100644 gcc/testsuite/gcc.dg/plugin/levenshtein-test-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/levenshtein_plugin.c
 create mode 100644 gcc/testsuite/gcc.dg/spellcheck-fields.c

-- 
1.8.5.3


Re: [gomp4 04/14] nvptx: fix output of _Bool global variables

2015-10-30 Thread Bernd Schmidt

The following patch passes testing with

make -k check-c DEJAGNU=.../dejagnu.exp 
RUNTESTFLAGS=--target_board=nvptx-none-run

with no new regressions, and fixes 1 test:
-FAIL: gcc.dg/compat/struct-align-1 c_compat_x_tst.o-c_compat_y_tst.o execute


Ok. Thanks!


Bernd



Re: using scratchpads to enhance RTL-level if-conversion: revised patch

2015-10-30 Thread Bernd Schmidt

(Jakub Cc'd because of code he added for PR23567).

On 10/27/2015 11:35 PM, Abe wrote:

Thanks for all your feedback.  I have integrated as much of it as I
could in the available time.


Unfortunately not all of it - I still think we need to have a better 
strategy of selecting a scratchpad than a newly allocated stack slot. 
There are sufficiently many options.



 * ifcvt.c (noce_mem_write_may_go_wrong_even_with_scratchpads):
New.


ChangeLog doesn't correspond to the patch. If the function actually 
existed I'd reject it for a way overlong identifier.



 * target.h: New enum named "RTL_ifcvt_when_to_use_scratchpads".


Please follow ChangeLog writing guidelines.


-/* Return true if a write into MEM may trap or fault.  */
+/* Return true if a write into MEM may trap or fault
+   even in the presence of scratchpad support.  */


I still think this comment is fairly useless and needs to better 
describe what it is actually doing.



  static bool
-noce_mem_write_may_trap_or_fault_p (const_rtx mem)
+noce_mem_write_may_trap_or_fault_p_1 (const_rtx mem)


For what this ends up doing, I think a name like "unsafe_address_p" 
would be better. Also, I think the code in there is really dubious - it 
tries to look for SYMBOL_REFs which are in decl_readonly_section, but 
shouldn't that be unnecessary given the test for MEM_READONLY_P? It's 
completely unreliable anyway since the address could be loaded into a 
register (on most targets it would be) and we'd never see the SYMBOL_REF 
in that function.



+/* Return true if a write into MEM may trap or fault
+   without scratchpad support.  */


Just keep the previous comment without mentioning scratchpads.


+static bool
+noce_mem_write_may_trap_or_fault_p (const_rtx mem)
+{
+  if (may_trap_or_fault_p (mem))
+return true;
+
+  return noce_mem_write_may_trap_or_fault_p_1 (mem);
+}
+  if (RTL_ifcvt_use_spads_as_per_profile
+== targetm.RTL_ifcvt_when_to_use_scratchpads
+  && (PROFILE_ABSENT == profile_status_for_fn (cfun)
+  || PROFILE_GUESSED == profile_status_for_fn (cfun)
+  || predictable_edge_p (then_edge)
+  || ! maybe_hot_bb_p (cfun, then_bb)))
+return FALSE;


I guess this is slightly better than no cost estimate at all.


+  if (noce_mem_write_may_trap_or_fault_p_1 (orig_x)


So why do you want to call this function anyway? Doesn't the scratchpad 
technique protect against storing to a bad address?



+  const size_t MEM_size = MEM_SIZE (orig_x);


No uppercase letters for variables and such. Just use "sz" or "size" for 
brevity.



+  biggest_spad = assign_stack_local (GET_MODE (orig_x),


Still the same problems - this is the least attractive choice of a 
scratchpad location, and the code may end up allocating more scratchpads 
than you need.



+  emit_insn_before_setloc (seq, if_info->jump,
+   INSN_LOCATION (if_info->insn_a));


Formatting? Could be mail client damage.


+DEFHOOKPOD
+(RTL_ifcvt_when_to_use_scratchpads,
+"*",
+enum RTL_ifcvt_when_to_use_scratchpads,
RTL_ifcvt_use_spads_as_per_profile)
+


No caps, and maybe a less clumsy name.


+enum RTL_ifcvt_when_to_use_scratchpads {
+  RTL_ifcvt_never_use_scratchpads = 0,
+  RTL_ifcvt_always_use_scratchpads,
+  RTL_ifcvt_use_spads_as_per_profile
+};


Likewise. Maybe

enum ifcvt_scratchpads_strategy {
  scratchpad_never,
  scratchpad_always,
  scratchpad_unpredictable
};

So, still a NACK from my side.


Bernd


Re: [PATCH] [ARM] neon-testgen.ml typo

2015-10-30 Thread Ramana Radhakrishnan


On 29/10/15 17:23, Jim Wilson wrote:
> I noticed a comment typo in this file while using grep to look for
> other stuff.  The typo is easy to fix.
> 
> I tried running neon-testgen.ml to verify, but it is apparently no
> longer valid ocaml, as it doesn't work with the ocamlc 4.01.0 I have
> on Ubuntu 14.04.  I get a syntax error.  Someone who knows ocaml will
> have to fix this.  Meanwhile, the patch to fix the typo should still
> be OK, as this is a separate problem.
> 
> Jim
> 

This is OK.

I'd really like neon-testgen.ml and the tests in gcc.target/arm/neon to be 
removed if all the intrinsics are now tested from Christophe's work in getting 
his advsimd tests integrated. Where are we on that ?

regards
Ramana


Re: more accurate omp in fortran

2015-10-30 Thread Jakub Jelinek
On Thu, Oct 22, 2015 at 08:21:35AM -0700, Cesar Philippidis wrote:
> diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
> index b2894cc..93adb7b 100644
> --- a/gcc/fortran/gfortran.h
> +++ b/gcc/fortran/gfortran.h
> @@ -1123,6 +1123,7 @@ typedef struct gfc_omp_namelist
>  } u;
>struct gfc_omp_namelist_udr *udr;
>struct gfc_omp_namelist *next;
> +  locus where;
>  }
>  gfc_omp_namelist;
>  
> diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
> index 3c12d8e..56a95d4 100644
> --- a/gcc/fortran/openmp.c
> +++ b/gcc/fortran/openmp.c
> @@ -244,6 +244,7 @@ gfc_match_omp_variable_list (const char *str, 
> gfc_omp_namelist **list,
>   }
> tail->sym = sym;
> tail->expr = expr;
> +   tail->where = cur_loc;
> goto next_item;
>   case MATCH_NO:
> break;
> @@ -278,6 +279,7 @@ gfc_match_omp_variable_list (const char *str, 
> gfc_omp_namelist **list,
> tail = tail->next;
>   }
> tail->sym = sym;
> +   tail->where = cur_loc;
>   }
>  
>  next_item:

The above is fine.

> @@ -2832,36 +2834,47 @@ resolve_omp_udr_clause (gfc_omp_namelist *n, 
> gfc_namespace *ns,
>return copy;
>  }
>  
> -/* Returns true if clause in list 'list' is compatible with any of
> -   of the clauses in lists [0..list-1].  E.g., a reduction variable may
> -   appear in both reduction and private clauses, so this function
> -   will return true in this case.  */
> +/* Check if a variable appears in multiple clauses.  */
>  
> -static bool
> -oacc_compatible_clauses (gfc_omp_clauses *clauses, int list,
> -gfc_symbol *sym, bool openacc)
> +static void
> +resolve_omp_duplicate_list (gfc_omp_namelist *clause_list, bool openacc,
> + int list)
>  {
>gfc_omp_namelist *n;
> +  const char *error_msg = "Symbol %qs present on multiple clauses at %L";

Please don't do this, I'm afraid this breaks translations.
Also, can you explain why all the mess with OMP_LIST_REDUCTION && openacc?
That clearly looks misplaced to me.
If one list item may be in at most one reduction clause, but may be in
any other clause too, then it is the same case as e.g. OpenMP
OMP_LIST_ALIGNED case, so you should instead just:
  && (list != OMP_LIST_REDUCTION || !openacc)
to the for (list = 0; list < OMP_LIST_NUM; list++) loop, and handle
OMP_LIST_REDUCTION specially, similarly how OMP_LIST_ALIGNED is handled,
just guarded with if (openacc).

Jakub


  1   2   >