Re: [PATCH v6] add -fprolog-pad=N,M option

2017-02-17 Thread Sandra Loosemore

On 02/17/2017 09:57 AM, Torsten Duwe wrote:


diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 3d1546a..ef7e985 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3076,6 +3076,23 @@ that affect more than one function.
  This attribute should be used for debugging purposes only.  It is not
  suitable in production code.

+@item prolog_pad
+@cindex @code{prolog_pad} function attribute


I'm only a documentation maintainer so this is out of my area of 
responsibility, but I really wish we could rename the attribute and 
command-line option.  Per


per https://gcc.gnu.org/codingconventions.html#Spelling

the correct spelling is "prologue".


+@cindex extra NOP instructions at the function entry point
+In case the target's text segment can be made writable at run time
+by any means, padding the function entry with a number of NOPs can
+be used to provide a universal tool for instrumentation.  Usually,
+prolog padding is enabled globally using the @option{-fprolog-pad=N,M}


definitely s/prolog/prologue/ in the running text here.


+command-line switch, and disabled with attribute @code{prolog_pad (0)}
+for functions that are part of the actual instrumentation framework.
+This conveniently avoids an endless recursion.
+The @code{prolog_pad} function attribute can be used to
+change the pad size to any desired value.  The two-value syntax is
+the same as for the command-line switch @option{-fprolog-pad=N,M},


Add a cross-reference here.


+generating a NOP pad of size @var{N}, with the function entry point


Sizes are usually expressed in bytes.  I think some other unit is 
intended here, though, so I'd avoid "size" and use some other way to 
describe it.  Maybe "generating a pad of @var{N} NOP instructions".



+@var{M} NOP instructions into the pad.  @var{M} defaults to 0
+if omitted e.g. function entry point is before the first NOP.
+
  @item pure
  @cindex @code{pure} function attribute
  @cindex functions that have no side effects
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 56ca53f..75a7e2c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11370,6 +11370,31 @@ of the function name, it is considered to be a match.  
For C99 and C++
  extended identifiers, the function name must be given in UTF-8, not
  using universal character names.

+@item -fprolog-pad=@var{N}[,@var{M}]
+@opindex fprolog-pad
+Generate a pad of @var{N} NOPs right at the beginning
+of each function, with the function entry point @var{M} NOPs into
+the pad.  If @var{M} is omitted, it defaults to @code{0} so the
+function entry points to the address just at the first NOP.
+The NOP instructions reserve extra space which can be used to patch in
+any desired instrumentation at run time, provided that the code segment
+is writable.  The amount of space is only controllable indirectly via
+the number of NOPs, so implementers are advised to use the smallest
+NOP instruction available for the current CPU mode should there be a
+choice, in order to achieve the finest granularity.


The audience of the GCC user manual is users, not implementers.  If this 
is really just "advice" on what the option should do in the presence of 
multiple instruction sizes, and not a firm requirement, then rewrite as 
something like:


The amount of space reserved is expressed as the number of NOP 
instructions to insert. On targets that have multiple instruction sizes, 
typically the smallest NOP instruction available for the current CPU 
mode is used to achieve the finest granularity.


...except that I don't think "CPU mode" is really what you intend here. 
 E.g. on nios2, support for 16-bit instructions is a code generation 
option (-mcdx) rather than a -mcpu= or -march= option, and there is 
certainly no runtime processor mode selection involved.


If this is really a firm requirement, I think the burden is on you to 
identify all backends that have multiple NOP sizes for which the default 
hook implementation won't give the required behavior, and either provide 
an appropriate hook or work with the backend maintainers to develop one.


I'd put a paragraph break here, before:


+For run-time identification, the starting addresses
+of these pads, which correspond to their respective function entries
+minus @var{M}, are additionally collected in the @code{__prolog_pads_loc}
+section of the resulting binary.
+
+Note that the value of @code{__attribute__ ((prolog_pad (N,M)))} takes
+precedence over command-line option @option{-fprolog-pad=N,M}.


@var{N} and @var{M} in both places, please.  And add a cross-reference.


+This can be used to increase the pad size or to remove it completely
+on a single function.  If @code{N=0}, no pad location is recorded.


That's sloppy markup.  How about

If @var{N} is zero, 


+The NOP instructions are inserted at (and maybe before) the function entry
+address, even before the prologue.
+
  @end table


diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 348fd68..5155d10 

Re: [gomp4] adjust num_gangs and add a diagnostic for unsupported num_workers

2017-02-17 Thread Alexander Monakov
On Fri, 17 Feb 2017, Cesar Philippidis wrote:
> > And then, I don't specifically have a problem with discontinuing CUDA 5.5
> > support, and require 6.5, for example, but that should be a conscious
> > decision.
> 
> We should probably ditch CUDA 5.5. In fact, according to trunk's cuda.h,
> it requires version 8.0.

No, the define in cuda.h substitute header does not imply a requirement.

> Alex, are you using CUDA 5.5 in your environment?

No.

Alexander


Re: [C++ RFC] Fix up attribute handling in templates (PR c++/79502)

2017-02-17 Thread Jason Merrill
On Thu, Feb 16, 2017 at 6:13 PM, Martin Sebor  wrote:
> On 02/16/2017 12:49 PM, Jason Merrill wrote:
>>
>> On Thu, Feb 16, 2017 at 11:33 AM, Jakub Jelinek  wrote:
>>>
>>> PR c++/79502
>>> * pt.c (apply_late_template_attributes): If there are
>>> no dependent attributes, set *p to attributes.  If there were
>>> some attributes in *p previously with or without dependent
>>> attributes, chain them after the new attributes.
>>
>>
>> Here's the variant of your patch that I'm applying.
>
>
> Sorry to butt in but I feel like I'm missing something basic.  Are
> these attributes (nodiscard, noreturn, maybe_unused, and deprecated)
> meant to apply to templates?  The text in for nodiscard suggests
> they're not:
>
>   The attribute-token nodiscard may be applied to the declarator-id
>   in a function declaration or to the declaration of a class or
>   enumeration.
>
> Noreturn also doesn't mention templates:
>
>   The attribute may be applied to the declarator-id in a function
>   declaration.
>
> Deprecated explicitly mentions template specializations but not
> primary templates:
>
>   The attribute may be applied to the declaration of a class,
>a typedef-name, a variable, a non-static data member, a function,
>a namespace, an enumeration, an enumerator, or a template
>specialization.
>
> I can certainly see how applying attributes to the primary template
> would be useful so it's puzzling to me that the standard seems to
> preclude it.

I don't think it's precluded; a /template-declaration/ syntactically
includes a /declaration/, so in general any statement about e.g. a
function declaration also applies to a function template declaration.

> I ask also because I was just looking at bug 79021 and scratching
> my head about what to thing about it.   While trying to understand
> how GCC handles attributes for the primary template I came across
> what doesn't make sense to me.   Why would it apply the attribute
> from the primary to the explicit specialization when the two are
> distinct entities?  Is that a bug?

This seems like a Core issue; the standard says nothing about how
attributes on a template affect specializations.

I think that as a general rule, not applying attributes from the
template to specializations makes sense.  There will be some
exceptions, such as the abi_tag attribute which is a property of the
name rather than a particular declaration.

Jason


Re: [PATCH][C++] Annotate more functions with MEM-STATs

2017-02-17 Thread Jason Merrill
On Fri, Feb 17, 2017 at 6:51 AM, Richard Biener  wrote:
>
> The following annotates two key wrappers around copy_node in the C++ FE
> with MEM-STAT info (and with CXX_MEM_STAT_INFO this is surprisingly
> easy, without adding _stat variants and macros as we have for the classic
> way from the pre-C++ era).
>
> It also annotates more type building functions in tree.c (all in the
> attempt to get a better idea on where all the types are built for C++
> sources).
>
> Bootstrapped without --enable-gather-detailed-mem-stats, bootstrapped
> with --enable-gather-detailed-mem-stats and visually inspected the
> improved stats on some example C++ code.
>
> There are still some more functions worth annotating:
>
> tree.c:8239 (build_range_type_1)840:  0.0%
> 666120:  3.2%
> tree.c:8362 (build_array_type_1)   3024:  0.0%
> 671496:  3.2%
> tree.c:4841 (build_type_attribute_qual_variant)   13776:  0.1%
> 67032:  0.3%
> tree.c:8681 (build_method_type_directly)  41832:  0.3%
> 202944:  1.0%
> hash-table.h:736 (expand) 15136:  0.1%
> 5826600: 27.8%
> tree.c:8532 (build_function_type)148344:  1.1%
> 3538080: 16.9%
> cp/lex.c:556 (retrofit_lang_decl) 78628:  0.6%
> 43776:  0.2%
> cp/lex.c:526 (build_lang_decl_loc)87968:  0.6%
> 260776:  1.2%   3902184:  7.5%536840: 23.8% 15444
>
> is it ok if I go forward with this (at this stage, also for C++
> specifics above?)
>
> Would it be welcome to scrap _stat and the macro wrappings everywhere
> at this stage?

The patch looks fine to me, I don't have an opinion about
appropriateness for this stage.

Jason


Re: [PATCH] PR c++/69523 make -Wliteral-suffix control warning

2017-02-17 Thread Jason Merrill
OK.

On Fri, Feb 17, 2017 at 7:59 AM, Jonathan Wakely  wrote:
> Currently there's no way to disable the warning about literal suffix
> identifiers that use reserved names. This patch from Eric makes it
> depend on the existing -Wliteral-suffix option. Currently that
> controls warnings when encountering string literals concatenated with
> macros, such as "%"PRIu32, which isn't the same problem, but I think
> it's OK to reuse the warning option for this as well.
>
> Tested powerpc64le-linux. OK for trunk now, or should it wait for
> Stage 1?
>
>
>
> gcc:
>
> 2017-02-17  Jonathan Wakely  
>
> PR c++/69523
> * doc/invoke.texi (C++ Dialect Options) [-Wliteral-suffix]: Update
> description.
>
> gcc/cp:
>
> 2017-02-17  Eric Fiselier  
> Jonathan Wakely  
>
> PR c++/69523
> * parser.c (cp_parser_unqualified_id): Use OPT_Wliteral_suffix to
> control warning about literal suffix identifiers without a leading
> underscore.
>
> gcc/testsuite:
>
> 2017-02-17  Eric Fiselier  
> Jonathan Wakely  
>
> PR c++/69523
> * g++.dg/cpp0x/Wliteral-suffix2.C: New test.
>
>


Re: [C++ Patch] PR 79380

2017-02-17 Thread Jason Merrill
OK.

On Fri, Feb 17, 2017 at 8:32 AM, Paolo Carlini  wrote:
> ... sorry, what I meant to propose uses
> INTEGRAL_OR_UNSCOPED_ENUMERATION_TYPE_P in the check, per the below. Both
> versions pass testing anyway, but the below seems more correct to me.
>
> Thanks again,
> Paolo.
>
> 


Re: [C++ PATCH] For -gdwarf-5 emit DW_TAG_variable instead of DW_TAG_member for C++ static data members

2017-02-17 Thread Jason Merrill
On Fri, Feb 17, 2017 at 1:52 PM, Jakub Jelinek  wrote:
> -  && die->die_tag != DW_TAG_member)
> +  && die->die_tag != DW_TAG_member
> +  && (die->die_tag != DW_TAG_variable || !class_scope_p 
> (die->die_parent)))

How about we only check class_scope_p (die->die_parent), and don't
consider the TAG at all?  DW_TAG_member should only appear at class
scope.

> - if (old_die->die_tag == DW_TAG_member)
> + if (old_die->die_tag == DW_TAG_member
> + || (dwarf_version >= 5 && class_scope_p (old_die->die_parent)))

Likewise here.

Jason


Re: [PATCH] Emit column info even in .debug_line section

2017-02-17 Thread Jason Merrill
Looks fine.

On Fri, Feb 17, 2017 at 1:59 PM, Jakub Jelinek  wrote:
> Hi!
>
> And here is incremental patch to provide column information even in
> .debug_line (whether through .loc directives or custom .debug_line).
> The patch looks large, because I had to adjust the two hooks to pass
> through the column information, but beyond that it is actually very simple.
>
> If the earlier patch is ok, would this be ok too (not bootstrapped yet,
> going to regtest it soon)?
>
> 2017-02-17  Jakub Jelinek  
>
> * final.c (last_columnnum, override_columnnum): New variables.
> (final_start_function): Set last_columnnum, pass it to begin_prologue
> hook and pass 0 to dwarf2out_begin_prologue.
> (final_scan_insn): Update override_columnnum.  Pass last_columnnum
> to source_line debug hook.
> (notice_source_line): Compute last_columnnum and for debug_column_info
> return true on column changes.
> * debug.h (struct gcc_debug_hooks): Add column argument to
> source_line and begin_prologue hooks.
> (debug_nothing_int_charstar_int_bool): Remove prototype.
> (debug_nothing_int_int_charstar,
> debug_nothing_int_int_charstar_int_bool): New prototypes.
> (dwarf2out_begin_prologue): Add column argument.
> * debug.c (do_nothing_debug_hooks): Adjust source_line and
> begin_prologue hooks.
> (debug_nothing_int_charstar_int_bool): Remove.
> (debug_nothing_int_int_charstar,
> debug_nothing_int_int_charstar_int_bool): New functions.
> * dwarf2out.c (dwarf2out_begin_prologue): Add column argument, pass it
> through to dwarf2out_source_line.
> (dwarf2_lineno_debug_hooks): Adjust begin_prologue hook.
> (dwarf2out_source_line): Add column argument, emit it if requested.
> * sdbout.c (sdbout_source_line, sdbout_begin_prologue): Add column
> arguments.
> * xcoffout.h (xcoffout_begin_prologue, xcoffout_source_line): 
> Likewise.
> * xcoffout.c (xcoffout_begin_prologue, xcoffout_source_line): 
> Likewise.
> * vmsdbgout.c (vmsdbgout_begin_prologue): Add column argument, pass it
> through to dwarf2out_begin_prologue.
> (vmsdbgout_source_line): Add column argument, pass it through to
> dwarf2out_source_line.
> * dbxout.c (dbxout_begin_prologue): Add column argument, adjust
> dbxout_source_line caller.
> (dbxout_source_line): Add column argument.
>
> --- gcc/final.c.jj  2017-01-24 23:29:07.0 +0100
> +++ gcc/final.c 2017-02-17 19:13:39.504406149 +0100
> @@ -118,6 +118,9 @@ rtx_insn *current_output_insn;
>  /* Line number of last NOTE.  */
>  static int last_linenum;
>
> +/* Column number of last NOTE.  */
> +static int last_columnnum;
> +
>  /* Last discriminator written to assembly.  */
>  static int last_discriminator;
>
> @@ -133,9 +136,10 @@ static int high_function_linenum;
>  /* Filename of last NOTE.  */
>  static const char *last_filename;
>
> -/* Override filename and line number.  */
> +/* Override filename, line and column number.  */
>  static const char *override_filename;
>  static int override_linenum;
> +static int override_columnnum;
>
>  /* Whether to force emission of a line note before the next insn.  */
>  static bool force_source_line = false;
> @@ -1763,6 +1767,7 @@ final_start_function (rtx_insn *first, F
>
>last_filename = LOCATION_FILE (prologue_location);
>last_linenum = LOCATION_LINE (prologue_location);
> +  last_columnnum = LOCATION_COLUMN (prologue_location);
>last_discriminator = discriminator = 0;
>
>high_block_linenum = high_function_linenum = last_linenum;
> @@ -1771,10 +1776,10 @@ final_start_function (rtx_insn *first, F
>  asan_function_start ();
>
>if (!DECL_IGNORED_P (current_function_decl))
> -debug_hooks->begin_prologue (last_linenum, last_filename);
> +debug_hooks->begin_prologue (last_linenum, last_columnnum, 
> last_filename);
>
>if (!dwarf2_debug_info_emitted_p (current_function_decl))
> -dwarf2out_begin_prologue (0, NULL);
> +dwarf2out_begin_prologue (0, 0, NULL);
>
>  #ifdef LEAF_REG_REMAP
>if (crtl->uses_only_leaf_regs)
> @@ -2335,6 +2340,7 @@ final_scan_insn (rtx_insn *insn, FILE *f
> {
>   override_filename = LOCATION_FILE (*locus_ptr);
>   override_linenum = LOCATION_LINE (*locus_ptr);
> + override_columnnum = LOCATION_COLUMN (*locus_ptr);
> }
> }
>   break;
> @@ -2370,11 +2376,13 @@ final_scan_insn (rtx_insn *insn, FILE *f
> {
>   override_filename = LOCATION_FILE (*locus_ptr);
>   override_linenum = LOCATION_LINE (*locus_ptr);
> + override_columnnum = LOCATION_COLUMN (*locus_ptr);
> }
>   else
> {
>   override_filename = 

Re: [PATCH] Emit column information in dwarf

2017-02-17 Thread Jason Merrill
OK.

On Fri, Feb 17, 2017 at 1:57 PM, Jakub Jelinek  wrote:
> Hi!
>
> The GDB folks expressed interest in handling column information in
> debug info, apparently clang emits it, but gcc does not.
>
> I know we are late in the release cycle, so I'm not suggesting to
> turn this on by default, but the following patch at least allows
> users to request it through -gcolumn-info, so that GDB can be adjusted
> to consume it and later on (GCC 8 or 9) we could perhaps switch the
> default.
>
> This patch handles just emitting DW_AT_decl_column and DW_AT_call_column
> if requested (-gcolumn-info) and non-zero (i.e. the middle-end knows the
> columns).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2017-02-17  Jakub Jelinek  
>
> * common.opt (gno-column-info, gcolumn-info): New options.
> * dwarf2out.c (dwarf2_lineno_debug_hooks): Formatting fix.
> (check_die): Also test for multiple DW_AT_decl_column attributes.
> (add_src_coords_attributes, dwarf2out_imported_module_or_decl_1): Add
> DW_AT_decl_column if requested.
> (gen_subprogram_die): Compare and/or add also DW_AT_decl_column
> if requested.
> (gen_variable_die): Likewise.
> (add_call_src_coords_attributes): Add DW_AT_call_column if requested.
> * doc/invoke.texi (-gcolumn-info, -gno-column-info): Document.
>
> --- gcc/common.opt.jj   2017-02-01 16:41:45.0 +0100
> +++ gcc/common.opt  2017-02-17 11:47:14.233098170 +0100
> @@ -2805,6 +2805,14 @@ gcoff
>  Common Driver JoinedOrMissing Negative(gdwarf)
>  Generate debug information in COFF format.
>
> +gno-column-info
> +Common Driver RejectNegative Var(debug_column_info,0) Init(0)
> +Don't record DW_AT_decl_column and DW_AT_call_column in DWARF.
> +
> +gcolumn-info
> +Common Driver RejectNegative Var(debug_column_info,1)
> +Record DW_AT_decl_column and DW_AT_call_column in DWARF.
> +
>  gdwarf
>  Common Driver JoinedOrMissing Negative(gdwarf-)
>  Generate debug information in default version of DWARF format.
> --- gcc/dwarf2out.c.jj  2017-02-09 23:01:46.0 +0100
> +++ gcc/dwarf2out.c 2017-02-17 11:56:13.471834354 +0100
> @@ -2732,7 +2732,7 @@ const struct gcc_debug_hooks dwarf2_line
>debug_nothing_int_int,/* begin_block */
>debug_nothing_int_int,/* end_block */
>debug_true_const_tree,/* ignore_block */
> -  dwarf2out_source_line,/* source_line */
> +  dwarf2out_source_line,/* source_line */
>debug_nothing_int_charstar,   /* begin_prologue */
>debug_nothing_int_charstar,   /* end_prologue */
>debug_nothing_int_charstar,   /* begin_epilogue */
> @@ -6109,7 +6109,7 @@ check_die (dw_die_ref die)
>dw_attr_node *a;
>bool inline_found = false;
>int n_location = 0, n_low_pc = 0, n_high_pc = 0, n_artificial = 0;
> -  int n_decl_line = 0, n_decl_file = 0;
> +  int n_decl_line = 0, n_decl_column = 0, n_decl_file = 0;
>FOR_EACH_VEC_SAFE_ELT (die->die_attr, ix, a)
>  {
>switch (a->dw_attr)
> @@ -6130,6 +6130,9 @@ check_die (dw_die_ref die)
> case DW_AT_artificial:
>   ++n_artificial;
>   break;
> +case DW_AT_decl_column:
> + ++n_decl_column;
> + break;
> case DW_AT_decl_line:
>   ++n_decl_line;
>   break;
> @@ -6141,7 +6144,7 @@ check_die (dw_die_ref die)
> }
>  }
>if (n_location > 1 || n_low_pc > 1 || n_high_pc > 1 || n_artificial > 1
> -  || n_decl_line > 1 || n_decl_file > 1)
> +  || n_decl_column > 1 || n_decl_line > 1 || n_decl_file > 1)
>  {
>fprintf (stderr, "Duplicate attributes in DIE:\n");
>debug_dwarf_die (die);
> @@ -20190,6 +20193,8 @@ add_src_coords_attributes (dw_die_ref di
>s = expand_location (DECL_SOURCE_LOCATION (decl));
>add_AT_file (die, DW_AT_decl_file, lookup_filename (s.file));
>add_AT_unsigned (die, DW_AT_decl_line, s.line);
> +  if (debug_column_info && s.column)
> +add_AT_unsigned (die, DW_AT_decl_column, s.column);
>  }
>
>  /* Add DW_AT_{,MIPS_}linkage_name attribute for the given decl.  */
> @@ -21936,7 +21941,11 @@ gen_subprogram_die (tree decl, dw_die_re
>&& (DECL_ARTIFICIAL (decl)
>|| (get_AT_file (old_die, DW_AT_decl_file) == file_index
>&& (get_AT_unsigned (old_die, DW_AT_decl_line)
> -  == (unsigned) s.line
> +  == (unsigned) s.line)
> +  && (!debug_column_info
> +  || s.column == 0
> +  || (get_AT_unsigned (old_die, DW_AT_decl_column)
> +  == (unsigned) s.column)
> {
>   subr_die = old_die;
>
> @@ -21963,10 +21972,15 @@ gen_subprogram_die (tree decl, dw_die_re
> add_AT_file (subr_die, DW_AT_decl_file, file_index);
>   if (get_AT_unsigned 

[PATCH] restore -Wunused-variable on a typedef'd variable in a function template (PR 79548)

2017-02-17 Thread Martin Sebor

The attached patch fixes bug 79548 - [5/6/7 Regression] missing
-Wunused-variable on a typedef'd variable in a function template,
most likely broken by the introduction of -Wunused-local-typedefs.

While testing the patch I came across a couple of other bugs:

  79585 - spurious -Wunused-variable on a pointer with attribute
  unused in function template
and

  79586 - missing -Wdeprecated depending on position of attribute

The test I added for 79548 fails two assertions due to the first
of these two so I xfailed them.  The second doesn't have an impact
on it.  Neither of these is a regression so I didn't try to fix them.

Martin
PR c++/79548 - [5/6/7 Regression] missing -Wunused-variable on a typedef'd variable in a function template

gcc/cp/ChangeLog:

	PR c++/79548
	* decl.c (poplevel): Avoid diagnosing entities declared with
	attribute unused.
	(initialize_local_var): Do not consider the type of a variable
	when determining whether or not it's used.
	
gcc/testsuite/ChangeLog:

	PR c++/79548
	* g++.dg/warn/Wunused-var-26.C: New test.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 70c44fb..e315ad0 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -664,7 +664,8 @@ poplevel (int keep, int reverse, int functionbody)
 	&& (!CLASS_TYPE_P (type)
 		|| !TYPE_HAS_NONTRIVIAL_DESTRUCTOR (type)
 		|| lookup_attribute ("warn_unused",
- TYPE_ATTRIBUTES (TREE_TYPE (decl)
+ TYPE_ATTRIBUTES (TREE_TYPE (decl
+	&& !lookup_attribute ("unused", TYPE_ATTRIBUTES (TREE_TYPE (decl
 	  {
 	if (! TREE_USED (decl))
 	  warning_at (DECL_SOURCE_LOCATION (decl),
@@ -6546,7 +6547,6 @@ initialize_local_var (tree decl, tree init)
 {
   tree type = TREE_TYPE (decl);
   tree cleanup;
-  int already_used;
 
   gcc_assert (VAR_P (decl)
 	  || TREE_CODE (decl) == RESULT_DECL);
@@ -6564,7 +6564,7 @@ initialize_local_var (tree decl, tree init)
 return;
 
   /* Compute and store the initial value.  */
-  already_used = TREE_USED (decl) || TREE_USED (type);
+  bool already_used = TREE_USED (decl);
   if (TREE_USED (type))
 DECL_READ_P (decl) = 1;
 
diff --git a/gcc/testsuite/g++.dg/warn/Wunused-var-26.C b/gcc/testsuite/g++.dg/warn/Wunused-var-26.C
new file mode 100644
index 000..562f25b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wunused-var-26.C
@@ -0,0 +1,127 @@
+// PR c++/79548 - missing -Wunused-variable on a typedef'd variable
+// in a function template
+// { dg-do compile }
+// { dg-options "-Wunused" }
+
+
+#define UNUSED __attribute__ ((unused))
+
+template 
+void f_int ()
+{
+  T t;// { dg-warning "unused variable" }
+
+  typedef T U;
+  U u;// { dg-warning "unused variable" }
+}
+
+template void f_int();
+
+
+template 
+void f_intptr ()
+{
+  T *t = 0;   // { dg-warning "unused variable" }
+
+  typedef T U;
+  U *u = 0;   // { dg-warning "unused variable" }
+}
+
+template void f_intptr();
+
+
+template 
+void f_var_unused ()
+{
+  // The variable is marked unused.
+  T t UNUSED;
+
+  typedef T U;
+  U u UNUSED;
+}
+
+template void f_var_unused();
+
+
+template 
+void f_var_type_unused ()
+{
+  // The variable's type is marked unused.
+  T* UNUSED t = new T;   // { dg-bogus "unused variable" "bug 79585" { xfail *-*-* } }
+
+  typedef T U;
+  U* UNUSED u = new U;   // { dg-bogus "unused variable" "bug 79585" { xfail *-*-* } }
+
+  typedef T UNUSED U;
+  U v = U ();   // { dg-bogus "unused variable" "bug 79585" { xfail *-*-* } }
+}
+
+template void f_var_type_unused();
+
+
+struct A { int i; };
+
+template 
+void f_A ()
+{
+  T t;// { dg-warning "unused variable" }
+
+  typedef T U;
+  U u;// { dg-warning "unused variable" }
+}
+
+template void f_A();
+
+
+template 
+void f_A_unused ()
+{
+  T t UNUSED;
+
+  typedef T U;
+  U u UNUSED;
+}
+
+template void f_A_unused();
+
+
+struct B { B (); };
+
+template 
+void f_B ()
+{
+  T t;
+
+  typedef T U;
+  U u;
+}
+
+template void f_B();
+
+
+struct C { ~C (); };
+
+template 
+void f_C ()
+{
+  T t;
+
+  typedef T U;
+  U u;
+}
+
+template void f_C();
+
+
+struct D { B b; };
+
+template 
+void f_D ()
+{
+  T t;
+
+  typedef T U;
+  U u;
+}
+
+template void f_D();


Re: [PATCH] Fix fixincludes for canadian cross builds

2017-02-17 Thread Bruce Korb
On 02/06/17 10:44, Bernd Edlinger wrote:
> I tested this change with different arm-linux-gnueabihf cross
> compilers, and verified that mkheaders still works on the host system.
> 
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?

As long as you certify that this is correct for all systems we care about:

+BUILD_SYSTEM_HEADER_DIR = `
+echo $(CROSS_SYSTEM_HEADER_DIR) | \
+sed -e :a -e 's,[^/]*/\.\.\/,,' -e ta`

that is pretty obtuse sed-speak to me.  I suggest a comment
explaining what sed is supposed to be doing.  What should
"$(CROSS_SYSTEM_HEADER_DIR)" look like?


Restore DECIMAL_DIG macro to C99/C11 value

2017-02-17 Thread Joseph Myers
By extending the set of floating types, TS 18661-3 thereby affected
the definition of DECIMAL_DIG, which is defined in terms of the
"widest supported floating type".  This is not conditional on whether
__STDC_WANT_IEC_60559_TYPES_EXT__ is defined when  is
included.

I raised this possible incompatibility with C11 (an implementation
should be able to conform simultaneously with C11, and with C11 + TS
18661) in DR#501.  This is not yet resolved, but the latest proposal
 would
obsolete DECIMAL_DIG with the intention of limiting it to the C11
types (so making it equivalent to LDBL_DECIMAL_DIG).  (This proposal
is intended to go along with a corresponding change to TS 18661-3 to
avoid the new types and non-arithmetic interchange encodings affecting
the value of DECIMAL_DIG.)

To avoid releasing GCC 7 with a wider-than-C11 value of DECIMAL_DIG
and possibly reverting back to a C11 value in a future release, this
patch reverts back to the C11 value now.  If the proposed resolution
to DR#501 changes again so that DECIMAL_DIG *should* have a
wider-than-C11 value, we can move back to a wider-than-C11 value in
GCC 8.

Bootstrapped with no regressions on x86_64-pc-linux-gnu.  Applied to 
mainline.

gcc/c-family:
2017-02-17  Joseph Myers  

* c-cppbuiltin.c (builtin_define_float_constants): Define
__DECIMAL_DIG__ to the value for long double.

gcc/testsuite:
2017-02-17  Joseph Myers  

* gcc.dg/c11-float-2.c: New test.
* gcc.dg/torture/float128-floath.c,
gcc.dg/torture/float128x-floath.c,
gcc.dg/torture/float16-floath.c, gcc.dg/torture/float32-floath.c,
gcc.dg/torture/float32x-floath.c, gcc.dg/torture/float64-floath.c,
gcc.dg/torture/float64x-floath.c: Do not test comparison of
*_DECIMAL_DIG macros with DECIMAL_DIG.

Index: gcc/c-family/c-cppbuiltin.c
===
--- gcc/c-family/c-cppbuiltin.c (revision 245538)
+++ gcc/c-family/c-cppbuiltin.c (working copy)
@@ -245,11 +245,10 @@ builtin_define_float_constants (const char *name_p
 if (type_decimal_dig < type_d_decimal_dig)
   type_decimal_dig++;
   }
-  /* Arbitrarily, define __DECIMAL_DIG__ when defining macros for long
- double, although it may be greater than the value for long
- double.  */
+  /* Define __DECIMAL_DIG__ to the value for long double to be
+ compatible with C99 and C11; see DR#501 and N2108.  */
   if (type == long_double_type_node)
-builtin_define_with_int_value ("__DECIMAL_DIG__", decimal_dig);
+builtin_define_with_int_value ("__DECIMAL_DIG__", type_decimal_dig);
   sprintf (name, "__%s_DECIMAL_DIG__", name_prefix);
   builtin_define_with_int_value (name, type_decimal_dig);
 
Index: gcc/testsuite/gcc.dg/c11-float-2.c
===
--- gcc/testsuite/gcc.dg/c11-float-2.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/c11-float-2.c  (working copy)
@@ -0,0 +1,9 @@
+/* Test DECIMAL_DIG equals LDBL_DECIMAL_DIG; see DR#501 and N2108.  */
+/* { dg-do preprocess } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
+
+#include 
+
+#if DECIMAL_DIG != LDBL_DECIMAL_DIG
+# error "DECIMAL_DIG != LDBL_DECIMAL_DIG"
+#endif
Index: gcc/testsuite/gcc.dg/torture/float128-floath.c
===
--- gcc/testsuite/gcc.dg/torture/float128-floath.c  (revision 245538)
+++ gcc/testsuite/gcc.dg/torture/float128-floath.c  (working copy)
@@ -53,10 +53,6 @@
 # error "FLT128_TRUE_MIN undefined"
 #endif
 
-#if FLT128_DECIMAL_DIG > DECIMAL_DIG
-# error "FLT128_DECIMAL_DIG > DECIMAL_DIG"
-#endif
-
 #if FLT128_MANT_DIG != 113 || FLT128_MAX_EXP != 16384 || FLT128_MIN_EXP != 
-16381
 # error "_Float128 bad format"
 #endif
Index: gcc/testsuite/gcc.dg/torture/float128x-floath.c
===
--- gcc/testsuite/gcc.dg/torture/float128x-floath.c (revision 245538)
+++ gcc/testsuite/gcc.dg/torture/float128x-floath.c (working copy)
@@ -53,10 +53,6 @@
 # error "FLT128X_TRUE_MIN undefined"
 #endif
 
-#if FLT128X_DECIMAL_DIG > DECIMAL_DIG
-# error "FLT128X_DECIMAL_DIG > DECIMAL_DIG"
-#endif
-
 #if FLT128X_MANT_DIG < 128 || FLT128X_MAX_EXP < 65536 || FLT128X_MIN_EXP + 
FLT128X_MAX_EXP != 3
 # error "_Float128x bad format"
 #endif
Index: gcc/testsuite/gcc.dg/torture/float16-floath.c
===
--- gcc/testsuite/gcc.dg/torture/float16-floath.c   (revision 245538)
+++ gcc/testsuite/gcc.dg/torture/float16-floath.c   (working copy)
@@ -53,10 +53,6 @@
 # error "FLT16_TRUE_MIN undefined"
 #endif
 
-#if FLT16_DECIMAL_DIG > DECIMAL_DIG
-# error "FLT16_DECIMAL_DIG > DECIMAL_DIG"
-#endif
-
 #if FLT16_MANT_DIG != 11 || FLT16_MAX_EXP != 16 || FLT16_MIN_EXP != -13
 # error "_Float16 bad format"
 #endif
Index: 

Re: [PATCH, GCC/x86 mingw32] Add configure option to force wildcard behavior on Windows

2017-02-17 Thread JonY
On 02/17/2017 11:31 AM, Thomas Preudhomme wrote:
> Here you are:
> 
> 2017-01-24  Thomas Preud'homme  
> 
> * configure.ac (--enable-mingw-wildcard): Add new configurable
> feature.
> * configure: Regenerate.
> * config.in: Regenerate.
> * config/i386/driver-mingw32.c: new file.
> * config/i386/x-mingw32: Add rule to build driver-mingw32.o.
> * config.host: Link driver-mingw32.o on MinGW host.
> * doc/install.texi: Document new --enable-mingw-wildcard configure
> option.
> 
> Must have forgotten to paste it.

Thanks, I'll stage it locally until stage 1 opens.





signature.asc
Description: OpenPGP digital signature


Re: [PATCH,rs6000] PR78056: Remove unreliable test case

2017-02-17 Thread Segher Boessenkool
On Fri, Feb 17, 2017 at 02:54:07PM -0700, Kelvin Nilsen wrote:
> This patch amends a patch merged with the trunk on 2017-01-14.  One of
> the new test cases added at that time has proven to be unreliable so
> this path removes it.
> 
> Is this patch ok for trunk?

Okay, so the testcase just does not work at all, does not do what it
was intended to do.  The patch is okay for trunk.  Thanks,


Segher


> 2017-02-17  Kelvin Nilsen  
> 
>   PR target/78056
>   * gcc.target/powerpc/pr78056-8.c: Remove.


Re: [PATCH,rs6000] PR78056: Remove unreliable test case

2017-02-17 Thread Segher Boessenkool
On Fri, Feb 17, 2017 at 02:54:07PM -0700, Kelvin Nilsen wrote:
> This patch amends a patch merged with the trunk on 2017-01-14.  One of
> the new test cases added at that time has proven to be unreliable so
> this path removes it.

In what way is it unreliable?  What fails?


Segher


C++ PATCH for c++/79508 (lookup error with member template)

2017-02-17 Thread Jason Merrill
My patch to limit template lookup after -> and . to class templates
sets parser->context->object_type even for a dependent object
expression, so that we know that there was one.  This is normally
cleared in cp_parser_lookup_name, but cp_parser_template_name was
optimizing away the call to that function for set_default, so we lost
this important side-effect, and so when we went to look up
random_positive we thought we were still looking up the name in object
scope.  Fixed by preserving the side-effect even when we don't make
the call.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit a3f8ced9964242a69bacec5a94c14ce380380fec
Author: Jason Merrill 
Date:   Fri Feb 17 16:20:10 2017 -0500

PR c++/79508 - lookup error with member template

* parser.c (cp_parser_template_name): Clear
parser->context->object_type if we aren't doing lookup.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 060962d..92d8cce 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -15719,6 +15719,7 @@ cp_parser_template_name (cp_parser* parser,
cp_lexer_purge_tokens_after (parser->lexer, start);
  if (is_identifier)
*is_identifier = true;
+ parser->context->object_type = NULL_TREE;
  return identifier;
}
 
@@ -15730,7 +15731,12 @@ cp_parser_template_name (cp_parser* parser,
  && (!parser->scope
  || (TYPE_P (parser->scope)
  && dependent_type_p (parser->scope
-   return identifier;
+   {
+ /* We're optimizing away the call to cp_parser_lookup_name, but we
+still need to do this.  */
+ parser->context->object_type = NULL_TREE;
+ return identifier;
+   }
 }
 
   /* Look up the name.  */
diff --git a/gcc/testsuite/g++.dg/template/memtmpl5.C 
b/gcc/testsuite/g++.dg/template/memtmpl5.C
new file mode 100644
index 000..c5c3634
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/memtmpl5.C
@@ -0,0 +1,22 @@
+// PR c++/79508
+
+struct C
+{
+  template< void(*F)()> void set_default() {   }   
+};
+
+
+template  void random_positive()
+{
+}
+
+template void initialize(T& x)
+{
+  x.template set_default();
+}
+
+int main ()
+{
+  C x;
+  initialize(x);
+}


[PATCH,rs6000] PR78056: Remove unreliable test case

2017-02-17 Thread Kelvin Nilsen

This patch amends a patch merged with the trunk on 2017-01-14.  One of
the new test cases added at that time has proven to be unreliable so
this path removes it.

Is this patch ok for trunk?

gcc/testsuite/ChangeLog:

2017-02-17  Kelvin Nilsen  

PR target/78056
* gcc.target/powerpc/pr78056-8.c: Remove.


Index: gcc/testsuite/gcc.target/powerpc/pr78056-8.c
===
--- gcc/testsuite/gcc.target/powerpc/pr78056-8.c(revision 245539)
+++ gcc/testsuite/gcc.target/powerpc/pr78056-8.c(working copy)
@@ -1,26 +0,0 @@
-/* { dg-do compile { target { powerpc*-*-* } } } */
-/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power5" } } */
-
-/* powerpc_popcntb_ok represents support for power 5.  */
-/* { dg-require-effective-target powerpc_popcntb_ok } */
-/* dfp_hw represents support for power 6.  */
-/* { dg-skip-if "" { dfp_hw } } */
-/* { dg-skip-if "" { powerpc*-*-aix* } } */
-/* { dg-options "-mcpu=power5" } */
-
-/* This test follows the pattern of pr78056-2.c, which has been
- * exercised with binutils 2.25.  This test, however, has not
- * been exercised because the author of the test does not have access
- * to a development environment that succesfully bootstraps gcc
- * while at the same lacking assembler support for power 6.  */
-
-/* This test should succeed on both 32- and 64-bit configurations.  */
-/* Though the command line specifies power5 target, this function is
-   to support power6. Expect an error message here because this target
-   does not support power6.  */
-__attribute__((target("cpu=power6")))
-/* fabs/fnabs/fsel */
-double normal1 (double a, double b)
-{ /* { dg-warning "lacks power6 support" } */
-  return __builtin_copysign (a, b); /* { dg-warning "implicit declaration" } */
-}



Re: [PATCH, rs6000] gcc 6 back port of xvcvsxdsp and xvcvuxdsp RTL fixes

2017-02-17 Thread Segher Boessenkool
On Fri, Feb 17, 2017 at 08:30:26AM -0800, Carl E. Love wrote:
> The patch has been tested on powerpc64le-unknown-linux-gnu (Power 8 LE)
> with no regressions.
> 
> Is the patch OK for gcc 6 branch?  

Yes, thanks.  In the future, please do 6 before 5.


Segher


C++ PATCH for c++/78690 (ICE with using and global type with same name)

2017-02-17 Thread Jason Merrill
We entered type_dependent_object_expression_p considering an
IDENTIFIER_NODE, which is always dependent.  But because there's a
global type with the same name, the identifier had global_type_node as
its type, which isn't recognized as a dependent type.  Fixed by
handling IDENTIFIER_NODE directly.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit d2b4bcf0dc04ea3ab7909d55bcc8a7d4703b34ca
Author: Jason Merrill 
Date:   Fri Feb 17 14:00:10 2017 -0500

PR c++/78690 - ICE with using and global type with same name

* pt.c (type_dependent_object_expression_p): True for
IDENTIFIER_NODE.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 04479d4..9e6ce8d 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -23932,6 +23932,10 @@ type_dependent_expression_p (tree expression)
 bool
 type_dependent_object_expression_p (tree object)
 {
+  /* An IDENTIFIER_NODE can sometimes have a TREE_TYPE, but it's still
+ dependent.  */
+  if (TREE_CODE (object) == IDENTIFIER_NODE)
+return true;
   tree scope = TREE_TYPE (object);
   return (!scope || dependent_scope_p (scope));
 }
diff --git a/gcc/testsuite/g++.dg/template/dependent-scope1.C 
b/gcc/testsuite/g++.dg/template/dependent-scope1.C
new file mode 100644
index 000..a5c18c4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/dependent-scope1.C
@@ -0,0 +1,17 @@
+// PR c++/78690
+
+struct C;
+
+template 
+struct A
+{
+  struct C { static void bar (); };
+};
+
+template 
+struct B
+{
+  using A::C;
+  void
+  foo () { C.bar (); }
+};


Re: [Patch, fortran] PR79402 - ICE with submodules: module procedure interface defined in parent module

2017-02-17 Thread Jerry DeLisle

On 02/08/2017 08:00 AM, Paul Richard Thomas wrote:

Dear All,

The attached rework of the patch functions in the same way as
yesterday's but is based in resolve.c rather than trans-decl.c. It
looks to me to be by far cleaner.

Bootstraps and regtests on FC23/x86_64 - OK for trunk?

Cheers

Paul

2017-02-08  Paul Thomas  

PR fortran/79344
* resolve.c (fixup_unique_dummy): New function.
(gfc_resolve_expr): Call it for dummy variables with a unique
symtree name.

2017-02-08  Paul Thomas  

PR fortran/79344
* gfortran.dg/submodule_23.f90: New test.



On 7 February 2017 at 16:06, Paul Richard Thomas
 wrote:

Dear All,

This bug generates an ICE because the symbol for dummy 'n' in the
specification expression for the result of 'fun1' is not the same as
the symbol in the formal arglist. For some reason that I have been
unable to uncover, this false dummy is associated with a unique
symtree. The odd thing is that the dump of the parse tree for the
failing module procedure case is identical to that where the interface
is explcitely reproduced in the submodule. The cause of the ICE is
that the false dummy has no backend_decl as it should.

This patch hits the problem directly on the head by using the
backend_decl from the symbol in the namespace of the formal arglist,
as described in the comment in the patch. If it is deemed to be more
hygenic, the chunk of code can be lifted out and deposited in a
separate function.

Bootstraps and regtests on FC23/x86_64 - OK for trunk?


Yes OK.


Jerry


Re: [gomp4] adjust num_gangs and add a diagnostic for unsupported num_workers

2017-02-17 Thread Cesar Philippidis
On 02/15/2017 01:29 PM, Thomas Schwinge wrote:
> On Mon, 13 Feb 2017 08:58:39 -0800, Cesar Philippidis 
>  wrote:

>> @@ -952,25 +958,30 @@ nvptx_exec (void (*fn), size_t mapnum, void 
>> **hostaddrs, void **devaddrs,
>>CUdevice dev = nvptx_thread()->ptx_dev->dev;
>>/* 32 is the default for known hardware.  */
>>int gang = 0, worker = 32, vector = 32;
>> -  CUdevice_attribute cu_tpb, cu_ws, cu_mpc, cu_tpm;
>> +  CUdevice_attribute cu_tpb, cu_ws, cu_mpc, cu_tpm, cu_rf, cu_sm;
>>  
>>cu_tpb = CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK;
>>cu_ws = CU_DEVICE_ATTRIBUTE_WARP_SIZE;
>>cu_mpc = CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT;
>>cu_tpm  = CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR;
>> +  cu_rf  = CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR;
>> +  cu_sm  = CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_MULTIPROCESSOR;
>>  
>>if (cuDeviceGetAttribute (_size, cu_tpb, dev) == CUDA_SUCCESS
>>&& cuDeviceGetAttribute (_size, cu_ws, dev) == CUDA_SUCCESS
>>&& cuDeviceGetAttribute (_size, cu_mpc, dev) == CUDA_SUCCESS
>> -  && cuDeviceGetAttribute (_size, cu_tpm, dev)  == CUDA_SUCCESS)
>> +  && cuDeviceGetAttribute (_size, cu_tpm, dev) == CUDA_SUCCESS
>> +  && cuDeviceGetAttribute (_size, cu_rf, dev)  == CUDA_SUCCESS
>> +  && cuDeviceGetAttribute (_size, cu_sm, dev)  == CUDA_SUCCESS)
> 
> Trying to compile this on CUDA 5.5/331.113, I run into:
> 
> [...]/source-gcc/libgomp/plugin/plugin-nvptx.c: In function 'nvptx_exec':
> [...]/source-gcc/libgomp/plugin/plugin-nvptx.c:970:16: error: 
> 'CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR' undeclared (first use 
> in this function)
>cu_rf  = CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR;
> ^~~~
> [...]/source-gcc/libgomp/plugin/plugin-nvptx.c:970:16: note: each 
> undeclared identifier is reported only once for each function it appears in
> [...]/source-gcc/libgomp/plugin/plugin-nvptx.c:971:16: error: 
> 'CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_MULTIPROCESSOR' undeclared (first 
> use in this function)
>cu_sm  = CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_MULTIPROCESSOR;
> ^~~~
> 
> For reference, please see the code handling
> CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR in the trunk version
> of the nvptx_open_device function.

ACK. While this change is fairly innocuous, it might be too invasive for
GCC 7.1. Maybe we can backport it to 7.1?

> And then, I don't specifically have a problem with discontinuing CUDA 5.5
> support, and require 6.5, for example, but that should be a conscious
> decision.

We should probably ditch CUDA 5.5. In fact, according to trunk's cuda.h,
it requires version 8.0.

Alex, are you using CUDA 5.5 in your environment?

>> @@ -980,8 +991,6 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, 
>> void **devaddrs,
>>   matches the hardware configuration.  Logical gangs are
>>   scheduled onto physical hardware.  To maximize usage, we
>>   should guess a large number.  */
>> -  if (default_dims[GOMP_DIM_GANG] < 1)
>> -default_dims[GOMP_DIM_GANG] = gang ? gang : 1024;
> 
> That's "bad", because a non-zero "default_dims[GOMP_DIM_GANG]" (also
> known as "default_dims[0]") is used to decide whether to enter this whole
> code block, and with that assignment removed, every call of the
> nvptx_exec function will now re-do all this GOMP_OPENACC_DIM parsing,
> cuDeviceGetAttribute calls, computations, and so on.  (See "GOMP_DEBUG=1"
> output.)

Good point. I made neutral values (e.g. '-' arguments as negative one).

> I think this whole code block should be moved into the nvptx_open_device
> function, to have it executed once when the device is opened -- after
> all, all these are per-device attributes.  (So, it's actually
> conceptually incorrect to have this done only once in the nvptx_exec
> function, given that this data then is used in the same process by/for
> potentially different hardware devices.)

Yeah, that's a better place. All of those hardware attributes are now
stored in ptx_device.

> And, one could argue that the GOMP_OPENACC_DIM parsing conceptually
> belongs into generic libgomp code, instead of the nvptx plugin.  (But
> that aspect can be cleaned up later: currently, the nvptx plugin is the
> only one supporting/using it.)
> 
>>/* The worker size must not exceed the hardware.  */
>>if (default_dims[GOMP_DIM_WORKER] < 1
>>|| (default_dims[GOMP_DIM_WORKER] > worker && gang))
>> @@ -998,9 +1007,56 @@ nvptx_exec (void (*fn), size_t mapnum, void 
>> **hostaddrs, void **devaddrs,
>>  }
>>pthread_mutex_unlock (_dev_lock);
>>  
>> +  int reg_used = -1;  /* Dummy value.  */
>> +  cuFuncGetAttribute (_used, CU_FUNC_ATTRIBUTE_NUM_REGS, 

Re: Re: Improving code generation in the nvptx back end

2017-02-17 Thread Cesar Philippidis
On 02/17/2017 05:09 AM, Thomas Schwinge wrote:

> On Fri, 17 Feb 2017 14:00:09 +0100, I wrote:
>> [...] for "normal" functions there is no reason to use the
>> ".param" space for passing arguments in and out of functions.  We can
>> then get rid of the boilerplate code to move ".param %in_ar*" into ".reg
>> %ar*", and the other way round for "%value_out"/"%value".  This will then
>> also simplify the call sites, where all that code "evaporates".  That's
>> actually something I started to look into, many months ago, and I now
>> just dug out those changes, and will post them later.
>>
>> (Very likely, the PTX "JIT" compiler will do the very same thing without
>> difficulty, but why not directly generate code that is less verbose to
>> read?)
> 
> Using my WIP patch, the generated PTX code changes/is simplified as
> follows:
> 
>  // BEGIN GLOBAL FUNCTION DECL: f
> -.visible .func (.param.f32 %value_out) f (.param.u32 %in_ar0, .param.u64 
> %in_ar1);
> +.visible .func (.reg.f32 %value_out) f (.reg.u32 %ar0, .reg.u64 %ar1);
> 
>  // BEGIN GLOBAL FUNCTION DEF: f
> -.visible .func (.param.f32 %value_out) f (.param.u32 %in_ar0, .param.u64 
> %in_ar1)
> +.visible .func (.reg.f32 %value_out) f (.reg.u32 %ar0, .reg.u64 %ar1)
>  {
> .reg.f32 %value;
> -   .reg.u32 %ar0;
> -   ld.param.u32 %ar0, [%in_ar0];
> -   .reg.u64 %ar1;
> -   ld.param.u64 %ar1, [%in_ar1];
> .reg.f64 %r23;
> .reg.f32 %r24;
> .reg.u32 %r25;
> @@ -34,15 +30,15 @@ $L3:
> mov.f32 %r24, 0f;
>  $L1:
> mov.f32 %value, %r24;
> -   st.param.f32[%value_out], %value;
> +   mov.f32 %value_out, %value;
> ret;
>  }
> 
>  // BEGIN GLOBAL FUNCTION DECL: main
> -.visible .func (.param.u32 %value_out) main (.param.u32 %in_ar0, 
> .param.u64 %in_ar1);
> +.visible .func (.reg.u32 %value_out) main (.reg.u32 %ar0, .reg.u64 %ar1);
> 
>  // BEGIN GLOBAL FUNCTION DEF: main
> -.visible .func (.param.u32 %value_out) main (.param.u32 %in_ar0, 
> .param.u64 %in_ar1)
> +.visible .func (.reg.u32 %value_out) main (.reg.u32 %ar0, .reg.u64 %ar1)
>  {
> .reg.u32 %value;
> .local .align 8 .b8 %frame_ar[32];
> @@ -70,13 +66,9 @@ $L1:
> st.u64  [%frame+24], %r29;
> add.u64 %r31, %frame, 16;
> {
> -   .param.f32 %value_in;
> -   .param.u32 %out_arg1;
> -   st.param.u32 [%out_arg1], %r26;
> -   .param.u64 %out_arg2;
> -   st.param.u64 [%out_arg2], %r31;
> -   call (%value_in), f, (%out_arg1, %out_arg2);
> -   ld.param.f32%r32, [%value_in];
> +   .reg.f32 %value_in;
> +   call (%value_in), f, (%r26, %r31);
> +   mov.f32 %r32, %value_in;
> }
> setp.eq.f32 %r33, %r32, 0f;
> @%r33   bra $L5;
> @@ -89,17 +81,13 @@ $L5:
> st.u64  [%frame+24], %r36;
> mov.u32 %r34, 1;
> {
> -   .param.f32 %value_in;
> -   .param.u32 %out_arg1;
> -   st.param.u32 [%out_arg1], %r34;
> -   .param.u64 %out_arg2;
> -   st.param.u64 [%out_arg2], %r31;
> -   call (%value_in), f, (%out_arg1, %out_arg2);
> -   ld.param.f32%r39, [%value_in];
> +   .reg.f32 %value_in;
> +   call (%value_in), f, (%r34, %r31);
> +   mov.f32 %r39, %value_in;
> }
> setp.neu.f32%r40, %r39, 0f3f80;
> @%r40   bra $L6;
> mov.u32 %value, 0;
> -   st.param.u32[%value_out], %value;
> +   mov.u32 %value_out, %value;
> ret;
>  }
> 
> (Not yet directly using "%value_out" instead of the intermediate "%value"
> register.)
> 
> Is such a patch something to pursue to completion?

Are you trying to optimize acc routines in general? I'm not sure how
frequently they are used at the moment.

Also, while .param values may be overkill for routines, they are
addressable. Looking at section 5.1.6.1 in the PTX reference manual, you
can have something like this:

.entry foo ( .param .b32 N, .param .align 8 .b8 buffer[64] )
{
  .reg .u32 %n;
  .reg .f64 %d;
  ld.param.u32 %n, [N];
  ld.param.f64
  ...

Granted, this is an entry function to be called from the host, but the
same usage is applicable inside routines.

This gives me an idea. While working on the firstprivate changes, I
noticed that GCC packs all of the offloaded function arguments into a
structure, which the nvptx run time plugin uploads to a special data
mapping prior to calling cuLaunchKernel. That's inefficient in
application that launch 

[PATCH] Emit column info even in .debug_line section

2017-02-17 Thread Jakub Jelinek
Hi!

And here is incremental patch to provide column information even in
.debug_line (whether through .loc directives or custom .debug_line).
The patch looks large, because I had to adjust the two hooks to pass
through the column information, but beyond that it is actually very simple.

If the earlier patch is ok, would this be ok too (not bootstrapped yet,
going to regtest it soon)?

2017-02-17  Jakub Jelinek  

* final.c (last_columnnum, override_columnnum): New variables.
(final_start_function): Set last_columnnum, pass it to begin_prologue
hook and pass 0 to dwarf2out_begin_prologue.
(final_scan_insn): Update override_columnnum.  Pass last_columnnum
to source_line debug hook.
(notice_source_line): Compute last_columnnum and for debug_column_info
return true on column changes.
* debug.h (struct gcc_debug_hooks): Add column argument to
source_line and begin_prologue hooks.
(debug_nothing_int_charstar_int_bool): Remove prototype.
(debug_nothing_int_int_charstar,
debug_nothing_int_int_charstar_int_bool): New prototypes.
(dwarf2out_begin_prologue): Add column argument.
* debug.c (do_nothing_debug_hooks): Adjust source_line and
begin_prologue hooks.
(debug_nothing_int_charstar_int_bool): Remove.
(debug_nothing_int_int_charstar,
debug_nothing_int_int_charstar_int_bool): New functions.
* dwarf2out.c (dwarf2out_begin_prologue): Add column argument, pass it
through to dwarf2out_source_line.
(dwarf2_lineno_debug_hooks): Adjust begin_prologue hook.
(dwarf2out_source_line): Add column argument, emit it if requested.
* sdbout.c (sdbout_source_line, sdbout_begin_prologue): Add column
arguments.
* xcoffout.h (xcoffout_begin_prologue, xcoffout_source_line): Likewise.
* xcoffout.c (xcoffout_begin_prologue, xcoffout_source_line): Likewise.
* vmsdbgout.c (vmsdbgout_begin_prologue): Add column argument, pass it
through to dwarf2out_begin_prologue.
(vmsdbgout_source_line): Add column argument, pass it through to
dwarf2out_source_line.
* dbxout.c (dbxout_begin_prologue): Add column argument, adjust
dbxout_source_line caller.
(dbxout_source_line): Add column argument.

--- gcc/final.c.jj  2017-01-24 23:29:07.0 +0100
+++ gcc/final.c 2017-02-17 19:13:39.504406149 +0100
@@ -118,6 +118,9 @@ rtx_insn *current_output_insn;
 /* Line number of last NOTE.  */
 static int last_linenum;
 
+/* Column number of last NOTE.  */
+static int last_columnnum;
+
 /* Last discriminator written to assembly.  */
 static int last_discriminator;
 
@@ -133,9 +136,10 @@ static int high_function_linenum;
 /* Filename of last NOTE.  */
 static const char *last_filename;
 
-/* Override filename and line number.  */
+/* Override filename, line and column number.  */
 static const char *override_filename;
 static int override_linenum;
+static int override_columnnum;
 
 /* Whether to force emission of a line note before the next insn.  */
 static bool force_source_line = false;
@@ -1763,6 +1767,7 @@ final_start_function (rtx_insn *first, F
 
   last_filename = LOCATION_FILE (prologue_location);
   last_linenum = LOCATION_LINE (prologue_location);
+  last_columnnum = LOCATION_COLUMN (prologue_location);
   last_discriminator = discriminator = 0;
 
   high_block_linenum = high_function_linenum = last_linenum;
@@ -1771,10 +1776,10 @@ final_start_function (rtx_insn *first, F
 asan_function_start ();
 
   if (!DECL_IGNORED_P (current_function_decl))
-debug_hooks->begin_prologue (last_linenum, last_filename);
+debug_hooks->begin_prologue (last_linenum, last_columnnum, last_filename);
 
   if (!dwarf2_debug_info_emitted_p (current_function_decl))
-dwarf2out_begin_prologue (0, NULL);
+dwarf2out_begin_prologue (0, 0, NULL);
 
 #ifdef LEAF_REG_REMAP
   if (crtl->uses_only_leaf_regs)
@@ -2335,6 +2340,7 @@ final_scan_insn (rtx_insn *insn, FILE *f
{
  override_filename = LOCATION_FILE (*locus_ptr);
  override_linenum = LOCATION_LINE (*locus_ptr);
+ override_columnnum = LOCATION_COLUMN (*locus_ptr);
}
}
  break;
@@ -2370,11 +2376,13 @@ final_scan_insn (rtx_insn *insn, FILE *f
{
  override_filename = LOCATION_FILE (*locus_ptr);
  override_linenum = LOCATION_LINE (*locus_ptr);
+ override_columnnum = LOCATION_COLUMN (*locus_ptr);
}
  else
{
  override_filename = NULL;
  override_linenum = 0;
+ override_columnnum = 0;
}
}
  break;
@@ -2592,8 +2600,9 @@ final_scan_insn (rtx_insn *insn, FILE *f
  {
if (flag_verbose_asm)
  asm_show_source 

[PATCH] Emit column information in dwarf

2017-02-17 Thread Jakub Jelinek
Hi!

The GDB folks expressed interest in handling column information in
debug info, apparently clang emits it, but gcc does not.

I know we are late in the release cycle, so I'm not suggesting to
turn this on by default, but the following patch at least allows
users to request it through -gcolumn-info, so that GDB can be adjusted
to consume it and later on (GCC 8 or 9) we could perhaps switch the
default.

This patch handles just emitting DW_AT_decl_column and DW_AT_call_column
if requested (-gcolumn-info) and non-zero (i.e. the middle-end knows the
columns).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-02-17  Jakub Jelinek  

* common.opt (gno-column-info, gcolumn-info): New options.
* dwarf2out.c (dwarf2_lineno_debug_hooks): Formatting fix.
(check_die): Also test for multiple DW_AT_decl_column attributes.
(add_src_coords_attributes, dwarf2out_imported_module_or_decl_1): Add
DW_AT_decl_column if requested.
(gen_subprogram_die): Compare and/or add also DW_AT_decl_column
if requested.
(gen_variable_die): Likewise.
(add_call_src_coords_attributes): Add DW_AT_call_column if requested.
* doc/invoke.texi (-gcolumn-info, -gno-column-info): Document.

--- gcc/common.opt.jj   2017-02-01 16:41:45.0 +0100
+++ gcc/common.opt  2017-02-17 11:47:14.233098170 +0100
@@ -2805,6 +2805,14 @@ gcoff
 Common Driver JoinedOrMissing Negative(gdwarf)
 Generate debug information in COFF format.
 
+gno-column-info
+Common Driver RejectNegative Var(debug_column_info,0) Init(0)
+Don't record DW_AT_decl_column and DW_AT_call_column in DWARF.
+
+gcolumn-info
+Common Driver RejectNegative Var(debug_column_info,1)
+Record DW_AT_decl_column and DW_AT_call_column in DWARF.
+
 gdwarf
 Common Driver JoinedOrMissing Negative(gdwarf-)
 Generate debug information in default version of DWARF format.
--- gcc/dwarf2out.c.jj  2017-02-09 23:01:46.0 +0100
+++ gcc/dwarf2out.c 2017-02-17 11:56:13.471834354 +0100
@@ -2732,7 +2732,7 @@ const struct gcc_debug_hooks dwarf2_line
   debug_nothing_int_int,/* begin_block */
   debug_nothing_int_int,/* end_block */
   debug_true_const_tree,/* ignore_block */
-  dwarf2out_source_line,/* source_line */
+  dwarf2out_source_line,/* source_line */
   debug_nothing_int_charstar,   /* begin_prologue */
   debug_nothing_int_charstar,   /* end_prologue */
   debug_nothing_int_charstar,   /* begin_epilogue */
@@ -6109,7 +6109,7 @@ check_die (dw_die_ref die)
   dw_attr_node *a;
   bool inline_found = false;
   int n_location = 0, n_low_pc = 0, n_high_pc = 0, n_artificial = 0;
-  int n_decl_line = 0, n_decl_file = 0;
+  int n_decl_line = 0, n_decl_column = 0, n_decl_file = 0;
   FOR_EACH_VEC_SAFE_ELT (die->die_attr, ix, a)
 {
   switch (a->dw_attr)
@@ -6130,6 +6130,9 @@ check_die (dw_die_ref die)
case DW_AT_artificial:
  ++n_artificial;
  break;
+case DW_AT_decl_column:
+ ++n_decl_column;
+ break;
case DW_AT_decl_line:
  ++n_decl_line;
  break;
@@ -6141,7 +6144,7 @@ check_die (dw_die_ref die)
}
 }
   if (n_location > 1 || n_low_pc > 1 || n_high_pc > 1 || n_artificial > 1
-  || n_decl_line > 1 || n_decl_file > 1)
+  || n_decl_column > 1 || n_decl_line > 1 || n_decl_file > 1)
 {
   fprintf (stderr, "Duplicate attributes in DIE:\n");
   debug_dwarf_die (die);
@@ -20190,6 +20193,8 @@ add_src_coords_attributes (dw_die_ref di
   s = expand_location (DECL_SOURCE_LOCATION (decl));
   add_AT_file (die, DW_AT_decl_file, lookup_filename (s.file));
   add_AT_unsigned (die, DW_AT_decl_line, s.line);
+  if (debug_column_info && s.column)
+add_AT_unsigned (die, DW_AT_decl_column, s.column);
 }
 
 /* Add DW_AT_{,MIPS_}linkage_name attribute for the given decl.  */
@@ -21936,7 +21941,11 @@ gen_subprogram_die (tree decl, dw_die_re
   && (DECL_ARTIFICIAL (decl)
   || (get_AT_file (old_die, DW_AT_decl_file) == file_index
   && (get_AT_unsigned (old_die, DW_AT_decl_line)
-  == (unsigned) s.line
+  == (unsigned) s.line)
+  && (!debug_column_info
+  || s.column == 0
+  || (get_AT_unsigned (old_die, DW_AT_decl_column)
+  == (unsigned) s.column)
{
  subr_die = old_die;
 
@@ -21963,10 +21972,15 @@ gen_subprogram_die (tree decl, dw_die_re
add_AT_file (subr_die, DW_AT_decl_file, file_index);
  if (get_AT_unsigned (old_die, DW_AT_decl_line) != (unsigned) s.line)
add_AT_unsigned (subr_die, DW_AT_decl_line, s.line);
+ if (debug_column_info
+ && s.column
+ && (get_AT_unsigned (old_die, DW_AT_decl_column)
+ != (unsigned) s.column))
+  

[C++ PATCH] For -gdwarf-5 emit DW_TAG_variable instead of DW_TAG_member for C++ static data members

2017-02-17 Thread Jakub Jelinek
Hi!

DWARF5 that has been released recently had a fairly late change where
it says that C++ static data members should use DW_TAG_variable tag
instead of DW_TAG_member inside of the class DIE.

The following patch implements just that, not any further changes e.g.
to inline static data members (i.e. it still emits a declaration in
the class and specification right after it, just both will now be
DW_TAG_variable instead DW_TAG_member followed by DW_TAG_variable).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-02-17  Jakub Jelinek  

* dwarf2out.c (add_linkage_name): Handle DW_TAG_variable with
class_scope_p parent like DW_TAG_member.
(gen_variable_die): For -gdwarf-5, use DW_TAG_variable instead of
DW_TAG_member for static data member declarations.
(gen_member_die): For -gdwarf-5 don't change DW_TAG_variable
to DW_TAG_member.

--- gcc/dwarf2out.c.jj  2017-02-17 11:56:13.0 +0100
+++ gcc/dwarf2out.c 2017-02-17 16:28:12.336632166 +0100
@@ -20226,7 +20226,8 @@ add_linkage_name (dw_die_ref die, tree d
   && VAR_OR_FUNCTION_DECL_P (decl)
   && TREE_PUBLIC (decl)
   && !(VAR_P (decl) && DECL_REGISTER (decl))
-  && die->die_tag != DW_TAG_member)
+  && die->die_tag != DW_TAG_member
+  && (die->die_tag != DW_TAG_variable || !class_scope_p (die->die_parent)))
 add_linkage_name_raw (die, decl);
 }
 
@@ -22818,9 +22819,10 @@ gen_variable_die (tree decl, tree origin
 }
 
   /* For static data members, the declaration in the class is supposed
- to have DW_TAG_member tag; the specification should still be
- DW_TAG_variable referencing the DW_TAG_member DIE.  */
-  if (declaration && class_scope_p (context_die))
+ to have DW_TAG_member tag in DWARF{3,4} and we emit it for compatibility
+ also in DWARF2; the specification should still be DW_TAG_variable
+ referencing the DW_TAG_member DIE.  */
+  if (declaration && class_scope_p (context_die) && dwarf_version < 5)
 var_die = new_die (DW_TAG_member, context_die, decl);
   else
 var_die = new_die (DW_TAG_variable, context_die, decl);
@@ -22858,7 +22860,8 @@ gen_variable_die (tree decl, tree origin
  != (unsigned) s.column))
add_AT_unsigned (var_die, DW_AT_decl_column, s.column);
 
- if (old_die->die_tag == DW_TAG_member)
+ if (old_die->die_tag == DW_TAG_member
+ || (dwarf_version >= 5 && class_scope_p (old_die->die_parent)))
add_linkage_name (var_die, decl);
}
 }
@@ -24089,7 +24092,8 @@ gen_member_die (tree type, dw_die_ref co
  && get_AT (child, DW_AT_specification) == NULL)
{
  reparent_child (child, context_die);
- child->die_tag = DW_TAG_member;
+ if (dwarf_version < 5)
+   child->die_tag = DW_TAG_member;
}
  else
splice_child_die (context_die, child);
@@ -24111,7 +24115,7 @@ gen_member_die (tree type, dw_die_ref co
}
 
   /* For C++ inline static data members emit immediately a DW_TAG_variable
-DIE that will refer to that DW_TAG_member through
+DIE that will refer to that DW_TAG_member/DW_TAG_variable through
 DW_AT_specification.  */
   if (TREE_STATIC (member)
  && (lang_hooks.decls.decl_dwarf_attribute (member, DW_AT_inline)

Jakub


[PATCH] Fix various issues with x86 builtins (PR target/79568)

2017-02-17 Thread Jakub Jelinek
Hi!

The masks for builtins seems to be quite messy.  Most of
the builtins have a single OPTION_MASK_ISA_* in there and that is
clear (i.e. that the builtin is only enabled/usable if that isa
bit is on).  Then we have 0 and that is meant for always enabled builtins.
Then there is OPTION_MASK_ISA_xxx | OPTION_MASK_ISA_64BIT and that
means (according to def_builtin code and comments) that we enable
only if TARGET_64BIT and the rest without the  | OPTION_MASK_ISA_64BIT
needs to be satisfied.  Then there is
OPTION_MASK_ISA_xxx | OPTION_MASK_ISA_AVX512VL and that is again
in def_builtin code and comments documented to be only enabled if
-mavx512vl and the rest without the | OPTION_MASK_ISA_AVX512VL
needs to be satisfied.  Then we have
OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4
OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_CRC32
OPTION_MASK_ISA_SSE | OPTION_MASK_ISA_3DNOW_A
def_builtin here as well as ix86_expand_builtin suggest here that it
is either SSE or 3dNOWa etc. (I believe at least the last one
is meant to be that, no idea about the other two).
And finally various builtins use ~OPTION_MASK_ISA_64BIT, which I have
no idea what people adding those really meant to express.  At least
for e.g. __builtin_ia32_pause I'd expect it wants to be enabled
everywhere (that is 0 though), while what both def_builtin and
ix86_expand_builtin actually implement for ~OPTION_MASK_ISA_64BIT
is that it is enabled whenever any ISA (other than OPTION_MASK_ISA_64BIT)
is set and disabled otherwise.  So, e.g. -m32 -march=i386 -mno-sahf -mno-mmx
-mno-sse disables __builtin_ia32_pause, __builtin_ia32_rdtsc,
__builtin_ia32_rolhi etc.

For OPTION_MASK_ISA_64BIT and OPTION_MASK_ISA_AVX512VL, while
def_builtin implements something documented (and what makes sense),
ix86_expand_builtin actually takes it as any of the ISAs enabled,
so if you manage to define the builtin earlier (e.g. through including
x86intrin.h), then one can use
OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL
builtin (meant to be enabled if both are on) in a function where just one
of them is on (-mavx512bw or -mavx512vl), if both aren't on, that will ICE.
Similarly, for OPTION_MASK_ISA_LWP | OPTION_MASK_ISA_64BIT, if the
builtin is defined earlier, it will be enabled even in -mno-lwp -m64
function and ICE (not for -m32 -mlwp, because -m64 is a global TU option
and so nothing will define the builtin).

This patch attempts to resolve it by changing all ~OPTION_MASK_ISA_64BIT
builtins to 0, handling xxx | OPTION_MASK_ISA_64BIT
as must satisfy TARGET_64BIT and xxx, handling yyy |
OPTION_MASK_ISA_AVX512VL as must satisfy -mavx512vl and yyy and for the
rest, if 0, enabling always, otherwise requiring at least one of the ISAs
enabled from the bitmask.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
Or do we want a different behavior (then what and why)?

2017-02-17  Jakub Jelinek  

PR target/79568
* config/i386/i386.c (ix86_expand_builtin): Handle
OPTION_MASK_ISA_AVX512VL and OPTION_MASK_ISA_64BIT in
ix86_builtins_isa[fcode].isa as a requirement of those
flags and any other flag in the bitmask.
(ix86_init_mmx_sse_builtins): Use 0 instead of
~OPTION_MASK_ISA_64BIT as mask.
* config/i386/i386-builtin.def (__builtin_ia32_rdtsc,
__builtin_ia32_rdtscp, __builtin_ia32_pause, __builtin_ia32_bsrsi,
__builtin_ia32_rdpmc, __builtin_ia32_rolqi, __builtin_ia32_rolhi,
__builtin_ia32_rorqi, __builtin_ia32_rorhi): Likewise.

* gcc.target/i386/pr79568-1.c: New test.
* gcc.target/i386/pr79568-2.c: New test.
* gcc.target/i386/pr79568-3.c: New test.

--- gcc/config/i386/i386.c.jj   2017-02-17 11:11:27.0 +0100
+++ gcc/config/i386/i386.c  2017-02-17 17:10:07.899194674 +0100
@@ -32075,11 +32075,11 @@ ix86_init_mmx_sse_builtins (void)
   IX86_BUILTIN_SBB64);
 
   /* Read/write FLAGS.  */
-  def_builtin (~OPTION_MASK_ISA_64BIT, "__builtin_ia32_readeflags_u32",
+  def_builtin (0, "__builtin_ia32_readeflags_u32",
UNSIGNED_FTYPE_VOID, IX86_BUILTIN_READ_FLAGS);
   def_builtin (OPTION_MASK_ISA_64BIT, "__builtin_ia32_readeflags_u64",
UINT64_FTYPE_VOID, IX86_BUILTIN_READ_FLAGS);
-  def_builtin (~OPTION_MASK_ISA_64BIT, "__builtin_ia32_writeeflags_u32",
+  def_builtin (0, "__builtin_ia32_writeeflags_u32",
VOID_FTYPE_UNSIGNED, IX86_BUILTIN_WRITE_FLAGS);
   def_builtin (OPTION_MASK_ISA_64BIT, "__builtin_ia32_writeeflags_u64",
VOID_FTYPE_UINT64, IX86_BUILTIN_WRITE_FLAGS);
@@ -36723,9 +36723,18 @@ ix86_expand_builtin (tree exp, rtx targe
  Originally the builtin was not created if it wasn't applicable to the
  current ISA based on the command line switches.  With function specific
  options, we need to check in the context of the function making the call
- whether it is supported.  */
-  if ((ix86_builtins_isa[fcode].isa
-   && 

C++ PATCHes for c++/79549 and 79556 (ICE with auto parameter pack)

2017-02-17 Thread Jason Merrill
In 79556, we try to deduce an auto type from a dependent initializer
with null TREE_TYPE, which doesn't work; fixed by catching that case
in do_auto_deduction.

In 79549, we try to tsubst into the type of a NONTYPE_ARGUMENT_PACK,
which doesn't make sense for an auto parameter pack; in fact, it
doesn't make sense for the argument pack to have a type at all.  For
GCC 7 I'm fixing this by leaving the auto type in place; for GCC 8
we'll do away with TREE_TYPE on all NONTYPE_ARGUMENT_PACKs.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit d6e7709acaafa96813a2eed1ca7cb51f7f9847a8
Author: Jason Merrill 
Date:   Thu Feb 16 17:24:19 2017 -0500

PR c++/79556 - C++17 ICE with non-type auto

* pt.c (do_auto_deduction): Don't try to deduce from null type.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 73d6be3..093c0f9 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -25191,6 +25191,10 @@ do_auto_deduction (tree type, tree init, tree 
auto_node,
 /* C++17 class template argument deduction.  */
 return do_class_deduction (type, tmpl, init, flags, complain);
 
+  if (TREE_TYPE (init) == NULL_TREE)
+/* Nothing we can do with this, even in deduction context.  */
+return type;
+
   /* [dcl.spec.auto]: Obtain P from T by replacing the occurrences of auto
  with either a new invented type template parameter U or, if the
  initializer is a braced-init-list (8.5.4), with
diff --git a/gcc/testsuite/g++.dg/cpp1z/nontype-auto9.C 
b/gcc/testsuite/g++.dg/cpp1z/nontype-auto9.C
new file mode 100644
index 000..2daa346
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/nontype-auto9.C
@@ -0,0 +1,8 @@
+// PR c++/79556
+// { dg-options -std=c++1z }
+
+template  struct A;
+template  struct B;
+template  struct B {
+  static auto a = A::value;
+};
commit 1eeb6fca6149f32f711ab2b404ce442c4a40b550
Author: Jason Merrill 
Date:   Fri Feb 17 12:46:52 2017 -0500

PR c++/79549 - C++17 ICE with non-type auto template parameter pack

* pt.c (convert_template_argument): Just return an auto arg pack.
(tsubst_template_args): Don't tsubst an auto pack type.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 093c0f9..04479d4 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -7612,6 +7612,10 @@ convert_template_argument (tree parm,
 
   if (tree a = type_uses_auto (t))
{
+ if (ARGUMENT_PACK_P (orig_arg))
+   /* There's nothing to check for an auto argument pack.  */
+   return orig_arg;
+
  t = do_auto_deduction (t, arg, a, complain, adc_unify, args);
  if (t == error_mark_node)
return error_mark_node;
@@ -11649,8 +11653,11 @@ tsubst_template_args (tree t, tree args, 
tsubst_flags_t complain, tree in_decl)
 new_arg = error_mark_node;
 
   if (TREE_CODE (new_arg) == NONTYPE_ARGUMENT_PACK) {
-TREE_TYPE (new_arg) = tsubst (TREE_TYPE (orig_arg), args,
-  complain, in_decl);
+   if (type_uses_auto (TREE_TYPE (orig_arg)))
+ TREE_TYPE (new_arg) = TREE_TYPE (orig_arg);
+   else
+ TREE_TYPE (new_arg) = tsubst (TREE_TYPE (orig_arg), args,
+   complain, in_decl);
 TREE_CONSTANT (new_arg) = TREE_CONSTANT (orig_arg);
 
 if (TREE_TYPE (new_arg) == error_mark_node)
diff --git a/gcc/testsuite/g++.dg/cpp1z/nontype-auto8.C 
b/gcc/testsuite/g++.dg/cpp1z/nontype-auto8.C
new file mode 100644
index 000..da4c88b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/nontype-auto8.C
@@ -0,0 +1,10 @@
+// PR c++/79549
+// { dg-options -std=c++1z }
+
+template 
+struct meow;
+
+template 
+struct meow { };
+
+meow<1> m;
commit 677e35c07a536344ce8bb74e97a72ab05cdb4da4
Author: Jason Merrill 
Date:   Thu Feb 16 16:31:26 2017 -0500

PR c++/79549 - C++17 ICE with non-type auto template parameter pack

* pt.c (convert_template_argument): Just return an argument pack.
(coerce_template_parameter_pack, template_parm_to_arg)
(extract_fnparm_pack, make_argument_pack, tsubst_template_args)
(tsubst_decl, tsubst, type_unification_real, unify_pack_expansion):
Don't set the type of a NONTYPE_ARGUMENT_PACK.
* parser.c (make_char_string_pack, make_string_pack): Likewise.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 060962d..7cba266 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -4150,7 +4150,6 @@ make_char_string_pack (tree value)
 
   /* Build the argument packs.  */
   SET_ARGUMENT_PACK_ARGS (argpack, charvec);
-  TREE_TYPE (argpack) = char_type_node;
 
   TREE_VEC_ELT (argvec, 0) = argpack;
 
@@ -4186,7 +4185,6 @@ make_string_pack (tree value)
 
   /* Build the argument packs.  */
   SET_ARGUMENT_PACK_ARGS (argpack, charvec);
-  TREE_TYPE (argpack) = str_char_type_node;
 
   TREE_VEC_ELT (argvec, 1) = 

[PATCH] Fix -m3dnowa (PR target/79569)

2017-02-17 Thread Jakub Jelinek
Hi!

-m3dnowa is an undocumented option that always errors out.
The following patch fixes it and makes it do the obvious thing.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-02-17  Jakub Jelinek  

PR target/79569
* config/i386/i386.opt (m3dnowa): Replace Undocumented with Report.
* common/config/i386/i386-common.c (OPTION_MASK_ISA_3DNOW_A_SET): 
Define.
(ix86_handle_option): Handle OPT_m3dnowa.
* doc/invoke.texi (-m3dnowa): Document.
* doc/extend.texi (__builtin_ia32_pmulhuw, __builtin_ia32_pf2iw): Use
-m3dnowa instead of -m3dnow -march=athlon.

* gcc.target/i386/3dnowA-3.c: New test.

--- gcc/config/i386/i386.opt.jj 2017-01-16 12:28:35.0 +0100
+++ gcc/config/i386/i386.opt2017-02-17 11:23:06.674671212 +0100
@@ -614,7 +614,7 @@ Target Report Mask(ISA_3DNOW) Var(ix86_i
 Support 3DNow! built-in functions.
 
 m3dnowa
-Target Undocumented Mask(ISA_3DNOW_A) Var(ix86_isa_flags) Save
+Target Report Mask(ISA_3DNOW_A) Var(ix86_isa_flags) Save
 Support Athlon 3Dnow! built-in functions.
 
 msse
--- gcc/common/config/i386/i386-common.c.jj 2017-01-12 22:29:00.0 
+0100
+++ gcc/common/config/i386/i386-common.c2017-02-17 10:55:20.023152107 
+0100
@@ -35,6 +35,8 @@ along with GCC; see the file COPYING3.
 #define OPTION_MASK_ISA_MMX_SET OPTION_MASK_ISA_MMX
 #define OPTION_MASK_ISA_3DNOW_SET \
   (OPTION_MASK_ISA_3DNOW | OPTION_MASK_ISA_MMX_SET)
+#define OPTION_MASK_ISA_3DNOW_A_SET \
+  (OPTION_MASK_ISA_3DNOW_A | OPTION_MASK_ISA_3DNOW_SET)
 
 #define OPTION_MASK_ISA_SSE_SET OPTION_MASK_ISA_SSE
 #define OPTION_MASK_ISA_SSE2_SET \
@@ -291,7 +293,17 @@ ix86_handle_option (struct gcc_options *
   return true;
 
 case OPT_m3dnowa:
-  return false;
+  if (value)
+   {
+ opts->x_ix86_isa_flags |= OPTION_MASK_ISA_3DNOW_A_SET;
+ opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_3DNOW_A_SET;
+   }
+  else
+   {
+ opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_3DNOW_A_UNSET;
+ opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_3DNOW_A_UNSET;
+   }
+  return true;
 
 case OPT_msse:
   if (value)
--- gcc/doc/invoke.texi.jj  2017-02-16 12:00:36.0 +0100
+++ gcc/doc/invoke.texi 2017-02-17 11:31:38.772731527 +0100
@@ -1188,9 +1188,9 @@ See RS/6000 and PowerPC Options.
 -mavx512bw  -mavx512dq  -mavx512ifma  -mavx512vbmi  -msha  -maes @gol
 -mpclmul  -mfsgsbase  -mrdrnd  -mf16c  -mfma @gol
 -mprefetchwt1  -mclflushopt  -mxsavec  -mxsaves @gol
--msse4a  -m3dnow  -mpopcnt  -mabm  -mbmi  -mtbm  -mfma4  -mxop  -mlzcnt @gol
--mbmi2  -mfxsr  -mxsave  -mxsaveopt  -mrtm  -mlwp  -mmpx  -mmwaitx @gol
--mclzero  -mpku  -mthreads @gol
+-msse4a  -m3dnow  -m3dnowa  -mpopcnt  -mabm  -mbmi  -mtbm  -mfma4  -mxop @gol
+-mlzcnt  -mbmi2  -mfxsr  -mxsave  -mxsaveopt  -mrtm  -mlwp  -mmpx  @gol
+-mmwaitx  -mclzero  -mpku  -mthreads @gol
 -mms-bitfields  -mno-align-stringops  -minline-all-stringops @gol
 -minline-stringops-dynamically  -mstringop-strategy=@var{alg} @gol
 -mmemcpy-strategy=@var{strategy}  -mmemset-strategy=@var{strategy} @gol
@@ -25004,6 +25004,9 @@ preferred alignment to @option{-mpreferr
 @itemx -m3dnow
 @opindex m3dnow
 @need 200
+@itemx -m3dnowa
+@opindex m3dnowa
+@need 200
 @itemx -mpopcnt
 @opindex mpopcnt
 @need 200
@@ -25053,7 +25056,7 @@ These switches enable the use of instruc
 SSE2, SSE3, SSSE3, SSE4.1, AVX, AVX2, AVX512F, AVX512PF, AVX512ER, AVX512CD,
 SHA, AES, PCLMUL, FSGSBASE, RDRND, F16C, FMA, SSE4A, FMA4, XOP, LWP, ABM,
 AVX512VL, AVX512BW, AVX512DQ, AVX512IFMA AVX512VBMI, BMI, BMI2, FXSR,
-XSAVE, XSAVEOPT, LZCNT, RTM, MPX, MWAITX, PKU or 3DNow!@:
+XSAVE, XSAVEOPT, LZCNT, RTM, MPX, MWAITX, PKU, 3DNow!@: or enhanced 3DNow!@:
 extended instruction sets.  Each has a corresponding @option{-mno-} option
 to disable use of these instructions.
 
--- gcc/doc/extend.texi.jj  2017-02-13 12:20:51.0 +0100
+++ gcc/doc/extend.texi 2017-02-17 12:02:22.195884349 +0100
@@ -19513,9 +19513,8 @@ v2si __builtin_ia32_psradi (v2si, int)
 @end smallexample
 
 The following built-in functions are made available either with
-@option{-msse}, or with a combination of @option{-m3dnow} and
-@option{-march=athlon}.  All of them generate the machine
-instruction that is part of the name.
+@option{-msse}, or with @option{-m3dnowa}.  All of them generate
+the machine instruction that is part of the name.
 
 @smallexample
 v4hi __builtin_ia32_pmulhuw (v4hi, v4hi)
@@ -20615,9 +20614,8 @@ v2sf __builtin_ia32_pi2fd (v2si)
 v4hi __builtin_ia32_pmulhrw (v4hi, v4hi)
 @end smallexample
 
-The following built-in functions are available when both @option{-m3dnow}
-and @option{-march=athlon} are used.  All of them generate the machine
-instruction that is part of the name.
+The following built-in functions are available when @option{-m3dnowa} is used.
+All of them generate the machine instruction that is part of the name.
 
 

[PATCH] Don't ICE on invalid operands in i?86 inline asm (PR target/79559)

2017-02-17 Thread Jakub Jelinek
Hi!

Asserts don't work really well on something we can't control in inline asm.
output_operand_lossage takes care to ICE outside of inline asm and error out
inside inline asm.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-02-17  Jakub Jelinek  

PR target/79559
* config/i386/i386.c (ix86_print_operand): Use output_operand_lossage
instead of gcc_assert for K, r and R code checks.  Formatting fixes.

* gcc.target/i386/pr79559.c: New test.

--- gcc/config/i386/i386.c.jj   2017-02-14 20:34:49.0 +0100
+++ gcc/config/i386/i386.c  2017-02-17 11:11:27.636114439 +0100
@@ -17844,8 +17844,8 @@ ix86_print_operand (FILE *file, rtx x, i
  break;
 
default:
- output_operand_lossage
-   ("invalid operand size for operand code 'O'");
+ output_operand_lossage ("invalid operand size for operand "
+ "code 'O'");
  return;
}
 
@@ -17879,15 +17879,14 @@ ix86_print_operand (FILE *file, rtx x, i
  return;
 
default:
- output_operand_lossage
-   ("invalid operand size for operand code 'z'");
+ output_operand_lossage ("invalid operand size for operand "
+ "code 'z'");
  return;
}
}
 
  if (GET_MODE_CLASS (GET_MODE (x)) == MODE_FLOAT)
-   warning
- (0, "non-integer operand used with operand code 'z'");
+   warning (0, "non-integer operand used with operand code 'z'");
  /* FALLTHRU */
 
case 'Z':
@@ -17949,13 +17948,12 @@ ix86_print_operand (FILE *file, rtx x, i
}
  else
{
- output_operand_lossage
-   ("invalid operand type used with operand code 'Z'");
+ output_operand_lossage ("invalid operand type used with "
+ "operand code 'Z'");
  return;
}
 
- output_operand_lossage
-   ("invalid operand size for operand code 'Z'");
+ output_operand_lossage ("invalid operand size for operand code 'Z'");
  return;
 
case 'd':
@@ -18154,7 +18152,12 @@ ix86_print_operand (FILE *file, rtx x, i
  break;
 
case 'K':
- gcc_assert (CONST_INT_P (x));
+ if (!CONST_INT_P (x))
+   {
+ output_operand_lossage ("operand is not an integer, invalid "
+ "operand code 'K'");
+ return;
+   }
 
  if (INTVAL (x) & IX86_HLE_ACQUIRE)
 #ifdef HAVE_AS_IX86_HLE
@@ -18177,8 +18180,12 @@ ix86_print_operand (FILE *file, rtx x, i
  return;
 
case 'r':
- gcc_assert (CONST_INT_P (x));
- gcc_assert (INTVAL (x) == ROUND_SAE);
+ if (!CONST_INT_P (x) || INTVAL (x) != ROUND_SAE)
+   {
+ output_operand_lossage ("operand is not a specific integer, "
+ "invalid operand code 'r'");
+ return;
+   }
 
  if (ASSEMBLER_DIALECT == ASM_INTEL)
fputs (", ", file);
@@ -18191,7 +18198,12 @@ ix86_print_operand (FILE *file, rtx x, i
  return;
 
case 'R':
- gcc_assert (CONST_INT_P (x));
+ if (!CONST_INT_P (x))
+   {
+ output_operand_lossage ("operand is not an integer, invalid "
+ "operand code 'R'");
+ return;
+   }
 
  if (ASSEMBLER_DIALECT == ASM_INTEL)
fputs (", ", file);
@@ -18306,7 +18318,7 @@ ix86_print_operand (FILE *file, rtx x, i
  return;
 
default:
-   output_operand_lossage ("invalid operand code '%c'", code);
+ output_operand_lossage ("invalid operand code '%c'", code);
}
 }
 
--- gcc/testsuite/gcc.target/i386/pr79559.c.jj  2017-02-17 11:16:18.949176256 
+0100
+++ gcc/testsuite/gcc.target/i386/pr79559.c 2017-02-17 11:17:10.514479159 
+0100
@@ -0,0 +1,11 @@
+/* PR target/79559 */
+/* { dg-do compile } */
+
+void
+foo (int x)
+{
+  __asm__ volatile ("# %K0" : : "r" (x));  /* { dg-error "invalid operand 
code" } */
+  __asm__ volatile ("# %r0" : : "r" (x));  /* { dg-error "invalid operand 
code" } */
+  __asm__ volatile ("# %r0" : : "n" (129));/* { dg-error "invalid operand 
code" } */
+  __asm__ volatile ("# %R0" : : "r" (x));  /* { dg-error "invalid operand 
code" } */
+}

Jakub


Re: [PATCH v5] add -fprolog-pad=N,M option

2017-02-17 Thread Torsten Duwe
On Wed, Feb 08, 2017 at 12:48:56PM +0100, Jakub Jelinek wrote:
> 
> First of all, GCC is in stage4 and this isn't a fix for a regression, so it
> is not acceptable at this point.  Next stage1 will start in April or so.

I had gotten the impression there were sort of branches in the SCM; this
of course should go to HEAD / trunk / master or whatever it's called.

> The length of nop varies greatly across different architectures,
> some architectures have different spelling for it (e.g.
> ia64, s390{,x} use nop 0 and mmix uses swym 0), some architectures have
> tons of different nops with different sizes (see e.g. x86_64/i686), for some
> longer sizes it is often beneficial to emit a jump around the rest of nops
> instead of just tons of nops.
> 
> Even if it is counted always in nops and only nops are emitted (not really
> efficient, and I guess kernel cares about efficiency), the above is

Yes, efficiency is a goal, but not unchallenged. Besides, out of order micro-
architectures hardly suffer from NOPs inserted other than their occupation of
cache space and memory bandwidth. The worst impact I found on some in-order
aarch64 CPUs was a penalty of 1 clock cycle for 2 NOPs.

Note that this is a "you asked for it -- you get it" feature. Only if you
explicitly request those NOPs they will be inserted; normal operation is
unaffected.

> definitely not acceptable for a generic hook implementation, it contains too
> many ELF specific details as well as 64-bit target specific details etc.
> For the label, you should use something like:
>   static int ppad_no;
>   char tmp_label[...];
>   ASM_GENERATE_INTERNAL_LABEL (tmp_label, "LPPAD", ppad_no);
>   ppad_no++;
>   ASM_OUTPUT_LABEL (file, tmp_label);
> the "a",@progbits is not generic enough, you want
>   switch_to_section (get_section ("__prolog_pads_loc", 0, NULL));
> or so.  .previous doesn't work on many targets, you need to actuall switch
> to the previously current section.  .quad is not supported by many
> assemblers, you want assemble_integer, with the appropriate size (say
> derived from size of pointer), decide whether in that section everything is
> aligned or unaligned, emit aligning directive if the former.

Thanks a lot for this blueprint! It saved me half of the work for the rewrite!

> There are many other ways in which the ABI can check, e.g.
> see how i386.c (ix86_function_regparm) can change ABI if
>   cgraph_local_info *i = >local;
>   if (i && i->local && i->can_change_signature)
> So you'd e.g. need to make sure that functions you emit the padding for
> can't change signature.  Bet you won't be able to handle e.g. IPA cp or many
> other IPA optimizations that change signature too, while the function from
> the source may still be around, perhaps nothing will call it because the
> callers have been changed to call its clone with different arguments.
> Or say IPA-ICF, if you want to live-patch only some function and not another
> function which happens to have identical bogy in the end.

Indeed. Live patching needs to take more care. But as I wrote, it is only
_my_ current use case; the feature itself should be usable more widely.
I wouldn't want to restrict future users on what they can optimise and what
they can't. Maybe someone wants to measure exactly these effects?

The point is: whatever instructions are to replace those NOPs, they will
very likely need a register or two. With IPA-RA, all bets are off. Without,
you can assume the caller-save regs, the scratch regs and some intra-procedure
regs to be available. Please let's leave this all to the framework
implementers. The same goes for nops before or after the entry point. This
is only a _mechanism_.

Torsten



Re: [PATCH v5] add -fprolog-pad=N,M option

2017-02-17 Thread Torsten Duwe
On Wed, Feb 15, 2017 at 11:01:16AM +, Richard Earnshaw (lists) wrote:
> On 13/01/17 12:19, Torsten Duwe wrote:
> 
> > +++ b/gcc/doc/invoke.texi
> > @@ -11341,6 +11341,27 @@ of the function name, it is considered to be a 
> > match.  For C99 and C++
> >  extended identifiers, the function name must be given in UTF-8, not
> >  using universal character names.
> >  
> > +@item -fprolog-pad=@var{N},@var{M}
> This needs to make it clear that M is optional.  Then below state that
> if omitted, M defaults to zero.

It was mentioned, further down in the paragraph. I moved it up.

> > --- a/gcc/opts.c
> > +++ b/gcc/opts.c
> > @@ -2157,6 +2157,26 @@ common_handle_option (struct gcc_options *opts,
> >  opts->x_flag_ipa_reference = false;
> >break;
> >  
> > +case OPT_fprolog_pad_:
> > +  {
> > +   const char *comma = strchr (arg, ',');
> > +   if (comma)
> > + {
> > +   prolog_nop_pad_size = atoi (arg);
> > +   prolog_nop_pad_entry = atoi (comma + 1);
> > + }
> > +   else
> > + {
> > +   prolog_nop_pad_size = atoi (arg);
> > +   prolog_nop_pad_entry = 0;
> > + }
> 
> Where's the error checking?  If I write gibberish after the option name
> then atoi will silently fail and return zero.  I'm not overly familiar
> with the option handling code, but I'm sure we have routines to do the
> heavy lifting here.

Yes, I had already found integral_argument, but that's unsuitable for a
comma separated list, and arg is const so I could' punch a \0 there.
Using atoi was just lazy, admittedly.

> > +default_print_prolog_pad (FILE *file, unsigned HOST_WIDE_INT pad_size,
> > + bool record_p)
> > +{
> > +  if (record_p)
> > +fprintf (file, "1:");
> > +
> > +  unsigned i;
> > +  for (i = 0; i < pad_size; ++i)
> > +fprintf (file, "\tnop\n");
> > +
> > +  if (record_p)
> > +{
> > +  fprintf (file, "\t.section __prolog_pads_loc, \"a\",@progbits\n");
> > +  fprintf (file, "\t.quad 1b\n");
> > +  fprintf (file, "\t.previous\n");
> > +}
> > +}
> 
> NO!  Almost everything in this function is wrong, it needs to be done
> through suitable hooks that call into the machine back-ends that
> understand assembly flavours supported.

That was already mentioned in a previous version. That code assumes GAS+ELF.
It was the quick and dirty solution to get a working prototype.

Torsten



Re: [RFC PATCH, i386]: Use "lock orl $0, -4(%esp)" in mfence_nosse

2017-02-17 Thread Jakub Jelinek
On Fri, Feb 17, 2017 at 05:59:30PM +0100, Uros Bizjak wrote:
> > Unfortunately this makes valgrind unhappy about that:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1423434
> > I assume it will complain now on anything pre-SSE2 that contains the memory
> > barrier in 32-bit code.
> > Perhaps we should decrement and increment %esp around it or something
> > similar (or push/pop)?  Of course, that would mean we need to take care
> > of async unwind info.
> 
> Or, we can simply revert the patch? Not that the barrier performance
> of non-SSE 32bit targets matter...

Yeah.  People who care about performance should use -m64 anyway.

Jakub


Re: [RFC PATCH, i386]: Use "lock orl $0, -4(%esp)" in mfence_nosse

2017-02-17 Thread Uros Bizjak
On Fri, Feb 17, 2017 at 5:30 PM, Jakub Jelinek  wrote:
> On Sun, May 29, 2016 at 11:10:15PM +0200, Uros Bizjak wrote:
>> As explained in PR71245, comment #3 [1], it is better to use offset -4
>> to a %esp to implement a non-SSE memory fence instruction:
>>
>> -q-
>>
>> I guess it costs a code byte for a disp8 in the addressing mode, but
>> it avoids adding a lot of latency to a critical path involving a
>> spill/reload to (%esp), in functions where there is something at
>> (%esp).
>>
>> If it's an object larger than 4B, the lock orl could even cause a
>> store-forwarding stall when the object is reloaded.  (e.g. a double or
>> a vector).
>>
>> Ideally we could do the  lock orl  on some padding between two locals,
>> or on something in memory that wasn't going to be loaded soon, to
>> avoid touching more stack memory (which might be in the next page
>> down).  But we still want to do it on a cache line that's hot, so
>> going way up above our own stack frame isn't good either.
>
> Unfortunately this makes valgrind unhappy about that:
> https://bugzilla.redhat.com/show_bug.cgi?id=1423434
> I assume it will complain now on anything pre-SSE2 that contains the memory
> barrier in 32-bit code.
> Perhaps we should decrement and increment %esp around it or something
> similar (or push/pop)?  Of course, that would mean we need to take care
> of async unwind info.

Or, we can simply revert the patch? Not that the barrier performance
of non-SSE 32bit targets matter...

Uros.


[PATCH v6] add -fprolog-pad=N,M option

2017-02-17 Thread Torsten Duwe
Hi,

Thanks for all the feedback. Hopefully it's all incorporated now.
I will reply to you individually on the specific topics, but here is
the new v6 for you to rip apart ;-)

Changes since v5:

* ChangeLogs split, reshuffled, reformatted.

* cmdline option parsing again with integral_argument ()

* Documentation has less "pad"s

* completely reworked default_print_prolog_pad ()
  -- never liked the old version either.

Torsten



gcc/c-family/ChangeLog
2017-02-17  Torsten Duwe  

* c-attribs.c (c_common_attribute_table): Add entry for "prolog_pad".

gcc/lto/ChangeLog
2017-02-17  Torsten Duwe  

* lto-lang.c (lto_attribute_table): Add entry for "prolog_pad".

gcc/ChangeLog
2017-02-17  Torsten Duwe  

* common.opt: Introduce -fprolog_pad command line option,
and its variables prolog_nop_pad_size and prolog_nop_pad_entry.
* opts.c (common_handle_option): Add -fprolog_pad_ case,
including a two-value parser.
* target.def (print_prolog_pad): New target hook.
* targhooks.h (default_print_prolog_pad): New function.
* targhooks.c (default_print_prolog_pad): Likewise.
* toplev.c (process_options): Switch off IPA-RA if
prolog pads are being generated.
* varasm.c (assemble_start_function): Look at the prolog-pad command
line switch and current function attributes and maybe generate NOP
instructions by calling the print_prolog_pad hook.
* doc/extend.texi: Document prolog_pad attribute.
* doc/invoke.texi: Document -fprolog_pad command line option.
* doc/tm.texi.in (TARGET_ASM_PRINT_PROLOG_PAD): New target hook.
* doc/tm.texi: Likewise.

gcc/testsuite/ChangeLog
2017-02-17  Torsten Duwe  

* c-c++-common/attribute-prolog_pad-1.c: New test.

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index ce7fcaa..9f0f580 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -139,6 +139,7 @@ static tree handle_bnd_variable_size_attribute (tree *, 
tree, tree, int, bool *)
 static tree handle_bnd_legacy (tree *, tree, tree, int, bool *);
 static tree handle_bnd_instrument (tree *, tree, tree, int, bool *);
 static tree handle_fallthrough_attribute (tree *, tree, tree, int, bool *);
+static tree handle_prolog_pad_attribute (tree *, tree, tree, int, bool *);
 
 /* Table of machine-independent attributes common to all C-like languages.
 
@@ -345,6 +346,8 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_bnd_instrument, false },
   { "fallthrough",   0, 0, false, false, false,
  handle_fallthrough_attribute, false },
+  { "prolog_pad",1, 2, true, false, false,
+ handle_prolog_pad_attribute, false },
   { NULL, 0, 0, false, false, false, NULL, false }
 };
 
@@ -3173,3 +3176,10 @@ handle_fallthrough_attribute (tree *, tree name, tree, 
int,
   *no_add_attrs = true;
   return NULL_TREE;
 }
+
+static tree
+handle_prolog_pad_attribute (tree *, tree, tree, int, bool *)
+{
+  /* Nothing to be done here.  */
+  return NULL_TREE;
+}
diff --git a/gcc/common.opt b/gcc/common.opt
index ad6baa3..02993b1 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -163,6 +163,13 @@ bool flag_stack_usage_info = false
 Variable
 int flag_debug_asm
 
+; How many NOP insns to place before each function prologue by default
+Variable
+HOST_WIDE_INT prolog_nop_pad_size
+
+; And how far the asm entry point is into this pad
+Variable
+HOST_WIDE_INT prolog_nop_pad_entry
 
 ; Balance between GNAT encodings and standard DWARF to emit.
 Variable
@@ -2022,6 +2029,10 @@ fprofile-reorder-functions
 Common Report Var(flag_profile_reorder_functions)
 Enable function reordering that improves code placement.
 
+fprolog-pad=
+Common Joined Optimization
+Insert NOP instructions before each function prologue.
+
 frandom-seed
 Common Var(common_deferred_options) Defer
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 3d1546a..ef7e985 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3076,6 +3076,23 @@ that affect more than one function.
 This attribute should be used for debugging purposes only.  It is not
 suitable in production code.
 
+@item prolog_pad
+@cindex @code{prolog_pad} function attribute
+@cindex extra NOP instructions at the function entry point
+In case the target's text segment can be made writable at run time
+by any means, padding the function entry with a number of NOPs can
+be used to provide a universal tool for instrumentation.  Usually,
+prolog padding is enabled globally using the @option{-fprolog-pad=N,M}
+command-line switch, and disabled with attribute @code{prolog_pad (0)}
+for functions that are part of the actual instrumentation framework.
+This conveniently avoids an endless recursion.
+The @code{prolog_pad} function attribute can be used 

[PATCH, rs6000] gcc 6 back port of xvcvsxdsp and xvcvuxdsp RTL fixes

2017-02-17 Thread Carl E. Love
GCC Maintainers:

Here is the GCC 6 branch back port of the mainline fixes for the RTL
definitions for xvcvsxdsp and xvcvuxdsp instructions, commit r245460 on
2017-02-14.  The GCC 5 backport of these fixes have already been
approved and  committed.

The RTL defined the instructions with a V2DF argument and returning
V4SI. They should take a V2DI argument and return a V4SF based on the
Power ISA document. Additionally, the RTL define_insn for the xvcvuxdsp
was fixed to generate the correct xvcvuxdsp instruction instead of the
xvcvuxwdp instruction.

The patch has been tested on powerpc64le-unknown-linux-gnu (Power 8 LE)
with no regressions.

Is the patch OK for gcc 6 branch?  

   Carl Love
---

gcc/ChangeLog:

2017-02-17  Carl Love  

   Backport from mainline commit r245460 on 2017-02-14

   PR 79545
   * config/rs6000/rs6000.c: Add case statement entry to make the xvcvuxdsp
   built-in argument unsigned.
   * config/rs6000/vsx.md: Fix the source and return operand types so they
   match the instruction definitions from the ISA document.  Fix typo
   in the instruction generation for the (define_insn "vsx_xvcvuxdsp"
   statement.

gcc/testsuite/ChangeLog:

2017-01-17  Carl Love  

   Backport from mainline commit r245460 on 2017-02-14

   PR 79545
   * gcc.target/powerpc/vsx-builtin-3.c: Add missing test case for the
   xvcvsxdsp and xvcvuxdsp instructions.
---
 gcc/config/rs6000/rs6000.c   |  1 +
 gcc/config/rs6000/vsx.md | 10 +-
 gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c | 23 +++
 3 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 7591e55..8661d4f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -17080,6 +17080,7 @@ builtin_function_type (machine_mode mode_ret, 
machine_mode mode_arg0,
   break;
 
   /* unsigned args, signed return.  */
+case VSX_BUILTIN_XVCVUXDSP:
 case VSX_BUILTIN_XVCVUXDDP_UNS:
 case ALTIVEC_BUILTIN_UNSFLOAT_V4SI_V4SF:
   h.uns_p[1] = 1;
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index f9717f1..c7abb7b 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -1827,19 +1827,19 @@
   [(set_attr "type" "vecdouble")])
 
 (define_insn "vsx_xvcvsxdsp"
-  [(set (match_operand:V4SI 0 "vsx_register_operand" "=wd,?wa")
-   (unspec:V4SI [(match_operand:V2DF 1 "vsx_register_operand" "wf,wa")]
+  [(set (match_operand:V4SF 0 "vsx_register_operand" "=wd,?wa")
+   (unspec:V4SF [(match_operand:V2DI 1 "vsx_register_operand" "wf,wa")]
 UNSPEC_VSX_CVSXDSP))]
   "VECTOR_UNIT_VSX_P (V2DFmode)"
   "xvcvsxdsp %x0,%x1"
   [(set_attr "type" "vecfloat")])
 
 (define_insn "vsx_xvcvuxdsp"
-  [(set (match_operand:V4SI 0 "vsx_register_operand" "=wd,?wa")
-   (unspec:V4SI [(match_operand:V2DF 1 "vsx_register_operand" "wf,wa")]
+  [(set (match_operand:V4SF 0 "vsx_register_operand" "=wd,?wa")
+   (unspec:V4SF [(match_operand:V2DI 1 "vsx_register_operand" "wf,wa")]
 UNSPEC_VSX_CVUXDSP))]
   "VECTOR_UNIT_VSX_P (V2DFmode)"
-  "xvcvuxwdp %x0,%x1"
+  "xvcvuxdsp %x0,%x1"
   [(set_attr "type" "vecdouble")])
 
 ;; Convert from 32-bit to 64-bit types
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
index f337c1c..ff5296c 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
@@ -35,6 +35,8 @@
 /* { dg-final { scan-assembler "xvcmpgesp" } } */
 /* { dg-final { scan-assembler "xxsldwi" } } */
 /* { dg-final { scan-assembler-not "call" } } */
+/* { dg-final { scan-assembler "xvcvsxdsp" } } */
+/* { dg-final { scan-assembler "xvcvuxdsp" } } */
 
 extern __vector int si[][4];
 extern __vector short ss[][4];
@@ -50,7 +52,9 @@ extern __vector __pixel p[][4];
 #ifdef __VSX__
 extern __vector double d[][4];
 extern __vector long sl[][4];
+extern __vector long long sll[][4];
 extern __vector unsigned long ul[][4];
+extern __vector unsigned long long ull[][4];
 extern __vector __bool long bl[][4];
 #endif
 
@@ -211,3 +215,22 @@ int do_xxsldwi (void)
   d[i][0] = __builtin_vsx_xxsldwi (d[i][1], d[i][2], 3); i++;
   return i;
 }
+
+int do_xvcvsxdsp (void)
+{
+  int i = 0;
+
+  f[i][0] = __builtin_vsx_xvcvsxdsp (sll[i][1]); i++;
+
+  return i;
+}
+
+int do_xvcvuxdsp (void)
+{
+  int i = 0;
+
+  f[i][0] = __builtin_vsx_xvcvuxdsp (ull[i][1]); i++;
+
+  return i;
+}
+
-- 
1.9.1





Re: [RFC PATCH, i386]: Use "lock orl $0, -4(%esp)" in mfence_nosse

2017-02-17 Thread Jakub Jelinek
On Sun, May 29, 2016 at 11:10:15PM +0200, Uros Bizjak wrote:
> As explained in PR71245, comment #3 [1], it is better to use offset -4
> to a %esp to implement a non-SSE memory fence instruction:
> 
> -q-
> 
> I guess it costs a code byte for a disp8 in the addressing mode, but
> it avoids adding a lot of latency to a critical path involving a
> spill/reload to (%esp), in functions where there is something at
> (%esp).
> 
> If it's an object larger than 4B, the lock orl could even cause a
> store-forwarding stall when the object is reloaded.  (e.g. a double or
> a vector).
> 
> Ideally we could do the  lock orl  on some padding between two locals,
> or on something in memory that wasn't going to be loaded soon, to
> avoid touching more stack memory (which might be in the next page
> down).  But we still want to do it on a cache line that's hot, so
> going way up above our own stack frame isn't good either.

Unfortunately this makes valgrind unhappy about that:
https://bugzilla.redhat.com/show_bug.cgi?id=1423434
I assume it will complain now on anything pre-SSE2 that contains the memory
barrier in 32-bit code.
Perhaps we should decrement and increment %esp around it or something
similar (or push/pop)?  Of course, that would mean we need to take care
of async unwind info.

> Attached RFC patch implements this proposal.
> 
> 2016-05-29  Uros Bizjak  
> 
> * config/i386/sync.md (mfence_nosse): Use "lock orl $0, -4(%esp)".
> 
> Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
> 
> Any other opinion on this issue? The linux kernel also implements
> memory fence like the above proposal.
> 
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71245#c3
> 
> Uros.

> Index: config/i386/sync.md
> ===
> --- config/i386/sync.md   (revision 236863)
> +++ config/i386/sync.md   (working copy)
> @@ -98,7 +98,7 @@
>   (unspec:BLK [(match_dup 0)] UNSPEC_MFENCE))
> (clobber (reg:CC FLAGS_REG))]
>"!(TARGET_64BIT || TARGET_SSE2)"
> -  "lock{%;} or{l}\t{$0, (%%esp)|DWORD PTR [esp], 0}"
> +  "lock{%;} or{l}\t{$0, -4(%%esp)|DWORD PTR [esp-4], 0}"
>[(set_attr "memory" "unknown")])
>  
>  (define_expand "mem_thread_fence"


Jakub


patch to fix PR79541

2017-02-17 Thread Vladimir Makarov

The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79541

The patch was successfully bootstrapped and tested on x86-64.

Committed as rev. 245536.


Index: ChangeLog
===
--- ChangeLog	(revision 245535)
+++ ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2017-02-17  Vladimir Makarov  
+
+	PR rtl-optimization/79541
+	* lra-constraints.c (curr_insn_transform): Remove wrong asm insn
+	instead of transforming it into USE.
+
 2017-02-17  Segher Boessenkool  
 
 	* config/rs6000/rs6000.md (extendsfdf2): Remove default arguments.
Index: lra-constraints.c
===
--- lra-constraints.c	(revision 245484)
+++ lra-constraints.c	(working copy)
@@ -3773,9 +3773,9 @@ curr_insn_transform (bool check_only_p)
 	fatal_insn ("unable to generate reloads for:", curr_insn);
   error_for_asm (curr_insn,
 		 "inconsistent operand constraints in an %");
-  /* Avoid further trouble with this insn.	*/
-  PATTERN (curr_insn) = gen_rtx_USE (VOIDmode, const0_rtx);
-  lra_invalidate_insn_data (curr_insn);
+  /* Avoid further trouble with this insn.  Don't generate use
+	 pattern here as we could use the insn SP offset.  */
+  lra_set_insn_deleted (curr_insn);
   return true;
 }
 


libgo patch committed: Update to final Go 1.8 release

2017-02-17 Thread Ian Lance Taylor
This patch to libgo updates the sources to the final Go 1.8 release.

Along with the update this fixes a problem that was always present but
only showed up with a new test in the reflect package.  When a program
used a **unsafe.Pointer and stored the value in an interface type, the
generated type descriptor pointed to the GC data for *unsafe.Pointer.
It did that by name, but we were not generating a variable with the
right name.

Bootstrapped and ran Go tests on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 245397)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-c3935e1f20ad5b1d4c41150f11fb266913c04df7
+893f0e4a707c6f10eb14842b18954486042f0fb3
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/cmd/go/alldocs.go
===
--- libgo/go/cmd/go/alldocs.go  (revision 245052)
+++ libgo/go/cmd/go/alldocs.go  (working copy)
@@ -17,7 +17,7 @@
 // clean   remove object files
 // doc show documentation for package or symbol
 // env print Go environment information
-// bug print information for bug reports
+// bug start a bug report
 // fix run go tool fix on packages
 // fmt run gofmt on package sources
 // generategenerate Go files by processing source
@@ -324,15 +324,14 @@
 // each named variable on its own line.
 //
 //
-// Print information for bug reports
+// Start a bug report
 //
 // Usage:
 //
 // go bug
 //
-// Bug prints information that helps file effective bug reports.
-//
-// Bugs may be reported at https://golang.org/issue/new.
+// Bug opens the default browser and starts a new bug report.
+// The report includes useful system information.
 //
 //
 // Run go tool fix on packages
Index: libgo/go/cmd/go/get.go
===
--- libgo/go/cmd/go/get.go  (revision 245052)
+++ libgo/go/cmd/go/get.go  (working copy)
@@ -428,7 +428,7 @@ func downloadPackage(p *Package) error {
return fmt.Errorf("cannot download, $GOPATH not set. 
For more details see: 'go help gopath'")
}
// Guard against people setting GOPATH=$GOROOT.
-   if list[0] == goroot {
+   if filepath.Clean(list[0]) == filepath.Clean(goroot) {
return fmt.Errorf("cannot download, $GOPATH must not be 
set to $GOROOT. For more details see: 'go help gopath'")
}
if _, err := os.Stat(filepath.Join(list[0], 
"src/cmd/go/alldocs.go")); err == nil {
Index: libgo/go/cmd/go/go_test.go
===
--- libgo/go/cmd/go/go_test.go  (revision 245052)
+++ libgo/go/cmd/go/go_test.go  (working copy)
@@ -1683,173 +1683,111 @@ func homeEnvName() string {
}
 }
 
-// Test go env missing GOPATH shows default.
-func TestMissingGOPATHEnvShowsDefault(t *testing.T) {
+func TestDefaultGOPATH(t *testing.T) {
tg := testgo(t)
defer tg.cleanup()
tg.parallel()
-   tg.setenv("GOPATH", "")
-   tg.run("env", "GOPATH")
-
-   want := filepath.Join(os.Getenv(homeEnvName()), "go")
-   got := strings.TrimSpace(tg.getStdout())
-   if got != want {
-   t.Errorf("got %q; want %q", got, want)
-   }
-}
-
-// Test go get missing GOPATH causes go get to warn if directory doesn't exist.
-func TestMissingGOPATHGetWarnsIfNotExists(t *testing.T) {
-   testenv.MustHaveExternalNetwork(t)
-
-   if _, err := exec.LookPath("git"); err != nil {
-   t.Skip("skipping because git binary not found")
-   }
-
-   tg := testgo(t)
-   defer tg.cleanup()
+   tg.tempDir("home/go")
+   tg.setenv(homeEnvName(), tg.path("home"))
 
-   // setenv variables for test and defer deleting temporary home 
directory.
-   tg.setenv("GOPATH", "")
-   tmp, err := ioutil.TempDir("", "")
-   if err != nil {
-   t.Fatalf("could not create tmp home: %v", err)
-   }
-   defer os.RemoveAll(tmp)
-   tg.setenv(homeEnvName(), tmp)
+   tg.run("env", "GOPATH")
+   tg.grepStdout(regexp.QuoteMeta(tg.path("home/go")), "want 
GOPATH=$HOME/go")
 
-   tg.run("get", "-v", "github.com/golang/example/hello")
+   tg.setenv("GOROOT", tg.path("home/go"))
+   tg.run("env", "GOPATH")
+   tg.grepStdoutNot(".", "want unset GOPATH because GOROOT=$HOME/go")
 
-   want := fmt.Sprintf("created GOPATH=%s; see 'go help gopath'", 
filepath.Join(tmp, "go"))
-   got := strings.TrimSpace(tg.getStderr())
-   if !strings.Contains(got, want) {
-   t.Errorf("got %q; want %q", got, want)
-   }
+   tg.setenv("GOROOT", 

Re: [PATCH PR79562] fix bootstrap on FreeBSD

2017-02-17 Thread Jakub Jelinek
On Fri, Feb 17, 2017 at 07:24:45AM +0100, Andreas Tobler wrote:
> > Yeah, I understand. This is due to the fact that on FreeBSD trunk (aka.
> > *-*-freebsd12) this commit
> > (https://svnweb.freebsd.org/base?view=revision=313560) dropped
> > the _WANT_RTENTRY from net/route.h.
> > Iow, all version of FreeBSD < svn commit r313560 will build w/o patch.
> 
> 
> Is it ok to apply on gcc5 and gcc6 branch too? They are also affected.

Ok.

Jakub


Re: [PATCH PR79562] fix bootstrap on FreeBSD

2017-02-17 Thread Andreas Tobler

On 16.02.17 22:20, Andreas Tobler wrote:

On 16.02.17 22:03, Jakub Jelinek wrote:

On Thu, Feb 16, 2017 at 09:57:48PM +0100, Andreas Tobler wrote:

is this patch ok for trunk?

Fixes bootstrap for x86_64-*-freebsd12 where the internal struct rtentry has
gone from userland.

TIA,
Andreas

2017-02-16  Andreas Tobler  

PR sanitizer/79562
* sanitizer_common/sanitizer_platform_limits_posix.cc: Cherry-pick
upstream r294806.


Ok, thanks.

I'm just surprised by the
"The problem was introduced within the last 8 days."
comment in the PR, because this file has been modified last time on
2016-11-08.


Yeah, I understand. This is due to the fact that on FreeBSD trunk (aka.
*-*-freebsd12) this commit
(https://svnweb.freebsd.org/base?view=revision=313560) dropped
the _WANT_RTENTRY from net/route.h.
Iow, all version of FreeBSD < svn commit r313560 will build w/o patch.



Is it ok to apply on gcc5 and gcc6 branch too? They are also affected.

TIA,

Andreas






Re: [PATCH, rs6000] Fix PR79261 (vec_xxpermdi is not endian-sensitive)

2017-02-17 Thread Segher Boessenkool
Hi Bill,

On Thu, Feb 16, 2017 at 02:16:02PM -0600, Bill Schmidt wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79261 records that the interface
> vec_xxpermdi isn't implemented in a bi-endian fashion; instead, it produces
> results appropriate for big-endian vector element numbering even when run on
> a little endian machine.  This is not part of the "official vector API" from
> the ELFv2 ABI document, but should still have appropriate bi-endian behavior.

Maybe this needs adding (or updating) some documentation?

> +;; Special version of xxpermdi that retains big-endian semantics.
> +(define_expand "vsx_xxpermdi__be"
> +  [(match_operand:VSX_L 0 "vsx_register_operand" "")
> +   (match_operand:VSX_L 1 "vsx_register_operand" "")
> +   (match_operand:VSX_L 2 "vsx_register_operand" "")
> +   (match_operand:QI 3 "u5bit_cint_operand" "")]

Please remove the "".

Okay with that and perhaps some doc changes.  Thanks,


Segher


[PATCH] Change default of param not being smaller that min.

2017-02-17 Thread Martin Liška
Hello.

We should not have a default value (different from -1) that is smaller than 
minimum.

Ready to be installed after tests?
Martin
>From 4ea71d245bda258ebaed22cb3661fff0265c7088 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 17 Feb 2017 16:00:30 +0100
Subject: [PATCH] Change default of param not being smaller that min.

gcc/ChangeLog:

2017-02-17  Martin Liska  

	* params.def (PARAM_MIN_NONDEBUG_INSN_UID): Change default to 1.
---
 gcc/params.def | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/params.def b/gcc/params.def
index 023ca727648..66929beeb2a 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -965,7 +965,7 @@ DEFPARAM (PARAM_MAX_VARTRACK_REVERSE_OP_SIZE,
 DEFPARAM (PARAM_MIN_NONDEBUG_INSN_UID,
 	  "min-nondebug-insn-uid",
 	  "The minimum UID to be used for a nondebug insn.",
-	  0, 1, 0)
+	  1, 1, 0)
 
 DEFPARAM (PARAM_IPA_SRA_PTR_GROWTH_FACTOR,
 	  "ipa-sra-ptr-growth-factor",
-- 
2.11.0



[PATCH] rs6000: Fix extendsfdf2 for signaling NaNs

2017-02-17 Thread Segher Boessenkool
A cast from float to double should turn a signaling NaN into a quiet
NaN, if using -fsignaling-nans.  On PowerPC single-precision floats are
stored as double precision in registers, and so, the cast normally does
nothing.  This causes gcc.dg/pr59833.c to fail (it does such a cast,
and expects a quiet NaN as output).

This patch adds a new pattern, used with -fsignaling-nans in effect,
that creates an frsp instruction (or xsrsp) in this case.  Since the
input already is SFmode, that instruction turns signaling NaNs into
quiet NaNs and does nothing more.

Tested on powerpc64-linux {-m32,-m64}, committing to trunk.


Segher


2017-02-17  Segher Boessenkool  

* config/rs6000/rs6000.md (extendsfdf2): Remove default arguments.
If HONOR_SNANS (SFmode) force the input to a register.
(*extendsfdf2_fpr): Add !HONOR_SNANS (SFmode) condition.
(*extendsfdf2_snan): New pattern, used when using SNaNs; it generates
an frsp or similar insn.

---
 gcc/config/rs6000/rs6000.md | 22 ++
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index b784bca..ec93010 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4648,15 +4648,19 @@ (define_insn "*cmp_fpr"
 
 ;; Floating point conversions
 (define_expand "extendsfdf2"
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-   (float_extend:DF (match_operand:SF 1 "reg_or_none500mem_operand" "")))]
+  [(set (match_operand:DF 0 "gpc_reg_operand")
+   (float_extend:DF (match_operand:SF 1 "reg_or_none500mem_operand")))]
   "TARGET_HARD_FLOAT && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) || 
TARGET_E500_DOUBLE)"
-  "")
+{
+  if (HONOR_SNANS (SFmode))
+operands[1] = force_reg (SFmode, operands[1]);
+})
 
 (define_insn_and_split "*extendsfdf2_fpr"
   [(set (match_operand:DF 0 "gpc_reg_operand" "=d,?d,d,ws,?ws,wu,wb")
(float_extend:DF (match_operand:SF 1 "reg_or_mem_operand" 
"0,f,m,0,wy,Z,wY")))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT"
+  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
+   && !HONOR_SNANS (SFmode)"
   "@
#
fmr %0,%1
@@ -4673,6 +4677,16 @@ (define_insn_and_split "*extendsfdf2_fpr"
 }
   [(set_attr "type" "fp,fpsimple,fpload,fp,fpsimple,fpload,fpload")])
 
+(define_insn "*extendsfdf2_snan"
+  [(set (match_operand:DF 0 "gpc_reg_operand" "=d,ws")
+   (float_extend:DF (match_operand:SF 1 "gpc_reg_operand" "f,wy")))]
+  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
+   && HONOR_SNANS (SFmode)"
+  "@
+   frsp %0,%1
+   xsrsp %x0,%x1"
+  [(set_attr "type" "fp")])
+
 (define_expand "truncdfsf2"
   [(set (match_operand:SF 0 "gpc_reg_operand" "")
(float_truncate:SF (match_operand:DF 1 "gpc_reg_operand" "")))]
-- 
1.9.3



Re: [PATCH] Use HOST_WIDE_INT for a param calculation (PR, rtl-optimization/79574).

2017-02-17 Thread Richard Biener
On Fri, Feb 17, 2017 at 3:18 PM, Martin Liška  wrote:
> Hi.
>
> Following patch prevents integer overflow with a huge param.
>
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>
> Ready to be installed?

Ok.

Richard.

> Martin


Re: [PATCH] Increase minimum for a param (PR rtl-optimization/79577).

2017-02-17 Thread Richard Biener
On Fri, Feb 17, 2017 at 3:17 PM, Martin Liška  wrote:
> Hello.
>
> Increasing minimum of param fixes the issue. I guess 0 as value does not 
> makes any
> real sense.
>
> Ready for trunk?

Ok.

Richard.

> Thanks,
> Martin


[PATCH] Use HOST_WIDE_INT for a param calculation (PR, rtl-optimization/79574).

2017-02-17 Thread Martin Liška
Hi.

Following patch prevents integer overflow with a huge param.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From c92bbded326bb117d6c5bfcb1f505f2bbffc0b75 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 17 Feb 2017 13:46:21 +0100
Subject: [PATCH] Use HOST_WIDE_INT for a param calculation (PR
 rtl-optimization/79574).

gcc/testsuite/ChangeLog:

2017-02-17  Martin Liska  

	PR rtl-optimization/79574
	* gcc.dg/pr79574.c: New test.

gcc/ChangeLog:

2017-02-17  Martin Liska  

	PR rtl-optimization/79574
	* gcse.c (want_to_gcse_p): Prevent integer overflow.
---
 gcc/gcse.c |  5 +++--
 gcc/testsuite/gcc.dg/pr79574.c | 11 +++
 2 files changed, 14 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr79574.c

diff --git a/gcc/gcse.c b/gcc/gcse.c
index d28288df95c..5c6984c3240 100644
--- a/gcc/gcse.c
+++ b/gcc/gcse.c
@@ -790,7 +790,7 @@ want_to_gcse_p (rtx x, machine_mode mode, int *max_distance_ptr)
 	/* PRE doesn't implement max_distance restriction.  */
 	{
 	  int cost;
-	  int max_distance;
+	  HOST_WIDE_INT max_distance;
 
 	  gcc_assert (!optimize_function_for_speed_p (cfun)
 		  && optimize_function_for_size_p (cfun));
@@ -798,7 +798,8 @@ want_to_gcse_p (rtx x, machine_mode mode, int *max_distance_ptr)
 
 	  if (cost < COSTS_N_INSNS (GCSE_UNRESTRICTED_COST))
 	{
-	  max_distance = (GCSE_COST_DISTANCE_RATIO * cost) / 10;
+	  max_distance
+		= ((HOST_WIDE_INT)GCSE_COST_DISTANCE_RATIO * cost) / 10;
 	  if (max_distance == 0)
 		return 0;
 
diff --git a/gcc/testsuite/gcc.dg/pr79574.c b/gcc/testsuite/gcc.dg/pr79574.c
new file mode 100644
index 000..572f5230d5e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr79574.c
@@ -0,0 +1,11 @@
+/* PR rtl-optimization/79080 */
+/* { dg-do compile } */
+/* { dg-options "-Os --param gcse-cost-distance-ratio=2147483647" } */
+
+
+void a (void)
+{
+  volatile int b;
+  for (;; b)
+;
+}
-- 
2.11.0



[PATCH] Increase minimum for a param (PR rtl-optimization/79577).

2017-02-17 Thread Martin Liška
Hello.

Increasing minimum of param fixes the issue. I guess 0 as value does not makes 
any
real sense.

Ready for trunk?
Thanks,
Martin
>From 84f39f731373a8b2703569ac23080b8278539ea9 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 17 Feb 2017 15:03:22 +0100
Subject: [PATCH] Increase minimum for a param (PR rtl-optimization/79577).

gcc/ChangeLog:

2017-02-17  Martin Liska  

	PR rtl-optimization/79577
	* params.def (selsched-max-sched-times): Increase minimum to 1.
---
 gcc/params.def | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/params.def b/gcc/params.def
index 123e6393ae0..023ca727648 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -657,7 +657,7 @@ DEFPARAM(PARAM_SELSCHED_MAX_LOOKAHEAD,
 DEFPARAM(PARAM_SELSCHED_MAX_SCHED_TIMES,
  "selsched-max-sched-times",
  "Maximum number of times that an insn could be scheduled.",
- 2, 0, 0)
+ 2, 1, 0)
 
 DEFPARAM(PARAM_SELSCHED_INSNS_TO_RENAME,
  "selsched-insns-to-rename",
-- 
2.11.0



C++ PATCH for c++/79533 (C++17 ICE with cast to reference)

2017-02-17 Thread Jason Merrill
The old copy elision code in build_over_call checks to make sure that
we don't hit it for a temporary object in C++17, where it should have
been handled at a higher level.  We were hitting it for this testcase
because the compiler was looking through the cast to reference type to
find the temporary inside, and trying to elide the copy.

I believe that this behavior is (and has been) wrong for all dialects;
in C++14 and below copy elision from a temporary talks about "a
temporary class object _that has not been bound to a reference_", and
the static_cast binds the temporary to a reference.  So this patch
looks through the outermost conversion to reference type, which comes
from binding to the parameter of the copy conversion, but stops at any
earlier conversion to reference type.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 96ec73b986fe122ac306cdefb227499aa5b5221a
Author: Jason Merrill 
Date:   Thu Feb 16 16:17:33 2017 -0500

PR c++/79533 - C++17 ICE with temporary cast to reference

* call.c (build_over_call): Conversion to a reference prevents copy
elision.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 154509b..4ef444b 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -7955,7 +7955,14 @@ build_over_call (struct z_candidate *cand, int flags, 
tsubst_flags_t complain)
 
   /* Pull out the real argument, disregarding const-correctness.  */
   targ = arg;
-  while (CONVERT_EXPR_P (targ)
+  /* Strip the reference binding for the constructor parameter.  */
+  if (CONVERT_EXPR_P (targ)
+ && TREE_CODE (TREE_TYPE (targ)) == REFERENCE_TYPE)
+   targ = TREE_OPERAND (targ, 0);
+  /* But don't strip any other reference bindings; binding a temporary to a
+reference prevents copy elision.  */
+  while ((CONVERT_EXPR_P (targ)
+ && TREE_CODE (TREE_TYPE (targ)) != REFERENCE_TYPE)
 || TREE_CODE (targ) == NON_LVALUE_EXPR)
targ = TREE_OPERAND (targ, 0);
   if (TREE_CODE (targ) == ADDR_EXPR)
diff --git a/gcc/testsuite/g++.dg/init/elide6.C 
b/gcc/testsuite/g++.dg/init/elide6.C
new file mode 100644
index 000..d40bd9d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/init/elide6.C
@@ -0,0 +1,11 @@
+// PR c++/79533
+
+struct S {
+  S();
+  S(const S&);
+};
+S f();
+S s(static_cast(f()));
+
+// The static_cast prevents copy elision.
+// { dg-final { scan-assembler "_ZN1SC1ERKS_" } }


[PATCH] Fix gimplify_expr ICE with Labels as Values (PR middle-end/79537)

2017-02-17 Thread Marek Polacek
We are crashing with a label as a value without accompanying goto.
The problem is that TREE_SIDE_EFFECTS and FORCED_LABEL use the same
->base.side_effects_flag, so gimplify_expr is confused.  We don't
ICE with 'goto *&' becase then we take this path:
11406 case GOTO_EXPR:
11407   /* If the target is not LABEL, then it is a computed jump
11408  and the target needs to be gimplified.  */
11409   if (TREE_CODE (GOTO_DESTINATION (*expr_p)) != LABEL_DECL)
11410 {
11411   ret = gimplify_expr (_DESTINATION (*expr_p), pre_p,
11412NULL, is_gimple_val, fb_rvalue);
and because of that fb_rvalue we won't go to the switch where the ICE
occured.  Because '*&' on its own is useless, I think we can simply
discard it, which is what the following does.

Bootstrapped/regtested on x86_64-linux, ok for trunk and 6?

2017-02-17  Marek Polacek  

PR middle-end/79537
* gimplify.c (gimplify_expr): Handle unused *&.

* gcc.dg/comp-goto-4.c: New.

diff --git gcc/gimplify.c gcc/gimplify.c
index 1b9c8d2..5524357 100644
--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -12003,6 +12003,11 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p,
 gimple_test_f, fallback);
  break;
 
+   case LABEL_DECL:
+ /* We can get here with code such as "*&", where L is
+a LABEL_DECL that is marked as FORCED_LABEL.  Skip it.  */
+ break;
+
default:
   /* Anything else with side-effects must be converted to
  a valid statement before we get here.  */
diff --git gcc/testsuite/gcc.dg/comp-goto-4.c gcc/testsuite/gcc.dg/comp-goto-4.c
index e69de29..51a6a86 100644
--- gcc/testsuite/gcc.dg/comp-goto-4.c
+++ gcc/testsuite/gcc.dg/comp-goto-4.c
@@ -0,0 +1,21 @@
+/* PR middle-end/79537 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+/* { dg-require-effective-target indirect_jumps } */
+/* { dg-require-effective-target label_values } */
+
+void
+f (void)
+{
+L:
+  *&
+}
+
+void
+f2 (void)
+{
+   void *p;
+L:
+   p = &
+   *p; /* { dg-warning "dereferencing 'void \\*' pointer" } */
+}

Marek


[PATCH] Fix PR79576

2017-02-17 Thread Richard Biener

The following fixes too deep recursion with insane
--param max-ssa-name-query-depth by limiting the param appropriately.
Given the search is not width limited it can be highly exponential
in complexity thus I limited it fairly low (10).

Committed as obvious.

Richard.

2017-02-17  Richard Biener  

PR middle-end/79576
* params.def (max-ssa-name-query-depth): Limit to 10.

Index: gcc/params.def
===
--- gcc/params.def  (revision 245501)
+++ gcc/params.def  (working copy)
@@ -1237,7 +1237,7 @@ DEFPARAM (PARAM_MAX_SSA_NAME_QUERY_DEPTH
  "max-ssa-name-query-depth",
  "Maximum recursion depth allowed when querying a property of an"
  " SSA name.",
- 3, 1, 0)
+ 3, 1, 10)
 
 DEFPARAM (PARAM_MAX_RTL_IF_CONVERSION_INSNS,
  "max-rtl-if-conversion-insns",


Re: [C++ Patch] PR 79380

2017-02-17 Thread Paolo Carlini
... sorry, what I meant to propose uses 
INTEGRAL_OR_UNSCOPED_ENUMERATION_TYPE_P in the check, per the below. 
Both versions pass testing anyway, but the below seems more correct to me.


Thanks again,
Paolo.


Index: cp/typeck.c
===
--- cp/typeck.c (revision 245503)
+++ cp/typeck.c (working copy)
@@ -1795,6 +1795,12 @@ cxx_alignas_expr (tree e)
 
  the assignment-expression shall be an integral constant
 expression.  */
+
+  if (!INTEGRAL_OR_UNSCOPED_ENUMERATION_TYPE_P (TREE_TYPE (e)))
+{
+  error ("% argument has non-integral type %qT", TREE_TYPE (e));
+  return error_mark_node;
+}
   
   return cxx_constant_value (e);
 }
Index: testsuite/g++.dg/cpp0x/alignas8.C
===
--- testsuite/g++.dg/cpp0x/alignas8.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/alignas8.C   (working copy)
@@ -0,0 +1,7 @@
+// PR c++/79380
+// { dg-do compile { target c++11 } }
+
+template < typename > constexpr int f () {  return 8;  }
+
+// should have been: struct alignas (f()) S {};
+struct alignas (f) S {};  // { dg-error "17:'alignas' argument has 
non-integral type" }


Re: [GIMPLE FE] Fix ICE's with default ssa-name parsing in c_parser_gimple_postfix_expression

2017-02-17 Thread Richard Biener
On Fri, 17 Feb 2017, Prathamesh Kulkarni wrote:

> Hi,
> For the following test-case:
> 
> int __GIMPLE () foo (int a)
> {
>   return t0_1(D);
> }
> 
> The compiler emits the undeclared diagnostic and then ICE's in
> c_parser_gimple_postfix_expression because we don't check if
> c_parser_parse_ssa_name returned error_mark_node.
> ICE: https://gist.github.com/anonymous/45a01338d80faf4f21330bc9ee97ca5f
> 
> For this test-case:
> int __GIMPLE() foo(int a)
> {
>   int _1;
>   return _1(D);
> }
> 
> c_parser_gimple_postfix_expression calls set_ssa_default_def which
> crashes with segfault
> because SSA_VAR_P(_1) is NULL and set_ssa_default_def expects it to be 
> non-null.
> ICE: https://gist.github.com/anonymous/1df5a4e8a59eca289e214bbaf07a461a
> 
> The patch rejects the above test-case with:
> foo.c: In function ‘foo’:
> foo.c:4:10: error: anonymous SSA name cannot have default definition
>return _1(D);
> 
> OK to commit after bootstrap+test ?

Ok.

Thanks,
Richard.

[C++ Patch] PR 79380

2017-02-17 Thread Paolo Carlini

Hi,

this ICE on invalid seems quite similar to c++/69637 which I fixed 
recently: we again end up with an unhandled OVERLOAD in the main 
cxx_eval_constant_expression switch and in this case we don't diagnose 
at all the invalid input. Thus, I propose to add a check to 
cxx_alignas_expr completely similar to what we have in grokbitfield on 
the width. Tested x86_64-linux.


Thanks, Paolo.

PS: I think we are moving away from pretty printing (%qE) expressions, 
thus I dind't try in the patchlet to print the argument, only its 
(wrong) type.


//

/cp
2017-02-17  Paolo Carlini  

PR c++/79380
* typeck.c (cxx_alignas_expr): Reject a non-integral alignas
argument.

/testsuite
2017-02-17  Paolo Carlini  

PR c++/79380
* g++.dg/cpp0x/alignas8.C: New.
Index: cp/typeck.c
===
--- cp/typeck.c (revision 245503)
+++ cp/typeck.c (working copy)
@@ -1795,6 +1795,12 @@ cxx_alignas_expr (tree e)
 
  the assignment-expression shall be an integral constant
 expression.  */
+
+  if (!INTEGRAL_OR_ENUMERATION_TYPE_P (TREE_TYPE (e)))
+{
+  error ("% argument has non-integral type %qT", TREE_TYPE (e));
+  return error_mark_node;
+}
   
   return cxx_constant_value (e);
 }
Index: testsuite/g++.dg/cpp0x/alignas8.C
===
--- testsuite/g++.dg/cpp0x/alignas8.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/alignas8.C   (working copy)
@@ -0,0 +1,7 @@
+// PR c++/79380
+// { dg-do compile { target c++11 } }
+
+template < typename > constexpr int f () {  return 8;  }
+
+// should have been: struct alignas (f()) S {};
+struct alignas (f) S {};  // { dg-error "17:'alignas' argument has 
non-integral type" }


RE: [PATCH] Enable RDPID instruction.

2017-02-17 Thread Koval, Julia
Hi,
Can you please commit it for me?

Thanks,
Julia

-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com] 
Sent: Friday, February 17, 2017 1:30 PM
To: Koval, Julia 
Cc: GCC Patches 
Subject: Re: [PATCH] Enable RDPID instruction.

On Thu, Feb 16, 2017 at 11:56 PM, Koval, Julia  wrote:
> Sorry, here is the right patch(previous one had a typo). Changelog is right.
>
>
> -Original Message-
> From: Koval, Julia
> Sent: Thursday, February 16, 2017 11:31 PM
> To: 'Uros Bizjak' 
> Cc: GCC Patches 
> Subject: RE: [PATCH] Enable RDPID instruction.
>
> Sorry, fixed it.
>
> gcc/
> * common/config/i386/i386-common.c (OPTION_MASK_ISA_RDPID_SET): New.
> (OPTION_MASK_ISA_PKU_UNSET): New.
> (ix86_handle_option): Handle -mrdpid.
> * config/i386/cpuid.h
> (bit_RDPID): New.
> * config/i386/driver-i386.c (host_detect_local_cpu): Detect RDPID 
> feature.
> * config/i386/i386-builtin.def (__builtin_ia32_rdpid): New.
> * config/i386/i386-c.c (ix86_target_macros_internal): Handle RDPID 
> flag.
> * config/i386/i386.c (ix86_target_string): Add -mrdpid to isa2_opts.
> (ix86_valid_target_attribute_inner_p): Add "rdpid".
> (ix86_expand_builtin): Handle IX86_BUILTIN_RDPID.
> * config/i386/i386.h (TARGET_RDPID, TARGET_RDPID_P): New.
> * config/i386/i386.md (define_insn "rdpid"): New.
> * config/i386/i386.opt Add -mrdpid.
> * config/i386/immintrin.h (_rdpid_u32): New.
>
> gcc/testsuite/
> * gcc.target/i386/rdpid.c New test.
> * gcc.target/i386/sse-12.c: Add -mrdpid.
> * gcc.target/i386/sse-13.c: Ditto.
> * gcc.target/i386/sse-14.c: Ditto.
> * gcc.target/i386/sse-22.c: Ditto.
> * gcc.target/i386/sse-23.c: Ditto.
> * g++.dg/other/i386-2.C: Ditto.
> * g++.dg/other/i386-3.C: Ditto.

OK for mainline.

Thanks,
Uros.


Re: Improving code generation in the nvptx back end

2017-02-17 Thread Thomas Schwinge
Hi!

On Fri, 17 Feb 2017 14:00:09 +0100, I wrote:
> [...] for "normal" functions there is no reason to use the
> ".param" space for passing arguments in and out of functions.  We can
> then get rid of the boilerplate code to move ".param %in_ar*" into ".reg
> %ar*", and the other way round for "%value_out"/"%value".  This will then
> also simplify the call sites, where all that code "evaporates".  That's
> actually something I started to look into, many months ago, and I now
> just dug out those changes, and will post them later.
> 
> (Very likely, the PTX "JIT" compiler will do the very same thing without
> difficulty, but why not directly generate code that is less verbose to
> read?)

Using my WIP patch, the generated PTX code changes/is simplified as
follows:

 // BEGIN GLOBAL FUNCTION DECL: f
-.visible .func (.param.f32 %value_out) f (.param.u32 %in_ar0, .param.u64 
%in_ar1);
+.visible .func (.reg.f32 %value_out) f (.reg.u32 %ar0, .reg.u64 %ar1);

 // BEGIN GLOBAL FUNCTION DEF: f
-.visible .func (.param.f32 %value_out) f (.param.u32 %in_ar0, .param.u64 
%in_ar1)
+.visible .func (.reg.f32 %value_out) f (.reg.u32 %ar0, .reg.u64 %ar1)
 {
.reg.f32 %value;
-   .reg.u32 %ar0;
-   ld.param.u32 %ar0, [%in_ar0];
-   .reg.u64 %ar1;
-   ld.param.u64 %ar1, [%in_ar1];
.reg.f64 %r23;
.reg.f32 %r24;
.reg.u32 %r25;
@@ -34,15 +30,15 @@ $L3:
mov.f32 %r24, 0f;
 $L1:
mov.f32 %value, %r24;
-   st.param.f32[%value_out], %value;
+   mov.f32 %value_out, %value;
ret;
 }

 // BEGIN GLOBAL FUNCTION DECL: main
-.visible .func (.param.u32 %value_out) main (.param.u32 %in_ar0, 
.param.u64 %in_ar1);
+.visible .func (.reg.u32 %value_out) main (.reg.u32 %ar0, .reg.u64 %ar1);

 // BEGIN GLOBAL FUNCTION DEF: main
-.visible .func (.param.u32 %value_out) main (.param.u32 %in_ar0, 
.param.u64 %in_ar1)
+.visible .func (.reg.u32 %value_out) main (.reg.u32 %ar0, .reg.u64 %ar1)
 {
.reg.u32 %value;
.local .align 8 .b8 %frame_ar[32];
@@ -70,13 +66,9 @@ $L1:
st.u64  [%frame+24], %r29;
add.u64 %r31, %frame, 16;
{
-   .param.f32 %value_in;
-   .param.u32 %out_arg1;
-   st.param.u32 [%out_arg1], %r26;
-   .param.u64 %out_arg2;
-   st.param.u64 [%out_arg2], %r31;
-   call (%value_in), f, (%out_arg1, %out_arg2);
-   ld.param.f32%r32, [%value_in];
+   .reg.f32 %value_in;
+   call (%value_in), f, (%r26, %r31);
+   mov.f32 %r32, %value_in;
}
setp.eq.f32 %r33, %r32, 0f;
@%r33   bra $L5;
@@ -89,17 +81,13 @@ $L5:
st.u64  [%frame+24], %r36;
mov.u32 %r34, 1;
{
-   .param.f32 %value_in;
-   .param.u32 %out_arg1;
-   st.param.u32 [%out_arg1], %r34;
-   .param.u64 %out_arg2;
-   st.param.u64 [%out_arg2], %r31;
-   call (%value_in), f, (%out_arg1, %out_arg2);
-   ld.param.f32%r39, [%value_in];
+   .reg.f32 %value_in;
+   call (%value_in), f, (%r34, %r31);
+   mov.f32 %r39, %value_in;
}
setp.neu.f32%r40, %r39, 0f3f80;
@%r40   bra $L6;
mov.u32 %value, 0;
-   st.param.u32[%value_out], %value;
+   mov.u32 %value_out, %value;
ret;
 }

(Not yet directly using "%value_out" instead of the intermediate "%value"
register.)

Is such a patch something to pursue to completion?

--- gcc/config/nvptx/nvptx.c
+++ gcc/config/nvptx/nvptx.c
@@ -603,19 +603,32 @@ nvptx_promote_function_mode (const_tree type, 
machine_mode mode,
to an argument register and it is greater than zero if we're
copying to a specific hard register.  */
 
+static bool write_as_kernel (tree attrs);
 static int
 write_arg_mode (std::stringstream , int for_reg, int argno,
-   machine_mode mode)
+   machine_mode mode, const_tree decl)
 {
+  bool kernel = (decl != NULL_TREE) && write_as_kernel (DECL_ATTRIBUTES 
(decl));
   const char *ptx_type = nvptx_ptx_type_from_mode (mode, false);
 
   if (for_reg < 0)
 {
   /* Writing PTX prototype.  */
   s << (argno ? ", " : " (");
-  s << ".param" << ptx_type << " %in_ar" << argno;
+  if (kernel)
+   s << ".param" << ptx_type << " %in_ar" << argno;
+  else
+#if 0
+   s << ".reg" << ptx_type << " %in_ar" << argno;
+#else
+   s << ".reg" << ptx_type << " %ar" << argno;
+#endif
 }
+#if 0
   else
+#else
+  else if (kernel || for_reg)
+#endif
   

[GIMPLE FE] Fix ICE's with default ssa-name parsing in c_parser_gimple_postfix_expression

2017-02-17 Thread Prathamesh Kulkarni
Hi,
For the following test-case:

int __GIMPLE () foo (int a)
{
  return t0_1(D);
}

The compiler emits the undeclared diagnostic and then ICE's in
c_parser_gimple_postfix_expression because we don't check if
c_parser_parse_ssa_name returned error_mark_node.
ICE: https://gist.github.com/anonymous/45a01338d80faf4f21330bc9ee97ca5f

For this test-case:
int __GIMPLE() foo(int a)
{
  int _1;
  return _1(D);
}

c_parser_gimple_postfix_expression calls set_ssa_default_def which
crashes with segfault
because SSA_VAR_P(_1) is NULL and set_ssa_default_def expects it to be non-null.
ICE: https://gist.github.com/anonymous/1df5a4e8a59eca289e214bbaf07a461a

The patch rejects the above test-case with:
foo.c: In function ‘foo’:
foo.c:4:10: error: anonymous SSA name cannot have default definition
   return _1(D);

OK to commit after bootstrap+test ?

Thanks,
Prathamesh
diff --git a/gcc/c/gimple-parser.c b/gcc/c/gimple-parser.c
index d959877..e1535a3 100644
--- a/gcc/c/gimple-parser.c
+++ b/gcc/c/gimple-parser.c
@@ -864,6 +864,8 @@ c_parser_gimple_postfix_expression (c_parser *parser)
  c_parser_consume_token (parser);
  expr.value = c_parser_parse_ssa_name (parser, id, NULL_TREE,
version, ver_offset);
+ if (expr.value == error_mark_node)
+   return expr;
  set_c_expr_source_range (, tok_range);
  /* For default definition SSA names.  */
  if (c_parser_next_token_is (parser, CPP_OPEN_PAREN)
@@ -878,6 +880,13 @@ c_parser_gimple_postfix_expression (c_parser *parser)
  c_parser_consume_token (parser);
  if (! SSA_NAME_IS_DEFAULT_DEF (expr.value))
{
+ if (!SSA_NAME_VAR (expr.value))
+   {
+ error_at (loc, "anonymous SSA name cannot have"
+   " default definition");
+ expr.value = error_mark_node;
+ return expr;
+   }
  set_ssa_default_def (cfun, SSA_NAME_VAR (expr.value),
   expr.value);
  SSA_NAME_DEF_STMT (expr.value) = gimple_build_nop ();


[PATCH] PR c++/69523 make -Wliteral-suffix control warning

2017-02-17 Thread Jonathan Wakely

Currently there's no way to disable the warning about literal suffix
identifiers that use reserved names. This patch from Eric makes it
depend on the existing -Wliteral-suffix option. Currently that
controls warnings when encountering string literals concatenated with
macros, such as "%"PRIu32, which isn't the same problem, but I think
it's OK to reuse the warning option for this as well.

Tested powerpc64le-linux. OK for trunk now, or should it wait for
Stage 1?



gcc:

2017-02-17  Jonathan Wakely  

PR c++/69523
* doc/invoke.texi (C++ Dialect Options) [-Wliteral-suffix]: Update
description.

gcc/cp:

2017-02-17  Eric Fiselier  
Jonathan Wakely  

PR c++/69523
* parser.c (cp_parser_unqualified_id): Use OPT_Wliteral_suffix to
control warning about literal suffix identifiers without a leading
underscore.

gcc/testsuite:

2017-02-17  Eric Fiselier  
Jonathan Wakely  

PR c++/69523
* g++.dg/cpp0x/Wliteral-suffix2.C: New test.


commit 4d04ba1c1d19973d2b7e845a2f92cdd292054756
Author: Jonathan Wakely 
Date:   Fri Feb 17 10:50:20 2017 +

PR c++/69523 make -Wliteral-suffix control warning

gcc:

2017-02-17  Jonathan Wakely  

PR c++/69523
* doc/invoke.texi (C++ Dialect Options) [-Wliteral-suffix]: Update
description.

gcc/cp:

2017-02-17  Eric Fiselier  
Jonathan Wakely  

PR c++/69523
* parser.c (cp_parser_unqualified_id): Use OPT_Wliteral_suffix to
control warning about literal suffix identifiers without a leading
underscore.

gcc/testsuite:

2017-02-17  Eric Fiselier  
Jonathan Wakely  

PR c++/69523
* g++.dg/cpp0x/Wliteral-suffix2.C: New test.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 060962d..06f2beb 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -5812,8 +5812,9 @@ cp_parser_unqualified_id (cp_parser* parser,
  const char *name = UDLIT_OP_SUFFIX (id);
  if (name[0] != '_' && !in_system_header_at (input_location)
  && declarator_p)
-   warning (0, "literal operator suffixes not preceded by %<_%>"
-   " are reserved for future standardization");
+   warning (OPT_Wliteral_suffix,
+"literal operator suffixes not preceded by %<_%>"
+" are reserved for future standardization");
}
 
  return id;
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 72038a1..0154e3d 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -2851,6 +2851,11 @@ int main() @{
 
 In this case, @code{PRId64} is treated as a separate preprocessing token.
 
+Additionally, warn when a user-defined literal operator is declared with
+a literal suffix identifier that doesn't begin with an underscore. Literal
+suffix identifiers that don't begin with an underscore are reserved for
+future standardization.
+
 This warning is enabled by default.
 
 @item -Wlto-type-mismatch
diff --git a/gcc/testsuite/g++.dg/cpp0x/Wliteral-suffix2.C 
b/gcc/testsuite/g++.dg/cpp0x/Wliteral-suffix2.C
new file mode 100644
index 000..129947d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/Wliteral-suffix2.C
@@ -0,0 +1,11 @@
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wno-literal-suffix" }
+
+// Test user-defined literals.
+// Test "-Wno-literal-suffix" suppresses warnings on declaration without
+// leading underscore.
+
+long double operator"" nounder(long double); // { dg-bogus "" }
+
+template
+  int operator"" nounder(); // { dg-bogus "" }


Re: [PATCH][C++] Annotate more functions with MEM-STATs

2017-02-17 Thread Richard Biener
On Fri, 17 Feb 2017, Jakub Jelinek wrote:

> On Fri, Feb 17, 2017 at 01:22:57PM +0100, Richard Biener wrote:
> > And even unify CXX_MEM_STAT_INFO and MEM_STAT_INFO, also dropping support
> > for host compilers < GCC 4.8, GCC 4.8 introduced __builtin_FILE and
> > friends (you'd have to bootstrap with older host compilers or clang
> > which doesn't seem to support those either and still claims to be
> > GCC 4.2.1 ...).
> 
> Do you mean drop support for host < GCC 4.8 if detailed mem stats gathering
> is requested, or dropping support for such host compilers altogether?
> I have no problem with the former, big problem with the latter.

The former, and it would just make stage1 more-or-less behave as if
detailed mem stats gathering would be disabled (with a simple-minded
approach GATHER_STATISTICS would still be 1 but all MEM_STAT_DECL, etc.
would expand to nothing, thus require __builtin_FILE () support).

Richard.


Re: [PATCH] Enable RDPID instruction.

2017-02-17 Thread Uros Bizjak
On Thu, Feb 16, 2017 at 11:56 PM, Koval, Julia  wrote:
> Sorry, here is the right patch(previous one had a typo). Changelog is right.
>
>
> -Original Message-
> From: Koval, Julia
> Sent: Thursday, February 16, 2017 11:31 PM
> To: 'Uros Bizjak' 
> Cc: GCC Patches 
> Subject: RE: [PATCH] Enable RDPID instruction.
>
> Sorry, fixed it.
>
> gcc/
> * common/config/i386/i386-common.c (OPTION_MASK_ISA_RDPID_SET): New.
> (OPTION_MASK_ISA_PKU_UNSET): New.
> (ix86_handle_option): Handle -mrdpid.
> * config/i386/cpuid.h
> (bit_RDPID): New.
> * config/i386/driver-i386.c (host_detect_local_cpu): Detect RDPID 
> feature.
> * config/i386/i386-builtin.def (__builtin_ia32_rdpid): New.
> * config/i386/i386-c.c (ix86_target_macros_internal): Handle RDPID 
> flag.
> * config/i386/i386.c (ix86_target_string): Add -mrdpid to isa2_opts.
> (ix86_valid_target_attribute_inner_p): Add "rdpid".
> (ix86_expand_builtin): Handle IX86_BUILTIN_RDPID.
> * config/i386/i386.h (TARGET_RDPID, TARGET_RDPID_P): New.
> * config/i386/i386.md (define_insn "rdpid"): New.
> * config/i386/i386.opt Add -mrdpid.
> * config/i386/immintrin.h (_rdpid_u32): New.
>
> gcc/testsuite/
> * gcc.target/i386/rdpid.c New test.
> * gcc.target/i386/sse-12.c: Add -mrdpid.
> * gcc.target/i386/sse-13.c: Ditto.
> * gcc.target/i386/sse-14.c: Ditto.
> * gcc.target/i386/sse-22.c: Ditto.
> * gcc.target/i386/sse-23.c: Ditto.
> * g++.dg/other/i386-2.C: Ditto.
> * g++.dg/other/i386-3.C: Ditto.

OK for mainline.

Thanks,
Uros.


Re: [PATCH][C++] Annotate more functions with MEM-STATs

2017-02-17 Thread Jakub Jelinek
On Fri, Feb 17, 2017 at 01:22:57PM +0100, Richard Biener wrote:
> And even unify CXX_MEM_STAT_INFO and MEM_STAT_INFO, also dropping support
> for host compilers < GCC 4.8, GCC 4.8 introduced __builtin_FILE and
> friends (you'd have to bootstrap with older host compilers or clang
> which doesn't seem to support those either and still claims to be
> GCC 4.2.1 ...).

Do you mean drop support for host < GCC 4.8 if detailed mem stats gathering
is requested, or dropping support for such host compilers altogether?
I have no problem with the former, big problem with the latter.

Jakub


Re: [PATCH][C++] Annotate more functions with MEM-STATs

2017-02-17 Thread Richard Biener
On Fri, 17 Feb 2017, Richard Biener wrote:

> 
> The following annotates two key wrappers around copy_node in the C++ FE
> with MEM-STAT info (and with CXX_MEM_STAT_INFO this is surprisingly
> easy, without adding _stat variants and macros as we have for the classic
> way from the pre-C++ era).
> 
> It also annotates more type building functions in tree.c (all in the 
> attempt to get a better idea on where all the types are built for C++
> sources).
> 
> Bootstrapped without --enable-gather-detailed-mem-stats, bootstrapped
> with --enable-gather-detailed-mem-stats and visually inspected the
> improved stats on some example C++ code.
> 
> There are still some more functions worth annotating:
> 
> tree.c:8239 (build_range_type_1)840:  0.0%
> 666120:  3.2%
> tree.c:8362 (build_array_type_1)   3024:  0.0%
> 671496:  3.2%
> tree.c:4841 (build_type_attribute_qual_variant)   13776:  0.1% 
> 67032:  0.3%
> tree.c:8681 (build_method_type_directly)  41832:  0.3%
> 202944:  1.0%
> hash-table.h:736 (expand) 15136:  0.1%   
> 5826600: 27.8%
> tree.c:8532 (build_function_type)148344:  1.1%   
> 3538080: 16.9%
> cp/lex.c:556 (retrofit_lang_decl) 78628:  0.6% 
> 43776:  0.2%
> cp/lex.c:526 (build_lang_decl_loc)87968:  0.6%
> 260776:  1.2%   3902184:  7.5%536840: 23.8% 15444
> 
> is it ok if I go forward with this (at this stage, also for C++
> specifics above?)
> 
> Would it be welcome to scrap _stat and the macro wrappings everywhere
> at this stage?

And even unify CXX_MEM_STAT_INFO and MEM_STAT_INFO, also dropping support
for host compilers < GCC 4.8, GCC 4.8 introduced __builtin_FILE and
friends (you'd have to bootstrap with older host compilers or clang
which doesn't seem to support those either and still claims to be
GCC 4.2.1 ...).

Richard.

> Thanks,
> Richard.
> 
> 2017-02-17  Richard Biener  
> 
>   * tree.h (build_qualified_type): Annotate with CXX_MEM_STAT_INFO.
>   (build_distinct_type_copy): Likewise.
>   (build_variant_type_copy): Likewise.
>   * tree.c (build_qualified_type): Pass down mem-stat info.
>   (build_distinct_type_copy): Likewise.
>   (build_variant_type_copy): Likewise.
> 
>   cp/
>   * cp-tree.h (copy_decl): Annotate with CXX_MEM_STAT_INFO.
>   (copy_type): Likewise.
>   * lex.c (copy_decl): Pass down mem-stat info.
>   (copy_type): Likewise.
> 
> Index: gcc/tree.h
> ===
> --- gcc/tree.h(revision 245526)
> +++ gcc/tree.h(working copy)
> @@ -4258,7 +4258,7 @@ extern tree get_qualified_type (tree, in
>  /* Like get_qualified_type, but creates the type if it does not
> exist.  This function never returns NULL_TREE.  */
>  
> -extern tree build_qualified_type (tree, int);
> +extern tree build_qualified_type (tree, int CXX_MEM_STAT_INFO);
>  
>  /* Create a variant of type T with alignment ALIGN.  */
>  
> @@ -4276,8 +4276,8 @@ extern tree build_aligned_type (tree, un
>  
>  /* Make a copy of a type node.  */
>  
> -extern tree build_distinct_type_copy (tree);
> -extern tree build_variant_type_copy (tree);
> +extern tree build_distinct_type_copy (tree CXX_MEM_STAT_INFO);
> +extern tree build_variant_type_copy (tree CXX_MEM_STAT_INFO);
>  
>  /* Given a hashcode and a ..._TYPE node (for which the hashcode was made),
> return a canonicalized ..._TYPE node, so that duplicates are not made.
> Index: gcc/tree.c
> ===
> --- gcc/tree.c(revision 245526)
> +++ gcc/tree.c(working copy)
> @@ -6622,7 +6622,7 @@ get_qualified_type (tree type, int type_
> exist.  This function never returns NULL_TREE.  */
>  
>  tree
> -build_qualified_type (tree type, int type_quals)
> +build_qualified_type (tree type, int type_quals MEM_STAT_DECL)
>  {
>tree t;
>  
> @@ -6632,7 +6632,7 @@ build_qualified_type (tree type, int typ
>/* If not, build it.  */
>if (!t)
>  {
> -  t = build_variant_type_copy (type);
> +  t = build_variant_type_copy (type PASS_MEM_STAT);
>set_type_quals (t, type_quals);
>  
>if (((type_quals & TYPE_QUAL_ATOMIC) == TYPE_QUAL_ATOMIC))
> @@ -6695,9 +6695,9 @@ build_aligned_type (tree type, unsigned
> TYPE_CANONICAL points to itself. */
>  
>  tree
> -build_distinct_type_copy (tree type)
> +build_distinct_type_copy (tree type MEM_STAT_DECL)
>  {
> -  tree t = copy_node (type);
> +  tree t = copy_node_stat (type PASS_MEM_STAT);
>  
>TYPE_POINTER_TO (t) = 0;
>TYPE_REFERENCE_TO (t) = 0;
> @@ -6733,11 +6733,11 @@ build_distinct_type_copy (tree type)
> require structural equality checks). */
>  
>  tree
> -build_variant_type_copy (tree type)
> +build_variant_type_copy (tree type MEM_STAT_DECL)
>  {
>tree t, m = TYPE_MAIN_VARIANT (type);
>  
> 

[PATCH][C++] Annotate more functions with MEM-STATs

2017-02-17 Thread Richard Biener

The following annotates two key wrappers around copy_node in the C++ FE
with MEM-STAT info (and with CXX_MEM_STAT_INFO this is surprisingly
easy, without adding _stat variants and macros as we have for the classic
way from the pre-C++ era).

It also annotates more type building functions in tree.c (all in the 
attempt to get a better idea on where all the types are built for C++
sources).

Bootstrapped without --enable-gather-detailed-mem-stats, bootstrapped
with --enable-gather-detailed-mem-stats and visually inspected the
improved stats on some example C++ code.

There are still some more functions worth annotating:

tree.c:8239 (build_range_type_1)840:  0.0%
666120:  3.2%
tree.c:8362 (build_array_type_1)   3024:  0.0%
671496:  3.2%
tree.c:4841 (build_type_attribute_qual_variant)   13776:  0.1% 
67032:  0.3%
tree.c:8681 (build_method_type_directly)  41832:  0.3%
202944:  1.0%
hash-table.h:736 (expand) 15136:  0.1%   
5826600: 27.8%
tree.c:8532 (build_function_type)148344:  1.1%   
3538080: 16.9%
cp/lex.c:556 (retrofit_lang_decl) 78628:  0.6% 
43776:  0.2%
cp/lex.c:526 (build_lang_decl_loc)87968:  0.6%
260776:  1.2%   3902184:  7.5%536840: 23.8% 15444

is it ok if I go forward with this (at this stage, also for C++
specifics above?)

Would it be welcome to scrap _stat and the macro wrappings everywhere
at this stage?

Thanks,
Richard.

2017-02-17  Richard Biener  

* tree.h (build_qualified_type): Annotate with CXX_MEM_STAT_INFO.
(build_distinct_type_copy): Likewise.
(build_variant_type_copy): Likewise.
* tree.c (build_qualified_type): Pass down mem-stat info.
(build_distinct_type_copy): Likewise.
(build_variant_type_copy): Likewise.

cp/
* cp-tree.h (copy_decl): Annotate with CXX_MEM_STAT_INFO.
(copy_type): Likewise.
* lex.c (copy_decl): Pass down mem-stat info.
(copy_type): Likewise.

Index: gcc/tree.h
===
--- gcc/tree.h  (revision 245526)
+++ gcc/tree.h  (working copy)
@@ -4258,7 +4258,7 @@ extern tree get_qualified_type (tree, in
 /* Like get_qualified_type, but creates the type if it does not
exist.  This function never returns NULL_TREE.  */
 
-extern tree build_qualified_type (tree, int);
+extern tree build_qualified_type (tree, int CXX_MEM_STAT_INFO);
 
 /* Create a variant of type T with alignment ALIGN.  */
 
@@ -4276,8 +4276,8 @@ extern tree build_aligned_type (tree, un
 
 /* Make a copy of a type node.  */
 
-extern tree build_distinct_type_copy (tree);
-extern tree build_variant_type_copy (tree);
+extern tree build_distinct_type_copy (tree CXX_MEM_STAT_INFO);
+extern tree build_variant_type_copy (tree CXX_MEM_STAT_INFO);
 
 /* Given a hashcode and a ..._TYPE node (for which the hashcode was made),
return a canonicalized ..._TYPE node, so that duplicates are not made.
Index: gcc/tree.c
===
--- gcc/tree.c  (revision 245526)
+++ gcc/tree.c  (working copy)
@@ -6622,7 +6622,7 @@ get_qualified_type (tree type, int type_
exist.  This function never returns NULL_TREE.  */
 
 tree
-build_qualified_type (tree type, int type_quals)
+build_qualified_type (tree type, int type_quals MEM_STAT_DECL)
 {
   tree t;
 
@@ -6632,7 +6632,7 @@ build_qualified_type (tree type, int typ
   /* If not, build it.  */
   if (!t)
 {
-  t = build_variant_type_copy (type);
+  t = build_variant_type_copy (type PASS_MEM_STAT);
   set_type_quals (t, type_quals);
 
   if (((type_quals & TYPE_QUAL_ATOMIC) == TYPE_QUAL_ATOMIC))
@@ -6695,9 +6695,9 @@ build_aligned_type (tree type, unsigned
TYPE_CANONICAL points to itself. */
 
 tree
-build_distinct_type_copy (tree type)
+build_distinct_type_copy (tree type MEM_STAT_DECL)
 {
-  tree t = copy_node (type);
+  tree t = copy_node_stat (type PASS_MEM_STAT);
 
   TYPE_POINTER_TO (t) = 0;
   TYPE_REFERENCE_TO (t) = 0;
@@ -6733,11 +6733,11 @@ build_distinct_type_copy (tree type)
require structural equality checks). */
 
 tree
-build_variant_type_copy (tree type)
+build_variant_type_copy (tree type MEM_STAT_DECL)
 {
   tree t, m = TYPE_MAIN_VARIANT (type);
 
-  t = build_distinct_type_copy (type);
+  t = build_distinct_type_copy (type PASS_MEM_STAT);
 
   /* Since we're building a variant, assume that it is a non-semantic
  variant. This also propagates TYPE_STRUCTURAL_EQUALITY_P. */
Index: gcc/cp/cp-tree.h
===
--- gcc/cp/cp-tree.h(revision 245526)
+++ gcc/cp/cp-tree.h(working copy)
@@ -6080,8 +6080,8 @@ extern tree unqualified_fn_lookup_error
 extern tree build_lang_decl(enum tree_code, tree, tree);
 extern tree build_lang_decl_loc  

Re: [PATCH, GCC/x86 mingw32] Add configure option to force wildcard behavior on Windows

2017-02-17 Thread Thomas Preudhomme

Here you are:

2017-01-24  Thomas Preud'homme  

* configure.ac (--enable-mingw-wildcard): Add new configurable feature.
* configure: Regenerate.
* config.in: Regenerate.
* config/i386/driver-mingw32.c: new file.
* config/i386/x-mingw32: Add rule to build driver-mingw32.o.
* config.host: Link driver-mingw32.o on MinGW host.
* doc/install.texi: Document new --enable-mingw-wildcard configure
option.

Must have forgotten to paste it.

On 17/02/17 10:52, JonY wrote:

On 02/14/2017 10:42 AM, JonY wrote:

On 02/14/2017 09:32 AM, Thomas Preudhomme wrote:


Looks good, be sure to emphasize this option affects mingw hosted GCC
only, not the compiler output.


I think that should be pretty clear in the latest version of the patch,
doc/install.texi contains:

"Note that this option only affects wildcard expansion for GCC itself.
It does
not affect wildcard expansion of executables built by the resulting GCC."

If you think a part of that sentence is still confusing please let me
know and I'll improve it.

Best regards,

Thomas



Yes, that should be good, no more objections.




Before I forget, please also provide a changelog, thanks.




Re: [PATCH] Fix PR79547

2017-02-17 Thread Richard Biener
On Fri, 17 Feb 2017, Marc Glisse wrote:

> On Fri, 17 Feb 2017, Richard Biener wrote:
> 
> > On Thu, 16 Feb 2017, Richard Biener wrote:
> > 
> > > 
> > > I am testing the following patch for PR79547.  Those builtins do not
> > > return anything that can be used to re-construct the pointer(s) passed
> > > to them.
> > > 
> > > Queued for GCC 8.
> > 
> > Actually we need calluse constraints.  Thus adjusted as follows.
> > 
> > Richard.
> > 
> > 2017-02-17  Richard Biener  
> > 
> > PR tree-optimization/79547
> > * tree-ssa-structalias.c (find_func_aliases_for_builtin_call):
> > Handle strlen, strcmp, strncmp, strcasecmp, strncasecmp, memcmp,
> > bcmp, strspn, strcspn, __builtin_object_size and __builtin_constant_p
> > without any constraints.
> 
> We have EAF_NOESCAPE that we are using for non-builtins, though it probably
> gets little use there. Would it make sense to use it here as well, or would
> that be pointless?

EAF_NOESCAPE doesn't capture what we want here -- for pure and const
functions arguments already don't escape in EAF_NOESCAPEs sense it's
just that EAF_NOESCAPE doesn't cover "escaping" through the return value.
We do not have sth like ERF_RETURNS_NO_ARG.

Generally all explicitely handled builtins do not need any further
fn-spec attributes added in builtins.def.

Richard.


Re: [PATCH, GCC/x86 mingw32] Add configure option to force wildcard behavior on Windows

2017-02-17 Thread JonY
On 02/14/2017 10:42 AM, JonY wrote:
> On 02/14/2017 09:32 AM, Thomas Preudhomme wrote:
>>>
>>> Looks good, be sure to emphasize this option affects mingw hosted GCC
>>> only, not the compiler output.
>>
>> I think that should be pretty clear in the latest version of the patch,
>> doc/install.texi contains:
>>
>> "Note that this option only affects wildcard expansion for GCC itself. 
>> It does
>> not affect wildcard expansion of executables built by the resulting GCC."
>>
>> If you think a part of that sentence is still confusing please let me
>> know and I'll improve it.
>>
>> Best regards,
>>
>> Thomas
>>
> 
> Yes, that should be good, no more objections.
> 
> 

Before I forget, please also provide a changelog, thanks.




signature.asc
Description: OpenPGP digital signature


Re: [PATCH] Fix PR79547

2017-02-17 Thread Marc Glisse

On Fri, 17 Feb 2017, Richard Biener wrote:


On Thu, 16 Feb 2017, Richard Biener wrote:



I am testing the following patch for PR79547.  Those builtins do not
return anything that can be used to re-construct the pointer(s) passed
to them.

Queued for GCC 8.


Actually we need calluse constraints.  Thus adjusted as follows.

Richard.

2017-02-17  Richard Biener  

PR tree-optimization/79547
* tree-ssa-structalias.c (find_func_aliases_for_builtin_call):
Handle strlen, strcmp, strncmp, strcasecmp, strncasecmp, memcmp,
bcmp, strspn, strcspn, __builtin_object_size and __builtin_constant_p
without any constraints.


We have EAF_NOESCAPE that we are using for non-builtins, though it 
probably gets little use there. Would it make sense to use it here as 
well, or would that be pointless?


--
Marc Glisse


Re: Handle GIMPLE NOPs in is_maybe_undefined (PR, tree-optimization/79529).

2017-02-17 Thread Richard Biener
On Fri, Feb 17, 2017 at 10:55 AM, Martin Liška  wrote:
> On 02/16/2017 12:34 PM, Richard Biener wrote:
>> Yes, we should handle all of the "hidden initialized" cases at
>>
>>   /* A PARM_DECL will not have an SSA_NAME_DEF_STMT.  Parameters
>>  get their initial value from function entry.  */
>>   if (SSA_NAME_VAR (t) && TREE_CODE (SSA_NAME_VAR (t)) == PARM_DECL)
>> continue;
>>
>> maybe add a predicate for those, like
>>
>>  ssa_defined_default_def_p ()
>>
>> right next to ssa_undefined_value_p and use it from there as well.
>
> Hi.
>
> Done in second version of patch.
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. 
> Firefox w/ -flto and -O3
> works fine.
>
> Ready to be installed?

Ok.

Thanks,
Richard.

> Martin


Re: [PATCH PR71437/V2]Simplify cond with assertions in threading

2017-02-17 Thread Bin.Cheng
On Fri, Feb 17, 2017 at 3:40 AM, Jeff Law  wrote:
> On 02/14/2017 03:05 AM, Bin Cheng wrote:
>>
>> Hi,
>> This is the second try fixing PR71437.  The old version patch tried to fix
>> issue in VRP but it requires further non-trivial change in VRP,
>> specifically, to better support variable value ranges.  This is not
>> appropriate at stage 4.  Alternatively, this patch tries to fix issue by
>> improving threading.  It additionally simplifies condition by using
>> assertion conditions.
>>
>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>
>> Thanks,
>> bin
>>
>> 2017-02-13  Bin Cheng  
>>
>> PR tree-optimization/71437
>> * tree-ssa-loop-niter.c (tree_simplify_using_condition): Only
>> expand condition if new parameter says so.  Also change it to
>> global.
>> * tree-ssa-loop-niter.h (tree_simplify_using_condition): New
>> declaration.
>> * tree-ssa-threadedge.c (tree-ssa-loop-niter.h): New include file.
>> (simplify_control_stmt_condition_1): Simplify condition using
>> assert conditions.
>>
>> gcc/testsuite/ChangeLog
>> 2017-02-13  Bin Cheng  
>>
>> PR tree-optimization/71437
>> * gcc.dg/tree-ssa/pr71437.c: New test.
>>
> So following up.  We're not going to get anywhere using the ranges in VRP.
> As Bin noted in the V1 patch, VRP prefers a useless range with constant
> bounds when a symbolic range would be better.  Thus the callbacks into VRP
> are doomed to failure.
>
> Bin's patch works around this by using the ASSERT_EXPRs to recover the
> symbolic range.  So it's a bit of a hack, but not a terrible one.  If we
> want to continue this path, we might still look for ways to avoid
> simplify_using_condition.
>
> One idea would be to go ahead and record the equivalence from the
> ASSERT_EXPR into the expression hash table and use the expression hash table
> to simplify the condition.  We didn't have that ability in the past, but
> should now after the refactorings from last year.
>
> It's slightly related to some ideas I've got around tackling 78496.
>
> I'm in/out of the office for until the 27th semi-randomly.  I'll try to poke
> at this while on the road.
Thanks for helping, I will hold this patch and let you work out a
generic fix in threading.

Thanks,
bin
>
> Jeff
>
>


Re: Handle GIMPLE NOPs in is_maybe_undefined (PR, tree-optimization/79529).

2017-02-17 Thread Martin Liška
On 02/16/2017 12:34 PM, Richard Biener wrote:
> Yes, we should handle all of the "hidden initialized" cases at
> 
>   /* A PARM_DECL will not have an SSA_NAME_DEF_STMT.  Parameters
>  get their initial value from function entry.  */
>   if (SSA_NAME_VAR (t) && TREE_CODE (SSA_NAME_VAR (t)) == PARM_DECL)
> continue;
> 
> maybe add a predicate for those, like
> 
>  ssa_defined_default_def_p ()
> 
> right next to ssa_undefined_value_p and use it from there as well.

Hi.

Done in second version of patch.
Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. 
Firefox w/ -flto and -O3
works fine.

Ready to be installed?
Martin
>From d98ab8fef6d634b73eeca74d11161e3cb7b59776 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 16 Feb 2017 17:07:51 +0100
Subject: [PATCH] Introduce ssa_defined_default_def_p function (PR
 tree-optimization/79529).

gcc/ChangeLog:

2017-02-16  Martin Liska  

	* tree-ssa-loop-unswitch.c (is_maybe_undefined): Use
	ssa_defined_default_def_p to handle cases which are implicitly
	defined.
	* tree-ssa.c (ssa_defined_default_def_p): New function.
	(ssa_undefined_value_p): Use ssa_defined_default_def_p to handle cases
	which are implicitly defined.
	* tree-ssa.h (ssa_defined_default_def_p): Declare.
---
 gcc/tree-ssa-loop-unswitch.c |  4 +---
 gcc/tree-ssa.c   | 26 +++---
 gcc/tree-ssa.h   |  2 ++
 3 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c
index afa04e9d110..1845148666d 100644
--- a/gcc/tree-ssa-loop-unswitch.c
+++ b/gcc/tree-ssa-loop-unswitch.c
@@ -134,9 +134,7 @@ is_maybe_undefined (const tree name, gimple *stmt, struct loop *loop)
   if (ssa_undefined_value_p (t, true))
 	return true;
 
-  /* A PARM_DECL will not have an SSA_NAME_DEF_STMT.  Parameters
-	 get their initial value from function entry.  */
-  if (SSA_NAME_VAR (t) && TREE_CODE (SSA_NAME_VAR (t)) == PARM_DECL)
+  if (ssa_defined_default_def_p (t))
 	continue;
 
   gimple *def = SSA_NAME_DEF_STMT (t);
diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c
index 28020b003f8..831fd61e15f 100644
--- a/gcc/tree-ssa.c
+++ b/gcc/tree-ssa.c
@@ -1251,27 +1251,39 @@ tree_ssa_strip_useless_type_conversions (tree exp)
   return exp;
 }
 
-
-/* Return true if T, an SSA_NAME, has an undefined value.  PARTIAL is what
-   should be returned if the value is only partially undefined.  */
+/* Return true if T, as SSA_NAME, has an implicit default defined value.  */
 
 bool
-ssa_undefined_value_p (tree t, bool partial)
+ssa_defined_default_def_p (tree t)
 {
-  gimple *def_stmt;
   tree var = SSA_NAME_VAR (t);
 
   if (!var)
 ;
   /* Parameters get their initial value from the function entry.  */
   else if (TREE_CODE (var) == PARM_DECL)
-return false;
+return true;
   /* When returning by reference the return address is actually a hidden
  parameter.  */
   else if (TREE_CODE (var) == RESULT_DECL && DECL_BY_REFERENCE (var))
-return false;
+return true;
   /* Hard register variables get their initial value from the ether.  */
   else if (VAR_P (var) && DECL_HARD_REGISTER (var))
+return true;
+
+  return false;
+}
+
+
+/* Return true if T, an SSA_NAME, has an undefined value.  PARTIAL is what
+   should be returned if the value is only partially undefined.  */
+
+bool
+ssa_undefined_value_p (tree t, bool partial)
+{
+  gimple *def_stmt;
+
+  if (ssa_defined_default_def_p (t))
 return false;
 
   /* The value is undefined iff its definition statement is empty.  */
diff --git a/gcc/tree-ssa.h b/gcc/tree-ssa.h
index 6d16ba9f6a0..c99b5eaee82 100644
--- a/gcc/tree-ssa.h
+++ b/gcc/tree-ssa.h
@@ -50,6 +50,8 @@ extern void delete_tree_ssa (function *);
 extern bool tree_ssa_useless_type_conversion (tree);
 extern tree tree_ssa_strip_useless_type_conversions (tree);
 
+
+extern bool ssa_defined_default_def_p (tree t);
 extern bool ssa_undefined_value_p (tree, bool = true);
 extern bool gimple_uses_undefined_value_p (gimple *);
 extern void execute_update_addresses_taken (void);
-- 
2.11.0



[PATCH, testsuite]: Use posix_memalign instead of aligned_alloc in gcc.dg/strncmp-2.c

2017-02-17 Thread Uros Bizjak
posix_memalign is portable to older, non-c11 runtimes.

2017-02-17  Uros Bizjak  

* gcc.dg/strncmp-2.c (test_driver_strncmp): Use posix_memalign
instead of aligned_alloc.

Tested on x86_64-linux-gnu, CentOS 5.11.

OK for mainline?

Uros.
diff --git a/gcc/testsuite/gcc.dg/strncmp-2.c b/gcc/testsuite/gcc.dg/strncmp-2.c
index 0c9a07a..8d799a1 100644
--- a/gcc/testsuite/gcc.dg/strncmp-2.c
+++ b/gcc/testsuite/gcc.dg/strncmp-2.c
@@ -19,10 +19,13 @@ static void test_driver_strncmp (void (test_strncmp)(const 
char *, const char *,
 {
   long pgsz = sysconf(_SC_PAGESIZE);
   char buf1[sz+1];
-  char *buf2 = aligned_alloc(pgsz,2*pgsz);
+  char *buf2;
   char *p2;
   int r,i,e;
 
+  r = posix_memalign ((void **),pgsz,2*pgsz);
+  if (r < 0) abort ();
+
   r = mprotect (buf2+pgsz,pgsz,PROT_NONE);
   if (r < 0) abort();
   


[PATCH] Fix PR79552

2017-02-17 Thread Richard Biener

The following fixes PR79552 on trunk (for the branch it would also need
backporting of r235147 to not regress PR48885 - its fix caused this
regression).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2017-02-17  Richard Biener  

PR tree-optimization/79552
* tree-ssa-structalias.c (visit_loadstore): Properly verify
default defs.

Index: gcc/tree-ssa-structalias.c
===
--- gcc/tree-ssa-structalias.c  (revision 245501)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -7296,9 +7312,15 @@ visit_loadstore (gimple *, tree base, tr
   || TREE_CODE (base) == TARGET_MEM_REF)
 {
   tree ptr = TREE_OPERAND (base, 0);
-  if (TREE_CODE (ptr) == SSA_NAME
- && ! SSA_NAME_IS_DEFAULT_DEF (ptr))
+  if (TREE_CODE (ptr) == SSA_NAME)
{
+ /* For parameters, get at the points-to set for the actual parm
+decl.  */
+ if (SSA_NAME_IS_DEFAULT_DEF (ptr)
+ && (TREE_CODE (SSA_NAME_VAR (ptr)) == PARM_DECL
+ || TREE_CODE (SSA_NAME_VAR (ptr)) == RESULT_DECL))
+   ptr = SSA_NAME_VAR (ptr);
+
  /* We need to make sure 'ptr' doesn't include any of
 the restrict tags we added bases for in its points-to set.  */
  varinfo_t vi = lookup_vi_for_tree (ptr);


[PATCH] Fix PR79567

2017-02-17 Thread Richard Biener

I am boostrapping the following to fix reported warnings about
bogus escape chars on mingw32 for genmatch output.  Reporter
tested this on mingw.

Will commit after bootstrap on x86_64-unknown-linux-gnu finished
(well, somewhat pointless, DIR_SEPARATOR_2 is not defined there).

Richard.

2017-02-17  Richard Biener  

PR bootstrap/79567
* genmatch.c (output_line_directive): Handle DIR_SEPARATOR_2.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 245501)
+++ gcc/genmatch.c  (working copy)
@@ -192,6 +192,11 @@ output_line_directive (FILE *f, source_l
 {
   /* When writing to a dumpfile only dump the filename.  */
   const char *file = strrchr (loc.file, DIR_SEPARATOR);
+#if defined(DIR_SEPARATOR_2)
+  const char *pos2 = strrchr (loc.file, DIR_SEPARATOR_2);
+  if (pos2 && (!file || (pos2 > file)))
+   file = pos2;
+#endif
   if (!file)
file = loc.file;
   else


Re: [PATCH] Fix PR79547

2017-02-17 Thread Richard Biener
On Thu, 16 Feb 2017, Richard Biener wrote:

> 
> I am testing the following patch for PR79547.  Those builtins do not
> return anything that can be used to re-construct the pointer(s) passed
> to them.
> 
> Queued for GCC 8.

Actually we need calluse constraints.  Thus adjusted as follows.

Richard.

2017-02-17  Richard Biener  

PR tree-optimization/79547
* tree-ssa-structalias.c (find_func_aliases_for_builtin_call):
Handle strlen, strcmp, strncmp, strcasecmp, strncasecmp, memcmp,
bcmp, strspn, strcspn, __builtin_object_size and __builtin_constant_p
without any constraints.

* gcc.dg/tree-ssa/strlen-2.c: New testcase.

Index: gcc/testsuite/gcc.dg/tree-ssa/strlen-2.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/strlen-2.c(nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/strlen-2.c(working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+void f (unsigned);
+
+void f3 (void)
+{
+  char s[] = "1234";
+
+  f (__builtin_strlen (s));
+  f (__builtin_strlen (s));
+  f (__builtin_strlen (s));
+}
+
+/* { dg-final { scan-tree-dump-times "strlen" 1 "strlen" } } */
Index: gcc/tree-ssa-structalias.c
===
--- gcc/tree-ssa-structalias.c  (revision 245501)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -4474,6 +4474,40 @@ find_func_aliases_for_builtin_call (stru
process_all_all_constraints (lhsc, rhsc);
  }
return true;
+  /* Pure functions that return something not based on any object and
+ that use the memory pointed to by their arguments (but not
+transitively).  */
+  case BUILT_IN_STRCMP:
+  case BUILT_IN_STRNCMP:
+  case BUILT_IN_STRCASECMP:
+  case BUILT_IN_STRNCASECMP:
+  case BUILT_IN_MEMCMP:
+  case BUILT_IN_BCMP:
+  case BUILT_IN_STRSPN:
+  case BUILT_IN_STRCSPN:
+   {
+ varinfo_t uses = get_call_use_vi (t);
+ make_any_offset_constraints (uses);
+ make_constraint_to (uses->id, gimple_call_arg (t, 0));
+ make_constraint_to (uses->id, gimple_call_arg (t, 1));
+ /* No constraints are necessary for the return value.  */
+ return true;
+   }
+  case BUILT_IN_STRLEN:
+   {
+ varinfo_t uses = get_call_use_vi (t);
+ make_any_offset_constraints (uses);
+ make_constraint_to (uses->id, gimple_call_arg (t, 0));
+ /* No constraints are necessary for the return value.  */
+ return true;
+   }
+  case BUILT_IN_OBJECT_SIZE:
+  case BUILT_IN_CONSTANT_P:
+   {
+ /* No constraints are necessary for the return value or the
+arguments.  */
+ return true;
+   }
   /* Trampolines are special - they set up passing the static
 frame.  */
   case BUILT_IN_INIT_TRAMPOLINE:



Re: [PATCH] Properly deprecate -fipa-cp-alignment

2017-02-17 Thread Jakub Jelinek
On Fri, Feb 17, 2017 at 10:31:16AM +0100, Martin Jambor wrote:
> @@ -8066,12 +8065,8 @@ This flag is enabled by default at @option{-O3}.
>  
>  @item -fipa-cp-alignment
>  @opindex -fipa-cp-alignment
> -When enabled, this optimization propagates alignment of function
> -parameters to support better vectorization and string operations.
> -
> -This flag is enabled by default at @option{-O2} and @option{-Os}.  It
> -requires that @option{-fipa-cp} is enabled.
> -@option{-fipa-cp-alignment} is obsolete, use @option{-fipa-bit-cp} instead.
> +This option has been superseded by @option{-fipa-bit-cp} and is now
> +deprecated.  Use @option{-fipa-bit-cp} instead.

I'd just remove the whole documentation about -fipa-cp-alignment, I think
that is what we do for other removed options.
Or at least say and is now ignored. rather than deprecated, to make it clear
how we handle it (not at all).
Ok with either of those changes.

Jakub


Re: [PATCH] Properly deprecate -fipa-cp-alignment

2017-02-17 Thread Martin Jambor
Hi,

On Wed, Feb 08, 2017 at 01:49:17PM +0100, Jakub Jelinek wrote:
> On Wed, Feb 08, 2017 at 01:41:24PM +0100, Martin Jambor wrote:
> > 2017-02-08  Martin Jambor  
> > 
> > * common.opt (-finstrument-functions-exclude-file-list): Remove Var
> > and Optimization, Document as deprecated and superseded by
> > -fipa-bit-cp.
> > * doc/invoke.texi (Option Summary): Remove -fipa-cp-alignment.
> > (Optimize Options): Likewise.
> > (-fipa-cp-alignment): Document as deprecated.
> > ---
> >  gcc/common.opt  |  4 ++--
> >  gcc/doc/invoke.texi | 11 +++
> >  2 files changed, 5 insertions(+), 10 deletions(-)
> > 
> > diff --git a/gcc/common.opt b/gcc/common.opt
> > index ad6baa3db68..661235ee4a9 100644
> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -1612,8 +1612,8 @@ Common Report Var(flag_ipa_cp_clone) Optimization
> >  Perform cloning to make Interprocedural constant propagation stronger.
> >  
> >  fipa-cp-alignment
> > -Common Report Var(flag_ipa_cp_alignment) Optimization
> > -Perform alignment discovery and propagation to make Interprocedural 
> > constant propagation stronger.
> > +Common Report
> > +This switch is deprecated.  Use -fipa-bit-cp instead.
> 
> I think this should be
> Common Ignore
> Does nothing. Preserved for backward compatibility.
> instead, but Joseph is the option handling maintainer, so CCing him.
> 

Thanks for pointing me to the right direction, I have added a second
space after the dot and changed the patch to the one below.

It has passed bootstrap, testing and I have also tested with make
info.  OK for trunk?

Thanks,

Martin


2017-02-16  Martin Jambor  

* common.opt (-fipa-cp-alignment): Mark as ignored and preserved
for backward compatibility only.
* doc/invoke.texi (Option Summary): Remove -fipa-cp-alignment.
(Optimize Options): Likewise.
(-fipa-cp-alignment): Document as deprecated.
---
 gcc/common.opt  |  4 ++--
 gcc/doc/invoke.texi | 11 +++
 2 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index ad6baa3db68..1f6aa8dd02a 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1612,8 +1612,8 @@ Common Report Var(flag_ipa_cp_clone) Optimization
 Perform cloning to make Interprocedural constant propagation stronger.
 
 fipa-cp-alignment
-Common Report Var(flag_ipa_cp_alignment) Optimization
-Perform alignment discovery and propagation to make Interprocedural constant 
propagation stronger.
+Common Ignore
+Does nothing.  Preserved for backward compatibility.
 
 fipa-bit-cp
 Common Report Var(flag_ipa_bit_cp) Optimization
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 56ca53f490b..ad617c02c31 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -371,7 +371,7 @@ Objective-C and Objective-C++ Dialects}.
 -fif-conversion2  -findirect-inlining @gol
 -finline-functions  -finline-functions-called-once  -finline-limit=@var{n} @gol
 -finline-small-functions  -fipa-cp  -fipa-cp-clone @gol
--fipa-cp-alignment  -fipa-bit-cp @gol
+-fipa-bit-cp @gol
 -fipa-pta  -fipa-profile  -fipa-pure-const  -fipa-reference  -fipa-icf @gol
 -fira-algorithm=@var{algorithm} @gol
 -fira-region=@var{region}  -fira-hoist-pressure @gol
@@ -7054,7 +7054,6 @@ also turns on the following optimization flags:
 -finline-small-functions @gol
 -findirect-inlining @gol
 -fipa-cp @gol
--fipa-cp-alignment @gol
 -fipa-bit-cp @gol
 -fipa-sra @gol
 -fipa-icf @gol
@@ -8066,12 +8065,8 @@ This flag is enabled by default at @option{-O3}.
 
 @item -fipa-cp-alignment
 @opindex -fipa-cp-alignment
-When enabled, this optimization propagates alignment of function
-parameters to support better vectorization and string operations.
-
-This flag is enabled by default at @option{-O2} and @option{-Os}.  It
-requires that @option{-fipa-cp} is enabled.
-@option{-fipa-cp-alignment} is obsolete, use @option{-fipa-bit-cp} instead.
+This option has been superseded by @option{-fipa-bit-cp} and is now
+deprecated.  Use @option{-fipa-bit-cp} instead.
 
 @item -fipa-bit-cp
 @opindex -fipa-bit-cp
-- 
2.11.0



Re: [PATCH] Fix PR78218

2017-02-17 Thread Richard Biener
On Fri, 17 Feb 2017, Thomas Schwinge wrote:

> Hi Richard!
> 
> On Mon, 7 Nov 2016 13:19:21 +0100 (CET), Richard Biener  
> wrote:
> > PR tree-optimization/78218
> 
> > --- gcc/testsuite/gcc.dg/torture/pr78218.c  (revision 0)
> > +++ gcc/testsuite/gcc.dg/torture/pr78218.c  (working copy)
> > @@ -0,0 +1,24 @@
> > +/* { dg-do run } */
> > +
> > +struct 
> > +{
> > +  int v;
> > +} a[2];
> > +
> > +int b; 
> > +
> > +void __attribute__((noinline,noclone))
> > +check ()
> 
> Is it itentional that here, check doesn't specify any formal parameters,
> but...
> 
> > +{
> > +  if (a[0].v != 1)
> > +__builtin_abort ();
> > +}
> > +
> > +int main ()
> > +{
> > +  a[1].v = 1;
> > +  a[0] = a[1];
> > +  a[1].v = 0;
> > +  check (a);
> 
> ... here it is called, passing in "a"?  (PTX doesn't like such
> mismatches.)
> 
> > +  return 0;
> > +}
> 
> OK to commit the obvious patch to change the call site?  (Not yet
> tested.)
> 
> -  check (a);
> +  check ();

Yes.  Can you quickly check if the adjusted testcase still fails before
the patch?

Thanks,
Richard.


Re: [PATCH] Fix ICE with COMPLEX_EXPR in fold_negate_expr (PR middle-end/79536)

2017-02-17 Thread Richard Biener
On Fri, 17 Feb 2017, Marek Polacek wrote:

> For "(int) -x", which is a NOP_EXPR, negate_expr_p returns true, which means
> that fold_negate_expr cannot return NULL_TREE.  But that's what happens here,
> leading to a crash in fold_build2.  negate_expr_p has (as well as negate_expr)
> STRIP_SIGN_NOPS, so the "(int) -x" becomes "-x", but fold_negate_expr doesn't
> have any STRIP_SIGN_NOPS.  Richi suggested to add a wrapper for 
> fold_negate_expr
> that strips/restores such NOP_EXPRs, much like negate_expr, which is what this
> patch does.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?  And 6 after a while?

Ok.

Thanks,
Richard.

> 2017-02-16  Marek Polacek  
> 
>   PR middle-end/79536
>   * fold-const.c (fold_negate_expr_1): Renamed from fold_negate_expr.
>   (fold_negate_expr): New wrapper.
> 
>   * gcc.dg/torture/pr79536.c: New test.
> 
> diff --git gcc/fold-const.c gcc/fold-const.c
> index a8bb8af..ad4770b 100644
> --- gcc/fold-const.c
> +++ gcc/fold-const.c
> @@ -139,6 +139,7 @@ static tree fold_relational_const (enum tree_code, tree, 
> tree, tree);
>  static tree fold_convert_const (enum tree_code, tree, tree);
>  static tree fold_view_convert_expr (tree, tree);
>  static bool vec_cst_ctor_to_array (tree, tree *);
> +static tree fold_negate_expr (location_t, tree);
>  
>  
>  /* Return EXPR_LOCATION of T if it is not UNKNOWN_LOCATION.
> @@ -522,7 +523,7 @@ negate_expr_p (tree t)
> returned.  */
>  
>  static tree
> -fold_negate_expr (location_t loc, tree t)
> +fold_negate_expr_1 (location_t loc, tree t)
>  {
>tree type = TREE_TYPE (t);
>tree tem;
> @@ -533,7 +534,7 @@ fold_negate_expr (location_t loc, tree t)
>  case BIT_NOT_EXPR:
>if (INTEGRAL_TYPE_P (type))
>  return fold_build2_loc (loc, PLUS_EXPR, type, TREE_OPERAND (t, 0),
> -build_one_cst (type));
> + build_one_cst (type));
>break;
>  
>  case INTEGER_CST:
> @@ -581,14 +582,14 @@ fold_negate_expr (location_t loc, tree t)
>  case COMPLEX_EXPR:
>if (negate_expr_p (t))
>   return fold_build2_loc (loc, COMPLEX_EXPR, type,
> - fold_negate_expr (loc, TREE_OPERAND (t, 0)),
> - fold_negate_expr (loc, TREE_OPERAND (t, 1)));
> + fold_negate_expr (loc, TREE_OPERAND (t, 0)),
> + fold_negate_expr (loc, TREE_OPERAND (t, 1)));
>break;
>  
>  case CONJ_EXPR:
>if (negate_expr_p (t))
>   return fold_build1_loc (loc, CONJ_EXPR, type,
> - fold_negate_expr (loc, TREE_OPERAND (t, 0)));
> + fold_negate_expr (loc, TREE_OPERAND (t, 0)));
>break;
>  
>  case NEGATE_EXPR:
> @@ -605,7 +606,7 @@ fold_negate_expr (location_t loc, tree t)
>   {
> tem = negate_expr (TREE_OPERAND (t, 1));
> return fold_build2_loc (loc, MINUS_EXPR, type,
> -   tem, TREE_OPERAND (t, 0));
> +   tem, TREE_OPERAND (t, 0));
>   }
>  
> /* -(A + B) -> (-A) - B.  */
> @@ -613,7 +614,7 @@ fold_negate_expr (location_t loc, tree t)
>   {
> tem = negate_expr (TREE_OPERAND (t, 0));
> return fold_build2_loc (loc, MINUS_EXPR, type,
> -   tem, TREE_OPERAND (t, 1));
> +   tem, TREE_OPERAND (t, 1));
>   }
>   }
>break;
> @@ -623,7 +624,7 @@ fold_negate_expr (location_t loc, tree t)
>if (!HONOR_SIGN_DEPENDENT_ROUNDING (element_mode (type))
> && !HONOR_SIGNED_ZEROS (element_mode (type)))
>   return fold_build2_loc (loc, MINUS_EXPR, type,
> - TREE_OPERAND (t, 1), TREE_OPERAND (t, 0));
> + TREE_OPERAND (t, 1), TREE_OPERAND (t, 0));
>break;
>  
>  case MULT_EXPR:
> @@ -638,11 +639,11 @@ fold_negate_expr (location_t loc, tree t)
> tem = TREE_OPERAND (t, 1);
> if (negate_expr_p (tem))
>   return fold_build2_loc (loc, TREE_CODE (t), type,
> - TREE_OPERAND (t, 0), negate_expr (tem));
> + TREE_OPERAND (t, 0), negate_expr (tem));
> tem = TREE_OPERAND (t, 0);
> if (negate_expr_p (tem))
>   return fold_build2_loc (loc, TREE_CODE (t), type,
> - negate_expr (tem), TREE_OPERAND (t, 1));
> + negate_expr (tem), TREE_OPERAND (t, 1));
>   }
>break;
>  
> @@ -715,6 +716,19 @@ fold_negate_expr (location_t loc, tree t)
>return NULL_TREE;
>  }
>  
> +/* A wrapper for fold_negate_expr_1.  */
> +
> +static tree
> +fold_negate_expr (location_t loc, tree t)
> +{
> +  tree type = TREE_TYPE (t);
> +  STRIP_SIGN_NOPS (t);
> +  tree tem = fold_negate_expr_1 (loc, t);
> +  if (tem == NULL_TREE)
> +return 

Re: fwprop fix for PR79405

2017-02-17 Thread Richard Biener
On Fri, Feb 17, 2017 at 10:07 AM, Richard Biener
 wrote:
> On Thu, Feb 16, 2017 at 8:41 PM, Bernd Schmidt  wrote:
>> We have two registers being assigned to each other:
>>
>>  (set (reg 213) (reg 209))
>>  (set (reg 209) (reg 213))
>>
>> These being the only definitions, we are happy to forward propagate reg 209
>> for reg 213 into a third insn, making a new use for reg 209. We are then
>> happy to forward propagate reg 213 for it in the same insn... ending up in
>> an infinite loop.
>>
>> I don't really see an elegant way to prevent this, so the following just
>> tries to detect the situation (and more general ones) by brute force.
>> Bootstrapped and tested on x86_64-linux, verified that the test passes with
>> a ppc cross, ok?
>
> But isn't the issue that we are walking "all uses" (in random order) rather 
> than
> only processing each stmt once?  That is,
>
>   /* Go through all the uses.  df_uses_create will create new ones at the
>  end, and we'll go through them as well.
>
>  Do not forward propagate addresses into loops until after unrolling.
>  CSE did so because it was able to fix its own mess, but we are not.  */
>
>   for (i = 0; i < DF_USES_TABLE_SIZE (); i++)
> {
>   df_ref use = DF_USES_GET (i);
>   if (use)
> if (DF_REF_TYPE (use) == DF_REF_REG_USE
> || DF_REF_BB (use)->loop_father == NULL
> /* The outer most loop is not really a loop.  */
> || loop_outer (DF_REF_BB (use)->loop_father) == NULL)
>   forward_propagate_into (use);
> }
>
> if that were simply walking all instructions, doing forward_propagat_into on
> each use on an instruction we'd avoid the cycle (because we stop propagating).
>
> Because when propagating DF_USES_TABLE changes.

Which either means we might even miss visiting some uses or a fix as simple as

Index: gcc/fwprop.c
===
--- gcc/fwprop.c(revision 245501)
+++ gcc/fwprop.c(working copy)
@@ -1478,7 +1478,8 @@ fwprop (void)
  Do not forward propagate addresses into loops until after unrolling.
  CSE did so because it was able to fix its own mess, but we are not.  */

-  for (i = 0; i < DF_USES_TABLE_SIZE (); i++)
+  unsigned sz = DF_USES_TABLE_SIZE ();
+  for (i = 0; i < sz; i++)
 {
   df_ref use = DF_USES_GET (i);
   if (use)

might work?  (not knowing too much about this detail of the DF data
structures - can
the table shrink?)

Richard.

> Richard.
>
>
>>
>>
>> Bernd
>>


Re: fwprop fix for PR79405

2017-02-17 Thread Richard Biener
On Thu, Feb 16, 2017 at 8:41 PM, Bernd Schmidt  wrote:
> We have two registers being assigned to each other:
>
>  (set (reg 213) (reg 209))
>  (set (reg 209) (reg 213))
>
> These being the only definitions, we are happy to forward propagate reg 209
> for reg 213 into a third insn, making a new use for reg 209. We are then
> happy to forward propagate reg 213 for it in the same insn... ending up in
> an infinite loop.
>
> I don't really see an elegant way to prevent this, so the following just
> tries to detect the situation (and more general ones) by brute force.
> Bootstrapped and tested on x86_64-linux, verified that the test passes with
> a ppc cross, ok?

But isn't the issue that we are walking "all uses" (in random order) rather than
only processing each stmt once?  That is,

  /* Go through all the uses.  df_uses_create will create new ones at the
 end, and we'll go through them as well.

 Do not forward propagate addresses into loops until after unrolling.
 CSE did so because it was able to fix its own mess, but we are not.  */

  for (i = 0; i < DF_USES_TABLE_SIZE (); i++)
{
  df_ref use = DF_USES_GET (i);
  if (use)
if (DF_REF_TYPE (use) == DF_REF_REG_USE
|| DF_REF_BB (use)->loop_father == NULL
/* The outer most loop is not really a loop.  */
|| loop_outer (DF_REF_BB (use)->loop_father) == NULL)
  forward_propagate_into (use);
}

if that were simply walking all instructions, doing forward_propagat_into on
each use on an instruction we'd avoid the cycle (because we stop propagating).

Because when propagating DF_USES_TABLE changes.

Richard.


>
>
> Bernd
>


Re: [PATCH] Fix PR78218

2017-02-17 Thread Thomas Schwinge
Hi Richard!

On Mon, 7 Nov 2016 13:19:21 +0100 (CET), Richard Biener  
wrote:
>   PR tree-optimization/78218

> --- gcc/testsuite/gcc.dg/torture/pr78218.c(revision 0)
> +++ gcc/testsuite/gcc.dg/torture/pr78218.c(working copy)
> @@ -0,0 +1,24 @@
> +/* { dg-do run } */
> +
> +struct 
> +{
> +  int v;
> +} a[2];
> +
> +int b; 
> +
> +void __attribute__((noinline,noclone))
> +check ()

Is it itentional that here, check doesn't specify any formal parameters,
but...

> +{
> +  if (a[0].v != 1)
> +__builtin_abort ();
> +}
> +
> +int main ()
> +{
> +  a[1].v = 1;
> +  a[0] = a[1];
> +  a[1].v = 0;
> +  check (a);

... here it is called, passing in "a"?  (PTX doesn't like such
mismatches.)

> +  return 0;
> +}

OK to commit the obvious patch to change the call site?  (Not yet
tested.)

-  check (a);
+  check ();


Grüße
 Thomas