Re: [SFN+LVU+IEPM v4 9/9] [IEPM] Introduce inline entry point markers

2018-02-27 Thread Alexandre Oliva
On Feb 21, 2018, Alexandre Oliva  wrote:

> On Feb 15, 2018, Szabolcs Nagy  wrote:
>> i see assembler slow downs with these location view patches
>> i opened https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84408


> [LVU] reset view at function entry, omit views at line zero

Ping?  https://gcc.gnu.org/ml/gcc-patches/2018-02/msg01224.html

> for  gcc/ChangeLog

>   PR debug/84404
>   PR debug/84408
>   * dwarf2out.c (struct dw_line_info_table): Update comments for
>   view == -1.
>   (FORCE_RESET_NEXT_VIEW): New.
>   (FORCE_RESETTING_VIEW_P): New.
>   (RESETTING_VIEW_P): Check for -1 too.
>   (ZERO_VIEW_P): Likewise.
>   (new_line_info_table): Force-reset next view.
>   (dwarf2out_begin_function): Likewise.
>   (dwarf2out_source_line): Simplify zero_view_p initialization.
>   Test FORCE_RESETTING_VIEW_P and RESETTING_VIEW_P instead of
>   view directly.  Omit view when omitting .loc at line 0.

> for  gcc/testsuite/ChangeLog

>   PR debug/84404
>   PR debug/84408
>   * gcc.dg/graphite/pr84404.c: New.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: gcc testsuite changes for new linker messages

2018-02-27 Thread Jeff Law
On 02/27/2018 03:15 PM, Alan Modra wrote:
> GNU ld error messages have changed to comply with the GNU coding
> standards.  The two fixes in this patch look to be the only required
> changes in the GCC testsuite.  I've written the prune_gcc_output patch
> the way I have to try to capture the fact that the lower case "in
> function" is correct for a message preceded by "ld: object(sec+off): "
> but "In function" is correct when the phrase starts a sentence.
> 
> Bootstrapped and regression tested x86_64-linux.  OK to apply all
> branches?
> 
>   * lib/prune.exp (prune_gcc_output): Match lower case "in function"
>   GNU ld message.
>   * g++.dg/other/anon5.C: Match lower case "bad value" GNU ld message.
OK.
jeff


Re: [PATCH] Fix debug for -mcall-ms2sysv-xlogues stubs fallout (PR target/83917)

2018-02-27 Thread Jeff Law
On 02/27/2018 01:29 AM, Jakub Jelinek wrote:
> On Mon, Feb 26, 2018 at 08:05:56PM -0600, Daniel Santos wrote:
> --- libgcc/config/i386/cygwin.S.jj2018-01-03 10:42:56.309763515 
> +0100
> +++ libgcc/config/i386/cygwin.S   2018-02-22 15:30:34.597925496 +0100
> @@ -23,31 +23,13 @@
>   * .
>   */
>  
> -#include "auto-host.h"
 The following include should be here.

 +#include "i386-asm.h"
>>> I don't understand this.  i386-asm.h needs (both before my patch and after
>>> it) both auto-host.h and auto-target.h, as it tests
>>> HAVE_GAS_SECTIONS_DIRECTIVE (this one newly, comes from cygwin.S)
>>
>> The problem is that HAVE_GAS_SECTIONS_DIRECTIVE gets defined (or not) in
>> ../../gcc/auto-host.h, but you are testing it before including
>> auto-host.h, either directly or via i386-asm.h.  So if i386-asm.h
>> depends upon HAVE_GAS_SECTIONS_DIRECTIVE first being defined then it is
>> a circular dependency.
>>
>> In its current form, cygwin.S would never define USE_GAS_CFI_DIRECTIVES
>> prior to including i386-asm.h and also never emit
>>     .cfi_sections    .debug_frame
>> and rather or not USE_GAS_CFI_DIRECTIVES ends up being defined to 1 or 0
>> depends upon the test of __GCC_HAVE_DWARF2_CFI_ASM in i386-asm.h.
> 
> Ugh, you're right.  I was trying to preserve existing behavior for cygwin.S,
> but failed to do so.  Unfortunately the patch which added this stuff from
> Kai T. and Richard H. from 2010 is not in gcc-patches archives; in any case,
> I think nothing seriously bad happens if with older gas versions which do
> support .cfi_* directives but not .cfi_sections .debug_frame we emit the CFI
> into .eh_frame section rather than .debug_frame.
> 
> So this patch simplifies it, with only one guard for the non-trivial
> vs. trivial cfi_* definitions (based on whether GCC itself would use it)
> and only guard the .cfi_sections directive on whether it is really
> available.
> 
> The __GCC_HAVE_DWARF2_CFI_ASM definition actually sometimes depends on the
> .cfi_sections presence too:
>   /* If we can't get the assembler to emit only .debug_frame, and we don't 
> need
>  dwarf2 unwind info for exceptions, then emit .debug_frame by hand.  */
>   if (!HAVE_GAS_CFI_SECTIONS_DIRECTIVE && !dwarf2out_do_eh_frame ())
> return false;
> but doesn't actually guarantee it always, as when doing .eh_frame it will
> not require .cfi_sections.
> 
> This spot brings in another, preexisting bug in cygwin.S though - 
> the HAVE_GAS_CFI_SECTIONS_DIRECTIVE macro is always defined, to 0 or 1,
> rather than sometimes #define and sometines #undef.
> 
>> Ultimately, the proper cleanup will be moving these tests out of
>> {gcc,libgcc}/configure.ac and into .m4 files in the root config
>> directory so that we don't uglify them with massive copy & pastes. 
>> These tests are also fairly complex as there are a lot of dependencies. 
>> m4 isn't my strong suite, but I can look at this after we're out of code
>> freeze.
> 
> Not really sure about that, because we really want to do a different thing
> in gcc/configure.ac (need to test the assembler directly, use
> GCC_TARGET_TEMPLATE) while in libgcc it does usually something different.
> 
> The libgcc configure already has all the code for the .hidden directive,
> as it uses it too, just it is only a pair of AC_SUBSTs rather than
> AC_DEFINE_UNQUOTED.
> The test for HAVE_GAS_CFI_SECTIONS_DIRECTIVE alternative can be compile
> int foo (int, char *);
> int bar (int x) { char *y = __builtin_alloca (x); return foo (x + 1, y) + 1; }
> with -g -fno-asynchronous-unwind-tables -fno-unwind-tables -fno-exceptions
> and scan for .cfi_sections .debug_frame.
> 
> So here is a new (I've committed the previous patch since then), only lightly
> tested (only on x86_64-linux and don't have too old binutils around), patch:
> 
> 2018-02-27  Jakub Jelinek  
> 
>   PR debug/83917
>   * configure.ac (AS_HIDDEN_DIRECTIVE): AC_DEFINE_UNQUOTED this to
>   $asm_hidden_op if visibility ("hidden") attribute works.
>   (HAVE_AS_CFI_SECTIONS): New AC_DEFINE.
>   * config/i386/i386-asm.h: Don't include auto-host.h.
>   (PACKAGE_VERSION, PACKAGE_NAME, PACKAGE_STRING, PACKAGE_TARNAME,
>   PACKAGE_URL): Don't undefine.
>   (USE_GAS_CFI_DIRECTIVES): Don't use nor define this macro, instead
>   guard cfi_startproc only on ifdef __GCC_HAVE_DWARF2_CFI_ASM.
>   (FN_HIDDEN): Change guard from #ifdef HAVE_GAS_HIDDEN to
>   #ifdef AS_HIDDEN_DIRECTIVE, use AS_HIDDEN_DIRECTIVE macro in the
>   definition instead of hardcoded .hidden.
>   * config/i386/cygwin.S: Include i386-asm.h first before .cfi_sections
>   directive.  Use #ifdef HAVE_AS_CFI_SECTIONS rather than
>   #ifdef HAVE_GAS_CFI_SECTIONS_DIRECTIVE to guard .cfi_sections.
>   (USE_GAS_CFI_DIRECTIVES): Don't define.
>   * configure: Regenerated.
>   * config.in: Likewise.
OK.
jeff


Re: [PATCH] Fix debug for -mcall-ms2sysv-xlogues stubs fallout (PR target/83917)

2018-02-27 Thread Jakub Jelinek
On Tue, Feb 27, 2018 at 09:29:36AM +0100, Jakub Jelinek wrote:
> So here is a new (I've committed the previous patch since then), only lightly
> tested (only on x86_64-linux and don't have too old binutils around), patch:
> 
> 2018-02-27  Jakub Jelinek  
> 
>   PR debug/83917
>   * configure.ac (AS_HIDDEN_DIRECTIVE): AC_DEFINE_UNQUOTED this to
>   $asm_hidden_op if visibility ("hidden") attribute works.
>   (HAVE_AS_CFI_SECTIONS): New AC_DEFINE.
>   * config/i386/i386-asm.h: Don't include auto-host.h.
>   (PACKAGE_VERSION, PACKAGE_NAME, PACKAGE_STRING, PACKAGE_TARNAME,
>   PACKAGE_URL): Don't undefine.
>   (USE_GAS_CFI_DIRECTIVES): Don't use nor define this macro, instead
>   guard cfi_startproc only on ifdef __GCC_HAVE_DWARF2_CFI_ASM.
>   (FN_HIDDEN): Change guard from #ifdef HAVE_GAS_HIDDEN to
>   #ifdef AS_HIDDEN_DIRECTIVE, use AS_HIDDEN_DIRECTIVE macro in the
>   definition instead of hardcoded .hidden.
>   * config/i386/cygwin.S: Include i386-asm.h first before .cfi_sections
>   directive.  Use #ifdef HAVE_AS_CFI_SECTIONS rather than
>   #ifdef HAVE_GAS_CFI_SECTIONS_DIRECTIVE to guard .cfi_sections.
>   (USE_GAS_CFI_DIRECTIVES): Don't define.
>   * configure: Regenerated.
>   * config.in: Likewise.

Now successfully bootstrapped/regtested on x86_64-linux and i686-linux.

Jakub


Re: [C++] [PR84231] overload on cond_expr in template

2018-02-27 Thread Alexandre Oliva
On Feb 27, 2018, Jason Merrill  wrote:

> Perhaps it would be easier to add the REFERENCE_TYPE in
> build_conditional_expr_1, adjusting result_type based on
> processing_template_decl and is_lvalue.

It is, indeed!

Here's the patch, regstrapped on i686- and x86_64-linux-gnu.  The only
unexpected glitch was the need for adjusting the fold expr parser to
deal with an indirect_ref, lest g++.dg/cpp1x/fold6.C would fail to
error at the line with the ternary operator.

Ok to install?


[C++] [PR84231] overload on cond_expr in template

A non-type-dependent COND_EXPR within a template is reconstructed with
the original operands, after one with non-dependent proxies is built to
determine its result type.  This is problematic because the operands of
a COND_EXPR determined to be an rvalue may have been converted to denote
their rvalue nature.  The reconstructed one, however, won't have such
conversions, so lvalue_kind may not recognize it as an rvalue, which may
lead to e.g. incorrect overload resolution decisions.

If we mistake such a COND_EXPR for an lvalue, overload resolution might
regard a conversion sequence that binds it to a non-const reference as
viable, and then select that over one that binds it to a const
reference.  Only after template substitution would we rebuild the
COND_EXPR, realize it is an rvalue, and conclude the reference binding
is ill-formed, but at that point we'd have long discarded any alternate
candidates we could have used.

This patch modifies the logic that determines whether a
(non-type-dependent) COND_EXPR in a template is an lvalue, to rely on
its type, more specifically, on the presence of a REFERENCE_TYPE
wrapper.  In order to avoid a type bootstrapping problem, the
REFERENCE_TYPE that wraps the type of some such COND_EXPRs is
introduced earlier, so that we don't have to test for lvalueness of
the expression using the very code that we wish to change.


for  gcc/cp/ChangeLog

PR c++/84231
* tree.c (lvalue_kind): Use presence/absence of REFERENCE_TYPE
only while processing template decls.
* typeck.c (build_x_conditional_expr): Move wrapping of
reference type around type...
* call.c (build_conditional_expr_1): ... here.
* parser.c (cp_parser_fold_expression): Catch REFERENCE_REF_P
INDIRECT_REF of COND_EXPR too.

for  gcc/testsuite/ChangeLog

PR c++/84231
* g++.dg/pr84231.C: New.
---
 gcc/cp/call.c  |3 +++
 gcc/cp/parser.c|4 +++-
 gcc/cp/tree.c  |8 
 gcc/cp/typeck.c|4 
 gcc/testsuite/g++.dg/pr84231.C |   29 +
 5 files changed, 43 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr84231.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 11fe28292fb1..9d98a3d90d25 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -5348,6 +5348,9 @@ build_conditional_expr_1 (location_t loc, tree arg1, tree 
arg2, tree arg3,
 return error_mark_node;
 
  valid_operands:
+  if (processing_template_decl)
+result_type = cp_build_reference_type (result_type, !is_lvalue);
+
   result = build3_loc (loc, COND_EXPR, result_type, arg1, arg2, arg3);
 
   /* If the ARG2 and ARG3 are the same and don't have side-effects,
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index bcee1214c2f3..c483b6ce25ea 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -4961,7 +4961,9 @@ cp_parser_fold_expression (cp_parser *parser, tree expr1)
   else if (is_binary_op (TREE_CODE (expr1)))
 error_at (location_of (expr1),
  "binary expression in operand of fold-expression");
-  else if (TREE_CODE (expr1) == COND_EXPR)
+  else if (TREE_CODE (expr1) == COND_EXPR
+  || (REFERENCE_REF_P (expr1)
+  && TREE_CODE (TREE_OPERAND (expr1, 0)) == COND_EXPR))
 error_at (location_of (expr1),
  "conditional expression in operand of fold-expression");
 
diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 9b9e36a1173f..76148c876b71 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -194,6 +194,14 @@ lvalue_kind (const_tree ref)
   break;
 
 case COND_EXPR:
+  /* Except for type-dependent exprs, a REFERENCE_TYPE will
+indicate whether its result is an lvalue or so.
+REFERENCE_TYPEs are handled above, so if we reach this point,
+we know we got an rvalue, unless we have a type-dependent
+expr.  */
+  if (processing_template_decl
+ && !type_dependent_expression_p (CONST_CAST_TREE (ref)))
+   return clk_none;
   op1_lvalue_kind = lvalue_kind (TREE_OPERAND (ref, 1)
? TREE_OPERAND (ref, 1)
: TREE_OPERAND (ref, 0));
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index 0e7c63dd1973..fba04c49ec2d 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -6565,10 +6565,6 @@ build_x_conditional_expr (location_t loc, tree ifexp, 
tree op1, 

Re: [PATCH PR other/77609] Let the assembler choose ELF section types for miscellaneous named sections

2018-02-27 Thread Ian Lance Taylor via gcc-patches
On Tue, Feb 27, 2018 at 6:01 PM, Roland McGrath  wrote:
> On Mon, Feb 26, 2018 at 8:11 PM, Ian Lance Taylor  wrote:
>> You are recreating the conditions used in
>> default_elf_asm_named_section, so I think you ought to have comments
>> referring back and forth between them.
>>
>> This is OK with the two additional comments.
>
> Thanks for the review.  I've added those comments.
>
> However, in testing on x86_64-linux-gnu it caused a regression in:
> gcc/testsuite/gcc.target/i386/pr25254.c
> which got the "section type conflict" error.
>
> This is because x86_64_elf_select_section for that case calls:
> get_section (".lrodata", SECTION_LARGE, NULL)
> but something else had previously instantiated the section via
> the section_type_flags logic that now adds in SECTION_NOTYPE.
>
> I addressed this by making get_section accept having SECTION_NOTYPE and not
> as a non-conflict if none of SECTION_BSS et al is present.  That seemed
> like a better bet than finding every get_section caller and making sure
> they use SECTION_NOTYPE when appropriate.  But I'm not sure if there might
> be some downside to that logic or if there is a third way to resolve this
> that's better than either of those two.
>
> Here's the new patch I'd like to commit.  It has no regressions on
> x86_64-linux-gnu, but I'm not set up to test other configurations.
>
>
> gcc/
> 2018-02-27  Roland McGrath  
>
> PR other/77609
> * varasm.c (default_section_type_flags): Set SECTION_NOTYPE for
> any section for which we don't know a specific type it should have,
> regardless of name.  Previously this was done only for the exact
> names ".init_array", ".fini_array", and ".preinit_array".
> (default_elf_asm_named_section): Add comment about
> relationship with default_section_type_flags and SECTION_NOTYPE.
> (get_section): Don't consider it a type conflict if one side has
> SECTION_NOTYPE and the other doesn't, as long as neither has the
> SECTION_BSS et al used in the default_section_type_flags logic.

Still OK, but it should wait until after the tree is back in stage 1.

Ian


Re: [PR81611] improve auto-inc

2018-02-27 Thread Jeff Law
On 02/27/2018 04:18 PM, Alexandre Oliva wrote:
> On Feb 14, 2018, Jeff Law  wrote:
> 
>>> + regno = REGNO (inc_insn.reg0);
>>> + int luid = DF_INSN_LUID (mem_insn.insn);
>>> + mem_insn.insn = get_next_ref (regno, bb, reg_next_use);
>> So I think a comment is warranted  right as we enter the TRUE arm.
> 
>> At that point INC_INSN is an inc/dec.  But MEM_INSN is not necessarily a
>> memory reference.  It could be a memory reference, it could be a copy,
>> it could be something completely different (it's just the next insn that
>> references the result of the increment).  In the case we care about we
>> want it to be a copy of INC_INSN's REG_RES back to REG0.
> 
>> ISTM that verifying MEM_INSN is a reg->reg copy (reg_res -> reg0) before
>> we call get_next_ref for reg0 is advisable and probably good from a
>> compile-time standpoint by avoiding calls into find_address.
> 
> But we don't need it to be a copy.  The transformation is just as
> legitimate if the regs go independent ways after that point.  We have
> reg_res set to reg0+reg1, and then a use of reg0 in a MEM before any
> other use of reg_res.  We turn that into a copy of reg0 to reg_res, and
> the MEM addr into a post_add of reg_res with reg1 (possibly a post_inc),
> so that the MEM dereferences reg_res while it's still equal to reg0, and
> after the MEM, reg_res becomes reg0+reg1, as it should for any
> subsequent uses, and reg0 is unmodified.  Whether or not a subsequent
> copy from reg_res to reg0 is to be found won't make the transformation
> any more or less legitimate.
> 
>> After we call get_next_ref to get the next reference of the source of
>> the increment, then we're hoping to find a memory reference that uses
>> REG0.  But it's not guaranteed it's a memory reference insn.
> 
> Yeah, find_address will determine if it contains any of the MEM patterns
> we might be interested in, but it could be anything whatsoever.  The MEM
> pattern might appear virtually anywhere in the insn.
> 
>> I was having an awful time understanding how this code could work from
>> the comments until I put it under a debugger and got a sense of the
>> state as we entered that IF block.  Then it was much clearer :-)
> 
> Sorry, I realize the comments were written based on a lot of context
> about the overall behavior of the pass, that I had learned while trying
> to figure it out.  At the risk of making it redundant, I've expanded the
> comments, and added further tests that won't affect current behavior in
> any significant way, but that might speed things up a bit and will save
> us trouble should find_address be extended to catch additional patterns.
> 
> 
>> I believe Georg had other testcases in subsequent comments in the BZ,
>> but I don't believe they were flagged as regressions.
> 
> However, with the testcases I realized the incremented register could
> still be live, even if we didn't find a subsequent use for it.
> Adjusting for that made those testcases use post_inc too.
> 
> Here's the improved patch, regstrapped on aarch64-, ppc64-, and
> ppc64el-linux-gnu.  Ok to install?
> 
> 
> [PR81611] turn inc-and-use-of-dead-orig into auto-inc
> 
> When the addressing modes available on the machine don't allow offsets
> in addresses, odds are that post-increments will be represented in
> trees and RTL as:
> 
>   y <= x + 1
>   ... *(x) ...
>   x <= y
> 
> so deal with it by turning such RTL as:
> 
>   (set y (plus x n))
>   ... (mem x) ...
> 
> without intervening uses of y into
> 
>   (set y x)
>   ... (mem (post_add y n)) ...
> 
> so as to create auto-inc addresses that we'd otherwise miss.
> 
> 
> for  gcc/ChangeLog
> 
>   PR rtl-optimization/81611
>   * auto-inc-dec.c (attempt_change): Move dead note from
>   mem_insn if it's the next use of regno
>   (find_address): Take address use of reg holding
>   non-incremented value.  Add parm to limit search to the named
>   reg only.
>   (merge_in_block): Attempt to use a mem insn that is the next
>   use of the original regno.
OK.  Thanks!

Jeff


Re: [PATCH PR other/77609] Let the assembler choose ELF section types for miscellaneous named sections

2018-02-27 Thread Roland McGrath via gcc-patches
On Mon, Feb 26, 2018 at 8:11 PM, Ian Lance Taylor  wrote:
> You are recreating the conditions used in
> default_elf_asm_named_section, so I think you ought to have comments
> referring back and forth between them.
>
> This is OK with the two additional comments.

Thanks for the review.  I've added those comments.

However, in testing on x86_64-linux-gnu it caused a regression in:
gcc/testsuite/gcc.target/i386/pr25254.c
which got the "section type conflict" error.

This is because x86_64_elf_select_section for that case calls:
get_section (".lrodata", SECTION_LARGE, NULL)
but something else had previously instantiated the section via
the section_type_flags logic that now adds in SECTION_NOTYPE.

I addressed this by making get_section accept having SECTION_NOTYPE and not
as a non-conflict if none of SECTION_BSS et al is present.  That seemed
like a better bet than finding every get_section caller and making sure
they use SECTION_NOTYPE when appropriate.  But I'm not sure if there might
be some downside to that logic or if there is a third way to resolve this
that's better than either of those two.

Here's the new patch I'd like to commit.  It has no regressions on
x86_64-linux-gnu, but I'm not set up to test other configurations.


gcc/
2018-02-27  Roland McGrath  

PR other/77609
* varasm.c (default_section_type_flags): Set SECTION_NOTYPE for
any section for which we don't know a specific type it should have,
regardless of name.  Previously this was done only for the exact
names ".init_array", ".fini_array", and ".preinit_array".
(default_elf_asm_named_section): Add comment about
relationship with default_section_type_flags and SECTION_NOTYPE.
(get_section): Don't consider it a type conflict if one side has
SECTION_NOTYPE and the other doesn't, as long as neither has the
SECTION_BSS et al used in the default_section_type_flags logic.

diff --git a/gcc/varasm.c b/gcc/varasm.c
index 6e345d39d31..e488f866011 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -296,6 +296,17 @@ get_section (const char *name, unsigned int
flags, tree decl)
   else
 {
   sect = *slot;
+  /* It is fine if one of the sections has SECTION_NOTYPE as long as
+ the other has none of the contrary flags (see the logic at the end
+ of default_section_type_flags, below).  */
+  if (((sect->common.flags ^ flags) & SECTION_NOTYPE)
+  && !((sect->common.flags | flags)
+   & (SECTION_CODE | SECTION_BSS | SECTION_TLS | SECTION_ENTSIZE
+  | (HAVE_COMDAT_GROUP ? SECTION_LINKONCE : 0
+{
+  sect->common.flags |= SECTION_NOTYPE;
+  flags |= SECTION_NOTYPE;
+}
   if ((sect->common.flags & ~SECTION_DECLARED) != flags
  && ((sect->common.flags | flags) & SECTION_OVERRIDE) == 0)
{
@@ -6361,15 +6372,23 @@ default_section_type_flags (tree decl, const
char *name, int reloc)
   || strncmp (name, ".gnu.linkonce.tb.", 17) == 0)
 flags |= SECTION_TLS | SECTION_BSS;

-  /* These three sections have special ELF types.  They are neither
- SHT_PROGBITS nor SHT_NOBITS, so when changing sections we don't
- want to print a section type (@progbits or @nobits).  If someone
- is silly enough to emit code or TLS variables to one of these
- sections, then don't handle them specially.  */
-  if (!(flags & (SECTION_CODE | SECTION_BSS | SECTION_TLS))
-  && (strcmp (name, ".init_array") == 0
- || strcmp (name, ".fini_array") == 0
- || strcmp (name, ".preinit_array") == 0))
+  /* Various sections have special ELF types that the assembler will
+ assign by default based on the name.  They are neither SHT_PROGBITS
+ nor SHT_NOBITS, so when changing sections we don't want to print a
+ section type (@progbits or @nobits).  Rather than duplicating the
+ assembler's knowledge of what those special name patterns are, just
+ let the assembler choose the type if we don't know a specific
+ reason to set it to something other than the default.  SHT_PROGBITS
+ is the default for sections whose name is not specially known to
+ the assembler, so it does no harm to leave the choice to the
+ assembler when @progbits is the best thing we know to use.  If
+ someone is silly enough to emit code or TLS variables to one of
+ these sections, then don't handle them specially.
+
+ default_elf_asm_named_section (below) handles the BSS, TLS, ENTSIZE, and
+ LINKONCE cases when NOTYPE is not set, so leave those to its logic.  */
+  if (!(flags & (SECTION_CODE | SECTION_BSS | SECTION_TLS | SECTION_ENTSIZE))
+  && !(HAVE_COMDAT_GROUP && (flags & SECTION_LINKONCE)))
 flags |= SECTION_NOTYPE;

   return flags;
@@ -6455,6 +6474,10 @@ default_elf_asm_named_section (const char
*name, unsigned int flags,

   fprintf (asm_out_file, "\t.section\t%s,\"%s\"", name, 

Re: [PING] [PATCH] consider successor blocks when avoiding -Wstringop-truncation (PR 84468)

2018-02-27 Thread Jeff Law
On 02/26/2018 05:47 PM, Martin Sebor wrote:
> On 02/26/2018 12:13 PM, Jeff Law wrote:
>> On 02/24/2018 05:11 PM, Martin Sebor wrote:
>>> Attached is an updated patch with a fix for a bad assumption
>>> exposed by building the linux kernel.
>>>
>>> On 02/19/2018 07:50 PM, Martin Sebor wrote:
 PR 84468 points out a false positive in -Wstringop-truncation
 in code like this:

   struct A { char a[4]; };

   void f (struct A *p, const struct A *q)
   {
     if (p->a)
   strncpy (p->a, q->a, sizeof p->a - 1);   // warning here

     p->a[3] = '\0';
   }

 The warning is due to the code checking only the same basic block
 as the one with the strncpy call for an assignment to the destination
 to avoid it, but failing to check the successor basic block if there
 is no subsequent statement in the current block.  (Eliminating
 the conditional is being tracked in PR 21474.)

 The attached test case adds logic to avoid this false positive.
 I don't know under what circumstances there could be more than
 one successor block here so I don't handle that case.
>> So this is feeling more and more like we need to go back to the ideas
>> behind checking the virtual operand chains.
>>
>> The patch as-written does not properly handle the case where BB has
>> multiple outgoing edges.  For gcc-8 you could probably get away with
>> checking that you have precisely one outgoing edge without EDGE_ABNORMAL
>> set in its flags in addition to the checks you're already doing.
>>
>> But again, it's feeling more and more like the right fix is to go back
>> and walk the virtual operands.
> 
> I intentionally kept the patch as simple as possible to minimize
> risk at this late stage.
> 
> Attached is a more robust version that handles multiple outgoing
> edges and avoids those with the EDGE_ABNORMAL bit set.  Retested
> on x86_64 and with the Linux kernel.
> 
> Enhancing this code to handle more complex cases is on my to-do
> list for stage 1 (e.g., to handle bug 84561 where MEM_REF defeats
> the detection of the nul assignment).
I don't think handling multiple outgoing edges is advisable here.  To do
that you have to start thinking about post-dominator analysis at which
point you're better off walking the memory web via VUSE/VDEFs.

Just verify the block has a single outgoing edge and that the edge is
not marked with EDGE_ABNORMAL.  Don't bother with the recursive call.
Assuming you get a suitable block, then look inside.

I glanced over the tests and I didn't see any that would benefit from
handling multiple edges or the recursion (every one of the dg-bogus
markers should be immediately transferring control to the null
termination statement AFAICT).

jeff



libgo patch committed: Update AIX memory allocation

2018-02-27 Thread Ian Lance Taylor
This patch by Tony Reix updates the AIX memory allocation in libgo for
new versions of AIX.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 258051)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-bd7fc3c85d874344b18bbb0a738ec94dfb43794b
+821960465883fbdd96568f2325f55ee4b05de1cb
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/runtime/malloc.go
===
--- libgo/go/runtime/malloc.go  (revision 257914)
+++ libgo/go/runtime/malloc.go  (working copy)
@@ -296,8 +296,8 @@ func mallocinit() {
// allocation at 0x40 << 32 because when using 4k pages with 
3-level
// translation buffers, the user address space is limited to 39 
bits
// On darwin/arm64, the address space is even smaller.
-   // On AIX, mmap adresses range start at 0x0700_ for 
64 bits
-   // processes.
+   // On AIX, mmap adresses range starts at 0x0700 for 
64-bit
+   // processes. The new address space allocator starts at 
0x0A00.
arenaSize := round(_MaxMem, _PageSize)
pSize = bitmapSize + spansSize + arenaSize + _PageSize
for i := 0; i <= 0x7f; i++ {
@@ -307,13 +307,16 @@ func mallocinit() {
case GOARCH == "arm64":
p = uintptr(i)<<40 | uintptrMask&(0x0040<<32)
case GOOS == "aix":
-   i = 1
-   p = uintptr(i)<<32 | uintptrMask&(0x70<<52)
+   if i == 0 {
+   p = uintptrMask&(1<<32) | 
uintptrMask&(0xa0<<52)
+   } else {
+   p = uintptr(i)<<32 | 
uintptrMask&(0x70<<52)
+   }
default:
p = uintptr(i)<<40 | uintptrMask&(0x00c0<<32)
}
p = uintptr(sysReserve(unsafe.Pointer(p), pSize, 
))
-   if p != 0 || GOOS == "aix" { // Useless to loop on AIX, 
as i is forced to 1
+   if p != 0 {
break
}
}


Re: [PATCH/testsuite] avoid test failures with -fpic

2018-02-27 Thread Jeff Law
On 02/26/2018 04:47 PM, Martin Sebor wrote:
> Compiling a number of tests with -fpic results in failures
> because the tests make use of non-inline, extern helper
> functions defined within, and these helpers must be assumed
> to have been superimposed elsewhere.
> 
> For example:
> https://gcc.gnu.org/ml/gcc-testresults/2018-02/msg01762.html
> 
> I took a quick pass through the failures and declared
> the helpers static to avoid them.  I only did this in failure
> tests I recognized because I introduced them myself being unaware
> that building the tests with -fpic was expected to work.
> 
> This should make the -fpic test results a lot cleaner than
> they currently are, although I don't think it brings them
> up to par with non-fpic results.
> 
> Unless there are objections in the next day or so I'll commit
> the fixes as obvious.
OK.


jeff



libgo patch committed: Update to final Go 1.10 release

2018-02-27 Thread Ian Lance Taylor
This patch to libgo updates it to the final Go 1.10 release.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 257954)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-8b3d6091801d485c74a9c92740c69673e39160b0
+bd7fc3c85d874344b18bbb0a738ec94dfb43794b
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/MERGE
===
--- libgo/MERGE (revision 257914)
+++ libgo/MERGE (working copy)
@@ -1,4 +1,4 @@
-20e228f2fdb44350c858de941dff4aea9f3127b8
+bf86aec25972f3a100c3aa58a6abcbcc35bdea49
 
 The first line of this file holds the git revision number of the
 last merge done from the master library sources.
Index: libgo/VERSION
===
--- libgo/VERSION   (revision 257914)
+++ libgo/VERSION   (working copy)
@@ -1 +1 @@
-go1.10rc2
+go1.10
Index: libgo/go/cmd/go/internal/load/pkg.go
===
--- libgo/go/cmd/go/internal/load/pkg.go(revision 257914)
+++ libgo/go/cmd/go/internal/load/pkg.go(working copy)
@@ -1224,6 +1224,7 @@ func (p *Package) load(stk *ImportStack,
 // GNU binutils flagfile specifiers, sometimes called "response files").
 // To be conservative, we reject almost any arg beginning with 
non-alphanumeric ASCII.
 // We accept leading . _ and / as likely in file system paths.
+// There is a copy of this function in cmd/compile/internal/gc/noder.go.
 func SafeArg(name string) bool {
if name == "" {
return false
Index: libgo/go/cmd/go/internal/work/exec.go
===
--- libgo/go/cmd/go/internal/work/exec.go   (revision 257914)
+++ libgo/go/cmd/go/internal/work/exec.go   (working copy)
@@ -945,15 +945,20 @@ func splitPkgConfigOutput(out []byte) []
 // Calls pkg-config if needed and returns the cflags/ldflags needed to build 
the package.
 func (b *Builder) getPkgConfigFlags(p *load.Package) (cflags, ldflags 
[]string, err error) {
if pkgs := p.CgoPkgConfig; len(pkgs) > 0 {
+   var pcflags []string
+   for len(pkgs) > 0 && strings.HasPrefix(pkgs[0], "--") {
+   pcflags = append(pcflags, pkgs[0])
+   pkgs = pkgs[1:]
+   }
for _, pkg := range pkgs {
if !load.SafeArg(pkg) {
return nil, nil, fmt.Errorf("invalid pkg-config 
package name: %s", pkg)
}
}
var out []byte
-   out, err = b.runOut(p.Dir, p.ImportPath, nil, b.PkgconfigCmd(), 
"--cflags", "--", pkgs)
+   out, err = b.runOut(p.Dir, p.ImportPath, nil, b.PkgconfigCmd(), 
"--cflags", pcflags, "--", pkgs)
if err != nil {
-   b.showOutput(nil, p.Dir, b.PkgconfigCmd()+" --cflags 
"+strings.Join(pkgs, " "), string(out))
+   b.showOutput(nil, p.Dir, b.PkgconfigCmd()+" --cflags 
"+strings.Join(pcflags, " ")+strings.Join(pkgs, " "), string(out))
b.Print(err.Error() + "\n")
return nil, nil, errPrintedOutput
}
@@ -963,15 +968,15 @@ func (b *Builder) getPkgConfigFlags(p *l
return nil, nil, err
}
}
-   out, err = b.runOut(p.Dir, p.ImportPath, nil, b.PkgconfigCmd(), 
"--libs", "--", pkgs)
+   out, err = b.runOut(p.Dir, p.ImportPath, nil, b.PkgconfigCmd(), 
"--libs", pcflags, "--", pkgs)
if err != nil {
-   b.showOutput(nil, p.Dir, b.PkgconfigCmd()+" --libs 
"+strings.Join(pkgs, " "), string(out))
+   b.showOutput(nil, p.Dir, b.PkgconfigCmd()+" --libs 
"+strings.Join(pcflags, " ")+strings.Join(pkgs, " "), string(out))
b.Print(err.Error() + "\n")
return nil, nil, errPrintedOutput
}
if len(out) > 0 {
ldflags = strings.Fields(string(out))
-   if err := checkLinkerFlags("CFLAGS", "pkg-config 
--cflags", ldflags); err != nil {
+   if err := checkLinkerFlags("LDFLAGS", "pkg-config 
--libs", ldflags); err != nil {
return nil, nil, err
}
}
Index: libgo/go/cmd/go/internal/work/security.go
===
--- libgo/go/cmd/go/internal/work/security.go   (revision 257914)
+++ libgo/go/cmd/go/internal/work/security.go   (working copy)
@@ -34,6 +34,7 @@ import (
"fmt"

Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-27 Thread Martin Sebor

On 02/27/2018 04:44 PM, Jakub Jelinek wrote:

On Mon, Feb 26, 2018 at 09:19:56PM -0700, Martin Sebor wrote:

+  /* Put together a list of the black listed attributes that the primary
+ template is declared with that the specialization is not, in case
+ it's not apparent from the most recent declaration of the primary.  */
+  unsigned nattrs = 0;
+  std::string str;
+
+  for (unsigned i = 0; i != sizeof blacklist / sizeof *blacklist; ++i)
+{
+  for (unsigned j = 0; j != 2; ++j)
+   {
+ if (!lookup_attribute (blacklist[i], tmpl_attrs[j]))
+   continue;
+
+ for (unsigned k = 0; k != 1 + !!spec_attrs[1]; ++k)
+   {
+ if (lookup_attribute (blacklist[i], spec_attrs[k]))
+   break;
+
+ if (str.size ())
+   str += ", ";
+ str += "%<";
+ str += blacklist[i];
+ str += "%>";
+ ++nattrs;
+   }
+   }
+}
+
+  if (!nattrs)
+return;
+
+  if (warning_at (DECL_SOURCE_LOCATION (spec), OPT_Wmissing_attributes,
+ "explicit specialization %q#D may be missing attributes",
+ spec))
+{
+  if (nattrs > 1)
+   str = G_("missing primary template attributes ") + str;
+  else
+   str = G_("missing primary template attribute ") + str;
+
+  inform (DECL_SOURCE_LOCATION (tmpl), str.c_str ());


This is broken for multiple reasons:
1) it should be inform_n rather than inform
2) you really can't do what you're doing for translations;
   G_(...) marks the string for translations, but what actually is
   translated is not that string, but rather what is passed to inform,
   i.e. str.c_str (), so it will be likely never translated
3) as others have mentioned, the #include  you are doing is
   wrong
4) I don't see justification to use std::string here

What you IMHO should use instead is use
  pretty_printer str;
instead, and the pp_* APIs to add stuff in there, including
pp_begin_quote (, pp_show_color (global_dc->printer))
and
pp_end_quote (, pp_show_color (global_dc->printer))
when you want to add what %< or %> expand to,
and finally
  inform_n (DECL_SOURCE_LOCATION (tmpl), nattrs,
"missing primary template attribute %s",
"missing primary template attributes %s",
pp_formatted_text ());
That way it should be properly translatable.


Using inform_n() would not be correct here.  What's being
translated is one of exactly two forms: singular and plural.
It doesn't matter how many things the plural form refers to
because the number doesn't appear in the message.  Let's ask
Google to translate the message above to a language with more
than two plural forms, such as Czech:

there are missing attributes:
https://translate.google.com/?tl=cs#auto/cs/there%20are%20missing%20attributes

vs there are 5 missing attributes:
https://translate.google.com/?tl=cs#auto/cs/there%20are%205%20missing%20attributes

Only the first form is correct when the exact number isn't
mentioned.

There are many places in the C++ front-end where a string
enclosed in G_() is assigned to a pointer and later used
in a diagnostic call.  Is there something different about
the usage I introduced that makes it unsuitable for
translation?

std::string is used in a number of places in GCC.  Why does
using it here need any special justification?

Using the pretty printer as you suggest also sounds
complicated to me and so prone to error but I will defer
to Jason's opinion to decide if any changes are necessary.

Martin

PS What I do think would be helpful (and what I'd like to
look into adding in stage 1) is a directive to format
attribute lists.  But I'm not sure the directive will help
with cases like this one.



[PATCH] Fix ms_struct/-mms-bitfields structure layout (PR target/52991)

2018-02-27 Thread Jakub Jelinek
Hi!

The following patch fixes the reported ms_struct/-mms-bitfields structure
layout issues from PR52991.

There are multiple issues, two of them introduced by the
https://gcc.gnu.org/ml/gcc-patches/2006-04/msg01064.html -mms-bitfields
revamp from Eric and follow-up fix r114552, the rest has been introduced
later when the known_align < desired_align case has been enabled for the ms
bitfield layout.

The first 2 hunks fix alignment of packed non-bitfield fields, we can't
ignore all the alignment updates for them, just should use only
desired_align which takes DECL_PACKED into account, rather than
MAX (type_align, desired_align).  Similarly, the last hunk in stor-layout.c
makes sure that for DECL_PACKED fields we use BITS_PER_UNIT alignment rather
than the type alignment.

The rest attempts to unbreak r184409 which enabled known_align < desired_align
case; doing that if rli->prev_field and ms layout is wrong, we first need to
deal with the bitfield packing and if we are within a bitfield word, we
shouldn't do any realignment, only in between them.

The patch reverts changes to bf-ms-layout{,-2}.c tests done in 2012, which
were done just to match the r184409 changes, and adds 2 new tests.  All of
these 4 I've tested (slightly tweaked, so that it compiles with VC) with
the online VC compiler http://rextester.com/l/c_online_compiler_visual .

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-02-27  Jakub Jelinek  

PR target/52991
* stor-layout.c (update_alignment_for_field): For
targetm.ms_bitfield_layout_p (rli->t), if !is_bitfield
&& !DECL_PACKED (field), do the alignment update, just use
only desired_align instead of MAX (type_align, desired_align)
as the alignment.
(place_field): Don't do known_align < desired_align handling
early if targetm.ms_bitfield_layout_p (rli->t) and rli->prev_field
is non-NULL, instead do it after rli->prev_field handling and
only if not within a bitfield word.  For DECL_PACKED (field)
use type_align of BITS_PER_UNIT.

* gcc.dg/bf-ms-layout.c: Revert 2012-04-26 changes.
* gcc.dg/bf-ms-layout-2.c: Revert 2012-02-23 changes.
* gcc.dg/bf-ms-layout-4.c: New test.
* gcc.dg/bf-ms-layout-5.c: New test.

--- gcc/stor-layout.c.jj2018-02-22 14:35:33.135216198 +0100
+++ gcc/stor-layout.c   2018-02-27 18:56:26.906494801 +0100
@@ -1038,7 +1038,7 @@ update_alignment_for_field (record_layou
 the type, except that for zero-size bitfields this only
 applies if there was an immediately prior, nonzero-size
 bitfield.  (That's the way it is, experimentally.) */
-  if ((!is_bitfield && !DECL_PACKED (field))
+  if (!is_bitfield
  || ((DECL_SIZE (field) == NULL_TREE
   || !integer_zerop (DECL_SIZE (field)))
  ? !DECL_PACKED (field)
@@ -1047,7 +1047,10 @@ update_alignment_for_field (record_layou
 && ! integer_zerop (DECL_SIZE (rli->prev_field)
{
  unsigned int type_align = TYPE_ALIGN (type);
- type_align = MAX (type_align, desired_align);
+ if (!is_bitfield && DECL_PACKED (field))
+   type_align = desired_align;
+ else
+   type_align = MAX (type_align, desired_align);
  if (maximum_field_alignment != 0)
type_align = MIN (type_align, maximum_field_alignment);
  rli->record_align = MAX (rli->record_align, type_align);
@@ -1303,7 +1306,9 @@ place_field (record_layout_info rli, tre
 
   /* Does this field automatically have alignment it needs by virtue
  of the fields that precede it and the record's own alignment?  */
-  if (known_align < desired_align)
+  if (known_align < desired_align
+  && (! targetm.ms_bitfield_layout_p (rli->t)
+ || rli->prev_field == NULL))
 {
   /* No, we need to skip space before this field.
 Bump the cumulative size to multiple of field alignment.  */
@@ -1331,8 +1336,6 @@ place_field (record_layout_info rli, tre
 
   if (! TREE_CONSTANT (rli->offset))
rli->offset_align = desired_align;
-  if (targetm.ms_bitfield_layout_p (rli->t))
-   rli->prev_field = NULL;
 }
 
   /* Handle compatibility with PCC.  Note that if the record has any
@@ -1448,6 +1451,8 @@ place_field (record_layout_info rli, tre
   /* This is a bitfield if it exists.  */
   if (rli->prev_field)
{
+ bool realign_p = known_align < desired_align;
+
  /* If both are bitfields, nonzero, and the same size, this is
 the middle of a run.  Zero declared size fields are special
 and handled as "end of run". (Note: it's nonzero declared
@@ -1481,7 +1486,10 @@ place_field (record_layout_info rli, tre
rli->remaining_in_alignment = typesize - bitsize;
}
  else
-   rli->remaining_in_alignment -= bitsize;
+   {
+

[RFA][PATCH][PR middle-end/61118] Improve tree CFG accuracy for setjmp/longjmp

2018-02-27 Thread Jeff Law
Richi, you worked on 57147 which touches on the issues here.  Your
thoughts would be greatly appreciated.


So 61118 is one of several bugs related to the clobbered-by-longjmp warning.

In 61118 is we are unable to coalesce all the objects in the key
partitions.  To remove the relevant PHIs we have to create two
assignments to the key pseudos.

Pseudos with more than one assignment are subject to the
clobbered-by-longjmp analysis:

 * True if register REGNO was alive at a place where `setjmp' was
   called and was set more than once or is an argument.  Such regs may
   be clobbered by `longjmp'.  */

static bool
regno_clobbered_at_setjmp (bitmap setjmp_crosses, int regno)
{
  /* There appear to be cases where some local vars never reach the
 backend but have bogus regnos.  */
  if (regno >= max_reg_num ())
return false;

  return ((REG_N_SETS (regno) > 1
   || REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN
(cfun)),
   regno))
  && REGNO_REG_SET_P (setjmp_crosses, regno));
}


The fact that no path sets the pseudo more than once is not considered.
If there is more than one static set of the pseudo, then it is
considered for possible warning.

--


I looked at the propagations which led to the inability to coalesce.
They all seemed valid to me.  We have always allowed copy propagation to
replace one pseudo with another as long as neither has
SSA_NAME_USED_IN_ABNORMAL_PHI set.

We have a PHI like

x1(ab) = (x0, x3 (ab))

x0 is not marked as abnormal because the edge isn't abnormal and thus we
can propagate into the x0 argument of the PHI.  This is consistent with
behavior since, well, forever.   We propagate a value for x0 resulting
in something like

x1(b) = (y0, x3 (ab))


Where y0 is still live across the PHI.  Thus the partition for x1/x3,
etc conflicts with the partition for y0 and they can not be coalesced.
This leads to the multiple assignments to the pseudo for the x1/x3
partition.  I briefly looked marking all the PHI arguments as abnormal
when the destination is abnormal, but it just doesn't seem right.

Anyway, I'd already been looking at 21161 and was aware that the CFG's
we're building in presence of setjmp/longjmp were slightly inaccurate.

In particular, a longjmp returns to the point immediately after the
setjmp, not to the setjmp itself.  But our CFG building has the edge
from the abnormal dispatcher going to the block containing the setjmp call.

This creates unnecessary irreducible loops.  It turns out that if we fix
the tree CFG, then lifetimes become more accurate (and more
constrained).  The more constrained, more accurate lifetime information
is enough to allow things to coalesce the way we want and everything for
61118 just works.

It's actually pretty easy to fix the CFG.  We  just need to recognize
that a "returns twice" function returns not to the call, but to the
point immediately after the call.  So if we have a call to a returns
twice function that ends a block with a single successor, when we wire
up the abnormal dispatcher, we target the single successor rather than
the block containing the returns-twice call.

This compromises the test gcc.dg/torture/57147-2.c


Prior to this change the CFG looks like

 2
/ \
   3<->4
   |
   R

Where block #3 contains the setjmp.  The edges 2->4, 3->4 and 4->3 are
abnormals.  Block #4 is the abnormal dispatcher.

Eventually we remove the edge from 2->3 because the last statement in
block #2 is to a non-returning function call.  But we leave the abnormal
edge 2->4 (on purpose) resulting in:


 2
 |
  +->4
  |  |
  +--3
 |
 R

The test then proceeds to verify there is a call to setjmp in the
resulting .optimized dump -- which there is because block #3 remains
reachable.


With this change the CFG looks like:



 2
/ \
   3-->4
   |  /
   | /
   |/
   R


Where the edges 2->4 and 3->4 and 4->R are abnormals.  Block #4 is still
the dispatcher and the setjmp is still in block #3.

We realize block #2 ends with a call to a noreturn function and again we
remove the 2->3 edge.  That makes block #3 unreachable and it gets
removed, resulting in:

2
|
4
|
R

Where 2->4 and 4->R are still abnormal edges.  With bb3 becoming
unreachable, the setjmp is unreachable and gets removed thus breaking
the scan part of the test.




If we review the source of the test:


struct __jmp_buf_tag {};
typedef struct __jmp_buf_tag jmp_buf[1];
extern int _setjmp (struct __jmp_buf_tag __env[1]);

jmp_buf g_return_jmp_buf;

void SetNaClSwitchExpectations (void)
{
  __builtin_longjmp (g_return_jmp_buf, 1);
}
void TestSyscall(void)
{
  SetNaClSwitchExpectations();
  _setjmp (g_return_jmp_buf);
}


We can easily see that the call to __setjmp can never be reached given
that we consider the longjmp call as non-returning.  So AFAICT
everything is as should be expected.  I think the right thing is to just
remove this compromised test.

--



The regression tested from 

Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-27 Thread Jakub Jelinek
On Mon, Feb 26, 2018 at 09:19:56PM -0700, Martin Sebor wrote:
> +  /* Put together a list of the black listed attributes that the primary
> + template is declared with that the specialization is not, in case
> + it's not apparent from the most recent declaration of the primary.  */
> +  unsigned nattrs = 0;
> +  std::string str;
> +
> +  for (unsigned i = 0; i != sizeof blacklist / sizeof *blacklist; ++i)
> +{
> +  for (unsigned j = 0; j != 2; ++j)
> + {
> +   if (!lookup_attribute (blacklist[i], tmpl_attrs[j]))
> + continue;
> +
> +   for (unsigned k = 0; k != 1 + !!spec_attrs[1]; ++k)
> + {
> +   if (lookup_attribute (blacklist[i], spec_attrs[k]))
> + break;
> +
> +   if (str.size ())
> + str += ", ";
> +   str += "%<";
> +   str += blacklist[i];
> +   str += "%>";
> +   ++nattrs;
> + }
> + }
> +}
> +
> +  if (!nattrs)
> +return;
> +
> +  if (warning_at (DECL_SOURCE_LOCATION (spec), OPT_Wmissing_attributes,
> +   "explicit specialization %q#D may be missing attributes",
> +   spec))
> +{
> +  if (nattrs > 1)
> + str = G_("missing primary template attributes ") + str;
> +  else
> + str = G_("missing primary template attribute ") + str;
> +
> +  inform (DECL_SOURCE_LOCATION (tmpl), str.c_str ());

This is broken for multiple reasons:
1) it should be inform_n rather than inform
2) you really can't do what you're doing for translations;
   G_(...) marks the string for translations, but what actually is
   translated is not that string, but rather what is passed to inform,
   i.e. str.c_str (), so it will be likely never translated
3) as others have mentioned, the #include  you are doing is
   wrong
4) I don't see justification to use std::string here

What you IMHO should use instead is use
  pretty_printer str;
instead, and the pp_* APIs to add stuff in there, including
pp_begin_quote (, pp_show_color (global_dc->printer))
and
pp_end_quote (, pp_show_color (global_dc->printer))
when you want to add what %< or %> expand to,
and finally
  inform_n (DECL_SOURCE_LOCATION (tmpl), nattrs,
"missing primary template attribute %s",
"missing primary template attributes %s",
pp_formatted_text ());
That way it should be properly translatable.

Jakub


Re: [PATCH] Fix gcc.target/i386/pr84309.c testcase (PR target/84575)

2018-02-27 Thread Jakub Jelinek
On Tue, Feb 27, 2018 at 08:53:15AM -0800, H.J. Lu wrote:
> On Tue, Feb 27, 2018 at 1:01 AM, Jakub Jelinek  wrote:
> >
> 
> NOPATCH.

Oops, sorry, here it is:

2018-02-27  Jakub Jelinek  

PR target/84575
* gcc.target/i386/pr84309.c: Add -mno-avx2 to dg-options.

--- gcc/testsuite/gcc.target/i386/pr84309.c.jj  2018-02-13 09:33:31.119560170 
+0100
+++ gcc/testsuite/gcc.target/i386/pr84309.c 2018-02-27 09:42:01.197135520 
+0100
@@ -1,6 +1,6 @@
 /* PR middle-end/84309 */
 /* { dg-do compile } */
-/* { dg-options "-Ofast -mavx" } */
+/* { dg-options "-Ofast -mavx -mno-avx2" } */
 
 double pow (double, double) __attribute__((simd));
 double exp (double) __attribute__((simd));

Jakub


Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-27 Thread Martin Sebor

On 02/27/2018 04:21 PM, David Edelsohn wrote:

Martin,

This patch broke bootstrap.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 42fd872..9c2e5e6 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -24,6 +24,7 @@ along with GCC; see the file COPYING3.  If not see
  all methods must be provided in header files; can't use a source
  file that contains only the method templates and "just win".  */

+#include 
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"

Nothing is allowed to be included before GCC config.h and system.h.
And you should not be including C++ header files directly.  If you
truly need , the file should define INCLUDE_STRING (see
system.h).


Sorry, I didn't know that and my bootstrap worked.  I committed
r258046 (following what gcc/ipa-chkp.c does).

Martin


[PATCH] Fix pt.c bootstrap breakage

2018-02-27 Thread David Edelsohn
The recent change to pt.c broke bootstrap by including C++ header file
 directly and including it first.  This patch changes to the
necessary method which includes the header file via system.h.

Okay?

Thanks, David

* pt.c: Don't include string. Define INCLUDE_STRING before system.h.

Index: pt.c
===
--- pt.c(revision 258045)
+++ pt.c(working copy)
@@ -24,8 +24,8 @@
  all methods must be provided in header files; can't use a source
  file that contains only the method templates and "just win".  */

-#include 
 #include "config.h"
+#define INCLUDE_STRING
 #include "system.h"
 #include "coretypes.h"
 #include "cp-tree.h"


Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-27 Thread David Edelsohn
Martin,

This patch broke bootstrap.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 42fd872..9c2e5e6 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -24,6 +24,7 @@ along with GCC; see the file COPYING3.  If not see
  all methods must be provided in header files; can't use a source
  file that contains only the method templates and "just win".  */

+#include 
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"

Nothing is allowed to be included before GCC config.h and system.h.
And you should not be including C++ header files directly.  If you
truly need , the file should define INCLUDE_STRING (see
system.h).

Thanks, David


Re: [PR81611] improve auto-inc

2018-02-27 Thread Alexandre Oliva
On Feb 14, 2018, Jeff Law  wrote:

>> +  regno = REGNO (inc_insn.reg0);
>> +  int luid = DF_INSN_LUID (mem_insn.insn);
>> +  mem_insn.insn = get_next_ref (regno, bb, reg_next_use);
> So I think a comment is warranted  right as we enter the TRUE arm.

> At that point INC_INSN is an inc/dec.  But MEM_INSN is not necessarily a
> memory reference.  It could be a memory reference, it could be a copy,
> it could be something completely different (it's just the next insn that
> references the result of the increment).  In the case we care about we
> want it to be a copy of INC_INSN's REG_RES back to REG0.

> ISTM that verifying MEM_INSN is a reg->reg copy (reg_res -> reg0) before
> we call get_next_ref for reg0 is advisable and probably good from a
> compile-time standpoint by avoiding calls into find_address.

But we don't need it to be a copy.  The transformation is just as
legitimate if the regs go independent ways after that point.  We have
reg_res set to reg0+reg1, and then a use of reg0 in a MEM before any
other use of reg_res.  We turn that into a copy of reg0 to reg_res, and
the MEM addr into a post_add of reg_res with reg1 (possibly a post_inc),
so that the MEM dereferences reg_res while it's still equal to reg0, and
after the MEM, reg_res becomes reg0+reg1, as it should for any
subsequent uses, and reg0 is unmodified.  Whether or not a subsequent
copy from reg_res to reg0 is to be found won't make the transformation
any more or less legitimate.

> After we call get_next_ref to get the next reference of the source of
> the increment, then we're hoping to find a memory reference that uses
> REG0.  But it's not guaranteed it's a memory reference insn.

Yeah, find_address will determine if it contains any of the MEM patterns
we might be interested in, but it could be anything whatsoever.  The MEM
pattern might appear virtually anywhere in the insn.

> I was having an awful time understanding how this code could work from
> the comments until I put it under a debugger and got a sense of the
> state as we entered that IF block.  Then it was much clearer :-)

Sorry, I realize the comments were written based on a lot of context
about the overall behavior of the pass, that I had learned while trying
to figure it out.  At the risk of making it redundant, I've expanded the
comments, and added further tests that won't affect current behavior in
any significant way, but that might speed things up a bit and will save
us trouble should find_address be extended to catch additional patterns.


> I believe Georg had other testcases in subsequent comments in the BZ,
> but I don't believe they were flagged as regressions.

However, with the testcases I realized the incremented register could
still be live, even if we didn't find a subsequent use for it.
Adjusting for that made those testcases use post_inc too.

Here's the improved patch, regstrapped on aarch64-, ppc64-, and
ppc64el-linux-gnu.  Ok to install?


[PR81611] turn inc-and-use-of-dead-orig into auto-inc

When the addressing modes available on the machine don't allow offsets
in addresses, odds are that post-increments will be represented in
trees and RTL as:

  y <= x + 1
  ... *(x) ...
  x <= y

so deal with it by turning such RTL as:

  (set y (plus x n))
  ... (mem x) ...

without intervening uses of y into

  (set y x)
  ... (mem (post_add y n)) ...

so as to create auto-inc addresses that we'd otherwise miss.


for  gcc/ChangeLog

PR rtl-optimization/81611
* auto-inc-dec.c (attempt_change): Move dead note from
mem_insn if it's the next use of regno
(find_address): Take address use of reg holding
non-incremented value.  Add parm to limit search to the named
reg only.
(merge_in_block): Attempt to use a mem insn that is the next
use of the original regno.
---
 gcc/auto-inc-dec.c |  140 
 1 file changed, 130 insertions(+), 10 deletions(-)

diff --git a/gcc/auto-inc-dec.c b/gcc/auto-inc-dec.c
index d02fa9d081c7..e6dc1c30d716 100644
--- a/gcc/auto-inc-dec.c
+++ b/gcc/auto-inc-dec.c
@@ -508,7 +508,11 @@ attempt_change (rtx new_addr, rtx inc_reg)
 before the memory reference.  */
   gcc_assert (mov_insn);
   emit_insn_before (mov_insn, inc_insn.insn);
-  move_dead_notes (mov_insn, inc_insn.insn, inc_insn.reg0);
+  regno = REGNO (inc_insn.reg0);
+  if (reg_next_use[regno] == mem_insn.insn)
+   move_dead_notes (mov_insn, mem_insn.insn, inc_insn.reg0);
+  else
+   move_dead_notes (mov_insn, inc_insn.insn, inc_insn.reg0);
 
   regno = REGNO (inc_insn.reg_res);
   reg_next_def[regno] = mov_insn;
@@ -825,13 +829,15 @@ parse_add_or_inc (rtx_insn *insn, bool before_mem)
 
 /* A recursive function that checks all of the mem uses in
ADDRESS_OF_X to see if any single one of them is compatible with
-   what has been found in inc_insn.
+   what has 

gcc testsuite changes for new linker messages

2018-02-27 Thread Alan Modra
GNU ld error messages have changed to comply with the GNU coding
standards.  The two fixes in this patch look to be the only required
changes in the GCC testsuite.  I've written the prune_gcc_output patch
the way I have to try to capture the fact that the lower case "in
function" is correct for a message preceded by "ld: object(sec+off): "
but "In function" is correct when the phrase starts a sentence.

Bootstrapped and regression tested x86_64-linux.  OK to apply all
branches?

* lib/prune.exp (prune_gcc_output): Match lower case "in function"
GNU ld message.
* g++.dg/other/anon5.C: Match lower case "bad value" GNU ld message.

diff --git a/gcc/testsuite/g++.dg/other/anon5.C 
b/gcc/testsuite/g++.dg/other/anon5.C
index 2a6f57f..ee4601e 100644
--- a/gcc/testsuite/g++.dg/other/anon5.C
+++ b/gcc/testsuite/g++.dg/other/anon5.C
@@ -4,7 +4,7 @@
 // Ignore additional message on powerpc-ibm-aix
 // { dg-prune-output "obtain more information" } */
 // Ignore additional messages on Linux/x86 with PIE
-// { dg-prune-output "Bad value" } */
+// { dg-prune-output "\[Bb\]ad value" } */
 
 namespace {
   struct c
diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp
index 2f26c6f..1e11dc9 100644
--- a/gcc/testsuite/lib/prune.exp
+++ b/gcc/testsuite/lib/prune.exp
@@ -31,7 +31,7 @@ proc prune_gcc_output { text } {
 # Handle any freeform regexps.
 set text [handle-dg-regexps $text]
 
-regsub -all "(^|\n)(\[^\n\]*: )?In ((static member |lambda 
)?function|member|method|(copy 
)?constructor|destructor|instantiation|substitution|program|subroutine|block-data)\[^\n\]*"
 $text "" text
+regsub -all "(^|\n)(\[^\n\]*: \[iI\]|I)n ((static member |lambda 
)?function|member|method|(copy 
)?constructor|destructor|instantiation|substitution|program|subroutine|block-data)\[^\n\]*"
 $text "" text
 regsub -all "(^|\n)\[^\n\]*(: )?At (top level|global scope):\[^\n\]*" 
$text "" text
 regsub -all "(^|\n)\[^\n\]*:   (recursively )?required \[^\n\]*" $text "" 
text
 regsub -all "(^|\n)\[^\n\]*:   . skipping \[0-9\]* instantiation contexts 
\[^\n\]*" $text "" text

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] adjust warning_n() to take uhwi (PR 84207)

2018-02-27 Thread Martin Sebor

On 02/22/2018 02:15 PM, Joseph Myers wrote:

On Thu, 22 Feb 2018, Martin Sebor wrote:


Ping: https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00858.html

This is just a tweak to fix a translation bug introduced by
one of my warnings (calling warning() where warning_n() is
more appropriate), and to enhance warning_n() et al. to do
the n % 100 + 100 computation so callers don't have
to worry about it.


OK in the absence of diagnostic maintainer objections within 48 hours,
with the comment saying "ngettext()" changed to remove the "()" (see the
GNU Coding Standards: "Please do not write @samp{()} after a function name
just to indicate it is a function.  @code{foo ()} is not a function, it is
a function call with no arguments.").


Committed as r258044.

Martin


[PATCH] [RFC] rs6000: -mreadonly-in-sdata (PR82411)

2018-02-27 Thread Segher Boessenkool
This adds a new option -mreadonly-in-sdata (on by default) that
controls whether readonly data can be put in sdata.  (For EABI this
does nothing, readonly data is put in sdata2 as usual).

Kees, could you try this out with your use case?  Add the flag
-mno-readonly-in-sdata in your build scripts.  The patch is against
GCC trunk.

[ I'll write documentation if this works; backports (to GCC 7 and
  GCC 6) can happen as well. ]


Segher


---
 gcc/config/rs6000/rs6000.c | 5 +
 gcc/config/rs6000/sysv4.opt| 4 
 gcc/testsuite/gcc.target/powerpc/ppc-sdata-2.c | 1 +
 3 files changed, 10 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index bd0a564..dbc1d79 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -32591,6 +32591,11 @@ rs6000_elf_in_small_data_p (const_tree decl)
 }
   else
 {
+  /* If we are told not to put readonly data in sdata, then don't.  */
+  if (TREE_READONLY (decl) && rs6000_sdata != SDATA_EABI
+ && !rs6000_readonly_in_sdata)
+   return false;
+
   HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (decl));
 
   if (size > 0
diff --git a/gcc/config/rs6000/sysv4.opt b/gcc/config/rs6000/sysv4.opt
index 9534c1c..fb03c0a 100644
--- a/gcc/config/rs6000/sysv4.opt
+++ b/gcc/config/rs6000/sysv4.opt
@@ -27,6 +27,10 @@ msdata=
 Target RejectNegative Joined Var(rs6000_sdata_name)
 Select method for sdata handling.
 
+mreadonly-in-sdata
+Target Report Var(rs6000_readonly_in_sdata) Init(1) Save
+Allow readonly data in sdata.
+
 mtls-size=
 Target RejectNegative Joined Var(rs6000_tls_size) Enum(rs6000_tls_size)
 Specify bit size of immediate TLS offsets.
diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-sdata-2.c 
b/gcc/testsuite/gcc.target/powerpc/ppc-sdata-2.c
index 570c81f..ee77456 100644
--- a/gcc/testsuite/gcc.target/powerpc/ppc-sdata-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/ppc-sdata-2.c
@@ -5,6 +5,7 @@
 /* { dg-final { scan-assembler-not "\\.section\[ \t\]\\.sdata2," } } */
 /* { dg-final { scan-assembler "sdat@sdarel\\(13\\)" } } */
 /* { dg-final { scan-assembler "sdat2@sdarel\\(13\\)" } } */
+/* { dg-skip-if "" { *-*-* } { "-mno-readonly-in-sdata" } { "" } } */
 
 
 int sdat = 2;
-- 
1.8.3.1



Re: C++ PATCH to fix static init with () in a template (PR c++/84582)

2018-02-27 Thread Jason Merrill

On 02/27/2018 02:13 PM, Marek Polacek wrote:

My recent change introducing cxx_constant_init caused this code

template  class A {
   static const long b = 0;
   static const unsigned c = (b);
};

to be rejected.  The reason is that force_paren_expr turns "b" into "*(const
long int &) ", where the former is not value-dependent but the latter is
value-dependent.  So when we get to maybe_constant_init_1:
5147   if (!is_nondependent_static_init_expression (t))
5148 /* Don't try to evaluate it.  */;
it's not evaluated and we get the non-constant initialization error.
(Before we'd always evaluated the expression.)

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2018-02-27  Marek Polacek  

PR c++/84582
* semantics.c (force_paren_expr): Avoid creating a static cast
when processing a template.

* g++.dg/cpp1z/static1.C: New test.
* g++.dg/template/static37.C: New test.

diff --git gcc/cp/semantics.c gcc/cp/semantics.c
index 35569d0cb0d..b48de2df4e2 100644
--- gcc/cp/semantics.c
+++ gcc/cp/semantics.c
@@ -1697,7 +1697,7 @@ force_paren_expr (tree expr)
  expr = build1 (PAREN_EXPR, TREE_TYPE (expr), expr);
else if (VAR_P (expr) && DECL_HARD_REGISTER (expr))
  /* We can't bind a hard register variable to a reference.  */;
-  else
+  else if (!processing_template_decl)


Hmm, this means that we forget about the parentheses in a template.  I'm 
surprised that this didn't break anything in the testsuite.  In 
particular, auto-fn15.C.  I've attached an addition to auto-fn15.C to 
catch this issue.


Can we use PAREN_EXPR instead of the static_cast in a template?

Jason
diff --git a/gcc/testsuite/g++.dg/cpp1y/auto-fn15.C b/gcc/testsuite/g++.dg/cpp1y/auto-fn15.C
index ba9f3579f62..0db428f7270 100644
--- a/gcc/testsuite/g++.dg/cpp1y/auto-fn15.C
+++ b/gcc/testsuite/g++.dg/cpp1y/auto-fn15.C
@@ -22,6 +22,8 @@ template 
 decltype(auto) h5(T t) { return t.i; }
 template 
 decltype(auto) h6(T t) { return (t.i); }
+template 
+decltype(auto) h7(T t) { return (i); }
 
 int main()
 {
@@ -48,4 +50,5 @@ int main()
   same_type();
   same_type();
   same_type();
+  same_type();
 }


Re: [PATCH] Fix PR c++/71546 - lambda capture fails with "was not declared in this scope"

2018-02-27 Thread Håkon Sandsmark
2018-02-27 22:02 GMT+01:00 Jason Merrill :
> On 02/27/2018 03:29 PM, Jason Merrill wrote:
>>
>> On 02/27/2018 01:51 PM, Håkon Sandsmark wrote:
>>>
>>> Thanks for the feedback. I chose to take the example from the bug
>>> report verbatim as the test case.
>>>
>>> However, I agree it makes sense to have the simplest possible test
>>> case that reproduces the issue. Here is an updated patch.
>>
>>
>> Thanks!
>>
>>> +  /* If there is any qualification still in effect, clear it
>>> +   * now; we will be starting fresh with the next capture.  */
>>
>>
>> For future reference, we don't add * at the beginning of subsequent lines
>> in a comment.  I'll correct that in this patch and check it in.
>
>
> Done.  FYI I also renamed the testcase to lambda-init17.C; I sometimes like
> to run e.g. the *lambda* tests as a smoke test, and "pr12345" isn't very
> useful for that.

Thanks for the quick turnaround! I was unsure about the naming of the
file myself.

I'll definitely look into the copyright assignment for the future.

> Jason

Håkon


Re: [PATCH] Fix PR c++/71546 - lambda capture fails with "was not declared in this scope"

2018-02-27 Thread Jason Merrill

On 02/27/2018 03:29 PM, Jason Merrill wrote:

On 02/27/2018 01:51 PM, Håkon Sandsmark wrote:

Thanks for the feedback. I chose to take the example from the bug
report verbatim as the test case.

However, I agree it makes sense to have the simplest possible test
case that reproduces the issue. Here is an updated patch.


Thanks!


+  /* If there is any qualification still in effect, clear it
+   * now; we will be starting fresh with the next capture.  */


For future reference, we don't add * at the beginning of subsequent 
lines in a comment.  I'll correct that in this patch and check it in.


Done.  FYI I also renamed the testcase to lambda-init17.C; I sometimes 
like to run e.g. the *lambda* tests as a smoke test, and "pr12345" isn't 
very useful for that.


Jason


[PR c++/84426] ICE after conflicting member decl

2018-02-27 Thread Nathan Sidwell
The crash was happening because add_method wasn't telling it's caller 
something went wrong.  We then ended up with a vfunc and a regular field 
on the TYPE_MEMBERS list, but only the field in the 
CLASSTYPE_MEMBER_VEC.  So we knew there was a virtual func, but couldn't 
find it.


Fixed by having add_method return false.  But that leads to the 
possibility of having NULL slots during class definition time as 
get_member_slot would always create the slot.   I should have know that 
might bite somewhere else.  So fixed by breaking get_member_slot into a 
finder and an adder.


nathan
--
Nathan Sidwell
2018-02-27  Nathan Sidwell  

	PR c++/84426
	* name-lookup.h (get_member_slot): Rename ...
	(find_member_slot): ... here.
	(add_member_slot): New.
	* name-lookup.c (member_vec_linear_search): No need to check for
	NULL slot.
	(get_member_slot): Rename ...
	(find_member_slot): ... here.  Don't add slot for incomplete class.
	(add_member_slot): New.
	* class.c (add_method): Adjust get_member_slot rename.  Bail out
	if push_class_level_binding fails.  Create slot and grok
	properties once we're committed to insertion.

	PR c++/84426
	* g++.dg/lookup/pr84426.C: New.

Index: cp/class.c
===
--- cp/class.c	(revision 258041)
+++ cp/class.c	(working copy)
@@ -993,14 +993,11 @@ add_method (tree type, tree method, bool
   if (method == error_mark_node)
 return false;
 
-  /* Maintain TYPE_HAS_USER_CONSTRUCTOR, etc.  */
-  grok_special_member_properties (method);
-
-  tree *slot = get_member_slot (type, DECL_NAME (method));
-  tree current_fns = *slot;
-
   gcc_assert (!DECL_EXTERN_C_P (method));
 
+  tree *slot = find_member_slot (type, DECL_NAME (method));
+  tree current_fns = slot ? *slot : NULL_TREE;
+
   /* Check to see if we've already got this method.  */
   for (ovl_iterator iter (current_fns); iter; ++iter)
 {
@@ -1146,8 +1143,15 @@ add_method (tree type, tree method, bool
 
   current_fns = ovl_insert (method, current_fns, via_using);
 
-  if (!DECL_CONV_FN_P (method) && !COMPLETE_TYPE_P (type))
-push_class_level_binding (DECL_NAME (method), current_fns);
+  if (!COMPLETE_TYPE_P (type) && !DECL_CONV_FN_P (method)
+  && !push_class_level_binding (DECL_NAME (method), current_fns))
+return false;
+
+  if (!slot)
+slot = add_member_slot (type, DECL_NAME (method));
+
+  /* Maintain TYPE_HAS_USER_CONSTRUCTOR, etc.  */
+  grok_special_member_properties (method);
 
   *slot = current_fns;
 
Index: cp/name-lookup.c
===
--- cp/name-lookup.c	(revision 258041)
+++ cp/name-lookup.c	(working copy)
@@ -1146,17 +1146,9 @@ static tree
 member_vec_linear_search (vec *member_vec, tree name)
 {
   for (int ix = member_vec->length (); ix--;)
-/* We can get a NULL binding during insertion of a new method
-   name, because the identifier_binding machinery performs a
-   lookup.  If we find such a NULL slot, that's the thing we were
-   looking for, so we might as well bail out immediately.  */
 if (tree binding = (*member_vec)[ix])
-  {
-	if (OVL_NAME (binding) == name)
-	  return binding;
-  }
-else
-  break;
+  if (OVL_NAME (binding) == name)
+	return binding;
 
   return NULL_TREE;
 }
@@ -1334,15 +1326,15 @@ get_class_binding (tree klass, tree name
 }
 
 /* Find the slot containing overloads called 'NAME'.  If there is no
-   such slot, create an empty one.  KLASS might be complete at this
-   point, in which case we need to preserve ordering.  Deals with
-   conv_op marker handling.  */
+   such slot and the class is complete, create an empty one, at the
+   correct point in the sorted member vector.  Otherwise return NULL.
+   Deals with conv_op marker handling.  */
 
 tree *
-get_member_slot (tree klass, tree name)
+find_member_slot (tree klass, tree name)
 {
   bool complete_p = COMPLETE_TYPE_P (klass);
-  
+
   vec *member_vec = CLASSTYPE_MEMBER_VEC (klass);
   if (!member_vec)
 {
@@ -1389,24 +1381,34 @@ get_member_slot (tree klass, tree name)
 	break;
 }
 
-  /* No slot found.  Create one at IX.  We know in this case that our
- caller will succeed in adding the function.  */
+  /* No slot found, add one if the class is complete.  */
   if (complete_p)
 {
-  /* Do exact allocation when complete, as we don't expect to add
-	 many.  */
+  /* Do exact allocation, as we don't expect to add many.  */
+  gcc_assert (name != conv_op_identifier);
   vec_safe_reserve_exact (member_vec, 1);
+  CLASSTYPE_MEMBER_VEC (klass) = member_vec;
   member_vec->quick_insert (ix, NULL_TREE);
+  return &(*member_vec)[ix];
 }
-  else
-{
-  gcc_checking_assert (ix == length);
-  vec_safe_push (member_vec, NULL_TREE);
-}
+
+  return NULL;
+}
+
+/* KLASS is an incomplete class to which we're adding a method NAME.
+   Add a slot and deal with conv_op marker 

Re: [PATCH] Fix PR c++/71546 - lambda capture fails with "was not declared in this scope"

2018-02-27 Thread Jason Merrill

On 02/27/2018 01:51 PM, Håkon Sandsmark wrote:

Thanks for the feedback. I chose to take the example from the bug
report verbatim as the test case.

However, I agree it makes sense to have the simplest possible test
case that reproduces the issue. Here is an updated patch.


Thanks!


+  /* If there is any qualification still in effect, clear it
+   * now; we will be starting fresh with the next capture.  */


For future reference, we don't add * at the beginning of subsequent 
lines in a comment.  I'll correct that in this patch and check it in.


It looks like you don't have a copyright assignment on file with the FSF 
yet.  This patch is small enough not to need one, but if (as I hope) 
you're thinking to continue contributing to GCC, you might want to file 
an assignment for future changes.


Jason


[PATCH] x86: Force __x86_indirect_thunk_reg for function call via GOT

2018-02-27 Thread H.J. Lu
For x86 targets, when -fno-plt is used, external functions are called
via GOT slot, in 64-bit mode:

[bnd] call/jmp *foo@GOTPCREL(%rip)

and in 32-bit mode:

[bnd] call/jmp *foo@GOT[(%reg)]

With -mindirect-branch=, they are converted to, in 64-bit mode:

pushq  foo@GOTPCREL(%rip)
[bnd] jmp  __x86_indirect_thunk[_bnd]

and in 32-bit mode:

pushl  foo@GOT[(%reg)]
[bnd] jmp  __x86_indirect_thunk[_bnd]

which were incompatible with CFI.  In 64-bit mode, since R11 is a scratch
register, we generate:

movq   foo@GOTPCREL(%rip), %r11
[bnd] call/jmp __x86_indirect_thunk_[bnd_]r11

instead.  We do it in ix86_output_indirect_branch so that we can use
the newly proposed R_X86_64_THUNK_GOTPCRELX relocation:

https://groups.google.com/forum/#!topic/x86-64-abi/eED5lzn3_Mg

movq   foo@OTPCREL_THUNK(%rip), %r11
[bnd] call/jmp __x86_indirect_thunk_[bnd_]r11

to load GOT slot into R11.  If foo is defined locally, linker can can
convert

movq   foo@GOTPCREL_THUNK(%rip), %reg
call/jmp   __x86_indirect_thunk_reg

to

call/jmp   foo
nop0L(%rax)

In 32-bit mode, since all caller-saved registers, EAX, EDX and ECX, may
used to function parameters, there is no scratch register available.  For
-fno-plt -fno-pic -mindirect-branch=, we expand external function call
to:

movl   foo@GOT, %reg
[bnd] call/jmp *%reg

so that it can be converted to

movl   foo@GOT, %reg
[bnd] call/jmp __x86_indirect_thunk_[bnd_]reg

in ix86_output_indirect_branch.  Since this is performed during RTL
expansion, other instructions may be inserted between movl and call/jmp.
Linker optimization isn't always possible.

Tested on i686 and x86-64.  OK for trunk?


H.J.
---
gcc/

PR target/83970
* config/i386/constraints.md (Bs): Allow GOT_memory_operand
for TARGET_LP64 with indirect branch conversion.
(Bw): Likewise.
* config/i386/i386.c (ix86_expand_call): Handle -fno-plt with
-mindirect-branch=.
(ix86_nopic_noplt_attribute_p): Likewise.
(ix86_output_indirect_branch): In 64-bit mode, convert function
call via GOT with R11 as a scratch register using
__x86_indirect_thunk_r11.
(ix86_output_call_insn): In 64-bit mode, set xasm to NULL when
calling ix86_output_indirect_branch with function call via GOT.
* config/i386/i386.md (*call_got_thunk): New call pattern for
TARGET_LP64 with indirect branch conversion.
(*call_value_got_thunk): Likewise.

gcc/testsuite/

PR target/83970
* gcc.target/i386/indirect-thunk-5.c: Updated.
* gcc.target/i386/indirect-thunk-6.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-3.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-4.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-5.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-6.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-5.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-6.c: Likewise.
* gcc.target/i386/indirect-thunk-13.c: New test.
* gcc.target/i386/indirect-thunk-14.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-5.c: Likewise.
* gcc.target/i386/indirect-thunk-bnd-6.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-11.c: Likewise.
* gcc.target/i386/indirect-thunk-extern-12.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-8.c: Likewise.
* gcc.target/i386/indirect-thunk-inline-9.c: Likewise.
---
 gcc/config/i386/constraints.md | 14 +++-
 gcc/config/i386/i386.c | 90 +++---
 gcc/config/i386/i386.md| 36 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-13.c  | 19 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-14.c  | 20 +
 gcc/testsuite/gcc.target/i386/indirect-thunk-5.c   |  6 +-
 gcc/testsuite/gcc.target/i386/indirect-thunk-6.c   | 12 +--
 .../gcc.target/i386/indirect-thunk-bnd-3.c |  2 +-
 .../gcc.target/i386/indirect-thunk-bnd-4.c |  2 +-
 .../gcc.target/i386/indirect-thunk-bnd-5.c | 21 +
 .../gcc.target/i386/indirect-thunk-bnd-6.c | 22 ++
 .../gcc.target/i386/indirect-thunk-extern-11.c | 18 +
 .../gcc.target/i386/indirect-thunk-extern-12.c | 19 +
 .../gcc.target/i386/indirect-thunk-extern-5.c  |  6 +-
 .../gcc.target/i386/indirect-thunk-extern-6.c  |  8 +-
 .../gcc.target/i386/indirect-thunk-inline-5.c  |  3 +-
 .../gcc.target/i386/indirect-thunk-inline-6.c  |  3 +-
 .../gcc.target/i386/indirect-thunk-inline-8.c  | 18 +
 .../gcc.target/i386/indirect-thunk-inline-9.c  | 19 +
 19 files changed, 300 insertions(+), 38 deletions(-)
 create mode 100644 

[Patch, fortran] PR83901 - [8 Regression] ICE in fold_convert_loc, at fold-const.c:2402

2018-02-27 Thread Paul Richard Thomas
Hi All,

I will commit this patch as obvious tomorrow night unless there are
objections in the meantime.

Bootstraps and regtests on FC27/x86_64 - OK?

Paul

2018-02-27  Paul Thomas  

PR fortran/83901
* trans-stmt.c (trans_associate_var): Make sure that the se
expression is a pointer type before converting it to the symbol
backend_decl type.

2018-02-27  Paul Thomas  

PR fortran/83901
* gfortran.dg/associate_37.f90: New test.
Index: gcc/fortran/trans-stmt.c
===
*** gcc/fortran/trans-stmt.c(revision 257969)
--- gcc/fortran/trans-stmt.c(working copy)
*** trans_associate_var (gfc_symbol *sym, gf
*** 1907,1913 
  
attr = gfc_expr_attr (e);
if (sym->ts.type == BT_CHARACTER && e->ts.type == BT_CHARACTER
! && (attr.allocatable || attr.pointer || attr.dummy))
{
  /* These are pointer types already.  */
  tmp = fold_convert (TREE_TYPE (sym->backend_decl), se.expr);
--- 1907,1914 
  
attr = gfc_expr_attr (e);
if (sym->ts.type == BT_CHARACTER && e->ts.type == BT_CHARACTER
! && (attr.allocatable || attr.pointer || attr.dummy)
! && POINTER_TYPE_P (TREE_TYPE (se.expr)))
{
  /* These are pointer types already.  */
  tmp = fold_convert (TREE_TYPE (sym->backend_decl), se.expr);
Index: gcc/testsuite/gfortran.dg/associate_36.f90
===
*** gcc/testsuite/gfortran.dg/associate_36.f90  (revision 257969)
--- gcc/testsuite/gfortran.dg/associate_36.f90  (working copy)
***
*** 2,8 
  !
  ! Test the fix for PR83344.
  !
! ! Contributed by 
  !
  program foo
 implicit none
--- 2,9 
  !
  ! Test the fix for PR83344.
  !
! ! Contributed by Janne Blomqvist  
! ! and Steve Kargl  
  !
  program foo
 implicit none
Index: gcc/testsuite/gfortran.dg/associate_37.f90
===
*** gcc/testsuite/gfortran.dg/associate_37.f90  (nonexistent)
--- gcc/testsuite/gfortran.dg/associate_37.f90  (working copy)
***
*** 0 
--- 1,15 
+ ! { dg-do run }
+ ! { dg-options "-fcoarray=single" }
+ !
+ ! Tests the fix for the regression PR83901.
+ !
+ ! Contributed by G Steinmetz  
+ !
+ program p
+character(8), allocatable :: x[:]
+allocate (x[*])
+x = 'abc'
+associate (y => x)
+  if (y .ne. 'abc') stop 1
+end associate
+ end


Re: [patch] better explain $target.h vs $target-protos.h in internals manual

2018-02-27 Thread Richard Sandiford
Sandra Loosemore  writes:
> Following up on discussion in gcc@, how does this documentation patch look?
>
> https://gcc.gnu.org/ml/gcc/2018-02/msg00139.html

Looks good to me FWIW.

Richard


C++ PATCH to fix static init with () in a template (PR c++/84582)

2018-02-27 Thread Marek Polacek
My recent change introducing cxx_constant_init caused this code

template  class A {
  static const long b = 0;
  static const unsigned c = (b);
};

to be rejected.  The reason is that force_paren_expr turns "b" into "*(const
long int &) ", where the former is not value-dependent but the latter is
value-dependent.  So when we get to maybe_constant_init_1:
5147   if (!is_nondependent_static_init_expression (t))
5148 /* Don't try to evaluate it.  */;
it's not evaluated and we get the non-constant initialization error.
(Before we'd always evaluated the expression.)

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2018-02-27  Marek Polacek  

PR c++/84582
* semantics.c (force_paren_expr): Avoid creating a static cast
when processing a template.

* g++.dg/cpp1z/static1.C: New test.
* g++.dg/template/static37.C: New test.

diff --git gcc/cp/semantics.c gcc/cp/semantics.c
index 35569d0cb0d..b48de2df4e2 100644
--- gcc/cp/semantics.c
+++ gcc/cp/semantics.c
@@ -1697,7 +1697,7 @@ force_paren_expr (tree expr)
 expr = build1 (PAREN_EXPR, TREE_TYPE (expr), expr);
   else if (VAR_P (expr) && DECL_HARD_REGISTER (expr))
 /* We can't bind a hard register variable to a reference.  */;
-  else
+  else if (!processing_template_decl)
 {
   cp_lvalue_kind kind = lvalue_kind (expr);
   if ((kind & ~clk_class) != clk_none)
diff --git gcc/testsuite/g++.dg/cpp1z/static1.C 
gcc/testsuite/g++.dg/cpp1z/static1.C
index e69de29bb2d..cb872997c5a 100644
--- gcc/testsuite/g++.dg/cpp1z/static1.C
+++ gcc/testsuite/g++.dg/cpp1z/static1.C
@@ -0,0 +1,19 @@
+// PR c++/84582
+// { dg-options -std=c++17 }
+
+class C {
+  static inline const long b = 0;
+  static inline const unsigned c = (b);
+};
+class D {
+  static inline const long b = 0;
+  static inline const unsigned c = b;
+};
+template  class A {
+  static inline const long b = 0;
+  static inline const unsigned c = (b);
+};
+template  class B {
+  static inline const long b = 0;
+  static inline const unsigned c = b;
+};
diff --git gcc/testsuite/g++.dg/template/static37.C 
gcc/testsuite/g++.dg/template/static37.C
index e69de29bb2d..90bc65d2fbc 100644
--- gcc/testsuite/g++.dg/template/static37.C
+++ gcc/testsuite/g++.dg/template/static37.C
@@ -0,0 +1,18 @@
+// PR c++/84582
+
+class C {
+  static const long b = 0;
+  static const unsigned c = (b);
+};
+class D {
+  static const long b = 0;
+  static const unsigned c = b;
+};
+template  class A {
+  static const long b = 0;
+  static const unsigned c = (b);
+};
+template  class B {
+  static const long b = 0;
+  static const unsigned c = b;
+};

Marek


Re: [PATCH] Fix PR c++/71546 - lambda capture fails with "was not declared in this scope"

2018-02-27 Thread Håkon Sandsmark
Hi,

Thanks for the feedback. I chose to take the example from the bug
report verbatim as the test case.

However, I agree it makes sense to have the simplest possible test
case that reproduces the issue. Here is an updated patch.

2018-02-27  Håkon Sandsmark  

PR c++/71546 - lambda capture fails with "was not declared in this scope"
* parser.c (cp_parser_lambda_introducer): Clear scope after
  each lambda capture.
* g++.dg/cpp1y/pr71546.C: New test.

2018-02-27 19:15 GMT+01:00 Paolo Carlini :
> .. or even:
>
> namespace n { struct make_shared { }; }
>
> int main()
> {
>   int x1;
>   [e = n::make_shared (), x1]() {};
> }
>
> I.e., I don't think the fact that std::make_shared is a template plays a
> specific role here.
>
> Paolo.
diff --git gcc/cp/parser.c gcc/cp/parser.c
index bcee1214c2f..fc11f9126d3 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -10440,6 +10440,12 @@ cp_parser_lambda_introducer (cp_parser* parser, tree lambda_expr)
 		   capture_init_expr,
 		   /*by_reference_p=*/capture_kind == BY_REFERENCE,
 		   explicit_init_p);
+
+  /* If there is any qualification still in effect, clear it
+   * now; we will be starting fresh with the next capture.  */
+  parser->scope = NULL_TREE;
+  parser->qualifying_scope = NULL_TREE;
+  parser->object_scope = NULL_TREE;
 }
 
   cp_parser_require (parser, CPP_CLOSE_SQUARE, RT_CLOSE_SQUARE);
diff --git gcc/testsuite/g++.dg/cpp1y/pr71546.C gcc/testsuite/g++.dg/cpp1y/pr71546.C
new file mode 100644
index 000..934a6b32364
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp1y/pr71546.C
@@ -0,0 +1,11 @@
+// PR c++/71546
+// { dg-do compile { target c++14 } }
+// { dg-options "" }
+
+namespace n { struct make_shared { }; }
+
+int main()
+{
+  int x1;
+  [e = n::make_shared (), x1]() {};
+}


Re: [PATCH][AArch64] PR84114: Avoid reassociating FMA

2018-02-27 Thread Aaron Sawdey
On Tue, 2018-02-27 at 14:21 +, Wilco Dijkstra wrote:
> Richard Biener 
> 
> > It happens that on some targets doing two FMAs in parallel and one
> > non-FMA operation merging them is faster than chaining three
> > FMAs...
> 
> Like I mentioned in the PR, long chains should be broken, but for
> that we need a new parameter to state how long a chain may be before
> it is split. The issue today is that it splits even very short
> chains, removing beneficial FMAs.
> 
> > But yes, somewhere I suggested that FMA detection should/could be
> > integrated with reassociation.

I'd also like to see some work here. 

Doing two FMA in parallel and then a non-FMA merge is faster on ppc,
but it would be nice if the target had some more control of exactly how
this happens.

Also doing parallel reassociation increases register pressure so it
would be nice to be able to avoid causing issues as a result of that.

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain



Re: [C++] [PR84231] overload on cond_expr in template

2018-02-27 Thread Jason Merrill
On Tue, Feb 27, 2018 at 1:05 PM, Alexandre Oliva  wrote:
> On Feb 15, 2018, Jason Merrill  wrote:
>
>> On Thu, Feb 8, 2018 at 9:09 PM, Alexandre Oliva  wrote:
>>> + /* If it was supposed to be an rvalue but it's not, adjust
>>> +one of the operands so that any overload resolution
>>> +taking this COND_EXPR as an operand makes the correct
>>> +decisions.  See c++/84231.  */
>>> + TREE_OPERAND (min, 2) = build1_loc (loc, NON_LVALUE_EXPR,
>>> + TREE_TYPE (min),
>>> + TREE_OPERAND (min, 2));
>>> + EXPR_LOCATION_WRAPPER_P (TREE_OPERAND (min, 2)) = 1;
>
>> But that's not true, this isn't a location wrapper, it has semantic
>> effect.  And would be the first such use of NON_LVALUE_EXPR in a
>> template.
>
> Yeah.  At first I thought NON_LVALUE_EXPR was the way to go, as the
> traditional way to denote non-lvalues, but when that didn't work, I
> investigated and saw if I marked it as a location wrapper, it would have
> the intended effect of stopping the template-dependent cond_expr from
> being regarded as an lvalue, while being dropped when tsubsting the
> cond_expr, so it had no ill effects AFAICT.
>
>> Since we're already using the type of the COND_EXPR to indicate a
>> glvalue, maybe lvalue_kind should say that within a template, a
>> COND_EXPR which got past the early check for reference type is a
>> prvalue.
>
> I suppose you mean something like this:
>
> diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
> index 9b9e36a1173f..76148c876b71 100644
> --- a/gcc/cp/tree.c
> +++ b/gcc/cp/tree.c
> @@ -194,6 +194,14 @@ lvalue_kind (const_tree ref)
>break;
>
>  case COND_EXPR:
> +  /* Except for type-dependent exprs, a REFERENCE_TYPE will
> +indicate whether its result is an lvalue or so.
> +REFERENCE_TYPEs are handled above, so if we reach this point,
> +we know we got an rvalue, unless we have a type-dependent
> +expr.  */
> +  if (processing_template_decl
> + && !type_dependent_expression_p (CONST_CAST_TREE (ref)))
> +   return clk_none;
>op1_lvalue_kind = lvalue_kind (TREE_OPERAND (ref, 1)
> ? TREE_OPERAND (ref, 1)
> : TREE_OPERAND (ref, 0));
>
> but there be dragons here.  build_x_conditional_expr wants tests
> glvalue_p on the proxy and the template expr, and glvalue_p uses
> lvalue_kind, so we have to disable this new piece of logic for the
> baseline so that we don't unintentionally change the lvalueness of the
> COND_EXPR.
>
> diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
> index 0e7c63dd1973..a34cb6ec175f 100644
> --- a/gcc/cp/typeck.c
> +++ b/gcc/cp/typeck.c
> @@ -6565,11 +6565,25 @@ build_x_conditional_expr (location_t loc, tree ifexp, 
> tree op1, tree op2,
>  {
>tree min = build_min_non_dep (COND_EXPR, expr,
> orig_ifexp, orig_op1, orig_op2);
> -  /* Remember that the result is an lvalue or xvalue.  */
> -  if (glvalue_p (expr) && !glvalue_p (min))
> -   TREE_TYPE (min) = cp_build_reference_type (TREE_TYPE (min),
> -  !lvalue_p (expr));
> +  /* Remember that the result is an lvalue or xvalue.  We have to
> +pretend EXPR is type-dependent, lest we short-circuit the
> +very logic we want to rely on.  */
> +  tree save_expr_type = TREE_TYPE (expr);
> +
> +  if (!type_dependent_expression_p (expr)
> + && TREE_CODE (save_expr_type) != REFERENCE_TYPE)
> +   TREE_TYPE (expr) = NULL_TREE;
> +
> +  bool glvalue = glvalue_p (expr);
> +  bool reftype = glvalue && !glvalue_p (min);
> +  bool lval = reftype ? lvalue_p (expr) : false;
> +
> +  TREE_TYPE (expr) = save_expr_type;
> +
> +  if (reftype)
> +   TREE_TYPE (min) = cp_build_reference_type (TREE_TYPE (min), !lval);
>expr = convert_from_reference (min);
> +  gcc_assert (glvalue_p (min) == glvalue);
>  }
>return expr;
>  }
>
>
> Even then, there are other surprises I'm trying to track down (libstdc++
> optimized headers won't build with the two patchlets above); my guess is
> that it's out of non-template-dependent cond_exprs' transitions from
> non-lvalue to lvalue as we finish template substitution and
> processing_template_decl becomes zero.
>
> This is getting hairy enough that I'm wondering if that's really what
> you had in mind, so I decided to touch base in case I had to be put back
> on the right track (or rather out of the wrong track again ;-)

Perhaps it would be easier to add the REFERENCE_TYPE in
build_conditional_expr_1, adjusting result_type based on
processing_template_decl and is_lvalue.

Jason


Re: [PATCH] Fix PR c++/71546 - lambda capture fails with "was not declared in this scope"

2018-02-27 Thread Paolo Carlini

.. or even:

namespace n { struct make_shared { }; }

int main()
{
  int x1;
  [e = n::make_shared (), x1]() {};
}

I.e., I don't think the fact that std::make_shared is a template plays a 
specific role here.


Paolo.


Re: [PATCH] PR preprocessor/84517 allow double-underscore macros after string literals

2018-02-27 Thread Jason Merrill
OK.

On Tue, Feb 27, 2018 at 9:41 AM, Jonathan Wakely  wrote:
> Since the fix for PR c++/80955 any suffix on a string literal that
> begins with an underscore is assumed to be a user-defined literal
> suffix, not a macro. This assumption is invalid for a suffix beginning
> with two underscores, because such names are reserved and can't be used
> for UDLs anyway. Checking for exactly one underscore restores support
> for macro expansion in cases like "File: "__FILE__ or "Date: "__DATE__
> (which are formally ill-formed but accepted with a warning, as a
> conforming extension).
>
> gcc/testsuite:
>
> PR preprocessor/84517
> * g++.dg/cpp0x/udlit-macros.C: Expect a warning for ""__FILE__.
>
> libcpp:
>
> PR preprocessor/84517
> * lex.c (is_macro_not_literal_suffix): New function.
> (lex_raw_string, lex_string): Use is_macro_not_literal_suffix to
> decide when to issue -Wliteral-suffix warnings.
>
>
> Tested powerpc64le-linux, OK for trunk?


Re: [PATCH] Fix PR c++/71546 - lambda capture fails with "was not declared in this scope"

2018-02-27 Thread Paolo Carlini

Hi,

I only have a simple comment about the testcase:

On 27/02/2018 17:42, Håkon Sandsmark wrote:

+++ gcc/testsuite/g++.dg/cpp1y/pr71546.C
@@ -0,0 +1,11 @@
+// PR c++/71546
+// { dg-do compile { target c++14 } }
+// { dg-options "" }
+
+#include 
+
+int main()
+{
+  int x1;
+  [e = std::make_shared  (), x1]() {};
+}

Instead of including the whole , shall we use something like:

namespace std { template struct make_shared { }; }

int main()
{
  int x1;
  [e = std::make_shared  (), x1]() {};
}

???

Thanks,
Paolo


Re: [C++] [PR84231] overload on cond_expr in template

2018-02-27 Thread Alexandre Oliva
On Feb 15, 2018, Jason Merrill  wrote:

> On Thu, Feb 8, 2018 at 9:09 PM, Alexandre Oliva  wrote:
>> + /* If it was supposed to be an rvalue but it's not, adjust
>> +one of the operands so that any overload resolution
>> +taking this COND_EXPR as an operand makes the correct
>> +decisions.  See c++/84231.  */
>> + TREE_OPERAND (min, 2) = build1_loc (loc, NON_LVALUE_EXPR,
>> + TREE_TYPE (min),
>> + TREE_OPERAND (min, 2));
>> + EXPR_LOCATION_WRAPPER_P (TREE_OPERAND (min, 2)) = 1;

> But that's not true, this isn't a location wrapper, it has semantic
> effect.  And would be the first such use of NON_LVALUE_EXPR in a
> template.

Yeah.  At first I thought NON_LVALUE_EXPR was the way to go, as the
traditional way to denote non-lvalues, but when that didn't work, I
investigated and saw if I marked it as a location wrapper, it would have
the intended effect of stopping the template-dependent cond_expr from
being regarded as an lvalue, while being dropped when tsubsting the
cond_expr, so it had no ill effects AFAICT.

> Since we're already using the type of the COND_EXPR to indicate a
> glvalue, maybe lvalue_kind should say that within a template, a
> COND_EXPR which got past the early check for reference type is a
> prvalue.

I suppose you mean something like this:

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 9b9e36a1173f..76148c876b71 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -194,6 +194,14 @@ lvalue_kind (const_tree ref)
   break;
 
 case COND_EXPR:
+  /* Except for type-dependent exprs, a REFERENCE_TYPE will
+indicate whether its result is an lvalue or so.
+REFERENCE_TYPEs are handled above, so if we reach this point,
+we know we got an rvalue, unless we have a type-dependent
+expr.  */
+  if (processing_template_decl
+ && !type_dependent_expression_p (CONST_CAST_TREE (ref)))
+   return clk_none;
   op1_lvalue_kind = lvalue_kind (TREE_OPERAND (ref, 1)
? TREE_OPERAND (ref, 1)
: TREE_OPERAND (ref, 0));

but there be dragons here.  build_x_conditional_expr wants tests
glvalue_p on the proxy and the template expr, and glvalue_p uses
lvalue_kind, so we have to disable this new piece of logic for the
baseline so that we don't unintentionally change the lvalueness of the
COND_EXPR.

diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index 0e7c63dd1973..a34cb6ec175f 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -6565,11 +6565,25 @@ build_x_conditional_expr (location_t loc, tree ifexp, 
tree op1, tree op2,
 {
   tree min = build_min_non_dep (COND_EXPR, expr,
orig_ifexp, orig_op1, orig_op2);
-  /* Remember that the result is an lvalue or xvalue.  */
-  if (glvalue_p (expr) && !glvalue_p (min))
-   TREE_TYPE (min) = cp_build_reference_type (TREE_TYPE (min),
-  !lvalue_p (expr));
+  /* Remember that the result is an lvalue or xvalue.  We have to
+pretend EXPR is type-dependent, lest we short-circuit the
+very logic we want to rely on.  */
+  tree save_expr_type = TREE_TYPE (expr);
+
+  if (!type_dependent_expression_p (expr)
+ && TREE_CODE (save_expr_type) != REFERENCE_TYPE)
+   TREE_TYPE (expr) = NULL_TREE;
+  
+  bool glvalue = glvalue_p (expr);
+  bool reftype = glvalue && !glvalue_p (min);
+  bool lval = reftype ? lvalue_p (expr) : false;
+
+  TREE_TYPE (expr) = save_expr_type;
+
+  if (reftype)
+   TREE_TYPE (min) = cp_build_reference_type (TREE_TYPE (min), !lval);
   expr = convert_from_reference (min);
+  gcc_assert (glvalue_p (min) == glvalue);
 }
   return expr;
 }


Even then, there are other surprises I'm trying to track down (libstdc++
optimized headers won't build with the two patchlets above); my guess is
that it's out of non-template-dependent cond_exprs' transitions from
non-lvalue to lvalue as we finish template substitution and
processing_template_decl becomes zero.

This is getting hairy enough that I'm wondering if that's really what
you had in mind, so I decided to touch base in case I had to be put back
on the right track (or rather out of the wrong track again ;-)

Thanks,

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


C++ PATCH for c++/84489, dependent default template argument

2018-02-27 Thread Jason Merrill
The logic in type_unification_real for handling template parms that
depend on earlier template parms is a bit complicated.  It already
recognizes when the type of the parm depends on something not
available yet, and it dealt with the case where substituting partial
args left some template parm uses behind, but it didn't handle the
case where substituting partial args just failed.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit dade5ae09222e696b2aaa05d730e2414b3568bd4
Author: Jason Merrill 
Date:   Mon Feb 26 23:38:10 2018 -0500

PR c++/84489 - dependent default template argument

* pt.c (type_unification_real): Handle early substitution failure.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 40c897aadc9..2a64fa6d9ad 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -19831,21 +19831,28 @@ type_unification_real (tree tparms,
continue;
  tree parm = TREE_VALUE (tparm);
 
- if (TREE_CODE (parm) == PARM_DECL
- && uses_template_parms (TREE_TYPE (parm))
- && saw_undeduced < 2)
-   continue;
+ tsubst_flags_t fcomplain = complain;
+ if (saw_undeduced == 1)
+   {
+ /* When saw_undeduced == 1, substitution into parm and arg might
+fail or not replace all template parameters, and that's
+fine.  */
+ fcomplain = tf_none;
+ if (TREE_CODE (parm) == PARM_DECL
+ && uses_template_parms (TREE_TYPE (parm)))
+   continue;
+   }
 
  tree arg = TREE_PURPOSE (tparm);
  reopen_deferring_access_checks (*checks);
  location_t save_loc = input_location;
  if (DECL_P (parm))
input_location = DECL_SOURCE_LOCATION (parm);
- arg = tsubst_template_arg (arg, full_targs, complain, NULL_TREE);
- if (!uses_template_parms (arg))
+ arg = tsubst_template_arg (arg, full_targs, fcomplain, NULL_TREE);
+ if (arg != error_mark_node && !uses_template_parms (arg))
arg = convert_template_argument (parm, arg, full_targs, complain,
 i, NULL_TREE);
- else if (saw_undeduced < 2)
+ else if (saw_undeduced == 1)
arg = NULL_TREE;
  else
arg = error_mark_node;
diff --git a/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg7.C 
b/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg7.C
new file mode 100644
index 000..636bf1afd88
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg7.C
@@ -0,0 +1,10 @@
+// PR c++/84489
+// { dg-do compile { target c++11 } }
+
+template > 1)>
+T f1() {return 0;}
+
+int main()
+{
+  f1(); // Bug here
+}


Re: [PATCH] PR preprocessor/84517 allow double-underscore macros after string literals

2018-02-27 Thread Jonathan Wakely
On 27 February 2018 at 16:59, Jonathan Wakely  wrote:
> On 27 February 2018 at 16:49, Tim Song  wrote:
>> On Tue, Feb 27, 2018 at 9:41 AM, Jonathan Wakely  
>> wrote:
>>> Since the fix for PR c++/80955 any suffix on a string literal that
>>> begins with an underscore is assumed to be a user-defined literal
>>> suffix, not a macro. This assumption is invalid for a suffix beginning
>>> with two underscores, because such names are reserved and can't be used
>>> for UDLs anyway.
>>
>> [lex.name]/3 reserves all identifiers containing double underscore,
>> but the spaceless one-token form does not actually use such an
>> identifier.
>>
>> See the (equally reserved) _Bq example in [over.literal]/8:
>>
>> double operator""_Bq(long double);  // OK: does not
>> use the reserved identifier _­Bq ([lex.name])
>> double operator"" _Bq(long double); // uses the
>> reserved identifier _­Bq ([lex.name])
>
> I know, but GCC doesn't implement the rule accurately. I reported PR
> 80955 because GCC's UDL parsing meant we reject valid programs. The
> fix for that bug was a bit of a hack, simply adding a special case so
> that a suffix starting with an underscore is never expanded as a
> macro, which makes the above examples do the right thing. But that
> introduces a regression where we no longer accept ""__FILE__ (because
> it starts with an underscore), where previously that was accepted as
> an extension, just like "%"PRIu64 is accepted (with a warning).
>
> After the hack for 80955 we still do the wrong thing for:
>
> #define foo
> int operator""foo();
>
> But that can't appear in a valid program, so we get away with it. But
> it's still a hack.
>
> My patch doesn't try to change how we parse operator""_Bq it just
> ensures we accept ""__FILE__ and similar cases. I think my patch means
> we reject:
>
> int operator""__X(unsigned long long);
>
> But that also can't appear in a valid program, so again we get away with it.

Sorry for being inaccurate, what I mean is the problem case where a
macro clashes with a UDL can't appear (because defining the macro
would be invalid):

#define __X
int operator""__X(unsigned long long);

However, that doesn't help for this case:

int operator""__FILE__(unsigned long long);

This is allowed as a UDL, and doesn't use a reserved identifier
(because it doesn't define __FILE__, the implementation does), but
would fail after my patch because __FILE__ gets incorrectly expanded
as a macro. I don't have much sympathy for anybody defining such a
UDL.


Re: [PATCH] PR preprocessor/84517 allow double-underscore macros after string literals

2018-02-27 Thread Jonathan Wakely
On 27 February 2018 at 16:49, Tim Song  wrote:
> On Tue, Feb 27, 2018 at 9:41 AM, Jonathan Wakely  
> wrote:
>> Since the fix for PR c++/80955 any suffix on a string literal that
>> begins with an underscore is assumed to be a user-defined literal
>> suffix, not a macro. This assumption is invalid for a suffix beginning
>> with two underscores, because such names are reserved and can't be used
>> for UDLs anyway.
>
> [lex.name]/3 reserves all identifiers containing double underscore,
> but the spaceless one-token form does not actually use such an
> identifier.
>
> See the (equally reserved) _Bq example in [over.literal]/8:
>
> double operator""_Bq(long double);  // OK: does not
> use the reserved identifier _­Bq ([lex.name])
> double operator"" _Bq(long double); // uses the
> reserved identifier _­Bq ([lex.name])

I know, but GCC doesn't implement the rule accurately. I reported PR
80955 because GCC's UDL parsing meant we reject valid programs. The
fix for that bug was a bit of a hack, simply adding a special case so
that a suffix starting with an underscore is never expanded as a
macro, which makes the above examples do the right thing. But that
introduces a regression where we no longer accept ""__FILE__ (because
it starts with an underscore), where previously that was accepted as
an extension, just like "%"PRIu64 is accepted (with a warning).

After the hack for 80955 we still do the wrong thing for:

#define foo
int operator""foo();

But that can't appear in a valid program, so we get away with it. But
it's still a hack.

My patch doesn't try to change how we parse operator""_Bq it just
ensures we accept ""__FILE__ and similar cases. I think my patch means
we reject:

int operator""__X(unsigned long long);

But that also can't appear in a valid program, so again we get away with it.


Re: [PATCH] Fix gcc.target/i386/pr84309.c testcase (PR target/84575)

2018-02-27 Thread H.J. Lu
On Tue, Feb 27, 2018 at 1:01 AM, Jakub Jelinek  wrote:
>

NOPATCH.


-- 
H.J.


Re: [PATCH] PR preprocessor/84517 allow double-underscore macros after string literals

2018-02-27 Thread Tim Song
On Tue, Feb 27, 2018 at 9:41 AM, Jonathan Wakely  wrote:
> Since the fix for PR c++/80955 any suffix on a string literal that
> begins with an underscore is assumed to be a user-defined literal
> suffix, not a macro. This assumption is invalid for a suffix beginning
> with two underscores, because such names are reserved and can't be used
> for UDLs anyway.

[lex.name]/3 reserves all identifiers containing double underscore,
but the spaceless one-token form does not actually use such an
identifier.

See the (equally reserved) _Bq example in [over.literal]/8:

double operator""_Bq(long double);  // OK: does not
use the reserved identifier _­Bq ([lex.name])
double operator"" _Bq(long double); // uses the
reserved identifier _­Bq ([lex.name])


[PATCH] Fix gcc.target/i386/pr84309.c testcase (PR target/84575)

2018-02-27 Thread Jakub Jelinek




[PATCH] Fix PR c++/71546 - lambda capture fails with "was not declared in this scope"

2018-02-27 Thread Håkon Sandsmark
Hi GCC developers,

I have attached a proposed patch for fixing PR c++/71546 - lambda
capture fails with "was not declared in this scope".

The patch clears the parser scope after each lambda capture in
cp_parser_lambda_introducer in parser.c. This is based on the
following observations:

Comment about cp_parser::scope in parse.h:
"This value is not cleared automatically after a name is looked
up, so we must be careful to clear it before starting a new look
up sequence.  (If it is not cleared, then `X::Y' followed by `Z'
will look up `Z' in the scope of `X', rather than the current
scope.)"

C++14 standard draft N4140 § 5.1.2 paragraph 10:
"The identifier in a simple-capture is looked up using
the usual rules for unqualified name lookup (3.4.1);
each such lookup shall find an entity."

I have compared the test results from a pristine build (with test case
from PR added) with a bootstrapped build with my patch applied (using
x86_64-linux). This is the output I got from the compare_tests tool:

$ gcc/contrib/compare_tests gcc-pristine-build gcc-patched-build
# Comparing directories
## Dir1=gcc-pristine-build: 6 sum files
## Dir2=gcc-patched-build: 6 sum files

# Comparing 6 common sum files
## /bin/sh gcc/contrib/compare_tests  /tmp/gxx-sum1.95415
/tmp/gxx-sum2.95415
Tests that now work, but didn't before:

g++.dg/cpp1y/pr71546.C  -std=gnu++14 (test for excess errors)

# No differences found in 6 common sum files

2018-02-27  Håkon Sandsmark  

* parser.c (cp_parser_lambda_introducer): Clear scope after
  each lambda capture.

* g++.dg/cpp1y/pr71546.C: New test.
diff --git gcc/cp/parser.c gcc/cp/parser.c
index bcee1214c2f..fc11f9126d3 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -10440,6 +10440,12 @@ cp_parser_lambda_introducer (cp_parser* parser, tree lambda_expr)
 		   capture_init_expr,
 		   /*by_reference_p=*/capture_kind == BY_REFERENCE,
 		   explicit_init_p);
+
+  /* If there is any qualification still in effect, clear it
+   * now; we will be starting fresh with the next capture.  */
+  parser->scope = NULL_TREE;
+  parser->qualifying_scope = NULL_TREE;
+  parser->object_scope = NULL_TREE;
 }
 
   cp_parser_require (parser, CPP_CLOSE_SQUARE, RT_CLOSE_SQUARE);
diff --git gcc/testsuite/g++.dg/cpp1y/pr71546.C gcc/testsuite/g++.dg/cpp1y/pr71546.C
new file mode 100644
index 000..861563aacf9
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp1y/pr71546.C
@@ -0,0 +1,11 @@
+// PR c++/71546
+// { dg-do compile { target c++14 } }
+// { dg-options "" }
+
+#include 
+
+int main()
+{
+  int x1;
+  [e = std::make_shared  (), x1]() {};
+}


[arm-embedded] Allow -mcpu=cortex-m33+nodsp

2018-02-27 Thread Thomas Preudhomme

Hi, we decided to apply the following patch to ARM/embedded-7-branch to
support -mcpu=cortex-m33+nodsp.

DSP instructions are optional for Arm Cortex-M33, yet its -mcpu option
does not allow +nodsp. Users are thus left with using
-march=armv8-m.main -mtune=cortex-m33. This patch creates a new cpu
cortex-m33+nodsp since there is no mechanism on GCC 7 for CPU
extensions. Since GCC passes the -mcpu parameter down to GAS verbatim
and that GAS does not support +nodsp for cortex-m33, this patch also
special cases -mcpu=cortex-m33 in arm_file_start to output a .arch
option instead of .cpu.

2018-02-26  Thomas Preud'homme  

* config/arm/arm-cpus.in (cortex-m33+nodsp): New CPU.
* config/arm/arm-cpu-cdata.h: Regenerate.
* config/arm/arm-cpu-data.h: Likewise.
* config/arm/arm-cpu.h: Likewise.
* config/arm/arm-tables.opt: Likewise.
* config/arm/arm-tune.md: Likewise.
* config/arm/arm.c (arm_file_start): Special case
* -mcpu=cortex-m33+nodsp to emit .arch armv8-m.main instead.
* doc/invoke.texi: Document cortex-m33+nodsp as a valid value for -mcpu
and -mtune.

Testing: Compiled a hello world with -S -mcpu=cortex-m33 and with
-S -mcpu=cortex-m33+dsp and compared both assembly files. The latter
correctly emits .arch armv8-m.main instead of .cpu cortex-m33.

Best regards,

Thomas
diff --git a/gcc/ChangeLog.arm b/gcc/ChangeLog.arm
index a98ecb028f6800a516f6cd252390ceac1e08911b..e09bd132d224aee511591143d86efff8bb156d60 100644
--- a/gcc/ChangeLog.arm
+++ b/gcc/ChangeLog.arm
@@ -1,3 +1,9 @@
+2018-02-26  Thomas Preud'homme  
+
+	* config/arm/arm-cpus.in (cortex-m33+nodsp): Define.
+	* doc/invoke.texi: Document +nodsp as a valid extension for
+	-mcpu=cortex-m33.
+
 2017-11-23  Thomas Preud'homme  
 
 	Cherry-pick from GCC 7
diff --git a/gcc/config/arm/arm-cpu-cdata.h b/gcc/config/arm/arm-cpu-cdata.h
index 27571c841d928fe9c331006bfc9608c4e75b60d8..f5e34c830ca28196ded0912c230f719a6ff5681e 100644
--- a/gcc/config/arm/arm-cpu-cdata.h
+++ b/gcc/config/arm/arm-cpu-cdata.h
@@ -789,6 +789,13 @@ static const struct arm_arch_core_flag arm_arch_core_flags[] =
 },
   },
   {
+"cortex-m33+nodsp",
+{
+  ISA_ARMv8m_main,
+  isa_nobit
+},
+  },
+  {
 "cortex-r52",
 {
   ISA_ARMv8r,isa_bit_crc32,
diff --git a/gcc/config/arm/arm-cpu-data.h b/gcc/config/arm/arm-cpu-data.h
index e474efa02ed93a93ae00ac2057a9bc841c48b87f..30902ecabc6c72e46e6f6aa1d92b9980fd639dcd 100644
--- a/gcc/config/arm/arm-cpu-data.h
+++ b/gcc/config/arm/arm-cpu-data.h
@@ -1221,6 +1221,17 @@ static const struct processors all_cores[] =
 _v7m_tune
   },
   {
+"cortex-m33+nodsp",
+TARGET_CPU_cortexm33nodsp,
+(TF_LDSCHED),
+"8M_MAIN", BASE_ARCH_8M_MAIN,
+{
+  ISA_ARMv8m_main,
+  isa_nobit
+},
+_v7m_tune
+  },
+  {
 "cortex-r52",
 TARGET_CPU_cortexr52,
 (TF_LDSCHED),
diff --git a/gcc/config/arm/arm-cpu.h b/gcc/config/arm/arm-cpu.h
index 502965081faa625abc93d97559517baf50972e1b..22566495fdf0da0ad75b81a5956eecb898c38684 100644
--- a/gcc/config/arm/arm-cpu.h
+++ b/gcc/config/arm/arm-cpu.h
@@ -130,6 +130,7 @@ enum processor_type
   TARGET_CPU_cortexa73cortexa53,
   TARGET_CPU_cortexm23,
   TARGET_CPU_cortexm33,
+  TARGET_CPU_cortexm33nodsp,
   TARGET_CPU_cortexr52,
   TARGET_CPU_arm_none
 };
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 5f18dfb35687888bc7f642785693f75658a96733..7368a067db92b384f83fdb4a0af6cb77cff4e6f4 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -1090,6 +1090,13 @@ begin cpu cortex-m33
  costs v7m
 end cpu cortex-m33
 
+begin cpu cortex-m33+nodsp
+ cname cortexm33nodsp
+ tune flags LDSCHED
+ architecture armv8-m.main
+ costs v7m
+end cpu cortex-m33+nodsp
+
 # V8 R-profile implementations.
 begin cpu cortex-r52
  cname cortexr52
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index ede44f497edd69390bbbe6de5a913430b546c547..a46bc3c7f8ba6048969bae4d37a7be3c5242ce6a 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -349,6 +349,9 @@ EnumValue
 Enum(processor_type) String(cortex-m33) Value( TARGET_CPU_cortexm33)
 
 EnumValue
+Enum(processor_type) String(cortex-m33+nodsp) Value( TARGET_CPU_cortexm33nodsp)
+
+EnumValue
 Enum(processor_type) String(cortex-r52) Value( TARGET_CPU_cortexr52)
 
 Enum
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index 519c0556fe76a5a391cd268bb50541c77a4596d4..542b7972d21cd3c9986229e91ce0841522e3b52f 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -57,5 +57,5 @@
 	cortexa73,exynosm1,xgene1,
 	cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,
 	cortexa73cortexa53,cortexm23,cortexm33,
-	cortexr52"
+	cortexm33nodsp,cortexr52"
 	(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/config/arm/arm.c 

Re: [PATCH] RL78 one_cmplhi2 improvement

2018-02-27 Thread DJ Delorie
"Sebastian Perta"  writes:
> Is this similar to what you had in mind?

Yes.  Did it affect code size in any of the larger tests?  I was hoping
that it wouldn't force too much into 8-bit registers and cause more
moves to be needed elsewhere.

(and even if it didn't, I think this one feels "more correct" than the
other, as it retains more of the "I'm 16 bits"-ness of the operand)

>> If it doesn't work out, consider this patch approved, though.
> Can I checkin now?

Yes.  Thanks!

Make sure the indentation is correct, of course.  It wasn't in the
email, and that confused me at first.


RE: [PATCH] RL78 one_cmplhi2 improvement

2018-02-27 Thread Sebastian Perta
HI DJ,

> One thing to try is to use (subreg:QI in a define_expand, so that
> there's a one_cmplhi2 pattern that expands to two QImode insns that
> operate on HImode input/outputs via SUBREGs.

Thank you for the suggestion! After several attempts the following is the
only successful one, however the code produced is identical with and without
the patch:

(define_expand "one_cmplhi2"
 [(set (subreg:QI (match_operand:HI 0 "nonimmediate_operand") 0)
  (xor:HI (subreg:QI (match_operand:HI 1 "general_operand") 0)
(const_int -1)))
  (set (subreg:QI (match_dup 0) 1)
  (xor:HI (subreg:QI (match_dup 1) 1)
(const_int -1)))
  ]
  ""
  "DONE;"
)

Is this similar to what you had in mind?

Output code (same as before the patch ... the patch makes no difference):
_test_one_cmplhi:
mov a, [sp+4]
xor a, #-1
mov r8, a
mov a, [sp+5]
xor a, #-1
mov r9, a
ret

I also explored other options including define_split without any success.

> If it doesn't work out, consider this patch approved, though.
Can I checkin now?

Best Regards,
Sebastian


> -Original Message-
> From: DJ Delorie [mailto:d...@redhat.com]
> Sent: 20 February 2018 19:39
> To: Sebastian Perta 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] RL78 one_cmplhi2 improvement
> 
> 
> Const type promotion is the bane of embedded developers...
> 
> One thing to try is to use (subreg:QI in a define_expand, so that
> there's a one_cmplhi2 pattern that expands to two QImode insns that
> operate on HImode input/outputs via SUBREGs.
> 
> I don't have high hopes of gcc optimizing this properly in all cases,
> but it's worth trying.
> 
> If it doesn't work out, consider this patch approved, though.
> 
> Thanks!



[PATCH, rs6000] (v2) Update altivec-7 testcase(s).

2018-02-27 Thread Will Schmidt
Hi, 
V2 update to incorporate suggested changes.

Move the vsx related content from the altivec-7-be test into
a new vsx-7-be test.  Split out the VSX specific bits into a vsx-7.h
header file, and include that when appropriate.

(v2 updates).   Adjust target stanza to allow 32bit targets to run test.
Updated scan-assembler stanzas to accommodate codegen variations involving
lxvd2x in BE versus LE versus P9.Sniff tested across p6-p9 systems
with -m32,-m64.  Test currently runs clean.

This fixes up results as seen on some power systems.

[testsuite]

2018-02-27  Will Schmidt  

* gcc.target/powerpc/altivec-7-be.c:  Remove VSX content, allow
-32 bit target.
* gcc.target/powerpc/altivec-7.h: Remove VSX content.
* gcc.target/powerpc/vsx-7-be.c: New test (VSX content).
* gcc.target/powerpc/vsx-7.h: New include (VSX content).
* gcc.target/powerpc/altivec-7-le.c: Add vsx-7.h include.

diff --git a/gcc/testsuite/gcc.target/powerpc/altivec-7-be.c 
b/gcc/testsuite/gcc.target/powerpc/altivec-7-be.c
index cbc31e6..1e690be 100644
--- a/gcc/testsuite/gcc.target/powerpc/altivec-7-be.c
+++ b/gcc/testsuite/gcc.target/powerpc/altivec-7-be.c
@@ -1,12 +1,12 @@
-/* { dg-do compile { target powerpc64-*-* } } */
+/* { dg-do compile { target powerpc*-*-* } } */
 /* { dg-require-effective-target powerpc_altivec_ok } */
 /* { dg-options "-maltivec" } */
 
 /* Expected results for Big Endian:
  vec_packpx vpkpx
- vec_ld lxv2x
+ vec_ld lxvd2x
  vec_ldelvewx
  vec_ldllxvl
  vec_lvewx  lvewx
  vec_unpackhvupklsh
  vec_unpacklvupkhsh
@@ -19,17 +19,12 @@
 */
 
 /* { dg-final { scan-assembler-times "vpkpx" 2 } } */
 /* { dg-final { scan-assembler-times "vmulesb" 1 } } */
 /* { dg-final { scan-assembler-times "vmulosb" 1 } } */
-/* { dg-final { scan-assembler-times "lxvd2x" 6 } } */
 /* { dg-final { scan-assembler-times "lvewx" 2 } } */
 /* { dg-final { scan-assembler-times "lvxl" 1 } } */
 /* { dg-final { scan-assembler-times "vupklsh" 1 } } */
 /* { dg-final { scan-assembler-times "vupkhsh" 1 } } */
-/* { dg-final { scan-assembler-times "xxlnor" 4 } } */
-/* { dg-final { scan-assembler-times "xxland" 4 } } */
-/* { dg-final { scan-assembler-times "xxlxor" 5 } } */
-/* { dg-final { scan-assembler-times "vupkhpx" 1 } } */
 
 /* Source code for the test in altivec-7.h */
 #include "altivec-7.h"
diff --git a/gcc/testsuite/gcc.target/powerpc/altivec-7-le.c 
b/gcc/testsuite/gcc.target/powerpc/altivec-7-le.c
index 6f895336..38ce153 100644
--- a/gcc/testsuite/gcc.target/powerpc/altivec-7-le.c
+++ b/gcc/testsuite/gcc.target/powerpc/altivec-7-le.c
@@ -30,7 +30,8 @@
 /* { dg-final { scan-assembler-times "xxlnor" 4 } } */
 /* { dg-final { scan-assembler-times "xxland" 4 } } */
 /* { dg-final { scan-assembler-times "xxlxor" 5 } } */
 /* { dg-final { scan-assembler-times "vupkhpx" 1 } } */
 
-/* Source code for the test in altivec-7.h */
+/* Source code for the test in altivec-7.h and vsx-7.h. */
 #include "altivec-7.h"
+#include "vsx-7.h"
diff --git a/gcc/testsuite/gcc.target/powerpc/altivec-7.h 
b/gcc/testsuite/gcc.target/powerpc/altivec-7.h
index ff87deb..4dedcd8 100644
--- a/gcc/testsuite/gcc.target/powerpc/altivec-7.h
+++ b/gcc/testsuite/gcc.target/powerpc/altivec-7.h
@@ -15,11 +15,10 @@ vector signed int *vecint;
 vector signed short *vecshort;
 vector unsigned char *vecuchar;
 vector unsigned int *vecuint;
 vector unsigned short *vecushort;
 vector float *vecfloat;
-vector double *vecdouble;
 
 int main ()
 {
   *vecfloat++ = vec_andc((vector bool int)vecint[0], vecfloat[1]);
   *vecfloat++ = vec_andc(vecfloat[0], (vector bool int)vecint[1]);
@@ -41,10 +40,8 @@ int main ()
   *vecushort++ = vec_vxor(vecushort[0], (vector bool short)vecshort[1]);
   *vecuint++ = vec_ld(var_int[0], uintp[1]);
   *vecuint++ = vec_lvx(var_int[0], uintp[1]);
   *vecuint++ = vec_vmsumubm(vecuchar[0], vecuchar[1], vecuint[2]);
   *vecuchar++ = vec_xor(vecuchar[0], (vector unsigned char)vecchar[1]);
-  *vecdouble++ = vec_unpackl(vecfloat[0]);
-  *vecdouble++ = vec_unpackh(vecfloat[0]);
 
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-7-be.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-7-be.c
new file mode 100644
index 000..52bcc43
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-7-be.c
@@ -0,0 +1,46 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx" } */
+
+/* This is an extension of altivec-7-be.c, with vsx target features included. 
*/
+
+/* Expected results for Big Endian:
+(from altivec-7.h)
+ vec_packpx vpkpx
+ vec_ld lxvd2x
+ vec_ldelvewx
+ vec_ldllxvl
+ 

Re: [RFC PATCH] avoid applying attributes to explicit specializations (PR 83871)

2018-02-27 Thread Jason Merrill

On 02/26/2018 11:19 PM, Martin Sebor wrote:

While reviewing other related bugs I noticed 83502.  This patch
doesn't fix the first test case in the bug (attribute noinline
vs always_inline).  Somehow those are still copied from
the primary to the specialization and can cause conflicts.


Hmm, that's odd.  Why is that?


Because duplicate_decl calls diagnose_mismatched_attributes()
on the NEWDECL and OLDDECL.  (Attribute optimize would do the
same thing.)  I was trying to keep the fix small but it makes
sense to take care of this as well so I have in this revision.


It does fix the second test case but with the noreturn change
it would issue a bogus -Wmissing-attributes warning for the
explicit specialization below.  Adding the warn_unused_result
attribute to it would then make GCC complain about a conflict
between the added attribute and noreturn, while removing it
would lead to worse code.

   template 
   int __attribute__ ((warn_unused_result)) f (T) { return 0; }

   template <>
   int __attribute__ ((noreturn)) f (int) { throw 0; }

   void fi () { f (0); }


I continue to disagree with this use of attribute noreturn.

+  /* Merge the function-has-side-effects bit.  */
+  if (TREE_THIS_VOLATILE (newdecl))
+    TREE_THIS_VOLATILE (olddecl) = 1;
+
+  if (merge_attr)


TREE_THIS_VOLATILE means attribute noreturn, not whether the function
has side-effects; it should be handled in the blocks controlled by
merge_attr.


Whoops.  That was a silly goof.  I must have misread the comment
above the macro definition.  I also didn't have a test for it (or
some of the other changes I've made) so I didn't see the problem.

Attached is an enhanced version of the patch that handles (and
tests) more of the commonly used attributes.  I'm not sure why
in the merge_attr block I have to merge TREE_THIS_VOLATILE and
TREE_NOTHROW back and forth but not also READONLY, PURE, or
MALLOC, but without it tests fail.


Because the memcpy from newdecl to olddecl at the end of duplicate_decls 
explicitly excludes the tree_common section.  I don't know why that is, 
it certainly complicates the logic.  That choice seems to predate the 
C++ front end.



PS Would it be possible to add a new macro with "noreturn" in
the name to make it more intuitive?  (And ditto perhaps also
for TREE_READONLY for "const" functions, though for whatever
reason that seems easier to decipher.  I know you're all used
to it but it's far from intuitive.)


Sounds good.


PPS Duplicate_decls is over 1,400 lines long.  If there is more
work to do here in stage 1 (I suspect there might be), would you
mind if I broke it up into two or more, say one for functions,
another for types, or whatever grouping makes most sense to make
it easier to follow?


Sure, there's plenty of scope for cleaning up duplicate_decls. :)


+ else
{
  if (DECL_DECLARED_INLINE_P (newdecl))
DECL_DISREGARD_INLINE_LIMITS (newdecl) = true;


This looks like it will mean setting DECL_DISREGARD_INLINE_LIMITS on all 
inline template specializations.  I think you want to drop these two lines.


OK with that change.

Jason


[PATCH] Fix PR84466

2018-02-27 Thread Richard Biener

The following fixes PR84466.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2018-02-27  Richard Biener  

PR tree-optimization/84466
* graphite-scop-detection.c (scop_detection::stmt_simple_for_scop_p):
Adjust last change to less strictly validate use operands.

Index: gcc/graphite-scop-detection.c
===
--- gcc/graphite-scop-detection.c   (revision 258030)
+++ gcc/graphite-scop-detection.c   (working copy)
@@ -1028,16 +1028,23 @@ scop_detection::stmt_simple_for_scop_p (
 case GIMPLE_ASSIGN:
 case GIMPLE_CALL:
   {
-   tree op;
+   tree op, lhs = gimple_get_lhs (stmt);
ssa_op_iter i;
+   /* If we are not going to instantiate the stmt do not require
+  its operands to be instantiatable at this point.  */
+   if (lhs
+   && TREE_CODE (lhs) == SSA_NAME
+   && scev_analyzable_p (lhs, scop))
+ return true;
/* Verify that if we can analyze operands at their def site we
   also can represent them when analyzed at their uses.  */
FOR_EACH_SSA_TREE_OPERAND (op, stmt, i, SSA_OP_USE)
  if (scev_analyzable_p (op, scop)
- && !graphite_can_represent_expr (scop, bb->loop_father, op))
+ && chrec_contains_undetermined
+  (scalar_evolution_in_region (scop, bb->loop_father, op)))
{
  DEBUG_PRINT (dp << "[scop-detection-fail] "
-  << "Graphite cannot represent stmt:\n";
+  << "Graphite cannot code-gen stmt:\n";
   print_gimple_stmt (dump_file, stmt, 0,
  TDF_VOPS | TDF_MEMSYMS));
  return false;


[PATCH] PR preprocessor/84517 allow double-underscore macros after string literals

2018-02-27 Thread Jonathan Wakely
Since the fix for PR c++/80955 any suffix on a string literal that
begins with an underscore is assumed to be a user-defined literal
suffix, not a macro. This assumption is invalid for a suffix beginning
with two underscores, because such names are reserved and can't be used
for UDLs anyway. Checking for exactly one underscore restores support
for macro expansion in cases like "File: "__FILE__ or "Date: "__DATE__
(which are formally ill-formed but accepted with a warning, as a
conforming extension).

gcc/testsuite:

PR preprocessor/84517
* g++.dg/cpp0x/udlit-macros.C: Expect a warning for ""__FILE__.

libcpp:

PR preprocessor/84517
* lex.c (is_macro_not_literal_suffix): New function.
(lex_raw_string, lex_string): Use is_macro_not_literal_suffix to
decide when to issue -Wliteral-suffix warnings.


Tested powerpc64le-linux, OK for trunk?
commit 9781a25354e07503690259d7fa95e9dd459c51b2
Author: Jonathan Wakely 
Date:   Thu Feb 22 19:59:38 2018 +

PR preprocessor/84517 allow double-underscore macros after string literals

Since the fix for PR c++/80955 any suffix on a string literal that
begins with an underscore is assumed to be a user-defined literal
suffix, not a macro. This assumption is invalid for a suffix beginning
with two underscores, because such names are reserved and can't be used
for UDLs anyway. Checking for exactly one underscore restores support
for macro expansion in cases like "File: "__FILE__ or "Date: "__DATE__
(which are formally ill-formed but accepted with a warning, as a
conforming extension).

gcc/testsuite:

PR preprocessor/84517
* g++.dg/cpp0x/udlit-macros.C: Expect a warning for ""__FILE__.

libcpp:

PR preprocessor/84517
* lex.c (is_macro_not_literal_suffix): New function.
(lex_raw_string, lex_string): Use is_macro_not_literal_suffix to
decide when to issue -Wliteral-suffix warnings.

diff --git a/gcc/testsuite/g++.dg/cpp0x/udlit-macros.C 
b/gcc/testsuite/g++.dg/cpp0x/udlit-macros.C
index fb518281811..7ef324b7e04 100644
--- a/gcc/testsuite/g++.dg/cpp0x/udlit-macros.C
+++ b/gcc/testsuite/g++.dg/cpp0x/udlit-macros.C
@@ -16,7 +16,7 @@ int operator""_ID(const char*, size_t) { return 0; }
 int main()
 {
   long i64 = 123;
-  char buf[100];
+  char buf[] = "xx"__FILE__;  // { dg-warning "invalid suffix on 
literal" }
   sprintf(buf, "%"PRId64"abc", i64);  // { dg-warning "invalid suffix on 
literal" }
   return strcmp(buf, "123abc")
 + ""_zero
diff --git a/libcpp/lex.c b/libcpp/lex.c
index 92c62517a4d..37c365a3560 100644
--- a/libcpp/lex.c
+++ b/libcpp/lex.c
@@ -1630,6 +1630,21 @@ is_macro(cpp_reader *pfile, const uchar *base)
   return !result ? false : (result->type == NT_MACRO);
 }
 
+/* Returns true if a literal suffix does not have the expected form
+   and is defined as a macro.  */
+
+static bool
+is_macro_not_literal_suffix(cpp_reader *pfile, const uchar *base)
+{
+  /* User-defined literals outside of namespace std must start with a single
+ underscore, so assume anything of that form really is a UDL suffix.
+ We don't need to worry about UDLs defined inside namespace std because
+ their names are reserved, so cannot be used as macro names in valid
+ programs.  */
+  if (base[0] == '_' && base[1] != '_')
+return false;
+  return is_macro (pfile, base);
+}
 
 /* Lexes a raw string.  The stored string contains the spelling, including
double quotes, delimiter string, '(' and ')', any leading
@@ -1900,10 +1915,8 @@ lex_raw_string (cpp_reader *pfile, cpp_token *token, 
const uchar *base,
 {
   /* If a string format macro, say from inttypes.h, is placed touching
 a string literal it could be parsed as a C++11 user-defined string
-literal thus breaking the program.
-Try to identify macros with is_macro. A warning is issued.
-The macro name should not start with '_' for this warning. */
-  if ((*cur != '_') && is_macro (pfile, cur))
+literal thus breaking the program.  */
+  if (is_macro_not_literal_suffix (pfile, cur))
{
  /* Raise a warning, but do not consume subsequent tokens.  */
  if (CPP_OPTION (pfile, warn_literal_suffix) && !pfile->state.skipping)
@@ -2031,10 +2044,8 @@ lex_string (cpp_reader *pfile, cpp_token *token, const 
uchar *base)
 {
   /* If a string format macro, say from inttypes.h, is placed touching
 a string literal it could be parsed as a C++11 user-defined string
-literal thus breaking the program.
-Try to identify macros with is_macro. A warning is issued.
-The macro name should not start with '_' for this warning. */
-  if ((*cur != '_') && is_macro (pfile, cur))
+literal thus breaking the program.  */
+  if (is_macro_not_literal_suffix (pfile, cur))
{
  /* Raise a 

Re: [PATCH] Fix aarch64_simd_reg_or_zero predicate (PR fortran/84565)

2018-02-27 Thread Richard Sandiford
Jakub Jelinek  writes:
> Hi!
>
> The following testcase ICEs, because the aarch64_cmeqdf instruction
> starting with r256612 no longer accepts CONST0_RTX (E_DFmode) as
> valid argument, but the expander generates it anyway.
>
> The bug has been introduced during the addition of aarch64_simd_imm_zero
> and aarch64_simd_or_scalar_imm_zero predicates, before the predicate
> used to be:
> (define_predicate "aarch64_simd_reg_or_zero"
>   (and (match_code "reg,subreg,const_int,const_double,const_vector")
>(ior (match_operand 0 "register_operand")
>(ior (match_test "op == const0_rtx")
> (match_test "aarch64_simd_imm_zero_p (op, mode)")
> with
> bool
> aarch64_simd_imm_zero_p (rtx x, machine_mode mode)
> {
>   return x == CONST0_RTX (mode);
> }
> and so matched not just const,const_vector zeros, but also
> const_int zero (that is through op == const0_rtx) and const_double
> zero too.

Thanks for fixing this.

> Bootstrapped/regtested on aarch64-linux (scratch Fedora gcc 8 package build
> with the patch applied), ok for trunk?
>
> 2018-02-27  Jakub Jelinek  
>
>   PR fortran/84565
>   * config/aarch64/predicates.md (aarch64_simd_reg_or_zero): Use
>   aarch64_simd_or_scalar_imm_zero rather than aarch64_simd_imm_zero.
>
>   * gfortran.dg/pr84565.f90: New test.
>
> --- gcc/config/aarch64/predicates.md.jj   2018-02-06 13:13:06.305751221 
> +0100
> +++ gcc/config/aarch64/predicates.md  2018-02-26 16:30:01.902195208 +0100
> @@ -395,7 +395,7 @@ (define_predicate "aarch64_simd_reg_or_z
>(and (match_code "reg,subreg,const_int,const_double,const,const_vector")
> (ior (match_operand 0 "register_operand")
>   (match_test "op == const0_rtx")
> - (match_operand 0 "aarch64_simd_imm_zero"
> + (match_operand 0 "aarch64_simd_or_scalar_imm_zero"

I think this makes the match_test on the line above redundant.

LGTM otherwise (but I can't approve).  We should probably clean up
the predicates so that the zero checks are more consistent.  genrecog
could probably help by ignoring predicate codes that obviously don't
apply (CONST_INT for vectors, CONST_VECTOR for ints, etc.).  But that's
obviously all stage 1 stuff.

Richard


[PATCH, arm-embedded] Multilib mapping for Armv8-R

2018-02-27 Thread Thomas Preudhomme

Hi,

We have decided to apply the following patch to the
ARM/embedded-7-branch to provide better multilib for Armv8-R targets.

Due to there being no multilib mapping for Armv8-R, default multilib
built for -march=armv4t with softfloat floating-point arithmetic is
being used. This patch maps it instead to the existing Armv7 multilibs.
Note that mapping for single-precision Armv8-R has been left out due to
there being no Arm implementation of that architecture variant.

Changelog entry is as follows:

*** gcc/ChangeLog ***

2018-02-26  Thomas Preud'homme  

* config/arm/t-rmprofile: Map Armv8-R and Armv8-R with CRC extension to
Armv7 multilibs.

Testing:

Ran -print-multi-directory for all combinations of
-march=armv8-r/-march=armv8-r+crc with
-mfpu=neon-fp-armv8/crypto-neon-fp-armv8. All gave the expected result. Details
in appendix.

Is this ok for stage4?

Best regards,

Thomas

Appendix: output of -print-multi-directory for all supported Armv8-R
configuration single precision FPU excepted.

% for ext in "" +crc; do cmd="arm-none-eabi-gcc -march=armv8-r${ext} 
-mfloat-abi=soft -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done
arm-none-eabi-gcc -march=armv8-r -mfloat-abi=soft -print-multi-directory: 
thumb/v7-ar
arm-none-eabi-gcc -march=armv8-r+crc -mfloat-abi=soft -print-multi-directory: 
thumb/v7-ar


% for ext in "" +crc; do for fpu in neon-fp-armv8 crypto-neon-fp-armv8; do 
cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfpu=${fpu} -mfloat-abi=softfp 
-print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done ; done
arm-none-eabi-gcc -march=armv8-r -mfpu=neon-fp-armv8 -mfloat-abi=softfp 
-print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -march=armv8-r -mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp 
-print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -march=armv8-r+crc -mfpu=neon-fp-armv8 -mfloat-abi=softfp 
-print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp


% for ext in "" +crc; do for fpu in neon-fp-armv8 crypto-neon-fp-armv8; do 
cmd="arm-none-eabi-gcc -march=armv8-r${ext} -mfpu=${fpu} -mfloat-abi=hard 
-print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done ; done
arm-none-eabi-gcc -march=armv8-r -mfpu=neon-fp-armv8 -mfloat-abi=hard 
-print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -march=armv8-r -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard 
-print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -march=armv8-r+crc -mfpu=neon-fp-armv8 -mfloat-abi=hard 
-print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard 
-print-multi-directory: thumb/v7-ar/fpv3/hard


% for ext in "" +crc; do cmd="arm-none-eabi-gcc -mthumb -march=armv8-r${ext} 
-mfpu=${fpu} -mfloat-abi=soft -print-multi-directory" ; echo -n "$cmd: " ; eval 
$cmd ; done
arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=soft -print-multi-directory: .
arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=soft -print-multi-directory: .


% for ext in "" +crc; do for fpu in neon-fp-armv8 crypto-neon-fp-armv8; do 
cmd="arm-none-eabi-gcc -mthumb -march=armv8-r${ext} -mfpu=${fpu} 
-mfloat-abi=softfp -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done 
; done
arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=neon-fp-armv8 -mfloat-abi=softfp 
-print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=neon-fp-armv8 
-mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp
arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=softfp -print-multi-directory: thumb/v7-ar/fpv3/softfp


% for ext in "" +crc; do for fpu in neon-fp-armv8 crypto-neon-fp-armv8; do 
cmd="arm-none-eabi-gcc -mthumb -march=armv8-r${ext} -mfpu=${fpu} 
-mfloat-abi=hard -print-multi-directory" ; echo -n "$cmd: " ; eval $cmd ; done ; 
done
arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=neon-fp-armv8 -mfloat-abi=hard 
-print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -mthumb -march=armv8-r -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=neon-fp-armv8 
-mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard
arm-none-eabi-gcc -mthumb -march=armv8-r+crc -mfpu=crypto-neon-fp-armv8 
-mfloat-abi=hard -print-multi-directory: thumb/v7-ar/fpv3/hard
diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile
index d4bc9fde4c5544812bde4743ccc18d68c1c25132..a3a24d59fb29b42a36177bd2d2ebfae4e50e5a10 100644
--- a/gcc/config/arm/t-rmprofile
+++ b/gcc/config/arm/t-rmprofile
@@ -135,6 +135,8 @@ 

Re: [PATCH][AArch64] PR84114: Avoid reassociating FMA

2018-02-27 Thread Wilco Dijkstra
Richard Biener 

> It happens that on some targets doing two FMAs in parallel and one
> non-FMA operation merging them is faster than chaining three FMAs...

Like I mentioned in the PR, long chains should be broken, but for that we need 
a new parameter to state how long a chain may be before it is split. The issue 
today is that it splits even very short chains, removing beneficial FMAs.

> But yes, somewhere I suggested that FMA detection should/could be
> integrated with reassociation.

Absolutely.

Wilco


Re: [PATCH,PTX] Add support for CUDA 9

2018-02-27 Thread Richard Biener
On Tue, 27 Feb 2018, Thomas Schwinge wrote:

> Hi!
> 
> Given that several users have run into this, is this (trunk r256891) OK
> to commit to open release branches, too.

Sure.

> On Fri, 19 Jan 2018 09:42:08 +0100, Tom de Vries  
> wrote:
> > On 01/19/2018 01:59 AM, Cesar Philippidis wrote:
> > > Here's the updated patch with the changes that you requested. There are
> > > no new regressions in trunk. I tested it on my desktop running driver
> > > 387.34 on a Pascal GPU.
> > > 
> > > Is this OK for trunk?
> > 
> > OK with 'PR target/83790' added to the changelog entry.
> > 
> > Thanks,
> > - Tom
> > 
> > > 
> > > trunk-cuda9.diff
> > > 
> > > 
> > > 2018-01-18  Cesar Philippidis  
> > > 
> > >   gcc/
> > >   * config/nvptx/nvptx.c (output_init_frag): Don't use generic address
> > >   spaces for function labels.
> > > 
> > >   gcc/testsuite/
> > >   * gcc.target/nvptx/indirect_call.c: New test.
> > > 
> > > diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
> > > index 86fc13f4fc0..4cb87c8ad07 100644
> > > --- a/gcc/config/nvptx/nvptx.c
> > > +++ b/gcc/config/nvptx/nvptx.c
> > > @@ -1899,9 +1899,15 @@ output_init_frag (rtx sym)
> > > 
> > > if (sym)
> > >   {
> > > -  fprintf (asm_out_file, "generic(");
> > > +  bool function = (SYMBOL_REF_DECL (sym)
> > > +&& (TREE_CODE (SYMBOL_REF_DECL (sym)) == FUNCTION_DECL));
> > > +  if (!function)
> > > + fprintf (asm_out_file, "generic(");
> > > output_address (VOIDmode, sym);
> > > -  fprintf (asm_out_file, val ? ") + " : ")");
> > > +  if (!function)
> > > + fprintf (asm_out_file, ")");
> > > +  if (val)
> > > + fprintf (asm_out_file, " + ");
> > >   }
> > >   
> > > if (!sym || val)
> > > diff --git a/gcc/testsuite/gcc.target/nvptx/indirect_call.c 
> > > b/gcc/testsuite/gcc.target/nvptx/indirect_call.c
> > > new file mode 100644
> > > index 000..39992a7137b
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/nvptx/indirect_call.c
> > > @@ -0,0 +1,19 @@
> > > +/* { dg-options "-O2 -msoft-stack" } */
> > > +/* { dg-do run } */
> > > +
> > > +int
> > > +f1 (int a)
> > > +{
> > > +  return a + 1;
> > > +}
> > > +
> > > +int (*f2)(int) = f1;
> > > +
> > > +int
> > > +main ()
> > > +{
> > > +  if (f2 (100) != 101)
> > > +__builtin_abort();
> > > +
> > > +  return 0;
> > > +}
> 
> 
> Grüße
>  Thomas
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Make groups more generic (PR gcov-profile/84548).

2018-02-27 Thread Martin Liška
On 02/27/2018 02:01 PM, Nathan Sidwell wrote:
> On 02/27/2018 07:45 AM, Martin Liška wrote:
>> Hi.
>>
>> Considering following C++ code snippet:
> 
>> gcc/ChangeLog:
>>
>> 2018-02-26  Martin Liska  
>>
>> PR gcov-profile/84548
>> * gcov.c (process_file): Allow partial overlap and consider it
>> also as group functions.
>> (output_lines): Properly calculate range of lines for a group.
> 
> Ok, but fix comment:
> 
> +  /* It's possible to have function that partially overlap,
> pluralize 'function'
> + thus take the maximum end_line of got functions.  */
> 'got functions'?
> 

Thanks.

Fixed that and installed as r258033.

Martin


Re: [PATCH,PTX] Add support for CUDA 9

2018-02-27 Thread Thomas Schwinge
Hi!

Given that several users have run into this, is this (trunk r256891) OK
to commit to open release branches, too?

On Fri, 19 Jan 2018 09:42:08 +0100, Tom de Vries  wrote:
> On 01/19/2018 01:59 AM, Cesar Philippidis wrote:
> > Here's the updated patch with the changes that you requested. There are
> > no new regressions in trunk. I tested it on my desktop running driver
> > 387.34 on a Pascal GPU.
> > 
> > Is this OK for trunk?
> 
> OK with 'PR target/83790' added to the changelog entry.
> 
> Thanks,
> - Tom
> 
> > 
> > trunk-cuda9.diff
> > 
> > 
> > 2018-01-18  Cesar Philippidis  
> > 
> > gcc/
> > * config/nvptx/nvptx.c (output_init_frag): Don't use generic address
> > spaces for function labels.
> > 
> > gcc/testsuite/
> > * gcc.target/nvptx/indirect_call.c: New test.
> > 
> > diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
> > index 86fc13f4fc0..4cb87c8ad07 100644
> > --- a/gcc/config/nvptx/nvptx.c
> > +++ b/gcc/config/nvptx/nvptx.c
> > @@ -1899,9 +1899,15 @@ output_init_frag (rtx sym)
> > 
> > if (sym)
> >   {
> > -  fprintf (asm_out_file, "generic(");
> > +  bool function = (SYMBOL_REF_DECL (sym)
> > +  && (TREE_CODE (SYMBOL_REF_DECL (sym)) == FUNCTION_DECL));
> > +  if (!function)
> > +   fprintf (asm_out_file, "generic(");
> > output_address (VOIDmode, sym);
> > -  fprintf (asm_out_file, val ? ") + " : ")");
> > +  if (!function)
> > +   fprintf (asm_out_file, ")");
> > +  if (val)
> > +   fprintf (asm_out_file, " + ");
> >   }
> >   
> > if (!sym || val)
> > diff --git a/gcc/testsuite/gcc.target/nvptx/indirect_call.c 
> > b/gcc/testsuite/gcc.target/nvptx/indirect_call.c
> > new file mode 100644
> > index 000..39992a7137b
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/nvptx/indirect_call.c
> > @@ -0,0 +1,19 @@
> > +/* { dg-options "-O2 -msoft-stack" } */
> > +/* { dg-do run } */
> > +
> > +int
> > +f1 (int a)
> > +{
> > +  return a + 1;
> > +}
> > +
> > +int (*f2)(int) = f1;
> > +
> > +int
> > +main ()
> > +{
> > +  if (f2 (100) != 101)
> > +__builtin_abort();
> > +
> > +  return 0;
> > +}


Grüße
 Thomas


Re: [PATCH v6] aarch64: Add split-stack support

2018-02-27 Thread Adhemerval Zanella
Ping (with Szabolcs remarks fixed).

On 07/02/2018 16:07, Adhemerval Zanella wrote:
> Changes from previous version:
> 
>   - Changed the wait to call __morestack to use use a branch with link
> instead of a simple branch.  This allows use a call instruction and
> avoid possible issues with later optimization passes which might
> see a branch outside the instruction block (as noticed in previous
> iterations while building a more complex workload as speccpu2006).
> 
>   - Change the return address to use the branch with link value and
> set x12 to save x30.  This simplifies the required instructions
> to setup/save the return address.
> 
> --
> 
> This patch adds the split-stack support on aarch64 (PR #67877).  As for
> other ports this patch should be used along with glibc and gold support.
> 
> The support is done similar to other architectures: a split-stack field
> is allocated before TCB by glibc, a target-specific __morestack implementation
> and helper functions are added in libgcc and compiler supported in adjusted
> (split-stack prologue, va_start for argument handling).  I also plan to
> send the gold support to adjust stack allocation acrosss split-stack
> and default code calls.
> 
> Current approach is to set the final stack adjustments using a 2 instructions
> at most (mov/movk) which limits stack allocation to upper limit of 4GB.
> The morestack call is non standard with x10 hollding the requested stack
> pointer, x11 the argument pointer (if required), and x12 to return
> continuation address.  Unwinding is handled by a personality routine that
> knows how to find stack segments.
> 
> Split-stack prologue on function entry is as follow (this goes before the
> usual function prologue):
> 
> function:
>   mrsx9, tpidr_el0
>   ldur   x9, [x9, -8]
>   movx10, 
>   movk   x10, #0x0, lsl #16
>   subx10, sp, x10
>   movx11, sp  # if function has stacked arguments
>   cmpx9, x10
>   bcc.LX
> main_fn_entry:
>   [function prologue]
> LX:
>   bl __morestack
>   b  main_fn_entry
> 
> Notes:
> 
> 1. Even if a function does not allocate a stack frame, a split-stack prologue
>is created.  It is to avoid issues with tail call for external symbols
>which might require linker adjustment (libgo/runtime/go-varargs.c).
> 
> 2. Basic-block reordering (enabled with -O2) will move split-stack TCB ldur
>to after the required stack calculation.
> 
> 3. Similar to powerpc, When the linker detects a call from split-stack to
>non-split-stack code, it adds 16k (or more) to the value found in 
> "allocate"
>instructions (so non-split-stack code gets a larger stack).  The amount is
>tunable by a linker option.  This feature is only implemented in the GNU
>gold linker.
> 
> 4. AArch64 does not handle >4G stack initially and although it is possible
>to implement it, limiting to 4G allows to materize the allocation with
>only 2 instructions (mov + movk) and thus simplifying the linker
>adjustments required.  Supporting multiple threads each requiring more
>than 4G of stack is probably not that important, and likely to OOM at
>run time.
> 
> 5. The TCB support on GLIBC is meant to be included in version 2.28.
> 
> 6. Besides a regression tests I also checked with a SPECcpu2006 run with
>-fsplit-stack additional option.  I saw no regression besides 416.gamess
>which fails on trunk as well (not sure if some misconfiguration in my
>environment).
> 
> libgcc/ChangeLog:
> 
>   * libgcc/config.host: Use t-stack and t-statck-aarch64 for
>   aarch64*-*-linux.
>   * libgcc/config/aarch64/morestack-c.c: New file.
>   * libgcc/config/aarch64/morestack.S: Likewise.
>   * libgcc/config/aarch64/t-stack-aarch64: Likewise.
>   * libgcc/generic-morestack.c (__splitstack_find): Add aarch64-specific
>   code.
> 
> gcc/ChangeLog:
> 
>   * common/config/aarch64/aarch64-common.c
>   (aarch64_supports_split_stack): New function.
>   (TARGET_SUPPORTS_SPLIT_STACK): New macro.
>   * gcc/config/aarch64/aarch64-linux.h (TARGET_ASM_FILE_END): Remove
>   macro.
>   * gcc/config/aarch64/aarch64-protos.h: Add
>   aarch64_expand_split_stack_prologue and
>   aarch64_split_stack_space_check.
>   * gcc/config/aarch64/aarch64.c (aarch64_expand_builtin_va_start): Use
>   internal argument pointer instead of virtual_incoming_args_rtx.
>   (morestack_ref): New symbol.
>   (aarch64_load_split_stack_value): New function.
>   (aarch64_expand_split_stack_prologue): Likewise.
>   (aarch64_internal_arg_pointer): Likewise.
>   (aarch64_file_end): Emit the split-stack note sections.
>   (aarch64_split_stack_space_check): Likewise.
>   (TARGET_ASM_FILE_END): New macro.
>   (TARGET_INTERNAL_ARG_POINTER): Likewise.
>   * gcc/config/aarch64/aarch64.h (aarch64_frame): Add
>   split_stack_arg_pointer to 

Re: [PATCH][AArch64] PR84114: Avoid reassociating FMA

2018-02-27 Thread Richard Biener
On Mon, Feb 26, 2018 at 11:25 PM, James Greenhalgh
 wrote:
> On Thu, Feb 22, 2018 at 11:38:03AM +, Wilco Dijkstra wrote:
>> As discussed in the PR, the reassociation phase runs before FMAs are formed
>> and so can significantly reduce FMA opportunities.  Although reassociation
>> could be switched off, it helps in many cases, so a better alternative is to
>> only avoid reassociation of floating point additions.  This fixes the 
>> testcase
>> and gives 1% speedup on SPECFP2017, fixing the performance regression.
>>
>> OK for commit?
>
> This is OK as a fairly safe fix for stage 4. We should fix reassociation
> properly in GCC 9.

It happens that on some targets doing two FMAs in parallel and one
non-FMA operation merging them is faster than chaining three FMAs...

But yes, somewhere I suggested that FMA detection should/could be
integrated with reassociation.

Richard.

> Thanks,
> James
>
>>
>> ChangeLog:
>> 2018-02-23  Wilco Dijkstra  
>>
>>   PR tree-optimization/84114
>>   * config/aarch64/aarch64.c (aarch64_reassociation_width)
>>   Avoid reassociation of FLOAT_MODE addition.
>> --
>>
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index 
>> b3d5fde171920e5759046a4bd61cfcf9eb78d7dd..5f9541cf700aaf18c1f1ac73054614e2932781e4
>>  100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -1109,15 +1109,16 @@ aarch64_min_divisions_for_recip_mul (machine_mode 
>> mode)
>>return aarch64_tune_params.min_div_recip_mul_df;
>>  }
>>
>> +/* Return the reassociation width of treeop OPC with mode MODE.  */
>>  static int
>> -aarch64_reassociation_width (unsigned opc ATTRIBUTE_UNUSED,
>> -  machine_mode mode)
>> +aarch64_reassociation_width (unsigned opc, machine_mode mode)
>>  {
>>if (VECTOR_MODE_P (mode))
>>  return aarch64_tune_params.vec_reassoc_width;
>>if (INTEGRAL_MODE_P (mode))
>>  return aarch64_tune_params.int_reassoc_width;
>> -  if (FLOAT_MODE_P (mode))
>> +  /* Avoid reassociating floating point addition so we emit more FMAs.  */
>> +  if (FLOAT_MODE_P (mode) && opc != PLUS_EXPR)
>>  return aarch64_tune_params.fp_reassoc_width;
>>return 1;
>>  }


Re: PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER

2018-02-27 Thread Jan Hubicka
> On Mon, Feb 26, 2018 at 8:54 AM, Jan Hubicka  wrote:
> >> On Mon, Feb 26, 2018 at 7:47 AM, Jan Hubicka  wrote:
> >> > Hi,
> >> > my main concern about the patch is that we now have 
> >> > -mindirect-branch=thunk-extern
> >> > which is intended to work well and is used by kernel, but we also have 
> >> > other modes
> >> > that are documented and as such they should work but they may lead to 
> >> > invalid
> >> > unwind info (or did I miss anything imporant here?).
> >> > Why we can't fix the others as well?
> >> >
> >>
> >> I took a closer look at my commit message.  It does leave an impression 
> >> that
> >> only  -mindirect-branch=thunk-extern is fixed.  But in fact, all
> >> -mindirect-branch=
> >> choices are fixed.
> >
> > I see, sorry for confussion!
> >>
> >> 1.  -mindirect-branch=thunk generates:
> >>
> >>.cfi_startproc
> >> pushq   %rbx
> >> .cfi_def_cfa_offset 16
> >> .cfi_offset 3, -16
> >> movq(%rdi), %rax
> >> movq%rdi, %rbx
> >> movq16(%rax), %rax
> >> call__x86_indirect_thunk_rax
> >> movq(%rbx), %rax
> >> movq%rbx, %rdi
> >> popq%rbx
> >> .cfi_def_cfa_offset 8
> >> movq16(%rax), %rax
> >> jmp __x86_indirect_thunk_rax
> >> .cfi_endproc
> >>
> >> 2.  -mindirect-branch=thunk-inline generates:
> >>
> >>.cfi_startproc
> >> pushq   %rbx
> >> .cfi_def_cfa_offset 16
> >> .cfi_offset 3, -16
> >> movq(%rdi), %rax
> >> movq%rdi, %rbx
> >> movq16(%rax), %rax
> >> jmp .LIND1
> >> .LIND0:
> >> call.LIND3
> >> .LIND2:
> >> pause
> >> lfence
> >> jmp .LIND2
> >> .LIND3:
> >> mov %rax, (%rsp)
> >> ret
> >> .LIND1:
> >> call.LIND0
> >> movq(%rbx), %rax
> >> movq%rbx, %rdi
> >> popq%rbx
> >> .cfi_def_cfa_offset 8
> >> movq16(%rax), %rax
> >> call.LIND5
> >> .LIND4:
> >> pause
> >> lfence
> >> jmp .LIND4
> >> .LIND5:
> >> mov %rax, (%rsp)
> >> ret
> >> .cfi_endproc
> >>
> >> I updated the commit message with
> >>
> >> This patch adds TARGET_INDIRECT_BRANCH_REGISTER to force indirect
> >> branch via register whenever -mindirect-branch= is used.
> >>
> >> OK for trunk?
> >>
> >> Thanks.
> >>
> >> --
> >> H.J.
> >
> >> From bd0672bd070da6fa4ff630540c1d87df3e8fdd53 Mon Sep 17 00:00:00 2001
> >> From: "H.J. Lu" 
> >> Date: Fri, 26 Jan 2018 15:54:25 -0800
> >> Subject: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
> >>
> >> For
> >>
> >> ---
> >> struct C {
> >>   virtual ~C();
> >>   virtual void f();
> >> };
> >>
> >> void
> >> f (C *p)
> >> {
> >>   p->f();
> >>   p->f();
> >> }
> >> ---
> >>
> >> -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates:
> >>
> >> _Z1fP1C:
> >> .LFB0:
> >> .cfi_startproc
> >> pushq   %rbx
> >> .cfi_def_cfa_offset 16
> >> .cfi_offset 3, -16
> >> movq(%rdi), %rax
> >> movq%rdi, %rbx
> >> jmp .LIND1
> >> .LIND0:
> >> pushq   16(%rax)
> >> jmp __x86_indirect_thunk
> >> .LIND1:
> >> call.LIND0
> >> movq(%rbx), %rax
> >> movq%rbx, %rdi
> >> popq%rbx
> >> .cfi_def_cfa_offset 8
> >> movq16(%rax), %rax
> >> jmp __x86_indirect_thunk_rax
> >> .cfi_endproc
> >>
> >> x86-64 is supposed to have asynchronous unwind tables by default, but
> >> there is nothing that reflects the change in the (relative) frame
> >> address after .LIND0.  That region really has to be moved outside of
> >> the .cfi_startproc/.cfi_endproc bracket.
> >>
> >> This patch adds TARGET_INDIRECT_BRANCH_REGISTER to force indirect
> >> branch via register whenever -mindirect-branch= is used.  Now,
> >> -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates:
> >>
> >> _Z1fP1C:
> >> .LFB0:
> >>   .cfi_startproc
> >>   pushq   %rbx
> >>   .cfi_def_cfa_offset 16
> >>   .cfi_offset 3, -16
> >>   movq(%rdi), %rax
> >>   movq%rdi, %rbx
> >>   movq16(%rax), %rax
> >>   call__x86_indirect_thunk_rax
> >>   movq(%rbx), %rax
> >>   movq%rbx, %rdi
> >>   popq%rbx
> >>   .cfi_def_cfa_offset 8
> >>   movq16(%rax), %rax
> >>   jmp __x86_indirect_thunk_rax
> >>   .cfi_endproc
> >>
> >> so that "-mindirect-branch=thunk-extern" is equivalent to
> >> "-mindirect-branch=thunk-extern -mindirect-branch-register", which is
> >> used by Linux kernel.
> >>
> >> gcc/
> >>
> >>   PR target/84039
> >>   * config/i386/constraints.md (Bs): Replace
> >>   ix86_indirect_branch_register with
> >>   TARGET_INDIRECT_BRANCH_REGISTER.
> >>   (Bw): Likewise.
> >>   * 

Re: [PATCH] Make groups more generic (PR gcov-profile/84548).

2018-02-27 Thread Nathan Sidwell

On 02/27/2018 07:45 AM, Martin Liška wrote:

Hi.

Considering following C++ code snippet:



gcc/ChangeLog:

2018-02-26  Martin Liska  

PR gcov-profile/84548
* gcov.c (process_file): Allow partial overlap and consider it
also as group functions.
(output_lines): Properly calculate range of lines for a group.


Ok, but fix comment:

+ /* It's possible to have function that partially overlap,
pluralize 'function'
+thus take the maximum end_line of got functions.  */
'got functions'?

--
Nathan Sidwell


Re: [Aarch64] Fix conditional branches with target far away.

2018-02-27 Thread Ramana Radhakrishnan
On Wed, Feb 14, 2018 at 8:30 AM, Sameera Deshpande
 wrote:
> Hi!
>
> Please find attached the patch to fix bug in branches with offsets over 1MiB.
> There has been an attempt to fix this issue in commit
> 050af05b9761f1979f11c151519e7244d5becd7c
>
> However, the far_branch attribute defined in above patch used
> insn_length - which computes incorrect offset. Hence, eliminated the
> attribute completely, and computed the offset from insn_addresses
> instead.
>
> Ok for trunk?
>
> gcc/Changelog
>
> 2018-02-13 Sameera Deshpande 
> * config/aarch64/aarch64.md (far_branch): Remove attribute. Eliminate
> all the dependencies on the attribute from RTL patterns.
>

I'm not a maintainer but this looks good to me modulo notes about how
this was tested. What would be nice is a testcase for the testsuite as
well as ensuring that the patch has been bootstrapped and regression
tested. AFAIR, the original patch was put in because match.pd failed
when bootstrap in another context.


regards
Ramana

> --
> - Thanks and regards,
>   Sameera D.


Re: PING: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER

2018-02-27 Thread H.J. Lu
On Mon, Feb 26, 2018 at 8:54 AM, Jan Hubicka  wrote:
>> On Mon, Feb 26, 2018 at 7:47 AM, Jan Hubicka  wrote:
>> > Hi,
>> > my main concern about the patch is that we now have 
>> > -mindirect-branch=thunk-extern
>> > which is intended to work well and is used by kernel, but we also have 
>> > other modes
>> > that are documented and as such they should work but they may lead to 
>> > invalid
>> > unwind info (or did I miss anything imporant here?).
>> > Why we can't fix the others as well?
>> >
>>
>> I took a closer look at my commit message.  It does leave an impression that
>> only  -mindirect-branch=thunk-extern is fixed.  But in fact, all
>> -mindirect-branch=
>> choices are fixed.
>
> I see, sorry for confussion!
>>
>> 1.  -mindirect-branch=thunk generates:
>>
>>.cfi_startproc
>> pushq   %rbx
>> .cfi_def_cfa_offset 16
>> .cfi_offset 3, -16
>> movq(%rdi), %rax
>> movq%rdi, %rbx
>> movq16(%rax), %rax
>> call__x86_indirect_thunk_rax
>> movq(%rbx), %rax
>> movq%rbx, %rdi
>> popq%rbx
>> .cfi_def_cfa_offset 8
>> movq16(%rax), %rax
>> jmp __x86_indirect_thunk_rax
>> .cfi_endproc
>>
>> 2.  -mindirect-branch=thunk-inline generates:
>>
>>.cfi_startproc
>> pushq   %rbx
>> .cfi_def_cfa_offset 16
>> .cfi_offset 3, -16
>> movq(%rdi), %rax
>> movq%rdi, %rbx
>> movq16(%rax), %rax
>> jmp .LIND1
>> .LIND0:
>> call.LIND3
>> .LIND2:
>> pause
>> lfence
>> jmp .LIND2
>> .LIND3:
>> mov %rax, (%rsp)
>> ret
>> .LIND1:
>> call.LIND0
>> movq(%rbx), %rax
>> movq%rbx, %rdi
>> popq%rbx
>> .cfi_def_cfa_offset 8
>> movq16(%rax), %rax
>> call.LIND5
>> .LIND4:
>> pause
>> lfence
>> jmp .LIND4
>> .LIND5:
>> mov %rax, (%rsp)
>> ret
>> .cfi_endproc
>>
>> I updated the commit message with
>>
>> This patch adds TARGET_INDIRECT_BRANCH_REGISTER to force indirect
>> branch via register whenever -mindirect-branch= is used.
>>
>> OK for trunk?
>>
>> Thanks.
>>
>> --
>> H.J.
>
>> From bd0672bd070da6fa4ff630540c1d87df3e8fdd53 Mon Sep 17 00:00:00 2001
>> From: "H.J. Lu" 
>> Date: Fri, 26 Jan 2018 15:54:25 -0800
>> Subject: [PATCH] i386: Add TARGET_INDIRECT_BRANCH_REGISTER
>>
>> For
>>
>> ---
>> struct C {
>>   virtual ~C();
>>   virtual void f();
>> };
>>
>> void
>> f (C *p)
>> {
>>   p->f();
>>   p->f();
>> }
>> ---
>>
>> -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates:
>>
>> _Z1fP1C:
>> .LFB0:
>> .cfi_startproc
>> pushq   %rbx
>> .cfi_def_cfa_offset 16
>> .cfi_offset 3, -16
>> movq(%rdi), %rax
>> movq%rdi, %rbx
>> jmp .LIND1
>> .LIND0:
>> pushq   16(%rax)
>> jmp __x86_indirect_thunk
>> .LIND1:
>> call.LIND0
>> movq(%rbx), %rax
>> movq%rbx, %rdi
>> popq%rbx
>> .cfi_def_cfa_offset 8
>> movq16(%rax), %rax
>> jmp __x86_indirect_thunk_rax
>> .cfi_endproc
>>
>> x86-64 is supposed to have asynchronous unwind tables by default, but
>> there is nothing that reflects the change in the (relative) frame
>> address after .LIND0.  That region really has to be moved outside of
>> the .cfi_startproc/.cfi_endproc bracket.
>>
>> This patch adds TARGET_INDIRECT_BRANCH_REGISTER to force indirect
>> branch via register whenever -mindirect-branch= is used.  Now,
>> -mindirect-branch=thunk-extern -O2 on x86-64 GNU/Linux generates:
>>
>> _Z1fP1C:
>> .LFB0:
>>   .cfi_startproc
>>   pushq   %rbx
>>   .cfi_def_cfa_offset 16
>>   .cfi_offset 3, -16
>>   movq(%rdi), %rax
>>   movq%rdi, %rbx
>>   movq16(%rax), %rax
>>   call__x86_indirect_thunk_rax
>>   movq(%rbx), %rax
>>   movq%rbx, %rdi
>>   popq%rbx
>>   .cfi_def_cfa_offset 8
>>   movq16(%rax), %rax
>>   jmp __x86_indirect_thunk_rax
>>   .cfi_endproc
>>
>> so that "-mindirect-branch=thunk-extern" is equivalent to
>> "-mindirect-branch=thunk-extern -mindirect-branch-register", which is
>> used by Linux kernel.
>>
>> gcc/
>>
>>   PR target/84039
>>   * config/i386/constraints.md (Bs): Replace
>>   ix86_indirect_branch_register with
>>   TARGET_INDIRECT_BRANCH_REGISTER.
>>   (Bw): Likewise.
>>   * config/i386/i386.md (indirect_jump): Likewise.
>>   (tablejump): Likewise.
>>   (*sibcall_memory): Likewise.
>>   (*sibcall_value_memory): Likewise.
>>   Peepholes of indirect call and jump via memory: Likewise.
>>   (*sibcall_GOT_32): Disallowed for TARGET_INDIRECT_BRANCH_REGISTER.
>>   (*sibcall_value_GOT_32): Likewise.

[PATCH] Make groups more generic (PR gcov-profile/84548).

2018-02-27 Thread Martin Liška
Hi.

Considering following C++ code snippet:

struct A { static int foo () { return 1; }; static int bar () {
  int x;
  return 2; } };

we should consider functions foo and bar as 2 within a single group. Even though
these functions are not 'clones',  they still overlap and it's proper to group 
them.

Doing that we'll have:

2:1:struct A { static int foo () { return 1; }; static int bar () {
-:2:  int x;
1:3:  return 2; } };
--
_ZN1A3fooEv:
1:1:struct A { static int foo () { return 1; }; static int bar () {
--
_ZN1A3barEv:
1:1:struct A { static int foo () { return 1; }; static int bar () {
-:2:  int x;
1:3:  return 2; } };
--

which is proper fix in my opinion.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

Martin

gcc/ChangeLog:

2018-02-26  Martin Liska  

PR gcov-profile/84548
* gcov.c (process_file): Allow partial overlap and consider it
also as group functions.
(output_lines): Properly calculate range of lines for a group.

gcc/testsuite/ChangeLog:

2018-02-26  Martin Liska  

PR gcov-profile/84548
* g++.dg/gcov/pr84548.C: New test.
---
 gcc/gcov.c  |  9 +++--
 gcc/testsuite/g++.dg/gcov/pr84548.C | 19 +++
 2 files changed, 26 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/gcov/pr84548.C


diff --git a/gcc/gcov.c b/gcc/gcov.c
index 37f431c0e91..3d802a10870 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -1151,7 +1151,6 @@ process_file (const char *file_name)
 	function_info **slot = fn_map.get (needle);
 	if (slot)
 	  {
-	gcc_assert ((*slot)->end_line == (*it)->end_line);
 	(*slot)->is_group = 1;
 	(*it)->is_group = 1;
 	  }
@@ -2957,7 +2956,13 @@ output_lines (FILE *gcov_file, const source_info *src)
 	{
 	  fns = src->get_functions_at_location (line_num);
 	  if (fns.size () > 1)
-	line_start_group = fns[0]->end_line;
+	{
+	  /* It's possible to have function that partially overlap,
+		 thus take the maximum end_line of got functions.  */
+	  for (unsigned i = 0; i < fns.size (); i++)
+		if (fns[i]->end_line > line_start_group)
+		  line_start_group = fns[i]->end_line;
+	}
 	  else if (fns.size () == 1)
 	{
 	  function_info *fn = fns[0];
diff --git a/gcc/testsuite/g++.dg/gcov/pr84548.C b/gcc/testsuite/g++.dg/gcov/pr84548.C
new file mode 100644
index 000..6c22c1902f2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gcov/pr84548.C
@@ -0,0 +1,19 @@
+// PR gcov-profile/84548
+// { dg-options "-fprofile-arcs -ftest-coverage" }
+// { dg-do run { target native } }
+// TODO: add support for groups to gcov.exp script
+
+struct A { static int foo () { return 1; }; static int bar () {
+  int x;
+  return 2; } };
+
+int main()
+{
+  int a = A::foo () + A::bar ();
+  if (a != 3)
+return 1;
+
+  return 0;
+}
+
+// { dg-final { run-gcov remove-gcda pr84548.C } }



[PATCH] Fix PR84512

2018-02-27 Thread Richard Biener

With Honzas change to make the x86 backend consider the actual operation
for costing vector stmts it becomes apparent that 
vect_compute_single_scalar_iteration_cost uses the old-style target
cost hook which doesn't get enough information to distinguish different
operations.  This means instead of actual scalar multiplication cost
we cost a general scalar-stmt cost for the testcase for the scalar
iteration but cost a vector multiplication for the vectorized body
resulting in an apples-to-oranges comparison in the end.

Fixed as follows.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2018-02-27  Richard Biener  

PR tree-optimization/84512
* tree-vect-loop.c (vect_compute_single_scalar_iteration_cost):
Do not use the estimate returned from record_stmt_cost for
the scalar iteration cost but sum properly using add_stmt_cost.

* gcc.dg/tree-ssa/pr84512.c: New testcase.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 258030)
+++ gcc/tree-vect-loop.c(working copy)
@@ -1384,16 +1384,10 @@ vect_compute_single_scalar_iteration_cos
 {
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-  int nbbs = loop->num_nodes, factor, scalar_single_iter_cost = 0;
+  int nbbs = loop->num_nodes, factor;
   int innerloop_iters, i;
 
-  /* Count statements in scalar loop.  Using this as scalar cost for a single
- iteration for now.
-
- TODO: Add outer loop support.
-
- TODO: Consider assigning different costs to different scalar
- statements.  */
+  /* Gather costs for statements in the scalar loop.  */
 
   /* FORNOW.  */
   innerloop_iters = 1;
@@ -1437,13 +1431,28 @@ vect_compute_single_scalar_iteration_cos
   else
 kind = scalar_stmt;
 
- scalar_single_iter_cost
-   += record_stmt_cost (_VINFO_SCALAR_ITERATION_COST (loop_vinfo),
-factor, kind, stmt_info, 0, vect_prologue);
+ record_stmt_cost (_VINFO_SCALAR_ITERATION_COST (loop_vinfo),
+   factor, kind, stmt_info, 0, vect_prologue);
 }
 }
-  LOOP_VINFO_SINGLE_SCALAR_ITERATION_COST (loop_vinfo)
-= scalar_single_iter_cost;
+
+  /* Now accumulate cost.  */
+  void *target_cost_data = init_cost (loop);
+  stmt_info_for_cost *si;
+  int j;
+  FOR_EACH_VEC_ELT (LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo),
+   j, si)
+{
+  struct _stmt_vec_info *stmt_info
+   = si->stmt ? vinfo_for_stmt (si->stmt) : NULL;
+  (void) add_stmt_cost (target_cost_data, si->count,
+   si->kind, stmt_info, si->misalign,
+   vect_body);
+}
+  unsigned dummy, body_cost = 0;
+  finish_cost (target_cost_data, , _cost, );
+  destroy_cost_data (target_cost_data);
+  LOOP_VINFO_SINGLE_SCALAR_ITERATION_COST (loop_vinfo) = body_cost;
 }
 
 
Index: gcc/testsuite/gcc.dg/tree-ssa/pr84512.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr84512.c (nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr84512.c (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+
+int foo()
+{
+  int a[10];
+  for(int i = 0; i < 10; ++i)
+a[i] = i*i;
+  int res = 0;
+  for(int i = 0; i < 10; ++i)
+res += a[i];
+  return res;
+}
+
+/* { dg-final { scan-tree-dump "return 285;" "optimized" } } */


Re: [PATCH] fix ICE in generic_overlap (PR 84526)

2018-02-27 Thread Richard Sandiford
Martin Sebor  writes:
> +  /* Convert the poly_int64 offset to to offset_int.  The offset
> + should be constant but be prepared for it not to be just in
> + case.  */

This comment seems redundant and could easily get out of date once
ACLE support is added.

> +  offset_int cstoff;
> +  if (bytepos.is_constant ())
>  {

> -  base = get_base_address (TREE_OPERAND (expr, 0));
> -  return;
> +  offrange[0] += cstoff;
> +  offrange[1] += cstoff;

Although this looks right, there's no actual need for cstoff to be
offset_int here, and I think the original HOST_WIDE_INT was more
efficient.

I realise you reuse the cstoff variable later with wi::to_offset,
but it doesn't seem necessary to use the same variable, since it's
logically a separate value.

Either way's fine, just thought I'd mention it :-)

Thanks,
Richard


Re: [Aarch64] Fix conditional branches with target far away.

2018-02-27 Thread Sameera Deshpande
On 14 February 2018 at 14:00, Sameera Deshpande
 wrote:
> Hi!
>
> Please find attached the patch to fix bug in branches with offsets over 1MiB.
> There has been an attempt to fix this issue in commit
> 050af05b9761f1979f11c151519e7244d5becd7c
>
> However, the far_branch attribute defined in above patch used
> insn_length - which computes incorrect offset. Hence, eliminated the
> attribute completely, and computed the offset from insn_addresses
> instead.
>
> Ok for trunk?
>
> gcc/Changelog
>
> 2018-02-13 Sameera Deshpande 
> * config/aarch64/aarch64.md (far_branch): Remove attribute. Eliminate
> all the dependencies on the attribute from RTL patterns.
>
> --
> - Thanks and regards,
>   Sameera D.


Gentle reminder!

-- 
- Thanks and regards,
  Sameera D.


Re: [v3 PATCH] Implement the missing bits of LWG 2769

2018-02-27 Thread Jonathan Wakely

On 27/02/18 10:13 +0200, Ville Voutilainen wrote:

On 26 February 2018 at 22:52, Jonathan Wakely  wrote:

On 25/02/18 23:22 +0200, Ville Voutilainen wrote:


Tested partially on Linux-x64, will test with the full suite on
Linux-PPC64.
Ok for trunk and the gcc-7 branch? This is theoretically a breaking change

This template argument should be aligned with "_ValueType" on the
previous line, not with "is_constructible".

Looking at that file, I'm also wondering if we want the alias _AnyCast
to be defined at namespace scope. It's only used in a few function
bodies, and its name is a bit misleading.

Could you just do:

 using _Up = remove_cv_t>;

in the four functions that use it?

Then I think the is_constructible specializations would fit on one line
anyway.



Done, new patch attached.


OK for trunk and gcc-7-branch, thanks.




[PATCH] Fix aarch64_simd_reg_or_zero predicate (PR fortran/84565)

2018-02-27 Thread Jakub Jelinek
Hi!

The following testcase ICEs, because the aarch64_cmeqdf instruction
starting with r256612 no longer accepts CONST0_RTX (E_DFmode) as
valid argument, but the expander generates it anyway.

The bug has been introduced during the addition of aarch64_simd_imm_zero
and aarch64_simd_or_scalar_imm_zero predicates, before the predicate
used to be:
(define_predicate "aarch64_simd_reg_or_zero"
  (and (match_code "reg,subreg,const_int,const_double,const_vector")
   (ior (match_operand 0 "register_operand")
   (ior (match_test "op == const0_rtx")
(match_test "aarch64_simd_imm_zero_p (op, mode)")
with
bool
aarch64_simd_imm_zero_p (rtx x, machine_mode mode)
{
  return x == CONST0_RTX (mode);
}
and so matched not just const,const_vector zeros, but also
const_int zero (that is through op == const0_rtx) and const_double
zero too.

Bootstrapped/regtested on aarch64-linux (scratch Fedora gcc 8 package build
with the patch applied), ok for trunk?

2018-02-27  Jakub Jelinek  

PR fortran/84565
* config/aarch64/predicates.md (aarch64_simd_reg_or_zero): Use
aarch64_simd_or_scalar_imm_zero rather than aarch64_simd_imm_zero.

* gfortran.dg/pr84565.f90: New test.

--- gcc/config/aarch64/predicates.md.jj 2018-02-06 13:13:06.305751221 +0100
+++ gcc/config/aarch64/predicates.md2018-02-26 16:30:01.902195208 +0100
@@ -395,7 +395,7 @@ (define_predicate "aarch64_simd_reg_or_z
   (and (match_code "reg,subreg,const_int,const_double,const,const_vector")
(ior (match_operand 0 "register_operand")
(match_test "op == const0_rtx")
-   (match_operand 0 "aarch64_simd_imm_zero"
+   (match_operand 0 "aarch64_simd_or_scalar_imm_zero"
 
 (define_predicate "aarch64_simd_struct_operand"
   (and (match_code "mem")
--- gcc/testsuite/gfortran.dg/pr84565.f90.jj2018-02-26 16:32:49.912271950 
+0100
+++ gcc/testsuite/gfortran.dg/pr84565.f90   2018-02-26 16:31:15.423223943 
+0100
@@ -0,0 +1,7 @@
+! PR fortran/84565
+! { dg-do compile { target aarch64*-*-* } }
+! { dg-options "-mlow-precision-sqrt -funsafe-math-optimizations" }
+subroutine mysqrt(a)
+ real(KIND=KIND(0.0D0)) :: a
+ a=sqrt(a)
+end subroutine

Jakub


[C++ PATCH] Fix ICE in cxx_eval_vec_init_1 (PR c++/84558)

2018-02-27 Thread Jakub Jelinek
Hi!

The PR70001 optimization in cxx_eval_vec_init_1 uses
initializer_constant_valid_p (eltinit, TREE_TYPE (eltinit))
which doesn't work if eltinit is NULL.  This can happen if
*non_constant_p is true, but ctx->quiet is true as well (for
*non_constant_p && !ctx->quiet we break the cycle early).
No initializer can be treated like a valid initializer and will
have the advantage that we don't repeat the body for every array element
and just do it once.  The caller will ignore the return value anyway
when *non_constant_p.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-02-26  Jakub Jelinek  

PR c++/84558
* constexpr.c (cxx_eval_vec_init_1): For reuse, treat NULL eltinit like
a valid constant initializer.  Formatting fixes.

* g++.dg/cpp1y/pr84558.C: New test.

--- gcc/cp/constexpr.c.jj   2018-02-26 10:46:02.162316172 +0100
+++ gcc/cp/constexpr.c  2018-02-26 14:07:38.705532369 +0100
@@ -2959,9 +2959,8 @@ cxx_eval_vec_init_1 (const constexpr_ctx
  if (!lvalue_p (init))
eltinit = move (eltinit);
  eltinit = force_rvalue (eltinit, tf_warning_or_error);
- eltinit = (cxx_eval_constant_expression
-(_ctx, eltinit, lval,
- non_constant_p, overflow_p));
+ eltinit = cxx_eval_constant_expression (_ctx, eltinit, lval,
+ non_constant_p, overflow_p);
}
   if (*non_constant_p && !ctx->quiet)
break;
@@ -2974,12 +2973,13 @@ cxx_eval_vec_init_1 (const constexpr_ctx
   else
CONSTRUCTOR_APPEND_ELT (*p, idx, eltinit);
   /* Reuse the result of cxx_eval_constant_expression call
- from the first iteration to all others if it is a constant
- initializer that doesn't require relocations.  */
+from the first iteration to all others if it is a constant
+initializer that doesn't require relocations.  */
   if (reuse
  && max > 1
- && (initializer_constant_valid_p (eltinit, TREE_TYPE (eltinit))
- == null_pointer_node))
+ && (eltinit == NULL_TREE
+ || (initializer_constant_valid_p (eltinit, TREE_TYPE (eltinit))
+ == null_pointer_node)))
{
  if (new_ctx.ctor != ctx->ctor)
eltinit = new_ctx.ctor;
--- gcc/testsuite/g++.dg/cpp1y/pr84558.C.jj 2018-02-26 14:33:56.575318295 
+0100
+++ gcc/testsuite/g++.dg/cpp1y/pr84558.C2018-02-26 14:33:33.142318527 
+0100
@@ -0,0 +1,6 @@
+// PR c++/84558
+// { dg-do compile { target c++14 } }
+
+struct A { static int i; constexpr A () { i = 0; } };
+struct B { A a[2][3][4]; };
+B b;

Jakub


[Patch, fortran] PR84538 - [8 Regression] Array of derived type elements incorrectly accessed in function

2018-02-27 Thread Paul Richard Thomas
Hi All,

The attached fixes this PR by dint of the change in class.c. The
changes to trans-array.c are largely cosmetic but the move of the call
to 'build_class_array_ref' ensures that all class array references go
by this route.

Boostrapped and regtested on FC27/x86_64 - OK to commit?

Regards

Paul

2018-02-27  Paul Thomas  

PR fortran/84538
* class.c (class_array_ref_detected): Remove the condition that
there be no reference after the array reference.
(find_intrinsic_vtab): Remove excess whitespace.
* trans-array.c (gfc_conv_scalarized_array_ref): Rename 'tmp'
as 'base and call build_class_array_ref earlier.

2018-02-27  Paul Thomas  

PR fortran/84538
* gfortran.dg/pr84523.f90: New test.
Index: gcc/fortran/class.c
===
*** gcc/fortran/class.c (revision 257969)
--- gcc/fortran/class.c (working copy)
*** class_array_ref_detected (gfc_ref *ref,
*** 308,314 
*full_array = true;
}
else if (ref->next && ref->next->type == REF_ARRAY
-   && !ref->next->next
&& ref->type == REF_COMPONENT
&& ref->next->u.ar.type != AR_ELEMENT)
{
--- 308,313 
*** find_intrinsic_vtab (gfc_typespec *ts)
*** 2630,2636 
  {
char tname[GFC_MAX_SYMBOL_LEN+1];
char *name;
!   
/* Encode all types as TYPENAME_KIND_ including especially character
 arrays, whose length is now consistently stored in the _len component
 of the class-variable.  */
--- 2629,2635 
  {
char tname[GFC_MAX_SYMBOL_LEN+1];
char *name;
! 
/* Encode all types as TYPENAME_KIND_ including especially character
 arrays, whose length is now consistently stored in the _len component
 of the class-variable.  */
Index: gcc/fortran/trans-array.c
===
*** gcc/fortran/trans-array.c   (revision 257969)
--- gcc/fortran/trans-array.c   (working copy)
*** gfc_conv_scalarized_array_ref (gfc_se *
*** 3376,3382 
gfc_array_info *info;
tree decl = NULL_TREE;
tree index;
!   tree tmp;
gfc_ss *ss;
gfc_expr *expr;
int n;
--- 3376,3382 
gfc_array_info *info;
tree decl = NULL_TREE;
tree index;
!   tree base;
gfc_ss *ss;
gfc_expr *expr;
int n;
*** gfc_conv_scalarized_array_ref (gfc_se *
*** 3396,3401 
--- 3396,3408 
  index = fold_build2_loc (input_location, PLUS_EXPR, gfc_array_index_type,
 index, info->offset);
  
+   base = build_fold_indirect_ref_loc (input_location, info->data);
+ 
+   /* Use the vptr 'size' field to access a class the element of a class
+  array.  */
+   if (build_class_array_ref (se, base, index))
+ return;
+ 
if (expr && ((is_subref_array (expr)
&& GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (info->descriptor)))
   || (expr->ts.deferred && (expr->expr_type == EXPR_VARIABLE
*** gfc_conv_scalarized_array_ref (gfc_se *
*** 3420,3433 
decl = info->descriptor;
  }
  
!   tmp = build_fold_indirect_ref_loc (input_location, info->data);
! 
!   /* Use the vptr 'size' field to access a class the element of a class
!  array.  */
!   if (build_class_array_ref (se, tmp, index))
! return;
! 
!   se->expr = gfc_build_array_ref (tmp, index, decl);
  }
  
  
--- 3427,3433 
decl = info->descriptor;
  }
  
!   se->expr = gfc_build_array_ref (base, index, decl);
  }
  
  
Index: gcc/testsuite/gfortran.dg/class_array_23.f03
===
*** gcc/testsuite/gfortran.dg/class_array_23.f03(nonexistent)
--- gcc/testsuite/gfortran.dg/class_array_23.f03(working copy)
***
*** 0 
--- 1,37 
+ ! { dg-do run }
+ !
+ ! Test the fix for PR84538 in which the scalarizer was taking the size
+ ! of 't', rather than 'te', to generate array references.
+ !
+ ! Contributed by Andrew Benson  
+ !
+ module bugMod
+   public
+   type :: t
+  integer :: i
+   end type t
+   type, extends(t) :: te
+  integer :: j
+   end type te
+ contains
+   subroutine check(n)
+ implicit none
+ class(t), intent(inout), dimension(:) :: n
+ integer :: i(2)
+ i = n%i ! Original testcase had this in a write statement. However,
+ ! it is the scalarizer that is getting the span wrong and so
+ ! this assignment failed too.
+ if (any (i .ne. [8,3])) stop 1
+ return
+   end subroutine check
+ end module bugMod
+ 
+ program bug
+   use bugMod
+   class(t), allocatable, dimension(:) :: n
+   allocate(te :: n(2))
+   n(1:2)%i=[8,3]
+   if (any (n%i .ne. [8,3])) stop 2
+   call check(n)
+   deallocate (n)
+ end program bug


[PATCH] Fix debug for -mcall-ms2sysv-xlogues stubs fallout (PR target/83917)

2018-02-27 Thread Jakub Jelinek
On Mon, Feb 26, 2018 at 08:05:56PM -0600, Daniel Santos wrote:
> >>> --- libgcc/config/i386/cygwin.S.jj2018-01-03 10:42:56.309763515 
> >>> +0100
> >>> +++ libgcc/config/i386/cygwin.S   2018-02-22 15:30:34.597925496 +0100
> >>> @@ -23,31 +23,13 @@
> >>>   * .
> >>>   */
> >>>  
> >>> -#include "auto-host.h"
> >> The following include should be here.
> >>
> >> +#include "i386-asm.h"
> > I don't understand this.  i386-asm.h needs (both before my patch and after
> > it) both auto-host.h and auto-target.h, as it tests
> > HAVE_GAS_SECTIONS_DIRECTIVE (this one newly, comes from cygwin.S)
> 
> The problem is that HAVE_GAS_SECTIONS_DIRECTIVE gets defined (or not) in
> ../../gcc/auto-host.h, but you are testing it before including
> auto-host.h, either directly or via i386-asm.h.  So if i386-asm.h
> depends upon HAVE_GAS_SECTIONS_DIRECTIVE first being defined then it is
> a circular dependency.
> 
> In its current form, cygwin.S would never define USE_GAS_CFI_DIRECTIVES
> prior to including i386-asm.h and also never emit
>     .cfi_sections    .debug_frame
> and rather or not USE_GAS_CFI_DIRECTIVES ends up being defined to 1 or 0
> depends upon the test of __GCC_HAVE_DWARF2_CFI_ASM in i386-asm.h.

Ugh, you're right.  I was trying to preserve existing behavior for cygwin.S,
but failed to do so.  Unfortunately the patch which added this stuff from
Kai T. and Richard H. from 2010 is not in gcc-patches archives; in any case,
I think nothing seriously bad happens if with older gas versions which do
support .cfi_* directives but not .cfi_sections .debug_frame we emit the CFI
into .eh_frame section rather than .debug_frame.

So this patch simplifies it, with only one guard for the non-trivial
vs. trivial cfi_* definitions (based on whether GCC itself would use it)
and only guard the .cfi_sections directive on whether it is really
available.

The __GCC_HAVE_DWARF2_CFI_ASM definition actually sometimes depends on the
.cfi_sections presence too:
  /* If we can't get the assembler to emit only .debug_frame, and we don't need
 dwarf2 unwind info for exceptions, then emit .debug_frame by hand.  */
  if (!HAVE_GAS_CFI_SECTIONS_DIRECTIVE && !dwarf2out_do_eh_frame ())
return false;
but doesn't actually guarantee it always, as when doing .eh_frame it will
not require .cfi_sections.

This spot brings in another, preexisting bug in cygwin.S though - 
the HAVE_GAS_CFI_SECTIONS_DIRECTIVE macro is always defined, to 0 or 1,
rather than sometimes #define and sometines #undef.

> Ultimately, the proper cleanup will be moving these tests out of
> {gcc,libgcc}/configure.ac and into .m4 files in the root config
> directory so that we don't uglify them with massive copy & pastes. 
> These tests are also fairly complex as there are a lot of dependencies. 
> m4 isn't my strong suite, but I can look at this after we're out of code
> freeze.

Not really sure about that, because we really want to do a different thing
in gcc/configure.ac (need to test the assembler directly, use
GCC_TARGET_TEMPLATE) while in libgcc it does usually something different.

The libgcc configure already has all the code for the .hidden directive,
as it uses it too, just it is only a pair of AC_SUBSTs rather than
AC_DEFINE_UNQUOTED.
The test for HAVE_GAS_CFI_SECTIONS_DIRECTIVE alternative can be compile
int foo (int, char *);
int bar (int x) { char *y = __builtin_alloca (x); return foo (x + 1, y) + 1; }
with -g -fno-asynchronous-unwind-tables -fno-unwind-tables -fno-exceptions
and scan for .cfi_sections .debug_frame.

So here is a new (I've committed the previous patch since then), only lightly
tested (only on x86_64-linux and don't have too old binutils around), patch:

2018-02-27  Jakub Jelinek  

PR debug/83917
* configure.ac (AS_HIDDEN_DIRECTIVE): AC_DEFINE_UNQUOTED this to
$asm_hidden_op if visibility ("hidden") attribute works.
(HAVE_AS_CFI_SECTIONS): New AC_DEFINE.
* config/i386/i386-asm.h: Don't include auto-host.h.
(PACKAGE_VERSION, PACKAGE_NAME, PACKAGE_STRING, PACKAGE_TARNAME,
PACKAGE_URL): Don't undefine.
(USE_GAS_CFI_DIRECTIVES): Don't use nor define this macro, instead
guard cfi_startproc only on ifdef __GCC_HAVE_DWARF2_CFI_ASM.
(FN_HIDDEN): Change guard from #ifdef HAVE_GAS_HIDDEN to
#ifdef AS_HIDDEN_DIRECTIVE, use AS_HIDDEN_DIRECTIVE macro in the
definition instead of hardcoded .hidden.
* config/i386/cygwin.S: Include i386-asm.h first before .cfi_sections
directive.  Use #ifdef HAVE_AS_CFI_SECTIONS rather than
#ifdef HAVE_GAS_CFI_SECTIONS_DIRECTIVE to guard .cfi_sections.
(USE_GAS_CFI_DIRECTIVES): Don't define.
* configure: Regenerated.
* config.in: Likewise.

--- libgcc/configure.ac.jj  2017-11-20 11:02:56.877786488 +0100
+++ libgcc/configure.ac 2018-02-27 09:20:45.939385138 +0100
@@ -486,11 +486,29 @@ AC_CACHE_CHECK([for 

Re: [PATCH] [Microblaze]: PIC Data Text Relative

2018-02-27 Thread Andrew Sadek
Thanks Micheal for your response.
I shall re-submit patches separately after re-running the whole GCC Test
suite and re-checking code conventions.
For sending to gdb-patches, it was a conflict from my side as actually I
thought it is also for binutils.

On Tue, Feb 27, 2018 at 2:07 AM, Michael Eager  wrote:

> On 02/25/2018 11:44 PM, Andrew Guirguis wrote:
>
>> Dears,
>>
>> Kindly find attached the patch bundle for Microblaze
>> '-mpic-data-text-relative' feature.
>>
>> Description of the feature in the following link:
>> https://github.com/andrewsadek/microblaze-pic-data-text-rel/
>> blob/pic_data_text_rel/README.md > k/microblaze-pic-data-text-rel/blob/pic_data_text_rel/README.md>
>>
>> Bundle includes:
>> 1) Change logs for GCC, binutils
>> 2) GCC Test results and comparison with the original.
>> 3) New Test case (picdtr.c)
>> 4) The Patches (against current heads)
>>
>
> Hi Andrew --
>
> Thanks for the submission.  I have the following recommendations:
>
> Submit each patch to the appropriate project mailing list.  Only submit
> the patch for the specific project, without patches for other projects.
>
> Include a description of the changes with each patch as well as the
> changelog.  Include the patch in your email or as an attachment.
>
> It isn't clear why you sent your submission to the gdb-patches mailing
> list, since there don't appear to be any GDB changes.  Conversely, it is
> not clear why you did not include the binutils mailing list, since you
> include a patch to that project.
>
> Be sure to follow GNU coding conventions,  Check brace placement,
> indent, maximum line length, if statements, etc.  I noticed a number
> of places where these conventions are not followed in your patches.
>
> GCC regression tests should include all tests (e.g., gcc.dg), not just the
> limited number of MicroBlaze-specific tests.
>
> --
> Michael Eagerea...@eagerm.com
> 1960 Park Blvd., Palo Alto, CA 94306
>



-- 

Andrew


Re: [v3 PATCH] Implement the missing bits of LWG 2769

2018-02-27 Thread Ville Voutilainen
On 26 February 2018 at 22:52, Jonathan Wakely  wrote:
> On 25/02/18 23:22 +0200, Ville Voutilainen wrote:
>>
>> Tested partially on Linux-x64, will test with the full suite on
>> Linux-PPC64.
>> Ok for trunk and the gcc-7 branch? This is theoretically a breaking change
> This template argument should be aligned with "_ValueType" on the
> previous line, not with "is_constructible".
>
> Looking at that file, I'm also wondering if we want the alias _AnyCast
> to be defined at namespace scope. It's only used in a few function
> bodies, and its name is a bit misleading.
>
> Could you just do:
>
>  using _Up = remove_cv_t>;
>
> in the four functions that use it?
>
> Then I think the is_constructible specializations would fit on one line
> anyway.


Done, new patch attached.
diff --git a/libstdc++-v3/include/std/any b/libstdc++-v3/include/std/any
index 466b7ca..a37eb38 100644
--- a/libstdc++-v3/include/std/any
+++ b/libstdc++-v3/include/std/any
@@ -438,8 +438,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return any(in_place_type<_Tp>, __il, std::forward<_Args>(__args)...);
 }
 
-  template 
-using _AnyCast = remove_cv_t>;
   /**
* @brief Access the contained object.
*
@@ -453,9 +451,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 inline _ValueType any_cast(const any& __any)
 {
+  using _Up = remove_cv_t>;
   static_assert(any::__is_valid_cast<_ValueType>(),
  "Template argument must be a reference or CopyConstructible type");
-  auto __p = any_cast<_AnyCast<_ValueType>>(&__any);
+  static_assert(is_constructible_v<_ValueType, const _Up&>,
+ "Template argument must be constructible from a const value.");
+  auto __p = any_cast<_Up>(&__any);
   if (__p)
return static_cast<_ValueType>(*__p);
   __throw_bad_any_cast();
@@ -476,37 +477,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 inline _ValueType any_cast(any& __any)
 {
+  using _Up = remove_cv_t>;
   static_assert(any::__is_valid_cast<_ValueType>(),
  "Template argument must be a reference or CopyConstructible type");
-  auto __p = any_cast<_AnyCast<_ValueType>>(&__any);
+  static_assert(is_constructible_v<_ValueType, _Up&>,
+ "Template argument must be constructible from an lvalue.");
+  auto __p = any_cast<_Up>(&__any);
   if (__p)
return static_cast<_ValueType>(*__p);
   __throw_bad_any_cast();
 }
 
-  template::value
-  || is_lvalue_reference<_ValueType>::value,
-  bool>::type = true>
-inline _ValueType any_cast(any&& __any)
-{
-  static_assert(any::__is_valid_cast<_ValueType>(),
- "Template argument must be a reference or CopyConstructible type");
-  auto __p = any_cast<_AnyCast<_ValueType>>(&__any);
-  if (__p)
-   return static_cast<_ValueType>(*__p);
-  __throw_bad_any_cast();
-}
-
-  template::value
-  && !is_lvalue_reference<_ValueType>::value,
-  bool>::type = false>
+  template
 inline _ValueType any_cast(any&& __any)
 {
+  using _Up = remove_cv_t>;
   static_assert(any::__is_valid_cast<_ValueType>(),
  "Template argument must be a reference or CopyConstructible type");
-  auto __p = any_cast<_AnyCast<_ValueType>>(&__any);
+  static_assert(is_constructible_v<_ValueType, _Up>,
+ "Template argument must be constructible from an rvalue.");
+  auto __p = any_cast<_Up>(&__any);
   if (__p)
return static_cast<_ValueType>(std::move(*__p));
   __throw_bad_any_cast();
diff --git a/libstdc++-v3/testsuite/20_util/any/misc/any_cast.cc 
b/libstdc++-v3/testsuite/20_util/any/misc/any_cast.cc
index 45d8b63..37a24d7 100644
--- a/libstdc++-v3/testsuite/20_util/any/misc/any_cast.cc
+++ b/libstdc++-v3/testsuite/20_util/any/misc/any_cast.cc
@@ -95,15 +95,6 @@ void test03()
   VERIFY(move_count == 1);
   MoveEnabled&& m3 = any_cast(any(m));
   VERIFY(move_count == 1);
-  struct MoveDeleted
-  {
-MoveDeleted(MoveDeleted&&) = delete;
-MoveDeleted() = default;
-MoveDeleted(const MoveDeleted&) = default;
-  };
-  MoveDeleted md;
-  MoveDeleted&& md2 = any_cast(any(std::move(md)));
-  MoveDeleted&& md3 = any_cast(any(std::move(md)));
 }
 
 void test04()
diff --git a/libstdc++-v3/testsuite/20_util/any/misc/any_cast_neg.cc 
b/libstdc++-v3/testsuite/20_util/any/misc/any_cast_neg.cc
index 50a9a67..62d7aaa 100644
--- a/libstdc++-v3/testsuite/20_util/any/misc/any_cast_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/any/misc/any_cast_neg.cc
@@ -20,11 +20,26 @@
 
 #include 
 
+using std::any;
+using std::any_cast;
+
 void test01()
 {
-  using std::any;
-  using std::any_cast;
-
   const any y(1);
-  any_cast(y); // {