[Bug go/91035] [10 Regression] gotools fails to build on s390x-linux-gnu

2019-08-15 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91035

--- Comment #5 from Ian Lance Taylor  ---
It's hard to see how this could be a bug in the Go frontend.  At first glance
this looks like a problem with the split-stack support on S/390.

RE: Add TIGERLAKE and COOPERLAKE to GCC

2019-08-15 Thread Cui, Lili


> -Original Message-
> From: H.J. Lu [mailto:hjl.to...@gmail.com]
> Sent: Friday, August 16, 2019 6:02 AM
> To: Jeff Law 
> Cc: Cui, Lili ; Uros Bizjak ; GCC
> Patches ; Zhang, Annita
> ; Xiao, Wei3 ; Liu, Hongtao
> ; Wang, Hongyu ;
> Castillo, Jason M 
> Subject: Re: Add TIGERLAKE and COOPERLAKE to GCC
> 
> On Wed, Aug 14, 2019 at 11:04 AM Jeff Law  wrote:
> >
> > On 8/14/19 1:38 AM, Cui, Lili wrote:
> > > Resend this mail for GCC Patches rejected my message, thanks.
> > >
> > > -Original Message-
> > >
> > > Hi Uros and all:
> > >
> > > This patch is about to add TIGERLAKE and COOPERLAKE to GCC.
> > > TIGERLAKE is based on ICELAKE_CLIENT and plus new ISA
> MOVEDIRI/MOVDIR64B/AVX512VP2INTERSECT.
> > > COOPERLAKE is based on CASCADELAKE and plus new ISA AVX512BF16.
> > >
> > > Bootstrap is ok, and no regressions for i386/x86-64 testsuite.
> > >
> > > Changelog:
> > > gcc/
> > >   * common/config/i386/i386-common.c
> > >   (processor_names): Add tigerlake and cooperlake.
> > >   (processor_alias_table): Add tigerlake and cooperlake.
> > >   * config.gcc: Add -march=tigerlake and cooperlake.
> > >   * config/i386/driver-i386.c
> > >(host_detect_local_cpu): Detect tigerlake and cooperlake.
> > >   * config/i386/i386-builtins.c
> > >   (processor_model) : Add M_INTEL_COREI7_TIGERLAKE and
> M_INTEL_COREI7_COOPERLAKE.
> > >   (arch_names_table): Add tigerlake and cooperlake.
> > >   (get_builtin_code_for_version) : Handle PROCESSOR_TIGERLAKE and
> PROCESSOR_COOPERLAKE.
> > >   * config/i386/i386-c.c
> > >   (ix86_target_macros_internal): Handle tigerlake and cooperlake.
> > >   (ix86_target_macros_internal): Handle
> OPTION_MASK_ISA_AVX512VP2INTERSECT.
> > >   * config/i386/i386-options.c
> > >   (m_TIGERLAKE)  : Define.
> > >   (m_COOPERLAKE) : Ditto.
> > >   (m_CORE_AVX512): Ditto.
> > >   (processor_cost_table): Add cascadelake.
> > >   (ix86_target_string)  : Handle -mavx512vp2intersect.
> > >   (ix86_valid_target_attribute_inner_p) : Handle avx512vp2intersect.
> > >   (ix86_option_override_internal): Hadle PTA_SHSTK, PTA_MOVDIRI,
> > >PTA_MOVDIR64B, PTA_AVX512VP2INTERSECT.
> > >   * config/i386/i386.h
> > >   (ix86_size_cost) : Define TARGET_TIGERLAKE and
> TARGET_COOPERLAKE.
> > >   (processor_type) : Add PROCESSOR_TIGERLAKE and
> PROCESSOR_COOPERLAKE.
> > >   (PTA_SHSTK) : Define.
> > >   (PTA_MOVDIRI): Ditto.
> > >   (PTA_MOVDIR64B): Ditto.
> > >   (PTA_COOPERLAKE) : Ditto.
> > >   (PTA_TIGERLAKE)  : Ditto.
> > >   (TARGET_AVX512VP2INTERSECT) : Ditto.
> > >   (TARGET_AVX512VP2INTERSECT_P(x)) : Ditto.
> > >   (processor_type) : Add PROCESSOR_TIGERLAKE and
> PROCESSOR_COOPERLAKE.
> > >   * doc/extend.texi: Add tigerlake and cooperlake.
> > >
> > > gcc/testsuite/
> > >   * gcc.target/i386/funcspec-56.inc: Handle new march.
> > >   * g++.target/i386/mv16.C: Handle new march
> > >
> > > libgcc/
> > >   * config/i386/cpuinfo.h: Add INTEL_COREI7_TIGERLAKE and
> INTEL_COREI7_COOPERLAKE.
> > >
> > ENOPATCH
> >
> > Note that HJ's reworking of the cost tables may require this patch to
> > change for the trunk.
> >
> 
> Yes, I have checked in my patch.  Please rebase.

Done, there is no conflict , thanks.


Lili.


0001-add-tigerlake-and-cooperlake-to-gcc.patch
Description: 0001-add-tigerlake-and-cooperlake-to-gcc.patch


[Bug ipa/91468] New: Suspicious codes in ipa-prop.c and ipa-cp.c

2019-08-15 Thread fxue at os dot amperecomputing.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91468

Bug ID: 91468
   Summary: Suspicious codes in ipa-prop.c and ipa-cp.c
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fxue at os dot amperecomputing.com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Some might be a bug, and some might be redundant.

ipa-prop.c:

In function ipcp_modif_dom_walker::before_dom_children(),

  vce = false;
  t = rhs;
  while (handled_component_p (t))
{
  /* V_C_E can do things like convert an array of integers to one
 bigger integer and similar things we do not handle below.  */
  if (TREE_CODE (rhs) == VIEW_CONVERT_EXPR)
{
  vce = true;
  break;
}
  t = TREE_OPERAND (t, 0);
}
  if (vce)
continue;

Should "rhs" in "if (TREE_CODE (rhs) == VIEW_CONVERT_EXPR)" be "t"?


In function update_jump_functions_after_inlining(),

  if (dst->type == IPA_JF_ANCESTOR)
{
  ..

  if (src->type == IPA_JF_PASS_THROUGH
  && src->value.pass_through.operation == NOP_EXPR)
{
   ..
}
  else if (src->type == IPA_JF_PASS_THROUGH
   && TREE_CODE_CLASS (src->value.pass_through.operation) ==
tcc_unary)
{
  dst->value.ancestor.formal_id =
src->value.pass_through.formal_id;
  dst->value.ancestor.agg_preserved = false;
}
  ..   
}

If we suppose pass_through operation is "negate_expr" (while it is not a
reasonable operation on pointer type), the code might be incorrect. It's better
to specify expected unary operations here.


In function compute_complex_assign_jump_func(),

case GIMPLE_UNARY_RHS:
  if (is_gimple_assign (stmt)
  && gimple_assign_rhs_class (stmt) == GIMPLE_UNARY_RHS
  && ! CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (stmt)))
ipa_set_jf_unary_pass_through (jfunc, index,
   gimple_assign_rhs_code (stmt));

The condition "is_gimple_assign (stmt)
  && gimple_assign_rhs_class (stmt) == GIMPLE_UNARY_RHS" seems to
be redundant, might be omit.


ipa-cp.c:

In function merge_agg_lats_step(), 

  if (**aglat && (**aglat)->offset == offset)
{
  if ((**aglat)->size != val_size
  || ((**aglat)->next 
  && (**aglat)->next->offset < offset + val_size))
{
  set_agg_lats_to_bottom (dest_plats);
  return false;
}
  gcc_checking_assert (!(**aglat)->next
   || (**aglat)->next->offset >= offset + val_size);
  return true;
}

The condition "|| ((**aglat)->next && (**aglat)->next->offset < offset +
val_size))" seems to be always false, because the next item should not be
overlapped with its prev, this is what merge_agg_lats_step() tries to ensure.

[PATCH] PR target/91441 - Turn off -fsanitize=kernel-address if TARGET_ASAN_SHADOW_OFFSET is not implemented.

2019-08-15 Thread Kito Cheng
 - -fsanitize=kernel-address will call targetm.asan_shadow_offset ()
   at asan_shadow_offset, so it will crash if TARGET_ASAN_SHADOW_OFFSET
   is not implemented, that's mean -fsanitize=kernel-address is not
   supported for target without TARGET_ASAN_SHADOW_OFFSET implementation.

gcc/ChangeLog:

PR target/91441
* toplev.c (process_options): Check TARGET_ASAN_SHADOW_OFFSET is
implemented for -fsanitize=kernel-address, and merge check logic
with -fsanitize=address.

testsuite/ChangeLog:

PR target/91441
* gcc.target/riscv/pr91441.c: New.
---
 gcc/testsuite/gcc.target/riscv/pr91441.c | 10 ++
 gcc/toplev.c | 10 +-
 2 files changed, 11 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr91441.c

diff --git a/gcc/testsuite/gcc.target/riscv/pr91441.c 
b/gcc/testsuite/gcc.target/riscv/pr91441.c
new file mode 100644
index 000..593a2972a0f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr91441.c
@@ -0,0 +1,10 @@
+/* PR target/91441 */
+/* { dg-do compile  } */
+/* { dg-options "--param asan-stack=1 -fsanitize=kernel-address" } */
+
+int *bar(int *);
+int *f( int a)
+{
+  return bar();
+}
+/* { dg-warning ".'-fsanitize=address' and '-fsanitize=kernel-address' are not 
supported for this target" "" { target *-*-* } 0 } */
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 7e0b9216dea..ddbb8b49436 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1744,7 +1744,7 @@ process_options (void)
   /* Address Sanitizer needs porting to each target architecture.  */
 
   if ((flag_sanitize & SANITIZE_ADDRESS)
-  && !FRAME_GROWS_DOWNWARD)
+  && (!FRAME_GROWS_DOWNWARD || targetm.asan_shadow_offset == NULL))
 {
   warning_at (UNKNOWN_LOCATION, 0,
  "%<-fsanitize=address%> and %<-fsanitize=kernel-address%> "
@@ -1752,14 +1752,6 @@ process_options (void)
   flag_sanitize &= ~SANITIZE_ADDRESS;
 }
 
-  if ((flag_sanitize & SANITIZE_USER_ADDRESS)
-  && targetm.asan_shadow_offset == NULL)
-{
-  warning_at (UNKNOWN_LOCATION, 0,
- "%<-fsanitize=address%> not supported for this target");
-  flag_sanitize &= ~SANITIZE_ADDRESS;
-}
-
  /* Do not use IPA optimizations for register allocation if profiler is active
 or patchable function entries are inserted for run-time instrumentation
 or port does not emit prologue and epilogue as RTL.  */
-- 
2.17.1



Re: PC-relative TLS support

2019-08-15 Thread Alan Modra
On Thu, Aug 15, 2019 at 01:24:07PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Aug 15, 2019 at 01:35:10PM +0930, Alan Modra wrote:
> > Supporting TLS for -mpcrel turns out to be relatively simple, in part
> > due to deciding that !TARGET_TLS_MARKERS with -mpcrel is silly.  No
> > assembler that I know of supporting prefix insns lacks TLS marker
> > support.
> 
> Will this stay that way?  (Or do we not care, not now anyway?)

I'd say we leave that problem to someone who wants pcrel without tls
markers.  It's not hard to do, just extend rs6000_output_tlsargs and
adjust IS_NOMARK_TLSGETADDR length attribute expressions.

> > Also, at some point powerpc gcc ought to remove
> > !TARGET_TLS_MARKERS generally and simplify all the occurrences of
> > IS_NOMARK_TLSGETADDR in rs6000.md rather than complicating them.
> 
> The last time this came up (a year ago) the conclusion was that we first
> would have to remove AIX support.

Hmm, I wonder has that changed?  A quick look at the source says the
AIX TLS support uses completely different patterns and shouldn't care.

> (Changelog has whitespace damage, I guess that is just from how you
> mailed this?  Please fix when applying it).

Fixed.  (It wasn't the mailer..)

-- 
Alan Modra
Australia Development Lab, IBM


[Bug ipa/91089] IPA-cp does not setup proper cost model for switch default case in function versioning

2019-08-15 Thread fxue at os dot amperecomputing.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91089

--- Comment #2 from Feng Xue  ---
I've already created a patch under review. Please give some comments. Here it
is: https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00937.html

[Bug ipa/91088] IPA-cp cost evaluation is too conservative for "if (f(param) cmp const_val)" condition

2019-08-15 Thread fxue at os dot amperecomputing.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91088

--- Comment #3 from Feng Xue  ---
I've already created a patch under review. Please give some comments. Here it
is: https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00959.html

Successful builds of GCC 9.2 on Darwin

2019-08-15 Thread Iain Sandoe
Successful builds have been made on
i686-darwin{9,10), 
powerpc-darwin9
x86_64-darwin{10,11,12,13,14,15,16,17,18} 
bootstrapped with GCC (including Ada)

test results are from
https://gcc.gnu.org/ml/gcc-testresults/2019-08/msg01662.html
to 
https://gcc.gnu.org/ml/gcc-testresults/2019-08/msg01673.html

and x86_64-darwin18 bootstrapped with clang (no Ada

test results:
https://gcc.gnu.org/ml/gcc-testresults/2019-08/msg01674.html

NOTES

1. *-darwin8 is currently not working with any live branch or trunk
  there’s a hope to address this before 9.3

2. anything earlier than darwin8 is going to require considerable
  work to build current branches, since the native installed tools do
  not meet the minimum requirements for GCC build.

3. some of the build/tests were performed on VMs;
   these are noted, and test timeouts should be regarded as
  “not significant” there.

Iain



Proposal to patch libiberty.a for controlling whether pathnames on Windows are converted to lower case

2019-08-15 Thread Carroll, Paul

This is a proposed patch to libiberty, with an accompanying patch to GCC.
The purpose of this patch is to make it possible for Windows-hosted 
toolchains to have the ability to control whether Canonicalized 
filenames are converted to all lower-case.

Most Windows users are not affected by this behavior.
However, we have at least one customer who does use Windows systems 
where their hard drives are case-sensitive.  Thus, they desire this ability.


The first implementation of Windows support in libiberty/lrealpath.c was 
added back in 2004.  The new code included a call to GetFullPathName(), 
to Canonicalize the filename.  Next,  if there was sufficient room in 
the buffer, the following code was run:


    /* The file system is case-preserving but case-insensitive,
   Canonicalize to lowercase, using the codepage associated
   with the process locale.  */
    CharLowerBuff (buf, len);
    return strdup (buf);

In effect, the assumption of the code is that all Windows file systems 
will be case-preserving but case-insensitive, so converting a filename 
to lowercase is not a problem.  And tools would always find the 
resulting file on the file system.  That turns out not to be true, but 
lrealpath() was mostly used just for system header files, so no one noticed.


However, in April 2014, libcpp was patched, to cause even non-system 
headers on Windows systems to be Canonicalized.  This patch has caused 
problems for users that have case-sensitive file systems on their 
Windows systems.  A patch to libcpp was proposed to do additional 
Canonicalize non-system headers on Windows systems.
The discussion on the patch starts at 
https://gcc.gnu.org/ml/gcc-patches/2014-04/msg9.html

As is noted in the comments:
  For DOS based file system, we always try to shorten non-system 
headers, as DOS has a tighter constraint on max path length.
The okay to add the patch was given May 9, 2014 at 
https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00557.html , with the note:
  I've spoke with Kai a bit about this and he thinks it's 
appropriate and desirable to shorten paths on these kinds of filesystems.


The libcpp change meant that lrealpath() began to get called for 
non-system header files.  There is other code in both bfd and libiberty 
that can reach the lrealpath() function, but apparently those functions 
are not as much of a problem, as in not really being used in most tools 
(obviously since our customer had no complaints before 2014).


What I am proposing is to modify lrealpath() to only call CharLowerBuff 
when the user has a filesystem where changing the case of the filename 
is not an issue.

That is actually most users, so the default should be to call CharLowerBuff.
But I want to give the user the ability to control that behavior.

My proposal is to change that section of code in libiberty/lrealpath.c 
as follows:


 else
   {
-   /* The file system is case-preserving but case-insensitive,
-  Canonicalize to lowercase, using the codepage associated
+   /* The file system is normally case-preserving but case-insensitive,
+  Canonicalize to lowercase if desired, using the codepage 
associated

   with the process locale.  */
-    CharLowerBuff (buf, len);
+   if (libiberty_lowerpath)
+  CharLowerBuff (buf, len);
 return strdup (buf);
   }

In effect, this just adds a control to let the user decide whether or 
not a pathname is converted to lowercase on Windows systems.


I also added a global definition for that control at the start of the 
libiberty/lrealpath.c source, setting it so the default behavior on 
Windows is to convert pathnames to lowercase:


+/* External option to control whether a pathname is converted
+   to all lower-case on Windows systems.  The default behavior
+   is to convert.
+*/
+#if defined (_WIN32)
+#ifdef __cplusplus
+extern "C" {
+#endif
+unsigned char libiberty_lowerpath = 1;
+#ifdef __cplusplus
+}
+#endif
+#endif

And, for use by tools that link to libiberty.a, I added an external 
reference to that control in include/libiberty.h:


+#if defined (_WIN32)
+/* Determine if filenames should be converted to lower case */
+extern unsigned char libiberty_lowerpath;
+#endif


Adding the above code to include/libiberty.h and libiberty/lrealpath.c 
results in a libiberty.a that behaves exactly the same as it does today 
for most users.
It also makes available a method of affecting whether or not a given 
tool affects canonicalized pathnames on Windows by also converting those 
pathnames to lowercase.


The questions to discuss:
1.    Is this a reasonable solution to this problem, by adding such a 
controlling symbol?
2.    Is it reasonable to use ‘libiberty_lowerpath’ as the symbol’s 
name?  There are other global symbols defined in libiberty that use a 
similar name, so this seems reasonable.
3.    Should the symbol be called ‘libiberty_lowerpath’ or something 
else, such as 

Re: [PATCH], Patch #3 of 10, Add prefixed addressing support

2019-08-15 Thread Bill Schmidt
On 8/14/19 5:06 PM, Michael Meissner wrote:
> This patch adds prefixed memory support to all offsettable instructions.
>
> Unlike previous versions of the patch, this patch combines all of the
> modifications for addressing to one patch.  Previously, I had 3 separate
> patches (one for PADDI, one for scalar types, and one for vector types).
>
> 2019-08-14   Michael Meissner  
>
>   * config/rs6000/predicates.md (add_operand): Add support for the
>   PADDI instruction.
>   (non_add_cint_operand): Add support for the PADDI instruction.
>   (lwa_operand): Add support for the prefixed PLWA instruction.
>   * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok_uncached):
>   Only treat modes < 16 bytes as scalars.
>   (rs6000_debug_print_mode): Print whether the mode supports
>   prefixed addressing.
>   (setup_insn_form): Enable prefixed addressing for all modes whose
>   default instruction form includes offset addressing.
>   (num_insns_constant_gpr): Add support for the PADDI instruction.
>   (quad_address_p): Add support for prefixed addressing.
>   (mem_operand_gpr): Add support for prefixed addressing.
>   (mem_operand_ds_form): Add support for prefixed addressing.
>   (rs6000_legitimate_offset_address_p): Add support for prefixed
>   addressing.
>   (rs6000_legitimate_address_p): Add support for prefixed
>   addressing.
>   (rs6000_mode_dependent_address): Add support for prefixed
>   addressing.
>   (rs6000_rtx_costs): Make PADDI cost the same as ADDI or ADDIS.
>   * config/rs6000/rs6000.md (add3): Add support for PADDI.
>   (movsi_internal1): Add support for prefixed addressing, and using
>   PADDI to load up large integers.
>   (movsi splitter): Do not split up a PADDI instruction.
>   (mov_64bit_dm): Add support for prefixed addressing.
>   (movtd_64bit_nodm): Add support for prefixed addressing.
>   (movdi_internal64): Add support for prefixed addressing, and using
>   PADDI to load up large integers.
>   (movdi splitter): Update comment about PADDI.
>   (stack_protect_setdi): Add support for prefixed addressing.
>   (stack_protect_testdi): Add support for prefixed addressing.
>   * config/rs6000/vsx.md (vsx_mov_64bit): Add support for
>   prefixed addressing.
>   (vsx_extract___load): Add support for prefixed
>   addressing.
>   (vsx_extract___load): Add support for prefixed
>   addressing.
>
> Index: gcc/config/rs6000/predicates.md
> ===
> --- gcc/config/rs6000/predicates.md   (revision 274174)
> +++ gcc/config/rs6000/predicates.md   (working copy)
> @@ -839,7 +839,8 @@
>  (define_predicate "add_operand"
>(if_then_else (match_code "const_int")
>  (match_test "satisfies_constraint_I (op)
> -  || satisfies_constraint_L (op)")
> +  || satisfies_constraint_L (op)
> +  || satisfies_constraint_eI (op)")
>  (match_operand 0 "gpc_reg_operand")))
>
>  ;; Return 1 if the operand is either a non-special register, or 0, or -1.
> @@ -852,7 +853,8 @@
>  (define_predicate "non_add_cint_operand"
>(and (match_code "const_int")
> (match_test "!satisfies_constraint_I (op)
> - && !satisfies_constraint_L (op)")))
> + && !satisfies_constraint_L (op)
> + && !satisfies_constraint_eI (op)")))
>
>  ;; Return 1 if the operand is a constant that can be used as the operand
>  ;; of an AND, OR or XOR.
> @@ -933,6 +935,13 @@
>  return false;
>
>addr = XEXP (inner, 0);
> +
> +  /* The LWA instruction uses the DS-form format where the bottom two bits of
> + the offset must be 0.  The prefixed PLWA does not have this
> + restriction.  */
> +  if (prefixed_local_addr_p (addr, mode, INSN_FORM_DS))
> +return true;
> +
>if (GET_CODE (addr) == PRE_INC
>|| GET_CODE (addr) == PRE_DEC
>|| (GET_CODE (addr) == PRE_MODIFY
> Index: gcc/config/rs6000/rs6000.c
> ===
> --- gcc/config/rs6000/rs6000.c(revision 274175)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -1828,7 +1828,7 @@ rs6000_hard_regno_mode_ok_uncached (int regno, mac
>
>if (ALTIVEC_REGNO_P (regno))
>   {
> -   if (GET_MODE_SIZE (mode) != 16 && !reg_addr[mode].scalar_in_vmx_p)
> +   if (GET_MODE_SIZE (mode) < 16 && !reg_addr[mode].scalar_in_vmx_p)
>   return 0;

Unrelated change?  I don't quite understand why it was changed, either. 
Is this to do with vector_pair support?  If so, maybe it belongs with a
different patch?
>
> return ALTIVEC_REGNO_P (last_regno);
> @@ -2146,6 +2146,11 @@ rs6000_debug_print_mode (ssize_t m)
>rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_FPR]),
>rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_VMX]));
>
> +  if 

[Bug rtl-optimization/91347] [7/8/9/10 Regression] pointer_string in linux vsprintf.c is miscompiled when sibling calls are optimized

2019-08-15 Thread dave.anglin at bell dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91347

--- Comment #4 from dave.anglin at bell dot net ---
If DSE bug can't be fixed, the attached patch appears to work around the issue
by disabling
tail calls on pa when an argument is passed on stack.

Re: [PATCH], Patch #1 replacement (fix issues with future TLS patches)

2019-08-15 Thread Bill Schmidt
Hi Mike, just a couple points from me...

On 8/15/19 4:19 PM, Michael Meissner wrote:


> Index: gcc/config/rs6000/rs6000.c
> ===
> --- gcc/config/rs6000/rs6000.c(revision 274172)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -369,8 +369,11 @@ struct rs6000_reg_addr {
>enum insn_code reload_fpr_gpr; /* INSN to move from FPR to GPR.  */
>enum insn_code reload_gpr_vsx; /* INSN to move from GPR to VSX.  */
>enum insn_code reload_vsx_gpr; /* INSN to move from VSX to GPR.  */
> +  enum insn_form default_insn_form;  /* Default format for offsets.  */
> +  enum insn_form insn_form[(int)N_RELOAD_REG]; /* Register insn format.  */
>addr_mask_type addr_mask[(int)N_RELOAD_REG]; /* Valid address masks.  */
>bool scalar_in_vmx_p;  /* Scalar value can go in VMX.  
> */
> +  bool prefixed_memory_p;/* We can use prefixed memory.  */
>  };
>
>  static struct rs6000_reg_addr reg_addr[NUM_MACHINE_MODES];
> @@ -2053,6 +2056,28 @@ rs6000_debug_vector_unit (enum rs6000_ve
>return ret;
>  }
>
> +/* Return a character that can be printed out to describe an instruction
> +   format.  */
> +
> +DEBUG_FUNCTION char
> +rs6000_debug_insn_form (enum insn_form iform)
> +{
> +  char ret;
> +
> +  switch (iform)
> +{
> +case INSN_FORM_UNKNOWN:  ret = '-'; break;
> +case INSN_FORM_D:ret = 'd'; break;
> +case INSN_FORM_DS:   ret = 's'; break;
> +case INSN_FORM_DQ:   ret = 'q'; break;
> +case INSN_FORM_X:ret = 'x'; break;
> +case INSN_FORM_PREFIXED: ret = 'p'; break;
> +default: ret = '?'; break;
> +}
> +
> +  return ret;
> +}
> +
>  /* Inner function printing just the address mask for a particular reload
> register class.  */
>  DEBUG_FUNCTION char *
> @@ -2115,6 +2140,12 @@ rs6000_debug_print_mode (ssize_t m)
>  fprintf (stderr, " %s: %s", reload_reg_map[rc].name,
>rs6000_debug_addr_mask (reg_addr[m].addr_mask[rc], true));
>
> +  fprintf (stderr, "  Format: %c:%c%c%c",
> +  rs6000_debug_insn_form (reg_addr[m].default_insn_form),
> +  rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_GPR]),
> +  rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_FPR]),
> +  rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_VMX]));
> +
>if ((reg_addr[m].reload_store != CODE_FOR_nothing)
>|| (reg_addr[m].reload_load != CODE_FOR_nothing))
>  {
> @@ -2668,6 +2699,153 @@ rs6000_setup_reg_addr_masks (void)
>  }
>  }
>
> +/* Set up the instruction format for each mode and register type from the
> +   addr_mask.  */
> +
> +static void
> +setup_insn_form (void)
> +{
> +  for (ssize_t m = 0; m < NUM_MACHINE_MODES; ++m)
> +{
> +  machine_mode scalar_mode = (machine_mode) m;
> +
> +  /* Convert complex and IBM double double/_Decimal128 into their scalar
> +  parts that the registers will be split into for doing load or
> +  store.  */
> +  if (COMPLEX_MODE_P (scalar_mode))
> + scalar_mode = GET_MODE_INNER (scalar_mode);
> +
> +  if (FLOAT128_2REG_P (scalar_mode))
> + scalar_mode = DFmode;
> +
> +  for (ssize_t rc = FIRST_RELOAD_REG_CLASS; rc <= LAST_RELOAD_REG_CLASS; 
> rc++)
> + {
> +   machine_mode single_reg_mode = scalar_mode;
> +   size_t msize = GET_MODE_SIZE (scalar_mode);
> +   addr_mask_type addr_mask = reg_addr[scalar_mode].addr_mask[rc];
> +   enum insn_form iform = INSN_FORM_UNKNOWN;
> +
> +   /* Is the mode permitted in the GPR/FPR/Altivec registers?  */
> +   if ((addr_mask & RELOAD_REG_VALID) != 0)

To help with readability and maintainability, may I suggest factoring
the following into a separate function...
> + {
> +   /* The addr_mask does not have the offsettable or indexed bits
> +  set for modes that are split into multiple registers (like
> +  IFmode).  It doesn't need this set, since typically by time it
> +  is used in secondary reload, the modes are split into
> +  component parts.
> +
> +  The instruction format however can be used earlier in the
> +  compilation, so we need to setup what kind of instruction can
> +  be generated for the modes that are split.  */
> +   if ((addr_mask & (RELOAD_REG_MULTIPLE
> + | RELOAD_REG_OFFSET
> + | RELOAD_REG_INDEXED)) == RELOAD_REG_MULTIPLE)
> + {
> +   /* Multiple register types in GPRs depend on whether we can
> +  use DImode in a single register or SImode.  */
> +   if (rc == RELOAD_REG_GPR)
> + {
> +   if (TARGET_POWERPC64)
> + {
> +   gcc_assert ((msize % 8) == 0);
> +   single_reg_mode = DImode;
> +

Re: [patch][aarch64]: add intrinsics for vld1(q)_x4 and vst1(q)_x4

2019-08-15 Thread Jason Merrill

On 8/6/19 5:51 AM, Richard Earnshaw (lists) wrote:

On 18/07/2019 18:18, James Greenhalgh wrote:

On Mon, Jun 10, 2019 at 06:21:05PM +0100, Sylvia Taylor wrote:

Greetings,

This patch adds the intrinsic functions for:
- vld1__x4
- vst1__x4
- vld1q__x4
- vst1q__x4

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk? If yes, I don't have any commit rights, so can someone
please commit it on my behalf.


Hi,

I'm concerned by this strategy for implementing the arm_neon.h builtins:


+__extension__ extern __inline int8x8x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vld1_s8_x4 (const int8_t *__a)
+{
+  union { int8x8x4_t __i; __builtin_aarch64_simd_xi __o; } __au;
+  __au.__o
+    = __builtin_aarch64_ld1x4v8qi ((const __builtin_aarch64_simd_qi 
*) __a);

+  return __au.__i;
+}


As far as I know this is undefined behaviour in C++11. This was the best
resource I could find pointing to the relevant standards paragraphs.

   
https://stackoverflow.com/questions/11373203/accessing-inactive-union-member-and-undefined-behavior 



That said, GCC explicitly allows it, so maybe this is fine?

   
https://gcc.gnu.org/onlinedocs/gcc-9.1.0/gcc/Optimize-Options.html#Type-punning 



Can anyone from the languages side chime in on whether we're exposing
undefined behaviour (in either C or C++) here?


Yes, this is a GNU extension.  My only question is whether or not this 
can be disabled within GCC if you're trying to check for strict 
standards conformance of your code?


It's undefined behavior: doing something reasonable is a conformant 
interpretation of undefined behavior.


I don't imagine that ubsan checks for this case, but it's possible.


And if so, is there a way of making sure that this header still works in that 
case?


The well-defined solution is memcpy.  Or, in C++20, bit_cast (not 
implemented yet).


Jason


Re: C++ PATCH for c++/91264 - detect modifying const objects in constexpr

2019-08-15 Thread Jason Merrill

On 8/15/19 5:34 PM, Marek Polacek wrote:

On Wed, Aug 14, 2019 at 02:50:13PM -0400, Jason Merrill wrote:

On Thu, Aug 8, 2019 at 3:25 PM Marek Polacek  wrote:


On Thu, Aug 08, 2019 at 11:06:17AM -0400, Jason Merrill wrote:

On 8/6/19 3:20 PM, Marek Polacek wrote:

On Mon, Aug 05, 2019 at 03:54:19PM -0400, Jason Merrill wrote:

On 7/31/19 3:26 PM, Marek Polacek wrote:

One of the features of constexpr is that it doesn't allow UB; and such UB must
be detected at compile-time.  So running your code in a context that requires
a constant expression should ensure that the code in question is free of UB.
In effect, constexpr can serve as a sanitizer.  E.g. this article describes in
in more detail:


[dcl.type.cv]p4 says "Any attempt to modify a const object during its lifetime
results in undefined behavior." However, as the article above points out, we
aren't detecting that case in constexpr evaluation.

This patch fixes that.  It's not that easy, though, because we have to keep in
mind [class.ctor]p5:
"A constructor can be invoked for a const, volatile or const volatile object.
const and volatile semantics are not applied on an object under construction.
They come into effect when the constructor for the most derived object ends."

I handled this by keeping a hash set which tracks objects under construction.
I considered other options, such as going up call_stack, but that wouldn't
work with trivial constructor/op=.  It was also interesting to find out that
the definition of TREE_HAS_CONSTRUCTOR says "When appearing in a FIELD_DECL,
it means that this field has been duly initialized in its constructor" though
nowhere in the codebase do we set TREE_HAS_CONSTRUCTOR on a FIELD_DECL as far
as I can see.  Unfortunately, using this bit proved useless for my needs here.



Also, be mindful of mutable subobjects.

Does this approach look like an appropriate strategy for tracking objects'
construction?


For scalar objects, we should be able to rely on INIT_EXPR vs. MODIFY_EXPR
to distinguish between initialization and modification; for class objects, I


This is already true: only class object go into the hash set.


wonder about setting a flag on the CONSTRUCTOR after initialization is
complete to indicate that the value is now constant.


But here we're not dealing with CONSTRUCTORs in the gcc sense (i.e. exprs with
TREE_CODE == CONSTRUCTOR).  We have a CALL_EXPR like Y::Y ((struct Y *) ),
which initializes the object "y".  Setting a flag on the CALL_EXPR or its 
underlying
function decl wouldn't help.

Am I missing something?


I was thinking that where in your current patch you call
remove_object_under_construction, we could instead mark the object's value
CONSTRUCTOR as immutable.


Ah, what you meant was to look at DECL_INITIAL of the object we're
constructing, which could be a CONSTRUCTOR.  Unfortunately, this
DECL_INITIAL is null (in all the new tests when doing
remove_object_under_construction), so there's nothing to mark as TREE_READONLY 
:/.


There's a value in ctx->values, isn't there?


Doesn't seem to be the case for e.g.

struct A {
   int n;
   constexpr A() : n(1) { n = 2; }
};

struct B {
   const A a;
   constexpr B(bool b) {
 if (b)
   const_cast(a).n = 3; // { dg-error "modifying a const object" }
 }
};

constexpr B b(false);
static_assert(b.a.n == 2, "");

Here we're constructing "b", its ctx->values->get(new_obj) is initially
"{}".  In the middle of constructing "b", we construct "b.a", but that
has nothing in ctx->values.


Right, subobjects aren't in ctx->values.  In cxx_eval_call_expression we 
have


  if (DECL_CONSTRUCTOR_P (fun))
/* This can be null for a subobject constructor call, in 

   which case what we care about is the initialization 

   side-effects rather than the value.  We could get at the 

   value by evaluating *this, but we don't bother; there's 


   no need to put such a call in the hash table.  */
result = lval ? ctx->object : ctx->ctor;

Your patch already finds *this (b.a) and puts it in new_obj; if it's 
const we can evaluate it to get the CONSTRUCTOR to set TREE_READONLY on.


Jason


[Bug c++/89179] compiler error: in ggc_set_mark, at ggc-page.c:1532

2019-08-15 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89179

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org

--- Comment #15 from Marek Polacek  ---
Those 0xa5 come from poison_pages in ggc-page.c, this call in particular:
 memset (object, 0xa5, size);

[Bug c++/91467] New: [concepts] ICE: in tsubst_copy, at cp/pt.c:15545

2019-08-15 Thread frederik.engels24 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91467

Bug ID: 91467
   Summary: [concepts] ICE: in tsubst_copy, at cp/pt.c:15545
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: frederik.engels24 at gmail dot com
  Target Milestone: ---

Run with "g++ (Compiler-Explorer-Build) 10.0.0 20190814 (experimental)"
and options "-std=c++2a -fconcepts"

source code: https://godbolt.org/z/tOIbPZ
~~~
template
struct foo {
T t;

template
void set(U&& u) requires requires { { static_cast(u)} -> T; }
{
t = static_cast(u);
}
};

int main()
{
auto w = foo{5};
w.set(5.0f);
}
~~
resulting in
~~
: In function 'int main()':

:15:15: internal compiler error: in tsubst_copy, at cp/pt.c:15545

   15 | w.set(5.0f);

  |   ^

Please submit a full bug report,

with preprocessed source if appropriate.

See  for instructions.

Compiler returned: 1
~

Re: [PATCH], Patch #1 replacement (fix issues with future TLS patches)

2019-08-15 Thread Segher Boessenkool
Hi Mike,

On Thu, Aug 15, 2019 at 05:19:16PM -0400, Michael Meissner wrote:
> -;; Return true if the operand is a pc-relative address.
> +;; Return true if the operand is a pc-relative address to a local symbol.

The pcrel_addr_p comment says it is *not* just for local symbols.  So
which is it?

Having both something called "pcrel_address" and something called
"pcrel_addr_p" is confusing.  And "pcrel_addr_p" isn't a predicate at
all, so it should not be called that.  Or you can remove that "info"
argument, which is probably a good idea.

>  (define_predicate "pcrel_address"
>(match_code "label_ref,symbol_ref,const")
>  {
> +  return pcrel_addr_p (op, true, false, PCREL_NULL);
>  })

Please avoid boolean arguments altogether; it isn't clear at all what
they mean here.

Ah, they say only locals are allowed here.  So this RTL predicate
shouldn't be called "pcrel_address"; it should have "local" in the name
somewhere.

>  ;; Return 1 if op is a prefixed memory operand.
>  (define_predicate "prefixed_mem_operand"
>(match_code "mem")
>  {
> -  return rs6000_prefixed_address_mode_p (XEXP (op, 0), GET_MODE (op));
> +  return prefixed_local_addr_p (XEXP (op, 0), mode, INSN_FORM_UNKNOWN);
>  })

Similar issues with "local" here.

>  (define_predicate "pcrel_external_mem_operand"
>(match_code "mem")
>  {
> -  return pcrel_external_address (XEXP (op, 0), Pmode);
> +  return pcrel_addr_p (XEXP (op, 0), false, true, PCREL_NULL);
>  })

Why this change?

> +/* Pc-relative address broken into component parts by pcrel_addr_p.  */
> +typedef struct {
> +  rtx base_addr; /* SYMBOL_REF or LABEL_REF.  */
> +  HOST_WIDE_INT offset;  /* Offset from the base address.  */
> +  bool external_p;   /* Is the symbol external?  */
> +} pcrel_info_type;

Don't use typedefs please.

Don't call booleans xxx_p; xxx_p is a name used for a predicate, that
is, a pure (or "const" in GCC terms) function returning a boolean.

Don't name types "*_type".

> +#define PCREL_NULL ((pcrel_info_type *)0)

Please just use NULL where you use this.  (Or 0 as far as I care, but
that's not the GCC coding style :-) ).

> --- gcc/config/rs6000/rs6000.c(revision 274172)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -369,8 +369,11 @@ struct rs6000_reg_addr {
>enum insn_code reload_fpr_gpr; /* INSN to move from FPR to GPR.  */
>enum insn_code reload_gpr_vsx; /* INSN to move from GPR to VSX.  */
>enum insn_code reload_vsx_gpr; /* INSN to move from VSX to GPR.  */
> +  enum insn_form default_insn_form;  /* Default format for offsets.  */
> +  enum insn_form insn_form[(int)N_RELOAD_REG]; /* Register insn format.  */
>addr_mask_type addr_mask[(int)N_RELOAD_REG]; /* Valid address masks.  */

Why the casts here?  Not all places use this cast, so why is it needed
here and not in all cases?

> +/* Return a character that can be printed out to describe an instruction
> +   format.  */
> +
> +DEBUG_FUNCTION char
> +rs6000_debug_insn_form (enum insn_form iform)
> +{
> +  char ret;
> +
> +  switch (iform)
> +{
> +case INSN_FORM_UNKNOWN:  ret = '-'; break;
> +case INSN_FORM_D:ret = 'd'; break;
> +case INSN_FORM_DS:   ret = 's'; break;
> +case INSN_FORM_DQ:   ret = 'q'; break;
> +case INSN_FORM_X:ret = 'x'; break;
> +case INSN_FORM_PREFIXED: ret = 'p'; break;
> +default: ret = '?'; break;
> +}
> +
> +  return ret;
> +}

This doesn't follow the coding style.

> +  fprintf (stderr, "  Format: %c:%c%c%c",
> +  rs6000_debug_insn_form (reg_addr[m].default_insn_form),
> +  rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_GPR]),
> +  rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_FPR]),
> +  rs6000_debug_insn_form (reg_addr[m].insn_form[RELOAD_REG_VMX]));

Is this useful?  For others I mean, not just for you.

> +/* Set up the instruction format for each mode and register type from the
> +   addr_mask.  */
> +
> +static void
> +setup_insn_form (void)
> +{
> +  for (ssize_t m = 0; m < NUM_MACHINE_MODES; ++m)

Why ssize_t?  Most places just use int.

> +{
> +  machine_mode scalar_mode = (machine_mode) m;
> +
> +  /* Convert complex and IBM double double/_Decimal128 into their scalar
> +  parts that the registers will be split into for doing load or
> +  store.  */
> +  if (COMPLEX_MODE_P (scalar_mode))
> + scalar_mode = GET_MODE_INNER (scalar_mode);

Do you also need to handle some vector modes here?

> +  if (FLOAT128_2REG_P (scalar_mode))
> + scalar_mode = DFmode;
> +
> +  for (ssize_t rc = FIRST_RELOAD_REG_CLASS; rc <= LAST_RELOAD_REG_CLASS; 
> rc++)

(overwide line)

> + {
> +   machine_mode single_reg_mode = scalar_mode;
> +   size_t msize = GET_MODE_SIZE (scalar_mode);
> +   addr_mask_type addr_mask = reg_addr[scalar_mode].addr_mask[rc];
> +   enum insn_form iform = INSN_FORM_UNKNOWN;
> +
> +   /* 

[Bug c++/89179] compiler error: in ggc_set_mark, at ggc-page.c:1532

2019-08-15 Thread sje at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89179

Steve Ellcey  changed:

   What|Removed |Added

 CC||sje at gcc dot gnu.org

--- Comment #14 from Steve Ellcey  ---
I think I may be seeing this same bug on aarch64 when building the RAJA library
based on where I am dying in ggc_set_mark.  I have not been able to create a
preprocessed test case because when I compile the preprocessed sources the bug
does not happen.  Here is my segfault dump:

min.cpp:191:1: internal compiler error: Segmentation fault
  191 | } 
  | ^ 
0xf03b5f crash_signal
../../gcc/gcc/toplev.c:326
0x9cc86c lookup_page_table_entry
../../gcc/gcc/ggc-page.c:632
0x9cc86c ggc_set_mark(void const*)
../../gcc/gcc/ggc-page.c:1531
0xc6fe47 gt_ggc_mx_symtab_node(void*)
/home/sellcey/gcc-raja/obj-gcc/gcc/gtype-desc.c:1302
0xe17503 gt_ggc_ma_order
./gt-passes.h:31
0xbe44f3 ggc_mark_root_tab
../../gcc/gcc/ggc-common.c:77
0xbe4813 ggc_mark_roots()
../../gcc/gcc/ggc-common.c:94
0x9cd1fb ggc_collect()
../../gcc/gcc/ggc-page.c:2201
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

When I look at gt_ggc_mx_symtab_node, the initial x_p pointer
that comes in is reasonable (0xa020c210) but after

xlimit = ((*xlimit).next);

The value of xlimit becomes 0xa5a5a5a5a5a5a5a5.  That looks
like a bogus value something might have put into memory
to poison it but I didn't see that specific string in
the GCC source tree anywhere.

[Bug c++/91466] New: [concepts] indicates "used in its own initializer" when not, constraint order change passes compilation.

2019-08-15 Thread frederik.engels24 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91466

Bug ID: 91466
   Summary: [concepts] indicates "used in its own initializer"
when not, constraint order change passes compilation.
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: frederik.engels24 at gmail dot com
  Target Milestone: ---

Run using "g++ (Compiler-Explorer-Build) 10.0.0 20190814 (experimental)" with
options "-std=c++2a -fconcepts"
and source code: https://godbolt.org/z/aGw2d0

~`
#include 
#include 

template
concept integral_constant_ = std::is_empty_v && std::is_trivial_v &&
requires
{
typename T::value_type;
requires std::is_integral_v;
{ T::value } -> typename T::value_type;
};

struct sz_fn
{
template requires requires(R&& r) { {static_cast(r).size()} ->
integral_constant_; }
constexpr auto operator()(R&& r) {
return static_cast(r).size();
}
};

constexpr auto sz = sz_fn{};


int main()
{
auto arr = std::array{1, 2, 3, 4, 5};

return !std::is_invocable_v;
}
~~~

results in:
~~~~
In file included from :1:

/opt/compiler-explorer/gcc-trunk-20190815/include/c++/10.0.0/type_traits: In
substitution of 'template static
std::__result_of_success()((declval<_Args>)()...)),
std::__invoke_other> std::__result_of_other_impl::_S_test(int) [with _Fn =
const sz_fn; _Args = {std::array}]':

/opt/compiler-explorer/gcc-trunk-20190815/include/c++/10.0.0/type_traits:2563:55:
  required from 'struct std::__result_of_impl >'

/opt/compiler-explorer/gcc-trunk-20190815/include/c++/10.0.0/type_traits:2976:12:
  recursively required by substitution of 'template
struct std::__is_invocable_impl<_Result, _Ret, std::__void_t > [with _Result = std::__invoke_result >; _Ret = void]'

/opt/compiler-explorer/gcc-trunk-20190815/include/c++/10.0.0/type_traits:2976:12:
  required from 'struct std::is_invocable >'

/opt/compiler-explorer/gcc-trunk-20190815/include/c++/10.0.0/type_traits:3021:27:
  required from 'constexpr const bool std::is_invocable_v >'

:28:18:   required from here

/opt/compiler-explorer/gcc-trunk-20190815/include/c++/10.0.0/type_traits:2552:26:
error: the value of 'std::is_empty_v' is not usable in a
constant expression

 2552 |   std::declval<_Fn>()(std::declval<_Args>()...)

  |   ~~~^~

/opt/compiler-explorer/gcc-trunk-20190815/include/c++/10.0.0/type_traits:3103:25:
note: 'std::is_empty_v' used in its own initializer

 3103 |   inline constexpr bool is_empty_v = is_empty<_Tp>::value;

  | ^~

/opt/compiler-explorer/gcc-trunk-20190815/include/c++/10.0.0/type_traits:2552:26:
error: the value of 'std::is_trivial_v' is not usable in a
constant expression

 2552 |   std::declval<_Fn>()(std::declval<_Args>()...)

  |   ~~~^~

/opt/compiler-explorer/gcc-trunk-20190815/include/c++/10.0.0/type_traits:3092:25:
note: 'std::is_trivial_v' used in its own initializer

 3092 |   inline constexpr bool is_trivial_v = is_trivial<_Tp>::value;

  | ^~~~

Compiler returned: 1


Changing

template
concept integral_constant_ = std::is_empty_v && std::is_trivial_v &&
requires
{
typename T::value_type;
requires std::is_integral_v;
{ T::value } -> typename T::value_type;
};
~`
to
~
template
concept integral_constant_ =
requires
{
typename T::value_type;
requires std::is_integral_v;
{ T::value } -> typename T::value_type;
} && std::is_empty_v && std::is_trivial_v;
~~~

Allows the program to compile successfully.

[Bug c++/91465] [9/10 Regression] unexpected expression of kind overload (ICE)

2019-08-15 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91465

Marek Polacek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Keywords||ice-on-valid-code
   Last reconfirmed||2019-08-15
 CC||mpolacek at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |mpolacek at gcc dot 
gnu.org
 Ever confirmed|0   |1
Summary|unexpected expression of|[9/10 Regression]
   |kind overload  (ICE)|unexpected expression of
   ||kind overload  (ICE)
   Target Milestone|--- |9.3

--- Comment #1 from Marek Polacek  ---
Started with r268321 thus mine.

[Bug c++/91465] New: unexpected expression of kind overload (ICE)

2019-08-15 Thread cuzdav at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91465

Bug ID: 91465
   Summary: unexpected expression of kind overload  (ICE)
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: cuzdav at gmail dot com
  Target Milestone: ---

This code results in an ICE:

In member function 'Destination Bar::baz(Source)':

internal compiler error: unexpected expression 'foo' of kind overload
  return Destination{foo(aLowValue)};
^

// See it LIVE!  https://godbolt.org/z/42c_Ll
enum class Destination  {};
enum class Source  {};

Destination foo(Source value) { return Destination{}; }
Destination foo(double value) { return Destination{}; }

template 
struct Bar {
Destination baz(Source aLowValue) {
return Destination{foo(aLowValue)}; //  HERE
}
};


Bug appears in 9.1 in c++17 (and up) in all higher g++ versions as well.  OK if
c++14 or lower, and OK if older g++.

Re: [patch, fortran] Fix PR 91443

2019-08-15 Thread Thomas Koenig

Hi Janne,


The patch itself looks Ok. One worry, are you introducing an
O(N**2)(?) algorithm (looping over all symbols for every symbol?), and
does this cause performance issues when compiling some gigantic F77
project?


This is a single pass over the code, so O(N) for the code size.
The lookup is O(log M), where M is the number of global symbols
in the file, so altogether, we're at O(N*log M), which should be OK.

Thanks for the review.

Committed as r274551.

I will add the documentation about the change in behavior later.

Regards

Thomas


Re: [PATCH 0/8] eBPF support for GCC

2019-08-15 Thread Jose E. Marchesi


Hi Richard.

> . Dynamic stack allocation (alloca and VLAs) is achieved by using what
>   otherwise would be a perfectly normal general register, %r9, as a
>   pseudo stack pointer.  This has the disadvantage of making the
>   register "fixed" and therefore not available for general register
>   allocation.  Hopefully there is a way to conditionalize this, since
>   both alloca and VLAs are relatively uncommon; I haven't found it
>   yet.

In principle it's possible to define register eliminations for
target-specific registers as well as the usual
FRAME/ARG_POINTER_REGNUM crowd.

Yeah, before I started using %r9 as a stack pointer, I was indeed
"eliminating" a fake stack pointer hard register to the frame register,
i.e. the opposite of what is usually done.

That seemed to work well, but as soon as __builtin_alloca and/or VLAs
were used, lra-eliminations would enter into an infinite loop: it didn't
like the stack pointer being eliminated.

So you could have a fake fixed register to represent the pseudo
stack pointer, then allow that to be "eliminated" to %r9 in
functions that need it.  Functions that don't need it can continue
(not) using the fake register and leave %r9 free for general use.

Interesting idea...  but wouldn't that require to have %r9 declared as a
fixed register, in functions that cfun->calls_alloca?

After reading your reply I investigated a bit, and found out that
CONDITIONAL_REGISTER_USAGE can indeed be called at pleasure, via
reinit_regs(). The i386 port calls reinit_regs in set_current_function,
for example.

So it should be possible to declare %r9 as fixed or non-fixed, in
bpf_set_current_function, depending on the value of
cfun->calls_alloca...


[Bug fortran/91443] -Wargument-mismatch does not catch mismatch for global procedure

2019-08-15 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91443

--- Comment #2 from Thomas Koenig  ---
Author: tkoenig
Date: Thu Aug 15 22:52:40 2019
New Revision: 274551

URL: https://gcc.gnu.org/viewcvs?rev=274551=gcc=rev
Log:
2019-08-15  Thomas Koenig  

PR fortran/91443
* frontend-passes.c (check_externals_expr): New function.
(check_externals_code): New function.
(gfc_check_externals): New function.
* gfortran.h (debug): Add prototypes for gfc_symbol * and
gfc_expr *.
(gfc_check_externals): Add prototype.
* interface.c (compare_actual_formal): Do not complain about
alternate returns if the formal argument is optional.
(gfc_procedure_use): Handle cases when an error has been issued
previously.  Break long line.
* parse.c (gfc_parse_file): Call gfc_check_externals for all
external procedures.
* resolve.c (resolve_global_procedure): Remove checking of
argument list.

2019-08-15  Thomas Koenig  

PR fortran/91443
* gfortran.dg/argument_checking_19.f90: New test.
* gfortran.dg/altreturn_10.f90: Change dg-warning to dg-error.
* gfortran.dg/dec_union_11.f90: Add -std=legacy.
* gfortran.dg/hollerith8.f90: Likewise. Remove warning for
Hollerith constant.
* gfortran.dg/integer_exponentiation_2.f90: New subroutine gee_i8;
use it to avoid type mismatches.
* gfortran.dg/pr41011.f: Add -std=legacy.
* gfortran.dg/whole_file_1.f90: Change warnings to errors.
* gfortran.dg/whole_file_2.f90: Likewise.


Added:
trunk/gcc/testsuite/gfortran.dg/argument_checking_19.f90
Modified:
trunk/gcc/fortran/ChangeLog
trunk/gcc/fortran/frontend-passes.c
trunk/gcc/fortran/gfortran.h
trunk/gcc/fortran/interface.c
trunk/gcc/fortran/parse.c
trunk/gcc/fortran/resolve.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gfortran.dg/altreturn_10.f90
trunk/gcc/testsuite/gfortran.dg/dec_union_11.f90
trunk/gcc/testsuite/gfortran.dg/hollerith8.f90
trunk/gcc/testsuite/gfortran.dg/integer_exponentiation_2.f90
trunk/gcc/testsuite/gfortran.dg/pr41011.f
trunk/gcc/testsuite/gfortran.dg/whole_file_1.f90
trunk/gcc/testsuite/gfortran.dg/whole_file_2.f90

gcc-7-20190815 is now available

2019-08-15 Thread gccadmin
Snapshot gcc-7-20190815 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/7-20190815/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 7 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-7-branch 
revision 274550

You'll find:

 gcc-7-20190815.tar.xzComplete GCC

  SHA256=934c86c020911e9fa63d8131dc75076252712ff541e396d1d29aa401e78017ff
  SHA1=f149bc4f1a4b20244262e5a994ce9e8a561d7b89

Diffs from 7-20190808 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-7
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


[Bug c++/91464] C++ extern "C" namespace A {int value;}; is definition not declaration

2019-08-15 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91464

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-08-15
 Ever confirmed|0   |1

--- Comment #1 from Jonathan Wakely  ---
The "2nd scenario" is:

extern "C" namespace A
{
int value;
}

int getValue ()
{
return A::value;
}

G++ only declares A::value, it doesn't define it.

Re: Add TIGERLAKE and COOPERLAKE to GCC

2019-08-15 Thread H.J. Lu
On Wed, Aug 14, 2019 at 11:04 AM Jeff Law  wrote:
>
> On 8/14/19 1:38 AM, Cui, Lili wrote:
> > Resend this mail for GCC Patches rejected my message, thanks.
> >
> > -Original Message-
> >
> > Hi Uros and all:
> >
> > This patch is about to add TIGERLAKE and COOPERLAKE to GCC.
> > TIGERLAKE is based on ICELAKE_CLIENT and plus new ISA 
> > MOVEDIRI/MOVDIR64B/AVX512VP2INTERSECT.
> > COOPERLAKE is based on CASCADELAKE and plus new ISA AVX512BF16.
> >
> > Bootstrap is ok, and no regressions for i386/x86-64 testsuite.
> >
> > Changelog:
> > gcc/
> >   * common/config/i386/i386-common.c
> >   (processor_names): Add tigerlake and cooperlake.
> >   (processor_alias_table): Add tigerlake and cooperlake.
> >   * config.gcc: Add -march=tigerlake and cooperlake.
> >   * config/i386/driver-i386.c
> >(host_detect_local_cpu): Detect tigerlake and cooperlake.
> >   * config/i386/i386-builtins.c
> >   (processor_model) : Add M_INTEL_COREI7_TIGERLAKE and 
> > M_INTEL_COREI7_COOPERLAKE.
> >   (arch_names_table): Add tigerlake and cooperlake.
> >   (get_builtin_code_for_version) : Handle PROCESSOR_TIGERLAKE and 
> > PROCESSOR_COOPERLAKE.
> >   * config/i386/i386-c.c
> >   (ix86_target_macros_internal): Handle tigerlake and cooperlake.
> >   (ix86_target_macros_internal): Handle 
> > OPTION_MASK_ISA_AVX512VP2INTERSECT.
> >   * config/i386/i386-options.c
> >   (m_TIGERLAKE)  : Define.
> >   (m_COOPERLAKE) : Ditto.
> >   (m_CORE_AVX512): Ditto.
> >   (processor_cost_table): Add cascadelake.
> >   (ix86_target_string)  : Handle -mavx512vp2intersect.
> >   (ix86_valid_target_attribute_inner_p) : Handle avx512vp2intersect.
> >   (ix86_option_override_internal): Hadle PTA_SHSTK, PTA_MOVDIRI,
> >PTA_MOVDIR64B, PTA_AVX512VP2INTERSECT.
> >   * config/i386/i386.h
> >   (ix86_size_cost) : Define TARGET_TIGERLAKE and TARGET_COOPERLAKE.
> >   (processor_type) : Add PROCESSOR_TIGERLAKE and PROCESSOR_COOPERLAKE.
> >   (PTA_SHSTK) : Define.
> >   (PTA_MOVDIRI): Ditto.
> >   (PTA_MOVDIR64B): Ditto.
> >   (PTA_COOPERLAKE) : Ditto.
> >   (PTA_TIGERLAKE)  : Ditto.
> >   (TARGET_AVX512VP2INTERSECT) : Ditto.
> >   (TARGET_AVX512VP2INTERSECT_P(x)) : Ditto.
> >   (processor_type) : Add PROCESSOR_TIGERLAKE and PROCESSOR_COOPERLAKE.
> >   * doc/extend.texi: Add tigerlake and cooperlake.
> >
> > gcc/testsuite/
> >   * gcc.target/i386/funcspec-56.inc: Handle new march.
> >   * g++.target/i386/mv16.C: Handle new march
> >
> > libgcc/
> >   * config/i386/cpuinfo.h: Add INTEL_COREI7_TIGERLAKE and 
> > INTEL_COREI7_COOPERLAKE.
> >
> ENOPATCH
>
> Note that HJ's reworking of the cost tables may require this patch to
> change for the trunk.
>

Yes, I have checked in my patch.  Please rebase.

Thanks.

-- 
H.J.


[Bug c++/64372] [DR1560] Gratuitous lvalue-to-rvalue conversion in conditional-expression with throw-expression operand

2019-08-15 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64372

--- Comment #14 from Jason Merrill  ---
Author: jason
Date: Thu Aug 15 21:55:19 2019
New Revision: 274550

URL: https://gcc.gnu.org/viewcvs?rev=274550=gcc=rev
Log:
PR c++/90393 - ICE with thow in ?:

My previous patch for 64372 was incomplete: it only stopped making the
non-throw argument into an rvalue, lvalue_kind still considered the ?:
expression to be an rvalue, leaving us worse than before.

PR c++/64372, DR 1560 - Gratuitous lvalue-to-rvalue conversion in ?:
* tree.c (lvalue_kind): Handle throw in one arm.
* typeck.c (rationalize_conditional_expr): Likewise.
(cp_build_modify_expr): Likewise.

Added:
trunk/gcc/testsuite/g++.dg/expr/cond15.C
trunk/gcc/testsuite/g++.dg/expr/cond16.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/tree.c
trunk/gcc/cp/typeck.c
trunk/gcc/testsuite/g++.dg/abi/mangle53.C
trunk/gcc/testsuite/g++.old-deja/g++.eh/cond1.C
trunk/gcc/testsuite/g++.old-deja/g++.other/cond5.C

[Bug c++/90393] [9/10 Regression] ICE in return statement with a conditional operator, one of the second and third arguments is throw, and the other is a const variable of a class with a nontrivial co

2019-08-15 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90393

--- Comment #9 from Jason Merrill  ---
Author: jason
Date: Thu Aug 15 21:55:19 2019
New Revision: 274550

URL: https://gcc.gnu.org/viewcvs?rev=274550=gcc=rev
Log:
PR c++/90393 - ICE with thow in ?:

My previous patch for 64372 was incomplete: it only stopped making the
non-throw argument into an rvalue, lvalue_kind still considered the ?:
expression to be an rvalue, leaving us worse than before.

PR c++/64372, DR 1560 - Gratuitous lvalue-to-rvalue conversion in ?:
* tree.c (lvalue_kind): Handle throw in one arm.
* typeck.c (rationalize_conditional_expr): Likewise.
(cp_build_modify_expr): Likewise.

Added:
trunk/gcc/testsuite/g++.dg/expr/cond15.C
trunk/gcc/testsuite/g++.dg/expr/cond16.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/tree.c
trunk/gcc/cp/typeck.c
trunk/gcc/testsuite/g++.dg/abi/mangle53.C
trunk/gcc/testsuite/g++.old-deja/g++.eh/cond1.C
trunk/gcc/testsuite/g++.old-deja/g++.other/cond5.C

[Bug c++/91464] New: C++ extern "C" namespace A {int value;}; is definition not declaration

2019-08-15 Thread dje at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91464

Bug ID: 91464
   Summary: C++ extern "C" namespace A {int value;}; is definition
not declaration
   Product: gcc
   Version: 9.2.1
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dje at gcc dot gnu.org
CC: jakub at gcc dot gnu.org, jason at gcc dot gnu.org
  Target Milestone: ---

From Twitter

extern "C" namespace A
{
int value;
}

int getValue ()
{
return A::value;
}

Clang compiles correctly this first scenario:
https://godbolt.org/z/FjUGov

while GCC doesn't compile:
https://godbolt.org/z/Ohfsgi

Clang compiles correctly this 2nd scenario:
https://godbolt.org/z/LhzeOo

while GCC compiles it wrongly:
https://godbolt.org/z/Nsiy5_

[Bug middle-end/91463] missing -Warray-bounds accessing past the end of a statically initialized flexible array member

2019-08-15 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91463

Martin Sebor  changed:

   What|Removed |Added

   Keywords||diagnostic
 Blocks||56456

--- Comment #1 from Martin Sebor  ---
This bug was never detected by GCC or other compilers (Clang or ICC).


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56456
[Bug 56456] [meta-bug] bogus/missing -Warray-bounds

[C++ PATCH] PR c++/90393 - ICE with thow in ?:

2019-08-15 Thread Jason Merrill
My previous patch for 64372 was incomplete: it only stopped making the
non-throw argument into an rvalue, lvalue_kind still considered the ?:
expression to be an rvalue, leaving us worse than before.

For GCC 9 I lean toward reverting the earlier patch rather than applying this
one and thus changing the meaning of well-formed code in the middle of a
release series.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/64372, DR 1560 - Gratuitous lvalue-to-rvalue conversion in ?:
* tree.c (lvalue_kind): Handle throw in one arm.
* typeck.c (rationalize_conditional_expr): Likewise.
(cp_build_modify_expr): Likewise.
---
 gcc/cp/tree.c| 21 
 gcc/cp/typeck.c  | 24 +++
 gcc/testsuite/g++.dg/abi/mangle53.C  |  5 ++--
 gcc/testsuite/g++.dg/expr/cond15.C   | 13 ++
 gcc/testsuite/g++.dg/expr/cond16.C   | 25 
 gcc/testsuite/g++.old-deja/g++.eh/cond1.C|  4 ++--
 gcc/testsuite/g++.old-deja/g++.other/cond5.C |  4 ++--
 gcc/cp/ChangeLog |  9 +++
 8 files changed, 85 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/expr/cond15.C
 create mode 100644 gcc/testsuite/g++.dg/expr/cond16.C

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index bca92100621..17a4df380c1 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -236,10 +236,23 @@ lvalue_kind (const_tree ref)
  gcc_assert (!type_dependent_expression_p (CONST_CAST_TREE (ref)));
  goto default_;
}
-  op1_lvalue_kind = lvalue_kind (TREE_OPERAND (ref, 1)
-   ? TREE_OPERAND (ref, 1)
-   : TREE_OPERAND (ref, 0));
-  op2_lvalue_kind = lvalue_kind (TREE_OPERAND (ref, 2));
+  {
+   tree op1 = TREE_OPERAND (ref, 1);
+   if (!op1) op1 = TREE_OPERAND (ref, 0);
+   tree op2 = TREE_OPERAND (ref, 2);
+   op1_lvalue_kind = lvalue_kind (op1);
+   op2_lvalue_kind = lvalue_kind (op2);
+   if (!op1_lvalue_kind != !op2_lvalue_kind)
+ {
+   /* The second or the third operand (but not both) is a
+  throw-expression; the result is of the type
+  and value category of the other.  */
+   if (op1_lvalue_kind && TREE_CODE (op2) == THROW_EXPR)
+ op2_lvalue_kind = op1_lvalue_kind;
+   else if (op2_lvalue_kind && TREE_CODE (op1) == THROW_EXPR)
+ op1_lvalue_kind = op2_lvalue_kind;
+ }
+  }
   break;
 
 case MODOP_EXPR:
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index 4cc0ee0128d..e2a4f285a72 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -2308,13 +2308,15 @@ rationalize_conditional_expr (enum tree_code code, tree 
t,
 complain);
 }
 
+  tree op1 = TREE_OPERAND (t, 1);
+  if (TREE_CODE (op1) != THROW_EXPR)
+op1 = cp_build_unary_op (code, op1, false, complain);
+  tree op2 = TREE_OPERAND (t, 2);
+  if (TREE_CODE (op2) != THROW_EXPR)
+op2 = cp_build_unary_op (code, op2, false, complain);
+
   return
-build_conditional_expr (loc, TREE_OPERAND (t, 0),
-   cp_build_unary_op (code, TREE_OPERAND (t, 1), false,
-   complain),
-   cp_build_unary_op (code, TREE_OPERAND (t, 2), false,
-   complain),
-complain);
+build_conditional_expr (loc, TREE_OPERAND (t, 0), op1, op2, complain);
 }
 
 /* Given the TYPE of an anonymous union field inside T, return the
@@ -8160,8 +8162,9 @@ cp_build_modify_expr (location_t loc, tree lhs, enum 
tree_code modifycode,
if (!lvalue_or_else (lhs, lv_assign, complain))
  return error_mark_node;
 
-   tree op1 = cp_build_modify_expr (loc, TREE_OPERAND (lhs, 1),
-modifycode, rhs, complain);
+   tree op1 = TREE_OPERAND (lhs, 1);
+   if (TREE_CODE (op1) != THROW_EXPR)
+ op1 = cp_build_modify_expr (loc, op1, modifycode, rhs, complain);
/* When sanitizing undefined behavior, even when rhs doesn't need
   stabilization at this point, the sanitization might add extra
   SAVE_EXPRs in there and so make sure there is no tree sharing
@@ -8170,8 +8173,9 @@ cp_build_modify_expr (location_t loc, tree lhs, enum 
tree_code modifycode,
if (sanitize_flags_p (SANITIZE_UNDEFINED
  | SANITIZE_UNDEFINED_NONDEFAULT))
  rhs = unshare_expr (rhs);
-   tree op2 = cp_build_modify_expr (loc, TREE_OPERAND (lhs, 2),
-modifycode, rhs, complain);
+   tree op2 = TREE_OPERAND (lhs, 2);
+   if (TREE_CODE (op2) != THROW_EXPR)
+ op2 = cp_build_modify_expr (loc, op2, modifycode, rhs, complain);
tree cond = build_conditional_expr (input_location,
 

[Bug middle-end/91463] New: missing -Warray-bounds accessing past the end of a statically initialized flexible array member

2019-08-15 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91463

Bug ID: 91463
   Summary: missing -Warray-bounds accessing past the end of a
statically initialized flexible array member
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: msebor at gcc dot gnu.org
  Target Milestone: ---

The past the end write to the flexible array member in f() is not diagnosed. 
DECL_SIZE of the array is null but for variables with a non-empty DECL_INITIAL
the size can be determined from the CONSTRUCTOR, so it should be possible to
detect the out-of-bounds index that way.

$ cat x.c && gcc -O2 -S -Wall -fdump-tree-vrp=/dev/stdout x.c
struct S { int n, a[]; };

struct S s = { 2, { 1, 0 } };

void f (void)
{
  s.a[666] = 0;   // missing -Warray-bounds
}

;; Function f (f, funcdef_no=0, decl_uid=1912, cgraph_uid=1, symbol_order=1)

;; 1 loops found
;;
;; Loop 0
;;  header 0, latch 1
;;  depth 0, outer -1
;;  nodes: 0 1 2
;; 2 succs { 1 }

Value ranges after VRP:



f ()
{
   [local count: 1073741824]:
  s.a[666] = 0;
  return;

}



;; Function f (f, funcdef_no=0, decl_uid=1912, cgraph_uid=1, symbol_order=1)

;; 1 loops found
;;
;; Loop 0
;;  header 0, latch 1
;;  depth 0, outer -1
;;  nodes: 0 1 2
;; 2 succs { 1 }

Value ranges after VRP:



f ()
{
   [local count: 1073741824]:
  s.a[666] = 0;
  return;

}

Re: C++ PATCH for c++/91264 - detect modifying const objects in constexpr

2019-08-15 Thread Marek Polacek
On Wed, Aug 14, 2019 at 02:50:13PM -0400, Jason Merrill wrote:
> On Thu, Aug 8, 2019 at 3:25 PM Marek Polacek  wrote:
> >
> > On Thu, Aug 08, 2019 at 11:06:17AM -0400, Jason Merrill wrote:
> > > On 8/6/19 3:20 PM, Marek Polacek wrote:
> > > > On Mon, Aug 05, 2019 at 03:54:19PM -0400, Jason Merrill wrote:
> > > > > On 7/31/19 3:26 PM, Marek Polacek wrote:
> > > > > > One of the features of constexpr is that it doesn't allow UB; and 
> > > > > > such UB must
> > > > > > be detected at compile-time.  So running your code in a context 
> > > > > > that requires
> > > > > > a constant expression should ensure that the code in question is 
> > > > > > free of UB.
> > > > > > In effect, constexpr can serve as a sanitizer.  E.g. this article 
> > > > > > describes in
> > > > > > in more detail:
> > > > > > 
> > > > > >
> > > > > > [dcl.type.cv]p4 says "Any attempt to modify a const object during 
> > > > > > its lifetime
> > > > > > results in undefined behavior." However, as the article above 
> > > > > > points out, we
> > > > > > aren't detecting that case in constexpr evaluation.
> > > > > >
> > > > > > This patch fixes that.  It's not that easy, though, because we have 
> > > > > > to keep in
> > > > > > mind [class.ctor]p5:
> > > > > > "A constructor can be invoked for a const, volatile or const 
> > > > > > volatile object.
> > > > > > const and volatile semantics are not applied on an object under 
> > > > > > construction.
> > > > > > They come into effect when the constructor for the most derived 
> > > > > > object ends."
> > > > > >
> > > > > > I handled this by keeping a hash set which tracks objects under 
> > > > > > construction.
> > > > > > I considered other options, such as going up call_stack, but that 
> > > > > > wouldn't
> > > > > > work with trivial constructor/op=.  It was also interesting to find 
> > > > > > out that
> > > > > > the definition of TREE_HAS_CONSTRUCTOR says "When appearing in a 
> > > > > > FIELD_DECL,
> > > > > > it means that this field has been duly initialized in its 
> > > > > > constructor" though
> > > > > > nowhere in the codebase do we set TREE_HAS_CONSTRUCTOR on a 
> > > > > > FIELD_DECL as far
> > > > > > as I can see.  Unfortunately, using this bit proved useless for my 
> > > > > > needs here.
> > > > >
> > > > > > Also, be mindful of mutable subobjects.
> > > > > >
> > > > > > Does this approach look like an appropriate strategy for tracking 
> > > > > > objects'
> > > > > > construction?
> > > > >
> > > > > For scalar objects, we should be able to rely on INIT_EXPR vs. 
> > > > > MODIFY_EXPR
> > > > > to distinguish between initialization and modification; for class 
> > > > > objects, I
> > > >
> > > > This is already true: only class object go into the hash set.
> > > >
> > > > > wonder about setting a flag on the CONSTRUCTOR after initialization is
> > > > > complete to indicate that the value is now constant.
> > > >
> > > > But here we're not dealing with CONSTRUCTORs in the gcc sense (i.e. 
> > > > exprs with
> > > > TREE_CODE == CONSTRUCTOR).  We have a CALL_EXPR like Y::Y ((struct Y *) 
> > > > ),
> > > > which initializes the object "y".  Setting a flag on the CALL_EXPR or 
> > > > its underlying
> > > > function decl wouldn't help.
> > > >
> > > > Am I missing something?
> > >
> > > I was thinking that where in your current patch you call
> > > remove_object_under_construction, we could instead mark the object's value
> > > CONSTRUCTOR as immutable.
> >
> > Ah, what you meant was to look at DECL_INITIAL of the object we're
> > constructing, which could be a CONSTRUCTOR.  Unfortunately, this
> > DECL_INITIAL is null (in all the new tests when doing
> > remove_object_under_construction), so there's nothing to mark as 
> > TREE_READONLY :/.
> 
> There's a value in ctx->values, isn't there?

Doesn't seem to be the case for e.g.

struct A {
  int n;
  constexpr A() : n(1) { n = 2; }
};

struct B {
  const A a;
  constexpr B(bool b) {
if (b)
  const_cast(a).n = 3; // { dg-error "modifying a const object" }
}
};

constexpr B b(false);
static_assert(b.a.n == 2, "");

Here we're constructing "b", its ctx->values->get(new_obj) is initially
"{}".  In the middle of constructing "b", we construct "b.a", but that
has nothing in ctx->values.

Marek


[Bug fortran/82992] ICE in create_int_parameter_array, at fortran/module.c:6586

2019-08-15 Thread kargl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82992

kargl at gcc dot gnu.org changed:

   What|Removed |Added

   Priority|P3  |P4
 CC||kargl at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |kargl at gcc dot gnu.org

--- Comment #6 from kargl at gcc dot gnu.org ---
Patch submitted.

[PATCH] PR fortran/82992 -- Check for conflicting symbols

2019-08-15 Thread Steve Kargl
The attached patch has be regression tested on x86_64-*-freebsd.

The testcase in the PR explains what the patch does.

% cat z1.f90
subroutine sub (x)
   use iso_fortran_env, only: x => character_kinds
end
%  gfcx -c a.f90
a.f90:1:17:

1 | subroutine sub (x)
  | 1
2 |use iso_fortran_env, only: x => character_kinds
  |   2
Error: Symbol 'x' at (1) conflicts with the rename symbol at (2)

OK to commit?

2019-08-15  Steven G. Kargl  

 PR fortran/82992
 * module.c (gfc_match_use):  When renaming a module entity, search
 current namespace for conflicting symbol.

2019-08-15  Steven G. Kargl  

 PR fortran/82992
 * gfortran.dg/pr71649.f90: Adjust error messages.
 * gfortran.dg/use_15.f90: Ditto.
 * gfortran.dg/use_rename_8.f90: Ditto.

-- 
Steve
Index: gcc/fortran/module.c
===
--- gcc/fortran/module.c	(revision 274495)
+++ gcc/fortran/module.c	(working copy)
@@ -525,6 +525,8 @@ gfc_match_use (void)
   gfc_intrinsic_op op;
   match m;
   gfc_use_list *use_list;
+  gfc_symtree *st;
+  locus loc;
 
   use_list = gfc_get_use_list ();
 
@@ -632,6 +634,8 @@ gfc_match_use (void)
 	case INTERFACE_USER_OP:
 	case INTERFACE_GENERIC:
 	case INTERFACE_DTIO:
+	  loc = gfc_current_locus;
+
 	  m = gfc_match (" =>");
 
 	  if (type == INTERFACE_USER_OP && m == MATCH_YES
@@ -641,6 +645,14 @@ gfc_match_use (void)
 
 	  if (type == INTERFACE_USER_OP)
 	new_use->op = INTRINSIC_USER;
+
+	  st = gfc_find_symtree (gfc_current_ns->sym_root, name);
+	  if (st)
+	{
+	  gfc_error ("Symbol %qs at %L conflicts with the rename symbol "
+			 "at %L", name, >n.sym->declared_at, );
+	  goto cleanup;
+	}
 
 	  if (use_list->only_flag)
 	{
Index: gcc/testsuite/gfortran.dg/pr71649.f90
===
--- gcc/testsuite/gfortran.dg/pr71649.f90	(revision 274495)
+++ gcc/testsuite/gfortran.dg/pr71649.f90	(working copy)
@@ -1,13 +1,13 @@
 ! { dg-do compile }
 ! PR71649 Internal Compiler Error
-SUBROUTINE Compiler_Options ( Options, Version, WriteOpt )
-   USE ISO_FORTRAN_ENV, ONLY : Compiler_Version, Compiler_Options ! { dg-error "already declared" }
+SUBROUTINE Compiler_Options ( Options, Version, WriteOpt )! { dg-error "\(1\)" }
+   USE ISO_FORTRAN_ENV, ONLY : Compiler_Version, Compiler_Options ! { dg-error "conflicts with the rename" }
IMPLICIT NONE
CHARACTER (LEN=*), INTENT(OUT) :: Options
CHARACTER (LEN=*), INTENT(OUT) :: Version
LOGICAL, INTENT(IN), OPTIONAL  :: WriteOpt
-   Version = Compiler_Version()
-   Options = Compiler_Options() ! { dg-error "Unexpected use of subroutine name" }
+   Version = Compiler_Version()  ! { dg-error "has no IMPLICIT type" }
+   Options = Compiler_Options()  ! { dg-error "Unexpected use of subroutine name" }
RETURN
 END SUBROUTINE Compiler_Options
 
Index: gcc/testsuite/gfortran.dg/use_15.f90
===
--- gcc/testsuite/gfortran.dg/use_15.f90	(revision 274495)
+++ gcc/testsuite/gfortran.dg/use_15.f90	(working copy)
@@ -28,8 +28,8 @@ subroutine my_sub2 (a)
 end subroutine
 
 
-subroutine my_sub3 (a)
-  use test_mod2, my_sub3 => my_sub2  ! { dg-error "is also the name of the current program unit" }
+subroutine my_sub3 (a)  ! { dg-error "\(1\)" }
+  use test_mod2, my_sub3 => my_sub2 ! { dg-error "conflicts with the rename" }
   real a
   print *, a
 end subroutine
Index: gcc/testsuite/gfortran.dg/use_rename_8.f90
===
--- gcc/testsuite/gfortran.dg/use_rename_8.f90	(revision 274495)
+++ gcc/testsuite/gfortran.dg/use_rename_8.f90	(working copy)
@@ -19,8 +19,8 @@ SUBROUTINE T
 USE MOO, ONLY: X => B
 END SUBROUTINE T
 
-SUBROUTINE C
-USE MOO, ONLY: C  ! { dg-error "is also the name of the current program unit" }
+SUBROUTINE C  ! { dg-error "\(1\)" }
+USE MOO, ONLY: C  ! { dg-error "conflicts with the rename" }
 END SUBROUTINE C
 
 SUBROUTINE D
@@ -36,15 +36,15 @@ SUBROUTINE F
 USE MOO, ONLY: X => F
 END SUBROUTINE F
 
-SUBROUTINE X
-USE MOO, ONLY: X => G ! { dg-error "is also the name of the current program unit" }
+SUBROUTINE X  ! { dg-error "\(1\)" }
+USE MOO, ONLY: X => G ! { dg-error "conflicts with the rename" }
 END SUBROUTINE X
 
-SUBROUTINE Y
-USE MOO, ONLY: Y => H ! { dg-error "is also the name of the current program unit" }
+SUBROUTINE Y  ! { dg-error "\(1\)" }
+USE MOO, ONLY: Y => H ! { dg-error "conflicts with the rename" }
 END SUBROUTINE Y
 
-SUBROUTINE Z
-USE MOO, ONLY: Z => I, Z => I ! { dg-error "is also the name of the current program unit" }
+SUBROUTINE Z! { dg-error "\(1\)" }
+USE MOO, ONLY: Z => I, Z => I   ! { dg-error "conflicts with the rename" }
 END SUBROUTINE Z
 


Re: Indirect memory addresses vs. lra

2019-08-15 Thread Segher Boessenkool
On Thu, Aug 15, 2019 at 02:30:19PM -0400, Vladimir Makarov wrote:
> >Couldn't we spill the frame pointer? Basically we should be able to 
> >compute the first address into a reg, spill that, do the second (both 
> >could require the frame pointer), spill the frame pointer, reload the 
> >first computed address from the stack, execute the insn and then reload 
> >the frame pointer.
> >
> >Maybe the frame pointer can also be implemented 'virually' in an index 
> >register that you keep updated so that sp + reg
> >Is the FP. Or frame accesses can use a
> >Stack slot as FP and the indirect memory
> >Addressing... (is there an indirect lea?)
> >
> Yes, it could be a solution.  It just needs some target maintainer 
> creativity.  There are a lot of things (tricks) can be done in 
> machine-dependent code which would not require RA changes.

You can even go as far as not having the hard frame pointer be a machine
register at all.  In RTL it will still be a reg, but that doesn't mean
the machine code you emit should be like that; you can use a special
fixed memory location for it, for example.


Segher


[PATCH], Patch #1 replacement (fix issues with future TLS patches)

2019-08-15 Thread Michael Meissner
After I submitted the patches, Aaron Sawdey tested the branch that has
the patches on it, along with Alan's TLS patches.  Alan's patch causes
the functions that determine if the insn is prefixed or not to be run
earlier that before.  The compiler was dying because the virtual arg
pointer and frame pointer registers weren't eliminated at that point,
and because I was only checking if the regno was between 0 and 31 for
GPRs.

I rewrote the test in reg_to_insn_form to use INT_REGNO_P macro (which
includes tests for arg pointer and frame pointer virtual registers).  I
used the other two macros (FP_REGNO_P and ALTIVEC_REGNO_P) for
consistency.

In addition, I removed the gcc_unreachable call if the register class
is not a GPR, FPR, or VMX register, and used the GPR defaults.  This is
in case the function gets called in the middle of reload where the
final moves are not done.

This patch replaces patch #1.  I have bootstrapped the compiler with
these changes and verified it fixed the problem Aaron was seeing.  Can
I check this into the FSF trunk?

2019-08-15   Michael Meissner  

* config/rs6000/predicates.md (pcrel_address): Rewrite to use
pcrel_addr_p.
(pcrel_external_address): Rewrite to use pcrel_addr_p.
(prefixed_mem_operand): Rewrite to use prefixed_local_addr_p.
(pcrel_external_mem_operand): Rewrite to use pcrel_addr_p.
* config/rs6000/rs6000-protos.h (reg_to_insn_form): New
declaration.
(pcrel_info_type): New declaration.
(PCREL_NULL): New macro.
(pcrel_addr_p): New declaration.
(rs6000_prefixed_address_mode_p): Delete.
* config/rs6000/rs6000.c (struct rs6000_reg_addr): Add fields for
instruction format and prefixed memory support.
(rs6000_debug_insn_form): New debug function.
(rs6000_debug_print_mode): Print instruction formats.
(setup_insn_form): New function.
(rs6000_init_hard_regno_mode_ok): Call setup_insn_form.
(print_operand_address): Call pcrel_addr_p instead of
pcrel_address.  Add support for external pc-relative labels.
(mode_supports_prefixed_address_p): Delete.
(rs6000_prefixed_address_mode_p): Delete, replace with
prefixed_local_addr_p.
(prefixed_local_addr_p): Replace rs6000_prefixed_address_mode_p.
Add argument to specify the instruction format.
(pcrel_addr_p): New function.
(reg_to_insn_form): New function.
* config/rs6000/rs6000.md (enum insn_form): New enumeration.

Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 274172)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -1626,32 +1626,11 @@ (define_predicate "small_toc_ref"
   return GET_CODE (op) == UNSPEC && XINT (op, 1) == UNSPEC_TOCREL;
 })
 
-;; Return true if the operand is a pc-relative address.
+;; Return true if the operand is a pc-relative address to a local symbol.
 (define_predicate "pcrel_address"
   (match_code "label_ref,symbol_ref,const")
 {
-  if (!rs6000_pcrel_p (cfun))
-return false;
-
-  if (GET_CODE (op) == CONST)
-op = XEXP (op, 0);
-
-  /* Validate offset.  */
-  if (GET_CODE (op) == PLUS)
-{
-  rtx op0 = XEXP (op, 0);
-  rtx op1 = XEXP (op, 1);
-
-  if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
-   return false;
-
-  op = op0;
-}
-
-  if (LABEL_REF_P (op))
-return true;
-
-  return (SYMBOL_REF_P (op) && SYMBOL_REF_LOCAL_P (op));
+  return pcrel_addr_p (op, true, false, PCREL_NULL);
 })
 
 ;; Return true if the operand is an external symbol whose address can be loaded
@@ -1665,32 +1644,14 @@ (define_predicate "pcrel_address"
 (define_predicate "pcrel_external_address"
   (match_code "symbol_ref,const")
 {
-  if (!rs6000_pcrel_p (cfun))
-return false;
-
-  if (GET_CODE (op) == CONST)
-op = XEXP (op, 0);
-
-  /* Validate offset.  */
-  if (GET_CODE (op) == PLUS)
-{
-  rtx op0 = XEXP (op, 0);
-  rtx op1 = XEXP (op, 1);
-
-  if (!CONST_INT_P (op1) || !SIGNED_34BIT_OFFSET_P (INTVAL (op1)))
-   return false;
-
-  op = op0;
-}
-
-  return (SYMBOL_REF_P (op) && !SYMBOL_REF_LOCAL_P (op));
+  return pcrel_addr_p (op, false, true, PCREL_NULL);
 })
 
 ;; Return 1 if op is a prefixed memory operand.
 (define_predicate "prefixed_mem_operand"
   (match_code "mem")
 {
-  return rs6000_prefixed_address_mode_p (XEXP (op, 0), GET_MODE (op));
+  return prefixed_local_addr_p (XEXP (op, 0), mode, INSN_FORM_UNKNOWN);
 })
 
 ;; Return 1 if op is a memory operand to an external variable when we
@@ -1699,7 +1660,7 @@ (define_predicate "prefixed_mem_operand"
 (define_predicate "pcrel_external_mem_operand"
   (match_code "mem")
 {
-  return pcrel_external_address (XEXP (op, 0), Pmode);
+  return pcrel_addr_p (XEXP (op, 0), false, true, PCREL_NULL);
 })
 
 ;; Match the first insn (addis) in fusing the combination of addis 

[Bug middle-end/91462] [8/9/10 Regression] missing -Warray-bounds writing to an empty flexible array member in a ctor

2019-08-15 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91462

Martin Sebor  changed:

   What|Removed |Added

   Keywords||diagnostic
 Blocks||56456
Summary|missing -Warray-bounds  |[8/9/10 Regression] missing
   |writing to an empty |-Warray-bounds writing to
   |flexible array member in a  |an empty flexible array
   |ctor|member in a ctor

--- Comment #1 from Martin Sebor  ---
GCC 5 diagnoses both: https://gcc.godbolt.org/z/dW2U6d.  Neither Clang nor GCC
diagnoses either.

The regression was introduced in r158058 committed to fix PR
tree-optimization/43270.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56456
[Bug 56456] [meta-bug] bogus/missing -Warray-bounds

[Bug middle-end/91462] New: missing -Warray-bounds writing to an empty flexible array member in a ctor

2019-08-15 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91462

Bug ID: 91462
   Summary: missing -Warray-bounds writing to an empty flexible
array member in a ctor
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: msebor at gcc dot gnu.org
  Target Milestone: ---

In the test case below, the write to the empty flexible array member in Ax
isn't diagnosed even though the equivalent write in A0 is.

$ cat t.C && gcc -O2 -S -Wall -fdump-tree-optimized=/dev/stdout t.C
struct A0
{
  char n, a[0];

  A0 () { a[0] = 0; }   // -Warray-bounds (good)
};

void f (void*);

void g ()
{
  struct A0 a;
  f ();
}

struct Ax
{ 
  char n, a[];

  Ax () { a[0] = 0; }   // missing -Warray-bounds
};

void h ()
{
  struct Ax a;
  f ();
}

t.C: In function ‘void g()’:
t.C:5:14: warning: array subscript 0 is above array bounds of ‘char [0]’
[-Warray-bounds]
5 |   A0 () { a[0] = 0; }
  |   ~~~^

;; Function g (_Z1gv, funcdef_no=3, decl_uid=2317, cgraph_uid=4,
symbol_order=3)

g ()
{
  struct A0 a;

   [local count: 1073741824]:
  a ={v} {CLOBBER};
  a.a[0] = 0;
  f ();
  a ={v} {CLOBBER};
  return;

}



;; Function h (_Z1hv, funcdef_no=7, decl_uid=2362, cgraph_uid=8,
symbol_order=7)

h ()
{
  struct Ax a;

   [local count: 1073741824]:
  a ={v} {CLOBBER};
  a.a[0] = 0;
  f ();
  a ={v} {CLOBBER};
  return;

}

[PATCHv5] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)

2019-08-15 Thread Bernd Edlinger
On 8/15/19 6:29 PM, Richard Biener wrote:
>>>
>>> Please split it into the parts for the PR and parts making the
>>> asserts not trigger.
>>>
>>
>> Yes, will do.
>>

Okay, here is the rest of the PR 89544 fix,
actually just an optimization, making the larger stack alignment
known to the middle-end, and the test cases.


Boot-strapped and reg-tested on x86_64-pc-linux-gnu and arm-linux-gnueabihf.
Is it OK for trunk?


Thanks
Bernd.
2019-08-15  Bernd Edlinger  

	PR middle-end/89544
	* function.c (assign_parm_find_stack_rtl): Use larger alignment
	when possible.

testsuite:
2019-08-15  Bernd Edlinger  

	PR middle-end/89544
	* gcc.target/arm/unaligned-argument-1.c: New test.
	* gcc.target/arm/unaligned-argument-2.c: New test.

Index: gcc/function.c
===
--- gcc/function.c	(Revision 274531)
+++ gcc/function.c	(Arbeitskopie)
@@ -2697,8 +2697,23 @@ assign_parm_find_stack_rtl (tree parm, struct assi
  intentionally forcing upward padding.  Otherwise we have to come
  up with a guess at the alignment based on OFFSET_RTX.  */
   poly_int64 offset;
-  if (data->locate.where_pad != PAD_DOWNWARD || data->entry_parm)
+  if (data->locate.where_pad == PAD_NONE || data->entry_parm)
 align = boundary;
+  else if (data->locate.where_pad == PAD_UPWARD)
+{
+  align = boundary;
+  /* If the argument offset is actually more aligned than the nominal
+	 stack slot boundary, take advantage of that excess alignment.
+	 Don't make any assumptions if STACK_POINTER_OFFSET is in use.  */
+  if (poly_int_rtx_p (offset_rtx, )
+	  && STACK_POINTER_OFFSET == 0)
+	{
+	  unsigned int offset_align = known_alignment (offset) * BITS_PER_UNIT;
+	  if (offset_align == 0 || offset_align > STACK_BOUNDARY)
+	offset_align = STACK_BOUNDARY;
+	  align = MAX (align, offset_align);
+	}
+}
   else if (poly_int_rtx_p (offset_rtx, ))
 {
   align = least_bit_hwi (boundary);
Index: gcc/testsuite/gcc.target/arm/unaligned-argument-1.c
===
--- gcc/testsuite/gcc.target/arm/unaligned-argument-1.c	(Revision 0)
+++ gcc/testsuite/gcc.target/arm/unaligned-argument-1.c	(Arbeitskopie)
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_arm_ok } */
+/* { dg-require-effective-target arm_ldrd_strd_ok } */
+/* { dg-options "-marm -mno-unaligned-access -O3" } */
+
+struct s {
+  int a, b;
+} __attribute__((aligned(8)));
+
+struct s f0;
+
+void f(int a, int b, int c, int d, struct s f)
+{
+  f0 = f;
+}
+
+/* { dg-final { scan-assembler-times "ldrd" 1 } } */
+/* { dg-final { scan-assembler-times "strd" 1 } } */
+/* { dg-final { scan-assembler-times "stm" 0 } } */
Index: gcc/testsuite/gcc.target/arm/unaligned-argument-2.c
===
--- gcc/testsuite/gcc.target/arm/unaligned-argument-2.c	(Revision 0)
+++ gcc/testsuite/gcc.target/arm/unaligned-argument-2.c	(Arbeitskopie)
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_arm_ok } */
+/* { dg-require-effective-target arm_ldrd_strd_ok } */
+/* { dg-options "-marm -mno-unaligned-access -O3" } */
+
+struct s {
+  int a, b;
+} __attribute__((aligned(8)));
+
+struct s f0;
+
+void f(int a, int b, int c, int d, int e, struct s f)
+{
+  f0 = f;
+}
+
+/* { dg-final { scan-assembler-times "ldrd" 0 } } */
+/* { dg-final { scan-assembler-times "strd" 0 } } */
+/* { dg-final { scan-assembler-times "stm" 1 } } */


[PATCH] Sanitizing the middle-end interface to the back-end for strict alignment

2019-08-15 Thread Bernd Edlinger
Hi,

this is the split out part from the "Fix not 8-byte aligned ldrd/strd on ARMv5 
(PR 89544)"
which is sanitizing the middle-end interface to the back-end for strict 
alignment,
and a couple of bug-fixes that are necessary to survive boot-strap.
It is intended to be applied after the PR 89544 fix.

I think it would be possible to change the default implementation of 
STACK_SLOT_ALIGNMENT
to make all stack variables always naturally aligned instead of doing that only
in assign_parm_setup_stack, but would still like to avoid changing too many 
things
that do not seem to have a problem.  Since this would affect many targets, and 
more
kinds of variables that may probably not have a strict alignment problem.
But I am ready to take your advice though.


Boot-strapped and reg-tested on x86_64-pc-linux-gnu and arm-linux-gnueabihf
Is it OK for trunk?


Thanks
Bernd.

2019-08-15  Bernd Edlinger  
	Richard Biener  

	* expr.c (expand_assignment): Handle misaligned DECLs.
	(expand_expr_real_1): Handle FUNCTION_DECL as unaligned.
	* function.c (assign_parm_adjust_stack_rtl): Check movmisalign optab
	too.
	(assign_parm_setup_stack): Allocate properly aligned stack slots.
	* varasm.c (build_constant_desc): Align constants of misaligned types.
	* config/arm/arm.md (movdi, movsi, movhi, movhf, movsf, movdf): Check
	strict alignment restrictions on memory addresses.
	* config/arm/neon.md (movti, mov, mov): Likewise.
	* config/arm/vec-common.md (mov): Likewise.

Index: gcc/config/arm/arm.md
===
--- gcc/config/arm/arm.md	(Revision 274531)
+++ gcc/config/arm/arm.md	(Arbeitskopie)
@@ -5838,6 +5838,12 @@
 	(match_operand:DI 1 "general_operand"))]
   "TARGET_EITHER"
   "
+  gcc_checking_assert (!MEM_P (operands[0])
+		   || MEM_ALIGN (operands[0])
+			  >= GET_MODE_ALIGNMENT (DImode));
+  gcc_checking_assert (!MEM_P (operands[1])
+		   || MEM_ALIGN (operands[1])
+			  >= GET_MODE_ALIGNMENT (DImode));
   if (can_create_pseudo_p ())
 {
   if (!REG_P (operands[0]))
@@ -6014,6 +6020,12 @@
   {
   rtx base, offset, tmp;
 
+  gcc_checking_assert (!MEM_P (operands[0])
+		   || MEM_ALIGN (operands[0])
+			  >= GET_MODE_ALIGNMENT (SImode));
+  gcc_checking_assert (!MEM_P (operands[1])
+		   || MEM_ALIGN (operands[1])
+			  >= GET_MODE_ALIGNMENT (SImode));
   if (TARGET_32BIT || TARGET_HAVE_MOVT)
 {
   /* Everything except mem = const or mem = mem can be done easily.  */
@@ -6503,6 +6515,12 @@
 	(match_operand:HI 1 "general_operand"))]
   "TARGET_EITHER"
   "
+  gcc_checking_assert (!MEM_P (operands[0])
+		   || MEM_ALIGN (operands[0])
+			  >= GET_MODE_ALIGNMENT (HImode));
+  gcc_checking_assert (!MEM_P (operands[1])
+		   || MEM_ALIGN (operands[1])
+			  >= GET_MODE_ALIGNMENT (HImode));
   if (TARGET_ARM)
 {
   if (can_create_pseudo_p ())
@@ -6912,6 +6930,12 @@
 	(match_operand:HF 1 "general_operand"))]
   "TARGET_EITHER"
   "
+  gcc_checking_assert (!MEM_P (operands[0])
+		   || MEM_ALIGN (operands[0])
+			  >= GET_MODE_ALIGNMENT (HFmode));
+  gcc_checking_assert (!MEM_P (operands[1])
+		   || MEM_ALIGN (operands[1])
+			  >= GET_MODE_ALIGNMENT (HFmode));
   if (TARGET_32BIT)
 {
   if (MEM_P (operands[0]))
@@ -6976,6 +7000,12 @@
 	(match_operand:SF 1 "general_operand"))]
   "TARGET_EITHER"
   "
+  gcc_checking_assert (!MEM_P (operands[0])
+		   || MEM_ALIGN (operands[0])
+			  >= GET_MODE_ALIGNMENT (SFmode));
+  gcc_checking_assert (!MEM_P (operands[1])
+		   || MEM_ALIGN (operands[1])
+			  >= GET_MODE_ALIGNMENT (SFmode));
   if (TARGET_32BIT)
 {
   if (MEM_P (operands[0]))
@@ -7071,6 +7101,12 @@
 	(match_operand:DF 1 "general_operand"))]
   "TARGET_EITHER"
   "
+  gcc_checking_assert (!MEM_P (operands[0])
+		   || MEM_ALIGN (operands[0])
+			  >= GET_MODE_ALIGNMENT (DFmode));
+  gcc_checking_assert (!MEM_P (operands[1])
+		   || MEM_ALIGN (operands[1])
+			  >= GET_MODE_ALIGNMENT (DFmode));
   if (TARGET_32BIT)
 {
   if (MEM_P (operands[0]))
Index: gcc/config/arm/neon.md
===
--- gcc/config/arm/neon.md	(Revision 274531)
+++ gcc/config/arm/neon.md	(Arbeitskopie)
@@ -127,6 +127,12 @@
 	(match_operand:TI 1 "general_operand"))]
   "TARGET_NEON"
 {
+  gcc_checking_assert (!MEM_P (operands[0])
+		   || MEM_ALIGN (operands[0])
+			  >= GET_MODE_ALIGNMENT (TImode));
+  gcc_checking_assert (!MEM_P (operands[1])
+		   || MEM_ALIGN (operands[1])
+			  >= GET_MODE_ALIGNMENT (TImode));
   if (can_create_pseudo_p ())
 {
   if (!REG_P (operands[0]))
@@ -139,6 +145,12 @@
 	(match_operand:VSTRUCT 1 "general_operand"))]
   "TARGET_NEON"
 {
+  gcc_checking_assert (!MEM_P (operands[0])
+		   || MEM_ALIGN (operands[0])
+			  >= GET_MODE_ALIGNMENT (mode));
+  gcc_checking_assert (!MEM_P (operands[1])
+		   || MEM_ALIGN (operands[1])
+			  >= GET_MODE_ALIGNMENT (mode));
   if (can_create_pseudo_p ())
 

[PATCH] [LRA] Fix wrong-code PR 91109 take 2

2019-08-15 Thread Bernd Edlinger
Hi,

as discussed in the PR 91109 audit trail,
my previous patch missed a case where no spilling is necessary,
but the re-materialized instruction has now scratch regs without
a hard register assignment.  And thus the LRA pass falls out of
the loop pre-maturely.

Fixed by checking for scratch regs with no assignment
and continuing the loop in that case.


Boot-strapped and reg-tested on x86_64-pc-linux-gnu and arm-linux-gnueabihf.
Is it OK for trunk?


Thanks
Bernd.
2019-08-12  Bernd Edlinger  

	PR tree-optimization/91109
	* lra-int.h (lra_need_for_scratch_reg_p): Declare.
	* lra.c (lra): Use lra_need_for_scratch_reg_p.
	* lra-spills.c (lra_need_for_scratch_reg_p): New function.

Index: gcc/lra-int.h
===
--- gcc/lra-int.h	(revision 274168)
+++ gcc/lra-int.h	(working copy)
@@ -396,6 +396,7 @@ extern bool lra_coalesce (void);
 
 /* lra-spills.c:  */
 
+extern bool lra_need_for_scratch_reg_p (void);
 extern bool lra_need_for_spills_p (void);
 extern void lra_spill (void);
 extern void lra_final_code_change (void);
Index: gcc/lra-spills.c
===
--- gcc/lra-spills.c	(revision 274168)
+++ gcc/lra-spills.c	(working copy)
@@ -549,6 +549,19 @@ spill_pseudos (void)
 }
 }
 
+/* Return true if we need scratch reg assignments.  */
+bool
+lra_need_for_scratch_reg_p (void)
+{
+  int i; max_regno = max_reg_num ();
+
+  for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
+if (lra_reg_info[i].nrefs != 0 && lra_get_regno_hard_regno (i) < 0
+	&& lra_former_scratch_p (i))
+  return true;
+  return false;
+}
+
 /* Return true if we need to change some pseudos into memory.  */
 bool
 lra_need_for_spills_p (void)
Index: gcc/lra.c
===
--- gcc/lra.c	(revision 274168)
+++ gcc/lra.c	(working copy)
@@ -2567,7 +2567,11 @@ lra (FILE *f)
 	  lra_create_live_ranges (lra_reg_spill_p, true);
 	  live_p = true;
 	  if (! lra_need_for_spills_p ())
-	break;
+	{
+	  if (lra_need_for_scratch_reg_p ())
+		continue;
+	  break;
+	}
 	}
   lra_spill ();
   /* Assignment of stack slots changes elimination offsets for


[Bug rtl-optimization/91460] gcc -mpreferred-vector-width=256 is slower than -mpreferred-vector-width=128 for some loops

2019-08-15 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91460

H.J. Lu  changed:

   What|Removed |Added

 Target||i386
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-08-15
 Ever confirmed|0   |1

--- Comment #2 from H.J. Lu  ---
When loop trip count is known, vectorizer won't select 256-bit vector when
266-bit vector can't be used.  When loop trip count is unknown, 256-bit
vector can be slower than 128-bit vector, depending on workloads.  In
case of SPEC CPU 2017, 128-bit vector is much faster than 256-bit vector
for a couple benchmarks.  For most of benchmarks, there are no performance
differences between  128-bit vector and 256-bit vector.

[Bug rtl-optimization/91460] gcc -mpreferred-vector-width=256 is slower than -mpreferred-vector-width=128 for some loops

2019-08-15 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91460

H.J. Lu  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com

--- Comment #1 from H.J. Lu  ---
This testcase

---
int block[9][9][9];
void foo(int row, int k, int h)
{
  /* Variable nrow range from 4 to 9.  */
  int nrow = ((row - 1)/3 + 1)*3 + 1;

   for (int i = nrow; i < 9; i++)
 block[k][h][i] = block[k][h][i] - 10;
}
---

Since nrow range from 4 to 9, 256bit vector operation will never be
executed(vector elements always less than 8), so 256bit vector actually
equals no vectorization plus additional branch cost.  Even with epilogue
vectorization, 256bit vector still has more overhead.  When this is a hot
function, 256bit vector can reduce performance by 6%.

[Bug target/91461] New: Don't turn on X86_TUNE_SSE_TYPELESS_STORES for AVX/AVX512

2019-08-15 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91461

Bug ID: 91461
   Summary: Don't turn on X86_TUNE_SSE_TYPELESS_STORES for
AVX/AVX512
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: crazylht at gmail dot com, skpgkp2 at gmail dot com
  Target Milestone: ---
Target: i386

We have

X86_TUNE_SSE_TYPELESS_STORES: Always movaps/movups for 128bit stores

since movaps/movups one byte shorter than movdaq/movdqu.  But it isn't
the case for AVX/AVX512.  There is no need to do it for AVX/AVX512.

Re: Expansion of narrowing math built-ins into power instructions

2019-08-15 Thread Segher Boessenkool
On Thu, Aug 15, 2019 at 03:29:03PM +0530, Tejas Joshi wrote:
> Also, in what manner should float_contract/narrow be different from
> float_truncate as both are trying to do similar things? (truncation
> from DF to SF)

It's just a different name, nothing more, nothing less.  Because it is
a different name it can not be accidentally generated from actual
truncations.


Segher


Re: [patch, fortran] Fix PR 91443

2019-08-15 Thread Janne Blomqvist
On Thu, Aug 15, 2019 at 2:35 PM Thomas Koenig  wrote:
>
> Hello world,
>
> this patch fixes PR 91443, in which we did not warn about a mismatched
> external procedure. The problem was that the module this was called in
> was resolved before parsing of the procedure ever started.
>
> The approach taken here is to move the checking of external procedures
> to a stage after normal resolution.  And, of course, fix the resulting
> fallout from regression-testing :-)
> There is also one policy change in the patch. Previously, we only warned
> about mismatched declarations.  Now, this is a hard error unless the
> user specifies -std=legacy.  The reason is that we have not yet solved
> our single declaration problem, but it cannot be solved unless all
> of a procedure's callers match.  People who have such broken code
> should at least be made aware that they have a problem. However, I would
> like to have some sort of agreement on this point before the patch
> is committed.  This can also be changed (see the code at the bottom
> of frontend-passes.c).

Personally, I'm fine with making this a hard error. As we've recently
seen with the varargs & missing charlengths in LAPACK saga, mismatches
can cause extremely subtle issues that can cause silent corruption and
take an expert to figure out.

> Once this is in, the next step is to issue errors for mismatching
> calls where the callee is not in the same file.  This can be done
> with the infrastructure of this patch.
>
> So, OK for trunk?

The patch itself looks Ok. One worry, are you introducing an
O(N**2)(?) algorithm (looping over all symbols for every symbol?), and
does this cause performance issues when compiling some gigantic F77
project?

If this worry is unfounded, Ok for trunk.

-- 
Janne Blomqvist


Re: Expansion of narrowing math built-ins into power instructions

2019-08-15 Thread Segher Boessenkool
On Thu, Aug 15, 2019 at 01:47:47PM +0100, Richard Sandiford wrote:
> Tejas Joshi  writes:
> > Hello.
> > I just wanted to make sure that I am looking at the correct code here.
> > Except for rtl.def where I should be introducing something like
> > float_contract (or float_narrow?) and also simplify-rtx.c, breakpoints

I like that "float_narrow" name :-)

> > set on functions around expr.c, cfgexpand.c where I grep for
> > float_truncate/FLOAT_TRUNCATE did not hit.
> > Also, in what manner should float_contract/narrow be different from
> > float_truncate as both are trying to do similar things? (truncation
> > from DF to SF)
> 
> I think the code should instead be a fused addition and truncation,
> a bit like FMA is a fused addition and multiplication.  Describing it as
> a DFmode addition followed by some conversion to SF would still involve
> double rounding.

How so?  It would *mean* there is only single rounding, even!  That's
the whole point of it.

> simplify-rtx.c is probably the most important place to handle it.
> It would be easiest to test using the selftests at the end of the file.


Segher


[Bug c++/87519] -Wsign-conversion -Wconversion explicit cast fails to silence warning

2019-08-15 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87519

Marek Polacek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #14 from Marek Polacek  ---
Fixed in 9.3+.

[Bug c++/90473] gcc does not call function in comma operator for default argument

2019-08-15 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90473

Marek Polacek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Marek Polacek  ---
Fixed in 9.3 too.

[Bug c++/81429] maybe_unused attribute triggers syntax error when used on first argument to a constructor

2019-08-15 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81429

Marek Polacek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #12 from Marek Polacek  ---
Fixed for 9.3+.

[Bug c++/90884] Option `-Wctor-dtor-privacy' enables warning even for system headers

2019-08-15 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90884

--- Comment #7 from Marek Polacek  ---
Author: mpolacek
Date: Thu Aug 15 18:35:07 2019
New Revision: 274547

URL: https://gcc.gnu.org/viewcvs?rev=274547=gcc=rev
Log:
PR c++/90884 - stray note with -Wctor-dtor-privacy.
* class.c (maybe_warn_about_overly_private_class): Guard the call to
inform.

Added:
branches/gcc-9-branch/gcc/testsuite/g++.dg/warn/ctor-dtor-privacy-4.C
branches/gcc-9-branch/gcc/testsuite/g++.dg/warn/ctor-dtor-privacy-4.h
Modified:
branches/gcc-9-branch/gcc/cp/ChangeLog
branches/gcc-9-branch/gcc/cp/class.c

[Bug c++/90473] gcc does not call function in comma operator for default argument

2019-08-15 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90473

--- Comment #7 from Marek Polacek  ---
Author: mpolacek
Date: Thu Aug 15 18:33:43 2019
New Revision: 274546

URL: https://gcc.gnu.org/viewcvs?rev=274546=gcc=rev
Log:
PR c++/90473 - wrong code with nullptr in default argument.
* call.c (null_ptr_cst_p): Update quote from the standard.
* decl.c (check_default_argument): Don't return nullptr when the arg
has side-effects.

* g++.dg/cpp0x/nullptr42.C: New test.

Added:
branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp0x/nullptr42.C
Modified:
branches/gcc-9-branch/gcc/cp/ChangeLog
branches/gcc-9-branch/gcc/cp/call.c
branches/gcc-9-branch/gcc/cp/decl.c

[Bug c++/87519] -Wsign-conversion -Wconversion explicit cast fails to silence warning

2019-08-15 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87519

--- Comment #13 from Marek Polacek  ---
Author: mpolacek
Date: Thu Aug 15 18:32:33 2019
New Revision: 274545

URL: https://gcc.gnu.org/viewcvs?rev=274545=gcc=rev
Log:
PR c++/87519 - bogus warning with -Wsign-conversion.
* typeck.c (cp_build_binary_op): Use same_type_p instead of comparing
the types directly.

* g++.dg/warn/Wsign-conversion-5.C: New test.

Added:
branches/gcc-9-branch/gcc/testsuite/g++.dg/warn/Wsign-conversion-5.C
Modified:
branches/gcc-9-branch/gcc/cp/ChangeLog
branches/gcc-9-branch/gcc/cp/typeck.c

[Bug c++/81429] maybe_unused attribute triggers syntax error when used on first argument to a constructor

2019-08-15 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81429

--- Comment #11 from Marek Polacek  ---
Author: mpolacek
Date: Thu Aug 15 18:31:16 2019
New Revision: 274544

URL: https://gcc.gnu.org/viewcvs?rev=274544=gcc=rev
Log:
PR c++/81429 - wrong parsing of constructor with C++11 attribute.
* parser.c (cp_parser_constructor_declarator_p): Handle the scenario
when a parameter declaration begins with [[attribute]].

* g++.dg/cpp0x/gen-attrs-68.C: New test.
* g++.dg/cpp0x/gen-attrs-69.C: New test.

Added:
branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp0x/gen-attrs-68.C
branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp0x/gen-attrs-69.C
Modified:
branches/gcc-9-branch/gcc/cp/ChangeLog
branches/gcc-9-branch/gcc/cp/parser.c

Re: Indirect memory addresses vs. lra

2019-08-15 Thread Vladimir Makarov

On 8/15/19 12:38 PM, Richard Biener wrote:

On August 15, 2019 6:29:13 PM GMT+02:00, Vladimir Makarov  
wrote:

On 8/10/19 2:05 AM, John Darrington wrote:

On Fri, Aug 09, 2019 at 01:34:36PM -0400, Vladimir Makarov wrote:
   
   If you provide LRA dump for such test (it is better to use

   -fira-verbose=15 to output full RA info into stderr), I

probably could

   say more.

I've attached such a dump (generated from

gcc/testsuite/gcc.c-torture/compile/pr53410-2.c).
   
   The less regs the architecture has, thoke easier to run into

such error

   message if something described wrong in the back-end.?? I see

your

   architecture is 16-bit micro-controller with only 8 regs, some

of them is

   specialized.?? So your architecture is really register

constrained.

That's not quite correct.  It is a 24-bit micro-controller (the

address

space is 24 bits wide).  There are 2 address registers (plus stack
pointer and program counter) and there are 8 general purpose data
registers (of differing sizes).
   


J'


Thank you for providing the sources.  It helped me to understand what
is
going on.  So the test crashes on

/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:
In function ‘f1’:
/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1:
error: unable to find a register to spill
/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1:
error: this is the insn:
(insn 14 49 15 2 (set (mem:SI (plus:PSI (reg/f:PSI 40 [34])
 (const_int 32 [0x20])) [2  S4 A64])
(mem:SI (reg:PSI 41) [2 *p_5(D)+0 S4 A8]))
"/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c":9:9
95 {*movsi}
  (expr_list:REG_DEAD (reg:PSI 41)
 (expr_list:REG_DEAD (reg/f:PSI 40 [34])
 (nil

Your target has only 2 non-fixed addr registers (r8, r9).  One (r9) is
defined as a hard reg pointer pointer. Honestly, I never saw a target
with such register constraints.

-O0 assumes -fno-omit-frame-pointer.  So in -O0 mode we have only *one*
free addr reg for insn which requires *2* of them.  That is why the GCC
port crashes on this test.  If you add -fomit-frame-pointer, the test
succeeds.

But even if use -fomit-frame-pointer,  it is not guaranteed that hard
reg pointer will be substituted by stack pointer.  There are many cases
where it is not possible (e.g. in case of alloca usage).

So what can be done, imho.  The simplest solution would be preventing
insns with more one memory operand.  The more difficult solution would
be permitting two memory one with address pseudo and another one with
stack pointer.

Couldn't we spill the frame pointer? Basically we should be able to compute the 
first address into a reg, spill that, do the second (both could require the 
frame pointer), spill the frame pointer, reload the first computed address from 
the stack, execute the insn and then reload the frame pointer.

Maybe the frame pointer can also be implemented 'virually' in an index register 
that you keep updated so that sp + reg
Is the FP. Or frame accesses can use a
Stack slot as FP and the indirect memory
Addressing... (is there an indirect lea?)

Yes, it could be a solution.  It just needs some target maintainer 
creativity.  There are a lot of things (tricks) can be done in 
machine-dependent code which would not require RA changes.


Re: PC-relative TLS support

2019-08-15 Thread Segher Boessenkool
Hi!

On Thu, Aug 15, 2019 at 01:35:10PM +0930, Alan Modra wrote:
> Supporting TLS for -mpcrel turns out to be relatively simple, in part
> due to deciding that !TARGET_TLS_MARKERS with -mpcrel is silly.  No
> assembler that I know of supporting prefix insns lacks TLS marker
> support.

Will this stay that way?  (Or do we not care, not now anyway?)

> Also, at some point powerpc gcc ought to remove
> !TARGET_TLS_MARKERS generally and simplify all the occurrences of
> IS_NOMARK_TLSGETADDR in rs6000.md rather than complicating them.

The last time this came up (a year ago) the conclusion was that we first
would have to remove AIX support.

> * config/rs6000/predicates.md (unspec_tls): Allow const0_rtx for got
>   element of unspec vec.
> * config/rs6000/rs6000.c (rs6000_option_override_internal): Disable
>   -mpcrel if -mno-tls-markers.
>   (rs6000_legitimize_tls_address): Support PC-relative TLS.
> * config/rs6000/rs6000.md (UNSPEC_TLSTLS_PCREL): New unspec.
>   (tls_gd_pcrel, tls_ld_pcrel): New insns.
> (tls_dtprel, tls_tprel): Set attr prefixed when tls_size is not 16.
> (tls_got_tprel_pcrel, tls_tls_pcrel): New insns.

(Changelog has whitespace damage, I guess that is just from how you
mailed this?  Please fix when applying it).

The patch is fine when its prerequisites are in.  Thanks,


Segher


Re: Indirect memory addresses vs. lra

2019-08-15 Thread Vladimir Makarov

On 8/15/19 1:35 PM, John Darrington wrote:

On Thu, Aug 15, 2019 at 12:29:13PM -0400, Vladimir Makarov wrote:


  Thank you for providing the sources.?? It helped me to understand what is
  going on.?? So the test crashes on
  
  /home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c: In function ???f1???:

  
/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1: 
error: unable to find a register to spill
  
/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1: 
error: this is the insn:
  (insn 14 49 15 2 (set (mem:SI (plus:PSI (reg/f:PSI 40 [34])
  (const_int 32 [0x20])) [2  S4 A64])
  (mem:SI (reg:PSI 41) [2 *p_5(D)+0 S4 A8])) 
"/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c":9:9 95 
{*movsi}
   (expr_list:REG_DEAD (reg:PSI 41)
  (expr_list:REG_DEAD (reg/f:PSI 40 [34])
  (nil

Thanks for taking a look.
  
  Your target has only 2 non-fixed addr registers (r8, r9).  One (r9) is defined as a hard reg pointer pointer.


That is correct.

  Honestly, I never saw a target with such register constraints.

My recollection is that MC68HC11 was the same.
  
  So what can be done, imho.  The simplest solution would be preventing insns with more one memory operand.


I tried this solution earlier.  But unfortunately it makes things worse.  What 
happens is it libgcc cannot
even be built -- ICEs occur on a memory from  address reg insn such as:
  
(insn 117 2981 3697 5 (set (mem/f:PSI (plus:PSI (reg:PSI 1309)

 (const_int 102 [0x66])) [3 fs_129(D)->pc+0 S4 A8])
(reg:PSI 1310)) 
"/home/jmd/Source/GCC2/libgcc/unwind-dw2.c":977:9 96 {movpsi}

I see.  Then for the insn, you could try to create a pattern 
"memory,special memory constraint".  The special memory constraint 
should satisfy only spilled pseudo (pseudo with reg_renumber == -1).  I 
believe lra-constraints.c can spill the pseudo and the end you will have 
mem[disp1 + r8|r9|sp] = mem[disp1+sp].


It might work.  If it is not, we could modify LRA to do this.

Another solution would be adding unexisting register Z and for mem:psi 
[psi:r] = Z you could emit an assembler insn : mem[psi:r] = a stack slot 
corresponding Z.




[Bug target/90878] [8/9/10 Regression] integer -> SSE register move isn't generated

2019-08-15 Thread hjl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90878

--- Comment #7 from hjl at gcc dot gnu.org  ---
Author: hjl
Date: Thu Aug 15 18:15:33 2019
New Revision: 274543

URL: https://gcc.gnu.org/viewcvs?rev=274543=gcc=rev
Log:
i386: Separate costs of pseudo registers from hard registers

processor_costs has costs of RTL expressions with pseudo registers and
and costs of hard register moves:

1. Costs of RTL expressions are used to generate the most efficient RTL
operations with pseudo registers.

2. Costs of hard register moves are used by register allocator to
decide how to allocate and move hard registers.

Since relative costs of pseudo register load and store versus pseudo
register moves in RTL expressions can be different from relative costs
of hard registers, we should separate costs of RTL expressions with
pseudo registers from costs of hard registers so that register allocator
and RTL expressions can be improved independently.

This patch moves costs of hard register moves to the new hard_register
field and duplicates costs of moves which are also used for costs of RTL
expressions.

PR target/90878
* config/i386/i386.c (inline_memory_move_cost): Use hard_register
for costs of hard register moves.
(ix86_register_move_cost): Likewise.
* config/i386/i386.h (processor_costs): Move costs of hard
register moves to hard_register.  Add int_load, int_store,
xmm_move, ymm_move, zmm_move, sse_to_integer, integer_to_sse,
sse_load, sse_store, sse_unaligned_load and sse_unaligned_store
for costs of RTL expressions.
* config/i386/x86-tune-costs.h: Move costs of hard register
moves to hard_register.  Duplicate int_load, int_store,
xmm_move, ymm_move, zmm_move, sse_to_integer, integer_to_sse,
sse_load, sse_store for costs of RTL expressions.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/i386.h
trunk/gcc/config/i386/x86-tune-costs.h

Re: [PATCH] make clear TYPE_SIZE may be non-constant or null

2019-08-15 Thread Jeff Law
On 8/15/19 10:56 AM, Martin Sebor wrote:
> The comment for DECL_SIZE makes it clear it may be non-constant
> but not that it may be null.  The comment for TYPE_SIZE mentions
> neither.
> 
> The attached update adds a few sentences to make these caveats
> clear.  If no one has any suggestions I'll commit it as obvious
> today or tomorrow.
OK
jeff



Re: match ld besides collect2 in gcov test

2019-08-15 Thread Jeff Law
On 8/15/19 2:13 AM, Alexandre Oliva wrote:
> The regexp that checks that -lgcov is linked in when --coverage is
> passed to the compiler driver requires the command line to match
> '/collect2'.  Some of our targets don't match that, but they match /ld
> or ${target_alias}-ld depending on the testing scenario, so I'd like
> to tweak the test to match those as well.
> 
> Tested on x86_64-linux-gnu, and on the affected test scenarios.
> Ok to install?
> 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * gcc.misc-tests/options.exp: Match /ld and -ld besides
>   /collect2.
OK
jeff


Re: use __builtin_alloca, drop non-standard alloca.h

2019-08-15 Thread Jeff Law
On 8/15/19 2:17 AM, Alexandre Oliva wrote:
> Since alloca.h is not ISO C, most of our alloca-using tests seem to
> rely on __builtin_alloca instead of including the header and calling
> alloca.  This patch extends this practice to some of the exceptions I
> found in gcc.target, marking them as requiring a functional alloca
> while at that.
> 
> Tested on x86_64-linux-gnu, and manually compile-tested the non-x86
> tests.  Ok to install?
> 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * gcc.target/arc/interrupt-6.c: Use __builtin_alloca, require
>   effective target support for alloca, drop include of alloca.h.
>   * gcc.target/i386/pr80969-3.c: Likewise.
>   * gcc.target/sparc/setjmp-1.c: Likewise.
>   * gcc.target/x86_64/abi/ms-sysv/gen.cc: Likewise.
>   * gcc.target/x86_64/abi/ms-sysv/ms-sysv.c: Likewise.
OK
jeff


Re: require trampolines for pr85044

2019-08-15 Thread Jeff Law
On 8/15/19 9:46 AM, Alexandre Oliva wrote:
> Testcases that require support for trampolines should be marked as
> such; gcc.target/i386/pr85044.c was missing it.  Fixed.
> 
> Tested on x86_64-linux-gnu.  Ok to install?
> 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * gcc.target/i386/pr85044.c: Require support for trampolines.
OK
jeff


[PATCH 2/8] bpf: new GCC port

2019-08-15 Thread Jose E. Marchesi
This patch adds a port for the Linux kernel eBPF architecture to GCC.

ChangeLog:

  * configure.ac: Support for bpf-*-* targets.
  * configure: Regenerate.

contrib/ChangeLog:

  * config-list.mk (LIST): Disable go in bpf-*-* targets.

gcc/ChangeLog:

  * config.gcc: Support for bpf-*-* targets.
  * common/config/bpf/bpf-common.c: New file.
  * config/bpf/t-bpf: Likewise.
  * config/bpf/predicates.md: Likewise.
  * config/bpf/constraints.md: Likewise.
  * config/bpf/bpf.opt: Likewise.
  * config/bpf/bpf.md: Likewise.
  * config/bpf/bpf.h: Likewise.
  * config/bpf/bpf.c: Likewise.
  * config/bpf/bpf-protos.h: Likewise.
  * config/bpf/bpf-opts.h: Likewise.
  * config/bpf/bpf-helpers.h: Likewise.
  * config/bpf/bpf-helpers.def: Likewise.
---
 ChangeLog  |5 +
 configure  |   68 ++-
 configure.ac   |   54 +-
 contrib/ChangeLog  |4 +
 contrib/config-list.mk |2 +-
 gcc/ChangeLog  |   16 +
 gcc/common/config/bpf/bpf-common.c |   57 ++
 gcc/config.gcc |9 +
 gcc/config/bpf/bpf-helpers.def |  194 ++
 gcc/config/bpf/bpf-helpers.h   |  324 ++
 gcc/config/bpf/bpf-opts.h  |   56 ++
 gcc/config/bpf/bpf-protos.h|   33 ++
 gcc/config/bpf/bpf.c   | 1136 
 gcc/config/bpf/bpf.h   |  565 ++
 gcc/config/bpf/bpf.md  |  528 +
 gcc/config/bpf/bpf.opt |  119 
 gcc/config/bpf/constraints.md  |   29 +
 gcc/config/bpf/predicates.md   |  105 
 gcc/config/bpf/t-bpf   |0
 19 files changed, 3300 insertions(+), 4 deletions(-)
 create mode 100644 gcc/common/config/bpf/bpf-common.c
 create mode 100644 gcc/config/bpf/bpf-helpers.def
 create mode 100644 gcc/config/bpf/bpf-helpers.h
 create mode 100644 gcc/config/bpf/bpf-opts.h
 create mode 100644 gcc/config/bpf/bpf-protos.h
 create mode 100644 gcc/config/bpf/bpf.c
 create mode 100644 gcc/config/bpf/bpf.h
 create mode 100644 gcc/config/bpf/bpf.md
 create mode 100644 gcc/config/bpf/bpf.opt
 create mode 100644 gcc/config/bpf/constraints.md
 create mode 100644 gcc/config/bpf/predicates.md
 create mode 100644 gcc/config/bpf/t-bpf

diff --git a/configure b/configure
index 63b1e33f41c..4f8e68a4085 100755
--- a/configure
+++ b/configure
@@ -754,6 +754,7 @@ infodir
 docdir
 oldincludedir
 includedir
+runstatedir
 localstatedir
 sharedstatedir
 sysconfdir
@@ -919,6 +920,7 @@ datadir='${datarootdir}'
 sysconfdir='${prefix}/etc'
 sharedstatedir='${prefix}/com'
 localstatedir='${prefix}/var'
+runstatedir='${localstatedir}/run'
 includedir='${prefix}/include'
 oldincludedir='/usr/include'
 docdir='${datarootdir}/doc/${PACKAGE}'
@@ -1171,6 +1173,15 @@ do
   | -silent | --silent | --silen | --sile | --sil)
 silent=yes ;;
 
+  -runstatedir | --runstatedir | --runstatedi | --runstated \
+  | --runstate | --runstat | --runsta | --runst | --runs \
+  | --run | --ru | --r)
+ac_prev=runstatedir ;;
+  -runstatedir=* | --runstatedir=* | --runstatedi=* | --runstated=* \
+  | --runstate=* | --runstat=* | --runsta=* | --runst=* | --runs=* \
+  | --run=* | --ru=* | --r=*)
+runstatedir=$ac_optarg ;;
+
   -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb)
 ac_prev=sbindir ;;
   -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \
@@ -1308,7 +1319,7 @@ fi
 for ac_var in  exec_prefix prefix bindir sbindir libexecdir datarootdir \
datadir sysconfdir sharedstatedir localstatedir includedir \
oldincludedir docdir infodir htmldir dvidir pdfdir psdir \
-   libdir localedir mandir
+   libdir localedir mandir runstatedir
 do
   eval ac_val=\$$ac_var
   # Remove trailing slashes.
@@ -1468,6 +1479,7 @@ Fine tuning of the installation directories:
   --sysconfdir=DIRread-only single-machine data [PREFIX/etc]
   --sharedstatedir=DIRmodifiable architecture-independent data [PREFIX/com]
   --localstatedir=DIR modifiable single-machine data [PREFIX/var]
+  --runstatedir=DIR   modifiable per-process data [LOCALSTATEDIR/run]
   --libdir=DIRobject code libraries [EPREFIX/lib]
   --includedir=DIRC header files [PREFIX/include]
   --oldincludedir=DIR C header files for non-gcc [/usr/include]
@@ -3353,6 +3365,9 @@ case "${target}" in
 # No hosted I/O support.
 noconfigdirs="$noconfigdirs target-libssp"
 ;;
+  bpf-*-*)
+noconfigdirs="$noconfigdirs target-libssp"
+;;
   powerpc-*-aix* | rs6000-*-aix*)
 noconfigdirs="$noconfigdirs target-libssp"
 ;;
@@ -3387,12 +3402,43 @@ if test "${ENABLE_LIBSTDCXX}" = "default" ; then
 avr-*-*)
   noconfigdirs="$noconfigdirs target-libstdc++-v3"
   ;;
+bpf-*-*)
+  noconfigdirs="$noconfigdirs target-libstdc++-v3"
+  ;;
 ft32-*-*)
   noconfigdirs="$noconfigdirs target-libstdc++-v3"
   

Re: Indirect memory addresses vs. lra

2019-08-15 Thread John Darrington
On Thu, Aug 15, 2019 at 06:38:30PM +0200, Richard Biener wrote:

   Couldn't we spill the frame pointer? Basically we should be able to
   compute the first address into a reg, spill that, do the second
   (both could require the frame pointer), spill the frame pointer,
   reload the first computed address from the stack, execute the insn
   and then reload the frame pointer. 
 
 Maybe the frame pointer can also be implemented 'virually' in an index 
register that you keep updated so that sp + reg
 Is the FP. Or frame accesses can use a  Stack slot as FP and the indirect 
memory 
 Addressing... (is there an indirect lea?)

Yes.  lea x, [4,x] is a valid instruction.

J'

-- 
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.



Re: Indirect memory addresses vs. lra

2019-08-15 Thread John Darrington
On Thu, Aug 15, 2019 at 12:29:13PM -0400, Vladimir Makarov wrote:


 Thank you for providing the sources.?? It helped me to understand what is
 going on.?? So the test crashes on
 
 /home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c: In 
function ???f1???:
 
/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1: 
error: unable to find a register to spill
 
/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1: 
error: this is the insn:
 (insn 14 49 15 2 (set (mem:SI (plus:PSI (reg/f:PSI 40 [34])
 (const_int 32 [0x20])) [2  S4 A64])
 (mem:SI (reg:PSI 41) [2 *p_5(D)+0 S4 A8])) 
"/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c":9:9 95 
{*movsi}
  (expr_list:REG_DEAD (reg:PSI 41)
 (expr_list:REG_DEAD (reg/f:PSI 40 [34])
 (nil

Thanks for taking a look.
 
 Your target has only 2 non-fixed addr registers (r8, r9).  One (r9) is 
defined as a hard reg pointer pointer.

That is correct.

 Honestly, I never saw a target with such register constraints.

My recollection is that MC68HC11 was the same.
 
 So what can be done, imho.  The simplest solution would be preventing 
insns with more one memory operand.

I tried this solution earlier.  But unfortunately it makes things worse.  What 
happens is it libgcc cannot
even be built -- ICEs occur on a memory from  address reg insn such as:
 
(insn 117 2981 3697 5 (set (mem/f:PSI (plus:PSI (reg:PSI 1309)
(const_int 102 [0x66])) [3 fs_129(D)->pc+0 S4 A8])
(reg:PSI 1310)) 
"/home/jmd/Source/GCC2/libgcc/unwind-dw2.c":977:9 96 {movpsi}


J'
 

-- 
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.



[PATCH] make clear TYPE_SIZE may be non-constant or null

2019-08-15 Thread Martin Sebor

The comment for DECL_SIZE makes it clear it may be non-constant
but not that it may be null.  The comment for TYPE_SIZE mentions
neither.

The attached update adds a few sentences to make these caveats
clear.  If no one has any suggestions I'll commit it as obvious
today or tomorrow.

Thanks
Martin
gcc/ChangeLog:

	* tree.def (TYPE_SIZE): Clarify.
	* tree.h (TYPE_SIZE, TYPE_SIZE_UNIT, DECL_SIZE): Add comments.

Index: gcc/tree.def
===
--- gcc/tree.def	(revision 274541)
+++ gcc/tree.def	(working copy)
@@ -77,7 +77,10 @@ DEFTREECODE (BLOCK, "block", tcc_exceptional, 0)
 /* Each data type is represented by a tree node whose code is one of
the following:  */
 /* Each node that represents a data type has a component TYPE_SIZE
-   containing a tree that is an expression for the size in bits.
+   that evaluates either to a tree that is a (potentially non-constant)
+   expression representing the type size in bits, or to a null pointer
+   when the size of the type is unknown (for example, for incomplete
+   types such as arrays of unspecified bound).
The TYPE_MODE contains the machine mode for values of this type.
The TYPE_POINTER_TO field contains a type for a pointer to this type,
  or zero if no such has been created yet.
Index: gcc/tree.h
===
--- gcc/tree.h	(revision 274541)
+++ gcc/tree.h	(working copy)
@@ -1952,7 +1952,10 @@ class auto_suppress_location_wrappers
so they must be checked as well.  */
 
 #define TYPE_UID(NODE) (TYPE_CHECK (NODE)->type_common.uid)
+/* Type size in bits as a tree expression.  Need not be constant
+   and may be null.  */
 #define TYPE_SIZE(NODE) (TYPE_CHECK (NODE)->type_common.size)
+/* Likewise, type size in bytes.  */
 #define TYPE_SIZE_UNIT(NODE) (TYPE_CHECK (NODE)->type_common.size_unit)
 #define TYPE_POINTER_TO(NODE) (TYPE_CHECK (NODE)->type_common.pointer_to)
 #define TYPE_REFERENCE_TO(NODE) (TYPE_CHECK (NODE)->type_common.reference_to)
@@ -2480,7 +2483,7 @@ extern machine_mode vector_type_mode (const_tree);
 #define DECL_INITIAL(NODE) (DECL_COMMON_CHECK (NODE)->decl_common.initial)
 
 /* Holds the size of the datum, in bits, as a tree expression.
-   Need not be constant.  */
+   Need not be constant and may be null.  */
 #define DECL_SIZE(NODE) (DECL_COMMON_CHECK (NODE)->decl_common.size)
 /* Likewise for the size in bytes.  */
 #define DECL_SIZE_UNIT(NODE) (DECL_COMMON_CHECK (NODE)->decl_common.size_unit)


[Bug rtl-optimization/91460] New: gcc -mpreferred-vector-width=256 is slower than -mpreferred-vector-width=128 for some loops

2019-08-15 Thread skpgkp2 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91460

Bug ID: 91460
   Summary: gcc -mpreferred-vector-width=256 is slower than
-mpreferred-vector-width=128 for some loops
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: skpgkp2 at gmail dot com
CC: hjl.tools at gmail dot com
  Target Milestone: ---

1 static inline void pixel_avg( uint8_t *dst,  int i_dst_stride,
2  uint8_t *src1, int i_src1_stride,
3  uint8_t *src2, int i_src2_stride,
4   int i_width, int i_height )
5 {
6 for( int y = 0; y < i_height; y++ )
7 {
8 for( int x = 0; x < i_width; x++ )
9 dst[x] = ( src1[x] + src2[x] + 1 ) >> 1;
10 dst  += i_dst_stride;
11 src1 += i_src1_stride;
12 src2 += i_src2_stride;
13 }
14 }

If above code is in hot loop.

if i_width value is between 16 and 32, -mprefer-vector-width=128 can provide
~6% performance improvement as compared to -mprefer-vector-width=256.

i_width value must be at least 16 to trigger 128 bit vectorization at line 8.

i_width value must be at least 32 to trigger 256 bit vectorization at line 8.

Re: Indirect memory addresses vs. lra

2019-08-15 Thread Richard Biener
On August 15, 2019 6:29:13 PM GMT+02:00, Vladimir Makarov  
wrote:
>On 8/10/19 2:05 AM, John Darrington wrote:
>> On Fri, Aug 09, 2019 at 01:34:36PM -0400, Vladimir Makarov wrote:
>>   
>>   If you provide LRA dump for such test (it is better to use
>>   -fira-verbose=15 to output full RA info into stderr), I
>probably could
>>   say more.
>>
>> I've attached such a dump (generated from
>gcc/testsuite/gcc.c-torture/compile/pr53410-2.c).
>>   
>>   The less regs the architecture has, thoke easier to run into
>such error
>>   message if something described wrong in the back-end.?? I see
>your
>>   architecture is 16-bit micro-controller with only 8 regs, some
>of them is
>>   specialized.?? So your architecture is really register
>constrained.
>>
>> That's not quite correct.  It is a 24-bit micro-controller (the
>address
>> space is 24 bits wide).  There are 2 address registers (plus stack
>> pointer and program counter) and there are 8 general purpose data
>> registers (of differing sizes).
>>   
>>
>> J'
>>
>Thank you for providing the sources.  It helped me to understand what
>is 
>going on.  So the test crashes on
>
>/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:
>In function ‘f1’:
>/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1:
>error: unable to find a register to spill
>/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1:
>error: this is the insn:
>(insn 14 49 15 2 (set (mem:SI (plus:PSI (reg/f:PSI 40 [34])
> (const_int 32 [0x20])) [2  S4 A64])
>(mem:SI (reg:PSI 41) [2 *p_5(D)+0 S4 A8]))
>"/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c":9:9
>95 {*movsi}
>  (expr_list:REG_DEAD (reg:PSI 41)
> (expr_list:REG_DEAD (reg/f:PSI 40 [34])
> (nil
>
>Your target has only 2 non-fixed addr registers (r8, r9).  One (r9) is
>defined as a hard reg pointer pointer. Honestly, I never saw a target
>with such register constraints.
>
>-O0 assumes -fno-omit-frame-pointer.  So in -O0 mode we have only *one*
>free addr reg for insn which requires *2* of them.  That is why the GCC
>port crashes on this test.  If you add -fomit-frame-pointer, the test
>succeeds.
>
>But even if use -fomit-frame-pointer,  it is not guaranteed that hard
>reg pointer will be substituted by stack pointer.  There are many cases
>where it is not possible (e.g. in case of alloca usage).
>
>So what can be done, imho.  The simplest solution would be preventing
>insns with more one memory operand.  The more difficult solution would
>be permitting two memory one with address pseudo and another one with
>stack pointer.

Couldn't we spill the frame pointer? Basically we should be able to compute the 
first address into a reg, spill that, do the second (both could require the 
frame pointer), spill the frame pointer, reload the first computed address from 
the stack, execute the insn and then reload the frame pointer.

Maybe the frame pointer can also be implemented 'virually' in an index register 
that you keep updated so that sp + reg
Is the FP. Or frame accesses can use a
Stack slot as FP and the indirect memory 
Addressing... (is there an indirect lea?) 

>I think only after solving this problem, you could think about
>implementing indirect memory addressing.
>
>  



Re: [PATCHv4] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)

2019-08-15 Thread Richard Biener
On August 15, 2019 4:52:24 PM GMT+02:00, Bernd Edlinger 
 wrote:
>On 8/15/19 2:54 PM, Richard Biener wrote:
>> On Thu, 15 Aug 2019, Bernd Edlinger wrote:
>> 
>>
>> Hmm.  So your patch overrides user-alignment here.  Woudln't it
>> be better to do that more conciously by
>>
>>   if (! DECL_USER_ALIGN (decl)
>>   || (align < GET_MODE_ALIGNMENT (DECL_MODE (decl))
>>   && targetm.slow_unaligned_access (DECL_MODE (decl),
>align)))
>>
>>>
>>> ? I don't know why that would be better?
>>> If the value is underaligned no matter why, pretend it was declared
>as
>>> naturally aligned if that causes wrong code otherwise.
>>> That was the idea here.
>> 
>> It would be better because then we ignore it and use what we'd use
>> by default rather than inventing sth new.  And your patch suggests
>> it might be needed to up align even w/o DECL_USER_ALIGN.
>> 
>
>Hmmm, you mean the constant 1.0i should not have DECL_USER_ALIGN set?
>But it inherits the alignment from the destination variable,
>apparently. 

Yes. I think it shouldn't inherit the alignment unless we are assembling a 
static initializer. 

>
>did you mean
>if (! DECL_USER_ALIGN (decl)
>&& align < GET_MODE_ALIGNMENT (DECL_MODE (decl))
>&& ...
>?
>
>I can give it a try.

No, I meant || thus ignore DECL_USER_ALIGN if it is sth we have to satisfy with 
unaligned loads. 
>
>> IMHO whatever code later fails to properly use unaligned loads
>> should be fixed instead rather than ignoring user requested
>alignment.
>>
>> Can you quote a short testcase that explains what exactly goes
>wrong?
>> The struct-layout ones are awkward to look at...
>>
>
> Sure,
>
> $ cat test.c
> _Complex float __attribute__((aligned(1))) cf;
>
> void foo (void)
> {
>   cf = 1.0i;
> }
>
> $ arm-linux-gnueabihf-gcc -S test.c 
> during RTL pass: expand
> test.c: In function 'foo':
> test.c:5:6: internal compiler error: in gen_movsf, at
>config/arm/arm.md:7003
> 5 |   cf = 1.0i;
>   |   ~~~^~
> 0x7ba475 gen_movsf(rtx_def*, rtx_def*)
>   ../../gcc-trunk/gcc/config/arm/arm.md:7003
> 0xa49587 insn_gen_fn::operator()(rtx_def*, rtx_def*) const
>   ../../gcc-trunk/gcc/recog.h:318
> 0xa49587 emit_move_insn_1(rtx_def*, rtx_def*)
>   ../../gcc-trunk/gcc/expr.c:3695
> 0xa49914 emit_move_insn(rtx_def*, rtx_def*)
>   ../../gcc-trunk/gcc/expr.c:3791
> 0xa494f7 emit_move_complex_parts(rtx_def*, rtx_def*)
>   ../../gcc-trunk/gcc/expr.c:3490
> 0xa49914 emit_move_insn(rtx_def*, rtx_def*)
>   ../../gcc-trunk/gcc/expr.c:3791
> 0xa5106f store_expr(tree_node*, rtx_def*, int, bool, bool)
>   ../../gcc-trunk/gcc/expr.c:5855
> 0xa51cc0 expand_assignment(tree_node*, tree_node*, bool)
>   ../../gcc-trunk/gcc/expr.c:5441

 Huh, so why didn't it trigger

   /* Handle misaligned stores.  */
   mode = TYPE_MODE (TREE_TYPE (to));
   if ((TREE_CODE (to) == MEM_REF
|| TREE_CODE (to) == TARGET_MEM_REF)
   && mode != BLKmode
   && !mem_ref_refers_to_non_mem_p (to)
   && ((align = get_object_alignment (to))
   < GET_MODE_ALIGNMENT (mode))
   && (((icode = optab_handler (movmisalign_optab, mode))
!= CODE_FOR_nothing)
   || targetm.slow_unaligned_access (mode, align)))
 {

 ?  (_Complex float is 32bit aligned it seems, the DECL_RTL for the
 var is (mem/c:SC (symbol_ref:SI ("cf") [flags 0x2] >>> 0x2aad1240 cf>) [1 cf+0 S8 A8]), SCmode is 32bit aligned.

 Ah, 'to' is a plain DECL here so the above handling is incomplete.
 IIRC component refs like __real cf = 0.f should be handled fine
 again(?).  So, does adding || DECL_P (to) fix the case as well?

>>>
>>> So I tried this instead of the varasm.c change:
>>>
>>> Index: expr.c
>>> ===
>>> --- expr.c  (revision 274487)
>>> +++ expr.c  (working copy)
>>> @@ -5002,9 +5002,10 @@ expand_assignment (tree to, tree from, bool
>nontem
>>>/* Handle misaligned stores.  */
>>>mode = TYPE_MODE (TREE_TYPE (to));
>>>if ((TREE_CODE (to) == MEM_REF
>>> -   || TREE_CODE (to) == TARGET_MEM_REF)
>>> +   || TREE_CODE (to) == TARGET_MEM_REF
>>> +   || DECL_P (to))
>>>&& mode != BLKmode
>>> -  && !mem_ref_refers_to_non_mem_p (to)
>>> +  && (DECL_P (to) || !mem_ref_refers_to_non_mem_p (to))
>>>&& ((align = get_object_alignment (to))
>>>   < GET_MODE_ALIGNMENT (mode))
>>>&& (((icode = optab_handler (movmisalign_optab, mode))
>>>
>>> Result, yes, it fixes this test case
>>> but then I run all struct-layout-1.exp there are sill cases. where
>we have problems:
>>>
>>> In file included from
>/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/gcc/testsuite/gcc/gcc.dg-struct-layout-1//t024_x.c:8:^M
>>>

Re: Indirect memory addresses vs. lra

2019-08-15 Thread Vladimir Makarov

On 8/10/19 2:05 AM, John Darrington wrote:

On Fri, Aug 09, 2019 at 01:34:36PM -0400, Vladimir Makarov wrote:
  
  If you provide LRA dump for such test (it is better to use

  -fira-verbose=15 to output full RA info into stderr), I probably could
  say more.

I've attached such a dump (generated from 
gcc/testsuite/gcc.c-torture/compile/pr53410-2.c).
  
  The less regs the architecture has, thoke easier to run into such error

  message if something described wrong in the back-end.?? I see your
  architecture is 16-bit micro-controller with only 8 regs, some of them is
  specialized.?? So your architecture is really register constrained.

That's not quite correct.  It is a 24-bit micro-controller (the address
space is 24 bits wide).  There are 2 address registers (plus stack
pointer and program counter) and there are 8 general purpose data
registers (of differing sizes).
  


J'

Thank you for providing the sources.  It helped me to understand what is 
going on.  So the test crashes on


/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c: In 
function ‘f1’:
/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1: 
error: unable to find a register to spill
/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c:10:1: 
error: this is the insn:
(insn 14 49 15 2 (set (mem:SI (plus:PSI (reg/f:PSI 40 [34])
(const_int 32 [0x20])) [2  S4 A64])
(mem:SI (reg:PSI 41) [2 *p_5(D)+0 S4 A8])) 
"/home/jmd/Source/GCC2/gcc/testsuite/gcc.c-torture/compile/pr53410-2.c":9:9 95 
{*movsi}
 (expr_list:REG_DEAD (reg:PSI 41)
(expr_list:REG_DEAD (reg/f:PSI 40 [34])
(nil

Your target has only 2 non-fixed addr registers (r8, r9).  One (r9) is defined 
as a hard reg pointer pointer. Honestly, I never saw a target with such 
register constraints.

-O0 assumes -fno-omit-frame-pointer.  So in -O0 mode we have only *one* free 
addr reg for insn which requires *2* of them.  That is why the GCC port crashes 
on this test.  If you add -fomit-frame-pointer, the test succeeds.

But even if use -fomit-frame-pointer,  it is not guaranteed that hard reg 
pointer will be substituted by stack pointer.  There are many cases where it is 
not possible (e.g. in case of alloca usage).

So what can be done, imho.  The simplest solution would be preventing insns 
with more one memory operand.  The more difficult solution would be permitting 
two memory one with address pseudo and another one with stack pointer.

I think only after solving this problem, you could think about implementing 
indirect memory addressing.

 



Re: i386/asm-4 test: use amd64's natural addressing mode on all OSs

2019-08-15 Thread Uros Bizjak
On Thu, Aug 15, 2019 at 3:47 PM Alexandre Oliva  wrote:
>
> On Aug 15, 2019, Uros Bizjak  wrote:
>
> > The immediate of lea is limited to +-2GB
>
> ... and we're talking about a code offset within a tiny translation
> unit, with both reference and referenced address within the same
> section.  It would be very surprising if the offset got up to 2KB, let
> alone 2GB ;-)
>
> The reason the testcase even mentions absolute addresses AFAICT is not
> that it wishes to use them, it's that there's no portable way to use
> PC-relative addresses on IA-32 (*), so the test gives up on platforms
> that mandate PIC and goes with absolute addressing there.  I'm pretty
> sure if it had a choice it would happily use PC-relative addressing,
> even with a reasonably short range, if that was readily available...
>
> (*) Something like 'call 1f; 1: popl %0; leal $foo-1b(%0), %0' is the
> closest to widely available I'm aware of, but the '1f/1b' notation is
> only available with GNU as AFAIK.

Well, OK ... let's go ahead with your patch then.

Thanks,
Uros.


[Bug libstdc++/91456] std::function and std::is_invocable_r do not understand guaranteed elision

2019-08-15 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91456

--- Comment #2 from Jonathan Wakely  ---
Author: redi
Date: Thu Aug 15 16:07:27 2019
New Revision: 274542

URL: https://gcc.gnu.org/viewcvs?rev=274542=gcc=rev
Log:
PR libstdc++/91456 make INVOKE work with uncopyable prvalues

In C++17 a function can return a prvalue of a type that cannot be moved
or copied. The current implementation of std::is_invocable_r uses
std::is_convertible to test the conversion to R required by INVOKE.
That fails for non-copyable prvalues, because std::is_convertible is
defined in terms of std::declval which uses std::add_rvalue_reference.
In C++17 conversion from R to R involves no copies and so is not the
same as conversion from R&& to R.

This commit changes std::is_invocable_r to check the conversion without
using std::is_convertible.

std::function also contains a similar check using std::is_convertible,
which can be fixed by simply reusing std::is_invocable_r (but because
std::is_invocable_r is not defined for C++11 it uses the underlying
std::__is_invocable_impl trait directly).

PR libstdc++/91456
* include/bits/std_function.h (__check_func_return_type): Remove.
(function::_Callable): Use std::__is_invocable_impl instead of
__check_func_return_type.
* include/std/type_traits (__is_invocable_impl): Add another defaulted
template parameter. Define a separate partial specialization for
INVOKE and INVOKE. For INVOKE replace is_convertible check
with a check that models delayed temporary materialization.
* testsuite/20_util/function/91456.cc: New test.
* testsuite/20_util/is_invocable/91456.cc: New test.

Added:
trunk/libstdc++-v3/testsuite/20_util/function/91456.cc
trunk/libstdc++-v3/testsuite/20_util/is_invocable/91456.cc
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/include/bits/std_function.h
trunk/libstdc++-v3/include/std/type_traits

Re: [PATCH] PR libstdc++/91456 make INVOKE work with uncopyable prvalues

2019-08-15 Thread Jonathan Wakely

On 15/08/19 17:04 +0100, Jonathan Wakely wrote:

In C++17 a function can return a prvalue of a type that cannot be moved
or copied. The current implementation of std::is_invocable_r uses
std::is_convertible to test the conversion to R required by INVOKE.
That fails for non-copyable prvalues, because std::is_convertible is
defined in terms of std::declval which uses std::add_rvalue_reference.
In C++17 conversion from R to R involves no copies and so is not the
same as conversion from R&& to R.

This commit changes std::is_invocable_r to check the conversion without
using std::is_convertible.

std::function also contains a similar check using std::is_convertible,
which can be fixed by simply reusing std::is_invocable_r (but because
std::is_invocable_r is not defined for C++11 it uses the underlying
std::__is_invocable_impl trait directly).

PR libstdc++/91456
* include/bits/std_function.h (__check_func_return_type): Remove.
(function::_Callable): Use std::__is_invocable_impl instead of
__check_func_return_type.
* include/std/type_traits (__is_invocable_impl): Add another defaulted
template parameter. Define a separate partial specialization for
INVOKE and INVOKE. For INVOKE replace is_convertible check
with a check that models delayed temporary materialization.
* testsuite/20_util/function/91456.cc: New test.
* testsuite/20_util/is_invocable/91456.cc: New test.


With some minor changes to __is_convertible_helper we could make that
usable by both std::is_convertible and __is_invokable_impl.

I don't plan to commit this now but might do at a later date.

diff --git a/libstdc++-v3/include/std/type_traits b/libstdc++-v3/include/std/type_traits
index 44db2cade5d..4df3fee4c77 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1491,20 +1491,23 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
bool = __or_, is_function<_To>,
 is_array<_To>>::value>
 struct __is_convertible_helper
-{
-  typedef typename is_void<_To>::type type;
-};
+: public is_void<_To>::type
+{ };
 
 #pragma GCC diagnostic push
 #pragma GCC diagnostic ignored "-Wctor-dtor-privacy"
   template
 class __is_convertible_helper<_From, _To, false>
 {
+  // Unlike declval, this doesn't add_rvalue_reference.
+  template
+	static _From1 __declval();
+
   template
 	static void __test_aux(_To1) noexcept;
 
   template(std::declval<_From1>()))>
+	   typename = decltype(__test_aux<_To1>(__declval<_From1>()))>
 	static true_type
 	__test(int);
 
@@ -1513,14 +1516,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	__test(...);
 
 public:
-  typedef decltype(__test<_From, _To>(0)) type;
+  using type = decltype(__test<_From, _To>(0));
 };
 #pragma GCC diagnostic pop
 
+  template struct add_rvalue_reference;
+
   /// is_convertible
   template
 struct is_convertible
-: public __is_convertible_helper<_From, _To>::type
+: public __is_convertible_helper::type,
+ _To>::type
 { };
 
   template
 class __is_nt_convertible_helper<_From, _To, false>
 {
+  // Unlike declval, this doesn't add_rvalue_reference.
+  template
+	static _From1 __declval();
+
   template
 	static void __test_aux(_To1) noexcept;
 
   template
 	static
-	__bool_constant(std::declval<_From1>()))>
+	__bool_constant(__declval<_From1>()))>
 	__test(int);
 
   template
@@ -1555,14 +1565,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // is_nothrow_convertible for C++11
   template
 struct __is_nothrow_convertible
-: public __is_nt_convertible_helper<_From, _To>::type
+: __is_nt_convertible_helper::type,
+ _To>::type
 { };
 
 #if __cplusplus > 201703L
   /// is_nothrow_convertible
   template
 struct is_nothrow_convertible
-: public __is_nt_convertible_helper<_From, _To>::type
+: __is_nt_convertible_helper::type,
+ _To>::type
 { };
 
   /// is_nothrow_convertible_v
@@ -2896,35 +2908,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : true_type
 { };
 
-#pragma GCC diagnostic push
-#pragma GCC diagnostic ignored "-Wctor-dtor-privacy"
   // Used for INVOKE expressions to check the implicit conversion to R.
   template
 struct __is_invocable_impl<_Result, _Ret,
 			   /* is_void<_Ret> = */ false,
 			   __void_t>
-{
-private:
-  // The type of the INVOKE expression.
-  // Unlike declval, this doesn't add_rvalue_reference.
-  static typename _Result::type _S_get();
-
-  template
-	static void _S_conv(_Tp);
-
-  // This overload is viable if INVOKE(f, args...) can convert to _Tp.
-  template(_S_get()))>
-	static true_type
-	_S_test(int);
-
-  template
-	static false_type
-	_S_test(...);
-
-public:
-  using type = decltype(_S_test<_Ret>(1));
-};
-#pragma GCC diagnostic pop
+: __is_convertible_helper
+{ };
 
   template
 struct __is_invocable


Re: types for VR_VARYING

2019-08-15 Thread Aldy Hernandez



On 8/15/19 7:23 AM, Richard Biener wrote:

On Thu, Aug 15, 2019 at 12:40 PM Aldy Hernandez  wrote:


On 8/14/19 1:37 PM, Jeff Law wrote:

On 8/13/19 6:39 PM, Aldy Hernandez wrote:



On 8/12/19 7:46 PM, Jeff Law wrote:

On 8/12/19 12:43 PM, Aldy Hernandez wrote:

This is a fresh re-post of:

https://gcc.gnu.org/ml/gcc-patches/2019-07/msg6.html

Andrew gave me some feedback a week ago, and I obviously don't remember
what it was because I was about to leave on PTO.  However, I do remember
I addressed his concerns before getting drunk on rum in tropical islands.


FWIW found a great coffee infused rum while in Kauai last week.  I'm not
a coffee fan, but it was wonderful.  The one bottle we brought back
isn't going to last until Cauldron and I don't think I can get a special
order filled before I leave :(


You must bring some to Cauldron before we believe you. :)

That's the problem.  The nearest place I can get it is in Vegas and
there's no distributor in Montreal.   I can special order it in our
state run stores, but it won't be here in time.

Of course, I don't mind if you don't believe me.  More for me in that
case...



Is the supports_type_p stuff there to placate the calls from ipa-cp?  I
can live with it in the short term, but it really feels like there
should be something in the ipa-cp client that avoids this silliness.


I am not happy with this either, but there are various places where
statements that are !stmt_interesting_for_vrp() are still setting a
range of VARYING, which is then being ignored at a later time.

For example, vrp_initialize:

if (!stmt_interesting_for_vrp (phi))
  {
tree lhs = PHI_RESULT (phi);
set_def_to_varying (lhs);
prop_set_simulate_again (phi, false);
  }

Also in evrp_range_analyzer::record_ranges_from_stmt(), where we if the
statement is interesting for VRP but extract_range_from_stmt() does not
produce a useful range, we also set a varying for a range we will never
use.  Similarly for a statement that is not interesting in this hunk.

Ugh.  One could perhaps argue that setting any kind of range in these
circumstances is silly.   But I suspect it's necessary due to the
optimistic handling of VR_UNDEFINED in value_range_base::union_helper.
It's all coming back to me now...




Then there is vrp_prop::visit_stmt() where we also set VARYING for types
that VRP will never handle:

case IFN_ADD_OVERFLOW:
case IFN_SUB_OVERFLOW:
case IFN_MUL_OVERFLOW:
case IFN_ATOMIC_COMPARE_EXCHANGE:
  /* These internal calls return _Complex integer type,
 which VRP does not track, but the immediate uses
 thereof might be interesting.  */
  if (lhs && TREE_CODE (lhs) == SSA_NAME)
{
  imm_use_iterator iter;
  use_operand_p use_p;
  enum ssa_prop_result res = SSA_PROP_VARYING;

  set_def_to_varying (lhs);

I've adjusted the patch so that set_def_to_varying will set the range to
VR_UNDEFINED if !supports_type_p.  This is a fail safe, as we can't
really do anything with a nonsensical range.  I just don't want to leave
the range in an indeterminate state.


I think VR_UNDEFINED is unsafe due to value_range_base::union_helper.
And that's a more general than this patch.  VR_UNDEFINED is _not_ a safe
range to set something to if we can't handle it.  We have to use VR_VARYING.

Why?  See the beginning of value_range_base::union_helper:

 /* VR0 has the resulting range if VR1 is undefined or VR0 is varying.  */
 if (vr1->undefined_p ()
 || vr0->varying_p ())
   return *vr0;

 /* VR1 has the resulting range if VR0 is undefined or VR1 is varying.  */
 if (vr0->undefined_p ()
 || vr1->varying_p ())
   return *vr1;
This can get called for something like

a =  ? name1 : name2;

If name1 was set to VR_UNDEFINED thinking that VR_UNDEFINED was a safe
value for something we can't handle, then we'll incorrectly return the
range for name2.


I think that if name1 was !supports_type_p, we will have never called
union/intersect.  We will have bailed at some point earlier.  However, I
do see your point about being consistent.



VR_UNDEFINED can only be used for the ranges of objects we haven't
processed.  If we can't produce a range for an object because the
statement is something we don't handle or just doesn't produce anythign
useful, then the right result is VR_VARYING.

This may be worth commenting at the definition site for VR_*.



I also noticed that Andrew's patch was setting num_vr_values to
num_ssa_names + num_ssa_names / 10.  I think he meant num_vr_values +
num_vr_values / 10.  Please verify the current incantation makes sense.

Going to assume this will be adjusted per the other messages in this thread.


Done.





diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 39ea22f0554..663dd6e2398 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -182,8 +182,10 

[PATCH] PR libstdc++/91456 make INVOKE work with uncopyable prvalues

2019-08-15 Thread Jonathan Wakely

In C++17 a function can return a prvalue of a type that cannot be moved
or copied. The current implementation of std::is_invocable_r uses
std::is_convertible to test the conversion to R required by INVOKE.
That fails for non-copyable prvalues, because std::is_convertible is
defined in terms of std::declval which uses std::add_rvalue_reference.
In C++17 conversion from R to R involves no copies and so is not the
same as conversion from R&& to R.

This commit changes std::is_invocable_r to check the conversion without
using std::is_convertible.

std::function also contains a similar check using std::is_convertible,
which can be fixed by simply reusing std::is_invocable_r (but because
std::is_invocable_r is not defined for C++11 it uses the underlying
std::__is_invocable_impl trait directly).

PR libstdc++/91456
* include/bits/std_function.h (__check_func_return_type): Remove.
(function::_Callable): Use std::__is_invocable_impl instead of
__check_func_return_type.
* include/std/type_traits (__is_invocable_impl): Add another defaulted
template parameter. Define a separate partial specialization for
INVOKE and INVOKE. For INVOKE replace is_convertible check
with a check that models delayed temporary materialization.
* testsuite/20_util/function/91456.cc: New test.
* testsuite/20_util/is_invocable/91456.cc: New test.

Tested x86_64-linux, committed to trunk.

I might backport this to gcc-9-branch too, after some time on trunk.

commit a9ae95a61b2d8b5ccbbaff1f9bd0b3ed70c600ed
Author: Jonathan Wakely 
Date:   Thu Aug 15 15:46:39 2019 +0100

PR libstdc++/91456 make INVOKE work with uncopyable prvalues

In C++17 a function can return a prvalue of a type that cannot be moved
or copied. The current implementation of std::is_invocable_r uses
std::is_convertible to test the conversion to R required by INVOKE.
That fails for non-copyable prvalues, because std::is_convertible is
defined in terms of std::declval which uses std::add_rvalue_reference.
In C++17 conversion from R to R involves no copies and so is not the
same as conversion from R&& to R.

This commit changes std::is_invocable_r to check the conversion without
using std::is_convertible.

std::function also contains a similar check using std::is_convertible,
which can be fixed by simply reusing std::is_invocable_r (but because
std::is_invocable_r is not defined for C++11 it uses the underlying
std::__is_invocable_impl trait directly).

PR libstdc++/91456
* include/bits/std_function.h (__check_func_return_type): Remove.
(function::_Callable): Use std::__is_invocable_impl instead of
__check_func_return_type.
* include/std/type_traits (__is_invocable_impl): Add another 
defaulted
template parameter. Define a separate partial specialization for
INVOKE and INVOKE. For INVOKE replace is_convertible check
with a check that models delayed temporary materialization.
* testsuite/20_util/function/91456.cc: New test.
* testsuite/20_util/is_invocable/91456.cc: New test.

diff --git a/libstdc++-v3/include/bits/std_function.h 
b/libstdc++-v3/include/bits/std_function.h
index 5733bf5f3f9..42f87873d55 100644
--- a/libstdc++-v3/include/bits/std_function.h
+++ b/libstdc++-v3/include/bits/std_function.h
@@ -293,10 +293,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
 };
 
-  template
-using __check_func_return_type
-  = __or_, is_same<_From, _To>, is_convertible<_From, _To>>;
-
   /**
*  @brief Primary class template for std::function.
*  @ingroup functors
@@ -309,8 +305,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   private _Function_base
 {
   template::type>
-   struct _Callable : __check_func_return_type<_Res2, _Res> { };
+  typename _Res2 = __invoke_result<_Func&, _ArgTypes...>>
+   struct _Callable
+   : __is_invocable_impl<_Res2, _Res>::type
+   { };
 
   // Used so the return type convertibility checks aren't done when
   // performing overload resolution for copy construction/assignment.
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index d3f853d4ce2..44db2cade5d 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2883,14 +2883,49 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // __is_invocable (std::is_invocable for C++11)
 
-  template
+  // The primary template is used for invalid INVOKE expressions.
+  template::value, typename = void>
 struct __is_invocable_impl : false_type { };
 
+  // Used for valid INVOKE and INVOKE expressions.
   template
-struct __is_invocable_impl<_Result, _Ret, __void_t>
-: __or_, is_convertible>::type
+struct __is_invocable_impl<_Result, _Ret,
+  /* is_void<_Ret> = */ true,
+ 

[Bug lto/91307] -flto causes binary to vary

2019-08-15 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91307

--- Comment #9 from Jan Hubicka  ---
I am not 100% sure if hashing calle names works safely, since they will all be
something like "static construction" or so, so I guess one can construct two
different translation units that will end up with same hash. But perhaps we
could teach LTO FE to produce those names purely on symbols exported from the
LTO unit since those should be unique and use it everywhere (including the
lto_priv numbers)?
Of course unless we go the route of getting API to the plugin interface to
obtain full symbol name.

[Bug middle-end/91459] Tail-Call Optimization is not performed when return value is assumed.

2019-08-15 Thread mike.k at digitalcarbide dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91459

--- Comment #1 from mike.k at digitalcarbide dot com ---
'foo3' in the assembly output should be 'foo2'. I'd changed the function name
in my test code and did not update the assembly. Apologies.

[Bug go/91035] [10 Regression] gotools fails to build on s390x-linux-gnu

2019-08-15 Thread doko at debian dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91035

--- Comment #4 from Matthias Klose  ---
looks like .L34 should be referenced instead of of .L37.

[Bug middle-end/91459] New: Tail-Call Optimization is not performed when return value is assumed.

2019-08-15 Thread mike.k at digitalcarbide dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91459

Bug ID: 91459
   Summary: Tail-Call Optimization is not performed when return
value is assumed.
   Product: gcc
   Version: 9.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mike.k at digitalcarbide dot com
  Target Milestone: ---

In situations where a function either returns a specific value or does not
return at all, GCC fails to perform tail call optimizations. This appears to
occur on all GCC versions with -O1, -O2, -O3, and -Os. It occurs with both the
C and C++ front-ends.

Observe:

/* This function is guaranteed to only return the value '1', else it does not
return.
// This is meant to emulate a function such as 'exec'.
*/
extern int function_returns_only_1_or_doesnt_return(int, int);

int foo1(int a, int b) {
const int result = function_returns_only_1_or_doesnt_return(a, b);
if (result == 1) {
return result;
}
else {
__builtin_unreachable();
}
}

int foo2(int a, int b) {
return function_returns_only_1_or_doesnt_return(a, b);
}


This results in the following output for -O3 on x86-64:

foo1(int, int):
  push rax
  call function_returns_only_1_or_doesnt_return(int, int)
  mov eax, 1
  pop rdx
  ret
foo3(int, int):
  jmp function_returns_only_1_or_doesnt_return(int, int)

While the behavior is correct, the tail-call optimization is far more optimal
and preserves the same semantics.

The same behavior occurs with other architectures as well, so it does not
appear to be a back-end issue.

require trampolines for pr85044

2019-08-15 Thread Alexandre Oliva
Testcases that require support for trampolines should be marked as
such; gcc.target/i386/pr85044.c was missing it.  Fixed.

Tested on x86_64-linux-gnu.  Ok to install?


for  gcc/testsuite/ChangeLog

* gcc.target/i386/pr85044.c: Require support for trampolines.
---
 gcc/testsuite/gcc.target/i386/pr85044.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gcc.target/i386/pr85044.c 
b/gcc/testsuite/gcc.target/i386/pr85044.c
index a25cc7fe3252..02ef91d3dbbb 100644
--- a/gcc/testsuite/gcc.target/i386/pr85044.c
+++ b/gcc/testsuite/gcc.target/i386/pr85044.c
@@ -1,4 +1,5 @@
 /* { dg-do run { target cet } } */
+/* { dg-require-effective-target trampolines } */
 /* { dg-options "-O2 -fcf-protection=branch" } */
 
 void callme (void (*callback) (void));

-- 
Alexandre Oliva, freedom fighter  he/him   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás - Che GNUevara


[PATCH][gensupport] PR 91255: Do not error out immediately on set_attr_alternative with define_subst

2019-08-15 Thread Kyrill Tkachov

Hi all,

I'm trying to add a define_subst use in the arm backend but am getting 
many build errors complaining about:

`set_attr_alternative' is unsupported by `define_subst'

Looking at the gensupport.c code it iterates over all define_insns and 
errors if any of them have set_attr_alternative.


The usecase I'm targetting doesn't involve patterns with 
set_attr_alternative, so I would like to make the define_subst handling
more robust to only error out if the define_subst is actually attempted 
on a set_attr_alternative.


This patch produces the error only if the set_attr_alternative attr 
matches the subst name.

This allows a build of the arm backend with a define_subst usage to succeed.

Bootstrapped and tested on arm-none-linux-gnueabihf and x86_64-linux-gnu.

Ok for trunk?

Thanks,
Kyrill

2019-08-15  Kyrylo Tkachov  

    PR other/91255
    * gensupport.c (has_subst_attribute): Error out on set_attr_alternative
    only if subst_name matches curr_attr string.

diff --git a/gcc/gensupport.c b/gcc/gensupport.c
index 1aab7119901..c64f683bc5c 100644
--- a/gcc/gensupport.c
+++ b/gcc/gensupport.c
@@ -788,9 +788,10 @@ has_subst_attribute (class queue_elem *elem, class queue_elem *subst_elem)
 	  return false;
 
 	case SET_ATTR_ALTERNATIVE:
-	  error_at (elem->loc,
-		"%s: `set_attr_alternative' is unsupported by "
-		"`define_subst'", XSTR (elem->data, 0));
+	  if (strcmp (XSTR (cur_attr, 0), subst_name) == 0)
+	error_at (elem->loc,
+		  "%s: `set_attr_alternative' is unsupported by "
+		  "`define_subst'", XSTR (elem->data, 0));
 	  return false;
 
 


[Bug tree-optimization/91457] FAIL: g++.dg/warn/Warray-bounds-4.C -std=gnu++98 (test for warnings, line 25)

2019-08-15 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91457

Martin Sebor  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-08-15
 CC||msebor at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |msebor at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Martin Sebor  ---
Confirmed.  Here's a slightly simplified/modified test case showing the
difference in output between an x86_64 native GCC and an hppa2.0w-hp-hpux11.11
cross.  Here too compute_objsize fails because it doesn't handle the ARRAY_REF
argument in the assignment a.a[0] = 0.

$ (set -x && cat t.C && gcc -O2 -S -Wall -fdump-tree-dom3=/dev/stdout t.C &&
/build/hppa2.0w-hp-hpux11.11/gcc-svn/gcc/xgcc -B
/build/hppa2.0w-hp-hpux11.11/gcc-svn/gcc -O2 -S -Wall
-fdump-tree-dom3=/dev/stdout t.C)
+ cat t.C
struct A
{
  char n, a[0];

  A () { a[0] = 0; }
};

void f (struct A*);

void g (void)
{
  struct A a;
  f ();
}
+ /build/gcc-svn/gcc/xgcc -B /build/gcc-svn/gcc -O2 -S -Wall
-fdump-tree-dom3=/dev/stdout t.C
t.C: In function ‘void g()’:
t.C:5:13: warning: array subscript 0 is above array bounds of ‘char [0]’
[-Warray-bounds]
5 |   A () { a[0] = 0; }
  |  ~~~^

;; Function g (_Z1gv, funcdef_no=3, decl_uid=2317, cgraph_uid=4,
symbol_order=3)

;; 1 loops found
;;
;; Loop 0
;;  header 0, latch 1
;;  depth 0, outer -1
;;  nodes: 0 1 2
;; 2 succs { 1 }
g ()
{
  struct A a;

   [local count: 1073741824]:
  a ={v} {CLOBBER};
  a.a[0] = 0;
  f ();
  a ={v} {CLOBBER};
  return;

}


+ /build/hppa2.0w-hp-hpux11.11/gcc-svn/gcc/xgcc -B
/build/hppa2.0w-hp-hpux11.11/gcc-svn/gcc -O2 -S -Wall
-fdump-tree-dom3=/dev/stdout t.C

;; Function g (_Z1gv, funcdef_no=3, decl_uid=1911, cgraph_uid=4,
symbol_order=3)

;; 1 loops found
;;
;; Loop 0
;;  header 0, latch 1
;;  depth 0, outer -1
;;  nodes: 0 1 2
;; 2 succs { 1 }
g ()
{
  struct A a;

   [local count: 1073741824]:
  MEM[(char *) + 1B] = 0;
  f ();
  a ={v} {CLOBBER};
  return;

}


In constructor ‘A::A()’,
inlined from ‘void g()’ at t.C:12:12:
t.C:5:15: warning: writing 1 byte into a region of size 0
[-Wstringop-overflow=]
5 |   A () { a[0] = 0; }
  |  ~^~~

[Bug tree-optimization/91457] FAIL: g++.dg/warn/Warray-bounds-4.C -std=gnu++98 (test for warnings, line 25)

2019-08-15 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91457

Martin Sebor  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #3 from Martin Sebor  ---
Let me enhance the function.

[Bug c++/90884] Option `-Wctor-dtor-privacy' enables warning even for system headers

2019-08-15 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90884

--- Comment #6 from Marek Polacek  ---
(In reply to Jonathan Wakely from comment #5)
> Marek, should this be backported to release branches too?

Ha, I'm just about to backport a few patches to 9.3, so I'll include this one
too.

[Bug c++/71484] Class with implicit public constructor triggers `-Wctor-dtor-privacy`

2019-08-15 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71484

Jonathan Wakely  changed:

   What|Removed |Added

   Keywords||diagnostic
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-08-15
 Ever confirmed|0   |1
   Severity|minor   |normal

--- Comment #5 from Jonathan Wakely  ---
Here's a case where I think the warning is too noisy:

template
class trait
{
  template
static U get();

public:
  using type = decltype(get());
};

trait i;

When compiled with -Wctor-dtor-privacy this warns:

tr.cc:2:7: warning: all member functions in class 'trait' are private
[-Wctor-dtor-privacy]
2 | class trait
  |   ^

The member function is supposed to be private. It's used by the
typedef-declaration, but only the typedef needs to be public.

The documentation for the warning says "Also warn if there are no non-private
methods, and there's at least one private member function that isn't a
constructor or destructor." I think the warning should be suppressed if those
private member functions are used by a public typedef.

[Bug go/91035] [10 Regression] gotools fails to build on s390x-linux-gnu

2019-08-15 Thread doko at debian dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91035

--- Comment #3 from Matthias Klose  ---
Created attachment 46718
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46718=edit
expvar assembler file

[Bug c++/90884] Option `-Wctor-dtor-privacy' enables warning even for system headers

2019-08-15 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90884

--- Comment #5 from Jonathan Wakely  ---
Marek, should this be backported to release branches too?

[Bug c++/71484] Class with implicit public constructor triggers `-Wctor-dtor-privacy`

2019-08-15 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71484

--- Comment #4 from Jonathan Wakely  ---
(In reply to Olivier Kannengieser from comment #2)
> In GCC 8.1, 
> 
> #pragma GCC diagnostic ignored "-Wctor-dtor-privacy",
> does not fully disable the diagnostic message:
> 
> --> without ignored diagnostic:
> 
> test.cpp:2:8: warning: ‘struct S’ only defines private constructors
> and has no friends [-Wctor-dtor-privacy]
> struct S{
> ^
> test.cpp:4:3: note: ‘constexpr S::S(const S&)’ is public, but
> requires an existing ‘struct S’ object
> S(const S&) = default;
> 
> 
> ---> with ignored diagnostic:
> 
> test.cpp:4:3: note: ‘constexpr S::S(const S&)’ is public, but
> requires an existing ‘struct S’ object
> S(const S&) = default;

That was Bug 90884 and is now fixed.

[Bug c++/71484] Class with implicit public constructor triggers `-Wctor-dtor-privacy`

2019-08-15 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71484

--- Comment #3 from Jonathan Wakely  ---
(In reply to Kyle J Strand from comment #0)
> A class whose only user-declared methods are `private`, but which *does*
> have an implicit public constructor, can trigger `-Wctor-dtor-privacy`.
> 
> See http://stackoverflow.com/q/33157248/1858225 for discussion and sample
> code.
> 
> The offending code is copied here for convenience:
> 
> struct foo
> {
>   private:
> static int test(void) { return 3; };
> };

That's by design, and is the documented behaviour of the warning. Nothing can
ever call that static function.

[PATCH] Reapply missing patch for libsanitizer.

2019-08-15 Thread Martin Liška
Hi.

There's forgotten patch for libsanitizer that was not listed
in LOCAL_PATCHES. I've just tested the patch on ppc64
(gcc110 compile farm machine) and I'm going to install the patch.

Martin
>From 82662f97b6bacf21eee1185bc116aa22c0c89b33 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 15 Aug 2019 17:23:14 +0200
Subject: [PATCH] Reapply missing patch for libsanitizer.

libsanitizer/ChangeLog:

2019-08-15  Martin Liska  

	* tsan/tsan_rtl_ppc64.S: Reapply.
---
 libsanitizer/tsan/tsan_rtl_ppc64.S | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libsanitizer/tsan/tsan_rtl_ppc64.S b/libsanitizer/tsan/tsan_rtl_ppc64.S
index 8285e21aa1e..9e533a71a9c 100644
--- a/libsanitizer/tsan/tsan_rtl_ppc64.S
+++ b/libsanitizer/tsan/tsan_rtl_ppc64.S
@@ -1,5 +1,6 @@
 #include "tsan_ppc_regs.h"
 
+.machine altivec
 .section .text
 .hidden __tsan_setjmp
 .globl _setjmp
-- 
2.22.0



[Bug go/91035] [10 Regression] gotools fails to build on s390x-linux-gnu

2019-08-15 Thread doko at debian dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91035

--- Comment #2 from Matthias Klose  ---
confirmed with trunk 20190815

build dir on
https://people.debian.org/~doko/tmp/tst-gotools.tar.xz

.L37 referenced in expvar.o

[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison

2019-08-15 Thread tnfchris at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398

--- Comment #24 from Tamar Christina  ---
>   if (pb[len] != cur[len])

Ah sorry, it also access cur + len, so no, indeed I think only unrolling is
possible here. Unless cur was originally a static array and you can tell it's
size under LTO?

[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison

2019-08-15 Thread tnfchris at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398

Tamar Christina  changed:

   What|Removed |Added

 CC||tnfchris at gcc dot gnu.org

--- Comment #23 from Tamar Christina  ---
? > The 'pb' pointer is the 'cur' pointer but moved back by 'delta'.
> > Presumably that means that all memory between 'pb' and 'delta' and could be
> > read in as wide a load as possible?
>
> A C language lawyer would agree with that.  But does it really help?
> The loop also accesses [cur + len, cur + len_limit].

Could we not emit a runtime check for this? Check if len <= delta and len +
len_limit <= delta, and if so emit a vectorized version and if not fall back to
the scalar unrolled code?

though if the iteration counts are so small as Wilco suggests then maybe it's
really not worth doing so.

Re: [PATCH 3/9] operand_equal_p: add support for OBJ_TYPE_REF.

2019-08-15 Thread Jan Hubicka
> On Tue, Aug 6, 2019 at 5:43 PM Martin Liska  wrote:
> >
> >
> > gcc/ChangeLog:
> 
> +   /* Virtual table call.  */
> +   case OBJ_TYPE_REF:
> + {
> +   if (!operand_equal_p (OBJ_TYPE_REF_EXPR (arg0),
> + OBJ_TYPE_REF_EXPR (arg1), flags))
> + return false;
> +   if (virtual_method_call_p (arg0))
> + {
> +   if (tree_to_uhwi (OBJ_TYPE_REF_TOKEN (arg0))
> +   != tree_to_uhwi (OBJ_TYPE_REF_TOKEN (arg1)))
> + return false;
> +   if (!types_same_for_odr (obj_type_ref_class (arg0),
> +obj_type_ref_class (arg1)))
> + return false;
> +   if (!operand_equal_p (OBJ_TYPE_REF_OBJECT (arg0),
> + OBJ_TYPE_REF_OBJECT (arg1), flags))
> + return false;
> 
> this all gets deep into the devirt machinery, including looking at
> ODR type hashes.  So I'm not sure if we really want to handle
> it this "optimistic" in operand_equal_p and completely ignore
> other operands when !virtual_method_call_p?  That is, why
> not compare OBJ_TYPE_REF_TOKEN/OBJECT always at least?

For !virtual_method_call_p we do not use OBJ_TYPE_REF at all yet obj-C frontend
produce it.  I think we should remove them somewhere during gimplification.
We can definitly turn "optimistic" to "pesimistic" and return false here.

Otherwise the checks makes sense to me - it the tests above passes devirt
machinery ought to give same results.
> 
> Do we then have cases where the OBJ_TYPE_REF is actually
> distinct according to the remaining check?

I am not sure what you mean here?

Honza
> 
> + }
> 
> 
> > 2019-07-24  Martin Liska  
> >
> > * fold-const.c (operand_equal_p): Support OBJ_TYPE_REF.
> > * tree.c (add_expr): Hash parts of OBJ_TYPE_REF.
> > ---
> >  gcc/fold-const.c | 21 +
> >  gcc/tree.c   |  9 +
> >  2 files changed, 30 insertions(+)
> >


Re: [PATCHv4] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)

2019-08-15 Thread Bernd Edlinger
On 8/15/19 2:54 PM, Richard Biener wrote:
> On Thu, 15 Aug 2019, Bernd Edlinger wrote:
> 
>
> Hmm.  So your patch overrides user-alignment here.  Woudln't it
> be better to do that more conciously by
>
>   if (! DECL_USER_ALIGN (decl)
>   || (align < GET_MODE_ALIGNMENT (DECL_MODE (decl))
>   && targetm.slow_unaligned_access (DECL_MODE (decl), align)))
>
>>
>> ? I don't know why that would be better?
>> If the value is underaligned no matter why, pretend it was declared as
>> naturally aligned if that causes wrong code otherwise.
>> That was the idea here.
> 
> It would be better because then we ignore it and use what we'd use
> by default rather than inventing sth new.  And your patch suggests
> it might be needed to up align even w/o DECL_USER_ALIGN.
> 

Hmmm, you mean the constant 1.0i should not have DECL_USER_ALIGN set?
But it inherits the alignment from the destination variable, apparently.

did you mean
if (! DECL_USER_ALIGN (decl)
&& align < GET_MODE_ALIGNMENT (DECL_MODE (decl))
&& ...
?

I can give it a try.

> IMHO whatever code later fails to properly use unaligned loads
> should be fixed instead rather than ignoring user requested alignment.
>
> Can you quote a short testcase that explains what exactly goes wrong?
> The struct-layout ones are awkward to look at...
>

 Sure,

 $ cat test.c
 _Complex float __attribute__((aligned(1))) cf;

 void foo (void)
 {
   cf = 1.0i;
 }

 $ arm-linux-gnueabihf-gcc -S test.c 
 during RTL pass: expand
 test.c: In function 'foo':
 test.c:5:6: internal compiler error: in gen_movsf, at 
 config/arm/arm.md:7003
 5 |   cf = 1.0i;
   |   ~~~^~
 0x7ba475 gen_movsf(rtx_def*, rtx_def*)
../../gcc-trunk/gcc/config/arm/arm.md:7003
 0xa49587 insn_gen_fn::operator()(rtx_def*, rtx_def*) const
../../gcc-trunk/gcc/recog.h:318
 0xa49587 emit_move_insn_1(rtx_def*, rtx_def*)
../../gcc-trunk/gcc/expr.c:3695
 0xa49914 emit_move_insn(rtx_def*, rtx_def*)
../../gcc-trunk/gcc/expr.c:3791
 0xa494f7 emit_move_complex_parts(rtx_def*, rtx_def*)
../../gcc-trunk/gcc/expr.c:3490
 0xa49914 emit_move_insn(rtx_def*, rtx_def*)
../../gcc-trunk/gcc/expr.c:3791
 0xa5106f store_expr(tree_node*, rtx_def*, int, bool, bool)
../../gcc-trunk/gcc/expr.c:5855
 0xa51cc0 expand_assignment(tree_node*, tree_node*, bool)
../../gcc-trunk/gcc/expr.c:5441
>>>
>>> Huh, so why didn't it trigger
>>>
>>>   /* Handle misaligned stores.  */
>>>   mode = TYPE_MODE (TREE_TYPE (to));
>>>   if ((TREE_CODE (to) == MEM_REF
>>>|| TREE_CODE (to) == TARGET_MEM_REF)
>>>   && mode != BLKmode
>>>   && !mem_ref_refers_to_non_mem_p (to)
>>>   && ((align = get_object_alignment (to))
>>>   < GET_MODE_ALIGNMENT (mode))
>>>   && (((icode = optab_handler (movmisalign_optab, mode))
>>>!= CODE_FOR_nothing)
>>>   || targetm.slow_unaligned_access (mode, align)))
>>> {
>>>
>>> ?  (_Complex float is 32bit aligned it seems, the DECL_RTL for the
>>> var is (mem/c:SC (symbol_ref:SI ("cf") [flags 0x2] >> 0x2aad1240 cf>) [1 cf+0 S8 A8]), SCmode is 32bit aligned.
>>>
>>> Ah, 'to' is a plain DECL here so the above handling is incomplete.
>>> IIRC component refs like __real cf = 0.f should be handled fine
>>> again(?).  So, does adding || DECL_P (to) fix the case as well?
>>>
>>
>> So I tried this instead of the varasm.c change:
>>
>> Index: expr.c
>> ===
>> --- expr.c   (revision 274487)
>> +++ expr.c   (working copy)
>> @@ -5002,9 +5002,10 @@ expand_assignment (tree to, tree from, bool nontem
>>/* Handle misaligned stores.  */
>>mode = TYPE_MODE (TREE_TYPE (to));
>>if ((TREE_CODE (to) == MEM_REF
>> -   || TREE_CODE (to) == TARGET_MEM_REF)
>> +   || TREE_CODE (to) == TARGET_MEM_REF
>> +   || DECL_P (to))
>>&& mode != BLKmode
>> -  && !mem_ref_refers_to_non_mem_p (to)
>> +  && (DECL_P (to) || !mem_ref_refers_to_non_mem_p (to))
>>&& ((align = get_object_alignment (to))
>>< GET_MODE_ALIGNMENT (mode))
>>&& (((icode = optab_handler (movmisalign_optab, mode))
>>
>> Result, yes, it fixes this test case
>> but then I run all struct-layout-1.exp there are sill cases. where we have 
>> problems:
>>
>> In file included from 
>> /home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/gcc/testsuite/gcc/gcc.dg-struct-layout-1//t024_x.c:8:^M
>> /home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/gcc/testsuite/gcc/gcc.dg-struct-layout-1//t024_test.h:
>>  In function 'test2112':^M
>> /home/ed/gnu/gcc-trunk/gcc/testsuite/gcc.dg/compat/struct-layout-1_x1.h:23:10:
>>  internal compiler error: in gen_movdf, at config/arm/arm.md:7107^M
>> /home/ed/gnu/gcc-trunk/gcc/testsuite/gcc.dg/compat/struct-layout-1_x1.h:62:3:
>>  note: in definition of 

[Bug tree-optimization/91458] New: FAIL: g++.dg/tree-ssa/pr19807.C -std=gnu++98 scan-tree-dump-times optimized "\\\\[\\\\(void .\\\\) \\\\+ 8B\\\\]" 3

2019-08-15 Thread danglin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91458

Bug ID: 91458
   Summary: FAIL: g++.dg/tree-ssa/pr19807.C  -std=gnu++98
scan-tree-dump-times optimized "[(void
.) + 8B]" 3
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: danglin at gcc dot gnu.org
  Target Milestone: ---
  Host: hppa2.0w-hp-hpux11.11
Target: hppa2.0w-hp-hpux11.11
 Build: hppa2.0w-hp-hpux11.11

We have the following fails:

FAIL: g++.dg/tree-ssa/pr19807.C  -std=gnu++98  scan-tree-dump-times optimized
"[(void .) + 8B]" 3
FAIL: g++.dg/tree-ssa/pr19807.C  -std=gnu++14  scan-tree-dump-times optimized
"[(void .) + 8B]" 3
FAIL: g++.dg/tree-ssa/pr19807.C  -std=gnu++17  scan-tree-dump-times optimized
"[(void .) + 8B]" 3

spawn /test/gnu/gcc/objdir/gcc/testsuite/g++/../../xg++
-B/test/gnu/gcc/objdir/gcc/testsuite/g++/../../
/test/gnu/gcc/gcc/gcc/testsuite/g++.dg/tree-ssa/pr19807.C
-fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
-fdiagnostics-color=never -nostdinc++
-I/test/gnu/gcc/objdir/hppa2.0w-hp-hpux11.11/libstdc++-v3/include/hppa2.0w-hp-hpux11.11
-I/test/gnu/gcc/objdir/hppa2.0w-hp-hpux11.11/libstdc++-v3/include
-I/test/gnu/gcc/gcc/libstdc++-v3/libsupc++
-I/test/gnu/gcc/gcc/libstdc++-v3/include/backward
-I/test/gnu/gcc/gcc/libstdc++-v3/testsuite/util -fmessage-length=0 -std=gnu++98
-O -fdump-tree-optimized -S -o pr19807.s
PASS: g++.dg/tree-ssa/pr19807.C  -std=gnu++98 (test for excess errors)
g++.dg/tree-ssa/pr19807.C  -std=gnu++98 : pattern found 0 times
FAIL: g++.dg/tree-ssa/pr19807.C  -std=gnu++98  scan-tree-dump-times optimized
"\\[\\(void .\\) \\+ 8B\\]" 3

In tree dump, we have:

;; Function foo (_Z3foov, funcdef_no=0, decl_uid=1894, cgraph_uid=1,
symbol_orde
r=4)

foo ()
{
   [local count: 1073741824]:
  x =   [(void *) + 8B];
  y =   [(void *) + 8B];
  z =   [(void *) + 8B];
  return;

}

Similar fail:
FAIL: g++.dg/tree-ssa/ssa-dse-1.C   scan-tree-dump-times dse1
"MEM[(struct FixBuf *)& + [0-9]+B] = {}" 1

Here we have:
  MEM  [(struct FixBuf *)& + 28B] = {};

Re: [PATCH] Prevent LTO section collision for a symbol name starting with '*'.

2019-08-15 Thread Jan Hubicka
> On Fri, Aug 9, 2019 at 3:57 PM Martin Liška  wrote:
> >
> > Hi.
> >
> > The patch is about prevention of LTO section name clashing.
> > Now we have a situation where body of 2 functions is streamed
> > into the same ELF section. Then we'll end up with smashed data.
> >
> > Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >
> > Ready to be installed?
> 
> I think the comment should mention why we skip a leading '*'
> at all.  IIRC this is some target mangling applied to DECL_ASSEMBLER_NAME?
DECL_ASSEMBLER_NAME works in a way that if it starts with "*"
it is copied verbatim to the linker ouptut.  If it is w/o "*"
then user_label_prefix is applied first, see 
symbol_table::assembler_names_equal_p

So if we skip "*" one can definitly construct testcases of different
function names ending up in same section especially when
user_label_prefix is non-empty, like on Windows I think it is "_".

> And section names cannot contain '*'?  Or do we need to actually
> indentify '*fn' and 'fn' as the same?  For the testcase what is the clashing
> symbol?  Can't we have many so that using "0" always is broken as well?

We may have duplicate symbols during the compile time->WPA streaming
since we do not do lto-symtab at compile time and user can use asm names
that way.  This is not limited to extern inlines, so it would be nice to
make this work reliably. I also plan support for keeping multiple
function bodies defined for one symbol in cases it is necessary (glibc
checking, when optimization flags are mismatches) for WPA->ltrans
streaming.

I was always considering option to simply use node->order ids to stream
sections.  Because of way node->order is merged we area always able to
recover the ID.

It is however kind of nice to see actual names in the objdump
of the LTO .o files.  I would not mind that much to see this go and it
would also save bit of space since symbol names can be long.

Honza
> 
> Richard.
> 
> > Thanks,
> > Martin
> >
> > gcc/ChangeLog:
> >
> > 2019-08-09  Martin Liska  
> >
> > PR lto/91393
> > PR lto/88220
> > * lto-streamer.c (lto_get_section_name): Replace '*' leading
> > character with '0'.
> >
> > gcc/testsuite/ChangeLog:
> >
> > 2019-08-09  Martin Liska  
> >
> > PR lto/91393
> > PR lto/88220
> > * gcc.dg/lto/pr91393_0.c: New test.
> > ---
> >  gcc/lto-streamer.c   | 15 ---
> >  gcc/testsuite/gcc.dg/lto/pr91393_0.c | 11 +++
> >  2 files changed, 23 insertions(+), 3 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/lto/pr91393_0.c
> >
> >


Re: [PATCH] Make cdtor names stable for LTO (PR lto/91307).

2019-08-15 Thread Jan Hubicka
> On Thu, Aug 1, 2019 at 3:10 PM Martin Liška  wrote:
> >
> > Hi.
> >
> > In LTO WPA mode we don't have to append temp file name
> > to the global cdtor function names.
> 
> Is that true?  You can link with -r -flinker-output=rel and use
> multiple WPA phases whose results you then finally link.
> 
> So I don't think it's that easy.  You might be able to look at
> all_translation_units, pick the one with a sensible name
> (hmm, not sure if we actually have a name there) and the lowest
> UID and use that?  Thus, make the set of involved source files
> known and pick one.  Ah,
> 
> struct GTY(()) tree_translation_unit_decl {
>   struct tree_decl_common common;
>   /* Source language of this translation unit.  Used for DWARF output.  */
>   const char * GTY((skip(""))) language;
>   /* TODO: Non-optimization used to build this translation unit.  */
>   /* TODO: Root of a partial DWARF tree for global types and decls.  */
> };
> 
> so you might be able to get at a filename via the decls location,
> I'm not sure.  Do we have any LTO records per "input file" where we
> can stream main_input_filename to?

This is all bit sloppy.  If you incrmentally link into .o file and then
use LTO again to add more code, you will very likely run into conflict
with .lto_priv clones as well. Especially now when we made them more
stable.

I wondered if Linker should not provide us also with list of symbols
that are used in the unit, so we can safely produce more local ones?

Honza
> 
> > It helps to have a reproducible
> > builds with LTO mode.
> >
> > Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >
> > Ready to be installed?
> > Thanks,
> > Martin
> >
> > gcc/ChangeLog:
> >
> > 2019-08-01  Martin Liska  
> >
> > PR lto/91307
> > * tree.c (get_file_function_name): Use "wpa" when
> > we are in WPA LTO mode.
> > ---
> >  gcc/tree.c | 17 +++--
> >  1 file changed, 11 insertions(+), 6 deletions(-)
> >
> >


  1   2   3   >