RE: [PATCH][GCC][mid-end] Allow larger copies when target supports unaligned access [Patch (1/2)]

2017-11-15 Thread Richard Biener
On Wed, 15 Nov 2017, tamar.christ...@arm.com wrote:

> > -Original Message-
> > From: Richard Biener [mailto:rguent...@suse.de]
> > Sent: Wednesday, November 15, 2017 12:50
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd ; l...@redhat.com;
> > i...@airs.com
> > Subject: RE: [PATCH][GCC][mid-end] Allow larger copies when target
> > supports unaligned access [Patch (1/2)]
> > 
> > On Wed, 15 Nov 2017, Tamar Christina wrote:
> > 
> > >
> > >
> > > > -Original Message-
> > > > From: Richard Biener [mailto:rguent...@suse.de]
> > > > Sent: Wednesday, November 15, 2017 08:24
> > > > To: Tamar Christina 
> > > > Cc: gcc-patches@gcc.gnu.org; nd ; l...@redhat.com;
> > > > i...@airs.com
> > > > Subject: Re: [PATCH][GCC][mid-end] Allow larger copies when target
> > > > supports unaligned access [Patch (1/2)]
> > > >
> > > > On Tue, 14 Nov 2017, Tamar Christina wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > This patch allows larger bitsizes to be used as copy size when the
> > > > > target does not have SLOW_UNALIGNED_ACCESS.
> > > > >
> > > > > fun3:
> > > > >   adrpx2, .LANCHOR0
> > > > >   add x2, x2, :lo12:.LANCHOR0
> > > > >   mov x0, 0
> > > > >   sub sp, sp, #16
> > > > >   ldrhw1, [x2, 16]
> > > > >   ldrbw2, [x2, 18]
> > > > >   add sp, sp, 16
> > > > >   bfi x0, x1, 0, 8
> > > > >   ubfxx1, x1, 8, 8
> > > > >   bfi x0, x1, 8, 8
> > > > >   bfi x0, x2, 16, 8
> > > > >   ret
> > > > >
> > > > > is turned into
> > > > >
> > > > > fun3:
> > > > >   adrpx0, .LANCHOR0
> > > > >   add x0, x0, :lo12:.LANCHOR0
> > > > >   sub sp, sp, #16
> > > > >   ldrhw1, [x0, 16]
> > > > >   ldrbw0, [x0, 18]
> > > > >   strhw1, [sp, 8]
> > > > >   strbw0, [sp, 10]
> > > > >   ldr w0, [sp, 8]
> > > > >   add sp, sp, 16
> > > > >   ret
> > > > >
> > > > > which avoids the bfi's for a simple 3 byte struct copy.
> > > > >
> > > > > Regression tested on aarch64-none-linux-gnu and
> > > > > x86_64-pc-linux-gnu and
> > > > no regressions.
> > > > >
> > > > > This patch is just splitting off from the previous combined patch
> > > > > with
> > > > > AArch64 and adding a testcase.
> > > > >
> > > > > I assume Jeff's ACK from
> > > > > https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01523.html is still
> > > > > valid as
> > > > the code did not change.
> > > >
> > > > Given your no_slow_unalign isn't mode specific can't you use the
> > > > existing non_strict_align?
> > >
> > > No because non_strict_align checks if the target supports unaligned
> > > access at all,
> > >
> > > This no_slow_unalign corresponds instead to the target
> > > slow_unaligned_access which checks that the access you want to make
> > > has a greater cost than doing an aligned access. ARM for instance
> > > always return 1 (value of STRICT_ALIGNMENT) for slow_unaligned_access
> > > while for non_strict_align it may return 0 or 1 based on the options
> > provided to the compiler.
> > >
> > > The problem is I have no way to test STRICT_ALIGNMENT or
> > > slow_unaligned_access So I had to hardcode some targets that I know it
> > does work on.
> > 
> > I see.  But then the slow_unaligned_access implementation should use
> > non_strict_align as default somehow as SLOW_UNALIGNED_ACCESS is
> > defaulted to STRICT_ALIGN.
> > 
> > Given that SLOW_UNALIGNED_ACCESS has different values for different
> > modes it would also make sense to be more specific for the testcase in
> > question, like word_mode_slow_unaligned_access to tell this only applies to
> > word_mode?
> 
> Ah, that's fair enough. I've updated the patch and the new changelog is:

Did you attach the old patch? I don't see strict_aling being tested in
the word_mode_np_slow_unalign test.

Richard.

> 
> gcc/
> 2017-11-15  Tamar Christina  
> 
>   * expr.c (copy_blkmode_to_reg): Fix bitsize for targets
>   with fast unaligned access.
>   * doc/sourcebuild.texi (word_mode_no_slow_unalign): New.
>   
> gcc/testsuite/
> 2017-11-15  Tamar Christina  
> 
>   * gcc.dg/struct-simple.c: New.
>   * lib/target-supports.exp
>   (check_effective_target_word_mode_no_slow_unalign): New.
> 
> Ok for trunk?
> 
> Thanks,
> Tamar
> 
> > 
> > Thanks,
> > Richard.
> > 
> > > Thanks,
> > > Tamar
> > > >
> > > > Otherwise the expr.c change looks ok.
> > > >
> > > > Thanks,
> > > > Richard.
> > > >
> > > > > Thanks,
> > > > > Tamar
> > > > >
> > > > >
> > > > > gcc/
> > > > > 2017-11-14  Tamar Christina  
> > > > >
> > > > >   * expr.c (copy_blkmode_to_reg): Fix bitsize for targets
> > > > >   with fast unaligned access.
> > > > >   * doc/sourcebuild.texi (no_slow_unalign): New.
> > > > >
> > > > > gcc/testsuite/
> > > > > 2017-11-14  Tamar Christina  

[PATCH] MicroBlaze use default ident output generation

2017-11-15 Thread Nathan Rossi
Remove the MicroBlaze specific TARGET_ASM_OUTPUT_IDENT definition, and
use the default.

This resolves issues associated with the use of the .sdata2 operation in
cases where emitted assembly after the ident output is incorrectly in
the .sdata2 section instead of .text or any other expected section.
Which results in assembly failures including operations with symbols
across different segments.

gcc/ChangeLog

2017-11-16  Nathan Rossi  

PR target/83013
* config/microblaze/microblaze-protos.h
(microblaze_asm_output_ident): Delete
* config/microblaze/microblaze.c (microblaze_asm_output_ident): Delete
* config/microblaze/microblaze.h (TARGET_ASM_OUTPUT_IDENT): Default
---
 gcc/config/microblaze/microblaze-protos.h |  1 -
 gcc/config/microblaze/microblaze.c| 24 
 gcc/config/microblaze/microblaze.h|  2 +-
 3 files changed, 1 insertion(+), 26 deletions(-)

diff --git a/gcc/config/microblaze/microblaze-protos.h 
b/gcc/config/microblaze/microblaze-protos.h
index 747ef35971..3d3e8c5a64 100644
--- a/gcc/config/microblaze/microblaze-protos.h
+++ b/gcc/config/microblaze/microblaze-protos.h
@@ -51,7 +51,6 @@ extern int microblaze_regno_ok_for_base_p (int, int);
 extern HOST_WIDE_INT microblaze_initial_elimination_offset (int, int);
 extern void microblaze_declare_object (FILE *, const char *, const char *,
const char *, int);
-extern void microblaze_asm_output_ident (const char *);
 extern int microblaze_legitimate_pic_operand (rtx);
 extern bool microblaze_tls_referenced_p (rtx);
 extern int symbol_mentioned_p (rtx);
diff --git a/gcc/config/microblaze/microblaze.c 
b/gcc/config/microblaze/microblaze.c
index 7487523877..379f4c4d7f 100644
--- a/gcc/config/microblaze/microblaze.c
+++ b/gcc/config/microblaze/microblaze.c
@@ -3377,30 +3377,6 @@ microblaze_eh_return (rtx op0)
   emit_insn (gen_movsi (gen_rtx_MEM (Pmode, stack_pointer_rtx), op0));
 }
 
-/* Queue an .ident string in the queue of top-level asm statements.
-   If the string size is below the threshold, put it into .sdata2.
-   If the front-end is done, we must be being called from toplev.c.
-   In that case, do nothing.  */
-void 
-microblaze_asm_output_ident (const char *string)
-{
-  const char *section_asm_op;
-  int size;
-  char *buf;
-
-  if (symtab->state != PARSING)
-return;
-
-  size = strlen (string) + 1;
-  if (size <= microblaze_section_threshold)
-section_asm_op = SDATA2_SECTION_ASM_OP;
-  else
-section_asm_op = READONLY_DATA_SECTION_ASM_OP;
-
-  buf = ACONCAT ((section_asm_op, "\n\t.ascii \"", string, "\\0\"\n", NULL));
-  symtab->finalize_toplevel_asm (build_string (strlen (buf), buf));
-}
-
 static void
 microblaze_elf_asm_init_sections (void)
 {
diff --git a/gcc/config/microblaze/microblaze.h 
b/gcc/config/microblaze/microblaze.h
index 59cc1cc2e3..06155d3163 100644
--- a/gcc/config/microblaze/microblaze.h
+++ b/gcc/config/microblaze/microblaze.h
@@ -705,7 +705,7 @@ do {
\
 #define STRING_ASM_OP  "\t.asciz\t"
 
 #undef TARGET_ASM_OUTPUT_IDENT
-#define TARGET_ASM_OUTPUT_IDENT microblaze_asm_output_ident
+#define TARGET_ASM_OUTPUT_IDENT default_asm_output_ident_directive
 
 /* Default to -G 8 */
 #ifndef MICROBLAZE_DEFAULT_GVALUE
-- 
2.15.0




Re: [x86,avx][patch] Fix PR82983

2017-11-15 Thread Kirill Yukhin
Hello Julia!
On 14 Nov 09:45, Koval, Julia wrote:
> Didn't get in the list for some reason.
> 
> > -Original Message-
> > From: Koval, Julia
> > Sent: Tuesday, November 14, 2017 10:29 AM
> > To: GCC Patches 
> > Cc: Kirill Yukhin 
> > Subject: [x86,avx][patch] Fix PR82983
> > 
> > Hi, this patch fix GFNI check which didn't work properly in gfni+sse case.
> > 
> > gcc/
> > * config/i386/gfniintrin.h: Add sse check.
> > * config/i386/i386.c (ix86_expand_builtin): Fix gfni check.
Your patch is OK for trunk. I've checked it in.

--
Thanks, K


Re: [PATCH][i386,AVX] Enable VBMI2 support [1/7]

2017-11-15 Thread Kirill Yukhin
Hello Julia!
On 25 Oct 11:18, Koval, Julia wrote:
> Thanks, fix it.
> 
> gcc/
>   * common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512VBMI2_SET,
>   OPTION_MASK_ISA_AVX512VBMI2_UNSET): New.
>   (ix86_handle_option): Handle -mavx512vbmi2.
>   * config/i386/cpuid.h: Add bit_AVX512VBMI2.
>   * config/i386/driver-i386.c (host_detect_local_cpu): Handle new bit.
>   * config/i386/i386-c.c (__AVX512VBMI2__): New.
>   * config/i386/i386.c (ix86_target_string): Handle -mavx512vbmi2.
>   (ix86_valid_target_attribute_inner_p): Ditto.
>   * config/i386/i386.h (TARGET_AVX512VBMI2, TARGET_AVX512VBMI2_P): New.
>   * config/i386/i386.opt (mavx512vbmi2): New option.
>   * doc/invoke.texi: Add new option.
Your patch is OK. I've checked it into main trunk.

--
Thanks, K


Re: [patch][i386, AVX] GFNI enabling [4/4]

2017-11-15 Thread Kirill Yukhin
Hello Julia,
On 17 Oct 13:28, Koval, Julia wrote:
> Fixed changelog.
> 
> gcc/
> * config/i386/gfniintrin.h (_mm_gf2p8mul_epi8, _mm256_gf2p8mul_epi8,
> _mm_mask_gf2p8mul_epi8, _mm_maskz_gf2p8mul_epi8,
> _mm256_mask_gf2p8mul_epi8, _mm256_maskz_gf2p8mul_epi8,
> _mm512_mask_gf2p8mul_epi8, _mm512_maskz_gf2p8mul_epi8,
> _mm512_gf2p8mul_epi8): New intrinsics.
> * config/i386/i386-builtin-types.def
> (V64QI_FTYPE_V64QI_V64QI): New type.
> * config/i386/i386-builtin.def (__builtin_ia32_vgf2p8mulb_v64qi,
> __builtin_ia32_vgf2p8mulb_v64qi_mask, __builtin_ia32_vgf2p8mulb_v32qi,
> __builtin_ia32_vgf2p8mulb_v32qi_mask, __builtin_ia32_vgf2p8mulb_v16qi,
> __builtin_ia32_vgf2p8mulb_v16qi_mask): New builtins.
> * config/i386/sse.md (vgf2p8mulb_*): New pattern.
> * config/i386/i386.c (ix86_expand_args_builtin): Handle new type.
> 
> gcc/testsuite/
> * gcc.target/i386/avx512f-gf2p8mulb-2.c: New runtime tests.
> * gcc.target/i386/avx512vl-gf2p8mulb-2.c: Ditto.
> * gcc.target/i386/gfni-1.c: Add tests for GF2P8MUL.
> * gcc.target/i386/gfni-2.c: Ditto.
> * gcc.target/i386/gfni-3.c: Ditto.
> * gcc.target/i386/gfni-4.c: Ditto.
Your patch is OK. I've checked it into main trunk.

--
Thanks, k


Re: Make istreambuf_iterator::_M_sbuf immutable and add debug checks

2017-11-15 Thread Petr Ovtchenkov
On Mon, 6 Nov 2017 22:19:22 +0100
François Dumont  wrote:

> Hi
> 
>      Any final decision regarding this patch ?
> 
> François

https://gcc.gnu.org/ml/libstdc++/2017-11/msg00036.html
https://gcc.gnu.org/ml/libstdc++/2017-11/msg00035.html
https://gcc.gnu.org/ml/libstdc++/2017-11/msg00037.html
https://gcc.gnu.org/ml/libstdc++/2017-11/msg00034.html

--

   - ptr


Re: [PATCH 3/4] libstdc++: avoid character accumulation in istreambuf_iterator

2017-11-15 Thread Petr Ovtchenkov
On Wed, 15 Nov 2017 22:31:11 +0100
Paolo Carlini  wrote:

> Hi,
> 
> On 15/11/2017 11:48, Petr Ovtchenkov wrote:
> > Ask associated streambuf for character when needed instead of
> > accumulate it in istreambuf_iterator object.
> >
> > Benefits from this:
> >- minus one class member in istreambuf_iterator
> >- trivial synchronization of states of istreambuf_iterator
> >  and associated streambuf
> > ---
> >   libstdc++-v3/include/bits/streambuf_iterator.h | 34 
> > --
> >   1 file changed, 15 insertions(+), 19 deletions(-)
> >
> > diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h
> > b/libstdc++-v3/include/bits/streambuf_iterator.h index 08fb13b..203da9d 
> > 100644
> > --- a/libstdc++-v3/include/bits/streambuf_iterator.h
> > +++ b/libstdc++-v3/include/bits/streambuf_iterator.h
> > @@ -95,19 +95,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > // NB: This implementation assumes the "end of stream" value
> > // is EOF, or -1.
> > mutable streambuf_type* _M_sbuf;
> > -  mutable int_type _M_c;
> Obviously this would be an ABI-breaking change, which certainly we don't 
> want. Unless I missed a detailed discussion of the non-trivial way to 
> avoid it in one of the recent threads about these topics...

Is we really worry about frozen sizeof of instantiated template?
(Removed private template member).

If yes, than

   int_type __dummy;

is our all.

> 
> Paolo.

--

   - ptr


Re: [PATCH v2] [libcc1] Rename C{,P}_COMPILER_NAME and remove triplet from them

2017-11-15 Thread Sergio Durigan Junior
On Wednesday, November 15 2017, Jim Wilson wrote:

> On 11/13/2017 01:10 PM, Sergio Durigan Junior wrote:
>> On Tuesday, September 26 2017, I wrote:
>>
>>> Ping^2.
>>
>> Ping^3.
>>
>> I'm sending the updated ChangeLog/patch.  I'm also removing gdb-patches
>> from the Cc list.
>>
>> libcc1/ChangeLog:
>> 2017-09-01  Sergio Durigan Junior  
>>  Pedro Alves  
>>
>>  * Makefile.am: Remove references to c-compiler-name.h and
>>  cp-compiler-name.h
>>  * Makefile.in: Regenerate.
>>  * compiler-name.hh: New file.
>>  * libcc1.cc: Don't include c-compiler-name.h.  Include
>>  compiler-name.hh.
>>  * libcp1.cc: Don't include cp-compiler-name.h.  Include
>>  compiler-name.hh.
>
> OK.
>
> This is a gcc plugin for gdb, so it makes sense that gdb developers
> should be allowed to decide how it should work.

Thanks Jim and Alex for the review.

I don't have permission to push to the GCC repository, so if one of you
guys could do it for me I'd appreciate.

Thank you,

-- 
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
http://sergiodj.net/


Re: [PATCH] Improve -Wmaybe-uninitialized documentation

2017-11-15 Thread Martin Sebor

On 11/15/2017 07:31 AM, Jonathan Wakely wrote:

The docs for -Wmaybe-uninitialized have some issues:

- That first sentence is looong.
- Apparently some C++ programmers think "automatic variable" means one
 declared with C++11 `auto`, rather than simply a local variable.
- The sentence about only warning when optimizing is stuck in between
 two chunks talking about longjmp, which could be inferred to mean
 only the setjmp/longjmp part of the warning depends on optimization.

This attempts to make it easier to parse and understand.


I've always found the description remarkably precise.  Particularly
the bit where it talks about the two paths, one initialized and the
other not.  Your rewording loses that distinction so I don't think
it's as accurate, or even correct.

To use an example, this would satisfy the new description:

  int f (void)
  {
int i;
return i;
  }

but it doesn't match GCC behavior (it triggers -Wuninitialized,
not -Wmaybe-uninitialized).  Unless the distinction is more
subtle than I ascribe to it I think it needs to be preserved
in the rewording.

Martin


Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-11-15 Thread Martin Sebor

On 11/15/2017 03:51 AM, Richard Biener wrote:

On Tue, Nov 14, 2017 at 6:45 PM, Martin Sebor  wrote:

On 11/14/2017 05:28 AM, Richard Biener wrote:


On Mon, Nov 13, 2017 at 6:37 PM, Martin Sebor  wrote:


Richard, this thread may have been conflated with the one Re:
[PATCH] enhance -Warray-bounds to detect out-of-bounds offsets
(PR 82455) They are about different things.

I'm still looking for approval of:

  https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01208.html



Sorry, I pointed to an outdated version.  This is the latest
version:

  https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html

My bad...




+  tree maxbound
+ = build_int_cst (sizetype, ~(1LLU << (TYPE_PRECISION (sizetype) - 1)));

this looks possibly bogus.  Can you instead use

  up_bound_p1
= wide_int_to_tree (sizetype, wi::div_trunc (wi::max_value
(TYPE_PRECISION (sizetype), SIGNED), wi::to_wide (eltsize)));

please?  Note you are _not_ computing the proper upper bound here because
that
is what you compute plus low_bound.

+  up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound, eltsize);

+
+  tree arg = TREE_OPERAND (ref, 0);
+  tree_code code = TREE_CODE (arg);
+  if (code == COMPONENT_REF)
+ {
+  HOST_WIDE_INT off;
+  if (tree base = get_addr_base_and_unit_offset (ref, ))
+{
+  tree size = TYPE_SIZE_UNIT (TREE_TYPE (base));
+  if (TREE_CODE (size) == INTEGER_CST)
+ up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1, size);

I think I asked this multiple times now but given 'ref' is the
variable array-ref
a.b.c[i] when you call get_addr_base_and_unit_offset (ref, ) you
always
get a NULL_TREE return value.

So I asked you to pass it 'arg' instead ... which gets you the offset of
a.b.c, which looks like what you intended to get anyway.

I also wonder what you compute here - you are looking at the size of
'base'
but that is the size of 'a'.  You don't even use the computed offset!
Which
means you could have used get_base_address instead!?  Also the type
of 'base' may be completely off given MEM[ + 8].b.c[i] would return
blk
as base which might be an array of chars and not in any way related to
the type of the innermost structure we access with COMPONENT_REFs.

Why are you only looking at COMPONENT_REF args anyways?  You
don't want to handle a.b[3][i]?

That is, I'd have expected you do

   if (get_addr_base_and_unit_offset (ref, ))
 up_bound_p1 = wide_int_to_tree (sizetype, wi::sub (wi::to_wide
(up_bound_p1), off));


^


Please see the attached update.

Martin
PR tree-optimization/82588 - missing -Warray-bounds on a excessively large index
PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds inner indic

gcc/ChangeLog:
	PR tree-optimization/82588
	PR tree-optimization/82583
	* tree-vrp.c (check_array_ref): Handle flexible array members,
	string literals, and inner indices.
	(search_for_addr_array): Add detail to diagnostics.

gcc/testsuite/ChangeLog:
	PR tree-optimization/82588
	PR tree-optimization/82583	
	* c-c++-common/Warray-bounds.c: New test.
	* gcc.dg/Warray-bounds-11.c: Adjust.
	* gcc.dg/Warray-bounds-22.c: New test.

diff --git a/gcc/testsuite/c-c++-common/Warray-bounds.c b/gcc/testsuite/c-c++-common/Warray-bounds.c
new file mode 100644
index 000..bea36fb
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Warray-bounds.c
@@ -0,0 +1,259 @@
+/* PR tree-optimization/82588 - missing -Warray-bounds on an excessively
+   large index
+   { dg-do compile }
+   { dg-require-effective-target alloca }
+   { dg-options "-O2 -Warray-bounds -ftrack-macro-expansion=0" }  */
+
+#define SIZE_MAX  __SIZE_MAX__
+#define DIFF_MAX  __PTRDIFF_MAX__
+#define DIFF_MIN  (-DIFF_MAX - 1)
+
+#define offsetof(T, m)   __builtin_offsetof (T, m)
+
+typedef __PTRDIFF_TYPE__ ssize_t;
+typedef __SIZE_TYPE__size_t;
+
+extern ssize_t signed_value (void)
+{
+  extern volatile ssize_t signed_value_source;
+  return signed_value_source;
+}
+
+extern size_t unsigned_value (void)
+{
+  extern volatile size_t unsigned_value_source;
+  return unsigned_value_source;
+}
+
+ssize_t signed_range (ssize_t min, ssize_t max)
+{
+  ssize_t val = signed_value ();
+  return val < min || max < val ? min : val;
+}
+
+typedef struct AX { int n; char ax[]; } AX;
+
+typedef struct A1 { int i; char a1[1]; } A1;
+typedef struct B { int i; struct A1 a1x[]; } B;
+
+void sink (int, ...);
+
+#define R(min, max) signed_range (min, max)
+#define T(expr) sink (0, expr)
+
+struct __attribute__ ((packed)) S16 { unsigned i: 16; };
+
+void farr_char (void)
+{
+  extern char ac[];
+
+  T (ac[DIFF_MIN]);   /* { dg-warning "array subscript -\[0-9\]+ is below array bounds of .char *\\\[]." } */
+  T (ac[-1]); /* { dg-warning "array subscript -1 is below array bounds" } */
+  T (ac[0]);
+
+  T (ac[DIFF_MAX - 1]);
+  T (ac[DIFF_MAX]);   /* { dg-warning "array subscript \[0-9\]+ is above array bounds" } */
+  T 

Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-11-15 Thread Martin Sebor

On 11/15/2017 03:51 AM, Richard Biener wrote:

On Tue, Nov 14, 2017 at 6:45 PM, Martin Sebor  wrote:

On 11/14/2017 05:28 AM, Richard Biener wrote:


On Mon, Nov 13, 2017 at 6:37 PM, Martin Sebor  wrote:


Richard, this thread may have been conflated with the one Re:
[PATCH] enhance -Warray-bounds to detect out-of-bounds offsets
(PR 82455) They are about different things.

I'm still looking for approval of:

  https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01208.html



Sorry, I pointed to an outdated version.  This is the latest
version:

  https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html

My bad...




+  tree maxbound
+ = build_int_cst (sizetype, ~(1LLU << (TYPE_PRECISION (sizetype) - 1)));

this looks possibly bogus.  Can you instead use

  up_bound_p1
= wide_int_to_tree (sizetype, wi::div_trunc (wi::max_value
(TYPE_PRECISION (sizetype), SIGNED), wi::to_wide (eltsize)));

please?  Note you are _not_ computing the proper upper bound here because
that
is what you compute plus low_bound.

+  up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound, eltsize);

+
+  tree arg = TREE_OPERAND (ref, 0);
+  tree_code code = TREE_CODE (arg);
+  if (code == COMPONENT_REF)
+ {
+  HOST_WIDE_INT off;
+  if (tree base = get_addr_base_and_unit_offset (ref, ))
+{
+  tree size = TYPE_SIZE_UNIT (TREE_TYPE (base));
+  if (TREE_CODE (size) == INTEGER_CST)
+ up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1, size);

I think I asked this multiple times now but given 'ref' is the
variable array-ref
a.b.c[i] when you call get_addr_base_and_unit_offset (ref, ) you
always
get a NULL_TREE return value.

So I asked you to pass it 'arg' instead ... which gets you the offset of
a.b.c, which looks like what you intended to get anyway.

I also wonder what you compute here - you are looking at the size of
'base'
but that is the size of 'a'.  You don't even use the computed offset!
Which
means you could have used get_base_address instead!?  Also the type
of 'base' may be completely off given MEM[ + 8].b.c[i] would return
blk
as base which might be an array of chars and not in any way related to
the type of the innermost structure we access with COMPONENT_REFs.

Why are you only looking at COMPONENT_REF args anyways?  You
don't want to handle a.b[3][i]?

That is, I'd have expected you do

   if (get_addr_base_and_unit_offset (ref, ))
 up_bound_p1 = wide_int_to_tree (sizetype, wi::sub (wi::to_wide
(up_bound_p1), off));


^





[PATCH, rs6000] correct implementation of _mm_add_pi32

2017-11-15 Thread Steven Munroe
A small thinko in the implementation of _mm_add_pi32 that only shows
when compiling for power9.

./gcc/ChangeLog:

2017-11-15  Steven Munroe  

* config/rs6000/mmintrin.h (_mm_add_pi32[_ARCH_PWR]): Correct
parameter list for vec_splats.

Index: gcc/config/rs6000/mmintrin.h
===
--- gcc/config/rs6000/mmintrin.h(revision 254714)
+++ gcc/config/rs6000/mmintrin.h(working copy)
@@ -463,8 +463,8 @@ _mm_add_pi32 (__m64 __m1, __m64 __m2)
 #if _ARCH_PWR9
   __vector signed int a, b, c;
 
-  a = (__vector signed int)vec_splats (__m1, __m1);
-  b = (__vector signed int)vec_splats (__m2, __m2);
+  a = (__vector signed int)vec_splats (__m1);
+  b = (__vector signed int)vec_splats (__m2);
   c = vec_add (a, b);
   return (__builtin_unpack_vector_int128 ((__vector __int128_t)c, 0));
 #else




Re: Patch ping^2

2017-11-15 Thread Jim Wilson

On 11/14/2017 08:29 AM, Jakub Jelinek wrote:

On Mon, Nov 06, 2017 at 05:22:36PM +0100, Jakub Jelinek wrote:

I'd like to ping the:

   http://gcc.gnu.org/ml/gcc-patches/2017-10/msg01895.html
   PR debug/82718
   Fix DWARF5 .debug_loclist handling with hot/cold partitioning

patch.  Thanks


Ping^2.


The testcase doesn't fail on mainline anymore.  It also doesn't fail 
with the 2017-11-05 snapshot.  It does fail with the 2017-10-22 
snapshot.  Perhaps something subtle changed in the optimizer.  But at 
the moment the testcase isn't doing anything useful.  Maybe it can be fixed?


The dwarf2out.c patch looks OK to me though.

Jim


[PATCH] Fix PowerPC testsuite not to look for *.c*~ files

2017-11-15 Thread Michael Meissner
I was back-porting some changes to the IBM Advance Toolchain branch, and I was
doing this via creating a patch file, and applying the patch output.  I tend to
always use the -b option to patch to create a backup file.  I had new failures,
since the new files the bfp, dfp, and vfu sub-directories created an empty
*.c.~1~ file, and the .exp tried to run it as a test.  Since we don't have any
non C files in those directories, I changed the test to just *.exp.  I verified
that we get the same number of failures, successes, etc. with the patch applied
and without it being applied.  Can I check this into the trunk?

[gcc/testsuite]
2017-11-15  Michael Meissner  

* gcc.target/powerpc/bfp/bfp.exp: Look for *.c files, not *.c*
files to prevent ~ files from getting recognized.
* gcc.target/powerpc/dfp/dfp.exp: Likewise.
* gcc.target/powerpc/vsu/vsu.exp: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/testsuite/gcc.target/powerpc/bfp/bfp.exp
===
--- gcc/testsuite/gcc.target/powerpc/bfp/bfp.exp(revision 254782)
+++ gcc/testsuite/gcc.target/powerpc/bfp/bfp.exp(working copy)
@@ -34,7 +34,7 @@ load_lib torture-options.exp
 # Initialize.
 dg-init
 
-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c*]] "" $DEFAULT_CFLAGS
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c]] "" $DEFAULT_CFLAGS
 
 # All done.
 dg-finish
Index: gcc/testsuite/gcc.target/powerpc/dfp/dfp.exp
===
--- gcc/testsuite/gcc.target/powerpc/dfp/dfp.exp(revision 254782)
+++ gcc/testsuite/gcc.target/powerpc/dfp/dfp.exp(working copy)
@@ -33,7 +33,7 @@ load_lib torture-options.exp
 # Initialize.
 dg-init
 
-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c*]] "" $DEFAULT_CFLAGS
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c]] "" $DEFAULT_CFLAGS
 
 # All done.
 dg-finish
Index: gcc/testsuite/gcc.target/powerpc/vsu/vsu.exp
===
--- gcc/testsuite/gcc.target/powerpc/vsu/vsu.exp(revision 254782)
+++ gcc/testsuite/gcc.target/powerpc/vsu/vsu.exp(working copy)
@@ -34,7 +34,7 @@ load_lib torture-options.exp
 # Initialize.
 dg-init
 
-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c*]] "" $DEFAULT_CFLAGS
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c]] "" $DEFAULT_CFLAGS
 
 # All done.
 dg-finish


Re: [PATCH v2] [libcc1] Rename C{,P}_COMPILER_NAME and remove triplet from them

2017-11-15 Thread Jim Wilson

On 11/13/2017 01:10 PM, Sergio Durigan Junior wrote:

On Tuesday, September 26 2017, I wrote:


Ping^2.


Ping^3.

I'm sending the updated ChangeLog/patch.  I'm also removing gdb-patches
from the Cc list.

libcc1/ChangeLog:
2017-09-01  Sergio Durigan Junior  
Pedro Alves  

* Makefile.am: Remove references to c-compiler-name.h and
cp-compiler-name.h
* Makefile.in: Regenerate.
* compiler-name.hh: New file.
* libcc1.cc: Don't include c-compiler-name.h.  Include
compiler-name.hh.
* libcp1.cc: Don't include cp-compiler-name.h.  Include
compiler-name.hh.


OK.

This is a gcc plugin for gdb, so it makes sense that gdb developers 
should be allowed to decide how it should work.


Jim


Re: [PATCH] Set default to -fomit-frame-pointer

2017-11-15 Thread Sandra Loosemore

On 11/15/2017 10:38 AM, Wilco Dijkstra wrote:

Sandra Loosemore wrote:


I'd prefer that you remove the reference to configure options entirely
here.  Nowadays most GCC users install a package provided by their OS
distribution, Linaro, etc, rather than trying to build GCC from scratch.


OK, I've removed that reference. Similarly the FRAME_POINTER_REQUIRED
bit as that statement is not only irrelevant but also completely incorrect.


+Enabled at levels @option{-O}, @option{-O1}, @option{-O2}, @option{-O3},
+@option{-Os} and @option{-Og}.


This last sentence makes no sense.  If the option is now enabled by
default, then the optimization level is irrelevant.


It's enabled from -O onwards, so I've changed it to the standard form used
elsewhere and updated the table for -O:

+Enabled by default at @option{-O} and higher.

Here is the cleaned up and simplified version:

[snip]


Thanks, this patch is OK with me.

-Sandra


Re: [PATCH][aarch64] Fix pr81356 - copy empty string with wrz, not a ldrb/strb

2017-11-15 Thread Steve Ellcey
re-re-ping.

Steve Ellcey
sell...@cavium.com

On Tue, 2017-10-24 at 11:16 -0700, Steve Ellcey wrote:
> Re-ping.
> 
> Steve Ellcey
> sell...@cavium.com
> 
> On Mon, 2017-09-25 at 10:36 -0700, Steve Ellcey wrote:
> > 
> > Ping.
> > 
> > Steve Ellcey
> > sell...@cavium.com
> > 
> > 
> > On Fri, 2017-09-15 at 11:22 -0700, Steve Ellcey wrote:
> > > 
> > > 
> > > PR 81356 points out that doing a __builtin_strcpy of an empty
> > > string on
> > > aarch64 does a copy from memory instead of just writing out a
> > > zero byte.
> > > In looking at this I found that it was because of
> > > aarch64_use_by_pieces_infrastructure_p, which returns false for
> > > STORE_BY_PIECES.  The comment says:
> > > 
> > >   /* STORE_BY_PIECES can be used when copying a constant string,
> > > but
> > >  in that case each 64-bit chunk takes 5 insns instead of 2
> > > (LDR/STR).
> > >  For now we always fail this and let the move_by_pieces code
> > > copy
> > >  the string from read-only memory.  */
> > > 
> > > But this doesn't seem to be the case anymore.  When I remove this
> > > function
> > > and the TARGET_USE_BY_PIECES_INFRASTRUCTURE_P macro that uses it
> > > the code
> > > for __builtin_strcpy of a constant string seems to be either
> > > better or the
> > > same.  The only time I got more instructions after removing this
> > > function
> > > was on an 8 byte __builtin_strcpy where we now generate a mov and
> > > 3 movk
> > > instructions to create the source followed by a store instead of
> > > doing a
> > > load/store of 8 bytes.  The comment may have been applicable for
> > > -mstrict-align at one time but it doesn't seem to be the case
> > > now.  I still
> > > get better code without this routine under that option as well.
> > > 
> > > Bootstrapped and tested without regressions, OK to checkin?
> > > 
> > > Steve Ellcey
> > > sell...@cavium.com
> > > 
> > > 
> > > 
> > > 2017-09-15  Steve Ellcey  
> > > 
> > >   PR target/81356
> > >   * config/aarch64/aarch64.c
> > > (aarch64_use_by_pieces_infrastructure_p):
> > >   Remove.
> > >   (TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Remove define.
> > > 
> > > 
> > > 2017-09-15  Steve Ellcey  
> > > 
> > >   * gcc.target/aarch64/pr81356.c: New test.


Re: [PATCH][aarch64] Put vector fnma instruction into canonical form for better code generation.

2017-11-15 Thread Steve Ellcey
Re-ping with an added cc to the aarch64 maintainers.

Steve Ellcey

On Tue, 2017-10-24 at 11:06 -0700, Steve Ellcey wrote:
> Ping.
> 
> Steve Ellcey
> 
> On Fri, 2017-10-06 at 14:01 -0700, Steve Ellcey wrote:
> > 
> > This patch is a follow up to a discussion at:
> > 
> > https://gcc.gnu.org/ml/gcc/2017-06/msg00126.html
> > 
> > For some reason the simd version of fnma in aarch64-simd.md
> > is not in the canonical form of having the neg operator on 
> > the first operand and instead has it on the second.  This 
> > results in sub-optimal code generation (an extra dup instruction).
> > 
> > I have moved the 'neg', rebuilt GCC and retested with this patch
> > There were no regressions.  OK to checkin?
> > 
> > 
> > 2017-10-06  Steve Ellcey  
> > 
> > * config/aarch64/aarch64-simd.md (fnma4): Move neg
> > operator
> > to canonical location.
> > 
> > 
> > diff --git a/gcc/config/aarch64/aarch64-simd.md
> > b/gcc/config/aarch64/aarch64-sim
> > d.md
> > index 12da8be..d9ced50 100644
> > --- a/gcc/config/aarch64/aarch64-simd.md
> > +++ b/gcc/config/aarch64/aarch64-simd.md
> > @@ -1777,9 +1777,8 @@
> >  (define_insn "fnma4"
> >    [(set (match_operand:VHSDF 0 "register_operand" "=w")
> >     (fma:VHSDF
> > -     (match_operand:VHSDF 1 "register_operand" "w")
> > -  (neg:VHSDF
> > -   (match_operand:VHSDF 2 "register_operand" "w"))
> > +     (neg:VHSDF (match_operand:VHSDF 1 "register_operand"
> > "w"))
> > +     (match_operand:VHSDF 2 "register_operand" "w")
> >       (match_operand:VHSDF 3 "register_operand" "0")))]
> >    "TARGET_SIMD"
> >    "fmls\\t%0., %1., %2."
> > 
> > 
> > 
> > 2017-10-06  Steve Ellcey  
> > 
> > * gcc.target/aarch64/fmls.c: New test.
> > 
> > 
> > diff --git a/gcc/testsuite/gcc.target/aarch64/fmls.c
> > b/gcc/testsuite/gcc.target/
> > aarch64/fmls.c
> > index e69de29..1ea0e6a 100644
> > --- a/gcc/testsuite/gcc.target/aarch64/fmls.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/fmls.c
> > @@ -0,0 +1,19 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O3" } */
> > +
> > +#define vector __attribute__((vector_size(16)))
> > +vector double a = {1.0,1.0};
> > +vector double b = {2.0,2.0};
> > +double x = 3.0;
> > +
> > +
> > +void __attribute__ ((noinline))
> > +vf (double x, vector double *v1, vector double *v2, vector double
> > *result)
> > +{
> > +  vector double s = v1[0];
> > +  vector double t = -v2[0];
> > +  vector double m = {x,x};
> > +  vector double r = t * m + s;
> > +  result[0] = r;
> > +}
> > +/* { dg-final { scan-assembler-not "dup" } } */


Re: [PATCH] fix -mnop-mcount generate 5byte nop in 32bit.

2017-11-15 Thread Uros Bizjak
Hello!

> "-mnop-mcount" needs to make 5byte size "nop" instruction.
> however recently gcc make only 4byte "nop" in 32bit.
> I have test in gcc 5.4, 7.2.

-fprintf (file, "1:\tnopl 0x00(%%eax,%%eax,1)\n"); /* 5 byte nop.  */
+fprintf (file, "1:\tnopl 0x01(%%eax,%%eax,1)\n"); /* 5 byte nop.  */

Even the above change is not correct, since it will be assembled in a
different way on 32 bit and 64 bit targets (size prefix will be added
on 64 bit targets). Attached patch fixes this issue by emitting a
stream of bytes.

2017-11-15  Uros Bizjak  

* config/i386/i386.c (x86_print_call_or_nop): Emit 5 byte nop
explicitly as a stream of bytes.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline, will be committed to release branches.

Uros.
Index: i386.c
===
--- i386.c  (revision 254773)
+++ i386.c  (working copy)
@@ -40473,7 +40473,8 @@ static void
 x86_print_call_or_nop (FILE *file, const char *target)
 {
   if (flag_nop_mcount)
-fprintf (file, "1:\tnopl 0x00(%%eax,%%eax,1)\n"); /* 5 byte nop.  */
+/* 5 byte nop: nopl 0(%[re]ax,%[re]ax,1) */
+fprintf (file, "1:" ASM_BYTE "0x0f, 0x1f, 0x44, 0x00, 0x00\n");
   else
 fprintf (file, "1:\tcall\t%s\n", target);
 }


Re: [PATCH #2], make Float128 built-in functions work with -mabi=ieeelongdouble

2017-11-15 Thread Michael Meissner
David tells me that the patch to enable float128 built-in functions to work
with the -mabi=ieeelongdouble option broke AIX because on AIX, the float128
insns are disabled, and they all become CODE_FOR_nothing.  The switch statement
that was added in rs6000.c to map KFmode built-in functions to TFmode breaks
under AIX.

I changed the code to have a separate table, and the first call, I build the
table.  If the insn was not generated, it will just be CODE_FOR_nothing, and
the KF->TF mode conversion will not be done.

I have tested this on a little endian power8 system and there were no
regressions.  Once David verifies that it builds on AIX, can I check this into
the trunk?

2017-11-15  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_expand_builtin): Do not use a
switch to map KFmode built-in functions to TFmode.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 254782)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -16786,27 +16786,45 @@ rs6000_expand_builtin (tree exp, rtx tar
  double (KFmode) or long double is IEEE 128-bit (TFmode).  It is simpler if
  we only define one variant of the built-in function, and switch the code
  when defining it, rather than defining two built-ins and using the
- overload table in rs6000-c.c to switch between the two.  */
+ overload table in rs6000-c.c to switch between the two.  On some systems
+ like AIX, the KF/TF mode insns are not generated, and they return
+ CODE_FOR_nothing.  */
   if (FLOAT128_IEEE_P (TFmode))
-switch (icode)
-  {
-  default:
-   break;
+{
+  struct map_f128 {
+   enum insn_code from;/* KFmode insn code that is in the tables.  */
+   enum insn_code to;  /* TFmode insn code to use instead.  */
+  };
+
+  static enum insn_code map_insn_code[NUM_INSN_CODES];
+  static bool first_time = true;
+  const static struct map_f128 map[] = {
+   { CODE_FOR_sqrtkf2_odd, CODE_FOR_sqrttf2_odd },
+   { CODE_FOR_trunckfdf2_odd,  CODE_FOR_trunctfdf2_odd },
+   { CODE_FOR_addkf3_odd,  CODE_FOR_addtf3_odd },
+   { CODE_FOR_subkf3_odd,  CODE_FOR_subtf3_odd },
+   { CODE_FOR_mulkf3_odd,  CODE_FOR_multf3_odd },
+   { CODE_FOR_divkf3_odd,  CODE_FOR_divtf3_odd },
+   { CODE_FOR_fmakf4_odd,  CODE_FOR_fmatf4_odd },
+   { CODE_FOR_xsxexpqp_kf, CODE_FOR_xsxexpqp_tf },
+   { CODE_FOR_xsxsigqp_kf, CODE_FOR_xsxsigqp_tf },
+   { CODE_FOR_xststdcnegqp_kf, CODE_FOR_xststdcnegqp_tf },
+   { CODE_FOR_xsiexpqp_kf, CODE_FOR_xsiexpqp_tf },
+   { CODE_FOR_xsiexpqpf_kf,CODE_FOR_xsiexpqpf_tf },
+   { CODE_FOR_xststdcqp_kf,CODE_FOR_xststdcqp_tf },
+  };
+
+  if (first_time)
+   {
+ first_time = false;
+ gcc_assert ((int)CODE_FOR_nothing == 0);
+ for (i = 0; i < ARRAY_SIZE (map); i++)
+   map_insn_code[(int)map[i].from] = map[i].to;
+   }
 
-  case CODE_FOR_sqrtkf2_odd:   icode = CODE_FOR_sqrttf2_odd;   break;
-  case CODE_FOR_trunckfdf2_odd:icode = CODE_FOR_trunctfdf2_odd; break;
-  case CODE_FOR_addkf3_odd:icode = CODE_FOR_addtf3_odd;
break;
-  case CODE_FOR_subkf3_odd:icode = CODE_FOR_subtf3_odd;
break;
-  case CODE_FOR_mulkf3_odd:icode = CODE_FOR_multf3_odd;
break;
-  case CODE_FOR_divkf3_odd:icode = CODE_FOR_divtf3_odd;
break;
-  case CODE_FOR_fmakf4_odd:icode = CODE_FOR_fmatf4_odd;
break;
-  case CODE_FOR_xsxexpqp_kf:   icode = CODE_FOR_xsxexpqp_tf;   break;
-  case CODE_FOR_xsxsigqp_kf:   icode = CODE_FOR_xsxsigqp_tf;   break;
-  case CODE_FOR_xststdcnegqp_kf:   icode = CODE_FOR_xststdcnegqp_tf; break;
-  case CODE_FOR_xsiexpqp_kf:   icode = CODE_FOR_xsiexpqp_tf;   break;
-  case CODE_FOR_xsiexpqpf_kf:  icode = CODE_FOR_xsiexpqpf_tf;  break;
-  case CODE_FOR_xststdcqp_kf:  icode = CODE_FOR_xststdcqp_tf;  break;
-  }
+  if (map_insn_code[(int)icode] != CODE_FOR_nothing)
+   icode = map_insn_code[(int)icode];
+}
 
   if (TARGET_DEBUG_BUILTIN)
 {


Re: [PATCH 3/4] libstdc++: avoid character accumulation in istreambuf_iterator

2017-11-15 Thread Paolo Carlini

Hi,

On 15/11/2017 11:48, Petr Ovtchenkov wrote:

Ask associated streambuf for character when needed instead of
accumulate it in istreambuf_iterator object.

Benefits from this:
   - minus one class member in istreambuf_iterator
   - trivial synchronization of states of istreambuf_iterator
 and associated streambuf
---
  libstdc++-v3/include/bits/streambuf_iterator.h | 34 --
  1 file changed, 15 insertions(+), 19 deletions(-)

diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h 
b/libstdc++-v3/include/bits/streambuf_iterator.h
index 08fb13b..203da9d 100644
--- a/libstdc++-v3/include/bits/streambuf_iterator.h
+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
@@ -95,19 +95,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
// NB: This implementation assumes the "end of stream" value
// is EOF, or -1.
mutable streambuf_type* _M_sbuf;
-  mutable int_type _M_c;
Obviously this would be an ABI-breaking change, which certainly we don't 
want. Unless I missed a detailed discussion of the non-trivial way to 
avoid it in one of the recent threads about these topics...


Paolo.


[PATCH] fix -mnop-mcount generate 5byte nop in 32bit.

2017-11-15 Thread 박한범
"-mnop-mcount" needs to make 5byte size "nop" instruction.
however recently gcc make only 4byte "nop" in 32bit.
I have test in gcc 5.4, 7.2.


===
bug result
===
080485c5 :
 80485c5:   0f 1f 04 00 nopl   (%eax,%eax,1)
 80485c9:   8d 4c 24 04 lea0x4(%esp),%ecx
 80485cd:   83 e4 f0and$0xfff0,%esp

===
fixed result
===
08048598 :
 8048598:   0f 1f 44 00 01  nopl   0x1(%eax,%eax,1)
 804859d:   8d 4c 24 04 lea0x4(%esp),%ecx
 80485a1:   83 e4 f0and$0xfff0,%esp


is it OK?


===
Index : gcc/config/i386/i386.c
===
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c6ca071..e574de3 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -40474,7 +40474,7 @@ static void
 x86_print_call_or_nop (FILE *file, const char *target)
 {
   if (flag_nop_mcount)
-fprintf (file, "1:\tnopl 0x00(%%eax,%%eax,1)\n"); /* 5 byte nop.  */
+fprintf (file, "1:\tnopl 0x01(%%eax,%%eax,1)\n"); /* 5 byte nop.  */
   else
 fprintf (file, "1:\tcall\t%s\n", target);
 }


Re: lambda-switch regression

2017-11-15 Thread Martin Sebor

On 11/15/2017 10:38 AM, David Malcolm wrote:

On Wed, 2017-11-15 at 12:25 -0500, David Malcolm wrote:

On Wed, 2017-11-15 at 12:06 -0500, David Malcolm wrote:

On Wed, 2017-11-15 at 08:03 -0500, Nathan Sidwell wrote:

g++.dg/lambda/lambda-switch.C Has recently regressed.


g++.dg/cpp0x/lambda/lambda-switch.C


It appears the
location of a warning message has moved.

  l = []()  // { dg-warning "statement will never
be executed" }
{
case 3: // { dg-error "case" }
  break;// { dg-error "break" }
};  <--- warning now here

We seem to be diagnosing the last line of the statement, not the
first.
That seems not a useful.

I've not investigated what patch may have caused this, on the
chance
someone might already know?

nathan


The warning was added in r236597 (aka
1398da0f786e120bb0b407e84f412aa9fc6d80ee):

+2016-05-23  Marek Polacek  
+
+   PR c/49859
+   * common.opt (Wswitch-unreachable): New option.
+   * doc/invoke.texi: Document -Wswitch-unreachable.
+   * gimplify.c (gimplify_switch_expr): Implement the
-Wswitch-
unreachable
+   warning.

which had it at there (23:7).

r244705 (aka 3ef7eab185e1463c7dbfa2a8d1af5d0120cf9f76) moved the
warning from 23:7 up to the "[] ()" at 19:6 in:

+2017-01-20  Marek Polacek  
+
+   PR c/64279
[...snip...]
+   * g++.dg/cpp0x/lambda/lambda-switch.C: Move dg-warning.

I tried it with some working copies I have to hand:
- works for me with r254387 (2017-11-03)
- fails for me with r254700 (2017-11-13)

so hopefully that helps track it down.

Dave


Searching in the November archives of the gcc-regression ML for
"lambda-switch.c":

https://gcc.gnu.org/cgi-bin/search.cgi?wm=wrd=extended=all=D
=lambda-switch.c=%2Fml%2Fgcc-regression%2F2017-11%2F%25

showed e.g.:
  https://gcc.gnu.org/ml/gcc-regression/2017-11/msg00173.html
   "Regressions on trunk at revision 254648 vs revision 254623"

which says this is a new failure somewhere in that range; so it
presumably happened sometime on 2017-11-10 after r254623 and up to
(maybe ==) r254648.

Looking at:
   svn log -r r254623:r254648 |less
nothing jumps out at me as being related.

Hope this is helpful
Dave


Actually, https://gcc.gnu.org/ml/gcc-regression/2017-11/msg00157.html
has a tighter range: r254628 vs r254635.

Looking at:
  svn log -r r254628:r254635 |less
I see msebor's r254630 ("PR c/81117 - Improve buffer overflow checking
in strncpy") has:

* gimple.c (gimple_build_call_from_tree): Set call location.

with:
+  gimple_set_location (call, EXPR_LOCATION (t));

Maybe that's it?  (nothing else in that commit range seems to affect
locations).


Yes, that's it.  Before the change there would be no location
associated with a GIMPLE call seen in gimple-fold.  The location
would only get added later, after folding.

The purpose of the lambda-switch.C test is to verify GCC doesn't
ICE on the ill-formed code.  The warning is incidental to the test
case so I've adjusted it to filter it out.

Martin



Re: Hurd port for gcc-7 go PATCH 1-3(15)

2017-11-15 Thread Svante Signell
On Wed, 2017-11-15 at 21:40 +0100, Matthias Klose wrote:
> On 06.11.2017 16:36, Svante Signell wrote:
> > Hi,
> > 
> > Attached are patches to enable gccgo to build properly on Debian
> > GNU/Hurd on gcc-7 (7-7.2.0-12).
> 
> sysinfo.go:6744:7: error: redefinition of 'SYS_IOCTL'
>  const SYS_IOCTL = _SYS_ioctl
>    ^
> sysinfo.go:6403:7: note: previous definition of 'SYS_IOCTL' was here
>  const SYS_IOCTL = 0
>    ^
> the patches break the build on any Linux architecture.  Please could you test
> your patches against a linux target as well?

I'm really sorry. I regularly do that, but missed this one for gcc-7. Do you
mean the patches against gcc-8 you asked me for? You wrote that gcc-7 is not of
interest and I should concentrate on gcc-8.

Again, I'm really sorry. Wil fix this tomorrow hopefully.

Thanks!



[PATCH 2/4] libstdc++: istreambuf_iterator keep attached streambuf

2017-11-15 Thread Petr Ovtchenkov
istreambuf_iterator should not forget about attached
streambuf when it reach EOF.

Checks in debug mode has no infuence more on character
extraction in istreambuf_iterator increment operators.
In this aspect behaviour in debug and non-debug mode
is similar now.

Test for detached srteambuf in istreambuf_iterator:
When istreambuf_iterator reach EOF of istream, it should not
forget about attached streambuf.
>From fact "EOF in stream reached" not follow that
stream reach end of life and input operation impossible
more.

postfix increment (r++) return proxy object, due to

  copies of the previous value of r are no longer
  required either to be dereferenceable or to be in
  the domain of ==.

i.e. type that usable only for dereference and extraction
"previous" character.

istreambuf_iterator should has ctor from proxy object,
so proxy should store pointer to streambuf object.
---
 libstdc++-v3/include/bits/streambuf_iterator.h | 67 ++
 .../24_iterators/istreambuf_iterator/3.cc  | 66 +
 2 files changed, 109 insertions(+), 24 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/3.cc

diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h 
b/libstdc++-v3/include/bits/streambuf_iterator.h
index 69ee013..08fb13b 100644
--- a/libstdc++-v3/include/bits/streambuf_iterator.h
+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
@@ -98,6 +98,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   mutable int_type _M_c;
 
 public:
+  class proxy
+  {
+  friend class istreambuf_iterator;
+  private:
+  proxy(int_type c, streambuf_type*sbuf_) :
+  _M_c(c),
+  _M_sbuf(sbuf_)
+  { }
+  int_type _M_c;
+  streambuf_type*  _M_sbuf;
+
+  public:
+  char_type
+  operator*() const
+  { return traits_type::to_char_type(_M_c); }
+  };
+
+public:
   ///  Construct end of input stream iterator.
   _GLIBCXX_CONSTEXPR istreambuf_iterator() _GLIBCXX_USE_NOEXCEPT
   : _M_sbuf(0), _M_c(traits_type::eof()) { }
@@ -116,6 +134,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   istreambuf_iterator(streambuf_type* __s) _GLIBCXX_USE_NOEXCEPT
   : _M_sbuf(__s), _M_c(traits_type::eof()) { }
 
+  ///  Construct start of istreambuf iterator.
+  istreambuf_iterator(const proxy& __p) _GLIBCXX_USE_NOEXCEPT
+  : _M_sbuf(__p._M_sbuf), _M_c(traits_type::eof()) { }
+
   ///  Return the current character pointed to by iterator.  This returns
   ///  streambuf.sgetc().  It cannot be assigned.  NB: The result of
   ///  operator*() on an end of stream is undefined.
@@ -136,29 +158,39 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   istreambuf_iterator&
   operator++()
   {
-   __glibcxx_requires_cond(!_M_at_eof(),
+   __glibcxx_requires_cond(_M_sbuf,
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));
if (_M_sbuf)
  {
+#ifdef _GLIBCXX_DEBUG_PEDANTIC
+   int_type __tmp =
+#endif
_M_sbuf->sbumpc();
+#ifdef _GLIBCXX_DEBUG_PEDANTIC
+   
__glibcxx_requires_cond(!traits_type::eq_int_type(__tmp,traits_type::eof()),
+   
_M_message(__gnu_debug::__msg_inc_istreambuf)
+   ._M_iterator(*this));
+#endif
_M_c = traits_type::eof();
  }
return *this;
   }
 
   /// Advance the iterator.  Calls streambuf.sbumpc().
-  istreambuf_iterator
+  proxy
   operator++(int)
   {
-   __glibcxx_requires_cond(!_M_at_eof(),
+_M_get();
+   __glibcxx_requires_cond(_M_sbuf
+   && 
!traits_type::eq_int_type(_M_c,traits_type::eof()),
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));
 
-   istreambuf_iterator __old = *this;
+   proxy __old(_M_c, _M_sbuf);
if (_M_sbuf)
  {
-   __old._M_c = _M_sbuf->sbumpc();
+   _M_sbuf->sbumpc();
_M_c = traits_type::eof();
  }
return __old;
@@ -177,18 +209,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_get() const
   {
const int_type __eof = traits_type::eof();
-   int_type __ret = __eof;
-   if (_M_sbuf)
- {
-   if (!traits_type::eq_int_type(_M_c, __eof))
- __ret = _M_c;
-   else if (!traits_type::eq_int_type((__ret = _M_sbuf->sgetc()),
-  __eof))
- _M_c = __ret;
-   else
- _M_sbuf = 0;
- }
-   return __ret;
+   if (_M_sbuf && traits_type::eq_int_type(_M_c, __eof))
+  _M_c = _M_sbuf->sgetc();
+   return _M_c;
   }
 
   bool
@@ -339,7 +362,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typedef typename 

[PATCH 1/4] Revert "2017-10-04 Petr Ovtchenkov <p...@void-ptr.info>"

2017-11-15 Thread Petr Ovtchenkov
This reverts commit 0dfbafdf338cc6899d146add5161e52efb02c067
(svn r253417).
---
 libstdc++-v3/include/bits/streambuf_iterator.h | 59 ++
 1 file changed, 33 insertions(+), 26 deletions(-)

diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h 
b/libstdc++-v3/include/bits/streambuf_iterator.h
index 081afe5..69ee013 100644
--- a/libstdc++-v3/include/bits/streambuf_iterator.h
+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
@@ -95,7 +95,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // NB: This implementation assumes the "end of stream" value
   // is EOF, or -1.
   mutable streambuf_type*  _M_sbuf;
-  int_type _M_c;
+  mutable int_type _M_c;
 
 public:
   ///  Construct end of input stream iterator.
@@ -122,29 +122,28 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   char_type
   operator*() const
   {
-   int_type __c = _M_get();
-
 #ifdef _GLIBCXX_DEBUG_PEDANTIC
// Dereferencing a past-the-end istreambuf_iterator is a
// libstdc++ extension
-   __glibcxx_requires_cond(!_S_is_eof(__c),
+   __glibcxx_requires_cond(!_M_at_eof(),
_M_message(__gnu_debug::__msg_deref_istreambuf)
._M_iterator(*this));
 #endif
-   return traits_type::to_char_type(__c);
+   return traits_type::to_char_type(_M_get());
   }
 
   /// Advance the iterator.  Calls streambuf.sbumpc().
   istreambuf_iterator&
   operator++()
   {
-   __glibcxx_requires_cond(_M_sbuf &&
-   (!_S_is_eof(_M_c) || 
!_S_is_eof(_M_sbuf->sgetc())),
+   __glibcxx_requires_cond(!_M_at_eof(),
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));
-
-   _M_sbuf->sbumpc();
-   _M_c = traits_type::eof();
+   if (_M_sbuf)
+ {
+   _M_sbuf->sbumpc();
+   _M_c = traits_type::eof();
+ }
return *this;
   }
 
@@ -152,14 +151,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   istreambuf_iterator
   operator++(int)
   {
-   __glibcxx_requires_cond(_M_sbuf &&
-   (!_S_is_eof(_M_c) || 
!_S_is_eof(_M_sbuf->sgetc())),
+   __glibcxx_requires_cond(!_M_at_eof(),
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));
 
istreambuf_iterator __old = *this;
-   __old._M_c = _M_sbuf->sbumpc();
-   _M_c = traits_type::eof();
+   if (_M_sbuf)
+ {
+   __old._M_c = _M_sbuf->sbumpc();
+   _M_c = traits_type::eof();
+ }
return __old;
   }
 
@@ -175,21 +176,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   int_type
   _M_get() const
   {
-   int_type __ret = _M_c;
-   if (_M_sbuf && _S_is_eof(__ret) && _S_is_eof(__ret = _M_sbuf->sgetc()))
- _M_sbuf = 0;
+   const int_type __eof = traits_type::eof();
+   int_type __ret = __eof;
+   if (_M_sbuf)
+ {
+   if (!traits_type::eq_int_type(_M_c, __eof))
+ __ret = _M_c;
+   else if (!traits_type::eq_int_type((__ret = _M_sbuf->sgetc()),
+  __eof))
+ _M_c = __ret;
+   else
+ _M_sbuf = 0;
+ }
return __ret;
   }
 
   bool
   _M_at_eof() const
-  { return _S_is_eof(_M_get()); }
-
-  static bool
-  _S_is_eof(int_type __c)
   {
const int_type __eof = traits_type::eof();
-   return traits_type::eq_int_type(__c, __eof);
+   return traits_type::eq_int_type(_M_get(), __eof);
   }
 };
 
@@ -367,14 +373,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typedef typename __is_iterator_type::traits_type traits_type;
   typedef typename __is_iterator_type::streambuf_type  streambuf_type;
   typedef typename traits_type::int_type   int_type;
-  const int_type __eof = traits_type::eof();
 
   if (__first._M_sbuf && !__last._M_sbuf)
{
  const int_type __ival = traits_type::to_int_type(__val);
  streambuf_type* __sb = __first._M_sbuf;
  int_type __c = __sb->sgetc();
- while (!traits_type::eq_int_type(__c, __eof)
+ while (!traits_type::eq_int_type(__c, traits_type::eof())
 && !traits_type::eq_int_type(__c, __ival))
{
  streamsize __n = __sb->egptr() - __sb->gptr();
@@ -391,9 +396,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__c = __sb->snextc();
}
 
- __first._M_c = __eof;
+ if (!traits_type::eq_int_type(__c, traits_type::eof()))
+   __first._M_c = __c;
+ else
+   __first._M_sbuf = 0;
}
-
   return __first;
 }
 
-- 
2.10.1



[PATCH 3/4] libstdc++: avoid character accumulation in istreambuf_iterator

2017-11-15 Thread Petr Ovtchenkov
Ask associated streambuf for character when needed instead of
accumulate it in istreambuf_iterator object.

Benefits from this:
  - minus one class member in istreambuf_iterator
  - trivial synchronization of states of istreambuf_iterator
and associated streambuf
---
 libstdc++-v3/include/bits/streambuf_iterator.h | 34 --
 1 file changed, 15 insertions(+), 19 deletions(-)

diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h 
b/libstdc++-v3/include/bits/streambuf_iterator.h
index 08fb13b..203da9d 100644
--- a/libstdc++-v3/include/bits/streambuf_iterator.h
+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
@@ -95,19 +95,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // NB: This implementation assumes the "end of stream" value
   // is EOF, or -1.
   mutable streambuf_type*  _M_sbuf;
-  mutable int_type _M_c;
 
 public:
   class proxy
   {
   friend class istreambuf_iterator;
   private:
-  proxy(int_type c, streambuf_type*sbuf_) :
+  proxy(int_type c, streambuf_type* sbuf_) :
   _M_c(c),
   _M_sbuf(sbuf_)
   { }
   int_type _M_c;
-  streambuf_type*  _M_sbuf;
+  streambuf_type* _M_sbuf;
 
   public:
   char_type
@@ -118,7 +117,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 public:
   ///  Construct end of input stream iterator.
   _GLIBCXX_CONSTEXPR istreambuf_iterator() _GLIBCXX_USE_NOEXCEPT
-  : _M_sbuf(0), _M_c(traits_type::eof()) { }
+  : _M_sbuf(0) { }
 
 #if __cplusplus >= 201103L
   istreambuf_iterator(const istreambuf_iterator&) noexcept = default;
@@ -128,15 +127,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   ///  Construct start of input stream iterator.
   istreambuf_iterator(istream_type& __s) _GLIBCXX_USE_NOEXCEPT
-  : _M_sbuf(__s.rdbuf()), _M_c(traits_type::eof()) { }
+  : _M_sbuf(__s.rdbuf()) { }
 
   ///  Construct start of streambuf iterator.
   istreambuf_iterator(streambuf_type* __s) _GLIBCXX_USE_NOEXCEPT
-  : _M_sbuf(__s), _M_c(traits_type::eof()) { }
+  : _M_sbuf(__s) { }
 
   ///  Construct start of istreambuf iterator.
   istreambuf_iterator(const proxy& __p) _GLIBCXX_USE_NOEXCEPT
-  : _M_sbuf(__p._M_sbuf), _M_c(traits_type::eof()) { }
+  : _M_sbuf(__p._M_sbuf) { }
 
   ///  Return the current character pointed to by iterator.  This returns
   ///  streambuf.sgetc().  It cannot be assigned.  NB: The result of
@@ -147,11 +146,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #ifdef _GLIBCXX_DEBUG_PEDANTIC
// Dereferencing a past-the-end istreambuf_iterator is a
// libstdc++ extension
-   __glibcxx_requires_cond(!_M_at_eof(),
+   int_type __tmp = _M_get();
+   
__glibcxx_requires_cond(!traits_type::eq_int_type(__tmp,traits_type::eof()),
_M_message(__gnu_debug::__msg_deref_istreambuf)
._M_iterator(*this));
-#endif
+   return traits_type::to_char_type(__tmp);
+#else
return traits_type::to_char_type(_M_get());
+#endif
   }
 
   /// Advance the iterator.  Calls streambuf.sbumpc().
@@ -172,7 +174,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));
 #endif
-   _M_c = traits_type::eof();
  }
return *this;
   }
@@ -181,17 +182,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   proxy
   operator++(int)
   {
-_M_get();
-   __glibcxx_requires_cond(_M_sbuf
-   && 
!traits_type::eq_int_type(_M_c,traits_type::eof()),
+int_type c = _M_get();
+   __glibcxx_requires_cond(!traits_type::eq_int_type(c,traits_type::eof()),
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));
 
-   proxy __old(_M_c, _M_sbuf);
+   proxy __old(c, _M_sbuf);
if (_M_sbuf)
  {
_M_sbuf->sbumpc();
-   _M_c = traits_type::eof();
  }
return __old;
   }
@@ -209,9 +208,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_get() const
   {
const int_type __eof = traits_type::eof();
-   if (_M_sbuf && traits_type::eq_int_type(_M_c, __eof))
-  _M_c = _M_sbuf->sgetc();
-   return _M_c;
+   return _M_sbuf ? _M_sbuf->sgetc() : __eof;
   }
 
   bool
@@ -418,7 +415,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  else
__c = __sb->snextc();
}
- __first._M_c = __c;
}
   return __first;
 }
-- 
2.10.1



[PATCH 4/4] libstdc++: immutable _M_sbuf in istreambuf_iterator

2017-11-15 Thread Petr Ovtchenkov
No needs to have mutable _M_sbuf in istreambuf_iterator
more.
---
 libstdc++-v3/include/bits/streambuf_iterator.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h 
b/libstdc++-v3/include/bits/streambuf_iterator.h
index 203da9d..e2b6707 100644
--- a/libstdc++-v3/include/bits/streambuf_iterator.h
+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
@@ -94,7 +94,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // the "end of stream" iterator value.
   // NB: This implementation assumes the "end of stream" value
   // is EOF, or -1.
-  mutable streambuf_type*  _M_sbuf;
+  streambuf_type* _M_sbuf;
 
 public:
   class proxy
-- 
2.10.1



Re: Hurd port for gcc-7 go PATCH 1-3(15)

2017-11-15 Thread Matthias Klose
On 06.11.2017 16:36, Svante Signell wrote:
> Hi,
> 
> Attached are patches to enable gccgo to build properly on Debian
> GNU/Hurd on gcc-7 (7-7.2.0-12).

sysinfo.go:6744:7: error: redefinition of 'SYS_IOCTL'
 const SYS_IOCTL = _SYS_ioctl
   ^
sysinfo.go:6403:7: note: previous definition of 'SYS_IOCTL' was here
 const SYS_IOCTL = 0
   ^
the patches break the build on any Linux architecture.  Please could you test
your patches against a linux target as well?


[PING**2] [PATCH] Add a warning for invalid function casts

2017-11-15 Thread Bernd Edlinger
Ping...

On 11/08/17 17:55, Bernd Edlinger wrote:
> Ping...
> 
> for the C++ part of this patch:
> 
> https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00559.html
> 
> 
> Thanks
> Bernd.
> 
>> On 10/10/17 00:30, Bernd Edlinger wrote:
>>> On 10/09/17 20:34, Martin Sebor wrote:
 On 10/09/2017 11:50 AM, Bernd Edlinger wrote:
> On 10/09/17 18:44, Martin Sebor wrote:
>> On 10/07/2017 10:48 AM, Bernd Edlinger wrote:
>>> Hi!
>>>
>>> I think I have now something useful, it has a few more heuristics
>>> added, to reduce the number of false-positives so that it
>>> is able to find real bugs, for instance in openssl it triggers
>>> at a function cast which has already a TODO on it.
>>>
>>> The heuristics are:
>>> - handle void (*)(void) as a wild-card function type.
>>> - ignore volatile, const qualifiers on parameters/return.
>>> - handle any pointers as equivalent.
>>> - handle integral types, enums, and booleans of same precision
>>>     and signedness as equivalent.
>>> - stop parameter validation at the first "...".
>>
>> These sound quite reasonable to me.  I have a reservation about
>> just one of them, and some comments about other aspects of the
>> warning.  Sorry if this seems like a lot.  I'm hoping you'll
>> find the feedback constructive.
>>
>> I don't think using void(*)(void) to suppress the warning is
>> a robust solution because it's not safe to call a function that
>> takes arguments through such a pointer (especially not if one
>> or more of the arguments is a pointer).  Depending on the ABI,
>> calling a function that expects arguments with none could also
>> mess up the stack as the callee may pop arguments that were
>> never passed to it.
>>
>
> This is of course only a heuristic, and if there is no warning
> that does not mean any guarantee that there can't be a problem
> at runtime.  The heuristic is only meant to separate the
> bad from the very bad type-cast.  In my personal opinion there
> is not a single good type cast.

 I agree.  Since the warning uses one kind of a cast as an escape
 mechanism from the checking it should be one whose result can
 the most likely be used to call the function without undefined
 behavior.

 Since it's possible to call any function through a pointer to
 a function with no arguments (simply by providing arguments of
 matching types) it's a reasonable candidate.

 On the other hand, since it is not safe to call an arbitrary
 function through void (*)(void), it's not as good a candidate.

 Another reason why I think a protoype-less function is a good
 choice is because the alias and ifunc attributes already use it
 as an escape mechanism from their type incompatibility warning.

>>>
>>> I know of pre-existing code-bases where a type-cast to type:
>>> void (*) (void);
>>>
>>> .. is already used as a generic function pointer: libffi and
>>> libgo, I would not want to break these.
>>>
>>> Actually when I have a type:
>>> X (*) (...);
>>>
>>> I would like to make sure that the warning checks that
>>> only functions returning X are assigned.
>>>
>>> and for X (*) (Y, );
>>>
>>> I would like to check that anything returning X with
>>> first argument of type Y is assigned.
>>>
>>> There are code bases where such a scheme is used.
>>> For instance one that I myself maintain: the OPC/UA AnsiC Stack,
>>> where I have this type definition:
>>>
>>> typedef OpcUa_StatusCode (OpcUa_PfnInvokeService)(OpcUa_Endpoint
>>> hEndpoint, ...);
>>>
>>> And this plays well together with this warning, because only
>>> functions are assigned that match up to the ...);
>>> Afterwards this pointer is cast back to the original signature,
>>> so everything is perfectly fine.
>>>
>>> Regarding the cast from pointer to member to function, I see also a
>>> warning without -Wpedantic:
>>> Warnung: converting from »void (S::*)(int*)« to »void (*)(int*)«
>>> [-Wpmf-conversions]
>>>  F *pf = (F*)::foo;
>>>  ^~~
>>>
>>> And this one is even default-enabled, so I think that should be
>>> more than sufficient.
>>>
>>> I also changed the heuristic, so that your example with the enum should
>>> now work.  I did not add it to the test case, because it would
>>> break with -fshort-enums :(
>>>
>>> Attached I have an updated patch that extends this warning to the
>>> pointer-to-member function cast, and relaxes the heuristic on the
>>> benign integral type differences a bit further.
>>>
>>>
>>> Is it OK for trunk after bootstrap and reg-testing?
>>>
>>>
>>> Thanks
>>> Bernd.
>>>


Re: [PATCH] i386: Update the default -mzeroupper setting

2017-11-15 Thread Uros Bizjak
On Wed, Nov 15, 2017 at 5:59 PM, H.J. Lu  wrote:
> On Wed, Nov 15, 2017 at 8:09 AM, Uros Bizjak  wrote:
>> On Wed, Nov 15, 2017 at 2:37 PM, H.J. Lu  wrote:
>>> -mzeroupper is specified to generate vzeroupper instruction.  If it
>>> isn't used, the default should depend on !TARGET_AVX512ER.  Users can
>>> always use -mzeroupper or -mno-zeroupper to override it.
>>>
>>> Sebastian, can you run the full test with it?
>>>
>>> OK for trunk if there is no regression?
>>
>> If we want to go this way, please add relevant tune flag (e.g.
>> X86_TUNE_EMIT_VZEROUPPER) and use it for ~m_KNL. This tune is the
>> property of the processor model, not ISA.
>
> How about this?  OK for trunk if there are no regressions?

> gcc/
>
> PR target/82990
> * config/i386/i386.c (pass_insert_vzeroupper::gate): Remove
> TARGET_AVX512ER check.
> (ix86_option_override_internal): Set MASK_VZEROUPPER if
> neither -mzeroupper nor -mno-zeroupper is used and
> TARGET_EMIT_VZEROUPPER is set.
> * config/i386/i386.h (TARGET_EMIT_VZEROUPPER): New.
> * config/i386/x86-tune.def: Add X86_TUNE_EMIT_VZEROUPPER.
>
> gcc/testsuite/
>
> PR target/82990
> * gcc.target/i386/pr82942-2.c: Add -mtune=knl.
> * gcc.target/i386/pr82990-1.c: New test.
> * gcc.target/i386/pr82990-2.c: Likewise.
> * gcc.target/i386/pr82990-3.c: Likewise.
> * gcc.target/i386/pr82990-4.c: Likewise.
> * gcc.target/i386/pr82990-5.c: Likewise.
> * gcc.target/i386/pr82990-6.c: Likewise.
> * gcc.target/i386/pr82990-7.c: Likewise.

OK.

Thanks,
Uros.


Re: [PATCH, rs6000] Correct some Power9 scheduling info

2017-11-15 Thread Pat Haugen
On 09/27/2017 12:56 PM, Pat Haugen wrote:
> The following patch corrects some Power9 resource requirements and
> instruction latencies. Bootstrap/regtest on powerpc64le-linux with no
> new regressions. Ok for trunk?

Updated patch follows. Bootstrap/regtest on powerpc64le-linux (Power9)
with no regressions. Ok for trunk?

-Pat

2017-11-15  Pat Haugen  

* rs6000/power9.md (power9fpdiv): New automaton and cpu_unit defined
for it.
(DU_C2_3_power9): Correct reservation combinations.
(FP_DIV_power9, VEC_DIV_power9): New.
(power9-alu): Split out rotate/shift...
(power9-rot): ...to here, correct dispatch resource.
(power9-cracked-alu, power9-mul, power9-mul-compare): Correct dispatch
resource.
(power9-fp): Correct latency.
(power9-sdiv): Add div/sqrt resource.
(power9-ddiv): Correct latency, add div/sqrt resource.
(power9-sqrt, power9-dsqrt): Add div/sqrt resource.
(power9-vecfdiv, power9-vecdiv): Correct latency, add div/sqrt
resource.
(power9-qpdiv, power9-qpmul): Adjust resource usage.


Index: gcc/config/rs6000/power9.md
===
--- gcc/config/rs6000/power9.md	(revision 254708)
+++ gcc/config/rs6000/power9.md	(working copy)
@@ -19,7 +19,7 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; .
 
-(define_automaton "power9dsp,power9lsu,power9vsu,power9misc")
+(define_automaton "power9dsp,power9lsu,power9vsu,power9fpdiv,power9misc")
 
 (define_cpu_unit "lsu0_power9,lsu1_power9,lsu2_power9,lsu3_power9" "power9lsu")
 (define_cpu_unit "vsu0_power9,vsu1_power9,vsu2_power9,vsu3_power9" "power9vsu")
@@ -28,7 +28,11 @@
 ; Two fixed point divide units, not pipelined
 (define_cpu_unit "fx_div0_power9,fx_div1_power9" "power9misc")
 (define_cpu_unit "bru_power9,cryptu_power9,dfu_power9" "power9misc")
+; Create a false unit for use by non-pipelined FP div/sqrt
+(define_cpu_unit "fp_div0_power9,fp_div1_power9,fp_div2_power9,fp_div3_power9"
+		 "power9fpdiv")
 
+
 (define_cpu_unit "x0_power9,x1_power9,xa0_power9,xa1_power9,
 		  x2_power9,x3_power9,xb0_power9,xb1_power9,
 		  br0_power9,br1_power9" "power9dsp")
@@ -79,8 +83,7 @@
 
 ; 2-way cracked plus 3rd slot
 (define_reservation "DU_C2_3_power9" "x0_power9+x1_power9+xa0_power9|
-  x1_power9+x2_power9+xa0_power9|
-  x1_power9+x2_power9+xb0_power9|
+  x1_power9+x2_power9+xa1_power9|
   x2_power9+x3_power9+xb0_power9")
 
 ; 3-way cracked (consumes whole decode/dispatch cycle)
@@ -108,7 +111,19 @@
 
 (define_reservation "VSU_PRM_power9" "prm0_power9|prm1_power9")
 
+; Define the reservation to be used by FP div/sqrt which allows other insns
+; to be issued to the VSU, but blocks other div/sqrt for a number of cycles.
+; Note that the number of cycles blocked varies depending on insn, but we
+; just use the same number for all in order to keep the number of DFA states
+; reasonable.
+(define_reservation "FP_DIV_power9"
+		"fp_div0_power9*8|fp_div1_power9*8|fp_div2_power9*8|
+		 fp_div3_power9*8")
+(define_reservation "VEC_DIV_power9"
+		"fp_div0_power9*8+fp_div1_power9*8|
+		 fp_div2_power9*8+fp_div3_power9*8")
 
+
 ; LS Unit
 (define_insn_reservation "power9-load" 4
   (and (eq_attr "type" "load")
@@ -243,9 +258,7 @@
 
 ; Most ALU insns are simple 2 cycle, including record form
 (define_insn_reservation "power9-alu" 2
-  (and (ior (eq_attr "type" "add,exts,integer,logical,isel")
-	(and (eq_attr "type" "insert,shift")
-		 (eq_attr "dot" "no")))
+  (and (eq_attr "type" "add,exts,integer,logical,isel")
(eq_attr "cpu" "power9"))
   "DU_any_power9,VSU_power9")
 ; 5 cycle CR latency
@@ -252,12 +265,19 @@
 (define_bypass 5 "power9-alu"
 		 "power9-crlogical,power9-mfcr,power9-mfcrf")
 
+; Rotate/shift prevent use of third slot
+(define_insn_reservation "power9-rot" 2
+  (and (eq_attr "type" "insert,shift")
+   (eq_attr "dot" "no")
+   (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
 ; Record form rotate/shift are cracked
 (define_insn_reservation "power9-cracked-alu" 2
   (and (eq_attr "type" "insert,shift")
(eq_attr "dot" "yes")
(eq_attr "cpu" "power9"))
-  "DU_C2_power9,VSU_power9")
+  "DU_C2_3_power9,VSU_power9")
 ; 7 cycle CR latency
 (define_bypass 7 "power9-cracked-alu"
 		 "power9-crlogical,power9-mfcr,power9-mfcrf")
@@ -291,13 +311,13 @@
   (and (eq_attr "type" "mul")
(eq_attr "dot" "no")
(eq_attr "cpu" "power9"))
-  "DU_any_power9,VSU_power9")
+  "DU_slice_3_power9,VSU_power9")
 
 (define_insn_reservation "power9-mul-compare" 5
   (and (eq_attr "type" "mul")
(eq_attr "dot" "yes")
(eq_attr "cpu" "power9"))
-  "DU_C2_power9,VSU_power9")
+  "DU_C2_3_power9,VSU_power9")
 ; 10 cycle CR latency
 (define_bypass 10 "power9-mul-compare"
 		 "power9-crlogical,power9-mfcr,power9-mfcrf")
@@ -349,7 +369,7 @@

[PATCH] Minor improvements to Filesystem tests

2017-11-15 Thread Jonathan Wakely

Make these tests a little more robust.

* testsuite/27_io/filesystem/iterators/directory_iterator.cc: Leave
error_code unset.
* testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc:
Check for past-the-end before dereferencing.
* testsuite/experimental/filesystem/iterators/
recursive_directory_iterator.cc: Likewise.

Tested powerpc64le-linux, committed to trunk.

commit ff95dc810ac57a0277d62bb122f7912d37a7cfd5
Author: Jonathan Wakely 
Date:   Wed Nov 15 18:10:52 2017 +

Minor improvements to Filesystem tests

* testsuite/27_io/filesystem/iterators/directory_iterator.cc: Leave
error_code unset.
* 
testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc:
Check for past-the-end before dereferencing.
* testsuite/experimental/filesystem/iterators/
recursive_directory_iterator.cc: Likewise.

diff --git 
a/libstdc++-v3/testsuite/27_io/filesystem/iterators/directory_iterator.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/iterators/directory_iterator.cc
index c3e6f01670a..9cdbd7aafa0 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/iterators/directory_iterator.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/iterators/directory_iterator.cc
@@ -61,7 +61,6 @@ test01()
   ec = bad_ec;
   permissions(p, fs::perms::none, ec);
   VERIFY( !ec );
-  ec = bad_ec;
   iter = fs::directory_iterator(p, ec);
   VERIFY( ec );
   VERIFY( iter == end(iter) );
diff --git 
a/libstdc++-v3/testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc
 
b/libstdc++-v3/testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc
index 1ef450fc907..d41a1506d3b 100644
--- 
a/libstdc++-v3/testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc
+++ 
b/libstdc++-v3/testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc
@@ -87,6 +87,7 @@ test01()
   VERIFY( iter != end(iter) );
   VERIFY( iter->path() == p/"d1" );
   ++iter;  // should recurse into d1
+  VERIFY( iter != end(iter) );
   VERIFY( iter->path() == p/"d1/d2" );
   iter.increment(ec);  // should fail to recurse into p/d1/d2
   VERIFY( ec );
@@ -99,6 +100,7 @@ test01()
   VERIFY( iter != end(iter) );
   VERIFY( iter->path() == p/"d1" );
   ++iter;  // should recurse into d1
+  VERIFY( iter != end(iter) );
   VERIFY( iter->path() == p/"d1/d2" );
   ec = bad_ec;
   iter.increment(ec);  // should fail to recurse into p/d1/d2, so skip it
diff --git 
a/libstdc++-v3/testsuite/experimental/filesystem/iterators/recursive_directory_iterator.cc
 
b/libstdc++-v3/testsuite/experimental/filesystem/iterators/recursive_directory_iterator.cc
index 50cc7d45de8..584cfeed839 100644
--- 
a/libstdc++-v3/testsuite/experimental/filesystem/iterators/recursive_directory_iterator.cc
+++ 
b/libstdc++-v3/testsuite/experimental/filesystem/iterators/recursive_directory_iterator.cc
@@ -56,6 +56,7 @@ test01()
   VERIFY( iter != end(iter) );
   VERIFY( iter->path() == p/"d1" );
   ++iter;
+  VERIFY( iter != end(iter) );
   VERIFY( iter->path() == p/"d1/d2" );
   ++iter;
   VERIFY( iter == end(iter) );
@@ -88,6 +89,7 @@ test01()
   VERIFY( iter != end(iter) );
   VERIFY( iter->path() == p/"d1" );
   ++iter;  // should recurse into d1
+  VERIFY( iter != end(iter) );
   VERIFY( iter->path() == p/"d1/d2" );
   iter.increment(ec);  // should fail to recurse into p/d1/d2
   VERIFY( ec );


Re: [PATCH] PR fortran/78240 -- kludge of the day

2017-11-15 Thread Fritz Reese
On Wed, Nov 15, 2017 at 1:13 PM, Steve Kargl
 wrote:
> On Tue, Nov 14, 2017 at 05:21:41PM -0500, Fritz Reese wrote:
>> On Tue, Nov 14, 2017 at 4:58 PM, Janus Weil  wrote:
>> > Hi guys,
>> >
>> > I see this new test case failing on x86_64-linux-gnu:
>> >
>> > FAIL: gfortran.dg/pr78240.f90   -O  (test for excess errors)
...
>>
>
> I've fixed the problem with this patch.
>
> 2017-11-15  Steven G. Kargl  
>
> PR fortran/78240
> gfortran.dg/pr78240.f90: Prune run-on errors.
>
>
> Index: gcc/testsuite/gfortran.dg/pr78240.f90
> ===
> --- gcc/testsuite/gfortran.dg/pr78240.f90   (revision 254779)
> +++ gcc/testsuite/gfortran.dg/pr78240.f90   (working copy)
> @@ -1,4 +1,5 @@
>  ! { dg-do compile }
> +! { dg-options "-w" }
>  !
>  ! PR fortran/78240
>  !
> @@ -8,5 +9,7 @@
>  !
>
>  program p
> -  integer x(n)/1/   ! { dg-error "Nonconstant array" }
> +  integer x(n)/1/   ! { dg-error "cannot appear in the expression" }
>  end
> +! { dg-prune-output "module or main program" }
> +! { dg-prune-output "Nonconstant array" }
>
> --
> Steve


Thanks! I was planning to commit the very same.

---
Fritz Reese


Re: [PATCH] PR fortran/78240 -- kludge of the day

2017-11-15 Thread Steve Kargl
On Tue, Nov 14, 2017 at 05:21:41PM -0500, Fritz Reese wrote:
> On Tue, Nov 14, 2017 at 4:58 PM, Janus Weil  wrote:
> > Hi guys,
> >
> > I see this new test case failing on x86_64-linux-gnu:
> >
> > FAIL: gfortran.dg/pr78240.f90   -O  (test for excess errors)
> >
> >
> > $ gfortran-8 pr78240.f90
> > pr78240.f90:11:12:
> >
> >integer x(n)/1/   ! { dg-error "Nonconstant array" }
> > 1
> > Error: Variable ‘n’ cannot appear in the expression at (1)
> > pr78240.f90:11:14:
> >
> >integer x(n)/1/   ! { dg-error "Nonconstant array" }
> >   1
> > Error: The module or main program array ‘x’ at (1) must have constant shape
> > pr78240.f90:11:19:
> >
> >integer x(n)/1/   ! { dg-error "Nonconstant array" }
> >1
> > Error: Nonconstant array section at (1) in DATA statement
> > [...]
> 
> ... does anyone know how to tell dejagnu to expect multiple errors on
> a single line?
> 

I've fixed the problem with this patch.

2017-11-15  Steven G. Kargl  

PR fortran/78240
gfortran.dg/pr78240.f90: Prune run-on errors.


Index: gcc/testsuite/gfortran.dg/pr78240.f90
===
--- gcc/testsuite/gfortran.dg/pr78240.f90   (revision 254779)
+++ gcc/testsuite/gfortran.dg/pr78240.f90   (working copy)
@@ -1,4 +1,5 @@
 ! { dg-do compile }
+! { dg-options "-w" }
 !
 ! PR fortran/78240
 !
@@ -8,5 +9,7 @@
 !
 
 program p
-  integer x(n)/1/   ! { dg-error "Nonconstant array" }
+  integer x(n)/1/   ! { dg-error "cannot appear in the expression" }
 end
+! { dg-prune-output "module or main program" }
+! { dg-prune-output "Nonconstant array" }

-- 
Steve


[PATCH] Add noexcept to generic std::size, std::empty and std::data

2017-11-15 Thread Jonathan Wakely

The standard doesn't say these are noexcept, but they can be.

* include/bits/range_access.h (size, empty, data): Add conditional
noexcept to generic overloads.

Tested powerpc64le-linux, committed to trunk.


commit 9348811e74851f9ce6594cbe1b98a855193867dc
Author: Jonathan Wakely 
Date:   Wed Nov 15 17:38:28 2017 +

Add noexcept to generic std::size, std::empty and std::data

* include/bits/range_access.h (size, empty, data): Add conditional
noexcept to generic overloads.

diff --git a/libstdc++-v3/include/bits/range_access.h 
b/libstdc++-v3/include/bits/range_access.h
index 3987c2addf1..2a037ad8082 100644
--- a/libstdc++-v3/include/bits/range_access.h
+++ b/libstdc++-v3/include/bits/range_access.h
@@ -230,7 +230,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #endif // C++14
 
-#if __cplusplus > 201402L
+#if __cplusplus >= 201703L
 #define __cpp_lib_nonmember_container_access 201411
 
   /**
@@ -239,7 +239,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   template 
 constexpr auto
-size(const _Container& __cont) -> decltype(__cont.size())
+size(const _Container& __cont) noexcept(noexcept(__cont.size()))
+-> decltype(__cont.size())
 { return __cont.size(); }
 
   /**
@@ -257,7 +258,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   template 
 constexpr auto
-empty(const _Container& __cont) -> decltype(__cont.empty())
+empty(const _Container& __cont) noexcept(noexcept(__cont.empty()))
+-> decltype(__cont.empty())
 { return __cont.empty(); }
 
   /**
@@ -284,7 +286,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   template 
 constexpr auto
-data(_Container& __cont) -> decltype(__cont.data())
+data(_Container& __cont) noexcept(noexcept(__cont.data()))
+-> decltype(__cont.data())
 { return __cont.data(); }
 
   /**
@@ -293,7 +296,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   template 
 constexpr auto
-data(const _Container& __cont) -> decltype(__cont.data())
+data(const _Container& __cont) noexcept(noexcept(__cont.data()))
+-> decltype(__cont.data())
 { return __cont.data(); }
 
   /**


Re: [PATCH] make canonicalize_condition keep its promise

2017-11-15 Thread Peter Bergner
On 11/15/17 9:40 AM, Aaron Sawdey wrote:
> Index: gcc/rtlanal.c
> ===
> --- gcc/rtlanal.c   (revision 254553)
> +++ gcc/rtlanal.c   (working copy)
> @@ -5623,7 +5623,11 @@
>if (CC0_P (op0))
>  return 0;
>  
> -  return gen_rtx_fmt_ee (code, VOIDmode, op0, op1);
> +  /* We promised to return a comparison.  */
> +  rtx ret = gen_rtx_fmt_ee (code, VOIDmode, op0, op1);
> +  if (COMPARISON_P (ret))
> +return ret;
> +  return 0;

I have no input on whether this approach is correct or not, but...
I know the return above this returns 0 as do other locations in
the file, but new code should return NULL_RTX.

Peter



Re: lambda-switch regression

2017-11-15 Thread David Malcolm
On Wed, 2017-11-15 at 12:25 -0500, David Malcolm wrote:
> On Wed, 2017-11-15 at 12:06 -0500, David Malcolm wrote:
> > On Wed, 2017-11-15 at 08:03 -0500, Nathan Sidwell wrote:
> > > g++.dg/lambda/lambda-switch.C Has recently regressed.  
> > 
> > g++.dg/cpp0x/lambda/lambda-switch.C
> > 
> > > It appears the 
> > > location of a warning message has moved.
> > > 
> > > l = []()  // { dg-warning "statement will never
> > > be executed" }
> > >   {
> > >   case 3: // { dg-error "case" }
> > > break;// { dg-error "break" }
> > >   };  <--- warning now here
> > > 
> > > We seem to be diagnosing the last line of the statement, not the
> > > first. 
> > > That seems not a useful.
> > > 
> > > I've not investigated what patch may have caused this, on the
> > > chance 
> > > someone might already know?
> > > 
> > > nathan
> > 
> > The warning was added in r236597 (aka
> > 1398da0f786e120bb0b407e84f412aa9fc6d80ee):
> > 
> > +2016-05-23  Marek Polacek  
> > +
> > +   PR c/49859
> > +   * common.opt (Wswitch-unreachable): New option.
> > +   * doc/invoke.texi: Document -Wswitch-unreachable.
> > +   * gimplify.c (gimplify_switch_expr): Implement the
> > -Wswitch-
> > unreachable
> > +   warning.
> > 
> > which had it at there (23:7).
> > 
> > r244705 (aka 3ef7eab185e1463c7dbfa2a8d1af5d0120cf9f76) moved the
> > warning from 23:7 up to the "[] ()" at 19:6 in:
> > 
> > +2017-01-20  Marek Polacek  
> > +
> > +   PR c/64279
> > [...snip...]
> > +   * g++.dg/cpp0x/lambda/lambda-switch.C: Move dg-warning.
> > 
> > I tried it with some working copies I have to hand:
> > - works for me with r254387 (2017-11-03)
> > - fails for me with r254700 (2017-11-13)
> > 
> > so hopefully that helps track it down.
> > 
> > Dave
> 
> Searching in the November archives of the gcc-regression ML for
> "lambda-switch.c":
> 
> https://gcc.gnu.org/cgi-bin/search.cgi?wm=wrd=extended=all=D
> =lambda-switch.c=%2Fml%2Fgcc-regression%2F2017-11%2F%25
> 
> showed e.g.:
>   https://gcc.gnu.org/ml/gcc-regression/2017-11/msg00173.html
>"Regressions on trunk at revision 254648 vs revision 254623"
> 
> which says this is a new failure somewhere in that range; so it
> presumably happened sometime on 2017-11-10 after r254623 and up to
> (maybe ==) r254648.
> 
> Looking at:
>svn log -r r254623:r254648 |less
> nothing jumps out at me as being related.
> 
> Hope this is helpful
> Dave

Actually, https://gcc.gnu.org/ml/gcc-regression/2017-11/msg00157.html
has a tighter range: r254628 vs r254635.

Looking at:
  svn log -r r254628:r254635 |less
I see msebor's r254630 ("PR c/81117 - Improve buffer overflow checking
in strncpy") has:

* gimple.c (gimple_build_call_from_tree): Set call location.

with:
+  gimple_set_location (call, EXPR_LOCATION (t));

Maybe that's it?  (nothing else in that commit range seems to affect
locations).

Dave


Re: [PATCH] Set default to -fomit-frame-pointer

2017-11-15 Thread Wilco Dijkstra
Sandra Loosemore wrote:

> I'd prefer that you remove the reference to configure options entirely 
> here.  Nowadays most GCC users install a package provided by their OS 
> distribution, Linaro, etc, rather than trying to build GCC from scratch.

OK, I've removed that reference. Similarly the FRAME_POINTER_REQUIRED
bit as that statement is not only irrelevant but also completely incorrect.

> > +Enabled at levels @option{-O}, @option{-O1}, @option{-O2}, @option{-O3},
> > +@option{-Os} and @option{-Og}.
>
> This last sentence makes no sense.  If the option is now enabled by 
> default, then the optimization level is irrelevant.

It's enabled from -O onwards, so I've changed it to the standard form used
elsewhere and updated the table for -O:

+Enabled by default at @option{-O} and higher.

Here is the cleaned up and simplified version:


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 
2ef88e081f982f5619132cc33ce23c3fb542ae11..158c9ae3f1297a1265fc974cd3e6825d8f5be096
 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -7258,6 +7258,7 @@ compilation time.
 -fipa-reference @gol
 -fmerge-constants @gol
 -fmove-loop-invariants @gol
+-fomit-frame-pointer @gol
 -freorder-blocks @gol
 -fshrink-wrap @gol
 -fshrink-wrap-separate @gol
@@ -7282,9 +7283,6 @@ compilation time.
 -ftree-ter @gol
 -funit-at-a-time}
 
-@option{-O} also turns on @option{-fomit-frame-pointer} on machines
-where doing so does not interfere with debugging.
-
 @item -O2
 @opindex O2
 Optimize even more.  GCC performs nearly all supported optimizations
@@ -7436,29 +7434,18 @@ The default is @option{-ffp-contract=fast}.
 
 @item -fomit-frame-pointer
 @opindex fomit-frame-pointer
-Don't keep the frame pointer in a register for functions that
-don't need one.  This avoids the instructions to save, set up and
-restore frame pointers; it also makes an extra register available
-in many functions.  @strong{It also makes debugging impossible on
-some machines.}
-
-On some machines, such as the VAX, this flag has no effect, because
-the standard calling sequence automatically handles the frame pointer
-and nothing is saved by pretending it doesn't exist.  The
-machine-description macro @code{FRAME_POINTER_REQUIRED} controls
-whether a target machine supports this flag.  @xref{Registers,,Register
-Usage, gccint, GNU Compiler Collection (GCC) Internals}.
-
-The default setting (when not optimizing for
-size) for 32-bit GNU/Linux x86 and 32-bit Darwin x86 targets is
-@option{-fomit-frame-pointer}.  You can configure GCC with the
-@option{--enable-frame-pointer} configure option to change the default.
-
-Note that @option{-fno-omit-frame-pointer} doesn't force a new stack
-frame for all functions if it isn't otherwise needed, and hence doesn't
-guarantee a new frame pointer for all functions.
+Omit the frame pointer in functions that don't need one.  This avoids the
+instructions to save, set up and restore the frame pointer; on many targets
+it also makes an extra register available.
 
-Enabled at levels @option{-O}, @option{-O2}, @option{-O3}, @option{-Os}.
+On some targets this flag has no effect because the standard calling sequence
+always uses a frame pointer, so it cannot be omitted.
+
+Note that @option{-fno-omit-frame-pointer} doesn't guarantee the frame pointer
+is used in all functions.  Several targets always omit the frame pointer in
+leaf functions.
+
+Enabled by default at @option{-O} and higher.
 
 @item -foptimize-sibling-calls
 @opindex foptimize-sibling-calls
@@ -16753,9 +16740,7 @@ Certain other options, such as 
@option{-mid-shared-library} and
 @opindex momit-leaf-frame-pointer
 Don't keep the frame pointer in a register for leaf functions.  This
 avoids the instructions to save, set up and restore frame pointers and
-makes an extra register available in leaf functions.  The option
-@option{-fomit-frame-pointer} removes the frame pointer for all functions,
-which might make debugging harder.
+makes an extra register available in leaf functions.
 
 @item -mspecld-anomaly
 @opindex mspecld-anomaly


Re: lambda-switch regression

2017-11-15 Thread Martin Sebor

On 11/15/2017 06:03 AM, Nathan Sidwell wrote:

g++.dg/lambda/lambda-switch.C Has recently regressed.  It appears the
location of a warning message has moved.

  l = []()// { dg-warning "statement will never be executed" }
{
case 3:// { dg-error "case" }
  break;// { dg-error "break" }
};  <--- warning now here

We seem to be diagnosing the last line of the statement, not the first.
That seems not a useful.

I've not investigated what patch may have caused this, on the chance
someone might already know?


Bug 82988 points to my r254630 as the commit that triggered it.
I haven't yet looked into it.  There some small chance that it
was caused by bug 82977 that Jakub just fixed.

Martin



Re: lambda-switch regression

2017-11-15 Thread David Malcolm
On Wed, 2017-11-15 at 12:06 -0500, David Malcolm wrote:
> On Wed, 2017-11-15 at 08:03 -0500, Nathan Sidwell wrote:
> > g++.dg/lambda/lambda-switch.C Has recently regressed.  
> 
> g++.dg/cpp0x/lambda/lambda-switch.C
> 
> > It appears the 
> > location of a warning message has moved.
> > 
> >   l = []()  // { dg-warning "statement will never
> > be executed" }
> > {
> > case 3: // { dg-error "case" }
> >   break;// { dg-error "break" }
> > };  <--- warning now here
> > 
> > We seem to be diagnosing the last line of the statement, not the
> > first. 
> > That seems not a useful.
> > 
> > I've not investigated what patch may have caused this, on the
> > chance 
> > someone might already know?
> > 
> > nathan
> 
> The warning was added in r236597 (aka
> 1398da0f786e120bb0b407e84f412aa9fc6d80ee):
> 
> +2016-05-23  Marek Polacek  
> +
> +   PR c/49859
> +   * common.opt (Wswitch-unreachable): New option.
> +   * doc/invoke.texi: Document -Wswitch-unreachable.
> +   * gimplify.c (gimplify_switch_expr): Implement the -Wswitch-
> unreachable
> +   warning.
> 
> which had it at there (23:7).
> 
> r244705 (aka 3ef7eab185e1463c7dbfa2a8d1af5d0120cf9f76) moved the
> warning from 23:7 up to the "[] ()" at 19:6 in:
> 
> +2017-01-20  Marek Polacek  
> +
> +   PR c/64279
> [...snip...]
> +   * g++.dg/cpp0x/lambda/lambda-switch.C: Move dg-warning.
> 
> I tried it with some working copies I have to hand:
> - works for me with r254387 (2017-11-03)
> - fails for me with r254700 (2017-11-13)
> 
> so hopefully that helps track it down.
> 
> Dave

Searching in the November archives of the gcc-regression ML for
"lambda-switch.c":

https://gcc.gnu.org/cgi-bin/search.cgi?wm=wrd=extended=all=D=lambda-switch.c=%2Fml%2Fgcc-regression%2F2017-11%2F%25

showed e.g.:
  https://gcc.gnu.org/ml/gcc-regression/2017-11/msg00173.html
   "Regressions on trunk at revision 254648 vs revision 254623"

which says this is a new failure somewhere in that range; so it
presumably happened sometime on 2017-11-10 after r254623 and up to
(maybe ==) r254648.

Looking at:
   svn log -r r254623:r254648 |less
nothing jumps out at me as being related.

Hope this is helpful
Dave


Re: [PATCH, rs6000] Repair vec_xl, vec_xst, vec_xl_be, vec_xst_be built-in functions

2017-11-15 Thread Segher Boessenkool
Hi!

On Tue, Nov 14, 2017 at 02:24:13PM -0600, Bill Schmidt wrote:
> +  for (i = 0; i < 16; ++i)
> + perm[i] = GEN_INT (reorder[i]);
> +
> +  pcv = force_reg (V16QImode,
> +   gen_rtx_CONST_VECTOR (V16QImode,
> +  gen_rtvec_v (16, perm)));
> +  emit_insn (gen_altivec_vperm_v8hi_direct (operands[0], subreg2,
> + subreg2, pcv));
> +  DONE;

Many whitespace problems on these lines, please fix.  More times later.

> +   (match_operand:V16QI 1 "vsx_register_operand" "wa")
> +   (parallel [(const_int 15) (const_int 14)
> +  (const_int 13) (const_int 12)
> +  (const_int 11) (const_int 10)
> +  (const_int  9) (const_int  8)
> +  (const_int  7) (const_int  6)
> +  (const_int  5) (const_int  4)
> +  (const_int  3) (const_int  2)
> +  (const_int  1) (const_int  0)])))]

Here, too.

The rest looks fine.  Thanks!


Segher


Re: [PATCH][GCC][DOCS][AArch64][ARM] Documentation updates adding -A extensions.

2017-11-15 Thread Sandra Loosemore

On 11/15/2017 10:00 AM, Tamar Christina wrote:


On 11/15/2017 04:51 AM, Tamar Christina wrote:

Hi All,

This patch updates the documentation for AArch64 and ARM correcting
the use of the architecture namings by adding the -A suffix in appropriate

places.

Just to clarify, was the documentation previously using incorrect terminology,
or are there new non-A ARMv7 and ARMv8 architectures that invalidate
existing uses of those terms without the -A suffix?


Yes, there are the -M and -R suffixes/profiles. A lot of the documentation was 
written
before these existed. It is mainly a find and replace, but I tried to determine 
for each
change whether the instructions exist in the other profiles. Hopefully they'll 
all correct
but I'll leave that for the review.


OK.  I have no objection to the patch from a documentation point of 
view, but I'll defer to the port maintainers for technical review.


-Sandra


[PATCH, GCC/ARM] Do no clobber r4 in Armv8-M nonsecure call

2017-11-15 Thread Thomas Preudhomme

Hi,

Expanders for Armv8-M nonsecure call unnecessarily clobber r4 despite
the libcall they perform not writing to r4.  Furthermore, the
requirement for the branch target address to be in r4 as expected by
the libcall is modeled in a convoluted way in the define_insn patterns:
the address is a register match_operand constrained by the match_dup
for the clobber which is guaranteed to be r4 due to the expander.

This patch simplifies all this by simply requiring the address to be in
r4 and removing the clobbers. Expanders are left alone because
cmse_nonsecure_call_clear_caller_saved relies on branch target memory
attributes which would be lost if expanding to reg:SI R4_REGNUM.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-10-24  Thomas Preud'homme  

* config/arm/arm.md (R4_REGNUM): Define constant.
(nonsecure_call_internal): Remove r4 clobber.
(nonsecure_call_value_internal): Likewise.
* config/arm/thumb1.md (nonsecure_call_reg_thumb1_v5): Remove second
clobber and resequence match_operands.
(nonsecure_call_value_reg_thumb1_v5): Likewise.
* config/arm/thumb2.md (nonsecure_call_reg_thumb2): Likewise.
(nonsecure_call_value_reg_thumb2): Likewise.

Testing: Bootstrapped on arm-linux-gnueabihf and testsuite shows no
regression.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index ddb9d8f359007c1d86d497aef0ff5fc0e4061813..6b0794ede9fbc5a4f41e1f4a92acb9b649a277bc 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -30,6 +30,7 @@
 (define_constants
   [(R0_REGNUM 0)	; First CORE register
(R1_REGNUM	  1)	; Second CORE register
+   (R4_REGNUM	  4)	; Fifth CORE register
(IP_REGNUM	 12)	; Scratch register
(SP_REGNUM	 13)	; Stack pointer
(LR_REGNUM14)	; Return address register
@@ -8118,14 +8119,13 @@
 			   UNSPEC_NONSECURE_MEM)
 		(match_operand 1 "general_operand" ""))
 	  (use (match_operand 2 "" ""))
-	  (clobber (reg:SI LR_REGNUM))
-	  (clobber (reg:SI 4))])]
+	  (clobber (reg:SI LR_REGNUM))])]
   "use_cmse"
   "
   {
 rtx tmp;
 tmp = copy_to_suggested_reg (XEXP (operands[0], 0),
- gen_rtx_REG (SImode, 4),
+ gen_rtx_REG (SImode, R4_REGNUM),
  SImode);
 
 operands[0] = replace_equiv_address (operands[0], tmp);
@@ -8210,14 +8210,13 @@
 UNSPEC_NONSECURE_MEM)
 			 (match_operand 2 "general_operand" "")))
 	  (use (match_operand 3 "" ""))
-	  (clobber (reg:SI LR_REGNUM))
-	  (clobber (reg:SI 4))])]
+	  (clobber (reg:SI LR_REGNUM))])]
   "use_cmse"
   "
   {
 rtx tmp;
 tmp = copy_to_suggested_reg (XEXP (operands[1], 0),
- gen_rtx_REG (SImode, 4),
+ gen_rtx_REG (SImode, R4_REGNUM),
  SImode);
 
 operands[1] = replace_equiv_address (operands[1], tmp);
diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 5d196a673355a7acf7d0ed30f21b997b815913f5..f91659386bf240172bd9a3076722683c8a50dff4 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -1732,12 +1732,11 @@
 )
 
 (define_insn "*nonsecure_call_reg_thumb1_v5"
-  [(call (unspec:SI [(mem:SI (match_operand:SI 0 "register_operand" "l*r"))]
+  [(call (unspec:SI [(mem:SI (reg:SI R4_REGNUM))]
 		UNSPEC_NONSECURE_MEM)
-	 (match_operand 1 "" ""))
-   (use (match_operand 2 "" ""))
-   (clobber (reg:SI LR_REGNUM))
-   (clobber (match_dup 0))]
+	 (match_operand 0 "" ""))
+   (use (match_operand 1 "" ""))
+   (clobber (reg:SI LR_REGNUM))]
   "TARGET_THUMB1 && use_cmse && !SIBLING_CALL_P (insn)"
   "bl\\t__gnu_cmse_nonsecure_call"
   [(set_attr "length" "4")
@@ -1779,12 +1778,11 @@
 (define_insn "*nonsecure_call_value_reg_thumb1_v5"
   [(set (match_operand 0 "" "")
 	(call (unspec:SI
-	   [(mem:SI (match_operand:SI 1 "register_operand" "l*r"))]
+	   [(mem:SI (reg:SI R4_REGNUM))]
 	   UNSPEC_NONSECURE_MEM)
-	  (match_operand 2 "" "")))
-   (use (match_operand 3 "" ""))
-   (clobber (reg:SI LR_REGNUM))
-   (clobber (match_dup 1))]
+	  (match_operand 1 "" "")))
+   (use (match_operand 2 "" ""))
+   (clobber (reg:SI LR_REGNUM))]
   "TARGET_THUMB1 && use_cmse"
   "bl\\t__gnu_cmse_nonsecure_call"
   [(set_attr "length" "4")
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 776d611d2538e790a5f504995050ffdfc51d7193..d56a8bd167575263edc2a4b3f66bda34a4a7a72a 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -555,12 +555,11 @@
 )
 
 (define_insn "*nonsecure_call_reg_thumb2"
-  [(call (unspec:SI [(mem:SI (match_operand:SI 0 "s_register_operand" "r"))]
+  [(call (unspec:SI [(mem:SI (reg:SI R4_REGNUM))]
 		UNSPEC_NONSECURE_MEM)
-	 (match_operand 1 "" ""))
-   (use (match_operand 2 "" ""))
-   (clobber (reg:SI LR_REGNUM))
-   (clobber (match_dup 0))]
+	 (match_operand 0 "" ""))
+   (use (match_operand 1 "" ""))
+   (clobber (reg:SI LR_REGNUM))]
   "TARGET_THUMB2 && use_cmse"
   

[PATCH, GCC/ARM] Factor out CMSE register clearing code

2017-11-15 Thread Thomas Preudhomme

Hi,

Functions cmse_nonsecure_call_clear_caller_saved and
cmse_nonsecure_entry_clear_before_return both contain very similar code
to clear registers. What's worse, they differ slightly at times so if a
bug is found in one careful thoughts is needed to decide whether the
other function needs fixing too.

This commit addresses the situation by factoring the two pieces of code
into a new function. In doing so the code generated to clear VFP
registers in cmse_nonsecure_call now uses the same sequence as
cmse_nonsecure_entry functions. Tests expectation are thus updated
accordingly.

ChangeLog entry are as follow:

*** gcc/ChangeLog ***

2017-10-24  Thomas Preud'homme  

* config/arm/arm.c (cmse_clear_registers): New function.
(cmse_nonsecure_call_clear_caller_saved): Replace register clearing
code by call to cmse_clear_registers.
(cmse_nonsecure_entry_clear_before_return): Likewise.

*** gcc/ChangeLog ***

2017-10-24  Thomas Preud'homme  

* gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Adapt expectations
to vmov instructions now generated.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-8.c: Likewise.

Testing: bootstrapped on arm-linux-gnueabihf and no regression in the
testsuite.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 9b494e9529a4470c18192a4561e03d2f80e90797..22c9add0722974902b2a89b2b0a75759ff8ba37c 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16991,6 +16991,128 @@ compute_not_to_clear_mask (tree arg_type, rtx arg_rtx, int regno,
   return not_to_clear_mask;
 }
 
+/* Clear registers secret before doing a cmse_nonsecure_call or returning from
+   a cmse_nonsecure_entry function.  TO_CLEAR_BITMAP indicates which registers
+   are to be fully cleared, using the value in register CLEARING_REG if more
+   efficient.  The PADDING_BITS_LEN entries array PADDING_BITS_TO_CLEAR gives
+   the bits that needs to be cleared in caller-saved core registers, with
+   SCRATCH_REG used as a scratch register for that clearing.
+
+   NOTE: one of three following assertions must hold:
+   - SCRATCH_REG is a low register
+   - CLEARING_REG is in the set of registers fully cleared (ie. its bit is set
+ in TO_CLEAR_BITMAP)
+   - CLEARING_REG is a low register.  */
+
+static void
+cmse_clear_registers (sbitmap to_clear_bitmap, uint32_t *padding_bits_to_clear,
+		  int padding_bits_len, rtx scratch_reg, rtx clearing_reg)
+{
+  bool saved_clearing = false;
+  rtx saved_clearing_reg = NULL_RTX;
+  int i, regno, clearing_regno, minregno = R0_REGNUM, maxregno = minregno - 1;
+
+  gcc_assert (arm_arch_cmse);
+
+  if (!bitmap_empty_p (to_clear_bitmap))
+{
+  minregno = bitmap_first_set_bit (to_clear_bitmap);
+  maxregno = bitmap_last_set_bit (to_clear_bitmap);
+}
+  clearing_regno = REGNO (clearing_reg);
+
+  /* Clear padding bits.  */
+  gcc_assert (padding_bits_len <= NUM_ARG_REGS);
+  for (i = 0, regno = R0_REGNUM; i < padding_bits_len; i++, regno++)
+{
+  uint64_t mask;
+  rtx rtx16, dest, cleared_reg = gen_rtx_REG (SImode, regno);
+
+  if (padding_bits_to_clear[i] == 0)
+	continue;
+
+  /* If this is a Thumb-1 target and SCRATCH_REG is not a low register, use
+	 CLEARING_REG as scratch.  */
+  if (TARGET_THUMB1
+	  && REGNO (scratch_reg) > LAST_LO_REGNUM)
+	{
+	  /* clearing_reg is not to be cleared, copy its value into scratch_reg
+	 such that we can use clearing_reg to clear the unused bits in the
+	 arguments.  */
+	  if ((clearing_regno > maxregno
+	   || !bitmap_bit_p (to_clear_bitmap, clearing_regno))
+	  && !saved_clearing)
+	{
+	  gcc_assert (clearing_regno <= LAST_LO_REGNUM);
+	  emit_move_insn (scratch_reg, clearing_reg);
+	  saved_clearing = true;
+	  saved_clearing_reg = scratch_reg;
+	}
+	  scratch_reg = clearing_reg;
+	}
+
+  /* Fill the lower half of the negated padding_bits_to_clear[i].  */
+  mask = (~padding_bits_to_clear[i]) & 0x;
+  emit_move_insn (scratch_reg, gen_int_mode (mask, SImode));
+
+  /* Fill the top half of the negated padding_bits_to_clear[i].  */
+  mask = (~padding_bits_to_clear[i]) >> 16;
+  rtx16 = gen_int_mode (16, SImode);
+  dest = gen_rtx_ZERO_EXTRACT (SImode, scratch_reg, rtx16, rtx16);
+  if (mask)
+	emit_insn (gen_rtx_SET (dest, gen_int_mode (mask, SImode)));
+
+  emit_insn (gen_andsi3 (cleared_reg, cleared_reg, scratch_reg));
+}
+  if (saved_clearing)
+emit_move_insn (clearing_reg, saved_clearing_reg);
+
+
+  /* Clear full registers.  */
+
+  /* If not marked for clearing, clearing_reg already does not 

Re: [PATCH][RFC] Add quotes for constexpr keyword.

2017-11-15 Thread Jonathan Wakely

On 15/11/17 10:04 -0700, Martin Sebor wrote:

On 11/15/2017 09:38 AM, Jonathan Wakely wrote:

On 15/11/17 09:30 -0700, Martin Sebor wrote:

On 11/15/2017 05:45 AM, Martin Liška wrote:

On 11/06/2017 07:29 PM, Martin Sebor wrote:

Sorry for being late with my comment.  I just spotted this minor
formatting issue.  Even though GCC isn't (yet) consistent about
it the keyword "constexpr" should be quoted in the error message
below (and, eventually, in all diagnostic messages).  Since the
patch has been committed by now this is just a reminder for us
to try to keep this in mind in the future.


Hi.

I've prepared patch for that. If it's desired, I can fix test-suite
follow-up.
Do we want to change it also for error messages like:
"call to non-constexpr function"
"constexpr call flows off the end of the function"


If GCC had support for italics for defined terms of the language
or the grammar /constexpr function/ would be italicized because
it's a defined term.  Absent that, I think I would quote them all
for consistency.

Martin

PS I checked the C++ standard to see how it used the term and
the choices it makes seem pretty arbitrary.  There are even
sentences with two instances of two word, one in fixed width
font and the other in proportional.  So I don't think we can
use the spec as an example to follow.


Did you check the latest draft? That should have been fixed.

Defined terms should only be italicized when introduced, not when
used, e.g. in [dcl.constexpr] p2 "constexpr function" and "constexpr
constructor" are italicized, but are in normal font elsewhere. When
referring specifically to the keyword `constexpr` it should be in code
font.

Grammar productions are always italicized, but "constexpr function" is
not a grammar production.


Right, /constexpr function/ is a defined term (as is /constexpr
cosntructor/ and /constexpr if/).  As you say, its defining
occurrence is italicized in the text, and the rest aren't.
In contrast, in terms like "constexpr specifier," "constexpr"
is the keyword and it's always in monospace.

The challenge in GCC as I see it is to know how to decide which
of the two it is.  The difference between constexpr the keyword
and constexpr as part of a defined term is too subtle for most
people who don't work with the standard for a living.  So we end
up with these minor inconsistencies in the diagnostics.  I think
the easiest way to achieve consistency (in diagnostics) it is to
always quote keywords.


Agreed. GCC also doesn't need to distinguish between the definition
and use of a standard term, it is always using the term, so can always
format it the same way.


Having italics would be a nice touch but
it would probably not improve consistency.



PS I was looking at the February 2017 draft.  The October version
looks quite a bit better.


These were the relevant fixes:
https://github.com/cplusplus/draft/issues/559
https://github.com/cplusplus/draft/issues/825
https://github.com/cplusplus/draft/pull/1153
https://github.com/cplusplus/draft/pull/1484

We've been trying to be more consistent about these kind of formatting
issues in the standard, as it was a bit of a mess.



[PATCH, GCC/ARM] Use bitmap to control cmse_nonsecure_call register clearing

2017-11-15 Thread Thomas Preudhomme

Hi,

As part of r253256, cmse_nonsecure_entry_clear_before_return has been
rewritten to use auto_sbitmap instead of an integer bitfield to control
which register needs to be cleared. This commit continue this work in
cmse_nonsecure_call_clear_caller_saved.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-10-16  Thomas Preud'homme  

* config/arm/arm.c (cmse_nonsecure_call_clear_caller_saved): Use
auto_sbitap instead of integer bitfield to control register needing
clearing.

Testing: bootstrapped on arm-linux-gnueabihf and no regression in the
testsuite.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 9919f54242d9317125a104f9777d76a85de80e9b..7384b96fea0179334a6010b099df68c8e2a0fc32 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16990,10 +16990,11 @@ cmse_nonsecure_call_clear_caller_saved (void)
 
   FOR_BB_INSNS (bb, insn)
 	{
-	  uint64_t to_clear_mask, float_mask;
+	  unsigned address_regnum, regno, maxregno =
+	TARGET_HARD_FLOAT_ABI ? D7_VFP_REGNUM : NUM_ARG_REGS - 1;
+	  auto_sbitmap to_clear_bitmap (maxregno + 1);
 	  rtx_insn *seq;
 	  rtx pat, call, unspec, reg, cleared_reg, tmp;
-	  unsigned int regno, maxregno;
 	  rtx address;
 	  CUMULATIVE_ARGS args_so_far_v;
 	  cumulative_args_t args_so_far;
@@ -17024,18 +17025,21 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	continue;
 
 	  /* Determine the caller-saved registers we need to clear.  */
-	  to_clear_mask = (1LL << (NUM_ARG_REGS)) - 1;
-	  maxregno = NUM_ARG_REGS - 1;
+	  bitmap_clear (to_clear_bitmap);
+	  bitmap_set_range (to_clear_bitmap, R0_REGNUM, NUM_ARG_REGS);
+
 	  /* Only look at the caller-saved floating point registers in case of
 	 -mfloat-abi=hard.  For -mfloat-abi=softfp we will be using the
 	 lazy store and loads which clear both caller- and callee-saved
 	 registers.  */
 	  if (TARGET_HARD_FLOAT_ABI)
 	{
-	  float_mask = (1LL << (D7_VFP_REGNUM + 1)) - 1;
-	  float_mask &= ~((1LL << FIRST_VFP_REGNUM) - 1);
-	  to_clear_mask |= float_mask;
-	  maxregno = D7_VFP_REGNUM;
+	  auto_sbitmap float_bitmap (maxregno + 1);
+
+	  bitmap_clear (float_bitmap);
+	  bitmap_set_range (float_bitmap, FIRST_VFP_REGNUM,
+D7_VFP_REGNUM - FIRST_VFP_REGNUM + 1);
+	  bitmap_ior (to_clear_bitmap, to_clear_bitmap, float_bitmap);
 	}
 
 	  /* Make sure the register used to hold the function address is not
@@ -17043,7 +17047,9 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  address = RTVEC_ELT (XVEC (unspec, 0), 0);
 	  gcc_assert (MEM_P (address));
 	  gcc_assert (REG_P (XEXP (address, 0)));
-	  to_clear_mask &= ~(1LL << REGNO (XEXP (address, 0)));
+	  address_regnum = REGNO (XEXP (address, 0));
+	  if (address_regnum < R0_REGNUM + NUM_ARG_REGS)
+	bitmap_clear_bit (to_clear_bitmap, address_regnum);
 
 	  /* Set basic block of call insn so that df rescan is performed on
 	 insns inserted here.  */
@@ -17064,6 +17070,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  FOREACH_FUNCTION_ARGS (fntype, arg_type, args_iter)
 	{
 	  rtx arg_rtx;
+	  uint64_t to_clear_args_mask;
 	  machine_mode arg_mode = TYPE_MODE (arg_type);
 
 	  if (VOID_TYPE_P (arg_type))
@@ -17076,10 +17083,18 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  arg_rtx = arm_function_arg (args_so_far, arg_mode, arg_type,
 	  true);
 	  gcc_assert (REG_P (arg_rtx));
-	  to_clear_mask
-		&= ~compute_not_to_clear_mask (arg_type, arg_rtx,
-	   REGNO (arg_rtx),
-	   padding_bits_to_clear_ptr);
+	  to_clear_args_mask
+		= compute_not_to_clear_mask (arg_type, arg_rtx,
+	 REGNO (arg_rtx),
+	 padding_bits_to_clear_ptr);
+	  if (to_clear_args_mask)
+		{
+		  for (regno = R0_REGNUM; regno <= maxregno; regno++)
+		{
+		  if (to_clear_args_mask & (1ULL << regno))
+			bitmap_clear_bit (to_clear_bitmap, regno);
+		}
+		}
 
 	  first_param = false;
 	}
@@ -17138,7 +17153,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	 call.  */
 	  for (regno = R0_REGNUM; regno <= maxregno; regno++)
 	{
-	  if (!(to_clear_mask & (1LL << regno)))
+	  if (!bitmap_bit_p (to_clear_bitmap, regno))
 		continue;
 
 	  /* If regno is an even vfp register and its successor is also to
@@ -17147,7 +17162,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 		{
 		  if (TARGET_VFP_DOUBLE
 		  && VFP_REGNO_OK_FOR_DOUBLE (regno)
-		  && to_clear_mask & (1LL << (regno + 1)))
+		  && bitmap_bit_p (to_clear_bitmap, (regno + 1)))
 		emit_move_insn (gen_rtx_REG (DFmode, regno++),
 CONST0_RTX (DFmode));
 		  else
@@ -17161,7 +17176,6 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  seq = get_insns ();
 	  end_sequence ();
 	  emit_insn_before (seq, insn);
-
 	}
 }
 }
@@ -25188,7 +25202,7 @@ cmse_nonsecure_entry_clear_before_return (void)
   if 

Re: lambda-switch regression

2017-11-15 Thread David Malcolm
On Wed, 2017-11-15 at 08:03 -0500, Nathan Sidwell wrote:
> g++.dg/lambda/lambda-switch.C Has recently regressed.  

g++.dg/cpp0x/lambda/lambda-switch.C

> It appears the 
> location of a warning message has moved.
> 
> l = []()  // { dg-warning "statement will never
> be executed" }
>   {
>   case 3: // { dg-error "case" }
> break;// { dg-error "break" }
>   };  <--- warning now here
> 
> We seem to be diagnosing the last line of the statement, not the
> first. 
> That seems not a useful.
> 
> I've not investigated what patch may have caused this, on the chance 
> someone might already know?
> 
> nathan

The warning was added in r236597 (aka
1398da0f786e120bb0b407e84f412aa9fc6d80ee):

+2016-05-23  Marek Polacek  
+
+   PR c/49859
+   * common.opt (Wswitch-unreachable): New option.
+   * doc/invoke.texi: Document -Wswitch-unreachable.
+   * gimplify.c (gimplify_switch_expr): Implement the -Wswitch-unreachable
+   warning.

which had it at there (23:7).

r244705 (aka 3ef7eab185e1463c7dbfa2a8d1af5d0120cf9f76) moved the
warning from 23:7 up to the "[] ()" at 19:6 in:

+2017-01-20  Marek Polacek  
+
+   PR c/64279
[...snip...]
+   * g++.dg/cpp0x/lambda/lambda-switch.C: Move dg-warning.

I tried it with some working copies I have to hand:
- works for me with r254387 (2017-11-03)
- fails for me with r254700 (2017-11-13)

so hopefully that helps track it down.

Dave


[PATCH, GCC/testsuite/ARM] Rework expectation for call to Armv8-M nonsecure function

2017-11-15 Thread Thomas Preudhomme

Hi,

Testcase gcc.target/arm/cmse/cmse-14.c checks whether bar is called via
__gnu_cmse_nonsecure_call libcall and not via a direct call. However the
pattern is a bit surprising in that it needs to explicitely allow "by"
due to allowing anything before the 'b'.

This patch rewrites the logic to look for b as a first non-whitespace
letter followed iby anything (to match bl and conditional branches)
followed by some spaces and then bar.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-11-01  Thomas Preud'homme  

* gcc.target/arm/cmse/cmse-14.c: Change logic to match branch
instruction to bar.

Testing: Test still passes for both Armv8-M Baseline and Mainline.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c b/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c
index 701e9ee7e318a07278099548f9b7042a1fde1204..df1ea52bec533c36a738d7d3b2b2ff749b0f3713 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c
@@ -10,4 +10,4 @@ int foo (void)
 }
 
 /* { dg-final { scan-assembler "bl\t__gnu_cmse_nonsecure_call" } } */
-/* { dg-final { scan-assembler-not "b\[^ y\n\]*\\s+bar" } } */
+/* { dg-final { scan-assembler-not "^(.*\\s)?bl?\[^\\s]*\\s+bar" } } */


Re: [PATCH][RFC] Add quotes for constexpr keyword.

2017-11-15 Thread Martin Sebor

On 11/15/2017 09:38 AM, Jonathan Wakely wrote:

On 15/11/17 09:30 -0700, Martin Sebor wrote:

On 11/15/2017 05:45 AM, Martin Liška wrote:

On 11/06/2017 07:29 PM, Martin Sebor wrote:

Sorry for being late with my comment.  I just spotted this minor
formatting issue.  Even though GCC isn't (yet) consistent about
it the keyword "constexpr" should be quoted in the error message
below (and, eventually, in all diagnostic messages).  Since the
patch has been committed by now this is just a reminder for us
to try to keep this in mind in the future.


Hi.

I've prepared patch for that. If it's desired, I can fix test-suite
follow-up.
Do we want to change it also for error messages like:
"call to non-constexpr function"
"constexpr call flows off the end of the function"


If GCC had support for italics for defined terms of the language
or the grammar /constexpr function/ would be italicized because
it's a defined term.  Absent that, I think I would quote them all
for consistency.

Martin

PS I checked the C++ standard to see how it used the term and
the choices it makes seem pretty arbitrary.  There are even
sentences with two instances of two word, one in fixed width
font and the other in proportional.  So I don't think we can
use the spec as an example to follow.


Did you check the latest draft? That should have been fixed.

Defined terms should only be italicized when introduced, not when
used, e.g. in [dcl.constexpr] p2 "constexpr function" and "constexpr
constructor" are italicized, but are in normal font elsewhere. When
referring specifically to the keyword `constexpr` it should be in code
font.

Grammar productions are always italicized, but "constexpr function" is
not a grammar production.


Right, /constexpr function/ is a defined term (as is /constexpr
cosntructor/ and /constexpr if/).  As you say, its defining
occurrence is italicized in the text, and the rest aren't.
In contrast, in terms like "constexpr specifier," "constexpr"
is the keyword and it's always in monospace.

The challenge in GCC as I see it is to know how to decide which
of the two it is.  The difference between constexpr the keyword
and constexpr as part of a defined term is too subtle for most
people who don't work with the standard for a living.  So we end
up with these minor inconsistencies in the diagnostics.  I think
the easiest way to achieve consistency (in diagnostics) it is to
always quote keywords.  Having italics would be a nice touch but
it would probably not improve consistency.

Martin

PS I was looking at the February 2017 draft.  The October version
looks quite a bit better.



RE: [PATCH][GCC][DOCS][AArch64][ARM] Documentation updates adding -A extensions.

2017-11-15 Thread Tamar Christina
Hi Sandra,

> -Original Message-
> From: Sandra Loosemore [mailto:san...@codesourcery.com]
> Sent: Wednesday, November 15, 2017 16:38
> To: Tamar Christina ; gcc-patches@gcc.gnu.org
> Cc: nd ; James Greenhalgh ;
> Richard Earnshaw ; Marcus Shawcroft
> 
> Subject: Re: [PATCH][GCC][DOCS][AArch64][ARM] Documentation updates
> adding -A extensions.
> 
> On 11/15/2017 04:51 AM, Tamar Christina wrote:
> > Hi All,
> >
> > This patch updates the documentation for AArch64 and ARM correcting
> > the use of the architecture namings by adding the -A suffix in appropriate
> places.
> 
> Just to clarify, was the documentation previously using incorrect terminology,
> or are there new non-A ARMv7 and ARMv8 architectures that invalidate
> existing uses of those terms without the -A suffix? 

Yes, there are the -M and -R suffixes/profiles. A lot of the documentation was 
written
before these existed. It is mainly a find and replace, but I tried to determine 
for each
change whether the instructions exist in the other profiles. Hopefully they'll 
all correct
but I'll leave that for the review.

> And, are the "appropriate
> places" all currently-unsuffixed uses, or just a subset of incorrect uses?
> 

It turned out I had to change all of them, for AArch64 for instance we only 
have A profile.
Which is why all unsuffixes changed to -A. For Aarch32 the explicitly different 
stuff
Already had the correct suffixes, so I changed the rest to -A as well.

Tamar.

> The actual patch looks like search-and-replace to me and I have no objection
> to it, but I'd like to understand the rationale so that I can try to remember
> what the conventions are for future patch review
> 
> -Sandra


[PATCH, GCC/testsuite/ARM] Fix selection of effective target for cmse tests

2017-11-15 Thread Thomas Preudhomme

Hi,

Some of the tests in the gcc.target/arm/cmse directory (eg.
gcc.target/arm/cmse/mainline/bitfield-4.c) are failing when run without
an architecture specified in RUNTESTFLAGS due to them not adding the
option to select an Armv8-M architecture.

This patch fixes the issue by adding the right option from the exp file
so that no architecture fiddling is necessary in the individual tests.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-11-03  Thomas Preud'homme  

* gcc.target/arm/cmse/cmse.exp: Add option to select Armv8-M Baseline
or Armv8-M Mainline when running the respective tests.
* gcc.target/arm/cmse/baseline/cmse-11.c: Remove architecture check and
selection.
* gcc.target/arm/cmse/baseline/cmse-13.c: Likewise.
* gcc.target/arm/cmse/baseline/cmse-2.c: Likewise.
* gcc.target/arm/cmse/baseline/cmse-6.c: Likewise.
* gcc.target/arm/cmse/baseline/softfp.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-8.c: Likewise.

Testing: Running cmse.exp for both Armv8-M Baseline and Mainline shows
no regression. Running it for a toolchain defaulting to Armv8-M Baseline
but with RUNTESTFLAGS unset sees some FAIL->PASS.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c
index 795544fe11d9d7f24086be16916a5bfee89d7b44..230b255963f56a6c29b91d2501b43fed6eda2476 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c
@@ -1,7 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-mcmse" }  */
-/* { dg-require-effective-target arm_arch_v8m_base_ok } */
-/* { dg-add-options arm_arch_v8m_base } */
 
 int __attribute__ ((cmse_nonsecure_call)) (*bar) (int);
 
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c
index 7208a2cedd2f4f8296b2801d6f5e5d7838b26551..7ab3219e860e993e2eca3bbee2e885f59b7b3cb4 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c
@@ -1,7 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-mcmse" } */
-/* { dg-require-effective-target arm_arch_v8m_base_ok } */
-/* { dg-add-options arm_arch_v8m_base } */
 
 #include "../cmse-13.x"
 
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c
index fec7dc10484b14db5796f5f431a9306c3b2e307c..d5115ecf2bdb3e87dc6a92244cb204e753f25b07 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c
@@ -1,7 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-mcmse" }  */
-/* { dg-require-effective-target arm_arch_v8m_base_ok } */
-/* { dg-add-options arm_arch_v8m_base } */
 
 extern float bar (void);
 
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c
index 43d45e7a63e56edfebc203c8f0e516dc13fbbd65..cae4f343621d1a19a8893ea4950d33e5e1842fb5 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c
@@ -1,7 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-mcmse" }  */
-/* { dg-require-effective-target arm_arch_v8m_base_ok } */
-/* { dg-add-options arm_arch_v8m_base } */
 
 int __attribute__ ((cmse_nonsecure_call)) (*bar) (double);
 
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/softfp.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/softfp.c
index ca76e12cd9287fd12b7eb7add638973f5d314939..3d383ff6ee17677120e3e1e81726785c30f3b25c 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/softfp.c
+++ 

Re: [PATCH] i386: Update the default -mzeroupper setting

2017-11-15 Thread H.J. Lu
On Wed, Nov 15, 2017 at 8:09 AM, Uros Bizjak  wrote:
> On Wed, Nov 15, 2017 at 2:37 PM, H.J. Lu  wrote:
>> -mzeroupper is specified to generate vzeroupper instruction.  If it
>> isn't used, the default should depend on !TARGET_AVX512ER.  Users can
>> always use -mzeroupper or -mno-zeroupper to override it.
>>
>> Sebastian, can you run the full test with it?
>>
>> OK for trunk if there is no regression?
>
> If we want to go this way, please add relevant tune flag (e.g.
> X86_TUNE_EMIT_VZEROUPPER) and use it for ~m_KNL. This tune is the
> property of the processor model, not ISA.

How about this?  OK for trunk if there are no regressions?


-- 
H.J.
From d9388c1b7f36e2310645aed4a4debefa65b5129e Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Tue, 14 Nov 2017 20:49:33 -0800
Subject: [PATCH] i386: Add X86_TUNE_EMIT_VZEROUPPER

Add X86_TUNE_EMIT_VZEROUPPER to indicate if vzeroupper instruction should
be inserted before a transfer of control flow out of the function.  It is
turned on by default unless we are tuning for KNL.  Users can always use
-mzeroupper or -mno-zeroupper to override X86_TUNE_EMIT_VZEROUPPER.

gcc/

	PR target/82990
	* config/i386/i386.c (pass_insert_vzeroupper::gate): Remove
	TARGET_AVX512ER check.
	(ix86_option_override_internal): Set MASK_VZEROUPPER if
	neither -mzeroupper nor -mno-zeroupper is used and
	TARGET_EMIT_VZEROUPPER is set.
	* config/i386/i386.h (TARGET_EMIT_VZEROUPPER): New.
	* config/i386/x86-tune.def: Add X86_TUNE_EMIT_VZEROUPPER.

gcc/testsuite/

	PR target/82990
	* gcc.target/i386/pr82942-2.c: Add -mtune=knl.
	* gcc.target/i386/pr82990-1.c: New test.
	* gcc.target/i386/pr82990-2.c: Likewise.
	* gcc.target/i386/pr82990-3.c: Likewise.
	* gcc.target/i386/pr82990-4.c: Likewise.
	* gcc.target/i386/pr82990-5.c: Likewise.
	* gcc.target/i386/pr82990-6.c: Likewise.
	* gcc.target/i386/pr82990-7.c: Likewise.
---
 gcc/config/i386/i386.c|  5 +++--
 gcc/config/i386/i386.h|  2 ++
 gcc/config/i386/x86-tune.def  |  4 
 gcc/testsuite/gcc.target/i386/pr82942-2.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr82990-1.c | 14 ++
 gcc/testsuite/gcc.target/i386/pr82990-2.c |  6 ++
 gcc/testsuite/gcc.target/i386/pr82990-3.c |  6 ++
 gcc/testsuite/gcc.target/i386/pr82990-4.c |  6 ++
 gcc/testsuite/gcc.target/i386/pr82990-5.c | 14 ++
 gcc/testsuite/gcc.target/i386/pr82990-6.c |  6 ++
 gcc/testsuite/gcc.target/i386/pr82990-7.c |  6 ++
 11 files changed, 68 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-7.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c5e84a09954..c6ca0712755 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2497,7 +2497,7 @@ public:
   /* opt_pass methods: */
   virtual bool gate (function *)
 {
-  return TARGET_AVX && !TARGET_AVX512ER
+  return TARGET_AVX
 	 && TARGET_VZEROUPPER && flag_expensive_optimizations
 	 && !optimize_size;
 }
@@ -4666,7 +4666,8 @@ ix86_option_override_internal (bool main_args_p,
   if (TARGET_SEH && TARGET_CALL_MS2SYSV_XLOGUES)
 sorry ("-mcall-ms2sysv-xlogues isn%'t currently supported with SEH");
 
-  if (!(opts_set->x_target_flags & MASK_VZEROUPPER))
+  if (!(opts_set->x_target_flags & MASK_VZEROUPPER)
+  && TARGET_EMIT_VZEROUPPER)
 opts->x_target_flags |= MASK_VZEROUPPER;
   if (!(opts_set->x_target_flags & MASK_STV))
 opts->x_target_flags |= MASK_STV;
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index e3e55da4232..a45e2df5783 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -517,6 +517,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
 	ix86_tune_features[X86_TUNE_AVOID_FALSE_DEP_FOR_BMI]
 #define TARGET_ONE_IF_CONV_INSN \
 	ix86_tune_features[X86_TUNE_ONE_IF_CONV_INSN]
+#define TARGET_EMIT_VZEROUPPER \
+	ix86_tune_features[X86_TUNE_EMIT_VZEROUPPER]
 
 /* Feature tests against the various architecture variations.  */
 enum ix86_arch_indices {
diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
index 99282c88341..19fd2b52b30 100644
--- a/gcc/config/i386/x86-tune.def
+++ b/gcc/config/i386/x86-tune.def
@@ -543,3 +543,7 @@ DEF_TUNE (X86_TUNE_QIMODE_MATH, "qimode_math", ~0U)
arithmetic to 32bit via PROMOTE_MODE macro.  This code generation scheme
is usually used for RISC targets.  */
 DEF_TUNE (X86_TUNE_PROMOTE_QI_REGS, "promote_qi_regs", 0U)
+
+/* X86_TUNE_EMIT_VZEROUPPER: This enables vzeroupper instruction insertion
+   before a transfer of 

[PATCH, GCC/ARM] Fix ICE in Armv8-M Security Extensions code

2017-11-15 Thread Thomas Preudhomme

Hi,

Commit r253825 which introduced some sanity checks for sbitmap revealed
a bug in the conversion of cmse_nonsecure_entry_clear_before_return ()
to using bitmap structure. bitmap_and expects that the two bitmaps have
the same length, yet the code in
cmse_nonsecure_entry_clear_before_return () have different size for
to_clear_bitmap and to_clear_arg_regs_bitmap, with the assumption that
bitmap_and would behave has if the bits not allocated were in fact zero.
This commit makes sure both bitmap are equally sized.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-11-13  Thomas Preud'homme  

* config/arm/arm.c (cmse_nonsecure_entry_clear_before_return): Allocate
to_clear_arg_regs_bitmap to the same size as to_clear_bitmap.

Testing: Bootstrapped GCC on arm-none-linux-gnueabihf target and
testsuite shows no regression. Running cmse.exp tests for Armv8-M
Baseline and Mainline shows FAIL->PASS for bitfield-1, bitfield-2,
bitfield-3 and struct-1 testcases.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index db99303f3fb7a2196f48358e74fa4d98f31f045e..106e3edce0d6f2518eb391c436c5213a78d1275b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -25205,7 +25205,8 @@ cmse_nonsecure_entry_clear_before_return (void)
   if (padding_bits_to_clear != 0)
 {
   rtx reg_rtx;
-  auto_sbitmap to_clear_arg_regs_bitmap (R0_REGNUM + NUM_ARG_REGS);
+  int to_clear_bitmap_size = SBITMAP_SIZE ((sbitmap) to_clear_bitmap);
+  auto_sbitmap to_clear_arg_regs_bitmap (to_clear_bitmap_size);
 
   /* Padding bits to clear is not 0 so we know we are dealing with
 	 returning a composite type, which only uses r0.  Let's make sure that


Re: [PATCH][RFC] Add quotes for constexpr keyword.

2017-11-15 Thread Jonathan Wakely

On 15/11/17 09:30 -0700, Martin Sebor wrote:

On 11/15/2017 05:45 AM, Martin Liška wrote:

On 11/06/2017 07:29 PM, Martin Sebor wrote:

Sorry for being late with my comment.  I just spotted this minor
formatting issue.  Even though GCC isn't (yet) consistent about
it the keyword "constexpr" should be quoted in the error message
below (and, eventually, in all diagnostic messages).  Since the
patch has been committed by now this is just a reminder for us
to try to keep this in mind in the future.


Hi.

I've prepared patch for that. If it's desired, I can fix test-suite follow-up.
Do we want to change it also for error messages like:
"call to non-constexpr function"
"constexpr call flows off the end of the function"


If GCC had support for italics for defined terms of the language
or the grammar /constexpr function/ would be italicized because
it's a defined term.  Absent that, I think I would quote them all
for consistency.

Martin

PS I checked the C++ standard to see how it used the term and
the choices it makes seem pretty arbitrary.  There are even
sentences with two instances of two word, one in fixed width
font and the other in proportional.  So I don't think we can
use the spec as an example to follow.


Did you check the latest draft? That should have been fixed.

Defined terms should only be italicized when introduced, not when
used, e.g. in [dcl.constexpr] p2 "constexpr function" and "constexpr
constructor" are italicized, but are in normal font elsewhere. When
referring specifically to the keyword `constexpr` it should be in code
font.

Grammar productions are always italicized, but "constexpr function" is
not a grammar production.



Re: [PATCH][GCC][DOCS][AArch64][ARM] Documentation updates adding -A extensions.

2017-11-15 Thread Sandra Loosemore

On 11/15/2017 04:51 AM, Tamar Christina wrote:

Hi All,

This patch updates the documentation for AArch64 and ARM correcting the use of 
the
architecture namings by adding the -A suffix in appropriate places.


Just to clarify, was the documentation previously using incorrect 
terminology, or are there new non-A ARMv7 and ARMv8 architectures that 
invalidate existing uses of those terms without the -A suffix?  And, are 
the "appropriate places" all currently-unsuffixed uses, or just a subset 
of incorrect uses?


The actual patch looks like search-and-replace to me and I have no 
objection to it, but I'd like to understand the rationale so that I can 
try to remember what the conventions are for future patch review


-Sandra


Re: [PATCH][RFC] Add quotes for constexpr keyword.

2017-11-15 Thread Martin Sebor

On 11/15/2017 05:45 AM, Martin Liška wrote:

On 11/06/2017 07:29 PM, Martin Sebor wrote:

Sorry for being late with my comment.  I just spotted this minor
formatting issue.  Even though GCC isn't (yet) consistent about
it the keyword "constexpr" should be quoted in the error message
below (and, eventually, in all diagnostic messages).  Since the
patch has been committed by now this is just a reminder for us
to try to keep this in mind in the future.


Hi.

I've prepared patch for that. If it's desired, I can fix test-suite follow-up.
Do we want to change it also for error messages like:
"call to non-constexpr function"
"constexpr call flows off the end of the function"


If GCC had support for italics for defined terms of the language
or the grammar /constexpr function/ would be italicized because
it's a defined term.  Absent that, I think I would quote them all
for consistency.

Martin

PS I checked the C++ standard to see how it used the term and
the choices it makes seem pretty arbitrary.  There are even
sentences with two instances of two word, one in fixed width
font and the other in proportional.  So I don't think we can
use the spec as an example to follow.




Re: [PATCH][AArch64] Add STP pattern to store a vec_concat of two 64-bit registers

2017-11-15 Thread Christophe Lyon
On 15 November 2017 at 16:58, Kyrill  Tkachov
 wrote:
> Hi Christophe,
>
>
> On 15/11/17 15:31, Christophe Lyon wrote:
>>
>> Hi Kyrill,
>>
>>
>> On 8 November 2017 at 19:34, Kyrill  Tkachov
>>  wrote:
>>>
>>> On 06/06/17 14:17, James Greenhalgh wrote:

 On Tue, Jun 06, 2017 at 09:40:44AM +0100, Kyrill Tkachov wrote:
>
> Hi all,
>
> On top of the previous vec_merge simplifications [1] we can add this
> pattern to perform
> a store of a vec_concat of two 64-bit values in distinct registers as
> an
> STP.
> This avoids constructing such a vector explicitly in a register and
> storing it as
> a Q register.
> This way for the code in the testcase we can generate:
>
> construct_lane_1:
>   ldp d1, d0, [x0]
>   fmovd3, 1.0e+0
>   fmovd2, 2.0e+0
>   faddd4, d1, d3
>   faddd5, d0, d2
>   stp d4, d5, [x1, 32]
>   ret
>
> construct_lane_2:
>   ldp x2, x0, [x0]
>   add x3, x2, 1
>   add x4, x0, 2
>   stp x3, x4, [x1, 32]
>   ret
>
> instead of the current:
> construct_lane_1:
>   ldp d0, d1, [x0]
>   fmovd3, 1.0e+0
>   fmovd2, 2.0e+0
>   faddd0, d0, d3
>   faddd1, d1, d2
>   dup v0.2d, v0.d[0]
>   ins v0.d[1], v1.d[0]
>   str q0, [x1, 32]
>   ret
>
> construct_lane_2:
>   ldp x2, x3, [x0]
>   add x0, x2, 1
>   add x2, x3, 2
>   dup v0.2d, x0
>   ins v0.d[1], x2
>   str q0, [x1, 32]
>   ret
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
> Ok for GCC 8?

 OK.
>>>
>>>
>>> Thanks, I've committed this and the other patches in this series after
>>> rebasing and rebootstrapping and testing on aarch64-none-linux-gnu.
>>> The only conflict from updating the patch was that I had to use the
>>> store_16
>>> attribute rather than
>>> the old store2 for the new define_insn. This is what I've committed with
>>> r254551.
>>>
>>> Sorry for the delay in committing.
>>>
>> I've noticed that the new tests fail when testing with -mabi=ilp32:
>> FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-not ins\t
>> FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-times
>> stp\td[0-9]+, d[0-9]+ 1 (found 0 times)
>> FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-times
>> stp\tx[0-9]+, x[0-9]+ 1 (found 0 times)
>>
>> Sorry if this has been reported before.
>
>
> Thank you for reporting this, I was not aware of it.
> My patch does indeed fail to generate the optimised sequence for
> -mabi=ilp32.
> During combine it fails to match:
> Failed to match this instruction:
> (set (mem:V2DF (plus:DI (reg/v/f:DI 79 [ z ])
> (const_int 32 [0x20])) [1 MEM[(v2df *)z_8(D) + 32B]+0 S16 A128])
> (vec_concat:V2DF (reg:DF 81 [ y0 ])
> (reg:DF 84 [ y1 ])))
>
>
> but without the -mabi=ilp32 it does successfully match the equivalent
>
> (set (mem:V2DF (plus:DI (reg:DI 1 x1 [ z ])
> (const_int 32 [0x20])) [1 MEM[(v2df *)z_8(D) + 32B]+0 S16 A128])
> (vec_concat:V2DF (reg:DF 81 [ y0 ])
> (reg:DF 84 [ y1 ])))
>
> The only difference is the index register being the hard reg x1.
> There's probably some subtlety in aarch64_classify_address that I'll need to
> dig into.
> In any case, can you please open a bug report for this so we can track it?

Sure, that's: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83009


> To be clear, the failure is just suboptimal codegen for the -mabi=ilp32
> case, not a wrong-code or ICE
> (though it should still be fixed, of course).
>
> Thanks again,
> Kyrill
>
>
>> Christophe
>>
>>> Kyrill
>>>
>>>
 Thanks,
 James

> 2017-06-06  Kyrylo Tkachov  
>
>   * config/aarch64/aarch64-simd.md (store_pair_lanes):
>   New pattern.
>   * config/aarch64/constraints.md (Uml): New constraint.
>   * config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand):
> New
>   predicate.
>
> 2017-06-06  Kyrylo Tkachov  
>
>   * gcc.target/aarch64/store_v2vec_lanes.c: New test.


>


Re: [PATCH] i386: Update the default -mzeroupper setting

2017-11-15 Thread Uros Bizjak
On Wed, Nov 15, 2017 at 2:37 PM, H.J. Lu  wrote:
> -mzeroupper is specified to generate vzeroupper instruction.  If it
> isn't used, the default should depend on !TARGET_AVX512ER.  Users can
> always use -mzeroupper or -mno-zeroupper to override it.
>
> Sebastian, can you run the full test with it?
>
> OK for trunk if there is no regression?

If we want to go this way, please add relevant tune flag (e.g.
X86_TUNE_EMIT_VZEROUPPER) and use it for ~m_KNL. This tune is the
property of the processor model, not ISA.

Uros.


Re: [patches] Re: [PATCH] RISC-V: Add Jim Wilson as a maintainer

2017-11-15 Thread Palmer Dabbelt

On Tue, 07 Nov 2017 09:53:12 PST (-0800), Palmer Dabbelt wrote:

On Tue, 07 Nov 2017 09:47:37 PST (-0800), Jim Wilson wrote:

On Mon, Nov 6, 2017 at 6:39 PM, Palmer Dabbelt  wrote:


+riscv port Jim Wilson  



It is jimw not jim for the email address.  Please fix.


Sorry.  We're still pending approval, but

diff --git a/MAINTAINERS b/MAINTAINERS
index 9c3a56ea0941..222dad81f2bb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -93,8 +93,9 @@ pdp11 portPaul Koning 
 picochip port  Daniel Towner   
 powerpcspe portAndrew Jenner   

 riscv port Kito Cheng  
-riscv port Palmer Dabbelt  
+riscv port Palmer Dabbelt  
 riscv port Andrew Waterman 
+riscv port Jim Wilson  
 rl78 port  DJ Delorie  
 rs6000/powerpc portDavid Edelsohn  
 rs6000/powerpc portSegher Boessenkool  


Committed.


RE: [PATCH][GCC][ARM] Implement "arch" GCC pragma and "+" attributes [Patch (2/3)]

2017-11-15 Thread Tamar Christina


> -Original Message-
> From: Kyrill Tkachov [mailto:kyrylo.tkac...@foss.arm.com]
> Sent: Wednesday, November 15, 2017 10:11
> To: Tamar Christina ; Sandra Loosemore
> ; gcc-patches@gcc.gnu.org
> Cc: nd ; Ramana Radhakrishnan
> ; Richard Earnshaw
> ; ni...@redhat.com
> Subject: Re: [PATCH][GCC][ARM] Implement "arch" GCC pragma and
> "+" attributes [Patch (2/3)]
> 
> Hi Tamar,
> 
> On 10/11/17 10:56, Tamar Christina wrote:
> > Hi Sandra,
> >
> > I've respun the patch with the docs changes you requested.
> >
> > Regards,
> > Tamar
> >
> > > -Original Message-
> > > From: Sandra Loosemore [mailto:san...@codesourcery.com]
> > > Sent: 07 November 2017 03:38
> > > To: Tamar Christina; gcc-patches@gcc.gnu.org
> > > Cc: nd; Ramana Radhakrishnan; Richard Earnshaw; ni...@redhat.com;
> > > Kyrylo Tkachov
> > > Subject: Re: [PATCH][GCC][ARM] Implement "arch" GCC pragma and
> > > "+" attributes [Patch (2/3)]
> > >
> > > On 11/06/2017 09:50 AM, Tamar Christina wrote:
> > > > Hi All,
> > > >
> > > > This patch adds support for the setting the architecture and
> > > > extensions using the target GCC pragma.
> > > >
> > > > #pragma GCC target ("arch=armv8-a+crc")
> > > >
> > > > It also supports a short hand where an extension is just added to
> > > > the current architecture without changing it
> > > >
> > > > #pragma GCC target ("+crc")
> > > >
> > > > Popping and pushing options also correctly reconfigure the global
> > > > state as expected.
> > > >
> > > > Also supported is using the __attribute__((target("...")))
> > > > attributes on functions to change the architecture or extension.
> > > >
> > > > Regtested on arm-none-eabi and no regressions.
> 
> This will need a bootstrap and test run on arm-none-linux-gnueabihf (like all
> arm changes).
> Your changelog at
> https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00387.html mentions some
> arm-c.c changes but I don't see any included in this patch?
> 
> The other changes look good and in line with what I would expect, but can
> you please post the arm-c.c changes if there are any?

Hi Kyrill,

Sorry, I moved this change into another patch that was already committed and 
forgot
to remove it from this changelog.

I've also already bootstrapped this and the 3rd patch when you asked for the 
bootstrap of
First one.  But I didn't update mail the status to the list.

The correct changelog is 

gcc/
2017-11-15  Tamar Christina  

PR target/82641
* config/arm/arm.c (arm_valid_target_attribute_rec):
Parse "arch=" and "+".
(arm_valid_target_attribute_tree): Re-init global options.
(arm_option_override): Make non-static.
(arm_options_perform_arch_sanity_checks): Make errors fatal.
* config/arm/arm_acle.h (__ARM_FEATURE_CRC32): Replace with pragma.
* doc/extend.texi (ARM Function Attributes): Add pragma and target.

gcc/testsuite/
2017-11-15  Tamar Christina  

PR target/82641
* gcc.target/arm/pragma_arch_attribute.c: New.

Ok for trunk?

Thanks,
Tamar

> 
> Thanks,
> Kyrill
> 
> > > >
> > > > Ok for trunk?
> > > >
> > > > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index
> > > >
> > >
> 8aa443f87fb700f7a723d736bdbd53b6c839656d..18d0ffa6820326ce7badf33001
> > > b1
> > > > c6a467c95883 100644
> > > > --- a/gcc/doc/extend.texi
> > > > +++ b/gcc/doc/extend.texi
> > > > @@ -3858,6 +3858,42 @@ Specifies the fpu for which to tune the
> > > performance of this function.
> > > >  The behavior and permissible arguments are the same as for the
> > > > @option{-mfpu=}  command-line option.
> > > >
> > > > +@item arch=
> > > > +@cindex @code{arch=} function attribute, ARM Specifies the
> > > > +architecture version and architectural extensions to use for this
> > > > +function.  The behavior and permissible arguments are the same as
> > > > +for the @option{-march=} command-line option.
> > > > +
> > > > +The above target attributes can be specified as follows:
> > > > +
> > > > +@smallexample
> > > > +__attribute__((target("@var{attr-string}")))
> > > > +int
> > > > +f (int a)
> > > > +@{
> > > > +  return a + 5;
> > > > +@}
> > > > +@end smallexample
> > > > +
> > > > +where @code{@var{attr-string}} is one of the attribute strings.
> > >
> > > This example doesn't illustrate anything useful, and in fact just
> > confuses
> > > things by introducing @var{attr-string}.  Please use an actual valid
> > attribute
> > > here, something like "arch=armv8-a" or whatever.
> > >
> > > Also, either kill the sentence fragment after the example, or be
> > careful to
> > > add @noindent before it to indicate it's a continuation of the
> > > previous paragraph.
> > >
> > > > +
> > > > +Additionally, the architectural extension string may be specified
> > > > +on its own.  This can be used to turn on and off particular
> > > > 

Re: [PATCH][AArch64] Add STP pattern to store a vec_concat of two 64-bit registers

2017-11-15 Thread Kyrill Tkachov

Hi Christophe,

On 15/11/17 15:31, Christophe Lyon wrote:

Hi Kyrill,


On 8 November 2017 at 19:34, Kyrill  Tkachov
 wrote:

On 06/06/17 14:17, James Greenhalgh wrote:

On Tue, Jun 06, 2017 at 09:40:44AM +0100, Kyrill Tkachov wrote:

Hi all,

On top of the previous vec_merge simplifications [1] we can add this
pattern to perform
a store of a vec_concat of two 64-bit values in distinct registers as an
STP.
This avoids constructing such a vector explicitly in a register and
storing it as
a Q register.
This way for the code in the testcase we can generate:

construct_lane_1:
  ldp d1, d0, [x0]
  fmovd3, 1.0e+0
  fmovd2, 2.0e+0
  faddd4, d1, d3
  faddd5, d0, d2
  stp d4, d5, [x1, 32]
  ret

construct_lane_2:
  ldp x2, x0, [x0]
  add x3, x2, 1
  add x4, x0, 2
  stp x3, x4, [x1, 32]
  ret

instead of the current:
construct_lane_1:
  ldp d0, d1, [x0]
  fmovd3, 1.0e+0
  fmovd2, 2.0e+0
  faddd0, d0, d3
  faddd1, d1, d2
  dup v0.2d, v0.d[0]
  ins v0.d[1], v1.d[0]
  str q0, [x1, 32]
  ret

construct_lane_2:
  ldp x2, x3, [x0]
  add x0, x2, 1
  add x2, x3, 2
  dup v0.2d, x0
  ins v0.d[1], x2
  str q0, [x1, 32]
  ret

Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for GCC 8?

OK.


Thanks, I've committed this and the other patches in this series after
rebasing and rebootstrapping and testing on aarch64-none-linux-gnu.
The only conflict from updating the patch was that I had to use the store_16
attribute rather than
the old store2 for the new define_insn. This is what I've committed with
r254551.

Sorry for the delay in committing.


I've noticed that the new tests fail when testing with -mabi=ilp32:
FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-not ins\t
FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-times
stp\td[0-9]+, d[0-9]+ 1 (found 0 times)
FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-times
stp\tx[0-9]+, x[0-9]+ 1 (found 0 times)

Sorry if this has been reported before.


Thank you for reporting this, I was not aware of it.
My patch does indeed fail to generate the optimised sequence for 
-mabi=ilp32.

During combine it fails to match:
Failed to match this instruction:
(set (mem:V2DF (plus:DI (reg/v/f:DI 79 [ z ])
(const_int 32 [0x20])) [1 MEM[(v2df *)z_8(D) + 32B]+0 S16 
A128])

(vec_concat:V2DF (reg:DF 81 [ y0 ])
(reg:DF 84 [ y1 ])))


but without the -mabi=ilp32 it does successfully match the equivalent

(set (mem:V2DF (plus:DI (reg:DI 1 x1 [ z ])
(const_int 32 [0x20])) [1 MEM[(v2df *)z_8(D) + 32B]+0 S16 
A128])

(vec_concat:V2DF (reg:DF 81 [ y0 ])
(reg:DF 84 [ y1 ])))

The only difference is the index register being the hard reg x1.
There's probably some subtlety in aarch64_classify_address that I'll 
need to dig into.

In any case, can you please open a bug report for this so we can track it?
To be clear, the failure is just suboptimal codegen for the -mabi=ilp32 
case, not a wrong-code or ICE

(though it should still be fixed, of course).

Thanks again,
Kyrill


Christophe


Kyrill



Thanks,
James


2017-06-06  Kyrylo Tkachov  

  * config/aarch64/aarch64-simd.md (store_pair_lanes):
  New pattern.
  * config/aarch64/constraints.md (Uml): New constraint.
  * config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): New
  predicate.

2017-06-06  Kyrylo Tkachov  

  * gcc.target/aarch64/store_v2vec_lanes.c: New test.






RE: [PATCH][GCC][mid-end] Allow larger copies when target supports unaligned access [Patch (1/2)]

2017-11-15 Thread tamar . christina
> -Original Message-
> From: Richard Biener [mailto:rguent...@suse.de]
> Sent: Wednesday, November 15, 2017 12:50
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; l...@redhat.com;
> i...@airs.com
> Subject: RE: [PATCH][GCC][mid-end] Allow larger copies when target
> supports unaligned access [Patch (1/2)]
> 
> On Wed, 15 Nov 2017, Tamar Christina wrote:
> 
> >
> >
> > > -Original Message-
> > > From: Richard Biener [mailto:rguent...@suse.de]
> > > Sent: Wednesday, November 15, 2017 08:24
> > > To: Tamar Christina 
> > > Cc: gcc-patches@gcc.gnu.org; nd ; l...@redhat.com;
> > > i...@airs.com
> > > Subject: Re: [PATCH][GCC][mid-end] Allow larger copies when target
> > > supports unaligned access [Patch (1/2)]
> > >
> > > On Tue, 14 Nov 2017, Tamar Christina wrote:
> > >
> > > > Hi All,
> > > >
> > > > This patch allows larger bitsizes to be used as copy size when the
> > > > target does not have SLOW_UNALIGNED_ACCESS.
> > > >
> > > > fun3:
> > > > adrpx2, .LANCHOR0
> > > > add x2, x2, :lo12:.LANCHOR0
> > > > mov x0, 0
> > > > sub sp, sp, #16
> > > > ldrhw1, [x2, 16]
> > > > ldrbw2, [x2, 18]
> > > > add sp, sp, 16
> > > > bfi x0, x1, 0, 8
> > > > ubfxx1, x1, 8, 8
> > > > bfi x0, x1, 8, 8
> > > > bfi x0, x2, 16, 8
> > > > ret
> > > >
> > > > is turned into
> > > >
> > > > fun3:
> > > > adrpx0, .LANCHOR0
> > > > add x0, x0, :lo12:.LANCHOR0
> > > > sub sp, sp, #16
> > > > ldrhw1, [x0, 16]
> > > > ldrbw0, [x0, 18]
> > > > strhw1, [sp, 8]
> > > > strbw0, [sp, 10]
> > > > ldr w0, [sp, 8]
> > > > add sp, sp, 16
> > > > ret
> > > >
> > > > which avoids the bfi's for a simple 3 byte struct copy.
> > > >
> > > > Regression tested on aarch64-none-linux-gnu and
> > > > x86_64-pc-linux-gnu and
> > > no regressions.
> > > >
> > > > This patch is just splitting off from the previous combined patch
> > > > with
> > > > AArch64 and adding a testcase.
> > > >
> > > > I assume Jeff's ACK from
> > > > https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01523.html is still
> > > > valid as
> > > the code did not change.
> > >
> > > Given your no_slow_unalign isn't mode specific can't you use the
> > > existing non_strict_align?
> >
> > No because non_strict_align checks if the target supports unaligned
> > access at all,
> >
> > This no_slow_unalign corresponds instead to the target
> > slow_unaligned_access which checks that the access you want to make
> > has a greater cost than doing an aligned access. ARM for instance
> > always return 1 (value of STRICT_ALIGNMENT) for slow_unaligned_access
> > while for non_strict_align it may return 0 or 1 based on the options
> provided to the compiler.
> >
> > The problem is I have no way to test STRICT_ALIGNMENT or
> > slow_unaligned_access So I had to hardcode some targets that I know it
> does work on.
> 
> I see.  But then the slow_unaligned_access implementation should use
> non_strict_align as default somehow as SLOW_UNALIGNED_ACCESS is
> defaulted to STRICT_ALIGN.
> 
> Given that SLOW_UNALIGNED_ACCESS has different values for different
> modes it would also make sense to be more specific for the testcase in
> question, like word_mode_slow_unaligned_access to tell this only applies to
> word_mode?

Ah, that's fair enough. I've updated the patch and the new changelog is:


gcc/
2017-11-15  Tamar Christina  

* expr.c (copy_blkmode_to_reg): Fix bitsize for targets
with fast unaligned access.
* doc/sourcebuild.texi (word_mode_no_slow_unalign): New.

gcc/testsuite/
2017-11-15  Tamar Christina  

* gcc.dg/struct-simple.c: New.
* lib/target-supports.exp
(check_effective_target_word_mode_no_slow_unalign): New.

Ok for trunk?

Thanks,
Tamar

> 
> Thanks,
> Richard.
> 
> > Thanks,
> > Tamar
> > >
> > > Otherwise the expr.c change looks ok.
> > >
> > > Thanks,
> > > Richard.
> > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > >
> > > > gcc/
> > > > 2017-11-14  Tamar Christina  
> > > >
> > > > * expr.c (copy_blkmode_to_reg): Fix bitsize for targets
> > > > with fast unaligned access.
> > > > * doc/sourcebuild.texi (no_slow_unalign): New.
> > > >
> > > > gcc/testsuite/
> > > > 2017-11-14  Tamar Christina  
> > > >
> > > > * gcc.dg/struct-simple.c: New.
> > > > * lib/target-supports.exp
> > > > (check_effective_target_no_slow_unalign): New.
> > > >
> > > >
> > >
> > > --
> > > Richard Biener 
> > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham
> > > Norton, HRB 21284 (AG Nuernberg)
> >
> >
> 
> --
> Richard Biener 

Re: [testsuite, committed] Compile strncpy-fix-1.c with -Wno-stringop-truncation

2017-11-15 Thread Martin Sebor

On 11/15/2017 08:12 AM, Tom de Vries wrote:

[ Re: [PATCH 3/4] enhance overflow and truncation detection in strncpy
and strncat (PR 81117) ]

On 08/06/2017 10:07 PM, Martin Sebor wrote:

Part 3 of the series contains the meat of the patch: the new
-Wstringop-truncation option, and enhancements to -Wstringop-
overflow, and -Wpointer-sizeof-memaccess to detect misuses of
strncpy and strncat.

Martin

gcc-81117-3.diff


PR c/81117 - Improve buffer overflow checking in strncpy




gcc/testsuite/ChangeLog:

PR c/81117
* c-c++-common/Wsizeof-pointer-memaccess3.c: New test.
* c-c++-common/Wstringop-overflow.c: Same.
* c-c++-common/Wstringop-truncation.c: Same.
* c-c++-common/Wsizeof-pointer-memaccess2.c: Adjust.
* c-c++-common/attr-nonstring-2.c: New test.
* g++.dg/torture/Wsizeof-pointer-memaccess1.C: Adjust.
* g++.dg/torture/Wsizeof-pointer-memaccess2.C: Same.
* gcc.dg/torture/pr63554.c: Same.
* gcc.dg/Walloca-1.c: Disable macro tracking.



Hi,

this also caused a regression in strncpy-fix-1.c. I noticed it for nvptx
 (but I also saw it in other test results, f.i. for
x86_64-unknown-freebsd12.0 at
https://gcc.gnu.org/ml/gcc-testresults/2017-11/msg01276.html ).

On linux you don't see this unless you add -Wsystem-headers:


Yes, some Glibc versions (I think 2.24 and prior) define strncpy
as a macro.  The macro has been removed from newer versions, which
makes the warning show up inconsistently.  I test on Fedora 25 with
the older Glibc so I don't see all these warnings.

I'm tracking the problem bug 82944.


...
$ gcc src/gcc/testsuite/gcc.dg/strncpy-fix-1.c
-fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -Wall
-Wsystem-headers -S -o strncpy-fix-1.s
In file included from /usr/include/string.h:630,
 from src/gcc/testsuite/gcc.dg/strncpy-fix-1.c:6:
src/gcc/testsuite/gcc.dg/strncpy-fix-1.c: In function ‘f’:
src/gcc/testsuite/gcc.dg/strncpy-fix-1.c:10:3: warning:
‘__builtin_strncpy’ output truncated before terminating nul copying 2
bytes from a string of the same length [-Wstringop-truncation]
...

Fixed by adding -Wno-stringop-truncation.

Committed as obvious.


Thanks
Martin


[PATCH] make canonicalize_condition keep its promise

2017-11-15 Thread Aaron Sawdey
So, the story of this very small patch starts with me adding patterns
for ppc instructions bdz[tf] and bdnz[tf] such as this:

  [(set (pc)
(if_then_else
  (and
 (ne (match_operand:P 1 "register_operand" "c,*b,*b,*b")
 (const_int 1))
 (match_operator 3 "branch_comparison_operator"
  [(match_operand 4 "cc_reg_operand" "y,y,y,y")
   (const_int 0)]))
  (label_ref (match_operand 0))
  (pc)))
   (set (match_operand:P 2 "nonimmediate_operand" "=1,*r,m,*d*wi*c*l")
(plus:P (match_dup 1)
(const_int -1)))
   (clobber (match_scratch:P 5 "=X,X,,r"))
   (clobber (match_scratch:CC 6 "=X,,,"))
   (clobber (match_scratch:CCEQ 7 "=X,,,"))]

However when this gets to the loop_doloop pass, we get an assert fail
in iv_number_of_iterations():

  gcc_assert (COMPARISON_P (condition));

This is happening because this branch insn tests two things ANDed
together so the and is at the top of the expression, not a comparison.

This condition is extracted from the insn by get_condition() which is
pretty straightforward, and which calls canonicalize_condition() before
returning it. Now, one could put a test for a jump condition that is
not a conditional test in here but the comment for
canonicalize_condition() says:

   (1) The code will always be a comparison operation (EQ, NE, GT, etc.).

So, this patch adds a test at the end that just returns 0 if the return
rtx is not a comparison. As it happens, doloop conversion is not needed
here because I'm already generating rtl for a branch-decrement counter
based loop.

If there is a better way to go about this please let me know and I'll
revise/retest.

Bootstrap and regtest pass on ppc64le and x86_64. Ok for trunk?

Thanks,
Aaron


2017-11-15  Aaron Sawdey  

* rtlanal.c (canonicalize_condition): Return 0 if final rtx
does not have a conditional at the top.

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC ToolchainIndex: gcc/rtlanal.c
===
--- gcc/rtlanal.c   (revision 254553)
+++ gcc/rtlanal.c   (working copy)
@@ -5623,7 +5623,11 @@
   if (CC0_P (op0))
 return 0;
 
-  return gen_rtx_fmt_ee (code, VOIDmode, op0, op1);
+  /* We promised to return a comparison.  */
+  rtx ret = gen_rtx_fmt_ee (code, VOIDmode, op0, op1);
+  if (COMPARISON_P (ret))
+return ret;
+  return 0;
 }
 
 /* Given a jump insn JUMP, return the condition that will cause it to branch


Re: [PATCH] Simplify floating point comparisons

2017-11-15 Thread Wilco Dijkstra
Richard Biener wrote:
> On Tue, Oct 17, 2017 at 6:28 PM, Wilco Dijkstra  
> wrote:

>> +(if (flag_unsafe_math_optimizations)
>> +  /* Simplify (C / x op 0.0) to x op 0.0 for C > 0.  */
>> +  (for op (lt le gt ge)
>> +   neg_op (gt ge lt le)
>> +    (simplify
>> +  (op (rdiv REAL_CST@0 @1) real_zerop@2)
>> +  (switch
>> +   (if (real_less (, TREE_REAL_CST_PTR (@0)))
>
> Note that real_less (0., +Inf) so I think you either need to check C is 
> 'normal'
> or ! HONOR_INFINITIES.

Yes, it was missing an explicit check for infinity, now added.

> There's also the underflow issue I guess this is what 
> -funsafe-math-optimizations
> is for.  I think ignoring underflows is dangerous though.

We could change C / x > 0 to x >= 0 so the underflow case is included.
However that still means x == 0.0 would behave differently - so the question is
what exactly does -funsafe-math-optimization allow?


>> + (for cmp (lt le gt ge)
>> +  neg_cmp (gt ge lt le)
>> +  /* Simplify (x * C1) cmp C2 -> x cmp (C2 / C1), where C1 != 0.  */
>> +  (simplify
>> +   (cmp (mult @0 REAL_CST@1) REAL_CST@2)
>> +   (with
>> +    { tree tem = const_binop (RDIV_EXPR, type, @2, @1); }
>> +    (if (tem)
>> + (switch
>> +  (if (real_less (, TREE_REAL_CST_PTR (@1)))
>> +   (cmp @0 { tem; }))
>> +  (if (real_less (TREE_REAL_CST_PTR (@1), ))
>> +   (neg_cmp @0 { tem; })))
>
>
> Drops possible overflow/underflow in x * C1 and may create underflow
> or overflow with C2/C1 which you should detect here at least.

I've added checks for this, however I thought -funsafe-math-optimizations is
allowed to insert/remove underflow/overflow, like in these cases:

(x * 1e20f) * 1e20f and (x * 1e40f) * 1e-30f.

> Existing overflows may be guarded against with a HONOR_INFINITIES check.

Not sure what you mean with this?

> When overflow/underflow can be disregarded is there any reason remaining to
> make this guarded by flag_unsafe_math_optimizations?  Are there any cases
> where rounding issues can flip the comparison result?

I think it needs to remain under -funsafe-math-optimizations. Here is the 
updated
version:


ChangeLog
2017-11-15  Wilco Dijkstra    
Jackson Woodruff  

gcc/
PR 71026/tree-optimization
* match.pd: Simplify floating point comparisons.

gcc/testsuite/
PR 71026/tree-optimization
* gcc.dg/associate_comparison_1.c: New.
--

diff --git a/gcc/match.pd b/gcc/match.pd
index 
4d56847d6889923938625beb579b7bbb0cbbad91..967dbf8946fd12a161330f4c8b58dada5d9cb871
 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -359,6 +359,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (rdiv @0 (negate @1))
  (rdiv (negate @0) @1))
 
+(if (flag_unsafe_math_optimizations)
+  /* Simplify (C / x op 0.0) to x op 0.0 for C > 0.  */
+  (for op (lt le gt ge)
+   neg_op (gt ge lt le)
+(simplify
+  (op (rdiv REAL_CST@0 @1) real_zerop@2)
+  (if (!REAL_VALUE_ISINF (TREE_REAL_CST (@0)))
+   (switch
+   (if (real_less (, TREE_REAL_CST_PTR (@0)))
+(op @1 @2))
+   /* For C < 0, use the inverted operator.  */
+   (if (real_less (TREE_REAL_CST_PTR (@0), ))
+(neg_op @1 @2)))
+
 /* Optimize (X & (-A)) / A where A is a power of 2, to X >> log2(A) */
 (for div (trunc_div ceil_div floor_div round_div exact_div)
  (simplify
@@ -3703,6 +3717,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(rdiv @2 @1))
(rdiv (op @0 @2) @1)))
 
+ (for cmp (lt le gt ge)
+  neg_cmp (gt ge lt le)
+  /* Simplify (x * C1) cmp C2 -> x cmp (C2 / C1), where C1 != 0.  */
+  (simplify
+   (cmp (mult @0 REAL_CST@1) REAL_CST@2)
+   (with
+{ tree tem = const_binop (RDIV_EXPR, type, @2, @1); }
+(if (tem
+&& !(REAL_VALUE_ISINF (TREE_REAL_CST (tem))
+ || (real_zerop (tem) && !real_zerop (@1
+ (switch
+  (if (real_less (, TREE_REAL_CST_PTR (@1)))
+   (cmp @0 { tem; }))
+  (if (real_less (TREE_REAL_CST_PTR (@1), ))
+   (neg_cmp @0 { tem; })))
+
  /* Simplify sqrt(x) * sqrt(y) -> sqrt(x*y).  */
  (for root (SQRT CBRT)
   (simplify
diff --git a/gcc/testsuite/gcc.dg/associate_comparison_1.c 
b/gcc/testsuite/gcc.dg/associate_comparison_1.c
new file mode 100644
index 
..ceaba334cce770eb1cbec9283ba8a0c64f725630
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/associate_comparison_1.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -funsafe-math-optimizations -fdump-tree-optimized-raw" } 
*/
+
+int
+cmp_mul_1 (float x)
+{
+  return x * 3 <= 100;
+}
+
+int
+cmp_mul_2 (float x)
+{
+  return x * -5 > 100;
+}
+
+int
+div_cmp_1 (float x, float y)
+{
+  return x / 3 <= y;
+}
+
+int
+div_cmp_2 (float x, float y)
+{
+  return x / 3 <= 1;
+}
+
+int
+inv_cmp (float x)
+{
+  return 5 / x >= 0;
+}
+
+/* { dg-final { scan-tree-dump-times "mult_expr" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-not "rdiv_expr" 

Re: [PATCH 02/14] Support for adding and stripping location_t wrapper nodes

2017-11-15 Thread David Malcolm
On Wed, 2017-11-15 at 12:11 +0100, Richard Biener wrote:
> On Wed, Nov 15, 2017 at 7:17 AM, Trevor Saunders  rg> wrote:
> > On Fri, Nov 10, 2017 at 04:45:17PM -0500, David Malcolm wrote:
> > > This patch provides a mechanism in tree.c for adding a wrapper
> > > node
> > > for expressing a location_t, for those nodes for which
> > > !CAN_HAVE_LOCATION_P, along with a new method of cp_expr.
> > > 
> > > It's called in later patches in the kit via that new method.
> > > 
> > > In this version of the patch, I use NON_LVALUE_EXPR for wrapping
> > > constants, and VIEW_CONVERT_EXPR for other nodes.
> > > 
> > > I also turned off wrapper nodes for EXCEPTIONAL_CLASS_P, for the
> > > sake
> > > of keeping the patch kit more minimal.
> > > 
> > > The patch also adds a STRIP_ANY_LOCATION_WRAPPER macro for
> > > stripping
> > > such nodes, used later on in the patch kit.
> > 
> > I happened to start reading this series near the end and was rather
> > confused by this macro since it changes variables in a rather
> > unhygienic
> > way.  Did you consider just defining a inline function to return
> > the
> > actual decl?  It seems like its not used that often so the slight
> > extra
> > syntax should be that big a deal compared to the explicitness.
> 
> Existing practice  (STRIP_NOPS & friends).  I'm fine either way,
> the patch looks good.
> 
> Eventually you can simplify things by doing less checking in
> location_wrapper_p, like only checking
> 
> +inline bool location_wrapper_p (const_tree exp)
> +{
> +  if ((TREE_CODE (exp) == NON_LVALUE_EXPR
> +   || (TREE_CODE (exp) == VIEW_CONVERT_EXPR
> +  && (TREE_TYPE (exp)
> + == TREE_TYPE (TREE_OPERAND (exp, 0)))
> +return true;
> +  return false;
> +}
> 
> and renaming to maybe_location_wrapper_p.  After all you can't really
> distinguish location wrappers from non-location wrappers?  (and why
> would you want to?)

That's the implementation I originally tried.

As noted in an earlier thread about this, the problem I ran into was
(in g++.dg/conversion/reinterpret1.C):

  // PR c++/15076

  struct Y { Y(int &); };

  int v;
  Y y1(reinterpret_cast(v));  // { dg-error "" }

where the "reinterpret_cast" has the same type as the VAR_DECL v,
and hence the argument to y1 is a NON_LVALUE_EXPR around a VAR_DECL,
where both have the same type, and hence location_wrapper_p () on the
cast would return true.

Compare with:

  Y y1(v);

where the argument "v" with a location wrapper is a VIEW_CONVERT_EXPR
around a VAR_DECL.

With the simpler conditions you suggested above, both are treated as
location wrappers (leading to the dg-error in the test failing),
whereas with the condition in the patch, only the latter is treated as
a location wrapper, and an error is correctly emitted for the dg-error.

Hope this sounds sane.  Maybe the function needs a more detailed
comment explaining this?

Thanks
Dave


> Thanks,
> Richard.
> 
> > Other than that the series seems reasonable, and I look forward to
> > having wrappers in more places.  I seem to remember something I
> > wanted
> > to warn about they would make much easier.
> > 
> > Thanks
> > 
> > Trev
> > 


Re: [PATCH][AArch64] Add STP pattern to store a vec_concat of two 64-bit registers

2017-11-15 Thread Christophe Lyon
Hi Kyrill,


On 8 November 2017 at 19:34, Kyrill  Tkachov
 wrote:
>
> On 06/06/17 14:17, James Greenhalgh wrote:
>>
>> On Tue, Jun 06, 2017 at 09:40:44AM +0100, Kyrill Tkachov wrote:
>>>
>>> Hi all,
>>>
>>> On top of the previous vec_merge simplifications [1] we can add this
>>> pattern to perform
>>> a store of a vec_concat of two 64-bit values in distinct registers as an
>>> STP.
>>> This avoids constructing such a vector explicitly in a register and
>>> storing it as
>>> a Q register.
>>> This way for the code in the testcase we can generate:
>>>
>>> construct_lane_1:
>>>  ldp d1, d0, [x0]
>>>  fmovd3, 1.0e+0
>>>  fmovd2, 2.0e+0
>>>  faddd4, d1, d3
>>>  faddd5, d0, d2
>>>  stp d4, d5, [x1, 32]
>>>  ret
>>>
>>> construct_lane_2:
>>>  ldp x2, x0, [x0]
>>>  add x3, x2, 1
>>>  add x4, x0, 2
>>>  stp x3, x4, [x1, 32]
>>>  ret
>>>
>>> instead of the current:
>>> construct_lane_1:
>>>  ldp d0, d1, [x0]
>>>  fmovd3, 1.0e+0
>>>  fmovd2, 2.0e+0
>>>  faddd0, d0, d3
>>>  faddd1, d1, d2
>>>  dup v0.2d, v0.d[0]
>>>  ins v0.d[1], v1.d[0]
>>>  str q0, [x1, 32]
>>>  ret
>>>
>>> construct_lane_2:
>>>  ldp x2, x3, [x0]
>>>  add x0, x2, 1
>>>  add x2, x3, 2
>>>  dup v0.2d, x0
>>>  ins v0.d[1], x2
>>>  str q0, [x1, 32]
>>>  ret
>>>
>>> Bootstrapped and tested on aarch64-none-linux-gnu.
>>> Ok for GCC 8?
>>
>> OK.
>
>
> Thanks, I've committed this and the other patches in this series after
> rebasing and rebootstrapping and testing on aarch64-none-linux-gnu.
> The only conflict from updating the patch was that I had to use the store_16
> attribute rather than
> the old store2 for the new define_insn. This is what I've committed with
> r254551.
>
> Sorry for the delay in committing.
>

I've noticed that the new tests fail when testing with -mabi=ilp32:
FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-not ins\t
FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-times
stp\td[0-9]+, d[0-9]+ 1 (found 0 times)
FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-times
stp\tx[0-9]+, x[0-9]+ 1 (found 0 times)

Sorry if this has been reported before.

Christophe

> Kyrill
>
>
>> Thanks,
>> James
>>
>>> 2017-06-06  Kyrylo Tkachov  
>>>
>>>  * config/aarch64/aarch64-simd.md (store_pair_lanes):
>>>  New pattern.
>>>  * config/aarch64/constraints.md (Uml): New constraint.
>>>  * config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): New
>>>  predicate.
>>>
>>> 2017-06-06  Kyrylo Tkachov  
>>>
>>>  * gcc.target/aarch64/store_v2vec_lanes.c: New test.
>>
>>
>


Re: [PATCH PR82726/PR70754][2/2]New fix by finding correct root reference in combined chains

2017-11-15 Thread Bin.Cheng
On Mon, Nov 13, 2017 at 1:20 PM, Richard Biener
 wrote:
> On Sat, Nov 11, 2017 at 11:19 AM, Bernhard Reutner-Fischer
>  wrote:
>> On Fri, Nov 10, 2017 at 02:14:25PM +, Bin.Cheng wrote:
>>> Hmm, the patch...
>>
>> +  /* Setup UID for all statements in dominance order.  */
>> +  basic_block *bbs = get_loop_body (loop);
>> +  for (i = 0; i < loop->num_nodes; i++)
>> +{
>> +  unsigned uid = 0;
>> +  basic_block bb = bbs[i];
>> +
>> +  for (gimple_stmt_iterator bsi = gsi_start_phis (bb); !gsi_end_p (bsi);
>> +  gsi_next ())
>> +   {
>> + gimple *stmt = gsi_stmt (bsi);
>> + if (!virtual_operand_p (gimple_phi_result (as_a (stmt
>> +   gimple_set_uid (stmt, uid);
>> +   }
>> +
>> +  for (gimple_stmt_iterator bsi = gsi_start_bb (bb); !gsi_end_p (bsi);
>> +  gsi_next ())
>> +   {
>> + gimple *stmt = gsi_stmt (bsi);
>> + if (gimple_code (stmt) != GIMPLE_LABEL && !is_gimple_debug (stmt))
>> +   gimple_set_uid (stmt, ++uid);
>> +   }
>>
>>   for (gimple_stmt_iterator bsi = gsi_start_nondebug_after_labels_bb 
>> (bb);
>>!gsi_end_p (bsi);
>>gsi_next_nondebug ())
>>  gimple_set_uid (gsi_stmt (bsi), ++uid);
>
> Or even better instead of the whole loop
>
> renumber_gimple_stmt_uids_in_blocks (bbs, loop->num_nodes);
>
> Ok with that change.
Right, here is the updated patch.  Will commit it later.

Thanks,
bin
>
> Thanks,
> Richard.
>
>> thanks,
>>
>> +}
>> +  free (bbs);
>>
From 28a21f4a86ed4e1b5a174b004c45bd4b8ede944f Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Wed, 1 Nov 2017 17:43:55 +
Subject: [PATCH 2/2] pr82726-2017.txt

---
 gcc/testsuite/gcc.dg/tree-ssa/pr82726.c |  26 ++
 gcc/tree-predcom.c  | 138 
 2 files changed, 148 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr82726.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr82726.c b/gcc/testsuite/gcc.dg/tree-ssa/pr82726.c
new file mode 100644
index 000..22bc59d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr82726.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 --param tree-reassoc-width=4" } */
+/* { dg-additional-options "-mavx2" { target { x86_64-*-* i?86-*-* } } } */
+
+#define N 40
+#define M 128
+unsigned int in[N+M];
+unsigned short out[N];
+
+/* Outer-loop vectorization. */
+
+void
+foo (){
+  int i,j;
+  unsigned int diff;
+
+  for (i = 0; i < N; i++) {
+diff = 0;
+for (j = 0; j < M; j+=8) {
+  diff += in[j+i];
+}
+out[i]=(unsigned short)diff;
+  }
+
+  return;
+}
diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
index 24d7c9c..28dac82 100644
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -1020,6 +1020,17 @@ order_drefs (const void *a, const void *b)
   return (*da)->pos - (*db)->pos;
 }
 
+/* Compares two drefs A and B by their position.  Callback for qsort.  */
+
+static int
+order_drefs_by_pos (const void *a, const void *b)
+{
+  const dref *const da = (const dref *) a;
+  const dref *const db = (const dref *) b;
+
+  return (*da)->pos - (*db)->pos;
+}
+
 /* Returns root of the CHAIN.  */
 
 static inline dref
@@ -2633,7 +2644,6 @@ combine_chains (chain_p ch1, chain_p ch2)
   bool swap = false;
   chain_p new_chain;
   unsigned i;
-  gimple *root_stmt;
   tree rslt_type = NULL_TREE;
 
   if (ch1 == ch2)
@@ -2675,31 +2685,55 @@ combine_chains (chain_p ch1, chain_p ch2)
   new_chain->refs.safe_push (nw);
 }
 
-  new_chain->has_max_use_after = false;
-  root_stmt = get_chain_root (new_chain)->stmt;
-  for (i = 1; new_chain->refs.iterate (i, ); i++)
-{
-  if (nw->distance == new_chain->length
-	  && !stmt_dominates_stmt_p (nw->stmt, root_stmt))
-	{
-	  new_chain->has_max_use_after = true;
-	  break;
-	}
-}
-
   ch1->combined = true;
   ch2->combined = true;
   return new_chain;
 }
 
-/* Try to combine the CHAINS.  */
+/* Recursively update position information of all offspring chains to ROOT
+   chain's position information.  */
+
+static void
+update_pos_for_combined_chains (chain_p root)
+{
+  chain_p ch1 = root->ch1, ch2 = root->ch2;
+  dref ref, ref1, ref2;
+  for (unsigned j = 0; (root->refs.iterate (j, )
+			&& ch1->refs.iterate (j, )
+			&& ch2->refs.iterate (j, )); ++j)
+ref1->pos = ref2->pos = ref->pos;
+
+  if (ch1->type == CT_COMBINATION)
+update_pos_for_combined_chains (ch1);
+  if (ch2->type == CT_COMBINATION)
+update_pos_for_combined_chains (ch2);
+}
+
+/* Returns true if statement S1 dominates statement S2.  */
+
+static bool
+pcom_stmt_dominates_stmt_p (gimple *s1, gimple *s2)
+{
+  basic_block bb1 = gimple_bb (s1), bb2 = gimple_bb (s2);
+
+  if (!bb1 || s1 == s2)
+return true;
+
+  if (bb1 == bb2)
+return gimple_uid (s1) < gimple_uid (s2);
+
+  return dominated_by_p (CDI_DOMINATORS, bb2, bb1);
+}
+
+/* Try to 

Re: [PATCH] Fix use-after-free in the strlen pass (PR tree-optimization/82977)

2017-11-15 Thread Martin Sebor

On 11/15/2017 01:28 AM, Richard Biener wrote:

On Tue, 14 Nov 2017, Jeff Law wrote:


On 11/14/2017 02:30 PM, Jakub Jelinek wrote:

On Tue, Nov 14, 2017 at 02:24:28PM -0700, Martin Sebor wrote:

On 11/14/2017 02:04 PM, Jakub Jelinek wrote:

Hi!

strlen_to_stridx.get (rhs1) returns an address into the hash_map, and
strlen_to_stridx.put (lhs, *ps); (in order to be efficient) doesn't make a
copy of the argument just in case, first inserts the slot into it which
may cause reallocation, and only afterwards runs the copy ctor to assign
the value into the new slot.  So, passing it a reference to something
in the hash_map is wrong.  Fixed thusly, bootstrapped/regtested on
x86_64-linux and i686-linux, ok for trunk?


This seems like an unnecessary gotcha that should be possible to
avoid in the hash_map.  The corresponding standard containers
require it to work and so it's surprising when it doesn't in GCC.

I've been looking at how this is implemented and it seems to me
that a fix should be doable by having the hash_map check to see if
the underlying table needs to expand and if so, create a temporary
copy of the element before reallocating it.


That would IMHO just slow down and enlarge the hash_map for all users,
even when most of them don't really need it.
While it is reasonable for STL containers to make sure it works, we
aren't using STL containers and can pose additional restrictions.

But when we make our containers behave differently than the STL it makes
it much easier for someone to make a mistake such as this one.

IMHO this kind of difference in behavior is silly and long term just
makes our jobs harder.

I'd vote for fixing our containers.


I'd argue that this is simply a programming error and I doubt the
libstdc++ variant works by design/specification.


It's by design.  You can find the discussion of this very issue
in C++ standard library issue 526:

  http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-closed.html#526

Martin


Re: [PATCH][ARM] Fix more -Wreturn-type fallout

2017-11-15 Thread Kyrill Tkachov

Hi Sudi,

On 10/11/17 17:06, Sudi Das wrote:


Hi

This patch fixes a couple of more tests that are giving out warnings 
with -Wreturn-type:

- g++.dg/ext/pr57735.C
- gcc.target/arm/pr54300.C



Thank you for the patch.
I've committed it on your behalf with r254773.

Kyrill


*** gcc/testsuite/ChangeLog ***

2017-11-10  Sudakshina Das  

* g++.dg/ext/pr57735.C: Add -Wno-return-type for test.
* gcc.target/arm/pr54300.C (main): Add return type and
return a value.




[testsuite, committed] Compile strncpy-fix-1.c with -Wno-stringop-truncation

2017-11-15 Thread Tom de Vries
[ Re: [PATCH 3/4] enhance overflow and truncation detection in strncpy 
and strncat (PR 81117) ]


On 08/06/2017 10:07 PM, Martin Sebor wrote:

Part 3 of the series contains the meat of the patch: the new
-Wstringop-truncation option, and enhancements to -Wstringop-
overflow, and -Wpointer-sizeof-memaccess to detect misuses of
strncpy and strncat.

Martin

gcc-81117-3.diff


PR c/81117 - Improve buffer overflow checking in strncpy




gcc/testsuite/ChangeLog:

PR c/81117
* c-c++-common/Wsizeof-pointer-memaccess3.c: New test.
* c-c++-common/Wstringop-overflow.c: Same.
* c-c++-common/Wstringop-truncation.c: Same.
* c-c++-common/Wsizeof-pointer-memaccess2.c: Adjust.
* c-c++-common/attr-nonstring-2.c: New test.
* g++.dg/torture/Wsizeof-pointer-memaccess1.C: Adjust.
* g++.dg/torture/Wsizeof-pointer-memaccess2.C: Same.
* gcc.dg/torture/pr63554.c: Same.
* gcc.dg/Walloca-1.c: Disable macro tracking.



Hi,

this also caused a regression in strncpy-fix-1.c. I noticed it for nvptx 
 (but I also saw it in other test results, f.i. for 
x86_64-unknown-freebsd12.0 at 
https://gcc.gnu.org/ml/gcc-testresults/2017-11/msg01276.html ).


On linux you don't see this unless you add -Wsystem-headers:
...
$ gcc src/gcc/testsuite/gcc.dg/strncpy-fix-1.c 
-fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -Wall 
-Wsystem-headers -S -o strncpy-fix-1.s

In file included from /usr/include/string.h:630,
 from src/gcc/testsuite/gcc.dg/strncpy-fix-1.c:6:
src/gcc/testsuite/gcc.dg/strncpy-fix-1.c: In function ‘f’:
src/gcc/testsuite/gcc.dg/strncpy-fix-1.c:10:3: warning: 
‘__builtin_strncpy’ output truncated before terminating nul copying 2 
bytes from a string of the same length [-Wstringop-truncation]

...

Fixed by adding -Wno-stringop-truncation.

Committed as obvious.

Thanks,
- Tom
Compile strncpy-fix-1.c with -Wno-stringop-truncation

2017-11-15  Tom de Vries  

	* gcc.dg/strncpy-fix-1.c: Add -Wno-stringop-truncation to dg-options.

---
 gcc/testsuite/gcc.dg/strncpy-fix-1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/strncpy-fix-1.c b/gcc/testsuite/gcc.dg/strncpy-fix-1.c
index b8bc916..b4fd4aa 100644
--- a/gcc/testsuite/gcc.dg/strncpy-fix-1.c
+++ b/gcc/testsuite/gcc.dg/strncpy-fix-1.c
@@ -1,7 +1,7 @@
 /* Test that use of strncpy does not result in a "value computed is
not used" warning.  */
 /* { dg-do compile } */
-/* { dg-options "-O2 -Wall" } */
+/* { dg-options "-O2 -Wall -Wno-stringop-truncation" } */
 
 #include 
 void


[PATCH 1/3][middle-end]PR78809 (Inline strcmp with small constant strings)

2017-11-15 Thread Qing Zhao
Hi,

this is the first patch for PR78809 (totally 3 patches)

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
inline strcmp with small constant strings

The design doc is at:
https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html

this patch is for the first part of change:

A. for strncmp (s1, s2, n)
 if one of "s1" or "s2" is a constant string, "n" is a constant, and
larger than the length of the constant string:
 change strncmp (s1, s2, n) to strcmp (s1, s2);

adding test case strcmpopt_1.c into gcc.dg

bootstraped and tested on both X86 and aarch64. no regression.

Okay for commit?

thanks.

Qing

==

gcc/ChangeLog

2017-11-15  Qing Zhao  

   * gimple-fold.c (gimple_fold_builtin_string_compare): Add handling
   of replacing call to strncmp with corresponding call to strcmp when
   meeting conditions.

gcc/testsuite/ChangeLog

2017-11-15  Qing Zhao  

   PR middle-end/78809
   * gcc.dg/strcmpopt_1.c: New test.

---
 gcc/gimple-fold.c  | 15 +++
 gcc/testsuite/gcc.dg/strcmpopt_1.c | 28 
 2 files changed, 43 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_1.c

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index adb6f3b..1ed6383 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -2258,6 +2258,21 @@ gimple_fold_builtin_string_compare (gimple_stmt_iterator 
*gsi)
   return true;
 }
 
+  /* If length is larger than the length of one constant string, 
+ replace strncmp with corresponding strcmp */ 
+  if (fcode == BUILT_IN_STRNCMP 
+  && length > 0
+  && ((p2 && (size_t) length > strlen (p2)) 
+  || (p1 && (size_t) length > strlen (p1
+{
+  tree fn = builtin_decl_implicit (BUILT_IN_STRCMP);
+  if (!fn)
+return false;
+  gimple *repl = gimple_build_call (fn, 2, str1, str2);
+  replace_call_with_call_and_fold (gsi, repl);
+  return true;
+}
+
   return false;
 }
 
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_1.c 
b/gcc/testsuite/gcc.dg/strcmpopt_1.c
new file mode 100644
index 000..40596a2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_1.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-options "-fdump-tree-gimple" } */
+
+#include 
+#include 
+
+int cmp1 (char *p)
+{
+  return strncmp (p, "fis", 4);
+}
+int cmp2 (char *q)
+{
+  return strncmp ("fis", q, 4);
+}
+
+int main ()
+{
+
+  char *p = "fish";
+  char *q = "fis\0";
+
+  if (cmp1 (p) == 0 || cmp2 (q) != 0)
+abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "strcmp \\(" 2 "gimple" } } */
-- 
1.9.1



Re: [PATCH] Canonicalize constant multiplies in division

2017-11-15 Thread Wilco Dijkstra
Richard Biener wrote:
> On Tue, Oct 17, 2017 at 6:32 PM, Wilco Dijkstra  
> wrote:

>>  (if (flag_reciprocal_math)
>> - /* Convert (A/B)/C to A/(B*C)  */
>> + /* Convert (A/B)/C to A/(B*C). */
>>   (simplify
>>    (rdiv (rdiv:s @0 @1) @2)
>> -   (rdiv @0 (mult @1 @2)))
>> +  (rdiv @0 (mult @1 @2)))
>> +
>> + /* Canonicalize x / (C1 * y) to (x * C2) / y.  */
>> + (if (optimize)
>
> why if (optimize) here?  The pattern you removed has no
> such check.  As discussed this may undo CSE of C1 * y
> so please check for a single-use on the mult with :s

I think that came from an earlier version of this patch. I've removed it
and added a single use check.

>> +  (simplify
>> +   (rdiv @0 (mult @1 REAL_CST@2))
>> +   (if (!real_zerop (@1))
>
> why this check?  The pattern below didn't have it.

Presumably to avoid the change when dividing by zero. I've removed it, here is
the updated version. This passes bootstrap and regress:


ChangeLog
2017-11-15  Wilco Dijkstra    
Jackson Woodruff  

gcc/
PR 71026/tree-optimization
* match.pd: Canonicalize constant multiplies in division.

gcc/testsuite/
PR 71026/tree-optimization
* gcc.dg/cse_recip.c: New test.
--

diff --git a/gcc/match.pd b/gcc/match.pd
index 
b5042b783c0830a2da08c44bed39842a17911844..ea7d90ed977cfff991d74bee54e91ecb209b6030
 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -344,10 +344,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (negate @0)))
 
 (if (flag_reciprocal_math)
- /* Convert (A/B)/C to A/(B*C)  */
+ /* Convert (A/B)/C to A/(B*C). */
  (simplify
   (rdiv (rdiv:s @0 @1) @2)
-   (rdiv @0 (mult @1 @2)))
+  (rdiv @0 (mult @1 @2)))
+
+ /* Canonicalize x / (C1 * y) to (x * C2) / y.  */
+ (simplify
+  (rdiv @0 (mult:s @1 REAL_CST@2))
+  (with
+   { tree tem = const_binop (RDIV_EXPR, type, build_one_cst (type), @2); }
+   (if (tem)
+(rdiv (mult @0 { tem; } ) @1
 
  /* Convert A/(B/C) to (A/B)*C  */
  (simplify
@@ -646,15 +654,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (if (tem)
  (rdiv { tem; } @1)
 
-/* Convert C1/(X*C2) into (C1/C2)/X  */
-(simplify
- (rdiv REAL_CST@0 (mult @1 REAL_CST@2))
-  (if (flag_reciprocal_math)
-   (with
-{ tree tem = const_binop (RDIV_EXPR, type, @0, @2); }
-(if (tem)
- (rdiv { tem; } @1)
-
 /* Simplify ~X & X as zero.  */
 (simplify
  (bit_and:c (convert? @0) (convert? (bit_not @0)))
diff --git a/gcc/testsuite/gcc.dg/cse_recip.c b/gcc/testsuite/gcc.dg/cse_recip.c
new file mode 100644
index 
..88cba9930c0eb1fdee22a797eff110cd9a14fcda
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cse_recip.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-optimized-raw" } */
+
+void
+cse_recip (float x, float y, float *a)
+{
+  a[0] = y / (5 * x);
+  a[1] = y / (3 * x);
+  a[2] = y / x;
+}
+
+/* { dg-final { scan-tree-dump-times "rdiv_expr" 1 "optimized" } } */





[PATCH] Improve -Wmaybe-uninitialized documentation

2017-11-15 Thread Jonathan Wakely

The docs for -Wmaybe-uninitialized have some issues:

- That first sentence is looong.
- Apparently some C++ programmers think "automatic variable" means one
 declared with C++11 `auto`, rather than simply a local variable.
- The sentence about only warning when optimizing is stuck in between
 two chunks talking about longjmp, which could be inferred to mean
 only the setjmp/longjmp part of the warning depends on optimization.

This attempts to make it easier to parse and understand.

OK for trunk?

commit a923e297acfd7c0ca3d3820463450f38230ab4ea
Author: Jonathan Wakely 
Date:   Wed Nov 15 14:25:09 2017 +

Improve -Wmaybe-uninitialized documentation

* doc/invoke.texi (-Wmaybe-uninitialized): Rephrase more accurately.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 44273284483..fac4122fe3e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -4970,14 +4970,17 @@ void store (int *i)
 @item -Wmaybe-uninitialized
 @opindex Wmaybe-uninitialized
 @opindex Wno-maybe-uninitialized
-For an automatic variable, if there exists a path from the function
-entry to a use of the variable that is initialized, but there exist
-some other paths for which the variable is not initialized, the compiler
-emits a warning if it cannot prove the uninitialized paths are not
-executed at run time. These warnings are made optional because GCC is
-not smart enough to see all the reasons why the code might be correct
-in spite of appearing to have an error.  Here is one example of how
-this can happen:
+Warn if there exists a path from entry to a function to a use of an automatic
+(i.e.@ local) variable, for which the variable is not initialized, and the
+compiler cannot prove that the uninitialized path will not be executed at run
+time.
+
+These warnings are only possible in optimizing compilation, because otherwise
+GCC does not keep track of the state of variables.
+
+These warnings are optional because GCC is not smart enough to see all the
+reasons why the code might be correct in spite of appearing to have an error.
+Here is one example of how this can happen:
 
 @smallexample
 @group
@@ -5003,19 +5006,15 @@ warning, you need to provide a default case with 
assert(0) or
 similar code.
 
 @cindex @code{longjmp} warnings
-This option also warns when a non-volatile automatic variable might be
-changed by a call to @code{longjmp}.  These warnings as well are possible
-only in optimizing compilation.
-
-The compiler sees only the calls to @code{setjmp}.  It cannot know
-where @code{longjmp} will be called; in fact, a signal handler could
-call it at any point in the code.  As a result, you may get a warning
-even when there is in fact no problem because @code{longjmp} cannot
-in fact be called at the place that would cause a problem.
+This option also warns when a non-volatile automatic variable might be changed
+by a call to @code{longjmp}.  The compiler sees only the calls to
+@code{setjmp}.  It cannot know where @code{longjmp} will be called; in fact, a
+signal handler could call it at any point in the code.  As a result, you may
+get a warning even when there is in fact no problem because @code{longjmp}
+cannot in fact be called at the place that would cause a problem.
 
 Some spurious warnings can be avoided if you declare all the functions
-you use that never return as @code{noreturn}.  @xref{Function
-Attributes}.
+you use that never return as @code{noreturn}.  @xref{Function Attributes}.
 
 This warning is enabled by @option{-Wall} or @option{-Wextra}.
 


Re: [PATCH] Fix pr81706 tests on darwin

2017-11-15 Thread Dominique d'Humières
Committed as revision r254770.

Thanks for the review.

Dominique

> Le 13 nov. 2017 à 18:26, Mike Stump  a écrit :
> 
> On Nov 12, 2017, at 6:05 AM, Dominique d'Humières  wrote:
>> 
>> The following patch fixes pr81706 tests on darwin
>> 
>> --- ../_clean/gcc/testsuite/gcc.target/i386/pr81706.c2017-10-26 
>> 07:16:18.0 +0200
>> +++ gcc/testsuite/gcc.target/i386/pr81706.c  2017-11-11 16:02:36.0 
>> +0100
>> @@ -1,8 +1,8 @@
>> /* PR libstdc++/81706 */
>> /* { dg-do compile } */
>> /* { dg-options "-O3 -mavx2 -mno-avx512f" } */
>> -/* { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_cos" } } */
>> -/* { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_sin" } } */
>> +/* { dg-final { scan-assembler "call\[^\n\r]__?ZGVdN4v_cos" } } */
>> +/* { dg-final { scan-assembler "call\[^\n\r]__?ZGVdN4v_sin" } } */
>> 
>> #ifdef __cplusplus
>> extern "C" {
>> --- ../_clean/gcc/testsuite/g++.dg/ext/pr81706.C 2017-10-26 
>> 07:16:21.0 +0200
>> +++ gcc/testsuite/g++.dg/ext/pr81706.C   2017-11-09 21:41:36.0 
>> +0100
>> @@ -1,8 +1,8 @@
>> // PR libstdc++/81706
>> // { dg-do compile { target i?86-*-* x86_64-*-* } }
>> // { dg-options "-O3 -mavx2 -mno-avx512f" }
>> -// { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_cos" } }
>> -// { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_sin" } }
>> +// { dg-final { scan-assembler "call\[^\n\r]__?ZGVdN4v_cos" } }
>> +// { dg-final { scan-assembler "call\[^\n\r]__?ZGVdN4v_sin" } }
>> 
>> #ifdef __cplusplus
>> extern "C {
>> 
>> Is it OK?
> 
> Ok.



Re: Add __builtin_tgmath for better tgmath.h implementation (bug 81156)

2017-11-15 Thread Joseph Myers
On Wed, 15 Nov 2017, Richard Biener wrote:

> Thanks - I suppose we can't avoid the repeated expansion by sth like
> 
> #define exp(Val) ({ __typeof__ Val tem = Val; __TGMATH_UNARY_REAL_IMAG
> (tem, exp, cexp); })

Well, that still expands its argument twice.  You'd need to use 
__auto_type to avoid the double expansion.  And then you'd still have 
extremely complicated expansions (that are correspondingly unfriendly if a 
user makes a mistake with a call, e.g. an argument of unsupported type), 
and complications around getting the right semantics when decimal floating 
point is involved.  And use of ({ }) doesn't work in sizeof outside 
functions.  And that wouldn't help with cases such as 
__STDC_TGMATH_OPERATOR_EVALUATION__, whereas it would actually be easy to 
add __builtin_tgmath_operator that's handled the same as __builtin_tgmath 
but ends up calling a function based on evaluation formats and producing 
an EXCESS_PRECISION_EXPR.

(Clang overloadable functions in C don't avoid the multiple expansion 
either, or at least Clang's tgmath.h doesn't.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Add libgomp.oacc-c-c++-common/f-asyncwait-{1,2,3}.c

2017-11-15 Thread Tom de Vries

Hi,

I noticed that there is only one asyncwait testcase for C on trunk.

I've rewritten asyncwait-{1,2,3}.f90 into C (and changed the float math 
into int math to keep things as simple as possible).


Tested on top of trunk for host.

Tested on top of trunk, gcc-7-branch, openacc-gcc-7-branch, 
gomp-4-branch for nvptx.


On trunk for nvptx, I'm seeing execution failures at -O2. I've verified 
that I see the same failures with all the async and wait clauses 
removed. Also, it's not the only failure at -O2 for trunk, so that's 
probably some orthogonal issue.


Committed as obvious.

Thanks,
- Tom
Add libgomp.oacc-c-c++-common/f-asyncwait-{1,2,3}.c

2017-11-15  Tom de Vries  

	* testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: New test, copied
	from asyncwait-1.f90.  Rewrite into C.  Rewrite from float to int.
	* testsuite/libgomp.oacc-c-c++-common/f-asyncwait-2.c: New test, copied
	from asyncwait-2.f90.  Rewrite into C.  Rewrite from float to int.
	* testsuite/libgomp.oacc-c-c++-common/f-asyncwait-3.c: New test, copied
	from asyncwait-3.f90.  Rewrite into C.  Rewrite from float to int.

---
 .../libgomp.oacc-c-c++-common/f-asyncwait-1.c  | 297 +
 .../libgomp.oacc-c-c++-common/f-asyncwait-2.c  |  61 +
 .../libgomp.oacc-c-c++-common/f-asyncwait-3.c  |  63 +
 3 files changed, 421 insertions(+)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c
new file mode 100644
index 000..cf85170
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c
@@ -0,0 +1,297 @@
+/* { dg-do run } */
+
+/* Based on asyncwait-1.f90.  */
+
+#include 
+
+#define N 64
+
+int
+main (void)
+{
+  int *a, *b, *c, *d, *e;
+
+  a = (int*)malloc (N * sizeof (*a));
+  b = (int*)malloc (N * sizeof (*b));
+  c = (int*)malloc (N * sizeof (*c));
+  d = (int*)malloc (N * sizeof (*d));
+  e = (int*)malloc (N * sizeof (*e));
+
+  for (int i = 0; i < N; ++i)
+{
+  a[i] = 3;
+  b[i] = 0;
+}
+
+#pragma acc data copy (a[0:N]) copy (b[0:N])
+  {
+
+#pragma acc parallel async
+#pragma acc loop
+for (int i = 0; i < N; ++i)
+  b[i] = a[i];
+
+#pragma acc wait
+  }
+
+  for (int i = 0; i < N; ++i)
+{
+  if (a[i] != 3)
+	abort ();
+  if (b[i] != 3)
+	abort ();
+}
+
+  for (int i = 0; i < N; ++i)
+{
+  a[i] = 2;
+  b[i] = 0;
+}
+
+#pragma acc data copy (a[0:N]) copy (b[0:N])
+  {
+#pragma acc parallel async (1)
+#pragma acc loop
+for (int i = 0; i < N; ++i)
+  b[i] = a[i];
+
+#pragma acc wait (1)
+  }
+
+  for (int i = 0; i < N; ++i)
+{
+  if (a[i] != 2) abort ();
+  if (b[i] != 2) abort ();
+}
+
+  for (int i = 0; i < N; ++i)
+{
+  a[i] = 3;
+  b[i] = 0;
+  c[i] = 0;
+  d[i] = 0;
+}
+
+#pragma acc data copy (a[0:N]) copy (b[0:N]) copy (c[0:N]) copy (d[0:N])
+  {
+
+#pragma acc parallel async (1)
+for (int i = 0; i < N; ++i)
+  b[i] = (a[i] * a[i] * a[i]) / a[i];
+
+#pragma acc parallel async (1)
+for (int i = 0; i < N; ++i)
+  c[i] = (a[i] * 4) / a[i];
+
+
+#pragma acc parallel async (1)
+#pragma acc loop
+for (int i = 0; i < N; ++i)
+  d[i] = ((a[i] * a[i] + a[i]) / a[i]) - a[i];
+
+#pragma acc wait (1)
+  }
+
+  for (int i = 0; i < N; ++i)
+{
+  if (a[i] != 3)
+	abort ();
+  if (b[i] != 9)
+	abort ();
+  if (c[i] != 4)
+	abort ();
+  if (d[i] != 1)
+	abort ();
+}
+
+  for (int i = 0; i < N; ++i)
+{
+  a[i] = 2;
+  b[i] = 0;
+  c[i] = 0;
+  d[i] = 0;
+  e[i] = 0;
+}
+
+#pragma acc data copy (a[0:N], b[0:N], c[0:N], d[0:N], e[0:N])
+  {
+
+#pragma acc parallel async (1)
+for (int i = 0; i < N; ++i)
+  b[i] = (a[i] * a[i] * a[i]) / a[i];
+
+#pragma acc parallel async (1)
+#pragma acc loop
+for (int i = 0; i < N; ++i)
+  c[i] = (a[i] * 4) / a[i];
+
+#pragma acc parallel async (1)
+#pragma acc loop
+for (int i = 0; i < N; ++i)
+  d[i] = ((a[i] * a[i] + a[i]) / a[i]) - a[i];
+
+
+#pragma acc parallel wait (1) async (1)
+#pragma acc loop
+for (int i = 0; i < N; ++i)
+  e[i] = a[i] + b[i] + c[i] + d[i];
+
+#pragma acc wait (1)
+  }
+
+  for (int i = 0; i < N; ++i)
+{
+  if (a[i] != 2)
+	abort ();
+  if (b[i] != 4)
+	abort ();
+  if (c[i] != 4)
+	abort ();
+  if (d[i] != 1)
+	abort ();
+  if (e[i] != 11)
+	abort ();
+}
+
+  for (int i = 0; i < N; ++i)
+{
+  a[i] = 3;
+  b[i] = 0;
+}
+
+#pragma acc data copy (a[0:N]) copy (b[0:N])
+  {
+
+#pragma acc kernels async
+#pragma acc loop
+for (int i = 0; i < N; ++i)
+  b[i] = a[i];
+
+#pragma acc wait
+  }
+
+  for (int i = 0; i < N; ++i)
+{
+  if (a[i] != 3)
+	abort ();
+  if (b[i] != 3)
+	abort ();
+}
+
+  for (int i = 0; i < N; ++i)
+{
+  a[i] = 2;
+  b[i] = 0;
+}
+
+#pragma acc data copy (a[0:N]) copy (b[0:N])
+  {
+#pragma acc kernels async (1)

[PATCH] i386: Update the default -mzeroupper setting

2017-11-15 Thread H.J. Lu
-mzeroupper is specified to generate vzeroupper instruction.  If it
isn't used, the default should depend on !TARGET_AVX512ER.  Users can
always use -mzeroupper or -mno-zeroupper to override it.

Sebastian, can you run the full test with it?

OK for trunk if there is no regression?

Thanks.

H.J.
---
gcc/

PR target/82990
* config/i386/i386.c (pass_insert_vzeroupper::gate): Remove
TARGET_AVX512ER check.
(ix86_option_override_internal): Set MASK_VZEROUPPER if
neither -mzeroupper nor -mno-zeroupper is used and AVX512ER is
disabled.

gcc/testsuite/

PR target/82990
* gcc.target/i386/pr82990-1.c: New test.
* gcc.target/i386/pr82990-2.c: Likewise.
* gcc.target/i386/pr82990-3.c: Likewise.
* gcc.target/i386/pr82990-4.c: Likewise.
---
 gcc/config/i386/i386.c|  5 +++--
 gcc/testsuite/gcc.target/i386/pr82990-1.c | 14 ++
 gcc/testsuite/gcc.target/i386/pr82990-2.c |  6 ++
 gcc/testsuite/gcc.target/i386/pr82990-3.c |  6 ++
 gcc/testsuite/gcc.target/i386/pr82990-4.c |  6 ++
 5 files changed, 35 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-4.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c5e84a09954..2c729236a29 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2497,7 +2497,7 @@ public:
   /* opt_pass methods: */
   virtual bool gate (function *)
 {
-  return TARGET_AVX && !TARGET_AVX512ER
+  return TARGET_AVX
 && TARGET_VZEROUPPER && flag_expensive_optimizations
 && !optimize_size;
 }
@@ -4666,7 +4666,8 @@ ix86_option_override_internal (bool main_args_p,
   if (TARGET_SEH && TARGET_CALL_MS2SYSV_XLOGUES)
 sorry ("-mcall-ms2sysv-xlogues isn%'t currently supported with SEH");
 
-  if (!(opts_set->x_target_flags & MASK_VZEROUPPER))
+  if (!(opts_set->x_target_flags & MASK_VZEROUPPER)
+  && !TARGET_AVX512ER_P (opts->x_ix86_isa_flags))
 opts->x_target_flags |= MASK_VZEROUPPER;
   if (!(opts_set->x_target_flags & MASK_STV))
 opts->x_target_flags |= MASK_STV;
diff --git a/gcc/testsuite/gcc.target/i386/pr82990-1.c 
b/gcc/testsuite/gcc.target/i386/pr82990-1.c
new file mode 100644
index 000..ff1d6d40eb2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82990-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=knl -mvzeroupper" } */
+
+#include 
+
+extern __m512d y, z;
+
+void
+pr82941 ()
+{
+  z = y;
+}
+
+/* { dg-final { scan-assembler-times "vzeroupper" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr82990-2.c 
b/gcc/testsuite/gcc.target/i386/pr82990-2.c
new file mode 100644
index 000..0d3cb2333dd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82990-2.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=skylake-avx512 -mno-vzeroupper" } */
+
+#include "pr82941-1.c"
+
+/* { dg-final { scan-assembler-not "vzeroupper" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr82990-3.c 
b/gcc/testsuite/gcc.target/i386/pr82990-3.c
new file mode 100644
index 000..201fa98d8d4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82990-3.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512f -mavx512er -mvzeroupper -O2" } */
+
+#include "pr82941-1.c"
+
+/* { dg-final { scan-assembler-times "vzeroupper" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr82990-4.c 
b/gcc/testsuite/gcc.target/i386/pr82990-4.c
new file mode 100644
index 000..09f161c7291
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82990-4.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512f -mno-avx512er -mno-vzeroupper -O2" } */
+
+#include "pr82941-1.c"
+
+/* { dg-final { scan-assembler-not "vzeroupper" } } */
-- 
2.13.6



Re: [PATCH, rs6000] (v2) GIMPLE folding for vector compares

2017-11-15 Thread Segher Boessenkool
Hi Will,

On Tue, Nov 14, 2017 at 04:11:34PM -0600, Will Schmidt wrote:
>   Add support for gimple folding of vec_cmp_{eq,ge,gt,le,ne}
> for the integer data types.

The code looks fine, just some typographical stuff:

>   * config/rs6000/vsx.md (vcmpneb, vcmpneh, vcmpnew): Update to specify 
>   the not+eq combination.

Trailing space.

> +/*  Helper function to handle the gimple folding of a vector compare
> +operation.  This sets up true/false vectors, and uses the
> +VEC_COND_EXPR operation.
> +'code' indicates which comparison is to be made. (EQ, GT, ...).
> +'type' indicates the type of the result.  */

One space less in the comment indent.  Names of parameters are written in
CAPS, no quotes.

> +static void
> +fold_compare_helper (gimple_stmt_iterator *gsi, tree_code code, gimple *stmt)
> +{
> +  tree arg0 = gimple_call_arg (stmt, 0);
> +  tree arg1 = gimple_call_arg (stmt, 1);
> +  tree lhs = gimple_call_lhs (stmt);
> +  gimple *g = gimple_build_assign (lhs,
> + fold_build_vec_cmp (code, TREE_TYPE (lhs), arg0, arg1));

That's not the standard indenting.  Maybe break the statement to make
it easier?  I.e.

  tree cmp = fold_build_vec_cmp (code, TREE_TYPE (lhs), arg0, arg1);
  gimple *g = gimple_build_assign (lhs, cmp);

> @@ -16701,10 +16731,67 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator 
> *gsi)
> gimple_set_location (g, gimple_location (stmt));
> gsi_replace (gsi, g, true);
> return true;
>}
>  
> +/* Vector compares; EQ, NE, GE, GT, LE.  */
> +case ALTIVEC_BUILTIN_VCMPEQUB:
> +case ALTIVEC_BUILTIN_VCMPEQUH:
> +case ALTIVEC_BUILTIN_VCMPEQUW:
> +case P8V_BUILTIN_VCMPEQUD:
> +  {
> + fold_compare_helper (gsi, EQ_EXPR, stmt);
> + return true;
> +  }

There's no need to make a block here (a bunch more of this later).

> @@ -18260,10 +18347,27 @@ builtin_function_type (machine_mode mode_ret, 
> machine_mode mode_arg0,
>  case MISC_BUILTIN_UNPACK_TD:
>  case MISC_BUILTIN_UNPACK_V1TI:
>h.uns_p[0] = 1;
>break;
>  
> +  /* unsigned arguments, bool return (compares).  */
> +case ALTIVEC_BUILTIN_VCMPEQUB:

The comment indent is wrong.

>/* unsigned arguments for 128-bit pack instructions.  */
>  case MISC_BUILTIN_PACK_TD:

Here too, but that is existing code :-)

Okay for trunk with those trivialities cleaned up.  Thanks!


Segher


[PR c++/81574] lambda capture of function reference

2017-11-15 Thread Nathan Sidwell
This patch fixes 81574.  Even when the capture default is '=', a 
reference to a function is captured by reference.  The init-capture case 
captures a pointer, via auto deduction machinery.  AFAICT that's the 
correct behaviour.


applying to trunk.

nathan
--
Nathan Sidwell
2017-11-15  Nathan Sidwell  

	PR c++/81574
	* lambda.c (lambda_capture_field_type): Function references are
	always catured by reference.

	PR c++/81574
	* g++.dg/cpp1y/pr81574.C: New.

Index: cp/lambda.c
===
--- cp/lambda.c	(revision 254740)
+++ cp/lambda.c	(working copy)
@@ -245,7 +245,8 @@ lambda_capture_field_type (tree expr, bo
 {
   type = non_reference (unlowered_expr_type (expr));
 
-  if (!is_this && by_reference_p)
+  if (!is_this
+	  && (by_reference_p || TREE_CODE (type) == FUNCTION_TYPE))
 	type = build_reference_type (type);
 }
 
Index: testsuite/g++.dg/cpp1y/pr81574.C
===
--- testsuite/g++.dg/cpp1y/pr81574.C	(revision 0)
+++ testsuite/g++.dg/cpp1y/pr81574.C	(working copy)
@@ -0,0 +1,13 @@
+// { dg-do compile { target c++14 } }
+// PR c++/81574 references to functions are captured by reference.
+
+// 8.1.5.2/10
+// For each entity captured by copy, ... an lvalue reference to the
+// referenced function type if the entity is a reference to a function
+
+void f (void ()())
+{
+  [=] {  b; } ();
+  [=, b(f)] { b; } ();
+  [=, b(b)] { b; } ();
+}


Re: [PATCH] Fix test-suite fallout of default -Wreturn-type.

2017-11-15 Thread Jonathan Wakely

On 06/11/17 15:12 +0100, Martin Liška wrote:

On 11/06/2017 02:58 PM, Paolo Carlini wrote:

Hi,

On 06/11/2017 14:37, Martin Liška wrote:

Thank you for the patch.
I'm going to install the remaining part that will fix x86_64 fallout. All 
changes are
quite obvious, so hope it's fine to install it.

I think so. Thanks.

Note that the 3 additional libstdc++-v3 changes aren't really necessary, but 
those testcases are failing, seg faulting, at run time for unrelated reasons. I 
don't know if Jonathan is already on that...

Paolo.


Right, adding "return 0;" to main() is just noise, it does nothing.


You're right, it started right when it was introduced in r254008.

I see:

g++ libstdc++-v3/testsuite/27_io/basic_ifstream/cons/char/path.cc -std=gnu++17 -I. 
-lstdc++fs && ./a.out
libstdc++-v3/testsuite/27_io/basic_ifstream/cons/char/path.cc:33: void 
test01(): Assertion 'f.is_open()' failed.
Aborted (core dumped)


I think that was PR libstdc++/82917 so should be fixed.



Re: [patch] backwards threader cleanups

2017-11-15 Thread Pedro Alves
On 11/15/2017 07:34 AM, Aldy Hernandez wrote:
> 
> 
> On 11/14/2017 02:38 PM, David Malcolm wrote:
>> On Tue, 2017-11-14 at 14:08 -0500, Aldy Hernandez wrote:
> 
>>https://gcc.gnu.org/codingconventions.html#Class_Form
>> says that:
>>
>> "When defining a class, first [...]
>> declare all public member functions,
>> [...]
>> then declare all non-public member functions, and
>> then declare all non-public member variables."
> 
> Wow, I did not expect that order.  Fixed.

...

>> (Is this a self-assign from this->speed_p? should the "speed_p" param
>> be renamed, e.g. to "speed_p_")
> 
> Yes.  Fixed.

The convention also says:

"When structs and/or classes have member functions, prefer to name
data members with a leading m_".

So in this case, the preference would be to rename this->speed_p to
m_speed_p instead.

Thanks,
Pedro Alves



lambda-switch regression

2017-11-15 Thread Nathan Sidwell
g++.dg/lambda/lambda-switch.C Has recently regressed.  It appears the 
location of a warning message has moved.


  l = []()  // { dg-warning "statement will never be 
executed" }
{
case 3: // { dg-error "case" }
  break;// { dg-error "break" }
};  <--- warning now here

We seem to be diagnosing the last line of the statement, not the first. 
That seems not a useful.


I've not investigated what patch may have caused this, on the chance 
someone might already know?


nathan
--
Nathan Sidwell


RE: [PATCH][GCC][mid-end] Allow larger copies when target supports unaligned access [Patch (1/2)]

2017-11-15 Thread Richard Biener
On Wed, 15 Nov 2017, Tamar Christina wrote:

> 
> 
> > -Original Message-
> > From: Richard Biener [mailto:rguent...@suse.de]
> > Sent: Wednesday, November 15, 2017 08:24
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd ; l...@redhat.com;
> > i...@airs.com
> > Subject: Re: [PATCH][GCC][mid-end] Allow larger copies when target
> > supports unaligned access [Patch (1/2)]
> > 
> > On Tue, 14 Nov 2017, Tamar Christina wrote:
> > 
> > > Hi All,
> > >
> > > This patch allows larger bitsizes to be used as copy size when the
> > > target does not have SLOW_UNALIGNED_ACCESS.
> > >
> > > fun3:
> > >   adrpx2, .LANCHOR0
> > >   add x2, x2, :lo12:.LANCHOR0
> > >   mov x0, 0
> > >   sub sp, sp, #16
> > >   ldrhw1, [x2, 16]
> > >   ldrbw2, [x2, 18]
> > >   add sp, sp, 16
> > >   bfi x0, x1, 0, 8
> > >   ubfxx1, x1, 8, 8
> > >   bfi x0, x1, 8, 8
> > >   bfi x0, x2, 16, 8
> > >   ret
> > >
> > > is turned into
> > >
> > > fun3:
> > >   adrpx0, .LANCHOR0
> > >   add x0, x0, :lo12:.LANCHOR0
> > >   sub sp, sp, #16
> > >   ldrhw1, [x0, 16]
> > >   ldrbw0, [x0, 18]
> > >   strhw1, [sp, 8]
> > >   strbw0, [sp, 10]
> > >   ldr w0, [sp, 8]
> > >   add sp, sp, 16
> > >   ret
> > >
> > > which avoids the bfi's for a simple 3 byte struct copy.
> > >
> > > Regression tested on aarch64-none-linux-gnu and x86_64-pc-linux-gnu and
> > no regressions.
> > >
> > > This patch is just splitting off from the previous combined patch with
> > > AArch64 and adding a testcase.
> > >
> > > I assume Jeff's ACK from
> > > https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01523.html is still valid as
> > the code did not change.
> > 
> > Given your no_slow_unalign isn't mode specific can't you use the existing
> > non_strict_align?
> 
> No because non_strict_align checks if the target supports unaligned access at 
> all,
> 
> This no_slow_unalign corresponds instead to the target slow_unaligned_access
> which checks that the access you want to make has a greater cost than doing an
> aligned access. ARM for instance always return 1 (value of STRICT_ALIGNMENT)
> for slow_unaligned_access while for non_strict_align it may return 0 or 1 
> based
> on the options provided to the compiler.
> 
> The problem is I have no way to test STRICT_ALIGNMENT or slow_unaligned_access
> So I had to hardcode some targets that I know it does work on.

I see.  But then the slow_unaligned_access implementation should use
non_strict_align as default somehow as SLOW_UNALIGNED_ACCESS is defaulted
to STRICT_ALIGN.

Given that SLOW_UNALIGNED_ACCESS has different values for different modes
it would also make sense to be more specific for the testcase in question,
like word_mode_slow_unaligned_access to tell this only applies to 
word_mode?

Thanks,
Richard.

> Thanks,
> Tamar
> > 
> > Otherwise the expr.c change looks ok.
> > 
> > Thanks,
> > Richard.
> > 
> > > Thanks,
> > > Tamar
> > >
> > >
> > > gcc/
> > > 2017-11-14  Tamar Christina  
> > >
> > >   * expr.c (copy_blkmode_to_reg): Fix bitsize for targets
> > >   with fast unaligned access.
> > >   * doc/sourcebuild.texi (no_slow_unalign): New.
> > >
> > > gcc/testsuite/
> > > 2017-11-14  Tamar Christina  
> > >
> > >   * gcc.dg/struct-simple.c: New.
> > >   * lib/target-supports.exp
> > >   (check_effective_target_no_slow_unalign): New.
> > >
> > >
> > 
> > --
> > Richard Biener 
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> > HRB 21284 (AG Nuernberg)
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[PATCH] Fix PR82985

2017-11-15 Thread Richard Biener

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to branch,
testcase also to trunk.

Richard.

2017-11-15  Richard Biener  

PR tree-optimization/82985
Backport from mainline
2017-08-15  Richard Biener  

PR tree-optimization/81790
* tree-ssa-sccvn.c (vn_lookup_simplify_result): Handle both
CONSTRUCTORs from simplifying and VN.

* gcc.dg/torture/pr81790.c: New testcase.
* g++.dg/torture/pr82985.C: Likewise.

Index: gcc/testsuite/gcc.dg/torture/pr81790.c
===
--- gcc/testsuite/gcc.dg/torture/pr81790.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr81790.c  (working copy)
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "--param sccvn-max-scc-size=10" } */
+
+typedef int a __attribute__ ((__vector_size__ (16)));
+typedef struct
+{
+  a b;
+} c;
+
+int d, e;
+
+void foo (c *ptr);
+
+void bar ()
+{
+  double b = 1842.9028;
+  c g, h;
+  if (d)
+b = 77.7998;
+  for (; e;)
+{
+  g.b = g.b = g.b + g.b;
+  h.b = (a){b};
+  h.b = h.b + h.b;
+}
+  foo ();
+  foo ();
+}
Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 254492)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -1643,13 +1643,25 @@ static vn_nary_op_t vn_nary_op_insert_st
 /* Hook for maybe_push_res_to_seq, lookup the expression in the VN tables.  */
 
 static tree
-vn_lookup_simplify_result (code_helper rcode, tree type, tree *ops)
+vn_lookup_simplify_result (code_helper rcode, tree type, tree *ops_)
 {
   if (!rcode.is_tree_code ())
 return NULL_TREE;
+  tree *ops = ops_;
+  unsigned int length = TREE_CODE_LENGTH ((tree_code) rcode);
+  if (rcode == CONSTRUCTOR
+  /* ???  We're arriving here with SCCVNs view, decomposed CONSTRUCTOR
+and GIMPLEs / match-and-simplifies, CONSTRUCTOR as GENERIC tree.  */
+  && TREE_CODE (ops_[0]) == CONSTRUCTOR)
+{
+  length = CONSTRUCTOR_NELTS (ops_[0]);
+  ops = XALLOCAVEC (tree, length);
+  for (unsigned i = 0; i < length; ++i)
+   ops[i] = CONSTRUCTOR_ELT (ops_[0], i)->value;
+}
   vn_nary_op_t vnresult = NULL;
-  return vn_nary_op_lookup_pieces (TREE_CODE_LENGTH ((tree_code) rcode),
-  (tree_code) rcode, type, ops, );
+  return vn_nary_op_lookup_pieces (length, (tree_code) rcode,
+  type, ops, );
 }
 
 /* Return a value-number for RCODE OPS... either by looking up an existing
Index: gcc/testsuite/g++.dg/torture/pr82985.C
===
--- gcc/testsuite/g++.dg/torture/pr82985.C  (nonexistent)
+++ gcc/testsuite/g++.dg/torture/pr82985.C  (working copy)
@@ -0,0 +1,458 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-w" } */
+/* { dg-additional-options "-mavx2" { target { x86_64-*-* i?86-*-* } } } */
+
+namespace std {
+template < typename _Default > struct __detector { using type = _Default; };
+template < typename _Default, template < typename > class >
+using __detected_or = __detector< _Default >;
+template < typename _Default, template < typename > class _Op >
+using __detected_or_t = typename __detected_or< _Default, _Op >::type;
+template < typename > struct iterator_traits;
+template < typename _Tp > struct iterator_traits< _Tp * > {
+  typedef _Tp reference;
+};
+} // std
+using std::iterator_traits;
+template < typename _Iterator, typename > struct __normal_iterator {
+  typename iterator_traits< _Iterator >::reference operator*();
+  void operator++();
+};
+template < typename _IteratorL, typename _IteratorR, typename _Container >
+int operator!=(__normal_iterator< _IteratorL, _Container >,
+   __normal_iterator< _IteratorR, _Container >);
+namespace std {
+template < typename _Tp > struct allocator { typedef _Tp value_type; };
+struct __allocator_traits_base {
+  template < typename _Tp > using __pointer = typename _Tp::pointer;
+};
+template < typename _Alloc > struct allocator_traits : __allocator_traits_base 
{
+  using pointer = __detected_or_t< typename _Alloc::value_type *, __pointer >;
+};
+} // std
+typedef double __m128d __attribute__((__vector_size__(16)));
+typedef double __m256d __attribute__((__vector_size__(32)));
+enum { InnerVectorizedTraversal, LinearVectorizedTraversal };
+enum { ReadOnlyAccessors };
+template < int, typename Then, typename > struct conditional {
+  typedef Then type;
+};
+template < typename Then, typename Else > struct conditional< 0, Then, Else > {
+  typedef Else type;
+};
+template < typename, typename > struct is_same {
+  enum { value };
+};
+template < typename T > struct is_same< T, T > {
+  enum { value = 1 };
+};
+template < typename > struct traits;
+struct accessors_level {
+  enum { has_direct_access, has_write_access, value };
+};
+template < typename > 

[PATCH][RFC] Add quotes for constexpr keyword.

2017-11-15 Thread Martin Liška
On 11/06/2017 07:29 PM, Martin Sebor wrote:
> Sorry for being late with my comment.  I just spotted this minor
> formatting issue.  Even though GCC isn't (yet) consistent about
> it the keyword "constexpr" should be quoted in the error message
> below (and, eventually, in all diagnostic messages).  Since the
> patch has been committed by now this is just a reminder for us
> to try to keep this in mind in the future.

Hi.

I've prepared patch for that. If it's desired, I can fix test-suite follow-up.
Do we want to change it also for error messages like:
"call to non-constexpr function"
"constexpr call flows off the end of the function"

Thanks,
Martin
>From eb554d8778be239a2edb06d21f98bda7e5153765 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 15 Nov 2017 08:41:12 +0100
Subject: [PATCH] Add quotes for constexpr keyword.

gcc/cp/ChangeLog:

2017-11-15  Martin Liska  

	* class.c (finalize_literal_type_property): Add quotes for
	constexpr keyword.
	(explain_non_literal_class): Likewise.
	* constexpr.c (ensure_literal_type_for_constexpr_object): Likewise.
	(is_valid_constexpr_fn): Likewise.
	(check_constexpr_ctor_body): Likewise.
	(register_constexpr_fundef): Likewise.
	(explain_invalid_constexpr_fn): Likewise.
	(cxx_eval_builtin_function_call): Likewise.
	(cxx_eval_call_expression): Likewise.
	(cxx_eval_loop_expr): Likewise.
	(potential_constant_expression_1): Likewise.
	* decl.c (check_previous_goto_1): Likewise.
	(check_goto): Likewise.
	(grokfndecl): Likewise.
	(grokdeclarator): Likewise.
	* error.c (maybe_print_constexpr_context): Likewise.
	* method.c (process_subob_fn): Likewise.
	(defaulted_late_check): Likewise.
	* parser.c (cp_parser_compound_statement): Likewise.
---
 gcc/cp/class.c |  4 ++--
 gcc/cp/constexpr.c | 35 ++-
 gcc/cp/decl.c  | 12 ++--
 gcc/cp/error.c |  4 ++--
 gcc/cp/method.c|  6 +++---
 gcc/cp/parser.c|  2 +-
 6 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index 586a32c436f..529f37f24ee 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -5368,7 +5368,7 @@ finalize_literal_type_property (tree t)
 	  DECL_DECLARED_CONSTEXPR_P (fn) = false;
 	  if (!DECL_GENERATED_P (fn)
 	  && pedwarn (DECL_SOURCE_LOCATION (fn), OPT_Wpedantic,
-			  "enclosing class of constexpr non-static member "
+			  "enclosing class of % non-static member "
 			  "function %q+#D is not a literal type", fn))
 	explain_non_literal_class (t);
 	}
@@ -5406,7 +5406,7 @@ explain_non_literal_class (tree t)
 {
   inform (UNKNOWN_LOCATION,
 	  "  %q+T is not an aggregate, does not have a trivial "
-	  "default constructor, and has no constexpr constructor that "
+	  "default constructor, and has no % constructor that "
 	  "is not a copy or move constructor", t);
   if (type_has_non_user_provided_default_constructor (t))
 	/* Note that we can't simply call locate_ctor because when the
diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index d6b6843e804..e0a4133d89b 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -94,8 +94,8 @@ ensure_literal_type_for_constexpr_object (tree decl)
 	{
 	  if (DECL_DECLARED_CONSTEXPR_P (decl))
 	{
-	  error ("the type %qT of constexpr variable %qD is not literal",
-		 type, decl);
+	  error ("the type %qT of % variable %qD "
+		 "is not literal", type, decl);
 	  explain_non_literal_class (type);
 	}
 	  else
@@ -177,7 +177,7 @@ is_valid_constexpr_fn (tree fun, bool complain)
 {
   ret = false;
   if (complain)
-	error ("inherited constructor %qD is not constexpr",
+	error ("inherited constructor %qD is not %",
 	   DECL_INHERITED_CTOR (fun));
 }
   else
@@ -189,7 +189,7 @@ is_valid_constexpr_fn (tree fun, bool complain)
 	ret = false;
 	if (complain)
 	  {
-		error ("invalid type for parameter %d of constexpr "
+		error ("invalid type for parameter %d of % "
 		   "function %q+#D", DECL_PARM_INDEX (parm), fun);
 		explain_non_literal_class (TREE_TYPE (parm));
 	  }
@@ -201,7 +201,7 @@ is_valid_constexpr_fn (tree fun, bool complain)
   ret = false;
   if (complain)
 	inform (DECL_SOURCE_LOCATION (fun),
-		"lambdas are implicitly constexpr only in C++17 and later");
+		"lambdas are implicitly % only in C++17 and later");
 }
   else if (!DECL_CONSTRUCTOR_P (fun))
 {
@@ -211,7 +211,7 @@ is_valid_constexpr_fn (tree fun, bool complain)
 	  ret = false;
 	  if (complain)
 	{
-	  error ("invalid return type %qT of constexpr function %q+D",
+	  error ("invalid return type %qT of % function %q+D",
 		 rettype, fun);
 	  explain_non_literal_class (rettype);
 	}
@@ -225,7 +225,7 @@ is_valid_constexpr_fn (tree fun, bool complain)
 	  ret = false;
 	  if (complain
 	  && pedwarn (DECL_SOURCE_LOCATION (fun), OPT_Wpedantic,

Re: [PATCH][RFC] Instrument function exit with __builtin_unreachable in C++.

2017-11-15 Thread Martin Liška
On 11/15/2017 11:04 AM, Jakub Jelinek wrote:
> On Wed, Nov 15, 2017 at 10:54:23AM +0100, Martin Liška wrote:
>> gcc/c/ChangeLog:
>>
>> 2017-11-15  Martin Liska  
>>
>>  * c-decl.c (grokdeclarator):
>>  Compare warn_return_type for greater than zero.
>>  (start_function): Likewise.
>>  (finish_function): Likewise.
>>  * c-typeck.c (c_finish_return): Likewise.
>>
>> gcc/cp/ChangeLog:
>>
>> 2017-11-15  Martin Liska  
>>
>>  * decl.c (finish_function):
>>  Compare warn_return_type for greater than zero.
>>  * semantics.c (finish_return_stmt): Likewise.
> 
> The c/cp changes aren't really needed, are they?  Because
> in that case you guarantee in the post options handling it is
> 0 or 1.

Yep, you're right!

> 
> The rest looks good (except for Ada that Eric doesn't want to change).
> 
>   Jakub
> 


Done that and I'm going to install the patch.

Martin
>From c0934d0be85d40762d4bafbf9991b167b711736e Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 15 Nov 2017 09:16:23 +0100
Subject: [PATCH] Disable -Wreturn-type by default in all languages other from
 C++.

gcc/ChangeLog:

2017-11-15  Martin Liska  

	* tree-cfg.c (pass_warn_function_return::execute):
	Compare warn_return_type for greater than zero.

gcc/fortran/ChangeLog:

2017-11-15  Martin Liska  

	* options.c (gfc_post_options):
	Do not set default value of warn_return_type.
	* trans-decl.c (gfc_trans_deferred_vars):
	Compare warn_return_type for greater than zero.
	(generate_local_decl): Likewise
	(gfc_generate_function_code): Likewise.
---
 gcc/fortran/options.c| 3 ---
 gcc/fortran/trans-decl.c | 8 
 gcc/tree-cfg.c   | 2 +-
 3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/gcc/fortran/options.c b/gcc/fortran/options.c
index c584a19e559..0ee6b7808d9 100644
--- a/gcc/fortran/options.c
+++ b/gcc/fortran/options.c
@@ -435,9 +435,6 @@ gfc_post_options (const char **pfilename)
 gfc_fatal_error ("Maximum subrecord length cannot exceed %d",
 		 MAX_SUBRECORD_LENGTH);
 
-  if (warn_return_type == -1)
-warn_return_type = 0;
-
   gfc_cpp_post_options ();
 
   if (gfc_option.allow_std & GFC_STD_F2008)
diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index 8efaae79ebc..60e7d8f79ee 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -4198,7 +4198,7 @@ gfc_trans_deferred_vars (gfc_symbol * proc_sym, gfc_wrapped_block * block)
 		  break;
 	}
 	  /* TODO: move to the appropriate place in resolve.c.  */
-	  if (warn_return_type && el == NULL)
+	  if (warn_return_type > 0 && el == NULL)
 	gfc_warning (OPT_Wreturn_type,
 			 "Return value of function %qs at %L not set",
 			 proc_sym->name, _sym->declared_at);
@@ -5619,7 +5619,7 @@ generate_local_decl (gfc_symbol * sym)
   else if (sym->attr.flavor == FL_PROCEDURE)
 {
   /* TODO: move to the appropriate place in resolve.c.  */
-  if (warn_return_type
+  if (warn_return_type > 0
 	  && sym->attr.function
 	  && sym->result
 	  && sym != sym->result
@@ -6494,11 +6494,11 @@ gfc_generate_function_code (gfc_namespace * ns)
   if (result == NULL_TREE || artificial_result_decl)
 	{
 	  /* TODO: move to the appropriate place in resolve.c.  */
-	  if (warn_return_type && sym == sym->result)
+	  if (warn_return_type > 0 && sym == sym->result)
 	gfc_warning (OPT_Wreturn_type,
 			 "Return value of function %qs at %L not set",
 			 sym->name, >declared_at);
-	  if (warn_return_type)
+	  if (warn_return_type > 0)
 	TREE_NO_WARNING(sym->backend_decl) = 1;
 	}
   if (result != NULL_TREE)
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 9a2fa1d98ca..f08a0547f0f 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -9071,7 +9071,7 @@ pass_warn_function_return::execute (function *fun)
 
   /* If we see "return;" in some basic block, then we do reach the end
  without returning a value.  */
-  else if (warn_return_type
+  else if (warn_return_type > 0
 	   && !TREE_NO_WARNING (fun->decl)
 	   && EDGE_COUNT (EXIT_BLOCK_PTR_FOR_FN (fun)->preds) > 0
 	   && !VOID_TYPE_P (TREE_TYPE (TREE_TYPE (fun->decl
-- 
2.14.3



Re: [PATCH] [PR82155] Fix crash in dwarf2out_abstract_function

2017-11-15 Thread Pierre-Marie de Rodat

On 11/15/2017 12:16 PM, Richard Biener wrote:

Is it still okay to commit to gcc-7, now?


Yes.


Done. Thank you!

--
Pierre-Marie de Rodat


[PATCH][GCC][DOCS][AArch64][ARM] Documentation updates adding -A extensions.

2017-11-15 Thread Tamar Christina
Hi All,

This patch updates the documentation for AArch64 and ARM correcting the use of 
the
architecture namings by adding the -A suffix in appropriate places.

Build done on aarch64-none-elf and arm-none-eabi and no issues.

Ok for trunk?

Thanks,
Tamar

gcc/
2017-11-15  Tamar Christina  

* doc/extend.texi: Add -A suffix (ARMv8*-A, ARMv7-A).
* doc/invoke.texi: Add -A suffix (ARMv8*-A, ARMv7-A).
* doc/sourcebuild.texi: Add -A suffix (ARMv8*-A, ARMv7-A).

-- 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 63b58c0681e856da7ecc8c57c5d2f43613389a1d..a7a1ffcb852749b4e39facb434b2feda3534e77b 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -1045,7 +1045,7 @@ expressions are automatically promoted to @code{float}.
 
 The ARM target provides hardware support for conversions between
 @code{__fp16} and @code{float} values
-as an extension to VFP and NEON (Advanced SIMD), and from ARMv8 provides
+as an extension to VFP and NEON (Advanced SIMD), and from ARMv8-A provides
 hardware support for conversions between @code{__fp16} and @code{double}
 values.  GCC generates code using these hardware instructions if you
 compile with options to select an FPU that provides them;
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e897d93070ae320f741aeba4d2490f8366843935..b2f044cf5fb75c44a180b2231284882728248952 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15504,8 +15504,8 @@ entirely disabled by the @samp{+nofp} option that follows it.
 Most extension names are generically named, but have an effect that is
 dependent upon the architecture to which it is applied.  For example,
 the @samp{+simd} option can be applied to both @samp{armv7-a} and
-@samp{armv8-a} architectures, but will enable the original ARMv7
-Advanced SIMD (Neon) extensions for @samp{armv7-a} and the ARMv8-a
+@samp{armv8-a} architectures, but will enable the original ARMv7-A
+Advanced SIMD (Neon) extensions for @samp{armv7-a} and the ARMv8-A
 variant for @samp{armv8-a}.
 
 The table below lists the supported extensions for each architecture.
@@ -15646,7 +15646,7 @@ Disable the floating-point and Advanced SIMD instructions.
 @item +crc
 The Cyclic Redundancy Check (CRC) instructions.
 @item +simd
-The ARMv8 Advanced SIMD and floating-point instructions.
+The ARMv8-A Advanced SIMD and floating-point instructions.
 @item +crypto
 The cryptographic instructions.
 @item +nocrypto
@@ -15658,7 +15658,7 @@ Disable the floating-point, Advanced SIMD and cryptographic instructions.
 @item armv8.1-a
 @table @samp
 @item +simd
-The ARMv8.1 Advanced SIMD and floating-point instructions.
+The ARMv8.1-A Advanced SIMD and floating-point instructions.
 
 @item +crypto
 The cryptographic instructions.  This also enables the Advanced SIMD and
@@ -15678,7 +15678,7 @@ The half-precision floating-point data processing instructions.
 This also enables the Advanced SIMD and floating-point instructions.
 
 @item +simd
-The ARMv8.1 Advanced SIMD and floating-point instructions.
+The ARMv8.1-A Advanced SIMD and floating-point instructions.
 
 @item +crypto
 The cryptographic instructions.  This also enables the Advanced SIMD and
@@ -15754,7 +15754,7 @@ The Cyclic Redundancy Check (CRC) instructions.
 @item +fp.sp
 The single-precision FPv5 floating-point instructions.
 @item +simd
-The ARMv8 Advanced SIMD and floating-point instructions.
+The ARMv8-A Advanced SIMD and floating-point instructions.
 @item +crypto
 The cryptographic instructions.
 @item +nocrypto
@@ -16173,9 +16173,9 @@ Divided syntax should be considered deprecated.
 
 @item -mrestrict-it
 @opindex mrestrict-it
-Restricts generation of IT blocks to conform to the rules of ARMv8.
+Restricts generation of IT blocks to conform to the rules of ARMv8-A.
 IT blocks can only contain a single 16-bit instruction from a select
-set of instructions. This option is on by default for ARMv8 Thumb mode.
+set of instructions. This option is on by default for ARMv8-A Thumb mode.
 
 @item -mprint-tune-info
 @opindex mprint-tune-info
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index d5a90e518d67fb289c8caf2e8f2237970b6649ea..9bb14da1a6f6ec76de72a0927a17909c4d2f0ad5 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1714,11 +1714,11 @@ Some multilibs may be incompatible with these options.
 
 @item arm_v8_1a_neon_ok
 @anchor{arm_v8_1a_neon_ok}
-ARM target supports options to generate ARMv8.1 Adv.SIMD instructions.
+ARM target supports options to generate ARMv8.1-A Adv.SIMD instructions.
 Some multilibs may be incompatible with these options.
 
 @item arm_v8_1a_neon_hw
-ARM target supports executing ARMv8.1 Adv.SIMD instructions.  Some
+ARM target supports executing ARMv8.1-A Adv.SIMD instructions.  Some
 multilibs may be incompatible with the options needed.  Implies
 arm_v8_1a_neon_ok.
 
@@ -1727,34 +1727,34 @@ ARM target supports acquire-release instructions.
 
 @item arm_v8_2a_fp16_scalar_ok
 

Re: [Patch, fortran] PR78990 [5/6/7 Regression] ICE when assigning polymorphic array function result

2017-11-15 Thread Dominique d'Humières
Hi Paul,

Your patch fixes the ICE and pass the tests. However I see

At line 22 of file pr78990.f90
Fortran runtime error: Attempting to allocate already allocated variable 
‘return_t1'

for the original tests (with mold or source). This runtime error depends on the 
options:

% gfc pr78990.f90
% a.out
At line 22 of file pr78990.f90
Fortran runtime error: Attempting to allocate already allocated variable 
'return_t1'

Error termination. Backtrace:
…
% gfc pr78990.f90 -fno-backtrace
% a.out
   0   0   0
% gfc pr78990.f90 -m32
% a.out
   0   0   0
% gfc pr78990.f90 -O
% a.out
   0   0   0

The problem seems related to the line

  print*,v2%i

Cheers,

Dominique




RE: [PATCH][GCC][mid-end] Allow larger copies when target supports unaligned access [Patch (1/2)]

2017-11-15 Thread Tamar Christina


> -Original Message-
> From: Richard Biener [mailto:rguent...@suse.de]
> Sent: Wednesday, November 15, 2017 08:24
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; l...@redhat.com;
> i...@airs.com
> Subject: Re: [PATCH][GCC][mid-end] Allow larger copies when target
> supports unaligned access [Patch (1/2)]
> 
> On Tue, 14 Nov 2017, Tamar Christina wrote:
> 
> > Hi All,
> >
> > This patch allows larger bitsizes to be used as copy size when the
> > target does not have SLOW_UNALIGNED_ACCESS.
> >
> > fun3:
> > adrpx2, .LANCHOR0
> > add x2, x2, :lo12:.LANCHOR0
> > mov x0, 0
> > sub sp, sp, #16
> > ldrhw1, [x2, 16]
> > ldrbw2, [x2, 18]
> > add sp, sp, 16
> > bfi x0, x1, 0, 8
> > ubfxx1, x1, 8, 8
> > bfi x0, x1, 8, 8
> > bfi x0, x2, 16, 8
> > ret
> >
> > is turned into
> >
> > fun3:
> > adrpx0, .LANCHOR0
> > add x0, x0, :lo12:.LANCHOR0
> > sub sp, sp, #16
> > ldrhw1, [x0, 16]
> > ldrbw0, [x0, 18]
> > strhw1, [sp, 8]
> > strbw0, [sp, 10]
> > ldr w0, [sp, 8]
> > add sp, sp, 16
> > ret
> >
> > which avoids the bfi's for a simple 3 byte struct copy.
> >
> > Regression tested on aarch64-none-linux-gnu and x86_64-pc-linux-gnu and
> no regressions.
> >
> > This patch is just splitting off from the previous combined patch with
> > AArch64 and adding a testcase.
> >
> > I assume Jeff's ACK from
> > https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01523.html is still valid as
> the code did not change.
> 
> Given your no_slow_unalign isn't mode specific can't you use the existing
> non_strict_align?

No because non_strict_align checks if the target supports unaligned access at 
all,

This no_slow_unalign corresponds instead to the target slow_unaligned_access
which checks that the access you want to make has a greater cost than doing an
aligned access. ARM for instance always return 1 (value of STRICT_ALIGNMENT)
for slow_unaligned_access while for non_strict_align it may return 0 or 1 based
on the options provided to the compiler.

The problem is I have no way to test STRICT_ALIGNMENT or slow_unaligned_access
So I had to hardcode some targets that I know it does work on.

Thanks,
Tamar
> 
> Otherwise the expr.c change looks ok.
> 
> Thanks,
> Richard.
> 
> > Thanks,
> > Tamar
> >
> >
> > gcc/
> > 2017-11-14  Tamar Christina  
> >
> > * expr.c (copy_blkmode_to_reg): Fix bitsize for targets
> > with fast unaligned access.
> > * doc/sourcebuild.texi (no_slow_unalign): New.
> >
> > gcc/testsuite/
> > 2017-11-14  Tamar Christina  
> >
> > * gcc.dg/struct-simple.c: New.
> > * lib/target-supports.exp
> > (check_effective_target_no_slow_unalign): New.
> >
> >
> 
> --
> Richard Biener 
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nuernberg)


Re: [AARCH64] implements neon vld1_*_x2 intrinsics

2017-11-15 Thread James Greenhalgh
On Wed, Nov 15, 2017 at 09:58:28AM +, Kyrill Tkachov wrote:
> Hi Kugan,
> 
> On 07/11/17 04:10, Kugan Vivekanandarajah wrote:
> > Hi,
> >
> > Attached patch implements the  vld1_*_x2 intrinsics as defined by the
> > neon document.
> >
> > Bootstrap for the latest patch is ongoing on aarch64-linux-gnu. Is
> > this OK for trunk if no regressions?
> >
> 
> This looks mostly ok to me (though I cannot approve) modulo a couple of 
> minor type issues below.

Thanks for the review Kyrill!

I'm happy to trust Kyrill's knowledge of the back-end here, so the patch
is OK with the changes Kyrill requested.

Thanks for the patch!

James

> > gcc/ChangeLog:
> >
> > 2017-11-06  Kugan Vivekanandarajah 
> >
> > * config/aarch64/aarch64-simd.md (aarch64_ld1x2): New.
> > (aarch64_ld1x2): Likewise.
> > (aarch64_simd_ld1_x2): Likewise.
> > (aarch64_simd_ld1_x2): Likewise.
> > * config/aarch64/arm_neon.h (vld1_u8_x2): New.
> > (vld1_s8_x2): Likewise.
> > (vld1_u16_x2): Likewise.
> > (vld1_s16_x2): Likewise.
> > (vld1_u32_x2): Likewise.
> > (vld1_s32_x2): Likewise.
> > (vld1_u64_x2): Likewise.
> > (vld1_s64_x2): Likewise.
> > (vld1_f16_x2): Likewise.
> > (vld1_f32_x2): Likewise.
> > (vld1_f64_x2): Likewise.
> > (vld1_p8_x2): Likewise.
> > (vld1_p16_x2): Likewise.
> > (vld1_p64_x2): Likewise.
> > (vld1q_u8_x2): Likewise.
> > (vld1q_s8_x2): Likewise.
> > (vld1q_u16_x2): Likewise.
> > (vld1q_s16_x2): Likewise.
> > (vld1q_u32_x2): Likewise.
> > (vld1q_s32_x2): Likewise.
> > (vld1q_u64_x2): Likewise.
> > (vld1q_s64_x2): Likewise.
> > (vld1q_f16_x2): Likewise.
> > (vld1q_f32_x2): Likewise.
> > (vld1q_f64_x2): Likewise.
> > (vld1q_p8_x2): Likewise.
> > (vld1q_p16_x2): Likewise.
> > (vld1q_p64_x2): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> > 2017-11-06  Kugan Vivekanandarajah 
> >
> > * gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: New test.
> 
> +__extension__ extern __inline int8x8x2_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +vld1_s8_x2 (const uint8_t *__a)
> 
> This should be "const int8_t *"
> 
>   +{
> +  int8x8x2_t ret;
> +  __builtin_aarch64_simd_oi __o;
> +  __o = __builtin_aarch64_ld1x2v8qi ((const __builtin_aarch64_simd_qi *) 
> __a);
> +  ret.val[0] = (int8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 0);
> +  ret.val[1] = (int8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 1);
> +  return ret;
> +}
> 
> ...
> 
> +__extension__ extern __inline int32x2x2_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +vld1_s32_x2 (const uint32_t *__a)
> 
> Likewise, this should be "const int32_t *"
> 
> +{
> +  int32x2x2_t ret;
> +  __builtin_aarch64_simd_oi __o;
> +  __o = __builtin_aarch64_ld1x2v2si ((const __builtin_aarch64_simd_si *) 
> __a);
> +  ret.val[0] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 0);
> +  ret.val[1] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 1);
> +  return ret;
> +}
> +
> 
> 


Re: [PATCH] [PR82155] Fix crash in dwarf2out_abstract_function

2017-11-15 Thread Richard Biener
On Wed, Nov 15, 2017 at 10:11 AM, Pierre-Marie de Rodat
 wrote:
> Hello Richard,
>
> On 09/25/2017 01:54 PM, Richard Biener wrote:
>>
>> Ok for trunk and gcc-7 branch after a while.
>
> Is it still okay to commit to gcc-7, now?

Yes.

Richard.

> --
> Pierre-Marie de Rodat


Re: [PATCH 02/14] Support for adding and stripping location_t wrapper nodes

2017-11-15 Thread Richard Biener
On Wed, Nov 15, 2017 at 7:17 AM, Trevor Saunders  wrote:
> On Fri, Nov 10, 2017 at 04:45:17PM -0500, David Malcolm wrote:
>> This patch provides a mechanism in tree.c for adding a wrapper node
>> for expressing a location_t, for those nodes for which
>> !CAN_HAVE_LOCATION_P, along with a new method of cp_expr.
>>
>> It's called in later patches in the kit via that new method.
>>
>> In this version of the patch, I use NON_LVALUE_EXPR for wrapping
>> constants, and VIEW_CONVERT_EXPR for other nodes.
>>
>> I also turned off wrapper nodes for EXCEPTIONAL_CLASS_P, for the sake
>> of keeping the patch kit more minimal.
>>
>> The patch also adds a STRIP_ANY_LOCATION_WRAPPER macro for stripping
>> such nodes, used later on in the patch kit.
>
> I happened to start reading this series near the end and was rather
> confused by this macro since it changes variables in a rather unhygienic
> way.  Did you consider just defining a inline function to return the
> actual decl?  It seems like its not used that often so the slight extra
> syntax should be that big a deal compared to the explicitness.

Existing practice  (STRIP_NOPS & friends).  I'm fine either way,
the patch looks good.

Eventually you can simplify things by doing less checking in
location_wrapper_p, like only checking

+inline bool location_wrapper_p (const_tree exp)
+{
+  if ((TREE_CODE (exp) == NON_LVALUE_EXPR
+   || (TREE_CODE (exp) == VIEW_CONVERT_EXPR
+  && (TREE_TYPE (exp)
+ == TREE_TYPE (TREE_OPERAND (exp, 0)))
+return true;
+  return false;
+}

and renaming to maybe_location_wrapper_p.  After all you can't really
distinguish location wrappers from non-location wrappers?  (and why
would you want to?)

Thanks,
Richard.

> Other than that the series seems reasonable, and I look forward to
> having wrappers in more places.  I seem to remember something I wanted
> to warn about they would make much easier.
>
> Thanks
>
> Trev
>


Re: Add __builtin_tgmath for better tgmath.h implementation (bug 81156)

2017-11-15 Thread Richard Biener
On Wed, Nov 15, 2017 at 2:54 AM, Joseph Myers  wrote:
> Various implementations of C99/C11  have the property that
> their macro expansions contain many copies of the macro arguments, so
> resulting in exponential blowup of the size of macro expansions where
> a call to such a macro contains other such calls in the macro
> arguments.
>
> This patch adds a (C-only) language feature __builtin_tgmath designed
> to avoid this problem by implementing the  function
> selection rules directly in the compiler.  The effect is that
> type-generic macros can be defined simply as
>
> #define pow(a, b) __builtin_tgmath (powf, pow, powl, \
> cpowf, cpow, cpowl, a, b)
>
> as in the example added to the manual, with each macro argument
> expanded exactly once.  The details of __builtin_tgmath are as
> described in the manual.  This is C-only since C++ uses function
> overloading and just defines  to include  and
> .
>
> __builtin_tgmath handles C99/C11 type-generic macros, and _FloatN,
> _FloatNx and decimal floating-point types (following the proposed
> resolution to the floating-point TS DR#9 that makes the rules for
> finding a common type from arguments to a type-generic macro follow
> the usual arithmetic conversions after adjustment of integer arguments
> to _Decimal64 or double - or to _Complex double in the case of GNU
> complex integer arguments).
>
> Type-generic macros for functions from TS 18661 that round their
> results to a narrower type are handled, but there are still some
> unresolved questions regarding such macros so further changes in that
> regard may be needed in future.  The current implementation follows an
> older version of the DR#13 resolution (allowing a function for a
> wide-enough argument type to be selected if no exactly-matching
> function is available), but with appropriate calls to __builtin_tgmath
> is still fully compatible with the latest version of the resolution
> (not yet in the DR log), and allowing such not-exactly-matching
> argument types to be chosen in that case avoids needing another
> special case to treat integers as _Float64 instead of double in
> certain cases.
>
> Regarding other possible language/library features, not currently
> implemented in GCC:
>
> * Imaginary types could be naturally supported by allowing cases where
>   the type-generic type is an imaginary type T and arguments or return
>   types may be T (as at present), or the corresponding real type to T
>   (as at present), or (new) the corresponding real type if T is real
>   or imaginary but T if T is complex.  (tgmath.h would need a series
>   of functions such as
>
>   static inline _Imaginary double
>   __sin_imag (_Imaginary double __x)
>   {
> return _Imaginary_I * sinh (__imag__ __x);
>   }
>
>   to be used in __builtin_tgmath calls.)
>
> * __builtin_tgmath would use the constant rounding direction in the
>   presence of support for the FENV_ROUND / FENV_DEC_ROUND pragmas.
>   Support for those would also require a new __builtin_ to
>   cause a non-type-generic call to use the constant rounding
>   direction (it seems cleaner to add a new __builtin_ when
>   required than to make __builtin_tgmath handle a non-type-generic
>   case with only one function argument).
>
> * TS 18661-5 __STDC_TGMATH_OPERATOR_EVALUATION__ would require new
>   __builtin_ that evaluates with excess range and precision
>   like arithmetic operators do.
>
> * The proposed C bindings for IEEE 754-2018 augmented arithmetic
>   operations involve struct return types.  As currently implemented
>   __builtin_tgmath does not handle those, but support could be added.
>
> There are many error cases that the implementation diagnoses.  I've
> tried to ensure reasonable error messages for erroneous uses of
> __builtin_tgmath, but the errors for erroneous uses of the resulting
> type-generic macros (that is, when the non-function arguments have
> inappropriate types) are more important as they are more likely to be
> seen by users.
>
> GCC's own tgmath.h, as used for some targets, is updated in this
> patch.  I've tested those changes minimally, via adjusting
> gcc.dg/c99-tgmath-* locally to use that tgmath.h version.  I've also
> run the glibc testsuite (which has much more thorough tests of
> correctness of tgmath.h function selection) with a glibc patch to use
> __builtin_tgmath in glibc's tgmath.h.
>
> Bootstrapped with no regressions on x86_64-pc-linux-gnu.  Applied to
> mainline.

Thanks - I suppose we can't avoid the repeated expansion by sth like

#define exp(Val) ({ __typeof__ Val tem = Val; __TGMATH_UNARY_REAL_IMAG
(tem, exp, cexp); })

?

Richard.

> gcc:
> 2017-11-15  Joseph Myers  
>
> PR c/81156
> * doc/extend.texi (Other Builtins): Document __builtin_tgmath.
> * ginclude/tgmath.h (__tg_cplx, __tg_ldbl, __tg_dbl, __tg_choose)
> (__tg_choose_2, __tg_choose_3, __TGMATH_REAL_1_2)
> 

Re: [PATCH][GCC][ARM][AArch64] Testsuite framework changes and execution tests [Patch (8/8)]

2017-11-15 Thread Kyrill Tkachov

Hi Tamar,

On 06/10/17 13:45, Tamar Christina wrote:

Hi All,

this is a minor respin of the patch with the comments addressed. Note 
this patch is now 7/8 in the series.



Regtested on arm-none-eabi, armeb-none-eabi,
aarch64-none-elf and aarch64_be-none-elf with no issues found.

Ok for trunk?



This looks ok to me from an arm perspective.

Kyrill


gcc/testsuite
2017-10-06  Tamar Christina  

* lib/target-supports.exp
(check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache): New.
(check_effective_target_arm_v8_2a_dotprod_neon_ok): New.
(add_options_for_arm_v8_2a_dotprod_neon): New.
(check_effective_target_arm_v8_2a_dotprod_neon_hw): New.
(check_effective_target_vect_sdot_qi): New.
(check_effective_target_vect_udot_qi): New.
* gcc.target/arm/simd/vdot-exec.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c: New.
* gcc/doc/sourcebuild.texi: Document arm_v8_2a_dotprod_neon.

From: Tamar Christina
Sent: Monday, September 4, 2017 2:01:40 PM
To: Christophe Lyon
Cc: gcc-patches@gcc.gnu.org; nd; James Greenhalgh; Richard Earnshaw; 
Marcus Shawcroft
Subject: RE: [PATCH][GCC][ARM][AArch64] Testsuite framework changes 
and execution tests [Patch (8/8)]


Hi Christophe,

> >
> > gcc/testsuite
> > 2017-09-01  Tamar Christina 
> >
> > * lib/target-supports.exp
> > (check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache):
> New.
> > (check_effective_target_arm_v8_2a_dotprod_neon_ok): New.
> > (add_options_for_arm_v8_2a_dotprod_neon): New.
> > (check_effective_target_arm_v8_2a_dotprod_neon_hw): New.
> > (check_effective_target_vect_sdot_qi): New.
> > (check_effective_target_vect_udot_qi): New.
> > * gcc.target/arm/simd/vdot-exec.c: New.
>
> Aren't you defining twice P() and ARR() in vdot-exec.c ?
> I'd expect a preprocessor error, did I read too quickly?
>

Yes they are defined twice but they're not redefined, all the definitions
are exactly the same so the pre-processor doesn't care. I can leave only
one if this is confusing.

>
> Thanks,
>
> Christophe
>
> > * gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c: New.
> > * gcc/doc/sourcebuild.texi: Document arm_v8_2a_dotprod_neon.
> >
> > --




Re: [PATCH][GCC][ARM] Restrict TARGET_DOTPROD to baseline Armv8.2-a.

2017-11-15 Thread Kyrill Tkachov

Hi Tamar,

On 14/11/17 15:53, Tamar Christina wrote:

Hi All,

Dot Product is intended to only be available for Armv8.2-a and newer.
While this restriction is reflected in the intrinsics, the patterns
themselves were missing the Armv8.2-a bit.

While GCC would prevent invalid options e.g. `-march=armv8.1-a+dotprod`
we should prevent the pattern from being able to expand at all.

Regtested on arm-none-eabi and no issues.

Ok for trunk?



Ok.
Thanks,
Kyrill


Thanks,
Tamar

gcc/
2017-11-14  Tamar Christina  

* config/arm/arm.h (TARGET_DOTPROD): Add arm_arch8_2.

--




Re: [PATCH][GCC][ARM] Add Armv8.3-a to AArch32.

2017-11-15 Thread Kyrill Tkachov

Hi Tamar,

On 14/11/17 15:54, Tamar Christina wrote:

Hi All,

This patch adds Armv8.3-a as an architecture to the compiler with
the feature set inherited from Armv8.2-a.

Bootstrapped regtested on arm-none-linux-gnueabihf and no issues.



This is ok with a couple of ChangeLog nits.


gcc/
2017-11-14  Tamar Christina  

* config/arm/arm-cpus.in (armv8_3, ARMv8_3a, armv8.3-a): New
* config/arm/arm-tables.opt (armv8.3-a): New.


The convention is to say "Regenerated" for the whole file as it is not 
manually updated.



* doc/invoke.texi (ARM Options): Add armv8.3-a


Full stop at the end of sentence.

Thanks,
Kyrill

P.S. Can you please create an entry for this in the changes.html page [1]?
You can have a look at other similar entries for previous GCC releases 
for the format [2]


[1] https://gcc.gnu.org/about.html
[2] https://gcc.gnu.org/gcc-7/changes.html



Ok for trunk?

Thanks,
Tamar.

--




Re: [PATCH, rs6000] (v2) GIMPLE folding for vector compares

2017-11-15 Thread Richard Biener
On Tue, Nov 14, 2017 at 11:11 PM, Will Schmidt
 wrote:
>
> Hi,
>   Add support for gimple folding of vec_cmp_{eq,ge,gt,le,ne}
> for the integer data types.
>
> As part of this change, several define_insn stanzas have been added/updated
> in vsx.md that specify the "ne: -> not: + eq: " combinations to allow for the 
> generation
> of the desired vcmpne[bhw] instructions, where we otherwise
> would have generated a vcmpeq + vnor combination.  The defines
> also obsoleted the need for the UNSPEC versions of the same, so this ends up
> being just an update to those existing defines.
>
> Several entries have been added to the switch statement in
> builtin_function_type to identify the builtins having unsigned arguments.
>
> A handful of existing tests required updates to their specified optimization
> levels to continue to generate the desired code.  builtins-3-p9.c in 
> particular
> has been updated to reflect improved code gen with the higher specified
> optimization level.
> Testcase coverage is otherwise handled by the already-in-tree
> gcc.target/powerpc/fold-vec-cmp-*.c tests.
>
> Per feedback from the prior version, v2 changes also include:
>   * Reworked the actual folding to use a VEC_COND_EXPR.  For cleanliness, I
>   moved this to a new fold_build_vec_cmp() helper function, which itself
>   is based on build_vec_cmp() as found in typeck.c.
>   * Added an additional fold_compare_helper() function to further factor out
>   the steps that are common to all of the vector compare operations.
>
> Testing is currently underway on P6 and newer. OK for trunk?

The folding part looks good to me.

Richard.

> Thanks,
> -Will
>
>
> 2017-11-14  Will Schmidt  
> [gcc]
> * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add support for
> folding of vector compares.
> (fold_build_vec_cmp): New helper function.
> (fold_compare_helper): New helper function.
> (builtin_function_type): Add compare builtins to the list of functions
> having unsigned arguments.
> * config/rs6000/vsx.md (vcmpneb, vcmpneh, vcmpnew): Update to specify
> the not+eq combination.
>
> [testsuite]
> * gcc.target/powerpc/builtins-3-p9.c: Add -O1, update
> expected codegen checks.
> * gcc.target/powerpc/vec-cmp-sel.c: Mark vars as volatile.
> * gcc.target/powerpc/vsu/vec-cmpne-0.c: Add -O1.
> * gcc.target/powerpc/vsu/vec-cmpne-1.c: Add -O1.
> * gcc.target/powerpc/vsu/vec-cmpne-2.c: Add -O1.
> * gcc.target/powerpc/vsu/vec-cmpne-3.c: Add -O1.
> * gcc.target/powerpc/vsu/vec-cmpne-4.c: Add -O1.
> * gcc.target/powerpc/vsu/vec-cmpne-5.c: Add -O1.
> * gcc.target/powerpc/vsu/vec-cmpne-6.c: Add -O1.
>
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 2c80a2f..0317324 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -16206,10 +16206,40 @@ rs6000_builtin_valid_without_lhs (enum 
> rs6000_builtins fn_code)
>  default:
>return false;
>  }
>  }
>
> +/*  Helper function to handle the gimple folding of a vector compare
> +operation.  This sets up true/false vectors, and uses the
> +VEC_COND_EXPR operation.
> +'code' indicates which comparison is to be made. (EQ, GT, ...).
> +'type' indicates the type of the result.  */
> +static tree
> +fold_build_vec_cmp (tree_code code, tree type,
> +   tree arg0, tree arg1)
> +{
> +  tree cmp_type = build_same_sized_truth_vector_type (type);
> +  tree zero_vec = build_zero_cst (type);
> +  tree minus_one_vec = build_minus_one_cst (type);
> +  tree cmp = fold_build2 (code, cmp_type, arg0, arg1);
> +  return fold_build3 (VEC_COND_EXPR, type, cmp, minus_one_vec, zero_vec);
> +}
> +
> +/* Helper function to handle the in-between steps for the
> +   vector compare built-ins.  */
> +static void
> +fold_compare_helper (gimple_stmt_iterator *gsi, tree_code code, gimple *stmt)
> +{
> +  tree arg0 = gimple_call_arg (stmt, 0);
> +  tree arg1 = gimple_call_arg (stmt, 1);
> +  tree lhs = gimple_call_lhs (stmt);
> +  gimple *g = gimple_build_assign (lhs,
> +   fold_build_vec_cmp (code, TREE_TYPE (lhs), arg0, arg1));
> +  gimple_set_location (g, gimple_location (stmt));
> +  gsi_replace (gsi, g, true);
> +}
> +
>  /* Fold a machine-dependent built-in in GIMPLE.  (For folding into
> a constant, use rs6000_fold_builtin.)  */
>
>  bool
>  rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
> @@ -16701,10 +16731,67 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator 
> *gsi)
> gimple_set_location (g, gimple_location (stmt));
> gsi_replace (gsi, g, true);
> return true;
>}
>
> +/* Vector compares; EQ, NE, GE, GT, LE.  */
> +case ALTIVEC_BUILTIN_VCMPEQUB:
> +case ALTIVEC_BUILTIN_VCMPEQUH:
> +case ALTIVEC_BUILTIN_VCMPEQUW:
> +case P8V_BUILTIN_VCMPEQUD:
> +  {
> + 

  1   2   >