Re: Enable no-exec stacks for more targets using the Linux kernel

2017-09-18 Thread Jim Wilson
On Mon, 2017-09-18 at 22:03 +, Joseph Myers wrote:
> Thus, I'd like the architecture maintainers to advise on whether any
> such issues apply for their architecture.  If they do, that will
> provide the information needed for a comment on XFAILing the test in
> glibc.  If no such reasons apply for the patch to be problematic, I'd
> like it reviewed for each of those architectures (you may wish to do
> such testing as you see fit; I have *not* run any GCC tests with this
> patch, just tested building glibc and running the compilation tests
> with build-many-glibcs.py).

Unfortunately, I don't have access to ia64 hardware anymore, so I am
not able to verify that this works on hardware.  I would expect that
the patches work, and would recommend that you make the changes, and
then we can back them out later if someone runs into a problem.  It
just looks like an oversight due to lack of ia64 maintenance that this
wasn't done before.

The ia64 changes are OK.

Jim



Re: [libstdc++/71500] make back reference work with icase

2017-09-18 Thread Tim Shen via gcc-patches
On Mon, Sep 18, 2017 at 4:01 PM, Jonathan Wakely  wrote:
> On 18/09/17 10:58 -0700, Tim Shen via libstdc++ wrote:
>>
>> On Mon, Sep 18, 2017 at 10:26 AM, Jonathan Wakely 
>> wrote:

 We need to rewrite this to check the lengths are equal first, and then
 call the 3-argument version of std::equal.

 Alternatively, we could move the implementation of the C++14
 std::equal overloads to __equal and make that available for C++11.
 I'll try that.
>>>
>>>
>>>
>>> Here's a proof of concept patch for that. It's a bit ugly.
>>
>>
>> Instead of having iterator tags in the interface, we can probe the
>> random-access-ness inside __equal4/__equal4_p, can't we? It's similar
>> to the existing "if (_RAIters()) { ... }".
>>
>> I'd expect the patches to be renaming the current implementations and
>> adding wrappers, instead of adding new implementations.
>
>
> Well I decided to split the existing functions up and use tag
> dispatching, which is conceptually cleaner anyway. But as the
> RandomAccessIterator version doesn't need any operations that aren't
> valid for other categories, it's not strictly necessary. The tag
> dispatching version should generate slightly smaller code for
> unoptimized builds, but that's not very important.

Unoptimized builds don't inline small functions, therefore the first
patch generate two weak symbols, instead of one by the second patch.
It's unclear to me how would number of symbols penalize the
performance/binary size.

>
> Here's the patch doing it as you suggest. We can't call the new
> functions __equal because t hat name is already taken by a helper
> struct, hence __equal4.
>
> Do you prefer this version?

Yes, I prefer this version for readability reasons:
1) subjectively, less scattered code; and
2) ideally I want `if constexpr (...)`), the if version is closer.

I agree that it's not a big difference. I just wanted to point out the
small difference. I'm fine with either version.

Thanks for the prototyping!


-- 
Regards,
Tim Shen


Re: Merge from trunk to gccgo branch

2017-09-18 Thread Ian Lance Taylor
On Mon, Sep 18, 2017 at 3:24 PM, Ian Lance Taylor  wrote:
> I merged revision 252949 from trunk to the gccgo branch.

Missed a patch.  Merged revision 252954 to the gccgo branch.

Ian


Re: [PATCH] [i386, libgcc] PR 82196 -mcall-ms2sysv-xlogues emits wrong AVX/SSE MOV

2017-09-18 Thread Daniel Santos
Mike, can you take a look at this please?

On 09/18/2017 10:17 AM, Dominique d'Humières wrote:
> This patch (r252896) breaks bootstrap on x86_64-apple-darwin10 configured with
>
> ../work/configure --prefix=/opt/gcc/gcc8w 
> --enable-languages=c,c++,fortran,objc,obj-c++,ada,lto --with-gmp=/opt/mp-new 
> --with-system-zlib --with-isl=/opt/mp-new --enable-lto --enable-plugin
>
> /opt/gcc/build_w/./gcc/xgcc -B/opt/gcc/build_w/./gcc/ 
> -B/opt/gcc/gcc8w/x86_64-apple-darwin10.8.0/bin/ 
> -B/opt/gcc/gcc8w/x86_64-apple-darwin10.8.0/lib/ -isystem 
> /opt/gcc/gcc8w/x86_64-apple-darwin10.8.0/include -isystem 
> /opt/gcc/gcc8w/x86_64-apple-darwin10.8.0/sys-include-g -O2 -O2  -g -O2 
> -DIN_GCC-W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format 
> -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition  -isystem 
> ./include   -mmacosx-version-min=10.5 -pipe -fno-common -g -DIN_LIBGCC2 
> -fbuilding-libgcc -fno-stack-protector   -mmacosx-version-min=10.5 -pipe 
> -fno-common -I. -I. -I../.././gcc -I../../../work/libgcc 
> -I../../../work/libgcc/. -I../../../work/libgcc/../gcc 
> -I../../../work/libgcc/../include  -DHAVE_CC_TLS -DUSE_EMUTLS -o 
> avx_savms64_s.o -MT avx_savms64_s.o -MD -MP -MF avx_savms64_s.dep -DSHARED -c 
> -xassembler-with-cpp ../../../work/libgcc/config/i386/avx_savms64.S
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm15,-0x30(%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm14,-0x20(%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm13,-0x10(%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm12, (%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm11, 0x10(%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm10, 0x20(%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm9, 0x30(%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm8, 0x40(%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm7, 0x50(%rax)'
> ../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
> %xmm6, 0x60(%rax)'
> make[3]: *** [avx_savms64_s.o] Error 1
>
> Dominique

Thanks for the report.  AVX has been out since early 2011 and Wikipedia
claims that AVX support was added to OSX in version 10.6.8 in June 2011
and you seem to be using 10.8.0.  I would presume that also means that
the assembler supports it.  So I'm going to guess that it's the
"-mmacosx-version-min=10.5" parameter.  Can you please try setting that
to 10.6.8 and let me know the result?  I don't know what the minimum
system requirements for GCC 8 are going to be, but if it includes these
older versions of OSX then I'll have to figure out how to cope with it
in the libgcc build.

Thanks,
Daniel


Re: [libstdc++/71500] make back reference work with icase

2017-09-18 Thread Jonathan Wakely

On 18/09/17 10:58 -0700, Tim Shen via libstdc++ wrote:

On Mon, Sep 18, 2017 at 10:26 AM, Jonathan Wakely  wrote:

We need to rewrite this to check the lengths are equal first, and then
call the 3-argument version of std::equal.

Alternatively, we could move the implementation of the C++14
std::equal overloads to __equal and make that available for C++11.
I'll try that.



Here's a proof of concept patch for that. It's a bit ugly.


Instead of having iterator tags in the interface, we can probe the
random-access-ness inside __equal4/__equal4_p, can't we? It's similar
to the existing "if (_RAIters()) { ... }".

I'd expect the patches to be renaming the current implementations and
adding wrappers, instead of adding new implementations.


Well I decided to split the existing functions up and use tag
dispatching, which is conceptually cleaner anyway. But as the
RandomAccessIterator version doesn't need any operations that aren't
valid for other categories, it's not strictly necessary. The tag
dispatching version should generate slightly smaller code for
unoptimized builds, but that's not very important.

Here's the patch doing it as you suggest. We can't call the new
functions __equal because t hat name is already taken by a helper
struct, hence __equal4.

Do you prefer this version?


diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc
index f6149fecf9d..2ceba35e7b8 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -366,17 +366,17 @@ namespace __detail
 	   _BiIter __actual_end)
   {
 	if (!_M_icase)
-	  return std::equal(__expected_begin, __expected_end,
-			__actual_begin, __actual_end);
+	  return std::__equal4(__expected_begin, __expected_end,
+			   __actual_begin, __actual_end);
 	typedef std::ctype<_CharT> __ctype_type;
 	const auto& __fctyp = use_facet<__ctype_type>(_M_traits.getloc());
-	return std::equal(__expected_begin, __expected_end,
-			  __actual_begin, __actual_end,
-			  [this, &__fctyp](_CharT __lhs, _CharT __rhs)
-			  {
-			return __fctyp.tolower(__lhs)
-== __fctyp.tolower(__rhs);
-			  });
+	return std::__equal4(__expected_begin, __expected_end,
+			 __actual_begin, __actual_end,
+			 [this, &__fctyp](_CharT __lhs, _CharT __rhs)
+			 {
+			   return __fctyp.tolower(__lhs)
+ == __fctyp.tolower(__rhs);
+			 });
   }
 
   bool _M_icase;
diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
index f68ecb22b82..ff5e94d9ae8 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -1082,6 +1082,58 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
   return true;
 }
 
+#if __cplusplus >= 201103L
+  template
+inline bool
+__equal4(_II1 __first1, _II1 __last1, _II2 __first2, _II2 __last2)
+{
+  using _RATag = random_access_iterator_tag;
+  using _Cat1 = typename iterator_traits<_II1>::iterator_category;
+  using _Cat2 = typename iterator_traits<_II2>::iterator_category;
+  using _RAIters = __and_, is_same<_Cat2, _RATag>>;
+  if (_RAIters())
+	{
+	  auto __d1 = std::distance(__first1, __last1);
+	  auto __d2 = std::distance(__first2, __last2);
+	  if (__d1 != __d2)
+	return false;
+	  return _GLIBCXX_STD_A::equal(__first1, __last1, __first2);
+	}
+
+  for (; __first1 != __last1 && __first2 != __last2;
+	  ++__first1, (void)++__first2)
+	if (!(*__first1 == *__first2))
+	  return false;
+  return __first1 == __last1 && __first2 == __last2;
+}
+
+  template
+inline bool
+__equal4(_II1 __first1, _II1 __last1, _II2 __first2, _II2 __last2,
+	 _BinaryPredicate __binary_pred)
+{
+  using _RATag = random_access_iterator_tag;
+  using _Cat1 = typename iterator_traits<_II1>::iterator_category;
+  using _Cat2 = typename iterator_traits<_II2>::iterator_category;
+  using _RAIters = __and_, is_same<_Cat2, _RATag>>;
+  if (_RAIters())
+	{
+	  auto __d1 = std::distance(__first1, __last1);
+	  auto __d2 = std::distance(__first2, __last2);
+	  if (__d1 != __d2)
+	return false;
+	  return _GLIBCXX_STD_A::equal(__first1, __last1, __first2,
+   __binary_pred);
+	}
+
+  for (; __first1 != __last1 && __first2 != __last2;
+	  ++__first1, (void)++__first2)
+	if (!bool(__binary_pred(*__first1, *__first2)))
+	  return false;
+  return __first1 == __last1 && __first2 == __last2;
+}
+#endif // C++11
+
 #if __cplusplus > 201103L
 
 #define __cpp_lib_robust_nonmodifying_seq_ops 201304
@@ -1112,24 +1164,7 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
   __glibcxx_requires_valid_range(__first1, __last1);
   __glibcxx_requires_valid_range(__first2, __last2);
 
-  using _RATag = random_access_iterator_tag;
-  using _Cat1 = typename iterator_traits<_II1>::iterator_category;
-  using _Cat2 = typename 

Re: [PATCH] lra: make reload_pseudo_compare_func a proper comparator

2017-09-18 Thread Vladimir Makarov

On 09/15/2017 01:38 PM, Alexander Monakov wrote:

Hello,

I'd like to apply the following LRA patch to make qsort comparator
reload_pseudo_compare_func proper (right now it lacks transitivity
due to incorrect use of non_reload_pseudos bitmap, PR 68988).

This function was originally a proper comparator, and the problematic
code was added as a fix for PR 57878.  However, some time later the fix
for PR 60650 really fixed this LRA spill failure, making the original
fix unneeded.  So now GCC can revert to the original, simpler comparator.

The only question is what comparison order would be better for performance.
The check in question only matters for multi-reg pseudos, so it matters
mostly for 64-bit modes on 32-bit architectures.

To investigate that, I've bootstrapped GCC on 32-bit x86 in 4 configurations:

1. Current trunk.

[2-4 are with original PR 57878 fix reverted]
2. Original code, with ira_reg_class_max_nregs below regno_assign_info.freq
check.
3. Hybrid code, with i_r_c_max_nregs preferred over r_a_i.freq during the
second assignment pass, but not first.
4. With i_r_c_max_nregs above r_a_i.freq check, i.e. always do fragmentation
avoidance check before the frequency check. This is the original PR 57878
fix proposed by Wei Mi.

I have found that cc1 size is largest with variant 1, variants 2 and 3 result
in ~500 bytes size reduction, and variant 4 is further ~500 bytes smaller than
that (considering only the .text section, debug info variance is larger).

I have also tested variants 2 and 4 on SPEC CPU 2000: there's no significant
difference in performance (in fact generated code is the same on almost all
tests).

Therefore I suggest we go with variant 4, implemented by the following patch.

Bootstrapped and regtested on 32-bit x86, OK to apply?

Alexander, thank you for benchmarking your changes.  People are 
frequently skipping this.  The change looks reasonable for me.  So you 
can commit it into the trunk.




libgo patch committed: always initialize str field in __go_string_slice result

2017-09-18 Thread Ian Lance Taylor
This patch to libgo alwayss initializes the str field in the result of
__go_string_slice.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 252866)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-abe58fdc529378706d65d6b22e4871646eb9023e
+be69546afcac182cc93c569bc96665f0ef72d66a
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/runtime/go-strslice.c
===
--- libgo/runtime/go-strslice.c (revision 251948)
+++ libgo/runtime/go-strslice.c (working copy)
@@ -18,10 +18,13 @@ __go_string_slice (String s, intgo start
   if (start > len || end < start || end > len)
 runtime_panicstring ("string index out of bounds");
   ret.len = end - start;
-  // If the length of the new string is zero, don't adjust the str
-  // field.  This ensures that we don't create a pointer to the next
-  // memory block, and thus keep it live unnecessarily.
-  if (ret.len > 0)
+  // If the length of the new string is zero, the str field doesn't
+  // matter, so just set it to nil.  This avoids the problem of
+  // s.str + start pointing just past the end of the string,
+  // which may keep the next memory block alive unnecessarily.
+  if (ret.len == 0)
+ret.str = nil;
+  else
 ret.str = s.str + start;
   return ret;
 }


Merge from trunk to gccgo branch

2017-09-18 Thread Ian Lance Taylor
I merged revision 252949 from trunk to the gccgo branch.

Ian


Enable no-exec stacks for more targets using the Linux kernel

2017-09-18 Thread Joseph Myers
Building glibc for many different configurations and running the
compilation parts of the testsuite runs into failures of the
elf/check-execstack test for hppa, ia64 and microblaze.

This fails because those configurations are not generating
.note.GNU-stack sections to indicate that programs do not need an
executable stack.  This patch fixes GCC to generate those sections on
those architectures (when configured for a target using the Linux
kernel), as it does on other architectures, together with adding that
section to libgcc .S sources, with the same code as used on other
architectures (or a variant using "#ifdef __linux__" instead of the
usual "#if defined(__ELF__) && defined(__linux__)" for microblaze, as
that configuration doesn't use elfos.h and so doesn't define __ELF__).

This suffices to eliminate that glibc test failure.  (For hppa, the
compilation parts of the glibc testsuite still fail because of the
separate elf/check-textrel failure.)  Now, there are some possible
reasons why a change such as this could be incorrect, and I don't know
enough about the architectures in question to rule them out: (a) if
the hardware architecture does not actually support the page
permissions required for no-exec stacks, (b) if, as on MIPS,
additional architecture-specific work elsewhere in the toolchain or
kernel would be needed for no-exec stacks to work, or (c) if in fact
one of the libgcc functions I've marked for no-exec stacks somehow
needs an executable stack.

Thus, I'd like the architecture maintainers to advise on whether any
such issues apply for their architecture.  If they do, that will
provide the information needed for a comment on XFAILing the test in
glibc.  If no such reasons apply for the patch to be problematic, I'd
like it reviewed for each of those architectures (you may wish to do
such testing as you see fit; I have *not* run any GCC tests with this
patch, just tested building glibc and running the compilation tests
with build-many-glibcs.py).

gcc:
2017-09-18  Joseph Myers  

* config/ia64/linux.h (TARGET_ASM_FILE_END): New macro.
* config/microblaze/linux.h (TARGET_ASM_FILE_END): Likewise.
* config/pa/pa.h (NEED_INDICATE_EXEC_STACK): Likewise.
* config/pa/pa-linux.h (NEED_INDICATE_EXEC_STACK): Likewise.
* config/pa/pa.c (pa_hpux_file_end): Rename to pa_file_end.
Define unconditionally, with [ASM_OUTPUT_EXTERNAL_REAL]
conditionals inside the function instead of around it.  Call
file_end_indicate_exec_stack if NEED_INDICATE_EXEC_STACK.
(TARGET_ASM_FILE_END): Define unconditionally to pa_file_end.

libgcc:
2017-09-18  Joseph Myers  

* config/ia64/crtbegin.S, config/ia64/crtend.S,
config/ia64/crti.S, config/ia64/crtn.S, config/ia64/lib1funcs.S,
config/microblaze/crti.S, config/microblaze/crtn.S,
config/microblaze/divsi3.S, config/microblaze/moddi3.S,
config/microblaze/modsi3.S, config/microblaze/muldi3_hard.S,
config/microblaze/mulsi3.S,
config/microblaze/stack_overflow_exit.S,
config/microblaze/udivsi3.S, config/microblaze/umodsi3.S,
config/pa/milli64.S: Add .note.GNU-stack section.

Index: gcc/config/ia64/linux.h
===
--- gcc/config/ia64/linux.h (revision 252935)
+++ gcc/config/ia64/linux.h (working copy)
@@ -81,3 +81,5 @@ do {  \
 
 /* Define this to be nonzero if static stack checking is supported.  */
 #define STACK_CHECK_STATIC_BUILTIN 1
+
+#define TARGET_ASM_FILE_END file_end_indicate_exec_stack
Index: gcc/config/microblaze/linux.h
===
--- gcc/config/microblaze/linux.h   (revision 252935)
+++ gcc/config/microblaze/linux.h   (working copy)
@@ -57,3 +57,5 @@
 /* For the microblaze-*-linux* subtarget.  */
 #undef TARGET_OS_CPP_BUILTINS
 #define TARGET_OS_CPP_BUILTINS() GNU_USER_TARGET_OS_CPP_BUILTINS()
+
+#define TARGET_ASM_FILE_END file_end_indicate_exec_stack
Index: gcc/config/pa/pa-linux.h
===
--- gcc/config/pa/pa-linux.h(revision 252935)
+++ gcc/config/pa/pa-linux.h(working copy)
@@ -141,3 +141,6 @@ along with GCC; see the file COPYING3.  If not see
 #define HAVE_sync_compare_and_swaphi 1
 #define HAVE_sync_compare_and_swapsi 1
 #define HAVE_sync_compare_and_swapdi 1
+
+#undef NEED_INDICATE_EXEC_STACK
+#define NEED_INDICATE_EXEC_STACK 1
Index: gcc/config/pa/pa.c
===
--- gcc/config/pa/pa.c  (revision 252935)
+++ gcc/config/pa/pa.c  (working copy)
@@ -159,9 +159,7 @@ static void pa_hpux64_gas_file_start (void) ATTRIB
 static void pa_hpux64_hpas_file_start (void) ATTRIBUTE_UNUSED;
 static void output_deferred_plabels (void);
 static void output_deferred_profile_counters (void) 

Re: [PATCH,rs6000] Replace swap of a loaded vector constant with load of a swapped vector constant

2017-09-18 Thread Segher Boessenkool
Hi Kelvin,

On Fri, Sep 15, 2017 at 03:04:52PM -0600, Kelvin Nilsen wrote:
> On Power8 little endian, two instructions are needed to load from the
> natural in-memory representation of a vector into a vector register: a
> load followed by a swap.  When the vector value to be loaded is a
> constant, more efficient code can be achieved by swapping the
> representation of the constant in memory so that only a load instruction
> is required.

I'll leave the review of the actual swaps part to Bill...  But some
comments:

> --- gcc/config/rs6000/rs6000-p8swap.c (revision 252768)
> +++ gcc/config/rs6000/rs6000-p8swap.c (working copy)
> @@ -342,7 +342,8 @@ const_load_sequence_p (swap_web_entry *insn_entry,
>FOR_EACH_INSN_INFO_USE (use, insn_info)
>  {
>struct df_link *def_link = DF_REF_CHAIN (use);
> -  if (!def_link || def_link->next)
> +  if (!def_link || !def_link->ref || DF_REF_IS_ARTIFICIAL (def_link->ref)
> +   || def_link->next)
>   return false;

You probably should adjust the comment before this a bit:
  /* Find the unique use in the swap and locate its def.  If the def
 isn't unique, punt.  */
no longer says what this does.

> @@ -370,6 +373,14 @@ const_load_sequence_p (swap_web_entry *insn_entry,
> if (!base_def_link || base_def_link->next)
>   return false;
>  
> +   /* Constants held on the stack are not "true" constants
> +* because their values are not part of the static load
> +* image.  If this constant's base reference is a stack
> +* or frame pointer, it is seen as an artificial
> +* reference. */

No leading asterisks in block comments.

> @@ -385,6 +396,25 @@ const_load_sequence_p (swap_web_entry *insn_entry,
> split_const (XVECEXP (tocrel_base, 0, 0), , );
> if (GET_CODE (base) != SYMBOL_REF || !CONSTANT_POOL_ADDRESS_P (base))
>   return false;
> +   else
> + {
> +   /* FIXME: The conditions under which
> +*  ((GET_CODE (const_vector) == SYMBOL_REF) &&
> +*   !CONSTANT_POOL_ADDRESS_P (const_vector))
> +* are not well understood.  This code prevents
> +* an internal compiler error which will occur in
> +* replace_swapped_load_constant () if we were to return
> +* true.  Some day, we should figure out how to properly
> +* handle this condition in
> +* replace_swapped_load_constant () and then we can
> +* remove this special test.  */
> +   rtx const_vector = get_pool_constant (base);
> +   if (GET_CODE (const_vector) == SYMBOL_REF)
> + {
> +   if (!CONSTANT_POOL_ADDRESS_P (const_vector))
> + return false;
> + }
> + }
>   }
>  }
>return true;

It would be good to understand what this is about.  Some day :-)

> @@ -1281,6 +1311,189 @@ replace_swap_with_copy (swap_web_entry *insn_entry
>insn->set_deleted ();
>  }
>  
> +/* Given that swap_insn represents a swap of a load of a constant
> +   vector value, replace with a single instruction that loads a
> +   swapped variant of the original constant. 

(Trailing space).

> +static void
> +replace_swapped_load_constant (swap_web_entry *insn_entry, rtx swap_insn)
> +{
> +  /* Find the load.  */
> +  struct df_insn_info *insn_info = DF_INSN_INFO_GET (swap_insn);
> +  rtx_insn *load_insn = 0;
> +  df_ref use;
> +
> +  FOR_EACH_INSN_INFO_USE (use, insn_info)
> +{
> +  struct df_link *def_link = DF_REF_CHAIN (use);
> +  gcc_assert (def_link && !def_link->next);
> +  load_insn = DF_REF_INSN (def_link->ref);
> +  break;
> +}
> +  gcc_assert (load_insn);

A loop where you always break after the first iteration?  You probably
can write this simpler without a loop?  Or is this normal idiom?

> +  /* Find the embedded CONST_VECTOR.  We have to call toc_relative_expr_p
> + to set tocrel_base; otherwise it would be unnecessary as we've
> + already established it will return true.  */
> +  rtx base, offset;
> +  rtx tocrel_expr = SET_SRC (PATTERN (tocrel_insn));
> +  const_rtx tocrel_base;
> +  /* There is an extra level of indirection for small/large code models.  */
> +  if (GET_CODE (tocrel_expr) == MEM)
> +tocrel_expr = XEXP (tocrel_expr, 0);
> +  if (!toc_relative_expr_p (tocrel_expr, false, _base, NULL))
> +gcc_unreachable ();
> +  split_const (XVECEXP (tocrel_base, 0, 0), , );
> +  rtx const_vector = get_pool_constant (base);
> +  /* With the extra indirection, get_pool_constant will produce the
> + real constant from the reg_equal expression, so get the real
> + constant.  */
> +  if (GET_CODE (const_vector) == SYMBOL_REF)
> +const_vector = get_pool_constant (const_vector);
> +  gcc_assert (GET_CODE (const_vector) == CONST_VECTOR);
> +
> +  rtx new_mem;
> +  enum machine_mode mode = GET_MODE (const_vector);
> +  /* Create an adjusted constant from the original constant.  */
> +
> 

Re: [PATCH] detect incompatible aliases (PR c/81854)

2017-09-18 Thread Joseph Myers
On Mon, 18 Sep 2017, Martin Sebor wrote:

> It's meant as an escape hatch.  It allows declaring compatibility
> symbols, for example by the libstdc++ _GLIBCXX_3_4_SYMVER macro
> defined in libstdc++-v3/src/c++98/compatibility.cc.  The macro is
> used to declare compatibility functions of all sorts of incompatible
> types.  The originally posted patch had libstdc++ disable the warning
> for the file with the symbols but Jonathan preferred this solution.
> 
> It could perhaps be tightened up to detect some of the cases on your
> list but I'm not sure it's worth the effort and added complexity.
> Let me know if you feel differently (or have a different suggestion),
> otherwise I will go ahead and commit the patch as is.

Please add a comment explaining this reasoning and commit the patch.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [gotools] Fix some gotools testing problems

2017-09-18 Thread Ian Lance Taylor via gcc-patches
On Tue, Sep 12, 2017 at 6:18 AM, Rainer Orth
 wrote:
>
> A couple of gotools test FAIL on Solaris, and there are several issues
> that make investigation particularly tedious.
>
> * The one invocation of gotest doesn't support passing additional flags
>   (--keep in particular).  Added below.
>
> * The order of some of the summaries isn't stable (cmd/go and runtime,
>   while cgo and carchive are, it seems).  I'm now sorting the output by
>   test name, both for make check output and the creation of gotools.sum.
>
>   Otherwise, it's hard to compare mail-report.log between builds.
>
> * The freshly built go, cgo, and gofmt rely on LD_LIBRARY_PATH to be set
>   so libgcc.so is found at runtime.  The right value is passed in from
>   the toplevel during a bootstrap, but missing when I run make check in
>   gotools manually.
>
> * The most glaring problem is getting your hands on the failing
>   executables: there are several levels of indirection here
>
>   make -> gotest ... -> go build -> gccgo -> a.out
>-> go test
>
>   Many of those steps also depend on lots of environment variables set
>   to the check-gcc or check-gccgo wrapper scripts and it's hard to
>   determine what's necessary to be able to lay my hand on the resulting
>   executable to run it under gdb.  I've found no way to pass e.g. -work
>   to go build from the outside (e.g. via an environment variable), nor
>   can I see the commands executed, so this means modifying the test
>   sources or run the full testsuite under truss in the hope to determine
>   all that's necessary to reproduce the build.
>
> Anyway, here's the trivial part so far.  Ok for mainline?
>
> Rainer
>
> --
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University
>
>
> 2017-09-12  Rainer Orth  
>
> * Makefile.am (GOTESTFLAGS): New variable.
> (check-runtime): Pass it to gotest.
> (check-go-tools): Sort summary.
> (check-runtime): Likewise.
> (check-cgo-test): Likewise.
> (check-carchive-test): Likewise.
> (check): Likewise.
> * Makefile.in: Regenerate.


This patch is OK.  Go ahead and commit.  Thanks.

What I do for debugging is to run `make install`.  Then I can simply
run `GO_TESTING_GOTOOLS=yes go test` in the appropriate directory.  I
don't have a good solution for simpler debugging without running `make
install` first.

Ian


Re: [PATCH] detect incompatible aliases (PR c/81854)

2017-09-18 Thread Martin Sebor

On 09/12/2017 10:17 AM, Joseph Myers wrote:

On Thu, 17 Aug 2017, Martin Sebor wrote:


+ || (prototype_p (t1)
+ && prototype_p (t2)
+ && !types_compatible_p (t1, t2


Why the restriction to prototyped types?  I'd expect a warning for an
alias between unprototyped functions of types int () and void (), for
example.  Or for an alias between void () and void (char), as a function
with a char argument is not compatible with a non-prototype function in C.
Is this an issue with the problem being diagnosed at a point where the
langhooks for language-specific type compatibility rules aren't available?
If that's preventing diagnosing incompatibility involving unprototyped
functions, then the patch is OK.


It's meant as an escape hatch.  It allows declaring compatibility
symbols, for example by the libstdc++ _GLIBCXX_3_4_SYMVER macro
defined in libstdc++-v3/src/c++98/compatibility.cc.  The macro is
used to declare compatibility functions of all sorts of incompatible
types.  The originally posted patch had libstdc++ disable the warning
for the file with the symbols but Jonathan preferred this solution.

It could perhaps be tightened up to detect some of the cases on your
list but I'm not sure it's worth the effort and added complexity.
Let me know if you feel differently (or have a different suggestion),
otherwise I will go ahead and commit the patch as is.

Thanks
Martin


Re: [PATCH] PR target/80556

2017-09-18 Thread Simon Wright
On 18 Sep 2017, at 21:09, Iain Sandoe  wrote:
> 
> Hi Simon,
> 
>> On 29 Jun 2017, at 21:41, Simon Wright  wrote:
>> 
>> On 28 Jun 2017, at 18:40, Jeff Law  wrote:
>>> 
>>> On 06/09/2017 07:57 AM, Simon Wright wrote:
  2017-06-09 Simon Wright 
 
  PR target/80556
  * configure.ac (stage1_ldflags): For Darwin, include -lSystem.
(poststage1_ldflags): likewise.
  * configure: regenerated.
>>> I'm a bit confused here.  Isn't -lSystem included in darwin's LIB_SPEC
>>> in which case the right things ought to already be happening, shouldn't it?
>> 
>> The specs that involve -lSystem are
> 
>> I've rebuilt gcc-8-20170528 with this change alone (i.e. not the patch 
>> currently posted here), successfully.
>> 
>> If I propose this alternative patch, should it be a new post, or should I 
>> continue this thread?
> 
> thanks for the patch.
> 
> The basic idea seems sound - as a workaround (as noted in comment #20 in the 
> PR, we should really rationalise the libgcc/crts stuff to reflect the modern 
> world, but these things take time...).
> 
> The patch as you have it would apply to every version of Darwin.
> 
> AFAICT from the published sources, i386 Darwin should be able to work with 
> the libgcc unwinder (and all earlier Darwin *have* to) - so I’ve proposed a 
> modified patch in the PR that makes the changes specific to m64 x86 and 
> doesn’t make any alteration for PPC and/or Darwin < 10.

That sounds like the right thing to do. I hadn't considered the older 
hardware/os issues (I only have kit back to macOS 10.11, Darwin 15).



Re: [PATCH v2, middle-end]: Introduce memory_blockage named insn pattern

2017-09-18 Thread Uros Bizjak
On Tue, Sep 5, 2017 at 3:50 PM, Uros Bizjak  wrote:
> Revised patch, incorporates fixes from Alexander's review comments.
>
> I removed some implementation details from Alexander's description of
> memory_blockage named pattern.
>
>
> 2017-09-05  Uros Bizjak  
>
> * target-insns.def: Add memory_blockage.
> * optabs.c (expand_memory_blockage): New function.
> (expand_asm_memory_barrier): Rename ...
> (expand_asm_memory_blockage): ... to this.
> (expand_mem_thread_fence): Call expand_memory_blockage
> instead of expand_asm_memory_barrier.
> (expand_mem_singnal_fence): Ditto.
> (expand_atomic_load): Ditto.
> (expand_atomic_store): Ditto.
> * doc/md.texi (Standard Pattern Names For Generation):
> Document memory_blockage instruction pattern.
>
> Bootstrapped and regression tested together with a followup x86 patch
> on x86_64-linux-gnu {,-m32}.
>
> OK for mainline?

PING, original patch at [1].

[1] https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00270.html

Uros.


Re: [PATCH] PR target/80556

2017-09-18 Thread Iain Sandoe
Hi Simon,

> On 29 Jun 2017, at 21:41, Simon Wright  wrote:
> 
> On 28 Jun 2017, at 18:40, Jeff Law  wrote:
>> 
>> On 06/09/2017 07:57 AM, Simon Wright wrote:
>>>   2017-06-09 Simon Wright 
>>> 
>>>   PR target/80556
>>>   * configure.ac (stage1_ldflags): For Darwin, include -lSystem.
>>> (poststage1_ldflags): likewise.
>>>   * configure: regenerated.
>> I'm a bit confused here.  Isn't -lSystem included in darwin's LIB_SPEC
>> in which case the right things ought to already be happening, shouldn't it?
> 
> The specs that involve -lSystem are

> I've rebuilt gcc-8-20170528 with this change alone (i.e. not the patch 
> currently posted here), successfully.
> 
> If I propose this alternative patch, should it be a new post, or should I 
> continue this thread?

thanks for the patch.

The basic idea seems sound - as a workaround (as noted in comment #20 in the 
PR, we should really rationalise the libgcc/crts stuff to reflect the modern 
world, but these things take time...).

The patch as you have it would apply to every version of Darwin.

AFAICT from the published sources, i386 Darwin should be able to work with the 
libgcc unwinder  (and all earlier Darwin *have* to) - so I’ve proposed a 
modified patch in the PR that makes the changes specific to m64 x86 and doesn’t 
make any alteration for PPC and/or Darwin < 10.

HTH,
Iain

Iain Sandoe
CodeSourcery / Mentor Embedded / Siemens



Re: [Ada] Validity check failure with packed array and pragma

2017-09-18 Thread Eric Botcazou
> That’s right, will do, thank you! Do I need to create a new ChangeLog
> entry in gcc/testsuite/ or is it fine if I just keep the current “New
> testcase.”?

I'm not sure anyone really cares so it's up to you I'd say.

-- 
Eric Botcazou


Re: [PATCH] PR libstdc++/81468 constrain std::chrono::time_point constructor

2017-09-18 Thread Jonathan Wakely

On 13/09/17 21:30 -0400, Tim Song wrote:

On Wed, Sep 13, 2017 at 10:55 AM, Jonathan Wakely  wrote:


+// DR 1177
+static_assert(is_constructible{},
+"can convert duration with one floating point rep to another");
+static_assert(is_constructible{},
+"can convert duration with integral rep to one with floating point rep");
+static_assert(!is_constructible{},
+"cannot convert duration with floating point rep to one with integral 
rep");
+static_assert(is_constructible{},
+"can convert duration with one integral rep to another");
+
+static_assert(!is_constructible>>{},
+"cannot convert duration to one with different period");
+static_assert(is_constructible>>{},
+"unless it has a floating-point representation");


"it" is a little ambiguous here unless you read the next message's
mention of "the original"...


+static_assert(is_constructible>>{},
+"or a period that is an integral multiple of the original");


This is backwards: duration is convertible to duration iff P1 is an integral multiple of P2, i.e., if the original's
period is an integral multiple of "its" period.

The static assert only passed because duration was used as the
destination type (presumably because of a copy/paste error).

Tim


Good catch, thanks.

I've committed this patch.


commit 5c021e19e0758e5ad7e47feadbd0632b15f85785
Author: Jonathan Wakely 
Date:   Mon Sep 18 19:04:25 2017 +0100

PR libstdc++/81468 fix test for duration conversions

PR libstdc++/81468
* testsuite/20_util/duration/cons/dr1177.cc: Fix incorrect test and
improve static assertion messages.

diff --git a/libstdc++-v3/testsuite/20_util/duration/cons/dr1177.cc b/libstdc++-v3/testsuite/20_util/duration/cons/dr1177.cc
index 28c881ccc79..d90cd27f482 100644
--- a/libstdc++-v3/testsuite/20_util/duration/cons/dr1177.cc
+++ b/libstdc++-v3/testsuite/20_util/duration/cons/dr1177.cc
@@ -36,6 +36,6 @@ static_assert(is_constructible{},
 static_assert(!is_constructible>>{},
 "cannot convert duration to one with different period");
 static_assert(is_constructible>>{},
-"unless it has a floating-point representation");
-static_assert(is_constructible>>{},
-"or a period that is an integral multiple of the original");
+"... unless the result type has a floating-point representation");
+static_assert(is_constructible>, duration>{},
+"... or the original's period is a multiple of the result's period");


demangler output format question

2017-09-18 Thread Nathan Sidwell
I discovered a bug in my fix for 82195, but it leads me towards a design 
question whose answer is not obvious to me


the simplest testcase is:

void Foo () {
  // _ZZ3FoovENKUlT_E_clIiEEfS_
  [](auto) ->float {
struct Local {
  // _ZZZ3FoovENKUlT_E_clIiEEfS_EN5Local2fnEv
  static void fn () {}
};
Local::fn ();
return 0.0f;
  } (0);
}

consider the demangled name of that Local::fn function.  we have to 
encode the enclosing template instantiation of the lambda function 
operator.  Where should that function's return type be?  I can think of 
3 alternatives:


1) as close to the operator () as possible:
Foo ()::{lambda(auto:1)}::float operator (int)::Local::fn ()

2) just before the 'lambda', the local class it's a member of
Foo ()::float {lambda(auto:1)}::operator (int)::Local::fn ()

3) just before the fully scoped name:
float Foo ()::{lambda(auto:1)}::operator (int)::Local::fn ()

#3 will be confusing if Local::fn and/or Foo are template 
instantiations, which contain their own return types.  We'd need 
parentheses or something.


#2 seemed like the natural place -- if one was mangling the class member 
fn on its own, one would write

  'float {lambda(auto:1)}::operator  (int)'
but that does separate the float from the entity it belongs to, which 
could be confusing in a complicated demangle.


Another alternative is to elide return types in such nested local names.

thoughts?

nathan

--
Nathan Sidwell


Re: [PATCH] Fix PR82220

2017-09-18 Thread Richard Sandiford
Richard Biener  writes:
> The following is said to fix a 482.sphinx3 regression.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
>
> Richard.
>
> 2017-09-18  Richard Biener  
>
>   PR tree-optimization/82220
>   * tree-vect-loop.c (vect_estimate_min_profitable_iters): Exclude
>   epilogue niters from the min_profitable_iters compute.
>
> Index: gcc/tree-vect-loop.c
> ===
> --- gcc/tree-vect-loop.c  (revision 252907)
> +++ gcc/tree-vect-loop.c  (working copy)
> @@ -3663,8 +3663,8 @@ vect_estimate_min_profitable_iters (loop
>  min_profitable_iters);
>  
>/* We want the vectorized loop to execute at least once.  */
> -  if (min_profitable_iters < (vf + peel_iters_prologue + 
> peel_iters_epilogue))
> -min_profitable_iters = vf + peel_iters_prologue + peel_iters_epilogue;
> +  if (min_profitable_iters < (vf + peel_iters_prologue))
> +min_profitable_iters = vf + peel_iters_prologue;

Maybe we should still add 1 when peeling for gaps?

Even adding the prologue count seems a bit weird if we've guessed it to
be vf/2.  Wouldn't it be more profitable to vectorise with an iteration
count of 1 vector iteration + 1 peeled iteration than 1 vector iteration
+ vf-1 peeled iterations, at least in percentage terms?  Was just wondering
if we should only add peel_iters_prologue when npeels > 0.

Thanks,
Richard


Re: [PATCH][aarch64] Fix error calls in aarch64 code so they can be tranlated

2017-09-18 Thread Martin Sebor

On 09/18/2017 12:26 PM, Steve Ellcey wrote:

This patch is for PR target/79868, where some aarch64 diagnostics are
said to be not translatable due to how they are implemented.  See the
bug report for more details on why the current setup of passing
the string 'pragma' or 'attribute' doesn't work.

This patch fixes it, unfortunately by increasing the number of calls we
have to 'error' (16 calls become 32 calls), but that seems to be the
most straight forward way to get translatable strings.


I haven't looked at all of them but from the few I have seen it
seems that rephrasing the messages along the following lines would
be a way to get around the translation issue and without increasing
the number of calls (though not without the conditional):

  error (is_pragma
 ? G_("missing name in %<#pragma target\(\"%s=\")%>")
 : G_("missing name in % attribute"),
 "arch");

The additional benefit of this approach is that it would also make
the quoting consistent with what seems to be the prevailing style
of these sorts of messages.  (It would be nice to eventually
converge on the same style/quoting and phrasing across all back
and front ends.)

Martin



This patch is an update to the one I originally attached to the bug
report and I have fixed the issue that Frederic Marchal found in my
original patch.

OK to checkin?

Steve Ellcey
sell...@cavium.com



2017-09-18  Steve Ellcey  

PR target/79868
* config/aarch64/aarch64-c.c (aarch64_pragma_target_parse):
Change argument type on aarch64_process_target_attr call.
* config/aarch64/aarch64-protos.h (aarch64_process_target_attr):
Change argument type.
* config/aarch64/aarch64.c (aarch64_attribute_info): Change
field type.
(aarch64_handle_attr_arch): Change argument type, use boolean
argument to make different error calls.
(aarch64_handle_attr_cpu): Ditto.
(aarch64_handle_attr_tune): Ditto.
(aarch64_handle_attr_isa_flags): Ditto.
(aarch64_process_one_target_attr): Ditto.
(aarch64_process_target_attr): Ditto.
(aarch64_option_valid_attribute_p): Change argument type on
aarch64_process_target_attr call.






Re: [PATCH][x86] Knights Mill -march/-mtune options

2017-09-18 Thread Uros Bizjak
On Mon, Sep 18, 2017 at 12:42 PM, Peryt, Sebastian
 wrote:
>> -Original Message-
>> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
>> ow...@gcc.gnu.org] On Behalf Of Uros Bizjak
>> Sent: Monday, September 18, 2017 12:23 PM
>> To: Peryt, Sebastian 
>> Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin 
>> Subject: Re: [PATCH][x86] Knights Mill -march/-mtune options
>>
>> On Mon, Sep 18, 2017 at 12:17 PM, Peryt, Sebastian
>>  wrote:
>> >> -Original Message-
>> >> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
>> >> ow...@gcc.gnu.org] On Behalf Of Uros Bizjak
>> >> Sent: Sunday, September 17, 2017 6:14 PM
>> >> To: Peryt, Sebastian 
>> >> Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin 
>> >> Subject: Re: [PATCH][x86] Knights Mill -march/-mtune options
>> >>
>> >> On Thu, Sep 14, 2017 at 1:47 PM, Peryt, Sebastian
>> >> 
>> >> wrote:
>> >> > Hi,
>> >> >
>> >> > This patch adds  options -march=/-mtune=knm for Knights Mill.
>> >> >
>> >> > 2017-09-14  Sebastian Peryt   gcc/
>> >> >
>> >> > * config.gcc: Support "knm".
>> >> > * config/i386/driver-i386.c (host_detect_local_cpu): Detect 
>> >> > "knm".
>> >> > * config/i386/i386-c.c (ix86_target_macros_internal): Handle
>> >> > PROCESSOR_KNM.
>> >> > * config/i386/i386.c (m_KNM): Define.
>> >> > (processor_target_table): Add "knm".
>> >> > (PTA_KNM): Define.
>> >> > (ix86_option_override_internal): Add "knm".
>> >> > (ix86_issue_rate): Add PROCESSOR_KNM.
>> >> > (ix86_adjust_cost): Ditto.
>> >> > (ia32_multipass_dfa_lookahead): Ditto.
>> >> > (get_builtin_code_for_version): Handle PROCESSOR_KNM.
>> >> > (fold_builtin_cpu): Define M_INTEL_KNM.
>> >> > * config/i386/i386.h (TARGET_KNM): Define.
>> >> > (processor_type): Add PROCESSOR_KNM.
>> >> > * config/i386/x86-tune.def: Add m_KNM.
>> >> > * doc/invoke.texi: Add knm as x86 -march=/-mtune= CPU type.
>> >> >
>> >> >
>> >> > gcc/testsuite/
>> >> >
>> >> > * gcc.target/i386/funcspec-5.c: Test knm.
>> >> >
>> >> > Is it ok for trunk?
>> >>
>> >> You also have to update libgcc/cpuinfo.h together with
>> >> fold_builtin_cpu from i386.c. Please note that all new processor
>> >> types and subtypes have to be added at the end of the enum.
>> >>
>> >
>> > Uros,
>> >
>> > I have updated libgcc/cpuinfo.h and libgcc/cpuinfo.c. I understood
>> > that CPU_TYPE_MAX in libgcc/cpuinfo.h processor_types is some kind of
>> > barrier, this is why I put KNM before that. Is that correct thinking?
>> > As for fold_builtin_cpu in i386.c I already have something like this:
>> >
>> > @@ -34217,6 +34229,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
>> >  M_AMDFAM15H,
>> >  M_INTEL_SILVERMONT,
>> >  M_INTEL_KNL,
>> > +M_INTEL_KNM,
>> >  M_AMD_BTVER1,
>> >  M_AMD_BTVER2,
>> >  M_CPU_SUBTYPE_START,
>> > @@ -34262,6 +34275,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
>> >{"bonnell", M_INTEL_BONNELL},
>> >{"silvermont", M_INTEL_SILVERMONT},
>> >{"knl", M_INTEL_KNL},
>> > +  {"knm", M_INTEL_KNM},
>> >{"amdfam10h", M_AMDFAM10H},
>> >{"barcelona", M_AMDFAM10H_BARCELONA},
>> >{"shanghai", M_AMDFAM10H_SHANGHAI},
>> >
>> > I couldn't find any other place where I'm supposed to add anything extra.
>>
>> Please look at libgcc/config/i386/cpuinfo.h. The comment here says that:
>>
>> /* Any new types or subtypes have to be inserted at the end. */
>>
>> The above patch should then add M_INTEL_KNM as the last entry *before*
>> M_CPU_SUBTYPE_START.
>>
>
> Sorry, I didn't notice this value at first. I believe now it's correct.

OK for mainline SVN (with updated ChangeLog).

Thanks,
Uros.


Re: [PR target/25512] Optimize certain equality tests on m68k

2017-09-18 Thread Andreas Schwab
I have checked in this patch to fix PR target/81613.

Andreas.

PR target/81613
* config/m68k/m68k.md (moveq feeding equality comparison): Check
that the registers are different.
---
 gcc/config/m68k/m68k.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/config/m68k/m68k.md b/gcc/config/m68k/m68k.md
index cd417e47a7..628e3889bb 100644
--- a/gcc/config/m68k/m68k.md
+++ b/gcc/config/m68k/m68k.md
@@ -7735,7 +7735,8 @@
   "peep2_reg_dead_p (2, operands[0])
&& peep2_reg_dead_p (2, operands[2])
&& (operands[3] == pc_rtx || operands[4] == pc_rtx)
-   && DATA_REG_P (operands[2])"
+   && DATA_REG_P (operands[2])
+   && !rtx_equal_p (operands[0], operands[2])"
   [(set (match_dup 2) (plus:SI (match_dup 2) (match_dup 6)))
(set (cc0) (compare (match_dup 2) (const_int 0)))
(set (pc) (if_then_else (match_op_dup 5 [(cc0) (const_int 0)])
-- 
2.14.1

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


[PATCH, i386]: Add missing M_AMDFAM17H CPU type to fold_builtin_cpu

2017-09-18 Thread Uros Bizjak
We have to synchronize fold_builtin_cpu with libgcc.

2017-09-18  Uros Bizjak  

* config/i386/i386.c (fold_builtin_cpu): Add M_AMDFAM17H
to processor_model and "amdfam17h" to arch_names_table.
* doc/extend.texi (__builtin_cpu_is): Document amdfam17h CPU name.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1ed6f75..79454f5 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -34418,6 +34418,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
 M_INTEL_KNL,
 M_AMD_BTVER1,
 M_AMD_BTVER2,
+M_AMDFAM17H,
 M_CPU_SUBTYPE_START,
 M_INTEL_COREI7_NEHALEM,
 M_INTEL_COREI7_WESTMERE,
@@ -34472,6 +34473,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
   {"bdver3", M_AMDFAM15H_BDVER3},
   {"bdver4", M_AMDFAM15H_BDVER4},
   {"btver2", M_AMD_BTVER2},
+  {"amdfam17h", M_AMDFAM17H},
   {"znver1", M_AMDFAM17H_ZNVER1},
 };
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index aa780a1..3efd398 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -19913,8 +19913,11 @@ AMD Family 15h Bulldozer version 4.
 @item btver2
 AMD Family 16h CPU.
 
-@item znver1
+@item amdfam17h
 AMD Family 17h CPU.
+
+@item znver1
+AMD Family 17h Zen version 1.
 @end table
 
 Here is an example:


[committed] Add @findex for __builtin_shuffle (PR c/82234)

2017-09-18 Thread Jakub Jelinek
Hi!

Our docs were missing @findex entry for __builtin_shuffle, added thusly,
tested on x86_64-linux, committed to trunk.

2017-09-18  Jakub Jelinek  

PR c/82234
* doc/extend.texi: Add @findex entry for __builtin_shuffle.

--- gcc/doc/extend.texi.jj  2017-09-15 23:12:30.0 +0200
+++ gcc/doc/extend.texi 2017-09-18 10:37:32.166090507 +0200
@@ -9683,6 +9683,7 @@ For mixed operations between a scalar @c
 @code{s && v} is equivalent to @code{s?v!=0:0} (the evaluation is
 short-circuit) and @code{v && s} is equivalent to @code{v!=0 & (s?-1:0)}.
 
+@findex __builtin_shuffle
 Vector shuffling is available using functions
 @code{__builtin_shuffle (vec, mask)} and
 @code{__builtin_shuffle (vec0, vec1, mask)}.

Jakub


[C++ PATCH] Add a testcase for -std=c++1z

2017-09-18 Thread Jakub Jelinek
Hi!

On Fri, Sep 15, 2017 at 12:08:07PM -0400, Nathan Sidwell wrote:
> On 09/14/2017 04:26 PM, Jakub Jelinek wrote:
> > Hi!
> > 
> > Given https://herbsutter.com/2017/09/06/c17-is-formally-approved/
> > this patch makes -std=c++17 and -std=gnu++17 the documented options
> > and -std=c++1z and -std=gnu++1z deprecated aliases, adjusts diagnostics etc.
> > 
> > Bootstrapped/regtest on x86_64-linux and i686-linux, ok for trunk?
> > The changes in gcc/testsuite/ and libstdc++/testsuite appart from
> > *.exp files are just sed -i -e 's/1z/17/g' `find . -type f`.
> 
> I think the patch is good, modulo the issue Pedro pointed at.

After the patch we have no testcases testing c++1z anymore, this patch adds
one.  And tests that __cplusplus is equal to the value we want.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-09-18  Jakub Jelinek  

* g++.dg/cpp1z/cplusplus.C: Test that __cplusplus is equal to 201703L.
* g++.dg/cpp1z/cplusplus_1z.C: New test.

--- gcc/testsuite/g++.dg/cpp1z/cplusplus.C.jj   2017-09-15 18:11:04.0 
+0200
+++ gcc/testsuite/g++.dg/cpp1z/cplusplus.C  2017-09-18 09:45:55.099541786 
+0200
@@ -1,6 +1,6 @@
 // { dg-do compile }
 // { dg-options "-std=c++17" }
 
-#if __cplusplus <= 201402L
-#error "__cplusplus <= 201402L"
+#if __cplusplus != 201703L
+#error "__cplusplus != 201703L"
 #endif
--- gcc/testsuite/g++.dg/cpp1z/cplusplus_1z.C.jj2017-09-18 
09:46:19.351239666 +0200
+++ gcc/testsuite/g++.dg/cpp1z/cplusplus_1z.C   2017-09-18 09:46:26.730147741 
+0200
@@ -0,0 +1,6 @@
+// { dg-do compile }
+// { dg-options "-std=c++1z" }
+
+#if __cplusplus != 201703L
+#error "__cplusplus != 201703L"
+#endif


Jakub


Re: [PATCH, rs6000 version 3] Add support for vec_xst_len_r() and vec_xl_len_r() builtins

2017-09-18 Thread Carl Love
GCC maintianers:

Addressed the comments from Segher about copying operands in
define_expand lxvll and stxvll.  Added new temp for the output of the
sldi instructions to give the allocator the freedom to select the
registers.  Removed constraints in the expanders.  Cleaned up issues
left over from the previous patch version.  Removed length attributes
that are now 4 rather then 8.

Tested on 
powerpc64le-unknown-linux-gnu (Power 9 LE),
powerpc64le-unknown-linux-gnu (Power 8 LE)  and
powerpc64le-unknown-linux-gnu (Power 8 BE) without regressions.

Please let me know if there are any additional issues to address.




2017-09-18  Carl Love  

* config/rs6000/rs6000-c.c (P9V_BUILTIN_VEC_XL_LEN_R,
P9V_BUILTIN_VEC_XST_LEN_R): Add support for builtins
vector unsigned char vec_xl_len_r (unsigned char *, size_t);
void vec_xst_len_r (vector unsigned char, unsigned char *, size_t);
* config/rs6000/altivec.h (vec_xl_len_r, vec_xst_len_r): Add defines.
* config/rs6000/rs6000-builtin.def (XL_LEN_R, XST_LEN_R): Add
definitions and overloading.
* config/rs6000/rs6000.c (altivec_expand_builtin): Add case
statement for P9V_BUILTIN_XST_LEN_R.
(altivec_init_builtins): Add def_builtin for P9V_BUILTIN_STXVLL.
* config/rs6000/vsx.md (lxvll, stxvll, xl_len_r, xst_len_r): Add
define_expand and define_insn for the instructions and builtins.
* doc/extend.texi: Update the built-in documentation file for the new
built-in functions.
* config/rs6000/altivec.md (altivec_lvsl_reg, altivec_lvsr_reg): Add
define_insn for the instructions

gcc/testsuite/ChangeLog:

2017-09-18  Carl Love  

* gcc.target/powerpc/builtins-5-p9-runnable.c: Add new runable test file
for the new built-ins and the existing built-ins.
---
 gcc/config/rs6000/altivec.h|   2 +
 gcc/config/rs6000/altivec.md   |  20 +-
 gcc/config/rs6000/rs6000-builtin.def   |   4 +
 gcc/config/rs6000/rs6000-c.c   |   8 +
 gcc/config/rs6000/rs6000.c |  11 +-
 gcc/config/rs6000/vsx.md   |  64 +
 gcc/doc/extend.texi|   4 +
 .../gcc.target/powerpc/builtins-5-p9-runnable.c| 309 +
 8 files changed, 419 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtins-5-p9-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index c8e508cf0..94a4db24a 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -467,6 +467,8 @@
 #ifdef _ARCH_PPC64
 #define vec_xl_len __builtin_vec_lxvl
 #define vec_xst_len __builtin_vec_stxvl
+#define vec_xl_len_r __builtin_vec_xl_len_r
+#define vec_xst_len_r __builtin_vec_xst_len_r
 #endif
 
 #define vec_cmpnez __builtin_vec_vcmpnez
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 0aa1e3016..a01720545 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -2542,6 +2542,15 @@
   DONE;
 })
 
+(define_insn "altivec_lvsl_reg"
+  [(set (match_operand:V16QI 0 "altivec_register_operand" "=v")
+   (unspec:V16QI
+   [(match_operand:DI 1 "gpc_reg_operand" "b")]
+   UNSPEC_LVSL_REG))]
+  "TARGET_ALTIVEC"
+  "lvsl %0,0,%1"
+  [(set_attr "type" "vecload")])
+
 (define_insn "altivec_lvsl_direct"
   [(set (match_operand:V16QI 0 "register_operand" "=v")
(unspec:V16QI [(match_operand:V16QI 1 "memory_operand" "Z")]
@@ -2551,7 +2560,7 @@
   [(set_attr "type" "vecload")])
 
 (define_expand "altivec_lvsr"
-  [(use (match_operand:V16QI 0 "register_operand" ""))
+  [(use (match_operand:V16QI 0 "altivec_register_operand" ""))
(use (match_operand:V16QI 1 "memory_operand" ""))]
   "TARGET_ALTIVEC"
 {
@@ -2574,6 +2583,15 @@
   DONE;
 })
 
+(define_insn "altivec_lvsr_reg"
+  [(set (match_operand:V16QI 0 "altivec_register_operand" "=v")
+   (unspec:V16QI
+   [(match_operand:DI 1 "gpc_reg_operand" "b")]
+   UNSPEC_LVSR_REG))]
+  "TARGET_ALTIVEC"
+  "lvsr %0,0,%1"
+  [(set_attr "type" "vecload")])
+
 (define_insn "altivec_lvsr_direct"
   [(set (match_operand:V16QI 0 "register_operand" "=v")
(unspec:V16QI [(match_operand:V16QI 1 "memory_operand" "Z")]
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 850164a09..8f87ccea4 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2125,6 +2125,7 @@ BU_P9V_OVERLOAD_2 (VIESP, "insert_exp_sp")
 
 /* 2 argument vector functions added in ISA 3.0 (power9).  */
 BU_P9V_64BIT_VSX_2 (LXVL,  "lxvl", CONST,  lxvl)
+BU_P9V_64BIT_VSX_2 (XL_LEN_R,  "xl_len_r", CONST,  xl_len_r)
 
 BU_P9V_AV_2 (VEXTUBLX, "vextublx", CONST,  vextublx)
 BU_P9V_AV_2 

[PATCH][aarch64] Fix error calls in aarch64 code so they can be tranlated

2017-09-18 Thread Steve Ellcey
This patch is for PR target/79868, where some aarch64 diagnostics are
said to be not translatable due to how they are implemented.  See the
bug report for more details on why the current setup of passing
the string 'pragma' or 'attribute' doesn't work.

This patch fixes it, unfortunately by increasing the number of calls we
have to 'error' (16 calls become 32 calls), but that seems to be the
most straight forward way to get translatable strings.

This patch is an update to the one I originally attached to the bug
report and I have fixed the issue that Frederic Marchal found in my
original patch.

OK to checkin?

Steve Ellcey
sell...@cavium.com



2017-09-18  Steve Ellcey  

PR target/79868
* config/aarch64/aarch64-c.c (aarch64_pragma_target_parse):
Change argument type on aarch64_process_target_attr call.
* config/aarch64/aarch64-protos.h (aarch64_process_target_attr):
Change argument type.
* config/aarch64/aarch64.c (aarch64_attribute_info): Change
field type.
(aarch64_handle_attr_arch): Change argument type, use boolean
argument to make different error calls.
(aarch64_handle_attr_cpu): Ditto.
(aarch64_handle_attr_tune): Ditto.
(aarch64_handle_attr_isa_flags): Ditto.
(aarch64_process_one_target_attr): Ditto.
(aarch64_process_target_attr): Ditto.
(aarch64_option_valid_attribute_p): Change argument type on
aarch64_process_target_attr call.


diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index 177e638..c9945db 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -165,7 +165,7 @@ aarch64_pragma_target_parse (tree args, tree pop_target)
  information that it specifies.  */
   if (args)
 {
-  if (!aarch64_process_target_attr (args, "pragma"))
+  if (!aarch64_process_target_attr (args, true))
 	return false;
 
   aarch64_override_options_internal (_options);
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index e67c2ed..4323e9e 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -445,7 +445,7 @@ bool aarch64_gen_adjusted_ldpstp (rtx *, bool, scalar_mode, RTX_CODE);
 
 void aarch64_init_builtins (void);
 
-bool aarch64_process_target_attr (tree, const char*);
+bool aarch64_process_target_attr (tree, bool);
 void aarch64_override_options_internal (struct gcc_options *);
 
 rtx aarch64_expand_builtin (tree exp,
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1c14008..054b1d2 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9554,7 +9554,7 @@ struct aarch64_attribute_info
   const char *name;
   enum aarch64_attr_opt_type attr_type;
   bool allow_neg;
-  bool (*handler) (const char *, const char *);
+  bool (*handler) (const char *, bool);
   enum opt_code opt_num;
 };
 
@@ -9562,7 +9562,7 @@ struct aarch64_attribute_info
PRAGMA_OR_ATTR is used in potential error messages.  */
 
 static bool
-aarch64_handle_attr_arch (const char *str, const char *pragma_or_attr)
+aarch64_handle_attr_arch (const char *str, bool is_pragma)
 {
   const struct processor *tmp_arch = NULL;
   enum aarch64_parse_opt_result parse_res
@@ -9579,15 +9579,24 @@ aarch64_handle_attr_arch (const char *str, const char *pragma_or_attr)
   switch (parse_res)
 {
   case AARCH64_PARSE_MISSING_ARG:
-	error ("missing architecture name in 'arch' target %s", pragma_or_attr);
+	if (is_pragma)
+	  error ("missing architecture name in 'arch' target pragma");
+	else
+	  error ("missing architecture name in 'arch' target attribute");
 	break;
   case AARCH64_PARSE_INVALID_ARG:
-	error ("unknown value %qs for 'arch' target %s", str, pragma_or_attr);
+	if (is_pragma)
+	  error ("unknown value %qs for 'arch' target pragma", str);
+	else
+	  error ("unknown value %qs for 'arch' target attribute", str);
 	aarch64_print_hint_for_arch (str);
 	break;
   case AARCH64_PARSE_INVALID_FEATURE:
-	error ("invalid feature modifier %qs for 'arch' target %s",
-	   str, pragma_or_attr);
+	if (is_pragma)
+	  error ("invalid feature modifier %qs for 'arch' target pragma", str);
+	else
+	  error ("invalid feature modifier %qs for 'arch' target attribute",
+		 str);
 	break;
   default:
 	gcc_unreachable ();
@@ -9600,7 +9609,7 @@ aarch64_handle_attr_arch (const char *str, const char *pragma_or_attr)
PRAGMA_OR_ATTR is used in potential error messages.  */
 
 static bool
-aarch64_handle_attr_cpu (const char *str, const char *pragma_or_attr)
+aarch64_handle_attr_cpu (const char *str, bool is_pragma)
 {
   const struct processor *tmp_cpu = NULL;
   enum aarch64_parse_opt_result parse_res
@@ -9620,15 +9629,24 @@ aarch64_handle_attr_cpu (const char *str, const char *pragma_or_attr)
   switch (parse_res)
 {
   case AARCH64_PARSE_MISSING_ARG:
-	error ("missing cpu name in 'cpu' target 

Re: [Patch, Fortran] PR 82143: add a -fdefault-real-16 flag

2017-09-18 Thread Janus Weil
2017-09-18 11:31 GMT+02:00 Dominique d'Humières :
> (1) real(16) is an order of magnitude slower than real(8) for the codes I 
> have tested (a long time ago). So its real utility is quite low.

I am fully aware that performance with quad-precision is lower than
with double precision. How much will certainly depend on the specifics
of the code in question. The flag I'm proposing would help in
evaluating this performance hit.


> (2) I think your time would be better used by dealing with your assigned PRs.

I think I can very well decide for myself where to waste my spare
time. There were actually times when I enjoyed contributing to
gfortran and reading this list very much, but recently it's really
becoming a PITA and I feel like I could spend my time on much nicer
things ...

Over and out,
Janus


Re: [libstdc++/71500] make back reference work with icase

2017-09-18 Thread Tim Shen via gcc-patches
On Mon, Sep 18, 2017 at 10:26 AM, Jonathan Wakely  wrote:
>> We need to rewrite this to check the lengths are equal first, and then
>> call the 3-argument version of std::equal.
>>
>> Alternatively, we could move the implementation of the C++14
>> std::equal overloads to __equal and make that available for C++11.
>> I'll try that.
>
>
> Here's a proof of concept patch for that. It's a bit ugly.

Instead of having iterator tags in the interface, we can probe the
random-access-ness inside __equal4/__equal4_p, can't we? It's similar
to the existing "if (_RAIters()) { ... }".

I'd expect the patches to be renaming the current implementations and
adding wrappers, instead of adding new implementations.


-- 
Regards,
Tim Shen


Re: [Patch, Fortran] PR 82143: add a -fdefault-real-16 flag

2017-09-18 Thread Janus Weil
2017-09-18 16:08 GMT+02:00 Steve Kargl :
> On Mon, Sep 18, 2017 at 09:02:22AM +0200, Janus Weil wrote:
>> Hi Steve,
>>
>> >> attached is a (technically) simple patch that implements the compiler
>> >> flag "-fdefault-real-16" for gfortran.
>> >
>> > What about -fdefault-real-10?  If you're going to add bloat to the
>> > compiler, then you might as well to it right.
>>
>> well, yeah. If my only aim was to add bloat to the compiler out of
>> plain boredom and nastiness, then I might as well add
>> -fdefault-real-37. But I don't think that would be very useful.
>
> Why?  One gets 11-bits of additional precision (on most platforms)
> and a significant increase in the exponent range (+- ~1024 to
> +- ~16384).  REAL(10) maps to hardware floating point, which is
> faster than software quad precision.

Well, ok. If adding -fdefault-real-10 was a serious suggestion from
your side (which was not so easy to tell through all the sarcasm), I
can surely add that as well.

Cheers,
Janus


Re: [PATCH 1/2] (header usage fix) remove unused system header includes

2017-09-18 Thread Mike Stump
I was hoping an RM would approve this as it seems just a hair beyond a normal 
darwin approval.  I'm fine with this, and it does help darwin.  Other ports 
should not care.

On Sep 18, 2017, at 10:30 AM, Jack Howarth  wrote:
> 
> Pinging for the final gcc 5.5 release.
> 
> On Mon, Aug 7, 2017 at 1:12 AM, Ryan Mounce  wrote:
>> 2017-08-05  Ryan Mounce  
>> 
>>cherry picked from trunk r235361
>>2016-04-22  Szabolcs Nagy  
>> 
>>* auto-profile.c: Remove  include.
>>* diagnostic.c: Remove  include.
>>* genmatch.c: Likewise.
>>* pretty-print.c: Likewise.
>>* toplev.c: Likewise
>>* c/c-objc-common.c: Likewise.
>>* cp/error.c: Likewise.
>>* fortran/error.c: Likewise.
>> ---
>> gcc/ChangeLog | 14 ++
>> gcc/auto-profile.c|  1 -
>> gcc/c/c-objc-common.c |  2 --
>> gcc/cp/error.c|  2 --
>> gcc/diagnostic.c  |  2 --
>> gcc/fortran/error.c   |  2 --
>> gcc/genmatch.c|  1 -
>> gcc/pretty-print.c|  2 --
>> gcc/toplev.c  |  2 --
>> 9 files changed, 14 insertions(+), 14 deletions(-)
>> 
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index 3b431ce83f4..f3280917ad8 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,17 @@
>> +2017-08-05  Ryan Mounce  
>> +
>> +   Backport from mainline
>> +   2016-04-22  Szabolcs Nagy  
>> +
>> +   * auto-profile.c: Remove  include.
>> +   * diagnostic.c: Remove  include.
>> +   * genmatch.c: Likewise.
>> +   * pretty-print.c: Likewise.
>> +   * toplev.c: Likewise
>> +   * c/c-objc-common.c: Likewise.
>> +   * cp/error.c: Likewise.
>> +   * fortran/error.c: Likewise.
>> +
>> 2017-07-31  Jakub Jelinek  
>> 
>>PR sanitizer/81604
>> diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
>> index b8b02d174b4..a5e7225e338 100644
>> --- a/gcc/auto-profile.c
>> +++ b/gcc/auto-profile.c
>> @@ -21,7 +21,6 @@ along with GCC; see the file COPYING3.  If not see
>> #include "config.h"
>> #include "system.h"
>> 
>> -#include 
>> #include 
>> #include 
>> 
>> diff --git a/gcc/c/c-objc-common.c b/gcc/c/c-objc-common.c
>> index 344d4e2949c..c1ec601f93c 100644
>> --- a/gcc/c/c-objc-common.c
>> +++ b/gcc/c/c-objc-common.c
>> @@ -38,8 +38,6 @@ along with GCC; see the file COPYING3.  If not see
>> #include "langhooks.h"
>> #include "c-objc-common.h"
>> 
>> -#include   // For placement new.
>> -
>> static bool c_tree_printer (pretty_printer *, text_info *, const char *,
>>int, bool, bool, bool);
>> 
>> diff --git a/gcc/cp/error.c b/gcc/cp/error.c
>> index 0c8bd66a325..f502127f34f 100644
>> --- a/gcc/cp/error.c
>> +++ b/gcc/cp/error.c
>> @@ -44,8 +44,6 @@ along with GCC; see the file COPYING3.  If not see
>> #include "ubsan.h"
>> #include "internal-fn.h"
>> 
>> -#include // For placement-new.
>> -
>> #define pp_separate_with_comma(PP) pp_cxx_separate_with (PP, ',')
>> #define pp_separate_with_semicolon(PP) pp_cxx_separate_with (PP, ';')
>> 
>> diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
>> index c43162269ec..1c3815c9f3d 100644
>> --- a/gcc/diagnostic.c
>> +++ b/gcc/diagnostic.c
>> @@ -41,8 +41,6 @@ along with GCC; see the file COPYING3.  If not see
>> # include 
>> #endif
>> 
>> -#include  // For placement new.
>> -
>> #define pedantic_warning_kind(DC)  \
>>   ((DC)->pedantic_errors ? DK_ERROR : DK_WARNING)
>> #define permissive_error_kind(DC) ((DC)->permissive ? DK_WARNING : DK_ERROR)
>> diff --git a/gcc/fortran/error.c b/gcc/fortran/error.c
>> index 18e127f8748..2f76de50c9e 100644
>> --- a/gcc/fortran/error.c
>> +++ b/gcc/fortran/error.c
>> @@ -34,8 +34,6 @@ along with GCC; see the file COPYING3.  If not see
>> #include "diagnostic-color.h"
>> #include "tree-diagnostic.h" /* tree_diagnostics_defaults */
>> 
>> -#include  /* For placement-new */
>> -
>> static int suppress_errors = 0;
>> 
>> static bool warnings_not_errors = false;
>> diff --git a/gcc/genmatch.c b/gcc/genmatch.c
>> index 8f94ff09263..8f495616e2e 100644
>> --- a/gcc/genmatch.c
>> +++ b/gcc/genmatch.c
>> @@ -22,7 +22,6 @@ along with GCC; see the file COPYING3.  If not see
>> .  */
>> 
>> #include "bconfig.h"
>> -#include 
>> #include "system.h"
>> #include "coretypes.h"
>> #include "ggc.h"
>> diff --git a/gcc/pretty-print.c b/gcc/pretty-print.c
>> index 78d334eae88..6881d1aeabe 100644
>> --- a/gcc/pretty-print.c
>> +++ b/gcc/pretty-print.c
>> @@ -25,8 +25,6 @@ along with GCC; see the file COPYING3.  If not see
>> #include "pretty-print.h"
>> #include "diagnostic-color.h"
>> 
>> -#include // For placement-new.
>> -
>> #if HAVE_ICONV
>> #include 
>> #endif
>> diff --git a/gcc/toplev.c b/gcc/toplev.c
>> index 17d05121026..237e24ef34e 100644
>> 

Re: [PATCH 0/2] backport c++ header fixes to gcc-5-branch

2017-09-18 Thread Mike Stump
I was hoping an RM would approve this as it seems just a hair beyond a normal 
darwin approval.  I'm fine with this, and it does help darwin.  Other ports 
should not care.

On Sep 18, 2017, at 10:31 AM, Jack Howarth  wrote:
> 
> Pinging for the final gcc 5.5 release.
> 
> On Mon, Aug 7, 2017 at 1:12 AM, Ryan Mounce  wrote:
>> Fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81037
>> 
>> Bootstrap now succeeds using Xcode 9 toolchain.
>> 
>> Tested on macOS 10.13 beta, however same issue reported on macOS 10.12
>> with Xcode 9.
>> 
>> Ryan Mounce (2):
>>  (header usage fix) remove unused system header includes
>>  (header usage fix) include c++ headers in system.h
>> 
>> gcc/ChangeLog| 29 +
>> gcc/auto-profile.c   |  6 ++
>> gcc/c/c-objc-common.c|  2 --
>> gcc/config/sh/sh.c   |  2 +-
>> gcc/config/sh/sh_treg_combine.cc |  7 +++
>> gcc/cp/error.c   |  2 --
>> gcc/diagnostic.c |  2 --
>> gcc/fortran/error.c  |  2 --
>> gcc/fortran/trans-common.c   |  2 +-
>> gcc/genmatch.c   |  1 -
>> gcc/graphite-isl-ast-to-gimple.c |  2 +-
>> gcc/ipa-icf-gimple.c |  2 +-
>> gcc/ipa-icf.c|  2 +-
>> gcc/pretty-print.c   |  2 --
>> gcc/system.h | 12 
>> gcc/toplev.c |  2 --
>> 16 files changed, 51 insertions(+), 26 deletions(-)
>> 
>> --
>> 2.13.2 (Apple Git-90)
>> 



smime.p7s
Description: S/MIME cryptographic signature


C++ PATCH for c++/82069, ICE with lambda in template

2017-09-18 Thread Jason Merrill
The code that avoids all implicit capture in template contexts got
confused by fold_non_dependent_expr, which temporarily clears
processing_template_decl.  Fixed by using uses_template_parms instead,
which does not depend on the current value of
processing_template_decl.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 985b8e5fde47ac972db3e7b25c0ef25980bc0a05
Author: Jason Merrill 
Date:   Sat Sep 16 07:45:02 2017 -0400

PR c++/82069

* semantics.c (process_outer_var_ref): Check uses_template_parms
instead of any_dependent_template_arguments_p.

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 4f4c17f853d..3a3ae55aa44 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -3347,8 +3347,7 @@ process_outer_var_ref (tree decl, tsubst_flags_t complain)
  time to implicitly capture.  */
   if (context == containing_function
   && DECL_TEMPLATE_INFO (containing_function)
-  && any_dependent_template_arguments_p (DECL_TI_ARGS
-(containing_function)))
+  && uses_template_parms (DECL_TI_ARGS (containing_function)))
 return decl;
 
   /* Core issue 696: "[At the July 2009 meeting] the CWG expressed
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template15.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template15.C
new file mode 100644
index 000..4da64f27fd8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template15.C
@@ -0,0 +1,11 @@
+// PR c++/82069
+// { dg-do compile { target c++11 } }
+
+struct A {
+  void foo(int *);
+};
+struct B : A {
+  template  void bar(int *p1) {
+[&] { foo(p1); };
+  }
+};


Re: [PATCH 0/2] backport c++ header fixes to gcc-5-branch

2017-09-18 Thread Jack Howarth
Pinging for the final gcc 5.5 release.

On Mon, Aug 7, 2017 at 1:12 AM, Ryan Mounce  wrote:
> Fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81037
>
> Bootstrap now succeeds using Xcode 9 toolchain.
>
> Tested on macOS 10.13 beta, however same issue reported on macOS 10.12
> with Xcode 9.
>
> Ryan Mounce (2):
>   (header usage fix) remove unused system header includes
>   (header usage fix) include c++ headers in system.h
>
>  gcc/ChangeLog| 29 +
>  gcc/auto-profile.c   |  6 ++
>  gcc/c/c-objc-common.c|  2 --
>  gcc/config/sh/sh.c   |  2 +-
>  gcc/config/sh/sh_treg_combine.cc |  7 +++
>  gcc/cp/error.c   |  2 --
>  gcc/diagnostic.c |  2 --
>  gcc/fortran/error.c  |  2 --
>  gcc/fortran/trans-common.c   |  2 +-
>  gcc/genmatch.c   |  1 -
>  gcc/graphite-isl-ast-to-gimple.c |  2 +-
>  gcc/ipa-icf-gimple.c |  2 +-
>  gcc/ipa-icf.c|  2 +-
>  gcc/pretty-print.c   |  2 --
>  gcc/system.h | 12 
>  gcc/toplev.c |  2 --
>  16 files changed, 51 insertions(+), 26 deletions(-)
>
> --
> 2.13.2 (Apple Git-90)
>


Re: [PATCH 1/2] (header usage fix) remove unused system header includes

2017-09-18 Thread Jack Howarth
Pinging for the final gcc 5.5 release.

On Mon, Aug 7, 2017 at 1:12 AM, Ryan Mounce  wrote:
> 2017-08-05  Ryan Mounce  
>
> cherry picked from trunk r235361
> 2016-04-22  Szabolcs Nagy  
>
> * auto-profile.c: Remove  include.
> * diagnostic.c: Remove  include.
> * genmatch.c: Likewise.
> * pretty-print.c: Likewise.
> * toplev.c: Likewise
> * c/c-objc-common.c: Likewise.
> * cp/error.c: Likewise.
> * fortran/error.c: Likewise.
> ---
>  gcc/ChangeLog | 14 ++
>  gcc/auto-profile.c|  1 -
>  gcc/c/c-objc-common.c |  2 --
>  gcc/cp/error.c|  2 --
>  gcc/diagnostic.c  |  2 --
>  gcc/fortran/error.c   |  2 --
>  gcc/genmatch.c|  1 -
>  gcc/pretty-print.c|  2 --
>  gcc/toplev.c  |  2 --
>  9 files changed, 14 insertions(+), 14 deletions(-)
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 3b431ce83f4..f3280917ad8 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,17 @@
> +2017-08-05  Ryan Mounce  
> +
> +   Backport from mainline
> +   2016-04-22  Szabolcs Nagy  
> +
> +   * auto-profile.c: Remove  include.
> +   * diagnostic.c: Remove  include.
> +   * genmatch.c: Likewise.
> +   * pretty-print.c: Likewise.
> +   * toplev.c: Likewise
> +   * c/c-objc-common.c: Likewise.
> +   * cp/error.c: Likewise.
> +   * fortran/error.c: Likewise.
> +
>  2017-07-31  Jakub Jelinek  
>
> PR sanitizer/81604
> diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
> index b8b02d174b4..a5e7225e338 100644
> --- a/gcc/auto-profile.c
> +++ b/gcc/auto-profile.c
> @@ -21,7 +21,6 @@ along with GCC; see the file COPYING3.  If not see
>  #include "config.h"
>  #include "system.h"
>
> -#include 
>  #include 
>  #include 
>
> diff --git a/gcc/c/c-objc-common.c b/gcc/c/c-objc-common.c
> index 344d4e2949c..c1ec601f93c 100644
> --- a/gcc/c/c-objc-common.c
> +++ b/gcc/c/c-objc-common.c
> @@ -38,8 +38,6 @@ along with GCC; see the file COPYING3.  If not see
>  #include "langhooks.h"
>  #include "c-objc-common.h"
>
> -#include   // For placement new.
> -
>  static bool c_tree_printer (pretty_printer *, text_info *, const char *,
> int, bool, bool, bool);
>
> diff --git a/gcc/cp/error.c b/gcc/cp/error.c
> index 0c8bd66a325..f502127f34f 100644
> --- a/gcc/cp/error.c
> +++ b/gcc/cp/error.c
> @@ -44,8 +44,6 @@ along with GCC; see the file COPYING3.  If not see
>  #include "ubsan.h"
>  #include "internal-fn.h"
>
> -#include // For placement-new.
> -
>  #define pp_separate_with_comma(PP) pp_cxx_separate_with (PP, ',')
>  #define pp_separate_with_semicolon(PP) pp_cxx_separate_with (PP, ';')
>
> diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
> index c43162269ec..1c3815c9f3d 100644
> --- a/gcc/diagnostic.c
> +++ b/gcc/diagnostic.c
> @@ -41,8 +41,6 @@ along with GCC; see the file COPYING3.  If not see
>  # include 
>  #endif
>
> -#include  // For placement new.
> -
>  #define pedantic_warning_kind(DC)  \
>((DC)->pedantic_errors ? DK_ERROR : DK_WARNING)
>  #define permissive_error_kind(DC) ((DC)->permissive ? DK_WARNING : DK_ERROR)
> diff --git a/gcc/fortran/error.c b/gcc/fortran/error.c
> index 18e127f8748..2f76de50c9e 100644
> --- a/gcc/fortran/error.c
> +++ b/gcc/fortran/error.c
> @@ -34,8 +34,6 @@ along with GCC; see the file COPYING3.  If not see
>  #include "diagnostic-color.h"
>  #include "tree-diagnostic.h" /* tree_diagnostics_defaults */
>
> -#include  /* For placement-new */
> -
>  static int suppress_errors = 0;
>
>  static bool warnings_not_errors = false;
> diff --git a/gcc/genmatch.c b/gcc/genmatch.c
> index 8f94ff09263..8f495616e2e 100644
> --- a/gcc/genmatch.c
> +++ b/gcc/genmatch.c
> @@ -22,7 +22,6 @@ along with GCC; see the file COPYING3.  If not see
>  .  */
>
>  #include "bconfig.h"
> -#include 
>  #include "system.h"
>  #include "coretypes.h"
>  #include "ggc.h"
> diff --git a/gcc/pretty-print.c b/gcc/pretty-print.c
> index 78d334eae88..6881d1aeabe 100644
> --- a/gcc/pretty-print.c
> +++ b/gcc/pretty-print.c
> @@ -25,8 +25,6 @@ along with GCC; see the file COPYING3.  If not see
>  #include "pretty-print.h"
>  #include "diagnostic-color.h"
>
> -#include // For placement-new.
> -
>  #if HAVE_ICONV
>  #include 
>  #endif
> diff --git a/gcc/toplev.c b/gcc/toplev.c
> index 17d05121026..237e24ef34e 100644
> --- a/gcc/toplev.c
> +++ b/gcc/toplev.c
> @@ -135,8 +135,6 @@ along with GCC; see the file COPYING3.  If not see
>  #define HAVE_prologue 0
>  #endif
>
> -#include 
> -
>  static void general_init (const char *, bool);
>  static void do_compile ();
>  static void process_options (void);
> --
> 2.13.2 (Apple Git-90)
>


Re: [libstdc++/71500] make back reference work with icase

2017-09-18 Thread Jonathan Wakely

On 15/09/17 16:39 +0100, Jonathan Wakely wrote:

On 04/09/17 03:31 -0700, Tim Shen via libstdc++ wrote:

This fixes the follow-up comments in 71500.

Back-reference matching is different from other matching, as the
content the back-reference refers to is at "run-time", aka during
regex_match(), not regex() compilation.

For compilation we do have an abstraction layer to catch all
comparison customizations, namely _M_translator in regex_compiler.h.
Until this patch, we don't have an abstraction for "run-time"
matching. I believe that back-reference is the only place that needs
run-time matching, so I just build a _Backref_matcher in
regex_executot.tcc.

Tested on x86_64-linux-gnu.

Thanks!

--
Regards,
Tim Shen



commit a97b7fecd319e031ffc489a956b8cf3dc63eeb26
Author: Tim Shen 
Date:   Mon Sep 4 03:19:35 2017 -0700

  PR libstdc++/71500
  * include/bits/regex_executor.tcc: Support icase in
  regex_tratis<...> for back reference matches.
  * testsuite/28_regex/regression.cc: Test case.

diff --git a/libstdc++-v3/include/bits/regex_executor.tcc 
b/libstdc++-v3/include/bits/regex_executor.tcc
index 226e05856e1..f6149fecf9d 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -335,6 +335,54 @@ namespace __detail
  _M_states._M_queue(__state._M_next, _M_cur_results);
   }

+  template
+struct _Backref_matcher
+{
+  _Backref_matcher(bool __icase, const _TraitsT& __traits)
+  : _M_traits(__traits) { }
+
+  bool
+  _M_apply(_BiIter __expected_begin,
+  _BiIter __expected_end, _BiIter __actual_begin,
+  _BiIter __actual_end)
+  {
+   return _M_traits.transform(__expected_begin, __expected_end)
+   == _M_traits.transform(__actual_begin, __actual_end);
+  }
+
+  const _TraitsT& _M_traits;
+};
+
+  template
+struct _Backref_matcher<_BiIter, std::regex_traits<_CharT>>
+{
+  using _TraitsT = std::regex_traits<_CharT>;
+  _Backref_matcher(bool __icase, const _TraitsT& __traits)
+  : _M_icase(__icase), _M_traits(__traits) { }
+
+  bool
+  _M_apply(_BiIter __expected_begin,
+  _BiIter __expected_end, _BiIter __actual_begin,
+  _BiIter __actual_end)
+  {
+   if (!_M_icase)
+ return std::equal(__expected_begin, __expected_end,
+   __actual_begin, __actual_end);


This is only valid in C++14 and higher, because the 4-argument version
of std::equal isn't present in C++11.


+   typedef std::ctype<_CharT> __ctype_type;
+   const auto& __fctyp = use_facet<__ctype_type>(_M_traits.getloc());
+   return std::equal(__expected_begin, __expected_end,
+ __actual_begin, __actual_end,


Same here.


+ [this, &__fctyp](_CharT __lhs, _CharT __rhs)
+ {
+   return __fctyp.tolower(__lhs)
+   == __fctyp.tolower(__rhs);
+ });


We need to rewrite this to check the lengths are equal first, and then
call the 3-argument version of std::equal.

Alternatively, we could move the implementation of the C++14
std::equal overloads to __equal and make that available for C++11.
I'll try that.


Here's a proof of concept patch for that. It's a bit ugly.


diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc
index f6149fecf9d..4b185cc9d1e 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -366,17 +366,21 @@ namespace __detail
 	   _BiIter __actual_end)
   {
 	if (!_M_icase)
-	  return std::equal(__expected_begin, __expected_end,
-			__actual_begin, __actual_end);
+	  return std::__equal4(__expected_begin, __expected_end,
+			   __actual_begin, __actual_end,
+			   std::__iterator_category(__expected_begin),
+			   std::__iterator_category(__actual_begin));
 	typedef std::ctype<_CharT> __ctype_type;
 	const auto& __fctyp = use_facet<__ctype_type>(_M_traits.getloc());
-	return std::equal(__expected_begin, __expected_end,
-			  __actual_begin, __actual_end,
-			  [this, &__fctyp](_CharT __lhs, _CharT __rhs)
-			  {
-			return __fctyp.tolower(__lhs)
-== __fctyp.tolower(__rhs);
-			  });
+	return std::__equal4_p(__expected_begin, __expected_end,
+			   __actual_begin, __actual_end,
+			   [this, &__fctyp](_CharT __lhs, _CharT __rhs)
+			   {
+ return __fctyp.tolower(__lhs)
+   == __fctyp.tolower(__rhs);
+			   },
+			   std::__iterator_category(__expected_begin),
+			   std::__iterator_category(__actual_begin));
   }
 
   bool _M_icase;
diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
index f68ecb22b82..b7848b3de99 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ 

Re: [PATCH][RFA/RFC] Stack clash mitigation patch 02/08 - V3

2017-09-18 Thread Jeff Law
On 09/18/2017 10:09 AM, Andreas Schwab wrote:
> On Sep 18 2017, Jeff Law  wrote:
> 
>> Can you confirm if the probe was in the red zone vs the live areas on
>> the stack?
> 
> It overwrites a nearby variable.  sp + 8 happens to be the address of
> file_entries_new_size.
> 
>0x000140e8 <+1172>:  mov r6, sp
>0x000140ec <+1176>:  add r3, r3, #7
>0x000140f0 <+1180>:  bic r3, r3, #7
>0x000140f4 <+1184>:  cmp r3, #4096   ; 0x1000
>0x000140f8 <+1188>:  bcc 0x14110 
>0x000140fc <+1192>:  sub r3, r3, #4096   ; 0x1000
>0x00014100 <+1196>:  sub sp, sp, #4096   ; 0x1000
>0x00014104 <+1200>:  cmp r3, #4096   ; 0x1000
>0x00014108 <+1204>:  str r0, [sp, #8]
>0x0001410c <+1208>:  bcs 0x140fc 
>0x00014110 <+1212>:  ldr r7, [r11, #-56] ; 0xffc8
>0x00014114 <+1216>:  sub sp, sp, r3
>0x00014118 <+1220>:  mov r1, #0
>0x0001411c <+1224>:  add r3, sp, #8
>0x00014120 <+1228>:  mov r0, r3
> => 0x00014124 <+1232>:  str r0, [sp, #8]
> 
> Andreas.
> 
Or better yet, include your .i and .s files in their entirety and the
gcc -v output

Jeff


Re: [PATCH][RFA/RFC] Stack clash mitigation patch 02/08 - V3

2017-09-18 Thread Jeff Law
On 09/18/2017 10:09 AM, Andreas Schwab wrote:
> On Sep 18 2017, Jeff Law  wrote:
> 
>> Can you confirm if the probe was in the red zone vs the live areas on
>> the stack?
> 
> It overwrites a nearby variable.  sp + 8 happens to be the address of
> file_entries_new_size.
> 
>0x000140e8 <+1172>:  mov r6, sp
>0x000140ec <+1176>:  add r3, r3, #7
>0x000140f0 <+1180>:  bic r3, r3, #7
>0x000140f4 <+1184>:  cmp r3, #4096   ; 0x1000
>0x000140f8 <+1188>:  bcc 0x14110 
>0x000140fc <+1192>:  sub r3, r3, #4096   ; 0x1000
>0x00014100 <+1196>:  sub sp, sp, #4096   ; 0x1000
>0x00014104 <+1200>:  cmp r3, #4096   ; 0x1000
>0x00014108 <+1204>:  str r0, [sp, #8]
>0x0001410c <+1208>:  bcs 0x140fc 
>0x00014110 <+1212>:  ldr r7, [r11, #-56] ; 0xffc8
>0x00014114 <+1216>:  sub sp, sp, r3
>0x00014118 <+1220>:  mov r1, #0
>0x0001411c <+1224>:  add r3, sp, #8
>0x00014120 <+1228>:  mov r0, r3
> => 0x00014124 <+1232>:  str r0, [sp, #8]
What is your exact configure target for gcc and glibc?  Additionally,
what's the git hash id of your glibc source tree?


I can't see how probing at sp+8 is ever valid here.  But the code I get
when compiling cache.i is significantly different than what you're
providing.  What I see is a probe at sp-4.

There's clearly something weird going on here.

jeff


[AArch64] PR71307: Define union class of POINTER+FP

2017-09-18 Thread Richard Sandiford
ALL_REGS doesn't function as a union class of POINTER_REGS and FP_REGS
since it includes the CC register as well.  REGNO_REG_CLASS (CC_REGNUM)
is NO_REGS, but of course NO_REGS rightly doesn't include CC_REGNUM.

Adding a union class for POINTER+FP allows the RA to use it as the
preferred or alternative class of a pseudo.  It also works as a
union class of GENERAL+FP for modes that aren't allowed in SP.

This is also needed for the SVE port, which adds predicate registers
to the mix.

The combination of r252033 and this patch fixes PR71307.  Tested on
aarch64-linux-gnu.  Also tested on SPEC2k6, where there were no
differences outside the (mostly low) noise.  OK to install?

The main potential disadvantage I can see is that the -fsched-pressure
code isn't very good at handling union classes: it generally just updates
one pressure class for each pseudo.  I haven't found any specific examples
of that causing problems though.

Thanks,
Richard


2017-09-15  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
PR target/71307
* config/aarch64/aarch64.h (POINTER_AND_FP_REGS): New reg class.
(REG_CLASS_NAMES, REG_CLASS_CONTENTS): Update accordingly.
* config/aarch64/aarch64.c (aarch64_class_max_nregs): Handle
POINTER_AND_FP_REGS.

gcc/testsuite/
PR target/71307
* gcc.target/aarch64/vect_copy_lane_1.c: Remove XFAIL.

Index: gcc/config/aarch64/aarch64.h
===
--- gcc/config/aarch64/aarch64.h2017-09-15 14:47:33.167333414 +0100
+++ gcc/config/aarch64/aarch64.h2017-09-18 17:31:34.720209011 +0100
@@ -452,6 +452,7 @@ enum reg_class
   POINTER_REGS,
   FP_LO_REGS,
   FP_REGS,
+  POINTER_AND_FP_REGS,
   ALL_REGS,
   LIM_REG_CLASSES  /* Last */
 };
@@ -467,6 +468,7 @@ #define REG_CLASS_NAMES \
   "POINTER_REGS",  \
   "FP_LO_REGS",\
   "FP_REGS",   \
+  "POINTER_AND_FP_REGS",   \
   "ALL_REGS"   \
 }
 
@@ -479,6 +481,7 @@ #define REG_CLASS_CONTENTS  
\
   { 0x, 0x, 0x0003 },  /* POINTER_REGS */  \
   { 0x, 0x, 0x },   /* FP_LO_REGS  */  \
   { 0x, 0x, 0x },   /* FP_REGS  */ \
+  { 0x, 0x, 0x0003 },  /* POINTER_AND_FP_REGS */\
   { 0x, 0x, 0x0007 }   /* ALL_REGS */  \
 }
 
Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2017-09-18 14:58:24.012256423 +0100
+++ gcc/config/aarch64/aarch64.c2017-09-18 17:31:34.720209011 +0100
@@ -6009,6 +6009,7 @@ aarch64_class_max_nregs (reg_class_t reg
 case POINTER_REGS:
 case GENERAL_REGS:
 case ALL_REGS:
+case POINTER_AND_FP_REGS:
 case FP_REGS:
 case FP_LO_REGS:
   return
Index: gcc/testsuite/gcc.target/aarch64/vect_copy_lane_1.c
===
--- gcc/testsuite/gcc.target/aarch64/vect_copy_lane_1.c 2016-11-22 
21:16:00.0 +
+++ gcc/testsuite/gcc.target/aarch64/vect_copy_lane_1.c 2017-09-18 
17:31:34.720209011 +0100
@@ -45,8 +45,7 @@ BUILD_TEST (uint32x2_t,  uint32x4_t,  ,
 BUILD_TEST (float64x1_t, float64x2_t, , q, f64, 0, 1)
 BUILD_TEST (int64x1_t,  int64x2_t,, q, s64, 0, 1)
 BUILD_TEST (uint64x1_t, uint64x2_t,   , q, u64, 0, 1)
-/* XFAIL due to PR 71307.  */
-/* { dg-final { scan-assembler-times "dup\\td0, v1.d\\\[1\\\]" 3 { xfail *-*-* 
} } } */
+/* { dg-final { scan-assembler-times "dup\\td0, v1.d\\\[1\\\]" 3 } } */
 
 /* vcopyq_lane.  */
 BUILD_TEST (poly8x16_t, poly8x8_t, q, , p8, 15, 7)


Ping for some "make more use of ..." patches

2017-09-18 Thread Richard Sandiford
Ping for some minor cleanups that help with the move to variable-length
modes.  Segher has approved the combine.c parts (thanks).

https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01339.html
Make more use of HWI_COMPUTABLE_MODE_P

https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01452.html
Make more use of read_modify_subreg_p

https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01343.html
Make more use of subreg_lowpart_offset

https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01344.html
Make more use of subreg_size_lowpart_offset

https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01345.html
Make more use of byte_lowpart_offset

Thanks,
Richard


Re: [PATCH][RFA/RFC] Stack clash mitigation patch 02/08 - V3

2017-09-18 Thread Andreas Schwab
On Sep 18 2017, Jeff Law  wrote:

> Can you confirm if the probe was in the red zone vs the live areas on
> the stack?

It overwrites a nearby variable.  sp + 8 happens to be the address of
file_entries_new_size.

   0x000140e8 <+1172>:  mov r6, sp
   0x000140ec <+1176>:  add r3, r3, #7
   0x000140f0 <+1180>:  bic r3, r3, #7
   0x000140f4 <+1184>:  cmp r3, #4096   ; 0x1000
   0x000140f8 <+1188>:  bcc 0x14110 
   0x000140fc <+1192>:  sub r3, r3, #4096   ; 0x1000
   0x00014100 <+1196>:  sub sp, sp, #4096   ; 0x1000
   0x00014104 <+1200>:  cmp r3, #4096   ; 0x1000
   0x00014108 <+1204>:  str r0, [sp, #8]
   0x0001410c <+1208>:  bcs 0x140fc 
   0x00014110 <+1212>:  ldr r7, [r11, #-56] ; 0xffc8
   0x00014114 <+1216>:  sub sp, sp, r3
   0x00014118 <+1220>:  mov r1, #0
   0x0001411c <+1224>:  add r3, sp, #8
   0x00014120 <+1228>:  mov r0, r3
=> 0x00014124 <+1232>:  str r0, [sp, #8]

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH][RFA/RFC] Stack clash mitigation patch 02/08 - V3

2017-09-18 Thread Jeff Law
On 09/18/2017 03:29 AM, Andreas Schwab wrote:
> On Jul 30 2017, Jeff Law  wrote:
> 
>> This patch introduces generic mechanisms to protect the dynamically
>> allocated stack space against stack-clash attacks.
>>
>> Changes since V2:
>>
>> Dynamic allocations can be emitted as unrolled inlined probes or with a
>> rotated loop.  Blockage insns are also properly emitted for the dynamic
>> area probes and the dynamic area probing now supports targets that may
>> make optimistic assumptions in their prologues.  Finally it uses the new
>> param to control the probing interval.
>>
>> Tests were updated to explicitly specify the guard and probing interval.
>>  New test to check inline/unrolled probes as well as rotated loop.
> 
> Does that work correctly when the VLA is smaller than the probe size
> (word_mode by default)?  I see a failure in glibc on armv7 where
> ldconfig is using a zero-size VLA, which is invalid in C, but it could
> also end up using a VLA of size 1.
And in the case where the allocation is nonzero, but less than a word,
isn't its size rounded up to STACK_BOUNDARY which I'd expect to be word.

Jeff


Re: [PATCH][RFA/RFC] Stack clash mitigation patch 02/08 - V3

2017-09-18 Thread Jeff Law
On 09/18/2017 03:29 AM, Andreas Schwab wrote:
> On Jul 30 2017, Jeff Law  wrote:
> 
>> This patch introduces generic mechanisms to protect the dynamically
>> allocated stack space against stack-clash attacks.
>>
>> Changes since V2:
>>
>> Dynamic allocations can be emitted as unrolled inlined probes or with a
>> rotated loop.  Blockage insns are also properly emitted for the dynamic
>> area probes and the dynamic area probing now supports targets that may
>> make optimistic assumptions in their prologues.  Finally it uses the new
>> param to control the probing interval.
>>
>> Tests were updated to explicitly specify the guard and probing interval.
>>  New test to check inline/unrolled probes as well as rotated loop.
> 
> Does that work correctly when the VLA is smaller than the probe size
> (word_mode by default)?  I see a failure in glibc on armv7 where
> ldconfig is using a zero-size VLA, which is invalid in C, but it could
> also end up using a VLA of size 1.
For a dynamic allocation of size 0, we should be probing into the red
zone.  Alternately we could emit the branch around the probing bits.
I'd need to think about how that interacts with quirks of the aarch64
outgoing argument probing conventions though.

Can you confirm if the probe was in the red zone vs the live areas on
the stack?  The latter would be a serious issue obviously and I'd like
to track it down.  A testcase would be helpful.

Jeff


Re: [PATCH][RFA/RFC] Stack clash mitigation patch 02/08 - V3

2017-09-18 Thread Joseph Myers
On Mon, 18 Sep 2017, Andreas Schwab wrote:

> Does that work correctly when the VLA is smaller than the probe size
> (word_mode by default)?  I see a failure in glibc on armv7 where
> ldconfig is using a zero-size VLA, which is invalid in C, but it could
> also end up using a VLA of size 1.

FWIW, I'd consider zero-size VLAs (and VLAs with a positive dimension but 
whose elements are zero-size) to be a valid use of the GNU extension of 
zero-size objects - but still appropriate for -fsanitize=vla-bound to 
detect.  (But any enabled-by-default checks for VLA sizes, as discussed in 
bug 68065, ought to allow zero size.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] [i386, libgcc] PR 82196 -mcall-ms2sysv-xlogues emits wrong AVX/SSE MOV

2017-09-18 Thread Dominique d'Humières
This patch (r252896) breaks bootstrap on x86_64-apple-darwin10 configured with

../work/configure --prefix=/opt/gcc/gcc8w 
--enable-languages=c,c++,fortran,objc,obj-c++,ada,lto --with-gmp=/opt/mp-new 
--with-system-zlib --with-isl=/opt/mp-new --enable-lto --enable-plugin

/opt/gcc/build_w/./gcc/xgcc -B/opt/gcc/build_w/./gcc/ 
-B/opt/gcc/gcc8w/x86_64-apple-darwin10.8.0/bin/ 
-B/opt/gcc/gcc8w/x86_64-apple-darwin10.8.0/lib/ -isystem 
/opt/gcc/gcc8w/x86_64-apple-darwin10.8.0/include -isystem 
/opt/gcc/gcc8w/x86_64-apple-darwin10.8.0/sys-include-g -O2 -O2  -g -O2 
-DIN_GCC-W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format 
-Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition  -isystem 
./include   -mmacosx-version-min=10.5 -pipe -fno-common -g -DIN_LIBGCC2 
-fbuilding-libgcc -fno-stack-protector   -mmacosx-version-min=10.5 -pipe 
-fno-common -I. -I. -I../.././gcc -I../../../work/libgcc 
-I../../../work/libgcc/. -I../../../work/libgcc/../gcc 
-I../../../work/libgcc/../include  -DHAVE_CC_TLS -DUSE_EMUTLS -o 
avx_savms64_s.o -MT avx_savms64_s.o -MD -MP -MF avx_savms64_s.dep -DSHARED -c 
-xassembler-with-cpp ../../../work/libgcc/config/i386/avx_savms64.S
../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
%xmm15,-0x30(%rax)'
../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
%xmm14,-0x20(%rax)'
../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
%xmm13,-0x10(%rax)'
../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
%xmm12, (%rax)'
../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
%xmm11, 0x10(%rax)'
../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
%xmm10, 0x20(%rax)'
../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
%xmm9, 0x30(%rax)'
../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
%xmm8, 0x40(%rax)'
../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
%xmm7, 0x50(%rax)'
../../../work/libgcc/config/i386/savms64.h:47:no such instruction: `vmovaps 
%xmm6, 0x60(%rax)'
make[3]: *** [avx_savms64_s.o] Error 1

Dominique



Re: [PATCH][RFA/RFC] Stack clash mitigation patch 02/08 - V3

2017-09-18 Thread Jeff Law
On 09/18/2017 03:29 AM, Andreas Schwab wrote:
> On Jul 30 2017, Jeff Law  wrote:
> 
>> This patch introduces generic mechanisms to protect the dynamically
>> allocated stack space against stack-clash attacks.
>>
>> Changes since V2:
>>
>> Dynamic allocations can be emitted as unrolled inlined probes or with a
>> rotated loop.  Blockage insns are also properly emitted for the dynamic
>> area probes and the dynamic area probing now supports targets that may
>> make optimistic assumptions in their prologues.  Finally it uses the new
>> param to control the probing interval.
>>
>> Tests were updated to explicitly specify the guard and probing interval.
>>  New test to check inline/unrolled probes as well as rotated loop.
> 
> Does that work correctly when the VLA is smaller than the probe size
> (word_mode by default)?  I see a failure in glibc on armv7 where
> ldconfig is using a zero-size VLA, which is invalid in C, but it could
> also end up using a VLA of size 1.
I don't have a test for that, but can probably create one.

ISTM that if the size is variable and zero at runtime, then we need to
either allocate a small chunk and probe or avoid probing.

jeff


Re: [Patch, Fortran] PR 82143: add a -fdefault-real-16 flag

2017-09-18 Thread Steve Kargl
On Mon, Sep 18, 2017 at 09:02:22AM +0200, Janus Weil wrote:
> Hi Steve,
> 
> >> attached is a (technically) simple patch that implements the compiler
> >> flag "-fdefault-real-16" for gfortran.
> >
> > What about -fdefault-real-10?  If you're going to add bloat to the
> > compiler, then you might as well to it right.
> 
> well, yeah. If my only aim was to add bloat to the compiler out of
> plain boredom and nastiness, then I might as well add
> -fdefault-real-37. But I don't think that would be very useful.

Why?  One gets 11-bits of additional precision (on most platforms)
and a significant increase in the exponent range (+- ~1024 to
+- ~16384).  REAL(10) maps to hardware floating point, which is
faster than software quad precision.

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow


Re: Let the target choose a vectorisation alignment

2017-09-18 Thread Richard Sandiford
Richard Biener  writes:
> On Mon, Sep 18, 2017 at 1:58 PM, Richard Sandiford
>  wrote:
>> The vectoriser aligned vectors to TYPE_ALIGN unconditionally, although
>> there was also a hard-coded assumption that this was equal to the type
>> size.  This was inconvenient for SVE for two reasons:
>>
>> - When compiling for a specific power-of-2 SVE vector length, we might
>>   want to align to a full vector.  However, the TYPE_ALIGN is governed
>>   by the ABI alignment, which is 128 bits regardless of size.
>>
>> - For vector-length-agnostic code it doesn't usually make sense to align,
>>   since the runtime vector length might not be a power of two.  Even for
>>   power of two sizes, there's no guarantee that aligning to the previous
>>   16 bytes will be an improveent.
>>
>> This patch therefore adds a target hook to control the preferred
>> vectoriser (as opposed to ABI) alignment.
>>
>> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
>> Also tested by comparing the testsuite assembly output on at least one
>> target per CPU directory.  OK to install?
>
> Did you specifically choose to pass the hook a vector type rather than
> a mode?

It seemed like the safest thing to do for the default implementation,
e.g. in case we're vectorising "without SIMD" and thus without an
underlying vector mode.  I agree it probably doesn't make much
difference for non-default implementations.

> I suppose in peeling for alignment the target should be able to
> prevent peeling by returning element alignment from the hook?

Yeah.  This is what the SVE port does in the default vector-length
agnostic mode, and might also make sense in fixed-length mode.
Maybe it would be useful for other targets too, if unaligned accesses
have a negligible penalty for them.

Thanks,
Richard
> Ok.
>
> Thanks,
> Richard.


RE: [PATCH] [ARC] Check the assembler for gdwarf2 support.

2017-09-18 Thread Claudiu Zissulescu
> > gcc/
> > 2017-06-21  Claudiu Zissulescu  
> >
> > * configure.ac: Add arc and check if assembler supports gdwarf2.
> > * configure: Regenerate.
> OK.
> jeff

Committed. Thank you for your review,
Claudiu


Re: [AArch64] Tweak aarch64_classify_address interface

2017-09-18 Thread Richard Sandiford
Richard Sandiford  writes:
> James Greenhalgh  writes:
>> On Tue, Aug 22, 2017 at 10:23:47AM +0100, Richard Sandiford wrote:
>>> Previously aarch64_classify_address used an rtx code to distinguish
>>> LDP/STP addresses from normal addresses; the code was PARALLEL
>>> to select LDP/STP and anything else to select normal addresses.
>>> This patch replaces that parameter with a dedicated enum.
>>> 
>>> The SVE port will add another enum value that didn't map naturally
>>> to an rtx code.
>>> 
>>> Tested on aarch64-linux-gnu.  OK to install?
>>
>> I can't say I really like this new interface, I'd prefer two wrappers
>> aarch64_legitimate_address_p , aarch64_legitimate_ldp_address_p (or similar)
>> around your new interface, and for most code to simply call the wrapper.
>> Or an overloaded call that filled in ADDR_QUERY_M automatically, to save
>> that spreading through the backend.
>
> OK, I went for the second, putting the query type last and making it
> an optional argument.

By way of a ping, here's the patch updated to current trunk.

Tested on aarch64-linux-gnu.  OK to install?

Thanks,
Richard

2017-09-18  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* config/aarch64/aarch64-protos.h (aarch64_addr_query_type): New enum.
(aarch64_legitimate_address_p): Use it instead of an rtx code,
as an optional final parameter.
* config/aarch64/aarch64.c (aarch64_classify_address): Likewise.
(aarch64_legitimate_address_p): Likewise.
(aarch64_address_valid_for_prefetch_p): Update calls accordingly.
(aarch64_legitimate_address_hook_p): Likewise.
(aarch64_print_operand_address): Likewise.
(aarch64_address_cost): Likewise.
* config/aarch64/constraints.md (Umq, Ump): Likewise.
* config/aarch64/predicates.md (aarch64_mem_pair_operand): Likewise.

Index: gcc/config/aarch64/aarch64-protos.h
===
--- gcc/config/aarch64/aarch64-protos.h 2017-09-18 14:41:37.369070450 +0100
+++ gcc/config/aarch64/aarch64-protos.h 2017-09-18 14:42:29.656488378 +0100
@@ -111,6 +111,19 @@ enum aarch64_symbol_type
   SYMBOL_FORCE_TO_MEM
 };
 
+/* Classifies the type of an address query.
+
+   ADDR_QUERY_M
+  Query what is valid for an "m" constraint and a memory_operand
+  (the rules are the same for both).
+
+   ADDR_QUERY_LDP_STP
+  Query what is valid for a load/store pair.  */
+enum aarch64_addr_query_type {
+  ADDR_QUERY_M,
+  ADDR_QUERY_LDP_STP
+};
+
 /* A set of tuning parameters contains references to size and time
cost models and vectors for address cost calculations, register
move costs and memory move costs.  */
@@ -427,7 +440,8 @@ bool aarch64_float_const_representable_p
 
 #if defined (RTX_CODE)
 
-bool aarch64_legitimate_address_p (machine_mode, rtx, RTX_CODE, bool);
+bool aarch64_legitimate_address_p (machine_mode, rtx, bool,
+  aarch64_addr_query_type = ADDR_QUERY_M);
 machine_mode aarch64_select_cc_mode (RTX_CODE, rtx, rtx);
 rtx aarch64_gen_compare_reg (RTX_CODE, rtx, rtx);
 rtx aarch64_load_tp (rtx);
Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2017-09-18 14:41:37.373588926 +0100
+++ gcc/config/aarch64/aarch64.c2017-09-18 14:42:29.657389742 +0100
@@ -4409,21 +4409,21 @@ virt_or_elim_regno_p (unsigned regno)
  || regno == ARG_POINTER_REGNUM);
 }
 
-/* Return true if X is a valid address for machine mode MODE.  If it is,
-   fill in INFO appropriately.  STRICT_P is true if REG_OK_STRICT is in
-   effect.  OUTER_CODE is PARALLEL for a load/store pair.  */
+/* Return true if X is a valid address of type TYPE for machine mode MODE.
+   If it is, fill in INFO appropriately.  STRICT_P is true if
+   REG_OK_STRICT is in effect.  */
 
 static bool
 aarch64_classify_address (struct aarch64_address_info *info,
- rtx x, machine_mode mode,
- RTX_CODE outer_code, bool strict_p)
+ rtx x, machine_mode mode, bool strict_p,
+ aarch64_addr_query_type type = ADDR_QUERY_M)
 {
   enum rtx_code code = GET_CODE (x);
   rtx op0, op1;
 
   /* On BE, we use load/store pair for all large int mode load/stores.
  TI/TFmode may also use a load/store pair.  */
-  bool load_store_pair_p = (outer_code == PARALLEL
+  bool load_store_pair_p = (type == ADDR_QUERY_LDP_STP
|| mode == TImode
|| mode == TFmode
|| (BYTES_BIG_ENDIAN
@@ -4655,7 +4655,7 @@ aarch64_address_valid_for_prefetch_p (rt
   struct aarch64_address_info addr;
 
   /* PRFM accepts the same addresses as DImode...  */
-  bool res = 

Re: Let the target choose a vectorisation alignment

2017-09-18 Thread Richard Biener
On Mon, Sep 18, 2017 at 1:58 PM, Richard Sandiford
 wrote:
> The vectoriser aligned vectors to TYPE_ALIGN unconditionally, although
> there was also a hard-coded assumption that this was equal to the type
> size.  This was inconvenient for SVE for two reasons:
>
> - When compiling for a specific power-of-2 SVE vector length, we might
>   want to align to a full vector.  However, the TYPE_ALIGN is governed
>   by the ABI alignment, which is 128 bits regardless of size.
>
> - For vector-length-agnostic code it doesn't usually make sense to align,
>   since the runtime vector length might not be a power of two.  Even for
>   power of two sizes, there's no guarantee that aligning to the previous
>   16 bytes will be an improveent.
>
> This patch therefore adds a target hook to control the preferred
> vectoriser (as opposed to ABI) alignment.
>
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
> Also tested by comparing the testsuite assembly output on at least one
> target per CPU directory.  OK to install?

Did you specifically choose to pass the hook a vector type rather than
a mode?  I suppose in peeling for alignment the target should be able to
prevent peeling by returning element alignment from the hook?

Ok.

Thanks,
Richard.

> Richard
>
>
> 2017-09-18  Richard Sandiford  
> Alan Hayward  
> David Sherwood  
>
> gcc/
> * target.def (preferred_vector_alignment): New hook.
> * doc/tm.texi.in (TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): New
> hook.
> * doc/tm.texi: Regenerate.
> * targhooks.h (default_preferred_vector_alignment): Declare.
> * targhooks.c (default_preferred_vector_alignment): New function.
> * tree-vectorizer.h (dataref_aux): Add a target_alignment field.
> Expand commentary.
> (DR_TARGET_ALIGNMENT): New macro.
> (aligned_access_p): Update commentary.
> (vect_known_alignment_in_bytes): New function.
> * tree-vect-data-refs.c (vect_calculate_required_alignment): New
> function.
> (vect_compute_data_ref_alignment): Set DR_TARGET_ALIGNMENT.
> Calculate the misalignment based on the target alignment rather than
> the vector size.
> (vect_update_misalignment_for_peel): Use DR_TARGET_ALIGMENT
> rather than TYPE_ALIGN / BITS_PER_UNIT to update the misalignment.
> (vect_enhance_data_refs_alignment): Mask the byte misalignment with
> the target alignment, rather than masking the element misalignment
> with the number of elements in a vector.  Also use the target
> alignment when calculating the maximum number of peels.
> (vect_find_same_alignment_drs): Use vect_calculate_required_alignment
> instead of TYPE_ALIGN_UNIT.
> (vect_duplicate_ssa_name_ptr_info): Remove stmt_info parameter.
> Measure DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT.
> (vect_create_addr_base_for_vector_ref): Update call accordingly.
> (vect_create_data_ref_ptr): Likewise.
> (vect_setup_realignment): Realign by ANDing with
> -DR_TARGET_MISALIGNMENT.
> * tree-vect-loop-manip.c (vect_gen_prolog_loop_niters): Calculate
> the number of peels based on DR_TARGET_ALIGNMENT.
> * tree-vect-stmts.c (get_group_load_store_type): Compare the gap
> with the guaranteed alignment boundary when deciding whether
> overrun is OK.
> (vectorizable_mask_load_store): Interpret DR_MISALIGNMENT
> relative to DR_TARGET_ALIGNMENT instead of TYPE_ALIGN_UNIT.
> (ensure_base_align): Remove stmt_info parameter.  Get the
> target base alignment from DR_TARGET_ALIGNMENT.
> (vectorizable_store): Update call accordingly.   Interpret
> DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT instead of
> TYPE_ALIGN_UNIT.
> (vectorizable_load): Likewise.
>
> gcc/testsuite/
> * gcc.dg/vect/vect-outer-3a.c: Adjust dump scan for new wording
> of alignment message.
> * gcc.dg/vect/vect-outer-3a-big-array.c: Likewise.
>
> Index: gcc/target.def
> ===
> *** gcc/target.def  2017-09-18 12:56:24.635070853 +0100
> --- gcc/target.def  2017-09-18 12:56:24.847378559 +0100
> *** misalignment value (@var{misalign}).",
> *** 1820,1825 
> --- 1820,1839 
>int, (enum vect_cost_for_stmt type_of_cost, tree vectype, int misalign),
>default_builtin_vectorization_cost)
>
> + DEFHOOK
> + (preferred_vector_alignment,
> +  "This hook returns the preferred alignment in bits for accesses to\n\
> + vectors of type @var{type} in vectorized code.  This might be less than\n\
> + or greater than the ABI-defined value returned by\n\
> + @code{TARGET_VECTOR_ALIGNMENT}.  It can be equal to the 

[PATCH][GRAPHITE] Fix PR69728

2017-09-18 Thread Richard Biener

The following fixes the (old) ICE in outer_projection_mupa we now run
into with SPEC CPU 2006 as well.  As I don't understand what the
code does or how it should behave when the scheduling domain is empty
the following simply adds a way to indicate failure when we'd
previously ICE.

Bootstrap and regtest running on x86_64-unknown-linux-gnu, will apply
once that finished.

Richard.

2017-09-18  Richard Biener  

PR tree-optimization/69728
* graphite-sese-to-poly.c (schedule_error): New global.
(add_loop_schedule): Handle empty domain by failing the
schedule.
(build_original_schedule): Handle schedule_error.

* gfortran.dg/graphite/pr69728.f90: New testcase.
* gcc.dg/graphite/pr69728.c: Likewise.

Index: gcc/graphite-sese-to-poly.c
===
--- gcc/graphite-sese-to-poly.c (revision 252920)
+++ gcc/graphite-sese-to-poly.c (working copy)
@@ -1030,6 +1035,8 @@ outer_projection_mupa (__isl_take isl_un
   return isl_multi_union_pw_aff_from_union_pw_multi_aff (data.res);
 }
 
+static bool schedule_error;
+
 /* Embed SCHEDULE in the constraints of the LOOP domain.  */
 
 static isl_schedule *
@@ -1043,6 +1050,14 @@ add_loop_schedule (__isl_take isl_schedu
   if (empty < 0 || empty)
 return empty < 0 ? isl_schedule_free (schedule) : schedule;
 
+  isl_union_set *domain = isl_schedule_get_domain (schedule);
+  if (isl_union_set_is_empty (domain))
+{
+  schedule_error = true;
+  isl_union_set_free (domain);
+  return schedule;
+}
+
   isl_space *space = isl_set_get_space (iterators);
   int loop_index = isl_space_dim (space, isl_dim_set) - 1;
 
@@ -1063,7 +1078,6 @@ add_loop_schedule (__isl_take isl_schedu
   prefix = isl_multi_aff_set_tuple_id (prefix, isl_dim_out, label);
 
   int n = isl_multi_aff_dim (prefix, isl_dim_in);
-  isl_union_set *domain = isl_schedule_get_domain (schedule);
   isl_multi_union_pw_aff *mupa = outer_projection_mupa (domain, n);
   mupa = isl_multi_union_pw_aff_apply_multi_aff (mupa, prefix);
   return isl_schedule_insert_partial_schedule (schedule, mupa);
@@ -1169,6 +1183,8 @@ build_schedule_loop_nest (scop_p scop, i
 static bool
 build_original_schedule (scop_p scop)
 {
+  schedule_error = false;
+
   int i = 0;
   int n = scop->pbbs.length ();
   while (i < n)
@@ -1183,6 +1199,14 @@ build_original_schedule (scop_p scop)
   scop->original_schedule = add_in_sequence (scop->original_schedule, s);
 }
 
+  if (schedule_error)
+{
+  if (dump_file)
+   fprintf (dump_file, "[sese-to-poly] failed to build "
+"original schedule\n");
+  return false;
+}
+
   if (dump_file)
 {
   fprintf (dump_file, "[sese-to-poly] original schedule:\n");
Index: gcc/testsuite/gfortran.dg/graphite/pr69728.f90
===
--- gcc/testsuite/gfortran.dg/graphite/pr69728.f90  (nonexistent)
+++ gcc/testsuite/gfortran.dg/graphite/pr69728.f90  (working copy)
@@ -0,0 +1,26 @@
+! { dg-do compile }
+! { dg-options "-O3 -floop-nest-optimize" }
+SUBROUTINE rk_addtend_dry ( t_tend, t_tendf, t_save, rk_step, &
+h_diabatic, mut, msft, ide, jde,  &
+ims,ime, jms,jme, kms,kme,&
+its,ite, jts,jte, kts,kte)
+   IMPLICIT NONE
+   INTEGER ,  INTENT(IN   ) :: ide, jde, ims, ime, jms, jme, kms, kme, &
+   its, ite, jts, jte, kts, kte
+   INTEGER ,  INTENT(IN   ) :: rk_step
+   REAL , DIMENSION( ims:ime , kms:kme, jms:jme  ), &
+   INTENT(INOUT) :: t_tend, t_tendf
+   REAL , DIMENSION( ims:ime , kms:kme, jms:jme  ) , &
+   INTENT(IN   ) ::  t_save, h_diabatic
+   REAL , DIMENSION( ims:ime , jms:jme ) , INTENT(IN   ) :: mut, msft
+   INTEGER :: i, j, k
+   DO j = jts,MIN(jte,jde-1)
+   DO k = kts,kte-1
+   DO i = its,MIN(ite,ide-1)
+ IF(rk_step == 1)t_tendf(i,k,j) = t_tendf(i,k,j) +  t_save(i,k,j)
+  t_tend(i,k,j) =  t_tend(i,k,j) +  t_tendf(i,k,j)/msft(i,j)  &
+ +  mut(i,j)*h_diabatic(i,k,j)/msft(i,j)
+   ENDDO
+   ENDDO
+   ENDDO
+END SUBROUTINE rk_addtend_dry
Index: gcc/testsuite/gcc.dg/graphite/pr69728.c
===
--- gcc/testsuite/gcc.dg/graphite/pr69728.c (nonexistent)
+++ gcc/testsuite/gcc.dg/graphite/pr69728.c (working copy)
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -floop-nest-optimize" } */
+
+int a[1];
+int b, c, d, e;
+void
+fn1 ()
+{
+  d = 9;
+  for (; c; c++)
+{
+  ++d;
+  b = 8;
+  for (; b; b--)
+   {
+ if (d)
+   break;
+ a[b] = e;
+   }
+}
+}


Re: Backport fix: [PATCH] Fix target attribute handling (PR c++/81355).

2017-09-18 Thread Jakub Jelinek
On Mon, Sep 18, 2017 at 03:01:53PM +0200, Martin Liška wrote:
> As discussed here:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81224
> 
> We have fallout caused by the patch and it's backport to active branches.
> I'm planning to revert the patch and install patch that will ignore empty 
> string
> values. I'm testing the patch.
> 
> Jakub do we really want it also for GCC 7? Note that the problematic 
> test-case is OK on GCC 7 branch
> as it contains your patch mentioned in discussion.

The question is, has the GCC 7 patch changed solely testcases where we'd
ICE on into ones where we warn, or are there cases where we used to accept
it and now warn?
Generally we don't want to introduce new warnings/errors on release branches
on something that used to be accepted, unless really necessary.

Jakub


[PATCH][GRAPHITE] Another SCOP verification goof

2017-09-18 Thread Richard Biener

The following enables another 35% more loop nest optimizations on
SPEC CPU 2006.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2017-09-18  Richard Biener  

* graphite-scop-detection.c (scop_detection::can_represent_loop):
Do not iterate to sibling loops but only to siblings of inner
loops.

Index: gcc/graphite-scop-detection.c
===
--- gcc/graphite-scop-detection.c   (revision 252923)
+++ gcc/graphite-scop-detection.c   (working copy)
@@ -975,11 +975,9 @@ scop_detection::can_represent_loop (loop
 {
   if (!can_represent_loop_1 (loop, scop))
 return false;
-  if (loop->inner && !can_represent_loop (loop->inner, scop))
-return false;
-  if (loop->next && !can_represent_loop (loop->next, scop))
-return false;
-
+  for (loop_p inner = loop->inner; inner; inner = inner->next)
+if (!can_represent_loop (inner, scop))
+  return false;
   return true;
 }
 


Backport fix: [PATCH] Fix target attribute handling (PR c++/81355).

2017-09-18 Thread Martin Liška
Hello.

As discussed here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81224

We have fallout caused by the patch and it's backport to active branches.
I'm planning to revert the patch and install patch that will ignore empty string
values. I'm testing the patch.

Jakub do we really want it also for GCC 7? Note that the problematic test-case 
is OK on GCC 7 branch
as it contains your patch mentioned in discussion.

Martin




>From d0f04048f86d2e13079900e8fee7fdf08643197a Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 18 Sep 2017 14:46:31 +0200
Subject: [PATCH 2/2] Ignore empty string in target attribute (PR c++/81355).

gcc/ChangeLog:

2017-09-18  Martin Liska  

	PR c++/81355
	* config/i386/i386.c (sorted_attr_string): Skip empty strings.
---
 gcc/config/i386/i386.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index b41cb819227..b62932ac2de 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -36766,6 +36766,9 @@ sorted_attr_string (tree arglist)
 {
   const char *str = TREE_STRING_POINTER (TREE_VALUE (arg));
   size_t len = strlen (str);
+  /* Skip empty string.  */
+  if (len == 0)
+	continue;
   str_len_sum += len + 1;
   if (arg != arglist)
 	argnum++;
@@ -36780,11 +36783,21 @@ sorted_attr_string (tree arglist)
 {
   const char *str = TREE_STRING_POINTER (TREE_VALUE (arg));
   size_t len = strlen (str);
+  /* Skip empty string.  */
+  if (len == 0)
+	continue;
   memcpy (attr_str + str_len_sum, str, len);
   attr_str[str_len_sum + len] = TREE_CHAIN (arg) ? ',' : '\0';
   str_len_sum += len + 1;
 }
 
+  /* Strip ',' character at the end.  */
+  if (str_len_sum > 0 && attr_str[str_len_sum - 1] == ',')
+{
+  attr_str[str_len_sum - 1] = '\0';
+  str_len_sum--;
+}
+
   /* Replace "=,-" with "_".  */
   for (i = 0; i < strlen (attr_str); i++)
 if (attr_str[i] == '=' || attr_str[i]== '-')
-- 
2.14.1



[PATCH] Use built-in for std::make_integer_sequnce

2017-09-18 Thread Jonathan Wakely

I forgot to commit this patch at the start of stage 1 when Jason added
the __integer_pack builtin.

This patch uses __has_builtin to detect Clang's __make_integer_seq
builtin and uses that instead when available. This means it works with
either __integer_pack or __make_integer_seq. whichever is available.

We _could_ keep the old implementation, for compilers that don't
support either builtin (I'm not sure if Intel icc supports either) but
I haven't done that here.

* include/std/utility (_Itup_cat, _Make_integer_sequence): Remove.
(_Build_index_tuple, make_integer_sequence): Use built-in to generate
pack expansion.

Tested powerpc64le-linux, committed to trunk.


commit d208208c187b810b68cee66730866eb862aed09d
Author: Jonathan Wakely 
Date:   Mon Sep 18 12:44:00 2017 +0100

Use built-in for std::make_integer_sequnce

* include/std/utility (_Itup_cat, _Make_integer_sequence): Remove.
(_Build_index_tuple, make_integer_sequence): Use built-in to 
generate
pack expansion.

diff --git a/libstdc++-v3/include/std/utility b/libstdc++-v3/include/std/utility
index c18bcb6f72d..29a626004f9 100644
--- a/libstdc++-v3/include/std/utility
+++ b/libstdc++-v3/include/std/utility
@@ -267,32 +267,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // extract the elements in a tuple.
   template struct _Index_tuple { };
 
-  // Concatenates two _Index_tuples.
-  template struct _Itup_cat;
-
-  template
-struct _Itup_cat<_Index_tuple<_Ind1...>, _Index_tuple<_Ind2...>>
-{
-  using __type = _Index_tuple<_Ind1..., (_Ind2 + sizeof...(_Ind1))...>;
-};
+#ifdef __has_builtin
+# if __has_builtin(__make_integer_seq)
+#  define _GLIBCXX_USE_MAKE_INTEGER_SEQ 1
+# endif
+#endif
 
   // Builds an _Index_tuple<0, 1, 2, ..., _Num-1>.
   template
 struct _Build_index_tuple
-: _Itup_cat::__type,
-   typename _Build_index_tuple<_Num - _Num / 2>::__type>
-{ };
-
-  template<>
-struct _Build_index_tuple<1>
 {
-  typedef _Index_tuple<0> __type;
-};
+#if _GLIBCXX_USE_MAKE_INTEGER_SEQ
+  template
+using _IdxTuple = _Index_tuple<_Indices...>;
 
-  template<>
-struct _Build_index_tuple<0>
-{
-  typedef _Index_tuple<> __type;
+  using __type = __make_integer_seq<_IdxTuple, size_t, _Num>;
+#else
+  using __type = _Index_tuple<__integer_pack(_Num)...>;
+#endif
 };
 
 #if __cplusplus > 201103L
@@ -307,23 +299,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   static constexpr size_t size() { return sizeof...(_Idx); }
 };
 
-  template::__type>
-struct _Make_integer_sequence;
-
-  template
-struct _Make_integer_sequence<_Tp, _Num, _Index_tuple<_Idx...>>
-{
-  static_assert( _Num >= 0,
-"Cannot make integer sequence of negative length" );
-
-  typedef integer_sequence<_Tp, static_cast<_Tp>(_Idx)...> __type;
-};
-
   /// Alias template make_integer_sequence
   template
 using make_integer_sequence
-  = typename _Make_integer_sequence<_Tp, _Num>::__type;
+#if _GLIBCXX_USE_MAKE_INTEGER_SEQ
+  = __make_integer_seq;
+#else
+  = integer_sequence<_Tp, __integer_pack(_Num)...>;
+#endif
+
+#undef _GLIBCXX_USE_MAKE_INTEGER_SEQ
 
   /// Alias template index_sequence
   template


Let the target choose a vectorisation alignment

2017-09-18 Thread Richard Sandiford
The vectoriser aligned vectors to TYPE_ALIGN unconditionally, although
there was also a hard-coded assumption that this was equal to the type
size.  This was inconvenient for SVE for two reasons:

- When compiling for a specific power-of-2 SVE vector length, we might
  want to align to a full vector.  However, the TYPE_ALIGN is governed
  by the ABI alignment, which is 128 bits regardless of size.

- For vector-length-agnostic code it doesn't usually make sense to align,
  since the runtime vector length might not be a power of two.  Even for
  power of two sizes, there's no guarantee that aligning to the previous
  16 bytes will be an improveent.

This patch therefore adds a target hook to control the preferred
vectoriser (as opposed to ABI) alignment.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
Also tested by comparing the testsuite assembly output on at least one
target per CPU directory.  OK to install?

Richard


2017-09-18  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* target.def (preferred_vector_alignment): New hook.
* doc/tm.texi.in (TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): New
hook.
* doc/tm.texi: Regenerate.
* targhooks.h (default_preferred_vector_alignment): Declare.
* targhooks.c (default_preferred_vector_alignment): New function.
* tree-vectorizer.h (dataref_aux): Add a target_alignment field.
Expand commentary.
(DR_TARGET_ALIGNMENT): New macro.
(aligned_access_p): Update commentary.
(vect_known_alignment_in_bytes): New function.
* tree-vect-data-refs.c (vect_calculate_required_alignment): New
function.
(vect_compute_data_ref_alignment): Set DR_TARGET_ALIGNMENT.
Calculate the misalignment based on the target alignment rather than
the vector size.
(vect_update_misalignment_for_peel): Use DR_TARGET_ALIGMENT
rather than TYPE_ALIGN / BITS_PER_UNIT to update the misalignment.
(vect_enhance_data_refs_alignment): Mask the byte misalignment with
the target alignment, rather than masking the element misalignment
with the number of elements in a vector.  Also use the target
alignment when calculating the maximum number of peels.
(vect_find_same_alignment_drs): Use vect_calculate_required_alignment
instead of TYPE_ALIGN_UNIT.
(vect_duplicate_ssa_name_ptr_info): Remove stmt_info parameter.
Measure DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT.
(vect_create_addr_base_for_vector_ref): Update call accordingly.
(vect_create_data_ref_ptr): Likewise.
(vect_setup_realignment): Realign by ANDing with
-DR_TARGET_MISALIGNMENT.
* tree-vect-loop-manip.c (vect_gen_prolog_loop_niters): Calculate
the number of peels based on DR_TARGET_ALIGNMENT.
* tree-vect-stmts.c (get_group_load_store_type): Compare the gap
with the guaranteed alignment boundary when deciding whether
overrun is OK.
(vectorizable_mask_load_store): Interpret DR_MISALIGNMENT
relative to DR_TARGET_ALIGNMENT instead of TYPE_ALIGN_UNIT.
(ensure_base_align): Remove stmt_info parameter.  Get the
target base alignment from DR_TARGET_ALIGNMENT.
(vectorizable_store): Update call accordingly.   Interpret
DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT instead of
TYPE_ALIGN_UNIT.
(vectorizable_load): Likewise.

gcc/testsuite/
* gcc.dg/vect/vect-outer-3a.c: Adjust dump scan for new wording
of alignment message.
* gcc.dg/vect/vect-outer-3a-big-array.c: Likewise.

Index: gcc/target.def
===
*** gcc/target.def  2017-09-18 12:56:24.635070853 +0100
--- gcc/target.def  2017-09-18 12:56:24.847378559 +0100
*** misalignment value (@var{misalign}).",
*** 1820,1825 
--- 1820,1839 
   int, (enum vect_cost_for_stmt type_of_cost, tree vectype, int misalign),
   default_builtin_vectorization_cost)
  
+ DEFHOOK
+ (preferred_vector_alignment,
+  "This hook returns the preferred alignment in bits for accesses to\n\
+ vectors of type @var{type} in vectorized code.  This might be less than\n\
+ or greater than the ABI-defined value returned by\n\
+ @code{TARGET_VECTOR_ALIGNMENT}.  It can be equal to the alignment of\n\
+ a single element, in which case the vectorizer will not try to optimize\n\
+ for alignment.\n\
+ \n\
+ The default hook returns @code{TYPE_ALIGN (@var{type})}, which is\n\
+ correct for most targets.",
+  HOST_WIDE_INT, (const_tree type),
+  default_preferred_vector_alignment)
+ 
  /* Return true if vector alignment is reachable (by peeling N
 iterations) for the given scalar type.  */
  DEFHOOK
Index: gcc/doc/tm.texi.in

[PATCH] PR libstdc++/71187 reimplement declval without add_rvalue_reference

2017-09-18 Thread Jonathan Wakely

This implements Eric Niebler's suggestion of a more lightweight
std::declval, which doesn't need to instantiate
std::add_rvalue_reference (and its base class and helpers).

PR libstdc++/71187
* include/std/type_traits (__declval): New function to deduce return
type of declval.
(__declval_protector::_delegate): Remove.
(declval): Use __declval instead of add_rvalue_reference and
__declval_protector::__delegate.
* testsuite/20_util/declval/requirements/1_neg.cc: Adjust dg-error
lineno.
* testsuite/20_util/make_signed/requirements/typedefs_neg.cc:
Likewise.
* testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc:
Likewise.

Tested powerpc64le-linux, committed to trunk.

commit 05558f97c64247cf421eff8b17570a0844794cbc
Author: Jonathan Wakely 
Date:   Fri Sep 15 14:49:08 2017 +0100

PR libstdc++/71187 reimplement declval without add_rvalue_reference

PR libstdc++/71187
* include/std/type_traits (__declval): New function to deduce return
type of declval.
(__declval_protector::_delegate): Remove.
(declval): Use __declval instead of add_rvalue_reference and
__declval_protector::__delegate.
* testsuite/20_util/declval/requirements/1_neg.cc: Adjust dg-error
lineno.
* testsuite/20_util/make_signed/requirements/typedefs_neg.cc:
Likewise.
* testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc:
Likewise.

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index f021c42396c..15b0d92bcb6 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -754,15 +754,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // Destructible and constructible type properties.
 
-  template
-struct add_rvalue_reference;
-
   /**
*  @brief  Utility to simplify expressions used in unevaluated operands
*  @ingroup utilities
*/
+
+  template
+_Up
+__declval(int);
+
   template
-typename add_rvalue_reference<_Tp>::type declval() noexcept;
+_Tp
+__declval(long);
+
+  template
+auto declval() noexcept -> decltype(__declval<_Tp>(0));
 
   template
 struct extent;
@@ -2079,16 +2085,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __declval_protector
 {
   static const bool __stop = false;
-  static typename add_rvalue_reference<_Tp>::type __delegate();
 };
 
   template
-inline typename add_rvalue_reference<_Tp>::type
-declval() noexcept
+auto declval() noexcept -> decltype(__declval<_Tp>(0))
 {
   static_assert(__declval_protector<_Tp>::__stop,
"declval() must not be used!");
-  return __declval_protector<_Tp>::__delegate();
+  return __declval<_Tp>(0);
 }
 
   /// result_of
diff --git a/libstdc++-v3/testsuite/20_util/declval/requirements/1_neg.cc 
b/libstdc++-v3/testsuite/20_util/declval/requirements/1_neg.cc
index 4e254e89191..17b41a007db 100644
--- a/libstdc++-v3/testsuite/20_util/declval/requirements/1_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/declval/requirements/1_neg.cc
@@ -18,7 +18,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-error "static assertion failed" "" { target *-*-* } 2089 }
+// { dg-error "static assertion failed" "" { target *-*-* } 2093 }
 
 #include 
 
diff --git 
a/libstdc++-v3/testsuite/20_util/make_signed/requirements/typedefs_neg.cc 
b/libstdc++-v3/testsuite/20_util/make_signed/requirements/typedefs_neg.cc
index e3e80f91979..308155383f0 100644
--- a/libstdc++-v3/testsuite/20_util/make_signed/requirements/typedefs_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/make_signed/requirements/typedefs_neg.cc
@@ -47,4 +47,4 @@ void test01()
 // { dg-error "required from here" "" { target *-*-* } 39 }
 // { dg-error "required from here" "" { target *-*-* } 41 }
 
-// { dg-error "invalid use of incomplete type" "" { target *-*-* } 1754 }
+// { dg-error "invalid use of incomplete type" "" { target *-*-* } 1760 }
diff --git 
a/libstdc++-v3/testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc 
b/libstdc++-v3/testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc
index 86b0c2d6da7..412608e5669 100644
--- a/libstdc++-v3/testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc
@@ -47,5 +47,5 @@ void test01()
 // { dg-error "required from here" "" { target *-*-* } 39 }
 // { dg-error "required from here" "" { target *-*-* } 41 }
 
-// { dg-error "invalid use of incomplete type" "" { target *-*-* } 1650 }
+// { dg-error "invalid use of incomplete type" "" { target *-*-* } 1656 }
 


Prevent invalid register mode changes in combine

2017-09-18 Thread Richard Sandiford
This patch stops combine from changing the mode of an existing register
in-place if doing so would change the size of the underlying register
allocation size, as given by REGMODE_NATURAL_SIZE.  Without this,
many tests fail in adjust_reg_mode after SVE is added.  One example
is gcc.c-torture/compile/20090401-1.c.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
Also tested by comparing the testsuite assembly output on at least one
target per CPU directory.  OK to install?

Richard


2017-09-18  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* combine.c (can_change_dest_mode): Reject changes in
REGMODE_NATURAL_SIZE.

Index: gcc/combine.c
===
--- gcc/combine.c   2017-09-18 12:31:05.45926 +0100
+++ gcc/combine.c   2017-09-18 12:31:05.604645232 +0100
@@ -2451,6 +2451,12 @@ can_change_dest_mode (rtx x, int added_s
   if (!REG_P (x))
 return false;
 
+  /* Don't change between modes with different underlying register sizes,
+ since this could lead to invalid subregs.  */
+  if (REGMODE_NATURAL_SIZE (mode)
+  != REGMODE_NATURAL_SIZE (GET_MODE (x)))
+return false;
+
   regno = REGNO (x);
   /* Allow hard registers if the new mode is legal, and occupies no more
  registers than the old mode.  */


Re: [PATCH] Bump downloaded ISL version to 0.18

2017-09-18 Thread Markus Trippelsdorf
On 2017.09.18 at 09:41 +0200, Richard Biener wrote:
> 
> Committed.
> 
> Richard.
> 
> 2017-09-18  Richard Biener  
> 
>   * download_prerequisites (isl): Bump version to 0.18.
> 
> Index: contrib/download_prerequisites
> ===
> --- contrib/download_prerequisites(revision 252906)
> +++ contrib/download_prerequisites(working copy)
> @@ -30,7 +30,7 @@ version='(unversioned)'
>  gmp='gmp-6.1.0.tar.bz2'
>  mpfr='mpfr-3.1.4.tar.bz2'
>  mpc='mpc-1.0.3.tar.gz'
> -isl='isl-0.16.1.tar.bz2'
> +isl='isl-0.18.tar.bz2'

As an obvious follow-up, I've updated the checksums:

commit 076d07cde56c0da62036f3bdd440fa5a160d5f6b
Author: trippels 
Date:   Mon Sep 18 11:25:13 2017 +

Update checksums for isl-0.18.tar.bz2

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@252921 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/contrib/prerequisites.md5 b/contrib/prerequisites.md5
index b8e89d43c8a8..cc71e0f4de68 100644
--- a/contrib/prerequisites.md5
+++ b/contrib/prerequisites.md5
@@ -1,4 +1,4 @@
 86ee6e54ebfc4a90b643a65e402c4048  gmp-6.1.0.tar.bz2
 b8a2f6b0e68bef46e53da2ac439e1cf4  mpfr-3.1.4.tar.bz2
 d6a1d5f8ddea3abd2cc3e98f58352d26  mpc-1.0.3.tar.gz
-ac1f25a0677912952718a51f5bc20f32  isl-0.16.1.tar.bz2
+11436d6b205e516635b666090b94ab32  isl-0.18.tar.bz2
diff --git a/contrib/prerequisites.sha512 b/contrib/prerequisites.sha512
index 808970778c74..cf6b93b8d6b8 100644
--- a/contrib/prerequisites.sha512
+++ b/contrib/prerequisites.sha512
@@ -1,4 +1,4 @@
 
3c82aeab9c1596d4da8afac2eec38e429e84f3211e1a572cf8fd2b546493c44c039b922a1133eaaa48bd7f3e11dbe795a384e21ed95cbe3ecc58d7ac02246117
  gmp-6.1.0.tar.bz2
 
51066066ff2c12ed2198605ecf68846b0c96b548adafa5b80e0c786d0df488411a5e8973358fce7192dc977ad4e68414cf14500e3c39746de62465eb145bb819
  mpfr-3.1.4.tar.bz2
 
0028b76df130720c1fad7de937a0d041224806ce5ef76589f19c7b49d956071a683e2f20d154c192a231e69756b19e48208f2889b0c13950ceb7b3cfaf059a43
  mpc-1.0.3.tar.gz
-c188667a84dc5bdddb4ab7c35f89c91bf15a8171f4fcaf41301cf285fb7328846d9a367c096012fec4cc69d244f0bc9e95d84c09ec097394cd4093076f2a041b
  isl-0.16.1.tar.bz2
+85d0b40f4dbf14cb99d17aa07048cdcab2dc3eb527d2fbb1e84c41b2de5f351025370e57448b63b2b8a8cf8a0843a089c3263f9baee1542d5c2e1cb37ed39d94
  isl-0.18.tar.bz2

-- 
Markus


Base subreg rules on REGMODE_NATURAL_SIZE rather than UNITS_PER_WORD

2017-09-18 Thread Richard Sandiford
Originally subregs operated at the word level and subreg offsets
were measured in words.  The offset units were later changed from
words to bytes (SUBREG_WORD became SUBREG_BYTE), but the fundamental
assumption that subregs should operate at the word level remained.
Whether (subreg:M1 (reg:M2 R2) N) is well-formed depended on the
way that M1 and M2 partitioned into words and whether the subword
part of N represented a lowpart.  However, some questions depended
instead on the macro REGMODE_NATURAL_SIZE, which was introduced
as part of the patch that moved from SUBREG_WORD to SUBREG_BYTE.
It is used to decide whether setting (subreg:M1 (reg:M2 R2) N)
clobbers all of R2 or just part of it (df_read_modify_subreg).

Using words doesn't really make sense for modern vector
architectures.  Vector registers are usually bigger than
a word and:

(a) setting the scalar lowpart of them usually clobbers the
rest of the register (contrary to the subreg rules,
where only the containing words should be clobbered).

(b) high words of vector registers are often not independently
addressable, even though that's what the subreg rules expect.

This patch therefore uses REGMODE_NATURAL_SIZE instead of
UNITS_PER_WORD to determine the size of the independently
addressable blocks in an inner register.

This is needed for SVE because the number of words in a vector
mode isn't known at compile time, so isn't a sensible basis
for calculating the number of registers.

The only existing port to define REGMODE_NATURAL_SIZE is
64-bit SPARC, where FP registers are 32 bits.  (This is the
opposite of the use case for SVE, since the natural division
is smaller than a word.)  I compiled the testsuite before and
after the patch for sparc64-linux-gnu and the only test whose
assembly changed was g++.dg/debug/pr65678.C, where the order
of two independent stores was reversed and where a different
register was picked for one pseudo.  The new code was
otherwise equivalent to the old code.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
Also tested by comparing the testsuite assembly output on at least one
target per CPU directory, with only the SPARC differences just mentioned.
OK to install?

Richard


2017-09-18  Richard Sandiford  
Alan Hayward  
David Sherwood  

gcc/
* doc/rtl.texi: Rewrite the subreg rules so that they partition
the inner register into REGMODE_NATURAL_SIZE bytes rather than
UNITS_PER_WORD bytes.
* emit-rtl.c (validate_subreg): Divide subregs into blocks
based on REGMODE_NATURAL_SIZE of the inner mode.
(gen_lowpart_common): Split the SCALAR_FLOAT_MODE_P and
!SCALAR_FLOAT_MODE_P cases.  Use REGMODE_NATURAL_SIZE for the latter.
* expr.c (store_constructor): Use REGMODE_NATURAL_SIZE to test
whether something is likely to occupy more than one register.

Index: gcc/doc/rtl.texi
===
--- gcc/doc/rtl.texi2017-09-15 13:56:20.294149114 +0100
+++ gcc/doc/rtl.texi2017-09-18 12:24:20.287485854 +0100
@@ -1921,19 +1921,32 @@ false.
 When @var{m1} is at least as narrow as @var{m2} the @code{subreg}
 expression is called @dfn{normal}.
 
+@findex REGMODE_NATURAL_SIZE
 Normal @code{subreg}s restrict consideration to certain bits of
-@var{reg}.  There are two cases.  If @var{m1} is smaller than a word,
-the @code{subreg} refers to the least-significant part (or
-@dfn{lowpart}) of one word of @var{reg}.  If @var{m1} is word-sized or
-greater, the @code{subreg} refers to one or more complete words.
-
-When used as an lvalue, @code{subreg} is a word-based accessor.
-Storing to a @code{subreg} modifies all the words of @var{reg} that
-overlap the @code{subreg}, but it leaves the other words of @var{reg}
+@var{reg}.  For this purpose, @var{reg} is divided into
+individually-addressable blocks in which each block has:
+
+@smallexample
+REGMODE_NATURAL_SIZE (@var{m2})
+@end smallexample
+
+bytes.  Usually the value is @code{UNITS_PER_WORD}; that is,
+most targets usually treat each word of a register as being
+independently addressable.
+
+There are two types of normal @code{subreg}.  If @var{m1} is known
+to be no bigger than a block, the @code{subreg} refers to the
+least-significant part (or @dfn{lowpart}) of one block of @var{reg}.
+If @var{m1} is known to be larger than a block, the @code{subreg} refers
+to two or more complete blocks.
+
+When used as an lvalue, @code{subreg} is a block-based accessor.
+Storing to a @code{subreg} modifies all the blocks of @var{reg} that
+overlap the @code{subreg}, but it leaves the other blocks of @var{reg}
 alone.
 
-When storing to a normal @code{subreg} that is smaller than a word,
-the other bits of the referenced word are usually left in an undefined
+When storing to a normal @code{subreg} that is smaller than a block,
+the other 

RE: [PATCH][x86] Knights Mill -march/-mtune options

2017-09-18 Thread Peryt, Sebastian
> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Uros Bizjak
> Sent: Monday, September 18, 2017 12:23 PM
> To: Peryt, Sebastian 
> Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin 
> Subject: Re: [PATCH][x86] Knights Mill -march/-mtune options
> 
> On Mon, Sep 18, 2017 at 12:17 PM, Peryt, Sebastian
>  wrote:
> >> -Original Message-
> >> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> >> ow...@gcc.gnu.org] On Behalf Of Uros Bizjak
> >> Sent: Sunday, September 17, 2017 6:14 PM
> >> To: Peryt, Sebastian 
> >> Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin 
> >> Subject: Re: [PATCH][x86] Knights Mill -march/-mtune options
> >>
> >> On Thu, Sep 14, 2017 at 1:47 PM, Peryt, Sebastian
> >> 
> >> wrote:
> >> > Hi,
> >> >
> >> > This patch adds  options -march=/-mtune=knm for Knights Mill.
> >> >
> >> > 2017-09-14  Sebastian Peryt   gcc/
> >> >
> >> > * config.gcc: Support "knm".
> >> > * config/i386/driver-i386.c (host_detect_local_cpu): Detect 
> >> > "knm".
> >> > * config/i386/i386-c.c (ix86_target_macros_internal): Handle
> >> > PROCESSOR_KNM.
> >> > * config/i386/i386.c (m_KNM): Define.
> >> > (processor_target_table): Add "knm".
> >> > (PTA_KNM): Define.
> >> > (ix86_option_override_internal): Add "knm".
> >> > (ix86_issue_rate): Add PROCESSOR_KNM.
> >> > (ix86_adjust_cost): Ditto.
> >> > (ia32_multipass_dfa_lookahead): Ditto.
> >> > (get_builtin_code_for_version): Handle PROCESSOR_KNM.
> >> > (fold_builtin_cpu): Define M_INTEL_KNM.
> >> > * config/i386/i386.h (TARGET_KNM): Define.
> >> > (processor_type): Add PROCESSOR_KNM.
> >> > * config/i386/x86-tune.def: Add m_KNM.
> >> > * doc/invoke.texi: Add knm as x86 -march=/-mtune= CPU type.
> >> >
> >> >
> >> > gcc/testsuite/
> >> >
> >> > * gcc.target/i386/funcspec-5.c: Test knm.
> >> >
> >> > Is it ok for trunk?
> >>
> >> You also have to update libgcc/cpuinfo.h together with
> >> fold_builtin_cpu from i386.c. Please note that all new processor
> >> types and subtypes have to be added at the end of the enum.
> >>
> >
> > Uros,
> >
> > I have updated libgcc/cpuinfo.h and libgcc/cpuinfo.c. I understood
> > that CPU_TYPE_MAX in libgcc/cpuinfo.h processor_types is some kind of
> > barrier, this is why I put KNM before that. Is that correct thinking?
> > As for fold_builtin_cpu in i386.c I already have something like this:
> >
> > @@ -34217,6 +34229,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
> >  M_AMDFAM15H,
> >  M_INTEL_SILVERMONT,
> >  M_INTEL_KNL,
> > +M_INTEL_KNM,
> >  M_AMD_BTVER1,
> >  M_AMD_BTVER2,
> >  M_CPU_SUBTYPE_START,
> > @@ -34262,6 +34275,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
> >{"bonnell", M_INTEL_BONNELL},
> >{"silvermont", M_INTEL_SILVERMONT},
> >{"knl", M_INTEL_KNL},
> > +  {"knm", M_INTEL_KNM},
> >{"amdfam10h", M_AMDFAM10H},
> >{"barcelona", M_AMDFAM10H_BARCELONA},
> >{"shanghai", M_AMDFAM10H_SHANGHAI},
> >
> > I couldn't find any other place where I'm supposed to add anything extra.
> 
> Please look at libgcc/config/i386/cpuinfo.h. The comment here says that:
> 
> /* Any new types or subtypes have to be inserted at the end. */
> 
> The above patch should then add M_INTEL_KNM as the last entry *before*
> M_CPU_SUBTYPE_START.
> 

Sorry, I didn't notice this value at first. I believe now it's correct.

Sebastian

> > Additionally I updated one extra test I found -
> > gcc.target/i386/funcspec-56.inc
> >
> >> Ops, and ANDFAM17H processor type should not be there in cpuinfo.h.
> >
> > Sorry, I don't understand - it shouldn't be at this position, or in this 
> > enum at all?
> 
> This means I have to synchronize gcc part with libgcc. I'll do it later today.
> 
> Uros.


KNM_enabling_v3.patch
Description: KNM_enabling_v3.patch


Re: [PATCH][x86] Knights Mill -march/-mtune options

2017-09-18 Thread Uros Bizjak
On Mon, Sep 18, 2017 at 12:17 PM, Peryt, Sebastian
 wrote:
>> -Original Message-
>> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
>> ow...@gcc.gnu.org] On Behalf Of Uros Bizjak
>> Sent: Sunday, September 17, 2017 6:14 PM
>> To: Peryt, Sebastian 
>> Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin 
>> Subject: Re: [PATCH][x86] Knights Mill -march/-mtune options
>>
>> On Thu, Sep 14, 2017 at 1:47 PM, Peryt, Sebastian 
>> wrote:
>> > Hi,
>> >
>> > This patch adds  options -march=/-mtune=knm for Knights Mill.
>> >
>> > 2017-09-14  Sebastian Peryt   gcc/
>> >
>> > * config.gcc: Support "knm".
>> > * config/i386/driver-i386.c (host_detect_local_cpu): Detect "knm".
>> > * config/i386/i386-c.c (ix86_target_macros_internal): Handle
>> > PROCESSOR_KNM.
>> > * config/i386/i386.c (m_KNM): Define.
>> > (processor_target_table): Add "knm".
>> > (PTA_KNM): Define.
>> > (ix86_option_override_internal): Add "knm".
>> > (ix86_issue_rate): Add PROCESSOR_KNM.
>> > (ix86_adjust_cost): Ditto.
>> > (ia32_multipass_dfa_lookahead): Ditto.
>> > (get_builtin_code_for_version): Handle PROCESSOR_KNM.
>> > (fold_builtin_cpu): Define M_INTEL_KNM.
>> > * config/i386/i386.h (TARGET_KNM): Define.
>> > (processor_type): Add PROCESSOR_KNM.
>> > * config/i386/x86-tune.def: Add m_KNM.
>> > * doc/invoke.texi: Add knm as x86 -march=/-mtune= CPU type.
>> >
>> >
>> > gcc/testsuite/
>> >
>> > * gcc.target/i386/funcspec-5.c: Test knm.
>> >
>> > Is it ok for trunk?
>>
>> You also have to update libgcc/cpuinfo.h together with fold_builtin_cpu from
>> i386.c. Please note that all new processor types and subtypes have to be 
>> added
>> at the end of the enum.
>>
>
> Uros,
>
> I have updated libgcc/cpuinfo.h and libgcc/cpuinfo.c. I understood that
> CPU_TYPE_MAX in libgcc/cpuinfo.h processor_types is some kind of barrier,
> this is why I put KNM before that. Is that correct thinking? As for 
> fold_builtin_cpu
> in i386.c I already have something like this:
>
> @@ -34217,6 +34229,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
>  M_AMDFAM15H,
>  M_INTEL_SILVERMONT,
>  M_INTEL_KNL,
> +M_INTEL_KNM,
>  M_AMD_BTVER1,
>  M_AMD_BTVER2,
>  M_CPU_SUBTYPE_START,
> @@ -34262,6 +34275,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
>{"bonnell", M_INTEL_BONNELL},
>{"silvermont", M_INTEL_SILVERMONT},
>{"knl", M_INTEL_KNL},
> +  {"knm", M_INTEL_KNM},
>{"amdfam10h", M_AMDFAM10H},
>{"barcelona", M_AMDFAM10H_BARCELONA},
>{"shanghai", M_AMDFAM10H_SHANGHAI},
>
> I couldn't find any other place where I'm supposed to add anything extra.

Please look at libgcc/config/i386/cpuinfo.h. The comment here says that:

/* Any new types or subtypes have to be inserted at the end. */

The above patch should then add M_INTEL_KNM as the last entry *before*
M_CPU_SUBTYPE_START.

> Additionally I updated one extra test I found - 
> gcc.target/i386/funcspec-56.inc
>
>> Ops, and ANDFAM17H processor type should not be there in cpuinfo.h.
>
> Sorry, I don't understand - it shouldn't be at this position, or in this enum 
> at all?

This means I have to synchronize gcc part with libgcc. I'll do it later today.

Uros.


Re: [Ada] Validity check failure with packed array and pragma

2017-09-18 Thread Pierre-Marie de Rodat

On 09/18/2017 12:02 PM, Eric Botcazou wrote:

You don't need this, just use:

--  { dg-options "-O -gnatn -gnatVa -gnatws" }

The -cargs/-margs trick is only needed for special switches like -dA.


That’s right, will do, thank you! Do I need to create a new ChangeLog 
entry in gcc/testsuite/ or is it fine if I just keep the current “New 
testcase.”?


--
Pierre-Marie de Rodat


RE: [PATCH][x86] Knights Mill -march/-mtune options

2017-09-18 Thread Peryt, Sebastian
> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Uros Bizjak
> Sent: Sunday, September 17, 2017 6:14 PM
> To: Peryt, Sebastian 
> Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin 
> Subject: Re: [PATCH][x86] Knights Mill -march/-mtune options
> 
> On Thu, Sep 14, 2017 at 1:47 PM, Peryt, Sebastian 
> wrote:
> > Hi,
> >
> > This patch adds  options -march=/-mtune=knm for Knights Mill.
> >
> > 2017-09-14  Sebastian Peryt   gcc/
> >
> > * config.gcc: Support "knm".
> > * config/i386/driver-i386.c (host_detect_local_cpu): Detect "knm".
> > * config/i386/i386-c.c (ix86_target_macros_internal): Handle
> > PROCESSOR_KNM.
> > * config/i386/i386.c (m_KNM): Define.
> > (processor_target_table): Add "knm".
> > (PTA_KNM): Define.
> > (ix86_option_override_internal): Add "knm".
> > (ix86_issue_rate): Add PROCESSOR_KNM.
> > (ix86_adjust_cost): Ditto.
> > (ia32_multipass_dfa_lookahead): Ditto.
> > (get_builtin_code_for_version): Handle PROCESSOR_KNM.
> > (fold_builtin_cpu): Define M_INTEL_KNM.
> > * config/i386/i386.h (TARGET_KNM): Define.
> > (processor_type): Add PROCESSOR_KNM.
> > * config/i386/x86-tune.def: Add m_KNM.
> > * doc/invoke.texi: Add knm as x86 -march=/-mtune= CPU type.
> >
> >
> > gcc/testsuite/
> >
> > * gcc.target/i386/funcspec-5.c: Test knm.
> >
> > Is it ok for trunk?
> 
> You also have to update libgcc/cpuinfo.h together with fold_builtin_cpu from
> i386.c. Please note that all new processor types and subtypes have to be added
> at the end of the enum.
> 

Uros,

I have updated libgcc/cpuinfo.h and libgcc/cpuinfo.c. I understood that 
CPU_TYPE_MAX in libgcc/cpuinfo.h processor_types is some kind of barrier,
this is why I put KNM before that. Is that correct thinking? As for 
fold_builtin_cpu 
in i386.c I already have something like this:

@@ -34217,6 +34229,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
 M_AMDFAM15H,
 M_INTEL_SILVERMONT,
 M_INTEL_KNL,
+M_INTEL_KNM,
 M_AMD_BTVER1,
 M_AMD_BTVER2,
 M_CPU_SUBTYPE_START,
@@ -34262,6 +34275,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
   {"bonnell", M_INTEL_BONNELL},
   {"silvermont", M_INTEL_SILVERMONT},
   {"knl", M_INTEL_KNL},
+  {"knm", M_INTEL_KNM},
   {"amdfam10h", M_AMDFAM10H},
   {"barcelona", M_AMDFAM10H_BARCELONA},
   {"shanghai", M_AMDFAM10H_SHANGHAI},

I couldn't find any other place where I'm supposed to add anything extra.
Additionally I updated one extra test I found - gcc.target/i386/funcspec-56.inc

> Ops, and ANDFAM17H processor type should not be there in cpuinfo.h.

Sorry, I don't understand - it shouldn't be at this position, or in this enum 
at all?
> 
> Uros.

Thanks,
Sebastian

2017-09-18  Sebastian Peryt   

gcc/

* config.gcc: Support "knm".
* config/i386/driver-i386.c (host_detect_local_cpu): Detect "knm".
* config/i386/i386-c.c (ix86_target_macros_internal): Handle
PROCESSOR_KNM.
* config/i386/i386.c (m_KNM): Define.
 (processor_target_table): Add "knm".
 (PTA_KNM): Define.
(ix86_option_override_internal): Add "knm".
 (ix86_issue_rate): Add PROCESSOR_KNM.
(ix86_adjust_cost): Ditto.
(ia32_multipass_dfa_lookahead): Ditto.
(get_builtin_code_for_version): Handle PROCESSOR_KNM.
(fold_builtin_cpu): Define M_INTEL_KNM.
* config/i386/i386.h (TARGET_KNM): Define.
(processor_type): Add PROCESSOR_KNM.
 * config/i386/x86-tune.def: Add m_KNM.
* doc/invoke.texi: Add knm as x86 -march=/-mtune= CPU type.

libgcc/
* config/i386/cpuinfo.h (processor_types): Add INTEL_KNM.
* config/i386/cpuinfo.c (get_intel_cpu): Detect Knights Mill.

gcc/testsuite/

* gcc.target/i386/funcspec-5.c: Test knm.
* gcc.target/i386/funcspec-56.inc: Test arch=knm.


KNM_enabling_v2.patch
Description: KNM_enabling_v2.patch


Re: 0006-Part-6.-Add-x86-tests-for-Intel-CET-implementation

2017-09-18 Thread Uros Bizjak
Hello!

> gcc/testsuite/
>
> * g++.dg/cet-notrack-1.C: New test.
> * gcc.target/i386/cet-intrin-1.c: Likewise.
> * gcc.target/i386/cet-intrin-10.c: Likewise.
> * gcc.target/i386/cet-intrin-2.c: Likewise.
> * gcc.target/i386/cet-intrin-3.c: Likewise.
> * gcc.target/i386/cet-intrin-4.c: Likewise.
> * gcc.target/i386/cet-intrin-5.c: Likewise.
> * gcc.target/i386/cet-intrin-6.c: Likewise.
> * gcc.target/i386/cet-intrin-7.c: Likewise.
> * gcc.target/i386/cet-intrin-8.c: Likewise.
> * gcc.target/i386/cet-intrin-9.c: Likewise.
> * gcc.target/i386/cet-label.c: Likewise.
> * gcc.target/i386/cet-notrack-1a.c: Likewise.
> * gcc.target/i386/cet-notrack-1b.c: Likewise.
> * gcc.target/i386/cet-notrack-2a.c: Likewise.
> * gcc.target/i386/cet-notrack-2b.c: Likewise.
> * gcc.target/i386/cet-notrack-3.c: Likewise.
> * gcc.target/i386/cet-notrack-4a.c: Likewise.
> * gcc.target/i386/cet-notrack-4b.c: Likewise.
> * gcc.target/i386/cet-notrack-5a.c: Likewise.
> * gcc.target/i386/cet-notrack-5b.c: Likewise.
> * gcc.target/i386/cet-notrack-6a.c: Likewise.
> * gcc.target/i386/cet-notrack-6b.c: Likewise.
> * gcc.target/i386/cet-notrack-7.c: Likewise.
> * gcc.target/i386/cet-property-1.c: Likewise.
> * gcc.target/i386/cet-property-2.c: Likewise.
> * gcc.target/i386/cet-rdssp-1.c: Likewise.
> * gcc.target/i386/cet-sjlj-1.c: Likewise.
> * gcc.target/i386/cet-sjlj-2.c: Likewise.
> * gcc.target/i386/cet-sjlj-3.c: Likewise.
> * gcc.target/i386/cet-switch-1.c: Likewise.
> * gcc.target/i386/cet-switch-2.c: Likewise.
> * lib/target-supports.exp (check_effective_target_cet): New
> proc.

A couple of questions:

+/* { dg-do compile } */
+/* { dg-options "-O2 -mcet" } */
+/* { dg-final { scan-assembler-times "setssbsy" 2 } } */
+
+#include 
+
+void f1 (void)
+{
+  __builtin_ia32_setssbsy ();
+}
+
+void f2 (void)
+{
+  _setssbsy ();
+}

Is there a reason that both, __builtin and intrinsic versions are
tested in a couple of places? The intrinsic version is just a wrapper
for __builtin, so IMO testing intrinsic version should be enough.


diff --git a/gcc/testsuite/gcc.target/i386/cet-rdssp-1.c
b/gcc/testsuite/gcc.target/i386/cet-rdssp-1.c
new file mode 100644
index 000..f9223a5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/cet-rdssp-1.c
@@ -0,0 +1,39 @@
+/* { dg-do run { target cet } } */
+/* { dg-options "-O2 -finstrument-control-flow -mcet" } */

The "target cet" directive just checks that CET instructions can be
compiled. The test will (probably?) fail on targets with binutils that
can compile CET instructions, but the target itself doesn't support
CET. If this is the case, then check header has to be introduced, so
the test can be bypassed on targets without runtime support.

Uros.


Re: Transform (x / y) != 0 to x >=y and (x / y) == 0 to x < y if x, y are unsigned

2017-09-18 Thread Wilco Dijkstra
Richard Sandiford wrote:

> I don't think it's literally always.  Testing the inputs instead of a
> multi-use result tends to mean that all three are live at once.  If the
> == 0 condition is only one component of a more complex condition that
> relies on the result of division regardless, then it's possible for
> testing the inputs to be a pessimisation, particularly on two-address
> targets.

Sure you can always make up a contrived example where one case is
better than another. In many cases it's just going to be a heuristic. 
With division the best option is far more obvious than in most cases.

> Not saying that's a strong enough reason not to do it.  I just don't think
> that we can guarantee it will be better in *every* case.

Every single decision a compiler makes cannot be optimal for all cases -
what matters is what is best on average in real code.

Wilco


Re: Transform (x / y) != 0 to x >=y and (x / y) == 0 to x < y if x, y are unsigned

2017-09-18 Thread Prathamesh Kulkarni
On 15 September 2017 at 22:09, Marc Glisse  wrote:
> On Fri, 15 Sep 2017, Wilco Dijkstra wrote:
>
>> Marc Glisse wrote:
>>
>>> The question is whether, having computed c=a/b, it is cheaper to test a>> or c!=0.
>>> I think it is usually the second one, but not for all types on all
>>> targets. Although since
>>> you mention VRP, it is easier to do further optimizations using the
>>> information a>
>>
>> No, a> throughput on
>> all modern cores, so rather than having to wait until the division
>> finishes, you can
>> execute whatever depends on the comparison many cycles earlier.
>>
>> Generally you want to avoid division as much as possible and when that
>> fails
>> reduce any dependencies on the result of divisions.
>
>
> This would indicate that we do not need to check for single-use, makes the
> patch simpler, thanks.
> (let's ignore -Os)
Hi,
Thanks for the suggestions, I have updated the patch.
Is this OK ?
Bootstrap+test in progress on x86_64-unknown-linux-gnu.
I will try address the right shift by 4 case in follow up patch.

Thanks,
Prathamesh
>
> --
> Marc Glisse
2017-09-18  Prathamesh Kulkarni  

* match.pd ((X / Y) == 0 -> X < Y): New pattern.
((X / Y) != 0 -> X >= Y): Likewise.

testsuite/
* gcc.dg/tree-ssa/cmpdiv.c: New test.
diff --git a/gcc/match.pd b/gcc/match.pd
index dbfceaf10a5..a9008f2437e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1266,6 +1266,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0
(op @1 @0
 
+/* Transform:
+ * (X / Y) == 0 -> X < Y if X, Y are unsigned.
+ * (X / Y) != 0 -> X >= Y, if X, Y are unsigned.
+ */
+(for cmp (eq ne)
+ ocmp (lt ge)
+ (simplify
+  (cmp (trunc_div @0 @1) integer_zerop)
+  (if (TYPE_UNSIGNED (TREE_TYPE (@0))
+   && (VECTOR_TYPE_P (type) || !VECTOR_TYPE_P (TREE_TYPE (@0
+   (ocmp @0 @1
+
 /* X == C - X can never be true if C is odd.  */
 (for cmp (eq ne)
  (simplify
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cmpdiv.c 
b/gcc/testsuite/gcc.dg/tree-ssa/cmpdiv.c
new file mode 100644
index 000..14161f5ea6f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/cmpdiv.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+
+_Bool f1(unsigned x, unsigned y)
+{
+  unsigned t1 = x / y;
+  _Bool t2 = (t1 != 0);
+  return t2;
+}
+
+_Bool f2(unsigned x, unsigned y)
+{
+  unsigned t1 = x / y;
+  _Bool t2 = (t1 == 0);
+  return t2;
+}
+
+/* { dg-final { scan-tree-dump-not "trunc_div_expr" "optimized" } } */


[PATCH] Fix PR82220

2017-09-18 Thread Richard Biener

The following is said to fix a 482.sphinx3 regression.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2017-09-18  Richard Biener  

PR tree-optimization/82220
* tree-vect-loop.c (vect_estimate_min_profitable_iters): Exclude
epilogue niters from the min_profitable_iters compute.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 252907)
+++ gcc/tree-vect-loop.c(working copy)
@@ -3663,8 +3663,8 @@ vect_estimate_min_profitable_iters (loop
   min_profitable_iters);
 
   /* We want the vectorized loop to execute at least once.  */
-  if (min_profitable_iters < (vf + peel_iters_prologue + peel_iters_epilogue))
-min_profitable_iters = vf + peel_iters_prologue + peel_iters_epilogue;
+  if (min_profitable_iters < (vf + peel_iters_prologue))
+min_profitable_iters = vf + peel_iters_prologue;
 
   if (dump_enabled_p ())
 dump_printf_loc (MSG_NOTE, vect_location,


Re: [Ada] Validity check failure with packed array and pragma

2017-09-18 Thread Eric Botcazou
> 2017-09-18  Bob Duff  
> 
>   * gnat.dg/validity_check.adb: New testcase.

+--  { dg-options "-cargs -O -gnatn -gnatVa -gnatws -margs" }

You don't need this, just use:

--  { dg-options "-O -gnatn -gnatVa -gnatws" }

The -cargs/-margs trick is only needed for special switches like -dA.

-- 
Eric Botcazou


Re: 0004-Part-4.-Update-x86-backend-to-enable-Intel-CET

2017-09-18 Thread Uros Bizjak
Hello!

> gcc/
>
> * common/config/i386/i386-common.c (OPTION_MASK_ISA_IBT_SET): New.
> (OPTION_MASK_ISA_SHSTK_SET): Likewise.
> (OPTION_MASK_ISA_IBT_UNSET): Likewise.
> (OPTION_MASK_ISA_SHSTK_UNSET): Likewise.
> (ix86_handle_option): Add -mibt, -mshstk, -mcet handling.
> * config.gcc (extra_headers): Add cetintrin.h for x86 targets.
> (extra_objs): Add cet.o for Linux/x86 targets.
> (tmake_file): Add i386/t-cet for Linux/x86 targets.
> * config/i386/cet.c: New file.
> * config/i386/cetintrin.h: Likewise.
> * config/i386/t-cet: Likewise.
> * config/i386/cpuid.h (bit_SHSTK): New.
> (bit_IBT): Likewise.
> * config/i386/driver-i386.c (host_detect_local_cpu): Detect and
> pass IBT and SHSTK bits.
> * config/i386/i386-builtin-types.def
> (VOID_FTYPE_UNSIGNED_PVOID): New.
> (VOID_FTYPE_UINT64_PVOID): Likewise.
> * config/i386/i386-builtin.def: Add CET intrinsics.
> * config/i386/i386-c.c (ix86_target_macros_internal): Add
> OPTION_MASK_ISA_IBT, OPTION_MASK_ISA_SHSTK handling.
> * config/i386/i386-passes.def: Add pass_insert_endbranch pass.
> * config/i386/i386-protos.h (make_pass_insert_endbranch): New
> prototype.
> * config/i386/i386.c (rest_of_insert_endbranch): New.
> (pass_data_insert_endbranch): Likewise.
> (pass_insert_endbranch): Likewise.
> (make_pass_insert_endbranch): Likewise.
> (ix86_notrack_prefixed_insn_p): Likewise.
> (ix86_target_string): Add -mibt, -mshstk flags.
> (ix86_option_override_internal): Add flag_instrument_control_flow
> processing.
> (ix86_valid_target_attribute_inner_p): Set OPT_mibt, OPT_mshstk.
> (ix86_print_operand): Add 'notrack' prefix output.
> (ix86_init_mmx_sse_builtins): Add CET intrinsics.
> (ix86_expand_builtin): Expand CET intrinsics.
> (x86_output_mi_thunk): Add 'endbranch' instruction.
> * config/i386/i386.h (TARGET_IBT): New.
> (TARGET_IBT_P): Likewise.
> (TARGET_SHSTK): Likewise.
> (TARGET_SHSTK_P): Likewise.
> * config/i386/i386.md (unspecv): Add UNSPECV_NOP_RDSSP,
> UNSPECV_INCSSP, UNSPECV_SAVEPREVSSP, UNSPECV_RSTORSSP,
> UNSPECV_WRSS, UNSPECV_WRUSS, UNSPECV_SETSSBSY, UNSPECV_CLRSSBSY.
> (builtin_setjmp_setup): New pattern.
> (builtin_longjmp): Likewise.
> (rdssp): Likewise.
> (incssp): Likewise.
> (saveprevssp): Likewise.
> (rstorssp): Likewise.
> (wrss): Likewise.
> (wruss): Likewise.
> (setssbsy): Likewise.
> (clrssbsy): Likewise.
> (nop_endbr): Likewise.
> * config/i386/i386.opt: Add -mcet, -mibt, -mshstk and -mcet-switch
> options.
> * config/i386/immintrin.h: Include .
> * config/i386/linux-common.h
> (file_end_indicate_exec_stack_and_cet): New prototype.
> (TARGET_ASM_FILE_END): New.

LGTM.

OK for mainline.

Thanks,
Uros.


[Ada] Fix spurious error on component of overloaded function result

2017-09-18 Thread Pierre-Marie de Rodat
This fixes a weird error given by the compiler on an access attribute applied
to the component of the result of a function call, if the called function
returns an access type designating a record containing the component declared
as aliased, and is overloaded with another function returning another access
type designating also a record containing a second component of the same name
but not declared as aliased.

The compiler wrongly complains that the prefix of the attribute is not declared
as aliased because the check is applied to a random interpretation (depending
on the declaration order among other things) of the overloaded component.  The
fix simply defers the check until after the right interpretation is chosen.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

2017-09-18  Eric Botcazou  

* sem_attr.adb (Analyze_Access_Attribute): Move check for the presence
of the "aliased" keyword on the prefix from here to...
(Resolve_Attribute) : ...here.  Remove useless call
to Check_No_Implicit_Aliasing.
* sinfo.ads (Non_Aliased_Prefix): Delete.
(Set_Non_Aliased_Prefix): Likewise.
* sinfo.adb (Non_Aliased_Prefix): Delete.
(Set_Non_Aliased_Prefix): Likewise.

gcc/testsuite/

2017-09-18  Eric Botcazou  

* gnat.dg/overload.ads, gnat.dg/overload.adb: New testcase.

Index: sem_attr.adb
===
--- sem_attr.adb(revision 252907)
+++ sem_attr.adb(working copy)
@@ -1074,49 +1074,6 @@
end if;
 end loop;
  end;
-
- --  Check for aliased view. We allow a nonaliased prefix when within
- --  an instance because the prefix may have been a tagged formal
- --  object, which is defined to be aliased even when the actual
- --  might not be (other instance cases will have been caught in the
- --  generic). Similarly, within an inlined body we know that the
- --  attribute is legal in the original subprogram, and therefore
- --  legal in the expansion.
-
- if not Is_Aliased_View (P)
-   and then not In_Instance
-   and then not In_Inlined_Body
-   and then Comes_From_Source (N)
- then
---  Here we have a non-aliased view. This is illegal unless we
---  have the case of Unrestricted_Access, where for now we allow
---  this (we will reject later if expected type is access to an
---  unconstrained array with a thin pointer).
-
---  No need for an error message on a generated access reference
---  for the controlling argument in a dispatching call: error will
---  be reported when resolving the call.
-
-if Aname /= Name_Unrestricted_Access then
-   Error_Attr_P ("prefix of % attribute must be aliased");
-   Check_No_Implicit_Aliasing (P);
-
---  For Unrestricted_Access, record that prefix is not aliased
---  to simplify legality check later on.
-
-else
-   Set_Non_Aliased_Prefix (N);
-end if;
-
- --  If we have an aliased view, and we have Unrestricted_Access, then
- --  output a warning that Unchecked_Access would have been fine, and
- --  change the node to be Unchecked_Access.
-
- else
---  For now, hold off on this change ???
-
-null;
- end if;
   end Analyze_Access_Attribute;
 
   --
@@ -11120,24 +11077,56 @@
end if;
 end if;
 
---  Check for unrestricted access where expected type is a thin
---  pointer to an unconstrained array.
+--  Check for aliased view. We allow a nonaliased prefix when in
+--  an instance because the prefix may have been a tagged formal
+--  object, which is defined to be aliased even when the actual
+--  might not be (other instance cases will have been caught in
+--  the generic). Similarly, within an inlined body we know that
+--  the attribute is legal in the original subprogram, therefore
+--  legal in the expansion.
 
-if Non_Aliased_Prefix (N)
-  and then Has_Size_Clause (Typ)
-  and then RM_Size (Typ) = System_Address_Size
+if not (Is_Entity_Name (P)
+ and then Is_Overloadable (Entity (P)))
+  and then not (Nkind (P) = N_Selected_Component
+ and then
+Is_Overloadable (Entity (Selector_Name (P
+  and then not Is_Aliased_View (P)
+  and then not In_Instance
+  and then not In_Inlined_Body
+  and then Comes_From_Source (N)
 then
-   declare
-   

[Ada] Spurious error with unqualified aggregate in instantiation.

2017-09-18 Thread Pierre-Marie de Rodat
When an aggregate appears without an explicit qualification in a generic
unit, the compiler builds a qualified expression for it, using the type of
the aggregate and when possible the scope of that type, so that both of
these entities are properly resolved in an instantiation. This patch verifies
that the scope is not hidden by a local declaration, to preent a spurious
visibility error in an instance.

The following must compile quietly:

   gcc -c foo.adb

---
with Langkit_Support.Lexical_Env;

procedure Foo is
   package Envs is new Langkit_Support.Lexical_Env (Natural);
begin
   null;
end Foo;
---
generic
   type T is private;
package Langkit_Support.Lexical_Env is

   type Record_Type is record
  V : T;
   end record;

   type Lexical_Env is null record;

   function Get
 (Self   : Lexical_Env;
  Value   : T;
  Filter : access function (R : Record_Type; Env : Lexical_Env)
return Boolean := null)
  return Boolean;

end Langkit_Support.Lexical_Env;
---
package body Langkit_Support.Lexical_Env is

   function Get
 (Self   : Lexical_Env;
  Value  : T;
  Filter : access function (R : Record_Type; Env : Lexical_Env)
return Boolean := null)
  return Boolean
   is
  Filtered_Out : constant Boolean :=
 Filter /= null
 and then not Filter ((V => Value), Self);
   begin
  return Filtered_Out;
   end Get;

end Langkit_Support.Lexical_Env;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-09-18  Ed Schonberg  

* sem_ch12.adb (Save_References_In_Aggregate): When constructing a
qualified exxpression for an aggregate in a generic unit, verify that
the scope of the type is itself visible and not hidden, so that the
qualified expression is correctly resolved in any instance.

Index: sem_ch12.adb
===
--- sem_ch12.adb(revision 252907)
+++ sem_ch12.adb(working copy)
@@ -15118,10 +15118,10 @@
--  preserved. In order to preserve some of this information,
--  wrap the aggregate in a qualified expression, using the id
--  of its type. For further disambiguation we qualify the type
-   --  name with its scope (if visible) because both id's will have
-   --  corresponding entities in an instance. This resolves most of
-   --  the problems with missing type information on aggregates in
-   --  instances.
+   --  name with its scope (if visible and not hidden by a local
+   --  homograph) because both id's will have corresponding
+   --  entities in an instance. This resolves most of the problems
+   --  with missing type information on aggregates in instances.
 
if Present (N2)
  and then Nkind (N2) = Nkind (N)
@@ -15131,7 +15131,9 @@
then
   Nam := Make_Identifier (Loc, Chars (Typ));
 
-  if Is_Immediately_Visible (Scope (Typ)) then
+  if Is_Immediately_Visible (Scope (Typ))
+and then Current_Entity (Scope (Typ)) = Scope (Typ)
+  then
  Nam :=
Make_Selected_Component (Loc,
  Prefix=>


[Ada] Spurious error in ASIS on static predicate aspects/

2017-09-18 Thread Pierre-Marie de Rodat
This patch fixes an error in ASIS mode when processing queries on a static
predicate for an enumeration type that involves a case expression. 

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-09-18  Ed Schonberg  

* sem_ch3.adb (Analyze_Declarations): In ASIS mode,  At the end of the
declarative list in a subprogram body, analyze aspext specifications to
provide basic semantic information, because otherwise the aspect
specifications might only be snalyzed during expansion, when related
subprograms are generated.

Index: sem_ch3.adb
===
--- sem_ch3.adb (revision 252907)
+++ sem_ch3.adb (working copy)
@@ -2666,6 +2666,16 @@
   Freeze_From := Last_Entity (Current_Scope);
 
else
+  --  For declarations in a subprogram body there is no issue
+  --  with name resolution in aspect specifications, but in
+  --  ASIS mode we need to preanalyze aspect specifications
+  --  that may otherwise only be analyzed during expansion
+  --  (e.g. during generation of a related subprogram).
+
+  if ASIS_Mode then
+ Resolve_Aspects;
+  end if;
+
   Freeze_All (First_Entity (Current_Scope), Decl);
   Freeze_From := Last_Entity (Current_Scope);
end if;
@@ -13510,6 +13520,7 @@
  end if;
 
  Constrain_Discriminated_Type (Def_Id, SI, Related_Nod);
+ Set_First_Private_Entity (Def_Id, First_Private_Entity (T_Ent));
 
  Set_Depends_On_Private (Def_Id, Has_Private_Component (Def_Id));
  Set_Corresponding_Record_Type (Def_Id,


[Ada] Ravenscar simple barriers and validity checks

2017-09-18 Thread Pierre-Marie de Rodat
If validity checks are enabled in Ravenscar mode, avoid incorrect error
messages complaining that simple barriers are not simple.

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-09-18  Bob Duff  

* exp_ch9.adb (Is_Simple_Barrier_Name): Follow Original_Node, in case
validity checks have rewritten the tree.

Index: exp_ch9.adb
===
--- exp_ch9.adb (revision 252907)
+++ exp_ch9.adb (working copy)
@@ -6000,11 +6000,13 @@
 
   begin
  --  Check if the name is a component of the protected object. If
- --  the expander is active, the component has been transformed into
- --  a renaming of _object.all.component.
+ --  the expander is active, the component has been transformed into a
+ --  renaming of _object.all.component. Original_Node is needed in case
+ --  validity checking is enabled, in which case the simple object
+ --  reference will have been rewritten.
 
  if Expander_Active then
-Renamed := Renamed_Object (Entity (N));
+Renamed := Renamed_Object (Entity (Original_Node (N)));
 
 return
   Present (Renamed)


[Ada] Undefined symbol due to pragma Inline_Always

2017-09-18 Thread Pierre-Marie de Rodat
This patch modifies the semantics of pragma Inline_Always to require that the
pragma appears on the initial declaration of the related subprogram. This rule
ensures that the back end will properly carry out the "always" semantic of the
pragma, regardless of whether a call to the related subprogram comes from an
external or internal source.


-- Source --


--  pack.ads

package Pack is
   procedure Proc;
end Pack;

--  pack.adb

with Ada.Text_IO; use Ada.Text_IO;

package body Pack is
   procedure Proc is
   begin
  Put_Line ("Proc");
   end Proc;
   pragma Inline_Always (Proc);
end Pack;

--  main.adb

with Pack;

procedure Main is
begin
   Pack.Proc;
end Main;


-- Compilation and output --


$ gnatmake -q main.adb
pack.adb:8:04: pragma "Inline_Always" must appear on initial declaration of
  subprogram "Proc" defined at pack.ads:2

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-09-18  Hristian Kirtchev  

* sem_ch6.adb (Check_Inline_Pragma): Link the newly generated spec to
the preexisting body.
* sem_prag.adb (Check_Inline_Always_Placement): New routine.
(Process_Inline): Verify the placement of pragma Inline_Always. The
pragma must now appear on the initial declaration of the related
subprogram.

Index: sem_ch6.adb
===
--- sem_ch6.adb (revision 252910)
+++ sem_ch6.adb (working copy)
@@ -2882,6 +2882,11 @@
New_Copy_Tree (Specification (N)));
 
begin
+  --  Link the body and the generated spec
+
+  Set_Corresponding_Body (Decl, Body_Id);
+  Set_Corresponding_Spec (N, Subp);
+
   Set_Defining_Unit_Name (Specification (Decl), Subp);
 
   --  To ensure proper coverage when body is inlined, indicate
Index: sem_prag.adb
===
--- sem_prag.adb(revision 252910)
+++ sem_prag.adb(working copy)
@@ -9097,15 +9097,10 @@
  --  The entity of the first Ghost subprogram encountered while
  --  processing the arguments of the pragma.
 
- procedure Make_Inline (Subp : Entity_Id);
- --  Subp is the defining unit name of the subprogram declaration. If
- --  the pragma is valid, call Set_Inline_Flags on Subp, as well as on
- --  the corresponding body, if there is one present.
+ procedure Check_Inline_Always_Placement (Spec_Id : Entity_Id);
+ --  Verify the placement of pragma Inline_Always with respect to the
+ --  initial declaration of subprogram Spec_Id.
 
- procedure Set_Inline_Flags (Subp : Entity_Id);
- --  Set Has_Pragma_{No_Inline,Inline,Inline_Always} flag on Subp.
- --  Also set or clear Is_Inlined flag on Subp depending on Status.
-
  function Inlining_Not_Possible (Subp : Entity_Id) return Boolean;
  --  Returns True if it can be determined at this stage that inlining
  --  is not possible, for example if the body is available and contains
@@ -9116,6 +9111,222 @@
  --  ??? is business with link symbols still valid, or does it relate
  --  to front end ZCX which is being phased out ???
 
+ procedure Make_Inline (Subp : Entity_Id);
+ --  Subp is the defining unit name of the subprogram declaration. If
+ --  the pragma is valid, call Set_Inline_Flags on Subp, as well as on
+ --  the corresponding body, if there is one present.
+
+ procedure Set_Inline_Flags (Subp : Entity_Id);
+ --  Set Has_Pragma_{No_Inline,Inline,Inline_Always} flag on Subp.
+ --  Also set or clear Is_Inlined flag on Subp depending on Status.
+
+ ---
+ -- Check_Inline_Always_Placement --
+ ---
+
+ procedure Check_Inline_Always_Placement (Spec_Id : Entity_Id) is
+Spec_Decl : constant Node_Id := Unit_Declaration_Node (Spec_Id);
+
+function Compilation_Unit_OK return Boolean;
+pragma Inline (Compilation_Unit_OK);
+--  Determine whether pragma Inline_Always applies to a compatible
+--  compilation unit denoted by Spec_Id.
+
+function Declarative_List_OK return Boolean;
+pragma Inline (Declarative_List_OK);
+--  Determine whether the initial declaration of subprogram Spec_Id
+--  and the pragma appear in compatible declarative lists.
+
+function Subprogram_Body_OK return Boolean;
+pragma Inline (Subprogram_Body_OK);
+--  Determine whether pragma Inline_Always applies to a compatible
+--  subprogram body denoted by Spec_Id.
+
+-
+-- 

[Ada] Validity check failure with packed array and pragma

2017-09-18 Thread Pierre-Marie de Rodat
Fix a bug in which if validity checking is enabled, Initialize_Scalars
is enabled, optimization is turned on, and inlining is enabled, the
initialization of a packed array can cause Constraint_Error to be
incorrectly raised.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

2017-09-18  Bob Duff  

* exp_ch3.adb (Build_Array_Init_Proc): If validity checking is enabled,
and it's a bit-packed array, pass False to the Consider_IS parameter of
Needs_Simple_Initialization.

gcc/testsuite/

2017-09-18  Bob Duff  

* gnat.dg/validity_check.adb: New testcase.
Index: exp_ch3.adb
===
--- exp_ch3.adb (revision 252913)
+++ exp_ch3.adb (working copy)
@@ -517,6 +517,10 @@
 
procedure Build_Array_Init_Proc (A_Type : Entity_Id; Nod : Node_Id) is
   Comp_Type: constant Entity_Id  := Component_Type (A_Type);
+  Comp_Type_Simple : constant Boolean :=
+Needs_Simple_Initialization
+  (Comp_Type, Consider_IS =>
+ not (Validity_Check_Copies and Is_Bit_Packed_Array (A_Type)));
   Body_Stmts   : List_Id;
   Has_Default_Init : Boolean;
   Index_List   : List_Id;
@@ -557,7 +561,7 @@
   Convert_To (Comp_Type,
 Default_Aspect_Component_Value (First_Subtype (A_Type);
 
- elsif Needs_Simple_Initialization (Comp_Type) then
+ elsif Comp_Type_Simple then
 Set_Assignment_OK (Comp);
 return New_List (
   Make_Assignment_Statement (Loc,
@@ -589,7 +593,7 @@
  --  the dummy Init_Proc needed for Initialize_Scalars processing.
 
  if not Has_Non_Null_Base_Init_Proc (Comp_Type)
-   and then not Needs_Simple_Initialization (Comp_Type)
+   and then not Comp_Type_Simple
and then not Has_Task (Comp_Type)
and then not Has_Default_Aspect (A_Type)
  then
@@ -679,7 +683,7 @@
   --  init_proc.
 
   Has_Default_Init := Has_Non_Null_Base_Init_Proc (Comp_Type)
-or else Needs_Simple_Initialization (Comp_Type)
+or else Comp_Type_Simple
 or else Has_Task (Comp_Type)
 or else Has_Default_Aspect (A_Type);
 
Index: ../testsuite/gnat.dg/validity_check.adb
===
--- ../testsuite/gnat.dg/validity_check.adb (revision 0)
+++ ../testsuite/gnat.dg/validity_check.adb (revision 0)
@@ -0,0 +1,18 @@
+--  { dg-do run }
+--  { dg-options "-cargs -O -gnatn -gnatVa -gnatws -margs" }
+
+pragma Initialize_Scalars;
+
+procedure Validity_Check is
+
+   type Small_Int is mod 2**6;
+
+   type Arr is array (1 .. 16) of Small_Int;
+   pragma Pack (Arr);
+
+   S : Small_Int;
+   A : Arr;
+
+begin
+   null;
+end;


[Ada] Iterable aspect of derived types

2017-09-18 Thread Pierre-Marie de Rodat
This patch avoids incorrect compilation errors if a derived type has a
parent type for which the Iterable aspect is specified, and a "for
... of" loop is used on an object of the derived type.

The following test should compile quietly.

gcc -c seqs-main.adb

package Seqs is
   type Container is null record
 with Iterable =>
   (First => First_Element,
Next => Next,
Has_Element => Has_Element,
Element => Get_Element);

   type Cursor is new Integer;
   type Element is new Boolean;
   type Element_Access is access all Element;

   function First_Element (Self : Container) return Cursor;
   function Next (Self : Container; C : Cursor) return Cursor;
   function Has_Element (Self : Container; C : Cursor) return Boolean;
   function Get_Element (Self : Container; C : Cursor) return Element_Access;

   type Derived is new Container;
end Seqs;

procedure Seqs.Main is
   S : Derived;
begin
   for X of S loop
  null;
   end loop;
end Seqs.Main;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-09-18  Bob Duff  

* exp_ch5.adb (Build_Formal_Container_Iteration,
Expand_Formal_Container_Element_Loop): Convert the container to the
root type before passing it to the iteration operations, so it will be
of the right type.

Index: exp_ch5.adb
===
--- exp_ch5.adb (revision 252907)
+++ exp_ch5.adb (working copy)
@@ -74,6 +74,12 @@
--  Utility to create declarations and loop statement for both forms
--  of formal container iterators.
 
+   function Convert_To_Iterable_Type
+ (Container : Entity_Id; Loc : Source_Ptr) return Node_Id;
+   --  Returns New_Occurrence_Of (Container), possibly converted to an
+   --  ancestor type, if the type of Container inherited the Iterable
+   --  aspect_specification from that ancestor.
+
function Change_Of_Representation (N : Node_Id) return Boolean;
--  Determine if the right-hand side of assignment N is a type conversion
--  which requires a change of representation. Called only for the array
@@ -189,7 +195,7 @@
 Make_Function_Call (Loc,
   Name   => New_Occurrence_Of (First_Op, Loc),
   Parameter_Associations => New_List (
-New_Occurrence_Of (Container, Loc;
+Convert_To_Iterable_Type (Container, Loc;
 
   --  Statement that advances cursor in loop
 
@@ -200,7 +206,7 @@
 Make_Function_Call (Loc,
   Name   => New_Occurrence_Of (Next_Op, Loc),
   Parameter_Associations => New_List (
-New_Occurrence_Of (Container, Loc),
+Convert_To_Iterable_Type (Container, Loc),
 New_Occurrence_Of (Cursor, Loc;
 
   --  Iterator is rewritten as a while_loop
@@ -211,13 +217,12 @@
 Make_Iteration_Scheme (Loc,
   Condition =>
 Make_Function_Call (Loc,
-  Name   =>
-New_Occurrence_Of (Has_Element_Op, Loc),
+  Name => New_Occurrence_Of (Has_Element_Op, Loc),
   Parameter_Associations => New_List (
-New_Occurrence_Of (Container, Loc),
+Convert_To_Iterable_Type (Container, Loc),
 New_Occurrence_Of (Cursor, Loc,
-  Statements   => Stats,
-  End_Label=> Empty);
+  Statements => Stats,
+  End_Label  => Empty);
end Build_Formal_Container_Iteration;
 
--
@@ -233,6 +238,26 @@
 not Same_Representation (Etype (Rhs), Etype (Expression (Rhs)));
end Change_Of_Representation;
 
+   --
+   -- Convert_To_Iterable_Type --
+   --
+
+   function Convert_To_Iterable_Type
+ (Container : Entity_Id; Loc : Source_Ptr) return Node_Id
+   is
+  Typ: constant Entity_Id  := Base_Type (Etype (Container));
+  Aspect : constant Node_Id := Find_Aspect (Typ, Aspect_Iterable);
+  Result : Node_Id := New_Occurrence_Of (Container, Loc);
+   begin
+  if Entity (Aspect) /= Typ then
+ Result := Make_Type_Conversion (Loc,
+ Subtype_Mark => New_Occurrence_Of (Entity (Aspect), Loc),
+ Expression   => Result);
+  end if;
+
+  return Result;
+   end Convert_To_Iterable_Type;
+
-
-- Expand_Assign_Array --
-
@@ -3207,7 +3232,7 @@
Make_Function_Call (Loc,
  Name   => New_Occurrence_Of (Element_Op, Loc),
  Parameter_Associations => New_List (
-   New_Occurrence_Of (Container, Loc),
+   Convert_To_Iterable_Type (Container, Loc),
New_Occurrence_Of (Cursor, Loc;
 
  Set_Statements (New_Loop,
@@ -3226,7 +3251,7 

[Ada] Implicit_Dereference with access to access and prefix notation

2017-09-18 Thread Pierre-Marie de Rodat
Fix a bug in which a call of the form X.Y (the prefix notation of Y(X))
where X is of a reference type (i.e. a type with the
Implicit_Dereference aspect specified), and the access
discriminant of X has a designated type that is also an access type,
incorrectly gets compilation errors.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

2017-09-18  Bob Duff  

* sem_ch4.adb (Complete_Object_Operation): Do not insert 'Access for
reference types in the access-to-access case.

gcc/testsuite/

2017-09-18  Bob Duff  

* gnat.dg/tagged_prefix_call.adb: New testcase.

Index: sem_ch4.adb
===
--- sem_ch4.adb (revision 252913)
+++ sem_ch4.adb (working copy)
@@ -8554,14 +8554,21 @@
  ("expect variable in call to&", Prefix (N), Entity (Subprog));
 end if;
 
- --  Conversely, if the formal is an access parameter and the object
- --  is not, replace the actual with a 'Access reference. Its analysis
- --  will check that the object is aliased.
+ --  Conversely, if the formal is an access parameter and the object is
+ --  not an access type or a reference type (i.e. a type with the
+ --  Implicit_Dereference aspect specified), replace the actual with a
+ --  'Access reference. Its analysis will check that the object is
+ --  aliased.
 
  elsif Is_Access_Type (Formal_Type)
and then not Is_Access_Type (Etype (Obj))
+   and then (not Has_Implicit_Dereference (Etype (Obj))
+ or else
+   not Is_Access_Type
+ (Designated_Type
+(Etype (Get_Reference_Discriminant (Etype (Obj))
  then
---  A special case: A.all'access is illegal if A is an access to a
+--  A special case: A.all'Access is illegal if A is an access to a
 --  constant and the context requires an access to a variable.
 
 if not Is_Access_Constant (Formal_Type) then
Index: ../testsuite/gnat.dg/tagged_prefix_call.adb
===
--- ../testsuite/gnat.dg/tagged_prefix_call.adb (revision 0)
+++ ../testsuite/gnat.dg/tagged_prefix_call.adb (revision 0)
@@ -0,0 +1,24 @@
+--  { dg-do compile }
+
+procedure Tagged_Prefix_Call is
+
+   package Defs is
+  type Database_Connection_Record is abstract tagged null record;
+  type Database_Connection is access all Database_Connection_Record'Class;
+
+  procedure Start_Transaction
+(Self : not null access Database_Connection_Record'Class)
+  is null;
+
+  type DB_Connection (Elem : access Database_Connection)
+  is null record
+with Implicit_Dereference => Elem;
+   end Defs;
+
+   use Defs;
+
+   DB  : DB_Connection(null);
+
+begin
+   DB.Start_Transaction;
+end Tagged_Prefix_Call;


Re: [Patch, Fortran] PR 82018: -Wextra should imply -Wconversion-extra

2017-09-18 Thread Dominique d'Humières
As said in bugzilla, I am against this change. If you want to use 
-Wconversion-extra, just add it to your favorite options.

-Wconversion-extra is extremely noisy and -Wextra has been stable for some 
years. IMO we cannot afford to have people complaining about the change.

If you really want a synthetic option, why not a new one -Wnoisy (or 
-Weally-all, …) which implies '-Wall -Wextra  -Wconversion-extra …’?

Dominique

PS. Who wants

Warning: Conversion from 'REAL(4)' to 'REAL(8)' at (1) [-Wconversion-extra]

even if may allow to detect things such as ‘pi8=acos(-1.0)’?



Re: [PATCH, i386] Enable option -mprefer-avx256 added for Intel AVX512 configuration

2017-09-18 Thread Uros Bizjak
On Thu, Sep 14, 2017 at 2:10 PM, Shalnov, Sergey
 wrote:
> Hi,
> GCC has the option "mprefer-avx128" to use 128-bit AVX registers instead of 
> 256-bit AVX registers in the auto-vectorizer.
> This patch enables the command line option "mprefer-avx256" that reduces 
> 512-bit registers usage in "march=skylake-avx512" mode.
> This is the initial implementation of the option. Currently, 512-bit 
> registers might appears in some cases. I have a plan to continue fix the 
> cases where 512-bit registers are appear.
> Sergey
>
> 2017-09-14  Sergey Shalnov  sergey.shal...@intel.com
> * config/i386/i386.opt (mprefer-avx256): New flag.
> * config/i386/i386.c (ix86_preferred_simd_mode): Prefer 256-bit AVX modes 
> when the flag -mprefer-avx256 is on.

Please rewrite integer mode handling in ix86_preferred_simd_mode to
some consistent form, like:

case E_QImode:
  if (TARGET_AVX512BW && !TARGET_PREFER_AVX256)
return V64QImode;
  else if (TARGET_AVX && !TARGET_PREFER_AVX128)
return V32QImode;
  else
return V16QImode;

...

and ix86_autovectorize_vector_sizes to some more readable form, like:

static unsigned int
ix86_autovectorize_vector_sizes (void)
{
  unsigned int bytesizes = 16;

  if (TARGET_AVX && !TARGET_PREFER_AVX128)
bytesizes |= 32;
  if (TARGET_AVX512F && !TARGET_PREFER_AVX256)
bytesizes |= 64;

  return bytesizes;
}

Uros.


Re: [Patch, Fortran] PR 82018: -Wextra should imply -Wconversion-extra

2017-09-18 Thread Janus Weil
> As a sidenote, I made an observation that is not directly related to
> the patch: The second warning in the test case, on "i4 = i8" shows
> [-Wconversion] when compiled with -Wall, but [-Wconversion-extra] when
> compiled with -Wextra. Does anyone understand how that inconsistency
> comes about?

Btw, this is probably due to constructs like:

if ((warn_conversion || warn_conversion_extra) ...

which occur in a few places in arith.c. In this sense, the
documentation is apparently not fully correct, claiming that
-Wconversion-extra does not imply -Wconversion? It probably does not
do so for all warnings, but for some of them it does apparently?

Maybe it would be clearer to just make -Wconversion-extra imply
-Wconversion (and document that), in order to avoid confusion?

Cheers,
Janus


Re: [Patch, Fortran] PR 82143: add a -fdefault-real-16 flag

2017-09-18 Thread Dominique d'Humières
As said in bugzilla

(1) real(16) is an order of magnitude slower than real(8) for the codes I have 
tested (a long time ago). So its real utility is quite low.

(2) I think your time would be better used by dealing with your assigned PRs.

But now the wasted time is done, I don’t have further objection.

Dominique

Note: Using -fdefault-* for "production" is a very bad idea.



Re: [Patch, Fortran] PR 82018: -Wextra should imply -Wconversion-extra

2017-09-18 Thread Janus Weil
Hi Thomas,

>> here is a small patch that enables -Wconversion-extra with -Wextra and
>> updates the documentation.
>
> I grepped for warn_conversion_extra and found 14 occurrences in the
> gfortran source tree.
>
> Are we sure we want to enable each of these warnings with -Wextra?

I'd say: Yes, why not. They're all similar in style, it seems. Which
one are you worried about, in particular?

AFAICS, these warnings will essentially be triggered on compile-time
constants, for which the compiler knows that they can be converted
e.g. from real to integer without change in value. For such constants,
I guess it is easy (and advisable) to change them.

If you're writing '2.0' and assign it to an integer, then you probably
just mean '2'. I'd say the warning is justified for such cases.

Cheers,
Janus


Re: [PATCH][RFA/RFC] Stack clash mitigation patch 02/08 - V3

2017-09-18 Thread Andreas Schwab
On Jul 30 2017, Jeff Law  wrote:

> This patch introduces generic mechanisms to protect the dynamically
> allocated stack space against stack-clash attacks.
>
> Changes since V2:
>
> Dynamic allocations can be emitted as unrolled inlined probes or with a
> rotated loop.  Blockage insns are also properly emitted for the dynamic
> area probes and the dynamic area probing now supports targets that may
> make optimistic assumptions in their prologues.  Finally it uses the new
> param to control the probing interval.
>
> Tests were updated to explicitly specify the guard and probing interval.
>  New test to check inline/unrolled probes as well as rotated loop.

Does that work correctly when the VLA is smaller than the probe size
(word_mode by default)?  I see a failure in glibc on armv7 where
ldconfig is using a zero-size VLA, which is invalid in C, but it could
also end up using a VLA of size 1.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH] Better fix for the x86_64 -mcmodel=large ICEs (PR target/82145)

2017-09-18 Thread Uros Bizjak
On Sun, Sep 17, 2017 at 7:27 PM, Jakub Jelinek  wrote:
> On Sun, Sep 17, 2017 at 05:47:11PM +0200, Uros Bizjak wrote:
>> > The postreload change is ok.
>>
>> The revert is OK even without approval.

I see.

The patch is also OK.

Thanks,
Uros.

> Well, it isn't a pure reversion, it is reversion plus addition of
>   const char *name = LABEL_NAME (label);
>   PUT_CODE (label, NOTE);
>   NOTE_KIND (label) = NOTE_INSN_DELETED_LABEL;
>   NOTE_DELETED_LABEL_NAME (label) = name;
> to the end of ix86_init_large_pic_reg.
>
>> >> 2017-09-13  Jakub Jelinek  
>> >>
>> >>   PR target/82145
>> >>   * postreload.c (reload_cse_simplify_operands): Skip
>> >>   NOTE_INSN_DELETED_LABEL similarly to skipping CODE_LABEL.
>> >>   * config/i386/i386.c (ix86_init_large_pic_reg): Revert 2017-09-01
>> >>   changes.  Turn CODE_LABEL into NOTE_INSN_DELETED_LABEL immediately.
>> >>   (ix86_init_pic_reg): Revert 2017-09-01 changes.
>> >>
>> >>   * gcc.target/i386/pr82145.c: New test.
>
> Jakub


[Ada] Scalar_Storage_Order support in conjunction with overlay

2017-09-18 Thread Pierre-Marie de Rodat
Toggling the scalar storage order by means of type punning or aliasing is not
supported in the general case, but it might be reasonable to support simple
overlays precisely used to test the effect of the attribute.

The following procedure must give the same output at all optimization levels:

with System;
with Interfaces; use Interfaces;
with Ada.Text_IO; use Ada.Text_IO;

procedure P is

  type U8Array is Array (Natural Range <>) of Unsigned_8;

  type One_Element_Unpacked is record
Value : Integer;
  end record;

  type One_Element_Packed is new One_Element_Unpacked;
  for One_Element_Packed use record
Value at 0 range 0 .. 31;
  end record;

  One_Element_Packed_Size : constant Positive := 32;
  for One_Element_Packed'Bit_Order use System.High_Order_First;
  for One_Element_Packed'Scalar_Storage_Order use System.High_Order_First;
  for One_Element_Packed'Size use One_Element_Packed_Size;

  subtype One_Element_Byte_Array
is U8Array(1 .. (One_Element_Packed'Object_Size/Unsigned_8'Object_Size));

  function F (Input : in One_Element_Packed) return One_Element_Byte_Array is
Result : constant One_Element_Byte_Array;
pragma Import (Ada, Result);
for Result'Address use Input'Address;
  begin
return Result;
  end;

  a : constant One_Element_Packed := (Value => 12);
  a_bytes : constant One_Element_Byte_Array := F (a);

begin
  Put("Record with single component byte representation:");
  for element of a_bytes loop
Put(element'img & " ");
  end loop;
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-09-18  Eric Botcazou  

* sem_ch13.adb (Analyze_Attribute_Definition_Clause) : Mark
the entity as being volatile for an overlay that toggles the scalar
storage order.

Index: sem_ch13.adb
===
--- sem_ch13.adb(revision 252907)
+++ sem_ch13.adb(working copy)
@@ -5084,6 +5084,22 @@
 Register_Address_Clause_Check
   (N, U_Ent, No_Uint, O_Ent, Off);
  end if;
+
+ --  If the overlay changes the storage order, mark the
+ --  entity as being volatile to block any optimization
+ --  for it since the construct is not really supported
+ --  by the back end.
+
+ if (Is_Record_Type (Etype (U_Ent))
+  or else Is_Array_Type (Etype (U_Ent)))
+   and then (Is_Record_Type (Etype (O_Ent))
+  or else Is_Array_Type (Etype (O_Ent)))
+   and then Reverse_Storage_Order (Etype (U_Ent))
+  /= Reverse_Storage_Order (Etype (O_Ent))
+ then
+Set_Treat_As_Volatile (U_Ent);
+ end if;
+
   else
  --  If this is not an overlay, mark a variable as being
  --  volatile to prevent unwanted optimizations. It's a


Re: Backports for GCC 6 branch

2017-09-18 Thread Martin Liška
On 09/16/2017 12:11 PM, Eric Botcazou wrote:
>> One more that I've just tested.
> 
> On which platform?  Here's what I have on x86_64-suse-linux:
> 
> FAIL: gcc.dg/asan/pr81224.c   -O0  (internal compiler error)
> FAIL: gcc.dg/asan/pr81224.c   -O0  (test for excess errors)
> FAIL: gcc.dg/asan/pr81224.c   -O1  (internal compiler error)
> FAIL: gcc.dg/asan/pr81224.c   -O1  (test for excess errors)
> FAIL: gcc.dg/asan/pr81224.c   -O2  (internal compiler error)
> FAIL: gcc.dg/asan/pr81224.c   -O2  (test for excess errors)
> FAIL: gcc.dg/asan/pr81224.c   -O2 -flto -fno-use-linker-plugin -flto-
> partition=none  (internal compiler error)
> FAIL: gcc.dg/asan/pr81224.c   -O2 -flto -fno-use-linker-plugin -flto-
> partition=none  (test for excess errors)
> FAIL: gcc.dg/asan/pr81224.c   -O3 -g  (internal compiler error)
> FAIL: gcc.dg/asan/pr81224.c   -O3 -g  (test for excess errors)
> FAIL: gcc.dg/asan/pr81224.c   -Os  (internal compiler error)
> FAIL: gcc.dg/asan/pr81224.c   -Os  (test for excess errors)
> 


Hi.

Sorry for the breakage, fixed by removal of the test-case.
It needs missing bits that will not be backported to GCC-[56] branches.

Martin


[Ada] Crash on mutable record component with box initialization

2017-09-18 Thread Pierre-Marie de Rodat
This patch fixes a compiler abort on a record declaration that includes a
mutable record component whose default value is an aggregate that includes
a box-initialized component whose value depends on a discriminant of the
component.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

2017-09-18  Ed Schonberg  

* exp_ch3.adb (Replace_Discriminant_References): New procedure,
subsidiary of Build_Assignment, used to handle the initialization code
for a mutable record component whose default value is an aggregate that
sets the values of the discriminants of the components.

gcc/testsuite/

2017-09-18  Ed Schonberg  

* gnat.dg/default_variants.adb: New testcase.
Index: exp_ch3.adb
===
--- exp_ch3.adb (revision 252907)
+++ exp_ch3.adb (working copy)
@@ -1782,6 +1782,42 @@
  Lhs  : Node_Id;
  Res  : List_Id;
 
+ function Replace_Discr_Ref (N : Node_Id) return Traverse_Result;
+ --  Analysis of the aggregate has replaced discriminants by their
+ --  corresponding discriminals, but these are irrelevant when the
+ --  component has a mutable type and is initialized with an aggregate.
+ --  Instead, they must be replaced by the values supplied in the
+ --  aggregate, that will be assigned during the expansion of the
+ --  assignment.
+
+ ---
+ -- Replace_Discr_Ref --
+ ---
+
+ function Replace_Discr_Ref (N : Node_Id) return Traverse_Result is
+Val : Node_Id;
+ begin
+if Is_Entity_Name (N)
+  and then Present (Entity (N))
+  and then Is_Formal (Entity (N))
+  and then Present (Discriminal_Link (Entity (N)))
+then
+   Val :=
+  Make_Selected_Component (N_Loc,
+Prefix => New_Copy_Tree (Lhs),
+Selector_Name => New_Occurrence_Of
+  (Discriminal_Link (Entity (N)), N_Loc));
+   if Present (Val) then
+  Rewrite (N, New_Copy_Tree (Val));
+   end if;
+end if;
+
+return OK;
+ end Replace_Discr_Ref;
+
+ procedure Replace_Discriminant_References is
+   new Traverse_Proc (Replace_Discr_Ref);
+
   begin
  Lhs :=
Make_Selected_Component (N_Loc,
@@ -1789,6 +1825,22 @@
  Selector_Name => New_Occurrence_Of (Id, N_Loc));
  Set_Assignment_OK (Lhs);
 
+ if Nkind (Exp) = N_Aggregate
+   and then Has_Discriminants (Typ)
+   and then not Is_Constrained (Base_Type (Typ))
+ then
+--  The aggregate may provide new values for the discriminants
+--  of the component, and other components may depend on those
+--  discriminants. Previous analysis of those expressions have
+--  replaced the discriminants by the formals of the initialization
+--  procedure for the type, but these are irrelevant in the
+--  enclosing initialization procedure: those discriminant
+--  references must be replaced by the values provided in the
+--  aggregate.
+
+Replace_Discriminant_References (Exp);
+ end if;
+
  --  Case of an access attribute applied to the current instance.
  --  Replace the reference to the type by a reference to the actual
  --  object. (Note that this handles the case of the top level of
Index: ../testsuite/gnat.dg/default_variants.adb
===
--- ../testsuite/gnat.dg/default_variants.adb   (revision 0)
+++ ../testsuite/gnat.dg/default_variants.adb   (revision 0)
@@ -0,0 +1,28 @@
+--  { dg-do compile }
+
+procedure Default_Variants is
+
+   type Variant_Kind is (A, B);
+
+   function Get_Default_Value (Kind : in Variant_Kind) return Natural is (10);
+
+   type Variant_Type (Kind : Variant_Kind := A) is
+  record
+ Common : Natural := Get_Default_Value (Kind);
+ case Kind is
+when A =>
+   A_Value : Integer := Integer'First;
+when B =>
+   B_Value : Natural := Natural'First;
+ end case;
+  end record;
+
+   type Containing_Type is tagged
+  record
+ Variant_Data : Variant_Type :=
+   (Kind => B, Common => <>, B_Value => 1);
+  end record;
+
+begin
+null;
+end Default_Variants;


[Ada] Crash on illegal current instance

2017-09-18 Thread Pierre-Marie de Rodat
If the type_mark of a qualified_expression refers to the current
instance of the type, do not crash; instead give a proper error
message. This is illegal by RM-8.6(17).

The following test should get an error:

current_instance_default.ads:2:54: current instance not allowed

package Current_Instance_Default is
   type Color is (Red, Orange) with Default_Value => Color'(Red); -- ERROR:
end Current_Instance_Default;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-09-18  Bob Duff  

* sem_ch4.adb (Analyze_Qualified_Expression): Give an error if the type
mark refers to the current instance. Set the type to Any_Type in that
case, to avoid later crashes.

Index: sem_ch4.adb
===
--- sem_ch4.adb (revision 252907)
+++ sem_ch4.adb (working copy)
@@ -3930,6 +3930,23 @@
   Set_Etype (N, Any_Type);
   Find_Type (Mark);
   T := Entity (Mark);
+
+  if Nkind_In
+(Enclosing_Declaration (N),
+ N_Formal_Type_Declaration,
+ N_Full_Type_Declaration,
+ N_Incomplete_Type_Declaration,
+ N_Protected_Type_Declaration,
+ N_Private_Extension_Declaration,
+ N_Private_Type_Declaration,
+ N_Subtype_Declaration,
+ N_Task_Type_Declaration)
+and then T = Defining_Identifier (Enclosing_Declaration (N))
+  then
+ Error_Msg_N ("current instance not allowed", Mark);
+ T := Any_Type;
+  end if;
+
   Set_Etype (N, T);
 
   if T = Any_Type then


[RFC][PATCH] ipa: fix dumping with deleted multiversioning nodes

2017-09-18 Thread Evgeny Kudryashov

Hello,

The code below causes an internal compiler error in cc1plus (trunk on 
x86-64) if it is compiled with -fdump-ipa-cgraph.


int foo () __attribute__ ((target ("default")));
int foo () __attribute__ ((target ("sse4.2")));

__attribute__ ((target ("sse4.2")))
int foo ()
{
  return 1;
}

The error occurs in cgraph_node::dump (gcc/cgraph.c:2065), particularly, 
in the following fragment:


cgraph_function_version_info *vi = function_version ();
if (vi != NULL)
  {
fprintf (f, "  Version info: ");
if (vi->prev != NULL)
  {
fprintf (f, "prev: ");
fprintf (f, "%s ", vi->prev->this_node->dump_asm_name ());
  }
if (vi->next != NULL)
  {
fprintf (f, "next: ");
fprintf (f, "%s ", vi->next->this_node->dump_asm_name ());
  }
if (vi->dispatcher_resolver != NULL_TREE)
  fprintf (f, "dispatcher: %s",
   lang_hooks.decl_printable_name (vi->dispatcher_resolver, 
2));


fprintf (f, "\n");
  }


The expression "vi->{prev,next}->this_node" can be null if it is a 
version of an unused symbol that was removed.


Is it intentional that removing a cgraph node does not also remove 
versions?


As a solution I suggest to delete the version of the node too during 
node removal.


Regards,
Evgeny.

* cgraph.c (delete_function_version): New, broken out from...
(cgraph_node::delete_function_version): ...here.  Rename to
cgraph_node::delete_function_version_by_decl.  Update all uses.
(cgraph_node::remove): Call delete_function_version.diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 8bffdec..3d0cefb 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -190,30 +190,34 @@ cgraph_node::insert_new_function_version (void)
   return version_info_node;
 }
 
-/* Remove the cgraph_function_version_info and cgraph_node for DECL.  This
-   DECL is a duplicate declaration.  */
-void
-cgraph_node::delete_function_version (tree decl)
+/* Remove the cgraph_function_version_info node given by DECL_V.  */
+static void
+delete_function_version (cgraph_function_version_info *decl_v)
 {
-  cgraph_node *decl_node = cgraph_node::get (decl);
-  cgraph_function_version_info *decl_v = NULL;
-
-  if (decl_node == NULL)
-return;
-
-  decl_v = decl_node->function_version ();
-
   if (decl_v == NULL)
 return;
 
   if (decl_v->prev != NULL)
-   decl_v->prev->next = decl_v->next;
+decl_v->prev->next = decl_v->next;
 
   if (decl_v->next != NULL)
 decl_v->next->prev = decl_v->prev;
 
   if (cgraph_fnver_htab != NULL)
 cgraph_fnver_htab->remove_elt (decl_v);
+}
+
+/* Remove the cgraph_function_version_info and cgraph_node for DECL.  This
+   DECL is a duplicate declaration.  */
+void
+cgraph_node::delete_function_version_by_decl (tree decl)
+{
+  cgraph_node *decl_node = cgraph_node::get (decl);
+
+  if (decl_node == NULL)
+return;
+
+  delete_function_version (decl_node->function_version ());
 
   decl_node->remove ();
 }
@@ -1844,6 +1848,7 @@ cgraph_node::remove (void)
   remove_callers ();
   remove_callees ();
   ipa_transforms_to_apply.release ();
+  delete_function_version (function_version ());
 
   /* Incremental inlining access removed nodes stored in the postorder list.
  */
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index c668b37..1303db0 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -1272,7 +1272,7 @@ public:
 
   /* Remove the cgraph_function_version_info and cgraph_node for DECL.  This
  DECL is a duplicate declaration.  */
-  static void delete_function_version (tree decl);
+  static void delete_function_version_by_decl (tree decl);
 
   /* Add the function FNDECL to the call graph.
  Unlike finalize_function, this function is intended to be used
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 858747e..50fa1ba 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -2566,7 +2566,7 @@ next_arg:;
   DECL_FUNCTION_VERSIONED (newdecl) = 1;
   /* newdecl will be purged after copying to olddecl and is no longer
  a version.  */
-  cgraph_node::delete_function_version (newdecl);
+  cgraph_node::delete_function_version_by_decl (newdecl);
 }
 
   if (TREE_CODE (newdecl) == FUNCTION_DECL)


[Ada] Spurious unreferenced warning in pramga Linker_Section

2017-09-18 Thread Pierre-Marie de Rodat
This patch corrects an issue whereby actuals within the aspect/pragma
Linker_Section were incorrectly flagged as unreferenced causing spurious
warnings when compiling with -gnatwu. Pramga_Linker_Section is now correctly
classified within the Sig_Flags table as significant correcting this behavior.


-- Source --


--  constants.ads

package Constants is
   Custom_Section : constant String := ".customsection";
end Constants;

--  foo.ads

with Constants;

package Foo is
   Variable : Natural
 with Linker_Section => Constants.Custom_Section;
end Foo;

--  bar.ads

with Constants;

package Bar is
   Variable : Natural := 1
 with Linker_Section => Constants.Custom_Section;
end Bar;


-- Compilation and output --


& gnatmake -q -c -gnatwa foo.ads
& gnatmake -q -c -gnatwa bar.ads

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-09-18  Justin Squirek  

* sem_prag.adb (Is_Non_Significant_Pragma_Reference): Change the
constant indication for Pragma_Linker_Section.

Index: sem_prag.adb
===
--- sem_prag.adb(revision 252907)
+++ sem_prag.adb(working copy)
@@ -29548,7 +29548,7 @@
   Pragma_Linker_Constructor => -1,
   Pragma_Linker_Destructor  => -1,
   Pragma_Linker_Options => -1,
-  Pragma_Linker_Section =>  0,
+  Pragma_Linker_Section => -1,
   Pragma_List   =>  0,
   Pragma_Lock_Free  =>  0,
   Pragma_Locking_Policy =>  0,


[Ada][PR ada/71358] GNAT.Command_Line: crash in Getopt on empty Config

2017-09-18 Thread Pierre-Marie de Rodat
This patch revisits the fix for a bug in GNAT.Command_Line.Getopt: instead of
checking everywhere that an pointer is not null, we allocate a dummy object and
remove all null pointer checks.

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-09-18  Bob Duff  

Alternate fix for PR ada/71358
* libgnat/g-comlin.adb (Getopt): Remove manual null access checks.
Instead, make a local copy of Config, and if it's null, allocate an
empty Command_Line_Configuration_Record, so we won't crash on null
pointer dereference.

Index: libgnat/g-comlin.adb
===
--- libgnat/g-comlin.adb(revision 252907)
+++ libgnat/g-comlin.adb(working copy)
@@ -3153,18 +3153,16 @@
 
  New_Line;
 
- if Section /= "" and then Config.Switches /= null then
+ if Section /= "" then
 Put_Line ("Switches after " & Section);
  end if;
 
  --  Compute size of the switches column
 
- if Config.Switches /= null then
-for S in Config.Switches'Range loop
-   Max_Len := Natural'Max
- (Max_Len, Switch_Name (Config.Switches (S), Section)'Length);
-end loop;
- end if;
+ for S in Config.Switches'Range loop
+Max_Len := Natural'Max
+  (Max_Len, Switch_Name (Config.Switches (S), Section)'Length);
+ end loop;
 
  if Config.Aliases /= null then
 for A in Config.Aliases'Range loop
@@ -3177,28 +3175,26 @@
 
  --  Display the switches
 
- if Config.Switches /= null then
-for S in Config.Switches'Range loop
-   declare
-  N : constant String :=
-Switch_Name (Config.Switches (S), Section);
+ for S in Config.Switches'Range loop
+declare
+   N : constant String :=
+ Switch_Name (Config.Switches (S), Section);
 
-   begin
-  if N /= "" then
- Put (" ");
- Put (N);
- Put ((1 .. Max_Len - N'Length + 1 => ' '));
+begin
+   if N /= "" then
+  Put (" ");
+  Put (N);
+  Put ((1 .. Max_Len - N'Length + 1 => ' '));
 
- if Config.Switches (S).Help /= null then
-Put (Config.Switches (S).Help.all);
- end if;
-
- New_Line;
+  if Config.Switches (S).Help /= null then
+ Put (Config.Switches (S).Help.all);
   end if;
-   end;
-end loop;
- end if;
 
+  New_Line;
+   end if;
+end;
+ end loop;
+
  --  Display the aliases
 
  if Config.Aliases /= null then
@@ -3348,6 +3344,7 @@
   Parser  : Opt_Parser := Command_Line_Parser;
   Concatenate : Boolean := True)
is
+  Local_Config: Command_Line_Configuration := Config;
   Getopt_Switches : String_Access;
   C   : Character := ASCII.NUL;
 
@@ -3373,22 +3370,22 @@
  --  Do automatic handling when possible
 
  if Index /= -1 then
-case Config.Switches (Index).Typ is
+case Local_Config.Switches (Index).Typ is
when Switch_Untyped =>
   null;   --  no automatic handling
 
when Switch_Boolean =>
-  Config.Switches (Index).Boolean_Output.all :=
-Config.Switches (Index).Boolean_Value;
+  Local_Config.Switches (Index).Boolean_Output.all :=
+Local_Config.Switches (Index).Boolean_Value;
   return;
 
when Switch_Integer =>
   begin
  if Parameter = "" then
-Config.Switches (Index).Integer_Output.all :=
-  Config.Switches (Index).Integer_Default;
+Local_Config.Switches (Index).Integer_Output.all :=
+  Local_Config.Switches (Index).Integer_Default;
  else
-Config.Switches (Index).Integer_Output.all :=
+Local_Config.Switches (Index).Integer_Output.all :=
   Integer'Value (Parameter);
  end if;
 
@@ -3402,8 +3399,8 @@
   return;
 
when Switch_String =>
-  Free (Config.Switches (Index).String_Output.all);
-  Config.Switches (Index).String_Output.all :=
+  Free (Local_Config.Switches (Index).String_Output.all);
+  Local_Config.Switches (Index).String_Output.all :=
 new String'(Parameter);
   return;
 end case;
@@ -3441,45 

[Ada] Allow generic package holding state

2017-09-18 Thread Pierre-Marie de Rodat
SPARK RM 7.2.6(5) allows the hidden state of a generic package to be Part_Of
the state of the generic package. This was not properly supported. Now fixed.

The following code compiles without errors.

$ gcc -c gen.ads

 1. generic
 2.  J : Integer;
 3. package Gen with
 4.   SPARK_Mode => On,
 5.   Abstract_State => State
 6. is
 7. G : Boolean;
 8. private
 9.  H : Boolean with Part_Of => State;
10. end Gen;

Tested on x86_64-pc-linux-gnu, committed on trunk

2017-09-18  Yannick Moy  

* sem_util.adb (Find_Placement_In_State_Space): Allow generic package
holding state.

Index: sem_util.adb
===
--- sem_util.adb(revision 252907)
+++ sem_util.adb(working copy)
@@ -7922,7 +7922,7 @@
 
   Context := Scope (Item_Id);
   while Present (Context) and then Context /= Standard_Standard loop
- if Ekind (Context) = E_Package then
+ if Is_Package_Or_Generic_Package (Context) then
 Pack_Id := Context;
 
 --  A package body is a cut off point for the traversal as the item


Re: [patch] Fix PR target/81361

2017-09-18 Thread Jakub Jelinek
On Mon, Sep 18, 2017 at 09:50:45AM +0200, Eric Botcazou wrote:
> .set L$set$24,LEFDE3-LASFDE3
>   .long L$set$24  # FDE Length
> LASFDE3:
>   .long   LASFDE3-EH_frame1   # FDE CIE offset
>   .quad   LCOLDB1-.   # FDE initial location
>   .set L$set$25,LCOLDE1-LCOLDB1
>   .quad L$set$25  # FDE address range
>   .byte   0x8 # uleb128 0x8; Augmentation size
>   .quad   LLSDAC5-.   # Language Specific Data Area
>   .byte   0x1 # DW_CFA_set_loc
>   .quad   LCFI1-.
>   .byte   0xe # DW_CFA_def_cfa_offset
>   .byte   0x10# uleb128 0x10
>   .byte   0x83# DW_CFA_offset, column 0x3
>   .byte   0x2 # uleb128 0x2
> 
> Note the DW_CFA_set_loc operation: it's the only case where the compiler 
> emits 
> it (DW_CFA_advance_loc4 is usually emitted) and is the source of the problem, 
> since it appears that the PC-relative relocation is not applied to the 
> operand 
> of the DW_CFA_set_loc (unlike to the 2 other cases in the FDE).

That sounds like a Darwin bug in handling of DW_CFA_set_loc in .eh_frame
section, the encoding/size of the DW_CFA_set_loc operand is an encoded
pointer, always absolute address in .debug_frame section, and whatever the
CIE augmentation says should be used otherwise, i.e. the same as e.g. FDE
initial location pointer.  As that is LCOLDB1-. in the same FDE, it means
LCFI1-. is right.

That said, there is indeed no reason to emit DW_CFA_set_loc when we have a
label, so your patch is ok for trunk.  That doesn't mean Darwin shouldn't be
fixed.  libgcc unwind-dw2.c for DW_CFA_set_loc uses read_encoded_value
and so I believe should DTRT.

> This DW_CFA_set_loc instruction is emitted by add_cfis_to_fde for the second 
> FDE generated for the cold part of a function but doesn't seem necessary any 
> more, since there is a label (LCOLDB1) to be used now (this can also be seen 
> on Linux with the -fno-dwarf2-cfi-asm option).
> 
> Bootstrapped/regtested on x86-64/Linux by me and various versions of Darwin 
> by 
> Iain, Dominique and myself.  OK for the mainline?
> 
> 
> 2017-09-18  Eric Botcazou  
> 
>   PR target/81361
>   * dwarf2cfi.c (add_cfis_to_fde): Do not generate DW_CFA_set_loc
>   after switching to a new text section.

Jakub


[patch] Fix PR target/81361

2017-09-18 Thread Eric Botcazou
Hi,

exception handling is currently broken in all languages for Darwin at -O2 on 
the mainline because of what appears to be a bug in either the assembler or 
the system unwinder.  The problem occurs when the compiler decides to split a 
function into hot & cold parts and the cold part is active wrt EH; in this 
case, the compiler generates a FDE for each part (the Darwin port doesn't use 
the CFI assembler directives) and the second FDE looks like:

.set L$set$24,LEFDE3-LASFDE3
.long L$set$24  # FDE Length
LASFDE3:
.long   LASFDE3-EH_frame1   # FDE CIE offset
.quad   LCOLDB1-.   # FDE initial location
.set L$set$25,LCOLDE1-LCOLDB1
.quad L$set$25  # FDE address range
.byte   0x8 # uleb128 0x8; Augmentation size
.quad   LLSDAC5-.   # Language Specific Data Area
.byte   0x1 # DW_CFA_set_loc
.quad   LCFI1-.
.byte   0xe # DW_CFA_def_cfa_offset
.byte   0x10# uleb128 0x10
.byte   0x83# DW_CFA_offset, column 0x3
.byte   0x2 # uleb128 0x2

Note the DW_CFA_set_loc operation: it's the only case where the compiler emits 
it (DW_CFA_advance_loc4 is usually emitted) and is the source of the problem, 
since it appears that the PC-relative relocation is not applied to the operand 
of the DW_CFA_set_loc (unlike to the 2 other cases in the FDE).

This DW_CFA_set_loc instruction is emitted by add_cfis_to_fde for the second 
FDE generated for the cold part of a function but doesn't seem necessary any 
more, since there is a label (LCOLDB1) to be used now (this can also be seen 
on Linux with the -fno-dwarf2-cfi-asm option).

Bootstrapped/regtested on x86-64/Linux by me and various versions of Darwin by 
Iain, Dominique and myself.  OK for the mainline?


2017-09-18  Eric Botcazou  

PR target/81361
* dwarf2cfi.c (add_cfis_to_fde): Do not generate DW_CFA_set_loc
after switching to a new text section.

-- 
Eric BotcazouIndex: dwarf2cfi.c
===
--- dwarf2cfi.c	(revision 252749)
+++ dwarf2cfi.c	(working copy)
@@ -2209,20 +2209,13 @@ add_cfis_to_fde (void)
 {
   dw_fde_ref fde = cfun->fde;
   rtx_insn *insn, *next;
-  /* We always start with a function_begin label.  */
-  bool first = false;
 
   for (insn = get_insns (); insn; insn = next)
 {
   next = NEXT_INSN (insn);
 
   if (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_SWITCH_TEXT_SECTIONS)
-	{
-	  fde->dw_fde_switch_cfi_index = vec_safe_length (fde->dw_fde_cfi);
-	  /* Don't attempt to advance_loc4 between labels
-	 in different sections.  */
-	  first = true;
-	}
+	fde->dw_fde_switch_cfi_index = vec_safe_length (fde->dw_fde_cfi);
 
   if (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_CFI)
 	{
@@ -2247,8 +2240,7 @@ add_cfis_to_fde (void)
 
 	  /* Set the location counter to the new label.  */
 	  xcfi = new_cfi ();
-	  xcfi->dw_cfi_opc = (first ? DW_CFA_set_loc
-  : DW_CFA_advance_loc4);
+	  xcfi->dw_cfi_opc = DW_CFA_advance_loc4;
 	  xcfi->dw_cfi_oprnd1.dw_cfi_addr = label;
 	  vec_safe_push (fde->dw_fde_cfi, xcfi);
 
@@ -2263,7 +2255,6 @@ add_cfis_to_fde (void)
 	  insn = NEXT_INSN (insn);
 	}
 	  while (insn != next);
-	  first = false;
 	}
 }
 }


Re: [RFC][PATCH 1/5] Add separate parms for rtl unroller

2017-09-18 Thread Richard Biener
On Mon, Sep 18, 2017 at 3:36 AM, Kugan Vivekanandarajah
 wrote:
> Hi Richard,
>
> On 15 September 2017 at 19:31, Richard Biener
>  wrote:
>> On Fri, Sep 15, 2017 at 3:27 AM, Kugan Vivekanandarajah
>>  wrote:
>>> This patch adds separate params for rtl unroller so that they can be
>>> tunned accordingly. Default values I have are based on some testing on
>>> aarch64. I am happy to leave it as the current value and set them in
>>> the back-end.
>>
>> PARAM_MAX_AVERAGE_UNROLLED_INSNS is only used by the RTL
>> unroller.  Why should we separate PARAM_MAX_UNROLL_TIMES?
>>
>> PARAM_MAX_UNROLLED_INSNS is only used by gimple passes
>> that perform unrolling.  Since GIMPLE is three-address it should
>> match RTL reasonably well -- but I'd be ok in having a separate param
>> for those.  But I wouldn't name those 'partial'.
>>
>> That said, those are magic numbers and I expect we can find some
>> that work well on RTL and GIMPLE.
>
> Thanks for the review. I am mostly interested in having separate
> params for RTL runtime unrolling as this is different to what GIMPLE
> unroller does.

Why?  Do you just want to have more magic knobs to machine-auto-tune?

> May be I should have separate params only for the
> runtime unrolling (or the partial unroller)  and let RTL/GIMPLE share
> the other. Any preference here ? Any preference on the name ?
>
> I am suspecting that RTL unroller which does the same as GIMPLE is
> kind of obsolete now?

We do not have a GIMPLE loop unroller pass.  On GIMPLE we only do
complete peeling as a separate pass and several passes perform
unrolling as part of their transform.

Richard.

> Thanks,
> Kugan
>
>
>
>
>
>> Richard.
>>
>>>
>>> Thanks,
>>> Kugan
>>>
>>>
>>> gcc/ChangeLog:
>>>
>>> 2017-09-12  Kugan Vivekanandarajah  
>>>
>>> * loop-unroll.c (decide_unroll_constant_iterations): Use new params.
>>> (decide_unroll_runtime_iterations): Likewise.
>>> (decide_unroll_stupid): Likewise.
>>> * params.def (DEFPARAM): Separate and add new params for rtl unroller.


[PATCH] Bump downloaded ISL version to 0.18

2017-09-18 Thread Richard Biener

Committed.

Richard.

2017-09-18  Richard Biener  

* download_prerequisites (isl): Bump version to 0.18.

Index: contrib/download_prerequisites
===
--- contrib/download_prerequisites  (revision 252906)
+++ contrib/download_prerequisites  (working copy)
@@ -30,7 +30,7 @@ version='(unversioned)'
 gmp='gmp-6.1.0.tar.bz2'
 mpfr='mpfr-3.1.4.tar.bz2'
 mpc='mpc-1.0.3.tar.gz'
-isl='isl-0.16.1.tar.bz2'
+isl='isl-0.18.tar.bz2'
 
 base_url='ftp://gcc.gnu.org/pub/gcc/infrastructure/'
 


[PATCH][GRAPHITE] Enhance handled data-refs

2017-09-18 Thread Richard Biener

The following removes odd restrictions from data-ref handling, resulting
in 15% more optimized loop nests in SPEC CPU 2006.  We're now also
running into existing PRs when building 481.wrf, I'll have a second
look into the respective PRs as a followup.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2017-09-18  Richard Biener  

* graphite-scop-detection.c (scop_detection::stmt_has_simple_data_ref):
Simplify.
(build_alias_set): Reject aliases with no access function.

Index: gcc/graphite-scop-detection.c
===
--- gcc/graphite-scop-detection.c   (revision 252905)
+++ gcc/graphite-scop-detection.c   (working copy)
@@ -1338,40 +1338,23 @@ scop_detection::stmt_has_simple_data_ref
 {
   loop_p nest = outermost_loop_in_sese (scop, gimple_bb (stmt));
   loop_p loop = loop_containing_stmt (stmt);
-  vec drs = vNULL;
+  if (!loop_in_sese_p (loop, scop))
+loop = nest;
 
-  graphite_find_data_references_in_stmt (nest, loop, stmt, );
+  auto_vec drs;
+  if (! graphite_find_data_references_in_stmt (nest, loop, stmt, ))
+return false;
 
   int j;
   data_reference_p dr;
   FOR_EACH_VEC_ELT (drs, j, dr)
 {
-  int nb_subscripts = DR_NUM_DIMENSIONS (dr);
-
-  if (nb_subscripts < 1)
-   {
- free_data_refs (drs);
+  for (unsigned i = 0; i < DR_NUM_DIMENSIONS (dr); ++i)
+   if (! graphite_can_represent_scev (DR_ACCESS_FN (dr, i)))
  return false;
-   }
-
-  tree ref = DR_REF (dr);
-
-  for (int i = nb_subscripts - 1; i >= 0; i--)
-   {
- if (!graphite_can_represent_scev (DR_ACCESS_FN (dr, i))
- || (TREE_CODE (ref) != ARRAY_REF && TREE_CODE (ref) != MEM_REF
- && TREE_CODE (ref) != COMPONENT_REF))
-   {
- free_data_refs (drs);
- return false;
-   }
-
- ref = TREE_OPERAND (ref, 0);
-   }
 }
 
-free_data_refs (drs);
-return true;
+  return true;
 }
 
 /* GIMPLE_ASM and GIMPLE_CALL may embed arbitrary side effects.
@@ -1875,7 +1858,8 @@ build_alias_set (scop_p scop)
{
  /* Dependences in the same alias set need to be handled
 by just looking at DR_ACCESS_FNs.  */
- if (DR_NUM_DIMENSIONS (dr1->dr) != DR_NUM_DIMENSIONS (dr2->dr)
+ if (DR_NUM_DIMENSIONS (dr1->dr) == 0
+ || DR_NUM_DIMENSIONS (dr1->dr) != DR_NUM_DIMENSIONS (dr2->dr)
  || ! operand_equal_p (DR_BASE_OBJECT (dr1->dr),
DR_BASE_OBJECT (dr2->dr),
OEP_ADDRESS_OF)


  1   2   >