[patch][x86_64]: AMD znver2 enablement

2018-10-30 Thread Kumar, Venkataramanan
Hi Maintainers,

PFA, the patch that enables support for the next generation AMD  Zen CPU via 
-march=znver2. 
As of now,  znver2 is using the same costs and scheduler descriptions written 
for znver1.

We will update scheduler descriptions and costing for znver2 later as we get 
more information.

Ok for trunk?

Regards,
Venkat.

ChangeLog gcc:
    * common/config/i386/i386-common.c (processor_alias_table): Add znver2 
entry.
  * config.gcc (i[34567]86-*-linux* | ...): Add znver2.
  (case ${target}): Add znver2.
  * config/i386/driver-i386.c: (host_detect_local_cpu): Let
  -march=native recognize znver2 processors.
  * config/i386/i386-c.c (ix86_target_macros_internal): Add znver2.
  * config/i386/i386.c (m_znver2): New definition.
  (m_ZNVER): New definition.
  (m_AMD_MULTIPLE): Includes m_znver2.
  (processor_cost_table): Add znver2 entry.
  (processor_target_table): Add znver2 entry.
  (get_builtin_code_for_version): Set priority for
 PROCESSOR_ZNVER2.
    (processor_model): Add M_AMDFAM17H_ZNVER2.
    (arch_names_table): Ditto.
    (ix86_reassociation_width): Include znver2. 
* config/i386/i386.h (TARGET_znver2): New definition.
  (struct ix86_size_cost): Add TARGET_ZNVER2.
  (enum processor_type): Add PROCESSOR_ZNVER2.
  * config/i386/i386.md (define_attr "cpu"): Add znver2.
    * config/i386/x86-tune-costs.h: (processor_costs) Add znver2 costs.
    * config/i386/x86-tune-sched.c: (ix86_issue_rate): Add znver2.
    (ix86_adjust_cost): Add znver2.
  * config/i386/x86-tune.def:  Replace m_ZNVER1 by m_ZNVER
  * gcc/doc/extend.texi: Add details about znver2.
  * gcc/doc/invoke.texi: Add details about znver2.

ChangeLog libgcc
 * config/i386/cpuinfo.c: (get_amd_cpu): Add znver2.
 (processor_subtypes): Ditto.

diff --git a/gcc/common/config/i386/i386-common.c 
b/gcc/common/config/i386/i386-common.c
index f12806e..ff13ea5 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -1677,6 +1677,16 @@ const pta processor_alias_table[] =
   | PTA_RDRND | PTA_MOVBE | PTA_MWAITX | PTA_ADX | PTA_RDSEED
   | PTA_CLZERO | PTA_CLFLUSHOPT | PTA_XSAVEC | PTA_XSAVES
   | PTA_SHA | PTA_LZCNT | PTA_POPCNT},
+  {"znver2", PROCESSOR_ZNVER2, CPU_ZNVER1,
+PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
+  | PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1
+  | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX | PTA_AVX2
+  | PTA_BMI | PTA_BMI2 | PTA_F16C | PTA_FMA | PTA_PRFCHW
+  | PTA_FXSR | PTA_XSAVE | PTA_XSAVEOPT | PTA_FSGSBASE
+  | PTA_RDRND | PTA_MOVBE | PTA_MWAITX | PTA_ADX | PTA_RDSEED
+  | PTA_CLZERO | PTA_CLFLUSHOPT | PTA_XSAVEC | PTA_XSAVES
+  | PTA_SHA | PTA_LZCNT | PTA_POPCNT | PTA_CLWB | PTA_RDPID
+  | PTA_WBNOINVD},
   {"btver1", PROCESSOR_BTVER1, CPU_GENERIC,
 PTA_64BIT | PTA_MMX |  PTA_SSE  | PTA_SSE2 | PTA_SSE3
   | PTA_SSSE3 | PTA_SSE4A |PTA_ABM | PTA_CX16 | PTA_PRFCHW
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 93dc297..a47e6c3 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -664,11 +664,11 @@ pentium4 pentium4m pentiumpro prescott lakemont"
 # 64-bit x86 processors supported by --with-arch=.  Each processor
 # MUST be separated by exactly one space.
 x86_64_archs="amdfam10 athlon64 athlon64-sse3 barcelona bdver1 bdver2 \
-bdver3 bdver4 znver1 btver1 btver2 k8 k8-sse3 opteron opteron-sse3 nocona \
-core2 corei7 corei7-avx core-avx-i core-avx2 atom slm nehalem westmere \
-sandybridge ivybridge haswell broadwell bonnell silvermont knl knm \
-skylake-avx512 cannonlake icelake-client icelake-server skylake goldmont \
-goldmont-plus tremont x86-64 native"
+bdver3 bdver4 znver1 znver2 btver1 btver2 k8 k8-sse3 opteron \
+opteron-sse3 nocona core2 corei7 corei7-avx core-avx-i core-avx2 atom \
+slm nehalem westmere sandybridge ivybridge haswell broadwell bonnell \
+silvermont knl knm skylake-avx512 cannonlake icelake-client icelake-server \
+skylake goldmont goldmont-plus tremont x86-64 native"
 
 # Additional x86 processors supported by --with-cpu=.  Each processor
 # MUST be separated by exactly one space.
@@ -3336,6 +3336,10 @@ case ${target} in
arch=znver1
cpu=znver1
;;
+  znver2-*)
+   arch=znver2
+   cpu=znver2
+   ;;
   bdver4-*)
 arch=bdver4
 cpu=bdver4
@@ -3453,6 +3457,10 @@ case ${target} in
arch=znver1
cpu=znver1
;;
+  znver2-*)
+   arch=znver2
+   cpu=znver2
+   ;;
   bdver4-*)
 arch=bdver4
 cpu=bdver4
diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
index 8c830bd..95ba393 100644
--- a/gcc/config/i386/driver-i386.c
+++ b/gcc/config/i386/driver-i386.c
@@ -649,6 +649,8 @@ const char 

Re: [PATCH, d] Disable D on systems where it is known not to work.

2018-10-30 Thread Alan Modra
On Tue, Oct 30, 2018 at 09:07:30PM +0100, Iain Buclaw wrote:
> On Tue, 30 Oct 2018 at 20:50, Andreas Schwab  wrote:
> >
> > On Okt 30 2018, Iain Buclaw  wrote:
> >
> > > This turns off D front-end where there's been reported bootstrap
> > > problems that need further investigation.  Also added a configure.tgt
> > > for libphobos to allow enabling for targets where there's known good
> > > runtime support backed by existing continuous integration.
> >
> > Why do you need that?  The D frontend isn't built by default.
> >
> 
> As far as I have seen, all automated builders for gcc are configured
> with --enable-languages=all however, which is pulling the D frontend
> in.

You can add powerpc64*-*-* to the list of targets that fail bootstrap
with --enable-languages=all

../libphobos/src/std/math.d:242:5: error: static assert  
"Only 64-bit, 80-bit, and 128-bit reals are supported for LittleEndian CPUs"
  242 | static assert(real.mant_dig == 53 || real.mant_dig == 64
  | ^

-- 
Alan Modra
Australia Development Lab, IBM


Update GCC to autoconf 2.69, automake 1.15.1

2018-10-30 Thread Joseph Myers
This patch (diffs to generated files omitted below) updates GCC to use
autoconf 2.69 and automake 1.15.1.  (That's not the latest automake
version, but it's the one used by binutils-gdb, with which consistency
is desirable, and in any case seems a useful incremental update that
should make a future update to 1.16.1 easier.)

The changes are generally similar to the binutils-gdb ones, and are
copied from there where shared files and directories are involved
(there are some further changes to such shared directories, however,
which I'd expect to apply to binutils-gdb once this patch is in GCC).
Largely, obsolete AC_PREREQ calls are removed, while many
AC_LANG_SOURCE calls are added to avoid warnings from aclocal and
autoconf.  Multilib support is no longer included in core automake,
meaning that multilib.am needs copying from automake's contrib
directory into the GCC source tree.  Autoconf 2.69 has Go support, so
local copies of that support are removed.  I hope the D support will
soon be submitted to upstream autoconf so the local copy of that can
be removed in a future update.

Note that the regeneration did not include regeneration of
fixincludes/config.h.in (attempting such regeneration resulted in all
the USED_FOR_TARGET conditionals disappearing; and I don't see
anything in the fixincludes/ directory that would result in such
conditionals being generated, unlike in the gcc/ directory).  Also
note that libvtv/testsuite/other-tests/Makefile.in was not
regenerated; that directory is not listed as a subdirectory for which
Makefile.in gets regenerated by calling "automake" in libvtv/, so I'm
not sure how it's meant to be regenerated.

While I mostly fixed warnings should running aclocal / automake /
autoconf, there were various such warnings from automake in the
libgfortran, libgo, libgomp, liboffloadmic, libsanitizer, libphobos
directories that I did not fix, preferring to leave those to the
relevant subsystem maintainers.  Specifically, most of those warnings
were of the following form (example from libgfortran):

Makefile.am:48: warning: source file 'caf/single.c' is in a subdirectory,
Makefile.am:48: but option 'subdir-objects' is disabled
automake: warning: possible forward-incompatibility.
automake: At least a source file is in a subdirectory, but the 'subdir-objects'
automake: automake option hasn't been enabled.  For now, the corresponding 
output
automake: object file(s) will be placed in the top-level directory.  However,
automake: this behaviour will change in future Automake versions: they
will
automake: unconditionally cause object files to be placed in the same 
subdirectory
automake: of the corresponding sources.
automake: You are advised to start using 'subdir-objects' option throughout your
automake: project, to avoid future incompatibilities.

I think it's best for the relevant maintainers to add subdir-objects
and do any other associated Makefile.am changes needed.  In some cases
the paths in the warnings involved ../; I don't know if that adds any
extra complications to the use of subdir-objects.

I've tested this with native, cross and Canadian cross builds.  The
risk of any OS-specific issues should I hope be rather lower than if a
libtool upgrade were included (we *should* do such an upgrade at some
point, but it's more complicated - it involves identifying all our
local libtool changes to see if any aren't included in the upstream
version we update to, and reverting an upstream libtool patch that's
inappropriate for use in GCC); I think it would be better to get this
update into GCC so that people can test in different configurations
and we can fix any issues found, rather than to try to get more and
more testing done before it goes in.

top level:
2018-10-31  Joseph Myers  

* multilib.am: New file.  From automake.

Merge from binutils-gdb:
2018-06-19  Simon Marchi  

* libtool.m4: Use AC_LANG_SOURCE.
* configure.ac: Remove AC_PREREQ, use AC_LANG_SOURCE.
* ar-lib: New file.
* test-driver: New file.
* configure: Re-generate.

config:
2018-10-31  Joseph Myers  

* math.m4, tls.m4: Use AC_LANG_SOURCE.

Merge from binutils-gdb:
2018-06-19  Simon Marchi  

* override.m4 (_GCC_AUTOCONF_VERSION): Bump from 2.64 to 2.69.

fixincludes:
2018-10-31  Joseph Myers  

* configure.ac: Remove AC_PREREQ.
* aclocal.m4, configure: Regenerate.

gcc:
2018-10-31  Joseph Myers  

* configure.ac: Remove AC_PREREQ.  Use AC_LANG_SOURCE.  Use single
line for second argument of AC_DEFINE_UNQUOTED.
* aclocal.m4, config.in, configure: Regenerate.

gnattools:
2018-10-31  Joseph Myers  

* configure.ac: Remove AC_PREREQ.
* configure: Regenerate.

gotools:
2018-10-31  Joseph Myers  

* config/go.m4: Remove file.
* Makefile.am (ACLOCAL_AMFLAGS): Do not use -I ./config.
* configure.ac:  Remove AC_PREREQ.  Do not include config/go.m4.
* 

Re: Building old GCC with new GCC: stage1 32-bit libstdc++ fails to build after building 64-bit libstdc++

2018-10-30 Thread Matthew Krupcale
Hello,

This is an update and a ping to my original patch message[1].

I've attached an updated set of patches:

1) gcc48-stage1-build-libstdc++.patch: Makefile.tpl patch portion
which allowed me to build GCC 4.8.3 using GCC 8.1
2) gcc82-stage1-build-libstdc++.patch: analogous patch for GCC 8.2
potentially allowing future GCC to build GCC 8.2 (untested).

These patches no longer contain the generated Makefile.in diff and
have a more general comment (i.e. not specific to CXXABI_1.3.9 issue
cited) since this issue might be more general.

Best,
Matthew Krupcale

[1] https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00176.html
diff --git a/Makefile.tpl b/Makefile.tpl
index 3233a788d..7faf98db5 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -294,7 +294,7 @@ BASE_TARGET_EXPORTS = \
 	WINDRES="$(WINDRES_FOR_TARGET)"; export WINDRES; \
 	WINDMC="$(WINDMC_FOR_TARGET)"; export WINDMC; \
 @if gcc-bootstrap
-	$(RPATH_ENVVAR)=`echo "$(TARGET_LIB_PATH)$$$(RPATH_ENVVAR)" | sed 's,::*,:,g;s,^:*,,;s,:*$$,,'`; export $(RPATH_ENVVAR); \
+	$(RPATH_ENVVAR)=`echo "$(TARGET_LIB_PATH_BUILD_LIBSTDC++)$$$(RPATH_ENVVAR)" | sed 's,::*,:,g;s,^:*,,;s,:*$$,,'`; export $(RPATH_ENVVAR); \
 @endif gcc-bootstrap
 	$(RPATH_ENVVAR)=`echo "$(HOST_LIB_PATH)$$$(RPATH_ENVVAR)" | sed 's,::*,:,g;s,^:*,,;s,:*$$,,'`; export $(RPATH_ENVVAR); \
 	TARGET_CONFIGDIRS="$(TARGET_CONFIGDIRS)"; export TARGET_CONFIGDIRS;
@@ -308,6 +308,17 @@ NORMAL_TARGET_EXPORTS = \
 	$(BASE_TARGET_EXPORTS) \
 	CXX="$(CXX_FOR_TARGET) $(XGCC_FLAGS_FOR_TARGET) $$TFLAGS"; export CXX;
 
+# Use the target libstdc++ only after stage1 since the build libstdc++ is
+# required by some stage1 host modules (e.g. cc1, cc1plus, lto1)
+POSTSTAGE1_RPATH_EXPORT = \
+@if target-libstdc++-v3-bootstrap
+	$(RPATH_ENVVAR)=`echo "$(TARGET_LIB_PATH_libstdc++-v3)$$$(RPATH_ENVVAR)" | sed 's,::*,:,g;s,^:*,,;s,:*$$,,'`; export $(RPATH_ENVVAR);
+@endif target-libstdc++-v3-bootstrap
+
+# Similar, for later GCC stages.
+POSTSTAGE1_TARGET_EXPORTS = \
+	$(POSTSTAGE1_RPATH_EXPORT)
+
 # Where to find GMP
 HOST_GMPLIBS = @gmplibs@
 HOST_GMPINC = @gmpinc@
@@ -531,6 +542,10 @@ all:
 TARGET_LIB_PATH = [+ FOR target_modules +][+
   IF lib_path +]$(TARGET_LIB_PATH_[+module+])[+ ENDIF lib_path +][+
   ENDFOR target_modules +]$(HOST_LIB_PATH_gcc)
+# Use the build rather than the target libstdc++
+TARGET_LIB_PATH_BUILD_LIBSTDC++ = [+ FOR target_modules +][+
+  IF lib_path +][+ IF (not (= (get "module") "libstdc++-v3")) +]$(TARGET_LIB_PATH_[+module+])[+ ENDIF +][+ ENDIF lib_path +][+
+  ENDFOR target_modules +]$(HOST_LIB_PATH_gcc)
 [+ FOR target_modules +][+ IF lib_path +]
 @if target-[+module+]
 TARGET_LIB_PATH_[+module+] = $$r/$(TARGET_SUBDIR)/[+module+]/[+lib_path+]:
@@ -1275,6 +1290,7 @@ maybe-[+make_target+]-[+module+]: [+make_target+]-[+module+]
 [+ configure prefix="target-" subdir="$(TARGET_SUBDIR)"
 	 check_multilibs=true
 	 exports="$(RAW_CXX_TARGET_EXPORTS)"
+	 poststage1_exports="$(POSTSTAGE1_TARGET_EXPORTS)"
 	 host_alias=(get "host" "${target_alias}")
 	 target_alias=(get "target" "${target_alias}")
 	 args="$(TARGET_CONFIGARGS)" no-config-site=true +]
@@ -1286,6 +1302,7 @@ maybe-[+make_target+]-[+module+]: [+make_target+]-[+module+]
 [+ configure prefix="target-" subdir="$(TARGET_SUBDIR)"
 	 check_multilibs=true
 	 exports="$(NORMAL_TARGET_EXPORTS)"
+	 poststage1_exports="$(POSTSTAGE1_TARGET_EXPORTS)"
 	 host_alias=(get "host" "${target_alias}")
 	 target_alias=(get "target" "${target_alias}")
 	 args="$(TARGET_CONFIGARGS)" no-config-site=true +]
diff --git a/Makefile.tpl b/Makefile.tpl
index 1f23b79b4..8f048efc1 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -294,7 +294,7 @@ BASE_TARGET_EXPORTS = \
 	WINDRES="$(WINDRES_FOR_TARGET)"; export WINDRES; \
 	WINDMC="$(WINDMC_FOR_TARGET)"; export WINDMC; \
 @if gcc-bootstrap
-	$(RPATH_ENVVAR)=`echo "$(TARGET_LIB_PATH)$$$(RPATH_ENVVAR)" | sed 's,::*,:,g;s,^:*,,;s,:*$$,,'`; export $(RPATH_ENVVAR); \
+	$(RPATH_ENVVAR)=`echo "$(TARGET_LIB_PATH_BUILD_LIBSTDC++)$$$(RPATH_ENVVAR)" | sed 's,::*,:,g;s,^:*,,;s,:*$$,,'`; export $(RPATH_ENVVAR); \
 @endif gcc-bootstrap
 	$(RPATH_ENVVAR)=`echo "$(HOST_LIB_PATH)$$$(RPATH_ENVVAR)" | sed 's,::*,:,g;s,^:*,,;s,:*$$,,'`; export $(RPATH_ENVVAR); \
 	TARGET_CONFIGDIRS="$(TARGET_CONFIGDIRS)"; export TARGET_CONFIGDIRS;
@@ -308,6 +308,17 @@ NORMAL_TARGET_EXPORTS = \
 	$(BASE_TARGET_EXPORTS) \
 	CXX="$(CXX_FOR_TARGET) $(XGCC_FLAGS_FOR_TARGET) $$TFLAGS"; export CXX;
 
+# Use the target libstdc++ only after stage1 since the build libstdc++ is
+# required by some stage1 host modules (e.g. cc1, cc1plus, lto1)
+POSTSTAGE1_RPATH_EXPORT = \
+@if target-libstdc++-v3-bootstrap
+	$(RPATH_ENVVAR)=`echo "$(TARGET_LIB_PATH_libstdc++-v3)$$$(RPATH_ENVVAR)" | sed 's,::*,:,g;s,^:*,,;s,:*$$,,'`; export $(RPATH_ENVVAR);
+@endif target-libstdc++-v3-bootstrap
+
+# Similar, for later GCC stages.
+POSTSTAGE1_TARGET_EXPORTS = \
+	$(POSTSTAGE1_RPATH_EXPORT)
+
 # Where to find GMP
 HOST_GMPLIBS = @gmplibs@
 

Re: [PATCH] Make __PRETTY_FUNCTION__-like functions mergeable string csts (PR c++/64266).

2018-10-30 Thread Jason Merrill

On 10/29/18 9:37 AM, Jason Merrill wrote:

On Fri, Oct 26, 2018 at 3:14 AM Martin Liška  wrote:

On 10/24/18 7:24 PM, Jason Merrill wrote:

On Tue, Oct 23, 2018 at 4:59 AM Martin Liška  wrote:

However, I still see some minor ICEs, it's probably related to decay_conversion 
in cp_fname_init:

1) ./xg++ -B. 
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-__func__2.C

/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-__func__2.C:6:17:
 internal compiler error: Segmentation fault
6 | [] { return __func__; }();
   | ^~~~
0x1344568 crash_signal
 /home/marxin/Programming/gcc/gcc/toplev.c:325
0x76bc310f ???
 
/usr/src/debug/glibc-2.27-6.1.x86_64/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
0x9db134 is_capture_proxy(tree_node*)


Hi.



The problem in both tests is that is_capture_proxy thinks your
__func__ VAR_DECL with DECL_VALUE_EXPR is a capture proxy, since it is
neither an anonymous union proxy nor a structured binding.


I see, however I'm a rookie in area of C++ FE. Would it be solvable this problem
with lambdas?



The standard says,

The function-local predefined variable __func__ is defined as if a
definition of the form
static const char __func__[] = "function-name ";
had been provided, where function-name is an implementation-defined
string. It is unspecified whether such a variable has an address
distinct from that of any other object in the program.

So changing the type of __func__ (from array to pointer) still breaks
conformance.  And we need to keep the type checks from pretty4.C, even
though the checks for strings being distinct need to go.


I added following patch which puts back type to const char[] (instead of char *)
and I made the variable static. Now I see pretty4.C testcase passing again.
To be honest I'm not convinced about the FE changes, so a help would
be appreciated.


OK, I'll poke at it.


This further patch teaches is_capture_proxy about function-name 
variables, and changes the handling of __PRETTY_FUNCTION__ in template 
instantiations to create new variables rather than instantiations.


The C++ changes are OK with this.

commit 100f6f38f9d8bbb5d77ba7a4ccb86a4c85aa2cd9
Author: Jason Merrill 
Date:   Mon Oct 29 22:35:59 2018 -0400

fname-p

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 8454cb4e178..f1a10297e79 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -3122,6 +3122,14 @@ struct GTY(()) lang_decl {
   (DECL_NAME (NODE) \
&& id_equal (DECL_NAME (NODE), "__PRETTY_FUNCTION__"))
 
+/* For a DECL, true if it is __func__ or similar.  */
+#define DECL_FNAME_P(NODE)	\
+  (VAR_P (NODE) && DECL_NAME (NODE) && DECL_ARTIFICIAL (NODE)	\
+   && DECL_HAS_VALUE_EXPR_P (NODE)\
+   && (id_equal (DECL_NAME (NODE), "__PRETTY_FUNCTION__")	\
+   || id_equal (DECL_NAME (NODE), "__FUNCTION__")		\
+   || id_equal (DECL_NAME (NODE), "__func__")))
+
 /* Nonzero if the variable was declared to be thread-local.
We need a special C++ version of this test because the middle-end
DECL_THREAD_LOCAL_P uses the symtab, so we can't use it for
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index cce9cc99ff1..bde8023ce3b 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -4465,7 +4465,7 @@ cp_fname_init (const char* name, tree *type_p)
 static tree
 cp_make_fname_decl (location_t loc, tree id, int type_dep)
 {
-  const char *const name = (type_dep && processing_template_decl
+  const char *const name = (type_dep && in_template_function ()
 			? NULL : fname_as_string (type_dep));
   tree type;
   tree init = cp_fname_init (name, );
@@ -7064,8 +7064,9 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p,
 	init = NULL_TREE;
 	  release_tree_vector (cleanups);
 	}
-  else if (!DECL_PRETTY_FUNCTION_P (decl))
+  else
 	{
+	  gcc_assert (!DECL_PRETTY_FUNCTION_P (decl));
 	  /* Deduce array size even if the initializer is dependent.  */
 	  maybe_deduce_size_from_array_init (decl, init);
 	  /* And complain about multiple initializers.  */
diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index 297327f1ab6..318671bbcd0 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -262,6 +262,7 @@ is_capture_proxy (tree decl)
 	  && DECL_HAS_VALUE_EXPR_P (decl)
 	  && !DECL_ANON_UNION_VAR_P (decl)
 	  && !DECL_DECOMPOSITION_P (decl)
+	  && !DECL_FNAME_P (decl)
 	  && LAMBDA_FUNCTION_P (DECL_CONTEXT (decl)));
 }
 
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 6975027076e..510264d38a0 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -16702,6 +16702,10 @@ tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl,
 	register_local_specialization (inst, decl);
 	break;
 	  }
+	else if (DECL_PRETTY_FUNCTION_P (decl))
+	  decl = make_fname_decl (DECL_SOURCE_LOCATION (decl),
+  DECL_NAME (decl),
+  true/*DECL_PRETTY_FUNCTION_P (decl)*/);
 	else if (DECL_IMPLICIT_TYPEDEF_P (decl)
 		 && LAMBDA_TYPE_P (TREE_TYPE (decl)))
 	  /* Don't copy the 

Re: [C++ Patch] PR 84644 ("internal compiler error: in warn_misplaced_attr_for_class_type, at cp/decl.c:4718")

2018-10-30 Thread Paolo Carlini

Hi,

On 30/10/18 21:37, Jason Merrill wrote:

On 10/26/18 2:02 PM, Paolo Carlini wrote:

On 26/10/18 17:18, Jason Merrill wrote:
On Fri, Oct 26, 2018 at 4:52 AM Paolo Carlini 
 wrote:

On 24/10/18 22:41, Jason Merrill wrote:

On 10/15/18 12:45 PM, Paolo Carlini wrote:

 && ((TREE_CODE (declspecs->type) != TYPENAME_TYPE
+   && TREE_CODE (declspecs->type) != DECLTYPE_TYPE
  && MAYBE_CLASS_TYPE_P (declspecs->type))
I would think that the MAYBE_CLASS_TYPE_P here should be 
CLASS_TYPE_P,

and then we can remove the TYPENAME_TYPE check.  Or do we want to
allow template type parameters for some reason?

Indeed, it would be nice to just use OVERLOAD_TYPE_P. However it seems
we at least want to let through TEMPLATE_TYPE_PARMs representing 
'auto'

- otherwise Dodji's check a few lines below which fixed c++/51473
doesn't work anymore - and also BOUND_TEMPLATE_TEMPLATE_PARM, 
otherwise

we regress on template/spec32.C and template/ttp22.C because we don't
diagnose the shadowing anymore. Thus, I would say either we keep on
using MAYBE_CLASS_TYPE_P or we pick what we need, possibly we add a 
comment?

Aha.  I guess the answer is not to restrict that test any more, but
instead to fix the code further down so it gives a proper diagnostic
rather than call warn_misplaced_attr_for_class_type.


I see. Thus something like the below? It passes testing on x86_64-linux.



+  if ((!declared_type || TREE_CODE (declared_type) == DECLTYPE_TYPE)
+  && ! saw_friend && !error_p)
 permerror (input_location, "declaration does not declare 
anything");


I see no reason to make this specific to decltype.  Maybe move this 
diagnostic into the final 'else' block with the other declspec 
diagnostics and not look at declared_type at all?


I'm not sure to fully understand: if we do that we still want to at 
least minimally check that declared_type is null, like we already do, 
and then we simply accept the new testcase. Is that Ok? Because, as I 
probably mentioned at some point, all the other compilers I have at hand 
issue a "does not declare anything" diagnostic, and we likewise do that 
for the legacy __typeof. Not looking into declared_type *at all* doesn't 
work with plain class types and enums, of course. Or you meant something 
entirely different??



+  if (declspecs->attributes && warn_attributes && declared_type
+  && TREE_CODE (declared_type) != DECLTYPE_TYPE)


I think we do want to give a diagnostic about useless attributes, not 
skip it.


Agreed. FWIW the attached tests fine.

Thanks, Paolo.

///

Index: decl.c
===
--- decl.c  (revision 265636)
+++ decl.c  (working copy)
@@ -4798,9 +4798,7 @@ check_tag_decl (cp_decl_specifier_seq *declspecs,
 declared_type = declspecs->type;
   else if (declspecs->type == error_mark_node)
 error_p = true;
-  if (declared_type == NULL_TREE && ! saw_friend && !error_p)
-permerror (input_location, "declaration does not declare anything");
-  else if (declared_type != NULL_TREE && type_uses_auto (declared_type))
+  if (declared_type && type_uses_auto (declared_type))
 {
   error_at (declspecs->locations[ds_type_spec],
"% can only be specified for variables "
@@ -4842,7 +4840,9 @@ check_tag_decl (cp_decl_specifier_seq *declspecs,
 
   else
 {
-  if (decl_spec_seq_has_spec_p (declspecs, ds_inline))
+  if (!declared_type && ! saw_friend && !error_p)
+   permerror (input_location, "declaration does not declare anything");
+  else if (decl_spec_seq_has_spec_p (declspecs, ds_inline))
error_at (declspecs->locations[ds_inline],
  "% can only be specified for functions");
   else if (decl_spec_seq_has_spec_p (declspecs, ds_virtual))
@@ -4909,7 +4909,7 @@ check_tag_decl (cp_decl_specifier_seq *declspecs,
"no attribute can be applied to "
"an explicit instantiation");
}
-  else
+  else if (TREE_CODE (declared_type) != DECLTYPE_TYPE)
warn_misplaced_attr_for_class_type (loc, declared_type);
 }
 


Re: [PATCH v3 3/3] or1k: gcc: initial support for openrisc

2018-10-30 Thread Stafford Horne
On Tue, Oct 30, 2018 at 03:57:03PM +, Richard Henderson wrote:
> On 10/30/18 12:18 PM, Stafford Horne wrote:
> > OK, I was just being lazy allowing the spill.  Do you think the split/expand
> > would be an RTL using left shift / right shift?  Can you think of something
> > more clever?  Since "real" hardware does not usually support shifts with an
> > immediate we will need 1 instruction to load shift amount. i.e.
> > 
> >   l.ori %0, r0, 24
> >   l.sll %1, %1, %0
> >   l.sra %0, %1, %0
> 
> This clobbers %1.

Right, it was just a rough idea to create another r/r pattern.
 
> So, ouch.  I think we will want to avoid creating this particular pattern in
> the first place unless l.exts exists then.  We would use another pattern like
> 
> (define_insn "*sign_extend_mem"
>   [(set (match_operand:SI 0 "register_operand" "=r")
>   (sign_extend:SI
> (match_operand:HI 1 "memory_operand" "m")))]
>   ""
>   "l.lhs\t%0, %1")
> 
> following the TARGET_SEXT pattern.  In this way combine can use this pattern
> without getting us into trouble with the register allocator later.

OK, thats simple enough then.  I had thought you were asking for creating
another r/r pattern using define_split.  That might be better than requiring a
memory load/store to do sign extension, but I guess we can optimize that later
if needed. (Thats what my original thought was.)

-Stafford


Re: [PATCH v3 3/3] or1k: gcc: initial support for openrisc

2018-10-30 Thread Stafford Horne
On Tue, Oct 30, 2018 at 10:49:53AM -0500, Segher Boessenkool wrote:
> On Tue, Oct 30, 2018 at 09:49:18PM +0900, Stafford Horne wrote:
> > Hello,
> > 
> > On Sun, Oct 28, 2018 at 05:54:47PM -0500, Segher Boessenkool wrote:
> > > Yes, like that.  It also easily can handle the other combos (those with
> > > STACK_POINTER), and it is easier if you have to switch 
> > > FRAME_GROWS_DOWNWARD
> > > ("false" is better on some args, but "true" is required for ssp).
> > > 
> > > Your code is fine as-is of course.
> > 
> > Just to be clear, when you say 'as-is' did you mean the original v3 patch?  
> > Or
> > are you referring to followup patch I posted with the some_offset (from) -
> > some_offset (to) logic.
> 
> Either.  Both.  I meant the orig big patch, v3 if that's what it was.

Alright, thanks, I just didnt want to misunderstand.

-Stafford



Re: [PATCH] use MAX_OFILE_ALIGNMENT to validate attribute aligned (PR 87795)

2018-10-30 Thread Joseph Myers
On Tue, 30 Oct 2018, Martin Sebor wrote:

> So it seems that the attribute handler should be using this macro
> instead.  I also took the liberty to add more detail to the error

Note that it should only be used for alignments relevant to the object 
file - *not* for alignments of variables with automatic storage duration 
(and thus not for alignments of types / struct fields, because such types 
might only be used on the stack) since GCC supports arbitrary alignments 
on the stack via dynamically realigning it.

So you need testcases that verify that large alignments are still allowed 
for types / fields / on the stack, even when the object file only supports 
smaller alignments.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: std::vector fix & enhancements

2018-10-30 Thread Jonathan Wakely

On 30/10/18 07:28 +0100, François Dumont wrote:
Following Marc Glisse change to ignore _M_start offset I wanted to go 
a little step further and just remove it in _GLIBCXX_INLINE_VERSION 
mode.


I also fix a regression we already fixed on mainstream std::vector 
regarding noexcept qualification of move constructor with allocator.


And I implemented the same optimizations than in std::vector for 
allocators always comparing equals and for the std::swap operation.


I also avoid re-implementing in vector::operator[] the same code 
already implemented in iterator::operator[] but this one should 
perhaps go in a different commit.



    * include/bits/stl_bvector.h
    [_GLIBCXX_INLINE_VERSION](_Bvector_impl_data::_M_start): Define as
    _Bit_type*.
    (_Bvector_impl_data(const _Bvector_impl_data&)): New.
    (_Bvector_impl_data(_Bvector_impl_data&&)): Delegate to latter.
    (_Bvector_impl_data::operator=(const _Bvector_impl_data&)): New.
(_Bvector_impl_data::_M_move_data(_Bvector_impl_data&&)): Use latter.
    (_Bvector_impl_data::_M_reset()): Likewise.
    (_Bvector_impl_data::_M_begin()): New.
    (_Bvector_impl_data::_M_cbegin()): New.
    (_Bvector_impl_data::_M_start_p()): New.
    (_Bvector_impl_data::_M_set_start(_Bit_type*)): New.
    (_Bvector_impl_data::_M_swap_data): New.
    (_Bvector_impl::_Bvector_impl(_Bvector_impl&&)): Implement explicitely.
    (_Bvector_impl::_Bvector_impl(_Bit_alloc_type&&, 
_Bvector_impl&&)): New.

    (_Bvector_base::_Bvector_base(_Bvector_base&&, const allocator_type&)):
    New.
    (_Bvector_base::_M_deallocate()): Adapt.
    (vector::vector(const vector&, const allocator_type&)): Adapt.
    (vector::vector(vector&&, const allocator_type&, true_type)): New.
    (vector::vector(vector&&, const allocator_type&, false_type)): New.
    (vector::vector(vector&&, const allocator_type&)): Use latters.
    (vector::vector(const vector&, const allocator_type&)): Adapt.
    (vector::begin()): Adapt.
    (vector::cbegin()): Adapt.
    (vector::operator[](size_type)): Use iterator operator[].
    (vector::swap(vector&)): Adapt.
    (vector::flip()): Adapt.
    (vector::_M_initialize(size_type)): Adapt.
    (vector::_M_initialize_value(bool)): Adapt.
    * include/bits/vector.tcc:
    (vector::_M_reallocate(size_type)): Adapt.
    (vector::_M_fill_insert(iterator, size_type, bool)): Adapt.
    (vector::_M_insert_range<_FwdIter>(iterator, _FwdIter, _FwdIter
    std::forward_iterator_tag)): Adapt.
    (vector::_M_insert_aux(iterator, bool)): Adapt.
    (std::hash>::operator()): Adapt.
    * testsuite/23_containers/vector/bool/cons/noexcept_move_construct.cc:
    Add check.

Tested under Linux x86_64.

Ok to commit ?

François




diff --git a/libstdc++-v3/include/bits/stl_bvector.h 
b/libstdc++-v3/include/bits/stl_bvector.h
index 8fbef7a1a3a..81b4a75236d 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -437,7 +437,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER

  struct _Bvector_impl_data
  {
+#if !_GLIBCXX_INLINE_VERSION
_Bit_iterator   _M_start;
+#else
+   _Bit_type*  _M_start;
+#endif


An alternative that would require fewer changes elsewhere in the file
would be:

#if !_GLIBCXX_INLINE_VERSION
_Bit_iterator   _M_start;
#else
   // We don't need the offset field for the start pointer,
   // it's always zero.
   struct {
 _Bit_type* _M_p;
 // Allow assignment from iterators (assume offset is zero):
 void operator=(_Bit_iterator __it) { _M_p = __it._M_p; }
   } _M_start;
#endif

Now the rest of the file doesn't need any checks for
_GLIBCXX_INLINE_VERSION (which I'd prefer to avoid, because it
clutters the code up with extra conditionals for a mode that
**nobody** uses in practice).



_Bit_iterator   _M_finish;
_Bit_pointer_M_end_of_storage;

@@ -447,32 +451,74 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER

#if __cplusplus >= 201103L
_Bvector_impl_data(_Bvector_impl_data&& __x) noexcept
-   : _M_start(__x._M_start), _M_finish(__x._M_finish)
-   , _M_end_of_storage(__x._M_end_of_storage)
+   : _Bvector_impl_data(__x)
{ __x._M_reset(); }

+   _Bvector_impl_data(const _Bvector_impl_data&) = default;
+   _Bvector_impl_data&
+   operator=(const _Bvector_impl_data&) = default;
void
_M_move_data(_Bvector_impl_data&& __x) noexcept
{
- this->_M_start = __x._M_start;
- this->_M_finish = __x._M_finish;
- this->_M_end_of_storage = __x._M_end_of_storage;
+ *this = __x;
  __x._M_reset();
}
+#else
+   _Bvector_impl_data(const _Bvector_impl_data& __x)
+ : _M_start(__x._M_start), _M_finish(__x._M_finish)
+ , _M_end_of_storage(__x._M_end_of_storage)
+   { }


Do we need this definition? Won't the compiler generate the same thing
for us anyway?


+   _Bvector_impl_data&
+   operator=(const _Bvector_impl_data& __x)
+   {

Re: std::vector fix & enhancements

2018-10-30 Thread François Dumont
Running tests in C++98 mode show that I had forgotten a 'return *this;' 
in _Bvector_impl_data::operator=.


So here is the patch again.

On 10/30/18 7:28 AM, François Dumont wrote:
Following Marc Glisse change to ignore _M_start offset I wanted to go 
a little step further and just remove it in _GLIBCXX_INLINE_VERSION mode.


I also fix a regression we already fixed on mainstream std::vector 
regarding noexcept qualification of move constructor with allocator.


And I implemented the same optimizations than in std::vector for 
allocators always comparing equals and for the std::swap operation.


I also avoid re-implementing in vector::operator[] the same code 
already implemented in iterator::operator[] but this one should 
perhaps go in a different commit.



    * include/bits/stl_bvector.h
    [_GLIBCXX_INLINE_VERSION](_Bvector_impl_data::_M_start): Define as
    _Bit_type*.
    (_Bvector_impl_data(const _Bvector_impl_data&)): New.
    (_Bvector_impl_data(_Bvector_impl_data&&)): Delegate to latter.
    (_Bvector_impl_data::operator=(const _Bvector_impl_data&)): New.
(_Bvector_impl_data::_M_move_data(_Bvector_impl_data&&)): Use latter.
    (_Bvector_impl_data::_M_reset()): Likewise.
    (_Bvector_impl_data::_M_begin()): New.
    (_Bvector_impl_data::_M_cbegin()): New.
    (_Bvector_impl_data::_M_start_p()): New.
    (_Bvector_impl_data::_M_set_start(_Bit_type*)): New.
    (_Bvector_impl_data::_M_swap_data): New.
    (_Bvector_impl::_Bvector_impl(_Bvector_impl&&)): Implement 
explicitely.
    (_Bvector_impl::_Bvector_impl(_Bit_alloc_type&&, 
_Bvector_impl&&)): New.
    (_Bvector_base::_Bvector_base(_Bvector_base&&, const 
allocator_type&)):

    New.
    (_Bvector_base::_M_deallocate()): Adapt.
    (vector::vector(const vector&, const allocator_type&)): Adapt.
    (vector::vector(vector&&, const allocator_type&, true_type)): New.
    (vector::vector(vector&&, const allocator_type&, false_type)): New.
    (vector::vector(vector&&, const allocator_type&)): Use latters.
    (vector::vector(const vector&, const allocator_type&)): Adapt.
    (vector::begin()): Adapt.
    (vector::cbegin()): Adapt.
    (vector::operator[](size_type)): Use iterator operator[].
    (vector::swap(vector&)): Adapt.
    (vector::flip()): Adapt.
    (vector::_M_initialize(size_type)): Adapt.
    (vector::_M_initialize_value(bool)): Adapt.
    * include/bits/vector.tcc:
    (vector::_M_reallocate(size_type)): Adapt.
    (vector::_M_fill_insert(iterator, size_type, bool)): Adapt.
    (vector::_M_insert_range<_FwdIter>(iterator, _FwdIter, _FwdIter
    std::forward_iterator_tag)): Adapt.
    (vector::_M_insert_aux(iterator, bool)): Adapt.
    (std::hash>::operator()): Adapt.
    * 
testsuite/23_containers/vector/bool/cons/noexcept_move_construct.cc:

    Add check.

Tested under Linux x86_64.

Ok to commit ?

François



diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
index 0bcfd19fd3e..dd0f00fe07f 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -437,7 +437,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 
   struct _Bvector_impl_data
   {
+#if !_GLIBCXX_INLINE_VERSION
 	_Bit_iterator	_M_start;
+#else
+	_Bit_type*	_M_start;
+#endif
 	_Bit_iterator	_M_finish;
 	_Bit_pointer	_M_end_of_storage;
 
@@ -447,32 +451,75 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 
 #if __cplusplus >= 201103L
 	_Bvector_impl_data(_Bvector_impl_data&& __x) noexcept
-	: _M_start(__x._M_start), _M_finish(__x._M_finish)
-	, _M_end_of_storage(__x._M_end_of_storage)
+	: _Bvector_impl_data(__x)
 	{ __x._M_reset(); }
 
+	_Bvector_impl_data(const _Bvector_impl_data&) = default;
+	_Bvector_impl_data&
+	operator=(const _Bvector_impl_data&) = default;
+
 	void
 	_M_move_data(_Bvector_impl_data&& __x) noexcept
 	{
-	  this->_M_start = __x._M_start;
-	  this->_M_finish = __x._M_finish;
-	  this->_M_end_of_storage = __x._M_end_of_storage;
+	  *this = __x;
 	  __x._M_reset();
 	}
+#else
+	_Bvector_impl_data(const _Bvector_impl_data& __x)
+	: _M_start(__x._M_start), _M_finish(__x._M_finish)
+	, _M_end_of_storage(__x._M_end_of_storage)
+	{ }
+
+	_Bvector_impl_data&
+	operator=(const _Bvector_impl_data& __x)
+	{
+	  _M_start = __x._M_start;
+	  _M_finish = __x._M_finish;
+	  _M_end_of_storage = __x._M_end_of_storage;
+	  return *this;
+	}
+#endif
+
+	_Bit_iterator
+	_M_begin() const _GLIBCXX_NOEXCEPT
+	{ return _Bit_iterator(_M_start_p(), 0); }
+
+	_Bit_const_iterator
+	_M_cbegin() const _GLIBCXX_NOEXCEPT
+	{ return _Bit_const_iterator(_M_start_p(), 0); }
+
+	_Bit_type*
+	_M_start_p() const _GLIBCXX_NOEXCEPT
+#if !_GLIBCXX_INLINE_VERSION
+	{ return _M_start._M_p; }
+#else
+	{ return _M_start; }
+#endif
+
+	void
+	_M_set_start(_Bit_type* __p) _GLIBCXX_NOEXCEPT
+#if !_GLIBCXX_INLINE_VERSION
+	{ _M_start._M_p = __p; }
+#else
+	{ _M_start = __p; }
 #endif
 
 	void
 	_M_reset() _GLIBCXX_NOEXCEPT
+	{ *this = _Bvector_impl_data(); }
+
+	void
+	_M_swap_data(_Bvector_impl_data& __x) 

Re: [Patch, fortran] PR40196 - [F03] [F08] Type parameter inquiry (str%len, a%kind) and Complex parts (z%re, z%im)

2018-10-30 Thread Paul Richard Thomas
Hi Thomas,

I tried failing cases of that kind; or assignment to len/kind part refs and
returned correct errors. Must check where I was going wrong.

Paul from a chilly Garching-bei-Muenchen


On Sun, 28 Oct 2018, 13:38 Thomas Koenig  Hi Paul,
>
>
> >> inq would be easier to understand and unambiguous imho.
> >
> > Why? inquiry_type seems fine to me.
>
> I think Bernhard means the name of the member, i.
>
> I think it makes sense to leave as it is - gfc_ref is a
> struct that occurs a lot in complicated expressions, and the other
> members are one and two letters, too.
>
> > snip
> >> Is the switch really worth it? I'd have used a plain chain of strcmp,
> >> fwiw.
> >
> > I have done it. However, I might revert in order to combine the switch
> > block where I set the typespec for the primary expression.
>
> Whatever suits you best.
>
> > I haven't added testcases for errors. Does anybody think that this is
> necessary?
>
> Might not be a bad idea to run through at least each new error message
> again.
>
> There is one illwfL test case which ICEs:
>
> $ cat b.f90
> program main
>character(len=:), allocatable :: a
>allocate(a,source="abc")
>a%len = 2
>print *,a
> end
> $ gfortran b.f90
> gimplification failed:
> (integer(kind=4)) .a   type   size 
>  unit-size 
>  align:32 warn_if_not_align:0 symtab:0 alias-set -1
> canonical-type 0x7f138acd15e8 precision:32 min  0x7f138acbcd68 -2147483648> max 
>  pointer_to_this >
>
>  arg:0   type   size 
>  unit-size 
>  align:64 warn_if_not_align:0 symtab:0 alias-set -1
> canonical-type 0x7f138acd1738 precision:64 min  0x7f138acbcdf8 -9223372036854775808> max  9223372036854775807>
>  pointer_to_this >
>  used DI b.f90:1:0 size 
> unit-size 
>  align:64 warn_if_not_align:0 context  0x7f138ae83200 MAIN__>
>  chain  0x7f138ae82540>
>  used unsigned DI b.f90:2:0 size  64> unit-size 
>  align:64 warn_if_not_align:0 context  0x7f138ae83200 MAIN__
> b.f90:4:0:
>
>  4 |   a%len = 2
>|
> internal compiler error: gimplification failed
> 0xb45602 gimplify_expr(tree_node**, gimple**, gimple**, bool
> (*)(tree_node*), int)
>  ../../trunk/gcc/gimplify.c:12568
>
> Regards
>
> Thomas
>


LTO partitioning performance & increase default number of partitions

2018-10-30 Thread Jan Hubicka
Hi,
this patch increases lto-partitions to 128.  This makes ltrans.o file sizes to
grow from 458MB to 651MB which is still not perfect but a lot better than
prevoiusly.  On firefox the growth is smaller (only about 10%) which is
probably caused by the "unified build" they use where they merge multiple
sources via #include to reduce number of objects "only" to about 8000.
I will do testing w/o unified build this week as well.

What is however interesting that even on my 8core 16hyperthread buldozer
machine this reduces both overall time and user time:

partitionsreal  user   sys  
16:   4m25.586s 30m0.760s  0m21.772s
32:   4m16.163s 28m58.992s 0m28.996s
32:   3m17.889s 28m57.012s 0m29.084s
64:   2m55.663s 27m46.344s 0m39.568s
64:   2m57.010s 27m48.812s 0m39.192s
128:  2m52.978s 27m43.616s 0m47.964s
256:  2m54.915s 27m56.324s 1m2.272s 
512:  3m2.762s  28m20.696s 1m25.616s
512:  3m1.851s  28m20.124s 1m23.812s

1to1: 4m34.263s 31m49.760s 1m56.804s

Firefox actually preffers even more partitions: it seems that ideal size for
partition memory use is about 80MB which is probably hard to achieve generally.
I plan to fine tune this at begining of stage3 but I want to increase
partitioning now so we hit possible negative performance effects earlier.

WPA stage having some ovbvious bottle necks:
Time variable   usr   sys  wall 
  GGC
 phase opt and generate :  39.34 ( 75%)   0.62 (  6%)  39.98 ( 65%) 
 360751 kB ( 26%)
 phase stream in:  11.88 ( 23%)   0.46 (  5%)  12.36 ( 20%) 
1050929 kB ( 74%)
 ipa function summary   :   0.17 (  0%)   0.03 (  0%)   0.23 (  0%) 
  68036 kB (  5%)
 ipa cp :   0.83 (  2%)   0.07 (  1%)   0.98 (  2%) 
 127680 kB (  9%)
 ipa inlining heuristics:  30.90 ( 59%)   0.05 (  1%)  30.96 ( 50%) 
 118731 kB (  8%)
 lto stream inflate :   2.94 (  6%)   0.15 (  2%)   2.95 (  5%) 
  0 kB (  0%)
 ipa lto gimple in  :   1.10 (  2%)   0.32 (  3%)   1.32 (  2%) 
 162967 kB ( 12%)
 ipa lto decl in:   7.51 ( 14%)   0.18 (  2%)   7.77 ( 13%) 
 748707 kB ( 53%)
 whopr partitioning :   1.45 (  3%)   0.02 (  0%)   1.48 (  2%) 
   5451 kB (  0%)
 ipa icf:   2.71 (  5%)   0.07 (  1%)   2.76 (  4%) 
  12571 kB (  1%)
 TOTAL  :  52.15  9.62 61.86
1413731 kB

 - we may be in position to look for faster compression library (to save 6% of 
WPA)
 - icf and profile merging still brings in too many function bodies (to save 
12% of GGC memory)
 - inliner got slower. Reason is twofold. It now spends about 15% in the
   hashtable mapping summaries to symbol nodes (we used to have an array which
   was removed by Martin) and we do spend a lot of time in sreal computation.
   This can be microoptimized + I have some patches to speed it up noticeably
   by getting functions contextes handled better.
 - I have noticed that ltrans spends absurt amount of time in 
   lookup_external_ref (up to 20% in large partitions) which may affect the 
table
   above in favour of more partitioning.

Still we could get important wins by reducing amount of decl streaming
(I will do some tests on simplifing function types, arrays and enums to see
if there is low hanging fruit left) but we do a lot better than ever brefore.

Bootstrapped/regtested x86_64-linux, comitted.

Honza


* params.def (lto-partitions): Set to 128 (instead of 32).
Index: params.def
===
--- params.def  (revision 265573)
+++ params.def  (working copy)
@@ -1103,7 +1103,7 @@ DEFPARAM (PARAM_IPA_MAX_AA_STEPS,
 DEFPARAM (PARAM_LTO_PARTITIONS,
  "lto-partitions",
  "Number of partitions the program should be split to.",
- 32, 1, 0)
+ 128, 1, 0)
 
 DEFPARAM (MIN_PARTITION_SIZE,
  "lto-min-partition",


Re: Avoid tests failures in C++98

2018-10-30 Thread Jonathan Wakely

On 30/10/18 22:42 +0100, François Dumont wrote:
Running some tests in C++98 show that the error checked by following 
tests are those of the C++11 mode.


This patch add target c++11 so that those tests are ignored in 
previous mode.


    * testsuite/23_containers/deque/requirements/dr438/assign_neg.cc:
    Add target c++11.
    * 
testsuite/23_containers/deque/requirements/dr438/constructor_1_neg.cc:

    Likewise.
    * 
testsuite/23_containers/deque/requirements/dr438/constructor_2_neg.cc:

    Likewise.
    * testsuite/23_containers/deque/requirements/dr438/insert_neg.cc:
    Likewise.

Ok to commit ?


No, because then we wouldn't test those members in C++98.

We should either duplicate the tests and have one copy using
{ target c++11 } and the other using { dg-options "-std=gnu++98" }, or
we should fix the expected errros to work for C++98 too. I have local
patches to do the latter, that I haven't submitted yet.




Re: [PATCH] use MAX_OFILE_ALIGNMENT to validate attribute aligned (PR 87795)

2018-10-30 Thread Jeff Law
On 10/30/18 3:40 PM, Martin Sebor wrote:
> Bug 87795 - Excessive alignment permitted for functions and labels
> points out that the handler for attribute aligned makes it possible
> for unsupported alignments to be accepted by the front end only to
> be either rejected later on by some targets for variables, or to
> cause an ICE for overaligned functions.
> 
> The reason for the problems is that the attribute handler considers
> any power of two alignment valid whose log2  is less than
> HOST_BITS_PER_INT - LOG2_BITS_PER_UNIT, but later parts of GCC
> assume values of at most MAX_OFILE_ALIGNMENT / BITS_PER_UNIT.
> The internals manual documents MAX_OFILE_ALIGNMENT as:
> 
>   Biggest alignment supported by the object file format of this
>   machine. Use this macro to limit the alignment which can be
>   specified using the __attribute__ ((aligned (n))) construct.
> 
> So it seems that the attribute handler should be using this macro
> instead.  I also took the liberty to add more detail to the error
> messages.  Attached is a patch that makes this change.  Tested on
> x86_64-linux, plus using cross-compilers for arm, hppa64, pdp11,
> and powerpc64.
> 
> Martin
> 
> gcc-87795.diff
> 
> PR c/87795 - Excessive alignment permitted for functions and labels
> 
> gcc/c-family/ChangeLog:
> 
>   PR c/87795
>   * c-common.c (check_user_alignment): Use MAX_OFILE_ALIGNMENT.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c/87795
>   * gcc.dg/attr-aligned.c: New test.
OK
jeff


Avoid tests failures in C++98

2018-10-30 Thread François Dumont
Running some tests in C++98 show that the error checked by following 
tests are those of the C++11 mode.


This patch add target c++11 so that those tests are ignored in previous 
mode.


    * testsuite/23_containers/deque/requirements/dr438/assign_neg.cc:
    Add target c++11.
    * 
testsuite/23_containers/deque/requirements/dr438/constructor_1_neg.cc:

    Likewise.
    * 
testsuite/23_containers/deque/requirements/dr438/constructor_2_neg.cc:

    Likewise.
    * testsuite/23_containers/deque/requirements/dr438/insert_neg.cc:
    Likewise.

Ok to commit ?

François

diff --git a/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/assign_neg.cc b/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/assign_neg.cc
index 22c9049283c..24c236decfb 100644
--- a/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/assign_neg.cc
+++ b/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/assign_neg.cc
@@ -17,7 +17,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-do compile }
+// { dg-do compile { target c++11 } }
 // { dg-prune-output "no matching function .*_M_fill_assign" }
 
 #include 
diff --git a/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/constructor_1_neg.cc b/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/constructor_1_neg.cc
index 5b63af2da84..5a055d52458 100644
--- a/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/constructor_1_neg.cc
+++ b/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/constructor_1_neg.cc
@@ -17,7 +17,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-do compile }
+// { dg-do compile  { target c++11 } }
 // { dg-prune-output "no matching function .*_M_fill_initialize" }
 
 #include 
diff --git a/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/constructor_2_neg.cc b/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/constructor_2_neg.cc
index ebb0dc17c39..78230241b67 100644
--- a/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/constructor_2_neg.cc
+++ b/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/constructor_2_neg.cc
@@ -17,7 +17,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-do compile }
+// { dg-do compile  { target c++11 } }
 // { dg-prune-output "no matching function .*_M_fill_initialize" }
 
 #include 
diff --git a/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/insert_neg.cc b/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/insert_neg.cc
index c351720b195..55ad8e5950f 100644
--- a/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/insert_neg.cc
+++ b/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/insert_neg.cc
@@ -17,7 +17,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-do compile }
+// { dg-do compile { target c++11 } }
 // { dg-prune-output "no matching function .*_M_fill_insert" }
 
 #include 


[PATCH] use MAX_OFILE_ALIGNMENT to validate attribute aligned (PR 87795)

2018-10-30 Thread Martin Sebor

Bug 87795 - Excessive alignment permitted for functions and labels
points out that the handler for attribute aligned makes it possible
for unsupported alignments to be accepted by the front end only to
be either rejected later on by some targets for variables, or to
cause an ICE for overaligned functions.

The reason for the problems is that the attribute handler considers
any power of two alignment valid whose log2  is less than
HOST_BITS_PER_INT - LOG2_BITS_PER_UNIT, but later parts of GCC
assume values of at most MAX_OFILE_ALIGNMENT / BITS_PER_UNIT.
The internals manual documents MAX_OFILE_ALIGNMENT as:

  Biggest alignment supported by the object file format of this
  machine. Use this macro to limit the alignment which can be
  specified using the __attribute__ ((aligned (n))) construct.

So it seems that the attribute handler should be using this macro
instead.  I also took the liberty to add more detail to the error
messages.  Attached is a patch that makes this change.  Tested on
x86_64-linux, plus using cross-compilers for arm, hppa64, pdp11,
and powerpc64.

Martin
PR c/87795 - Excessive alignment permitted for functions and labels

gcc/c-family/ChangeLog:

	PR c/87795
	* c-common.c (check_user_alignment): Use MAX_OFILE_ALIGNMENT.

gcc/testsuite/ChangeLog:

	PR c/87795
	* gcc.dg/attr-aligned.c: New test.

Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c	(revision 265630)
+++ gcc/c-family/c-common.c	(working copy)
@@ -5123,17 +5123,20 @@ c_init_attributes (void)
 #undef DEF_ATTR_TREE_LIST
 }
 
-/* Check whether ALIGN is a valid user-specified alignment.  If so,
-   return its base-2 log; if not, output an error and return -1.  If
-   ALLOW_ZERO then 0 is valid and should result in a return of -1 with
-   no error.  */
+/* Check whether the byte alignment ALIGN is a valid user-specified
+   alignment less than MAX_OFILE_ALIGNMENT when converted to bits.
+   If so, return ALIGN's base-2 log; if not, output an error and
+   return -1.  If ALLOW_ZERO then 0 is valid and should result in
+   a return of -1 with no error.  */
+
 int
 check_user_alignment (const_tree align, bool allow_zero)
 {
-  int i;
+  int log2bitalign;
 
   if (error_operand_p (align))
 return -1;
+
   if (TREE_CODE (align) != INTEGER_CST
   || !INTEGRAL_TYPE_P (TREE_TYPE (align)))
 {
@@ -5140,20 +5143,35 @@ check_user_alignment (const_tree align, bool allow
   error ("requested alignment is not an integer constant");
   return -1;
 }
-  else if (allow_zero && integer_zerop (align))
+
+  if (allow_zero && integer_zerop (align))
 return -1;
-  else if (tree_int_cst_sgn (align) == -1
-   || (i = tree_log2 (align)) == -1)
+
+  if (tree_int_cst_sgn (align) == -1
+  || (log2bitalign = tree_log2 (align)) == -1)
 {
-  error ("requested alignment is not a positive power of 2");
+  error ("requested alignment %qE is not a positive power of 2",
+	 align);
   return -1;
 }
-  else if (i >= HOST_BITS_PER_INT - LOG2_BITS_PER_UNIT)
+
+  unsigned maxalign = MAX_OFILE_ALIGNMENT / BITS_PER_UNIT;
+  if (tree_to_shwi (align) > maxalign)
 {
-  error ("requested alignment is too large");
+  error ("requested alignment %qE exceeds object file maximum %u",
+	 align, maxalign);
   return -1;
 }
-  return i;
+
+  /* The following is probably redundant given the test above.  */
+  if (log2bitalign >= HOST_BITS_PER_INT - LOG2_BITS_PER_UNIT)
+{
+  error ("requested alignment %qE exceeds maximum %u",
+	 align, 1U << (HOST_BITS_PER_INT - 1));
+  return -1;
+}
+
+  return log2bitalign;
 }
 
 /* Determine the ELF symbol visibility for DECL, which is either a
Index: gcc/testsuite/gcc.dg/attr-aligned.c
===
--- gcc/testsuite/gcc.dg/attr-aligned.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/attr-aligned.c	(working copy)
@@ -0,0 +1,35 @@
+/* PR c/87795 - Excessive alignment permitted for functions and labels
+   { dg-do compile } */
+
+/* Hardcode a few known values for testing the tight bounds.  */
+#if __hpux__ && __hppa__ && __LP64__
+#  define ALIGN_MAX   4096
+#  define ALIGN_TOO_BIG   (ALIGN_MAX << 1)
+#elif pdp11
+#  define ALIGN_MAX   2
+#  define ALIGN_TOO_BIG   4
+#elif __powerpc64__ || __x86_64__
+#  define ALIGN_MAX   0x1000
+#else
+   /* Guaranteed to be accepted regardless of the target.  */
+#  define ALIGN_MAX  __BIGGEST_ALIGNMENT__
+#endif
+
+#ifndef ALIGN_TOO_BIG
+   /* Guaranteed to be rejected regardless of the target.  */
+#  define ALIGN_TOO_BIG   (0x1000 << 1)
+#endif
+
+__attribute__ ((aligned (ALIGN_MAX))) const char c_max = 0;
+__attribute__ ((aligned (ALIGN_MAX))) char v_max;
+__attribute__ ((aligned (ALIGN_MAX))) void f_max (void);
+
+_Static_assert (_Alignof (c_max) == ALIGN_MAX);
+_Static_assert (_Alignof (v_max) == ALIGN_MAX);
+
+
+__attribute__ ((aligned (ALIGN_TOO_BIG))) char

Re: [PATCH, AArch64 v2 06/11] Add visibility to libfunc constructors

2018-10-30 Thread James Greenhalgh
This one needs some other reviewers copied in, who may have missed that
it is not an AARch64 only patch (it looks fine to me).

James

On Tue, Oct 02, 2018 at 11:19:10AM -0500, Richard Henderson wrote:
>   * optabs-libfuncs.c (build_libfunc_function_visibility):
>   New, split out from...
>   (build_libfunc_function): ... here.
>   (init_one_libfunc_visibility): New, split out from ...
>   (init_one_libfunc): ... here.
> ---
>  gcc/optabs-libfuncs.h |  2 ++
>  gcc/optabs-libfuncs.c | 26 --
>  2 files changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/gcc/optabs-libfuncs.h b/gcc/optabs-libfuncs.h
> index 0669ea1fdd7..cf39da36887 100644
> --- a/gcc/optabs-libfuncs.h
> +++ b/gcc/optabs-libfuncs.h
> @@ -63,7 +63,9 @@ void gen_satfract_conv_libfunc (convert_optab, const char *,
>  void gen_satfractuns_conv_libfunc (convert_optab, const char *,
>  machine_mode, machine_mode);
>  
> +tree build_libfunc_function_visibility (const char *, symbol_visibility);
>  tree build_libfunc_function (const char *);
> +rtx init_one_libfunc_visibility (const char *, symbol_visibility);
>  rtx init_one_libfunc (const char *);
>  rtx set_user_assembler_libfunc (const char *, const char *);
>  
> diff --git a/gcc/optabs-libfuncs.c b/gcc/optabs-libfuncs.c
> index bd0df8baa37..73a28e9ca7a 100644
> --- a/gcc/optabs-libfuncs.c
> +++ b/gcc/optabs-libfuncs.c
> @@ -719,10 +719,10 @@ struct libfunc_decl_hasher : ggc_ptr_hash
>  /* A table of previously-created libfuncs, hashed by name.  */
>  static GTY (()) hash_table *libfunc_decls;
>  
> -/* Build a decl for a libfunc named NAME.  */
> +/* Build a decl for a libfunc named NAME with visibility VIS.  */
>  
>  tree
> -build_libfunc_function (const char *name)
> +build_libfunc_function_visibility (const char *name, symbol_visibility vis)
>  {
>/* ??? We don't have any type information; pretend this is "int foo ()".  
> */
>tree decl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL,
> @@ -731,7 +731,7 @@ build_libfunc_function (const char *name)
>DECL_EXTERNAL (decl) = 1;
>TREE_PUBLIC (decl) = 1;
>DECL_ARTIFICIAL (decl) = 1;
> -  DECL_VISIBILITY (decl) = VISIBILITY_DEFAULT;
> +  DECL_VISIBILITY (decl) = vis;
>DECL_VISIBILITY_SPECIFIED (decl) = 1;
>gcc_assert (DECL_ASSEMBLER_NAME (decl));
>  
> @@ -742,11 +742,19 @@ build_libfunc_function (const char *name)
>return decl;
>  }
>  
> +/* Build a decl for a libfunc named NAME.  */
> +
> +tree
> +build_libfunc_function (const char *name)
> +{
> +  return build_libfunc_function_visibility (name, VISIBILITY_DEFAULT);
> +}
> +
>  /* Return a libfunc for NAME, creating one if we don't already have one.
> -   The returned rtx is a SYMBOL_REF.  */
> +   The decl is given visibility VIS.  The returned rtx is a SYMBOL_REF.  */
>  
>  rtx
> -init_one_libfunc (const char *name)
> +init_one_libfunc_visibility (const char *name, symbol_visibility vis)
>  {
>tree id, decl;
>hashval_t hash;
> @@ -763,12 +771,18 @@ init_one_libfunc (const char *name)
>  {
>/* Create a new decl, so that it can be passed to
>targetm.encode_section_info.  */
> -  decl = build_libfunc_function (name);
> +  decl = build_libfunc_function_visibility (name, vis);
>*slot = decl;
>  }
>return XEXP (DECL_RTL (decl), 0);
>  }
>  
> +rtx
> +init_one_libfunc (const char *name)
> +{
> +  return init_one_libfunc_visibility (name, VISIBILITY_DEFAULT);
> +}
> +
>  /* Adjust the assembler name of libfunc NAME to ASMSPEC.  */
>  
>  rtx
> -- 
> 2.17.1
> 


Re: [PATCH, AArch64 v2 09/11] aarch64: Force TImode values into even registers

2018-10-30 Thread James Greenhalgh
On Tue, Oct 02, 2018 at 11:19:13AM -0500, Richard Henderson wrote:
> The LSE CASP instruction requires values to be placed in even
> register pairs.  A solution involving two additional register
> classes was rejected in favor of the much simpler solution of
> simply requiring all TImode values to be aligned.

OK.

Thanks,
James

> 
>   * config/aarch64/aarch64.c (aarch64_hard_regno_mode_ok): Force
>   16-byte modes held in GP registers to use an even regno.
> ---
>  gcc/config/aarch64/aarch64.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 49b47382b5d..ce4d7e51d00 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -1451,10 +1451,14 @@ aarch64_hard_regno_mode_ok (unsigned regno, 
> machine_mode mode)
>if (regno == FRAME_POINTER_REGNUM || regno == ARG_POINTER_REGNUM)
>  return mode == Pmode;
>  
> -  if (GP_REGNUM_P (regno) && known_le (GET_MODE_SIZE (mode), 16))
> -return true;
> -
> -  if (FP_REGNUM_P (regno))
> +  if (GP_REGNUM_P (regno))
> +{
> +  if (known_le (GET_MODE_SIZE (mode), 8))
> + return true;
> +  else if (known_le (GET_MODE_SIZE (mode), 16))
> + return (regno & 1) == 0;
> +}
> +  else if (FP_REGNUM_P (regno))
>  {
>if (vec_flags & VEC_STRUCT)
>   return end_hard_regno (mode, regno) - 1 <= V31_REGNUM;
> -- 
> 2.17.1
> 


Re: [C++ Patch] PR 84644 ("internal compiler error: in warn_misplaced_attr_for_class_type, at cp/decl.c:4718")

2018-10-30 Thread Jason Merrill

On 10/26/18 2:02 PM, Paolo Carlini wrote:

On 26/10/18 17:18, Jason Merrill wrote:
On Fri, Oct 26, 2018 at 4:52 AM Paolo Carlini 
 wrote:

On 24/10/18 22:41, Jason Merrill wrote:

On 10/15/18 12:45 PM, Paolo Carlini wrote:

 && ((TREE_CODE (declspecs->type) != TYPENAME_TYPE
+   && TREE_CODE (declspecs->type) != DECLTYPE_TYPE
  && MAYBE_CLASS_TYPE_P (declspecs->type))

I would think that the MAYBE_CLASS_TYPE_P here should be CLASS_TYPE_P,
and then we can remove the TYPENAME_TYPE check.  Or do we want to
allow template type parameters for some reason?

Indeed, it would be nice to just use OVERLOAD_TYPE_P. However it seems
we at least want to let through TEMPLATE_TYPE_PARMs representing 'auto'
- otherwise Dodji's check a few lines below which fixed c++/51473
doesn't work anymore - and also BOUND_TEMPLATE_TEMPLATE_PARM, otherwise
we regress on template/spec32.C and template/ttp22.C because we don't
diagnose the shadowing anymore. Thus, I would say either we keep on
using MAYBE_CLASS_TYPE_P or we pick what we need, possibly we add a 
comment?

Aha.  I guess the answer is not to restrict that test any more, but
instead to fix the code further down so it gives a proper diagnostic
rather than call warn_misplaced_attr_for_class_type.


I see. Thus something like the below? It passes testing on x86_64-linux.



+  if ((!declared_type || TREE_CODE (declared_type) == DECLTYPE_TYPE)
+  && ! saw_friend && !error_p)
 permerror (input_location, "declaration does not declare anything");


I see no reason to make this specific to decltype.  Maybe move this 
diagnostic into the final 'else' block with the other declspec 
diagnostics and not look at declared_type at all?



+  if (declspecs->attributes && warn_attributes && declared_type
+  && TREE_CODE (declared_type) != DECLTYPE_TYPE)


I think we do want to give a diagnostic about useless attributes, not 
skip it.


Jason


V2 [PATCH] i386: Use scalar operand in SF/DF/SI/DI vec_dup patterns

2018-10-30 Thread H.J. Lu
On Mon, Oct 29, 2018 at 2:02 PM Uros Bizjak  wrote:
>
> On Sat, Oct 27, 2018 at 8:03 AM H.J. Lu  wrote:
> >
> > Use scalar operand in SF/DF/SI/DI vec_dup patterns which enables combiner
> > to generate
> >
> > (set (reg:V8SF 84)
> >  (vec_duplicate:V8SF (mem/c:SF (symbol_ref:DI ("y")
> >
> > const_vector_duplicate_operand is added for constant vector broadcast.
> > We split
> >
> > (set (reg:V16SF 86)
> >  (const_vector:V16SF
> >[(const_double:SF 2.0e+0 [0x0.8p+2]) repeated x16])
> >
> > to
> >
> > (set (reg:V16SF 86)
> >  (vec_duplicate:V16SF (mem/u/c:SF (symbol_ref/u:DI ("*.LC1")
>
> Why not at the expand time? Rewrite vector constant as vec_duplicate
> from memory and combine will do the stuff for you. We do have _bcst
> instruction patterns.
>

Here is the updated patch to do that.  OK for trunk?

Thanks.


-- 
H.J.
From 0c2ffe8a627c64263805baba8c9d9754dbb30f4b Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Tue, 2 Oct 2018 14:27:55 -0700
Subject: [PATCH] i386: Use scalar operand in SF/DF/SI/DI vec_dup patterns

Use scalar operand in SF/DF/SI/DI vec_dup patterns for AVX512 which
enables combiner to generate

(set (reg:V8SF 84)
 (vec_duplicate:V8SF (mem/c:SF (symbol_ref:DI ("y")

To support it, the following changes are made:

1. For AVX512 broadcast instructions from integer register operand, we
only need to broadcast integer to integer vectors.
2. Replace nonimmediate_operand with register_operand in vec_dup patterns
since memory operand size is wrong.  Add vec_dup patterns with
memory_operand of correct operand size.
3. Replace duplicated vec_dup patterns with subreg.
4. Update AVX512 broadcast expanders to optimize constant SF/DF/SI/DI
vector broadcasts.
5. Add const_vector_duplicate_operand for constant vector broadcast.
We split

(set (reg:V16SF 86)
 (const_vector:V16SF
   [(const_double:SF 2.0e+0 [0x0.8p+2]) repeated x16])

to

(set (reg:V16SF 86)
 (vec_duplicate:V16SF (mem/u/c:SF (symbol_ref/u:DI ("*.LC1")

before IRA so tha IRA can turn

(set (reg:V16SF 86)
 (vec_duplicate:V16SF (mem/u/c:SF (symbol_ref/u:DI ("*.LC1")
(set (reg:V16SF 90)
 (plus:V16SF (reg/v:V16SF 85 [ x ])
		 (reg:V16SF 86)))

into

(set (reg:V16SF 90)
 (plus:V16SF
   (vec_duplicate:V16SF (mem/u/c:SF (symbol_ref/u:DI ("*.LC1"
   (reg/v:V16SF 85 [ x ])))

gcc/

	PR target/87537
	PR target/87767
	* config/i386/i386-builtin-types.def: Replace
	CODE_FOR_avx2_vec_dupv4sf, CODE_FOR_avx2_vec_dupv8sf and
	CODE_FOR_avx2_vec_dupv4df with CODE_FOR_vec_dupv4sf,
	CODE_FOR_vec_dupv8sf and CODE_FOR_vec_dupv4df, respectively.
	* config/i386/i386.c (ix86_expand_args_builtin): Handle
	SF/DF/SI/DI constant vector broadcast.
	(expand_vec_perm_1): Updated.  Duplicate them from source operand.
	* config/i386/i386.md (SF to DF splitter): Replace
	gen_avx512f_vec_dupv16sf_1 with gen_avx512f_vec_dupv16sf.
	* config/i386/predicates.md (const_vector_duplicate_operand): New.
	* config/i386/sse.md (VF48_AVX512VL): New.
	(avx2_vec_dup): Removed.
	(avx2_vec_dupv8sf_1): Likewise.
	(avx512f_vec_dup_1): Likewise.
	(avx2_pbroadcast_1): Likewise.
	(avx2_vec_dupv4df): Likewise.
	(_vec_dup_1): Likewise.
	(_vec_dup:V48_AVX512VL): Likewise.
	(avx2_pbroadcast): Replace nonimmediate_operand with
	register_operand.
	(_vec_dup:VI48_AVX512VL): Likewise.
	(_vec_dup:VI12_AVX512VL): Likewise.
	(_vec_dup:VF48_AVX512VL): New.
	(*const_vec_dup): Likewise.
	(_vec_dup:VI48_AVX512VL): Likewise.
	(_vec_dup_1:VI48_AVX512VL): Likewise.
	(_vec_dup_gpr): Replace
	V48_AVX512VL with VI48_AVX512VL.
	(*avx_vperm_broadcast_): Replace gen_avx2_vec_dupv8sf with
	gen_vec_dupv8sf.

gcc/testsuite/

	PR target/87537
	PR target/87767
	* gcc.target/i386/avx2-vbroadcastss_ps256-1.c: Updated.
	* gcc.target/i386/avx512vl-vbroadcast-3.c: Likewise.
	* gcc.target/i386/avx512-binop-7.h: New file.
	* gcc.target/i386/avx512f-add-sf-zmm-7.c: Likewise.
	* gcc.target/i386/avx512f-add-si-zmm-7.c: Likewise.
	* gcc.target/i386/avx512vl-add-di-xmm-7.c: Likewise.
	* gcc.target/i386/avx512vl-add-sf-xmm-7.c: Likewise.
	* gcc.target/i386/avx512vl-add-sf-ymm-7.c: Likewise.
	* gcc.target/i386/avx512vl-add-si-xmm-7.c: Likewise.
	* gcc.target/i386/avx512vl-add-si-ymm-7.c: Likewise.
	* gcc.target/i386/pr87537-2.c: Likewise.
	* gcc.target/i386/pr87537-3.c: Likewise.
	* gcc.target/i386/pr87537-4.c: Likewise.
	* gcc.target/i386/pr87537-5.c: Likewise.
	* gcc.target/i386/pr87537-6.c: Likewise.
	* gcc.target/i386/pr87537-7.c: Likewise.
	* gcc.target/i386/pr87537-8.c: Likewise.
	* gcc.target/i386/pr87537-9.c: Likewise.
---
 gcc/config/i386/i386-builtin.def  |   6 +-
 gcc/config/i386/i386.c| 212 --
 gcc/config/i386/i386.md   |   2 +-
 gcc/config/i386/predicates.md |  13 ++
 gcc/config/i386/sse.md| 145 +---
 .../i386/avx2-vbroadcastss_ps256-1.c  |   3 +-
 .../gcc.target/i386/avx512-binop-7.h  |  12 +
 

Re: [PATCH, AArch64 v2 05/11] aarch64: Emit LSE st instructions

2018-10-30 Thread James Greenhalgh
On Tue, Oct 02, 2018 at 11:19:09AM -0500, Richard Henderson wrote:
> When the result of an operation is not used, we can ignore the
> result by storing to XZR.  For two of the memory models, using
> XZR with LD has a preferred assembler alias, ST.

ST has different semantics to LD, in particular, ST is not
ordered by a DMB LD; so this could weaken the LDADD and break C11 semantics.

The relevant Arm Arm text is:

  If the destination register is not one of WZR or XZR, LDADDA and
  LDADDAL load from memory with acquire semantics

  LDADDL and LDADDAL store to memory with release semantics.

  LDADD has no memory ordering requirements.

I'm taking this to mean that even if the result is unused, using XZR is not
a valid transformation; it weakens the expected acquire semantics to
unordered.

The example I have from Will Deacon on an internal bug database is:

  P0 (atomic_int* y,atomic_int* x) {
atomic_store_explicit(x,1,memory_order_relaxed);
atomic_thread_fence(memory_order_release);
atomic_store_explicit(y,1,memory_order_relaxed);
  }

  P1 (atomic_int* y,atomic_int* x) {
int r0 = atomic_fetch_add_explicit(y,1,memory_order_relaxed);
atomic_thread_fence(memory_order_acquire);
int r1 = atomic_load_explicit(x,memory_order_relaxed);
  }

  The outcome where y == 2 and P1 has r0 = 1 and r1 = 0 is illegal.

This example comes from a while back in my memory; so copying Will for
any more detailed questions.

My impression is that this transformation is not safe, and so the patch is
not OK.

Thanks,
James

> 
>   * config/aarch64/atomics.md (aarch64_atomic__lse):
>   Use ST for relaxed and release models; load to XZR otherwise;
>   remove the now unnecessary scratch register.
> 
>   * gcc.target/aarch64/atomic-inst-ldadd.c: Expect stadd{,l}.
>   * gcc.target/aarch64/atomic-inst-ldlogic.c: Similarly.
> ---
>  .../gcc.target/aarch64/atomic-inst-ldadd.c| 18 ---
>  .../gcc.target/aarch64/atomic-inst-ldlogic.c  | 54 ---
>  gcc/config/aarch64/atomics.md | 15 +++---
>  3 files changed, 57 insertions(+), 30 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-inst-ldadd.c 
> b/gcc/testsuite/gcc.target/aarch64/atomic-inst-ldadd.c
> index 4b2282c6861..db2206186b4 100644
> --- a/gcc/testsuite/gcc.target/aarch64/atomic-inst-ldadd.c
> +++ b/gcc/testsuite/gcc.target/aarch64/atomic-inst-ldadd.c
> @@ -67,20 +67,26 @@ TEST (add_load_notreturn, ADD_LOAD_NORETURN)
>  TEST (sub_load, SUB_LOAD)
>  TEST (sub_load_notreturn, SUB_LOAD_NORETURN)
>  
> -/* { dg-final { scan-assembler-times "ldaddb\t" 16} } */
> +/* { dg-final { scan-assembler-times "ldaddb\t" 8} } */
>  /* { dg-final { scan-assembler-times "ldaddab\t" 32} } */
> -/* { dg-final { scan-assembler-times "ldaddlb\t" 16} } */
> +/* { dg-final { scan-assembler-times "ldaddlb\t" 8} } */
>  /* { dg-final { scan-assembler-times "ldaddalb\t" 32} } */
> +/* { dg-final { scan-assembler-times "staddb\t" 8} } */
> +/* { dg-final { scan-assembler-times "staddlb\t" 8} } */
>  
> -/* { dg-final { scan-assembler-times "ldaddh\t" 16} } */
> +/* { dg-final { scan-assembler-times "ldaddh\t" 8} } */
>  /* { dg-final { scan-assembler-times "ldaddah\t" 32} } */
> -/* { dg-final { scan-assembler-times "ldaddlh\t" 16} } */
> +/* { dg-final { scan-assembler-times "ldaddlh\t" 8} } */
>  /* { dg-final { scan-assembler-times "ldaddalh\t" 32} } */
> +/* { dg-final { scan-assembler-times "staddh\t" 8} } */
> +/* { dg-final { scan-assembler-times "staddlh\t" 8} } */
>  
> -/* { dg-final { scan-assembler-times "ldadd\t" 32} } */
> +/* { dg-final { scan-assembler-times "ldadd\t" 16} } */
>  /* { dg-final { scan-assembler-times "ldadda\t" 64} } */
> -/* { dg-final { scan-assembler-times "ldaddl\t" 32} } */
> +/* { dg-final { scan-assembler-times "ldaddl\t" 16} } */
>  /* { dg-final { scan-assembler-times "ldaddal\t" 64} } */
> +/* { dg-final { scan-assembler-times "stadd\t" 16} } */
> +/* { dg-final { scan-assembler-times "staddl\t" 16} } */
>  
>  /* { dg-final { scan-assembler-not "ldaxr\t" } } */
>  /* { dg-final { scan-assembler-not "stlxr\t" } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-inst-ldlogic.c 
> b/gcc/testsuite/gcc.target/aarch64/atomic-inst-ldlogic.c
> index 4879d52b9b4..b8a53e0a676 100644
> --- a/gcc/testsuite/gcc.target/aarch64/atomic-inst-ldlogic.c
> +++ b/gcc/testsuite/gcc.target/aarch64/atomic-inst-ldlogic.c
> @@ -101,54 +101,72 @@ TEST (xor_load_notreturn, XOR_LOAD_NORETURN)
>  
>  /* Load-OR.  */
>  
> -/* { dg-final { scan-assembler-times "ldsetb\t" 8} } */
> +/* { dg-final { scan-assembler-times "ldsetb\t" 4} } */
>  /* { dg-final { scan-assembler-times "ldsetab\t" 16} } */
> -/* { dg-final { scan-assembler-times "ldsetlb\t" 8} } */
> +/* { dg-final { scan-assembler-times "ldsetlb\t" 4} } */
>  /* { dg-final { scan-assembler-times "ldsetalb\t" 16} } */
> +/* { dg-final { scan-assembler-times "stsetb\t" 4} } */
> +/* { dg-final { scan-assembler-times "stsetlb\t" 4} } */

Re: [PATCH, AArch64 v2 04/11] aarch64: Improve atomic-op lse generation

2018-10-30 Thread James Greenhalgh
On Tue, Oct 02, 2018 at 11:19:08AM -0500, Richard Henderson wrote:
> Fix constraints; avoid unnecessary split.  Drop the use of the atomic_op
> iterator in favor of the ATOMIC_LDOP iterator; this is simplier and more
> logical for ldclr aka bic.

OK.

Thanks,
James

> 
>   * config/aarch64/aarch64.c (aarch64_emit_bic): Remove.
>   (aarch64_atomic_ldop_supported_p): Remove.
>   (aarch64_gen_atomic_ldop): Remove.
>   * config/aarch64/atomic.md (atomic_):
>   Fully expand LSE operations here.
>   (atomic_fetch_): Likewise.
>   (atomic__fetch): Likewise.
>   (aarch64_atomic__lse): Drop atomic_op iterator
>   and use ATOMIC_LDOP instead; use register_operand for the input;
>   drop the split and emit insns directly.
>   (aarch64_atomic_fetch__lse): Likewise.
>   (aarch64_atomic__fetch_lse): Remove.
>   (@aarch64_atomic_load): Remove.
> ---
>  gcc/config/aarch64/aarch64-protos.h |   2 -
>  gcc/config/aarch64/aarch64.c| 176 -
>  gcc/config/aarch64/atomics.md   | 197 +++-
>  gcc/config/aarch64/iterators.md |   5 +-
>  4 files changed, 108 insertions(+), 272 deletions(-)
> 



Re: [PATCH, AArch64 v2 03/11] aarch64: Improve swp generation

2018-10-30 Thread James Greenhalgh
On Tue, Oct 02, 2018 at 11:19:07AM -0500, Richard Henderson wrote:
> Allow zero as an input; fix constraints; avoid unnecessary split.

OK.

James

> 
>   * config/aarch64/aarch64.c (aarch64_emit_atomic_swap): Remove.
>   (aarch64_gen_atomic_ldop): Don't call it.
>   * config/aarch64/atomics.md (atomic_exchange):
>   Use aarch64_reg_or_zero.
>   (aarch64_atomic_exchange): Likewise.
>   (aarch64_atomic_exchange_lse): Remove split; remove & from
>   operand 0; use aarch64_reg_or_zero for input; merge ...
>   (@aarch64_atomic_swp): ... this and remove.
> ---


Re: [PATCH, d] Disable D on systems where it is known not to work.

2018-10-30 Thread Iain Buclaw
On Tue, 30 Oct 2018 at 20:50, Andreas Schwab  wrote:
>
> On Okt 30 2018, Iain Buclaw  wrote:
>
> > This turns off D front-end where there's been reported bootstrap
> > problems that need further investigation.  Also added a configure.tgt
> > for libphobos to allow enabling for targets where there's known good
> > runtime support backed by existing continuous integration.
>
> Why do you need that?  The D frontend isn't built by default.
>

As far as I have seen, all automated builders for gcc are configured
with --enable-languages=all however, which is pulling the D frontend
in.

-- 
Iain


Re: [PATCH, AArch64 v2 02/11] aarch64: Improve cas generation

2018-10-30 Thread James Greenhalgh
On Tue, Oct 02, 2018 at 11:19:06AM -0500, Richard Henderson wrote:
> Do not zero-extend the input to the cas for subword operations;
> instead, use the appropriate zero-extending compare insns.
> Correct the predicates and constraints for immediate expected operand.

OK, modulo two very dull style comments.

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index fbec54fe5da..0e2b85de1e3 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -1613,6 +1613,33 @@ aarch64_gen_compare_reg (RTX_CODE code, rtx x, rtx y)
>return cc_reg;
>  }
>  
> +/* Similarly, but maybe zero-extend Y if Y_MODE < SImode.  */
> +
> +static rtx
> +aarch64_gen_compare_reg_maybe_ze(RTX_CODE code, rtx x, rtx y,
> + machine_mode y_mode)

Space before the bracket: aarch64_gen_compare_reg_maybe_ze (RTX_CODE

> @@ -14187,26 +14197,32 @@ aarch64_expand_compare_and_swap (rtx operands[])
>/* The CAS insn requires oldval and rval overlap, but we need to
>have a copy of oldval saved across the operation to tell if
>the operation is successful.  */
> -  if (mode == QImode || mode == HImode)
> - rval = copy_to_mode_reg (SImode, gen_lowpart (SImode, oldval));
> -  else if (reg_overlap_mentioned_p (rval, oldval))
> -rval = copy_to_mode_reg (mode, oldval);
> -  else
> - emit_move_insn (rval, oldval);
> +  if (reg_overlap_mentioned_p (rval, oldval))
> +rval = copy_to_mode_reg (r_mode, oldval);
> +  else 

Trailing space on else.

> + emit_move_insn (rval, gen_lowpart (r_mode, oldval));
> +
>emit_insn (gen_aarch64_compare_and_swap_lse (mode, rval, mem,
>  newval, mod_s));
> -  aarch64_gen_compare_reg (EQ, rval, oldval);
> +  cc_reg = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode);
>  }

Thanks,
James



Re: [PATCH, d] Disable D on systems where it is known not to work.

2018-10-30 Thread Andreas Schwab
On Okt 30 2018, Iain Buclaw  wrote:

> This turns off D front-end where there's been reported bootstrap
> problems that need further investigation.  Also added a configure.tgt
> for libphobos to allow enabling for targets where there's known good
> runtime support backed by existing continuous integration.

Why do you need that?  The D frontend isn't built by default.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH, AArch64 v2 01/11] aarch64: Simplify LSE cas generation

2018-10-30 Thread James Greenhalgh
On Tue, Oct 02, 2018 at 11:19:05AM -0500, Richard Henderson wrote:
> The cas insn is a single insn, and if expanded properly need not
> be split after reload.  Use the proper inputs for the insn.

OK.

Thanks,
James

> 
>   * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap):
>   Force oldval into the rval register for TARGET_LSE; emit the compare
>   during initial expansion so that it may be deleted if unused.
>   (aarch64_gen_atomic_cas): Remove.
>   * config/aarch64/atomics.md (@aarch64_compare_and_swap_lse):
>   Change = to +r for operand 0; use match_dup for operand 2;
>   remove is_weak and mod_f operands as unused.  Drop the split
>   and merge with...
>   (@aarch64_atomic_cas): ... this pattern's output; remove.
>   (@aarch64_compare_and_swap_lse): Similarly.
>   (@aarch64_atomic_cas): Similarly.


Re: C++ PATCH to implement C++20 P0892R2 - explicit(bool) [v4]

2018-10-30 Thread Jason Merrill

On 10/29/18 6:15 PM, Marek Polacek wrote:

On Wed, Oct 24, 2018 at 02:55:14PM -0400, Jason Merrill wrote:

On 10/12/18 12:32 PM, Marek Polacek wrote:

+   EXPLICIT_SPECIFIER is used in case the explicit-specifier, if any, has
+   value-dependent expression.  */
  static void
  cp_parser_decl_specifier_seq (cp_parser* parser,
  cp_parser_flags flags,
  cp_decl_specifier_seq *decl_specs,
- int* declares_class_or_enum)
+ int* declares_class_or_enum,
+ tree* explicit_specifier)


Why not add the explicit-specifier to cp_decl_specifier_seq?  They don't
live very long, so making them bigger isn't a concern.  Then other of the
handling could move into grokdeclarator along with the other explicit
handling.


Great -- that simplifies things.


@@ -12822,6 +12844,17 @@ tsubst_function_decl (tree t, tree args, 
tsubst_flags_t complain,
if (!uses_template_parms (DECL_TI_ARGS (t)))
return t;
+  /* Handle explicit(dependent-expr).  */
+  if (DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P (t))
+   {
+ tree spec = lookup_explicit_specifier (t);
+ spec = tsubst_copy_and_build (spec, args, complain, in_decl,
+   /*function_p=*/false,
+   /*i_c_e_p=*/true);
+ spec = build_explicit_specifier (spec, complain);
+ DECL_NONCONVERTING_P (t) = (spec == boolean_true_node);
+   }


This is setting DECL_NONCONVERTING_P on the template, rather than the
instantiation r, which hasn't been created yet at this point; this handling
needs to move further down in the function.


Hmm, interesting that that worked, too.  Anyway, fixed.  Thanks!


It worked for the testcase because that bit was then copied to the 
instantiation.  I'm surprised that converting b0 in explicit13.C worked, 
though.



+  /* Handle explicit(dependent-expr).  */
+  if (DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P (t))
+{
+  tree spec = lookup_explicit_specifier (t);
+  spec = tsubst_copy_and_build (spec, args, complain, in_decl,
+   /*function_p=*/false,
+   /*i_c_e_p=*/true);
+  spec = build_explicit_specifier (spec, complain);
+  DECL_NONCONVERTING_P (r) = (spec == boolean_true_node);
+}


It still surprises me that you don't need to store the partially 
instantiated explicit-specifier, but explicit13.C does seem to cover the 
cases I would expect to break.



Bootstrapped/regtested on x86_64-linux, ok for trunk?


OK.

Jason


Re: [patch] various OpenACC reduction enhancements - ME and nvptx changes

2018-10-30 Thread Cesar Philippidis
On 10/5/18 07:07, Tom de Vries wrote:
> On 6/29/18 8:19 PM, Cesar Philippidis wrote:
>> The attached patch includes the nvptx and GCC ME reductions enhancements.
>>
>> Is this patch OK for trunk? It bootstrapped / regression tested cleanly
>> for x86_64 with nvptx offloading.
>>
> 
> These need fixing:
> ...
> === ERROR type #5: trailing whitespace (4 error(s)) ===
> gcc/config/nvptx/nvptx.c:5139:0:██
> gcc/config/nvptx/nvptx.c:5660:8:  do█
> gcc/config/nvptx/nvptx.c:5702:0:██
> gcc/config/nvptx/nvptx.c:5726:0:██
> ...

Sorry. The attached patch fixes that.

> Otherwise, nvptx part LGTM.
Tomorrow's my last day at Mentor, so either Thomas or Julian will need
to commit it once the other patches get approved.

Thanks,
Cesar
	gcc/
	* config/nvptx/nvptx.c (nvptx_propagate_unified): New.
	(nvptx_split_blocks): Call it for cond_uni insn.
	(nvptx_expand_cond_uni): New.
	(enum nvptx_builtins): Add NVPTX_BUILTIN_COND_UNI.
	(nvptx_init_builtins): Initialize it.
	(nvptx_expand_builtin):
	(nvptx_generate_vector_shuffle): Change integral SHIFT operand to
	tree BITS operand.
	(nvptx_vector_reduction): New.
	(nvptx_adjust_reduction_type): New.
	(nvptx_goacc_reduction_setup): Use it to adjust the type of ref_to_res.
	(nvptx_goacc_reduction_init): Don't update LHS if it doesn't exist.
	(nvptx_goacc_reduction_fini): Call nvptx_vector_reduction for vector.
	Use it to adjust the type of ref_to_res.
	(nvptx_goacc_reduction_teardown):
	* config/nvptx/nvptx.md (cond_uni): New pattern.

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 9903a273863..acb490a9a90 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -2863,6 +2863,52 @@ nvptx_reorg_uniform_simt ()
 }
 }
 
+/* UNIFIED is a cond_uni insn.  Find the branch insn it affects, and
+   mark that as unified.  We expect to be in a single block.  */
+
+static void
+nvptx_propagate_unified (rtx_insn *unified)
+{
+  rtx_insn *probe = unified;
+  rtx cond_reg = SET_DEST (PATTERN (unified));
+  rtx pat = NULL_RTX;
+
+  /* Find the comparison.  (We could skip this and simply scan to he
+ blocks' terminating branch, if we didn't care for self
+ checking.)  */
+  for (;;)
+{
+  probe = next_real_insn (probe);
+  if (!probe)
+	break;
+  pat = PATTERN (probe);
+
+  if (GET_CODE (pat) == SET
+	  && GET_RTX_CLASS (GET_CODE (SET_SRC (pat))) == RTX_COMPARE
+	  && XEXP (SET_SRC (pat), 0) == cond_reg)
+	break;
+  gcc_assert (NONJUMP_INSN_P (probe));
+}
+  gcc_assert (pat);
+  rtx pred_reg = SET_DEST (pat);
+
+  /* Find the branch.  */
+  do
+probe = NEXT_INSN (probe);
+  while (!JUMP_P (probe));
+
+  pat = PATTERN (probe);
+  rtx itec = XEXP (SET_SRC (pat), 0);
+  gcc_assert (XEXP (itec, 0) == pred_reg);
+
+  /* Mark the branch's condition as unified.  */
+  rtx unspec = gen_rtx_UNSPEC (BImode, gen_rtvec (1, pred_reg),
+			   UNSPEC_BR_UNIFIED);
+  bool ok = validate_change (probe,  (itec, 0), unspec, false);
+
+  gcc_assert (ok);
+}
+
 /* Loop structure of the function.  The entire function is described as
a NULL loop.  */
 
@@ -2964,6 +3010,9 @@ nvptx_split_blocks (bb_insn_map_t *map)
 	continue;
 	  switch (recog_memoized (insn))
 	{
+	case CODE_FOR_cond_uni:
+	  nvptx_propagate_unified (insn);
+	  /* FALLTHROUGH */
 	default:
 	  seen_insn = true;
 	  continue;
@@ -5083,6 +5132,21 @@ nvptx_expand_cmp_swap (tree exp, rtx target,
   return target;
 }
 
+/* Expander for the compare unified builtin.  */
+
+static rtx
+nvptx_expand_cond_uni (tree exp, rtx target, machine_mode mode, int ignore)
+{
+  if (ignore)
+return target;
+
+  rtx src = expand_expr (CALL_EXPR_ARG (exp, 0),
+			 NULL_RTX, mode, EXPAND_NORMAL);
+
+  emit_insn (gen_cond_uni (target, src));
+
+  return target;
+}
 
 /* Codes for all the NVPTX builtins.  */
 enum nvptx_builtins
@@ -5092,6 +5156,7 @@ enum nvptx_builtins
   NVPTX_BUILTIN_WORKER_ADDR,
   NVPTX_BUILTIN_CMP_SWAP,
   NVPTX_BUILTIN_CMP_SWAPLL,
+  NVPTX_BUILTIN_COND_UNI,
   NVPTX_BUILTIN_MAX
 };
 
@@ -5129,6 +5194,7 @@ nvptx_init_builtins (void)
(PTRVOID, ST, UINT, UINT, NULL_TREE));
   DEF (CMP_SWAP, "cmp_swap", (UINT, PTRVOID, UINT, UINT, NULL_TREE));
   DEF (CMP_SWAPLL, "cmp_swapll", (LLUINT, PTRVOID, LLUINT, LLUINT, NULL_TREE));
+  DEF (COND_UNI, "cond_uni", (integer_type_node, integer_type_node, NULL_TREE));
 
 #undef DEF
 #undef ST
@@ -5161,6 +5227,9 @@ nvptx_expand_builtin (tree exp, rtx target, rtx ARG_UNUSED (subtarget),
 case NVPTX_BUILTIN_CMP_SWAPLL:
   return nvptx_expand_cmp_swap (exp, target, mode, ignore);
 
+case NVPTX_BUILTIN_COND_UNI:
+  return nvptx_expand_cond_uni (exp, target, mode, ignore);
+
 default: gcc_unreachable ();
 }
 }
@@ -5284,7 +5353,7 @@ nvptx_get_worker_red_addr (tree type, tree offset)
 
 static void
 nvptx_generate_vector_shuffle (location_t loc,
-			   tree dest_var, tree var, unsigned shift,
+			   tree dest_var, tree var, tree bits,
 			   gimple_seq *seq)
 

[PATCH, d] Disable D on systems where it is known not to work.

2018-10-30 Thread Iain Buclaw
Hi,

This turns off D front-end where there's been reported bootstrap
problems that need further investigation.  Also added a configure.tgt
for libphobos to allow enabling for targets where there's known good
runtime support backed by existing continuous integration.

For both, this can be overridden if either D or libphobos was
explicitly requested by configure.

The guards will be loosened as I go through each target configuration.

-- 
Iain

---
ChangeLog:

2018-10-30  Iain Buclaw  

PR bootstrap/87788
PR d/87799
* configure: Rebuild.
* configure.ac: Disable D on systems where it is known not to work.

libphobos/ChangeLog:

2018-10-30  Iain Buclaw  

PR bootstrap/87789
PR d/87818
PR d/87819
* configure.tgt: New file.

---
diff --git a/configure b/configure
index 77e7e1869ba..20741aef7e3 100755
--- a/configure
+++ b/configure
@@ -3345,6 +3345,40 @@ if test "${ENABLE_LIBSTDCXX}" = "default" ; then
   esac
 fi
 
+# Disable D on systems where it is known to not work.
+# For testing, you can override this with --enable-languages=d.
+case ,${enable_languages}, in
+  *,d,*)
+;;
+  *)
+case "${target}" in
+  *-*-darwin* | *-*-cygwin* | *-*-mingw*)
+	unsupported_languages="$unsupported_languages d"
+	;;
+esac
+;;
+esac
+
+# Disable libphobos on unsupported systems.
+# For testing, you can override this with --enable-libphobos.
+if test -d ${srcdir}/libphobos; then
+if test x$enable_libphobos = x; then
+	{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for libphobos support" >&5
+$as_echo_n "checking for libphobos support... " >&6; }
+	if (srcdir=${srcdir}/libphobos; \
+		. ${srcdir}/configure.tgt; \
+		test -n "$UNSUPPORTED")
+	then
+	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+	noconfigdirs="$noconfigdirs target-libphobos"
+	else
+	{ $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+	fi
+fi
+fi
+
 # Disable Fortran for some systems.
 case "${target}" in
   mmix-*-*)
diff --git a/configure.ac b/configure.ac
index 1e5979dc043..b10212b3be5 100644
--- a/configure.ac
+++ b/configure.ac
@@ -674,6 +674,37 @@ if test "${ENABLE_LIBSTDCXX}" = "default" ; then
   esac
 fi
 
+# Disable D on systems where it is known to not work.
+# For testing, you can override this with --enable-languages=d.
+case ,${enable_languages}, in
+  *,d,*)
+;;
+  *)
+case "${target}" in
+  *-*-darwin* | *-*-cygwin* | *-*-mingw*)
+	unsupported_languages="$unsupported_languages d"
+	;;
+esac
+;;
+esac
+
+# Disable libphobos on unsupported systems.
+# For testing, you can override this with --enable-libphobos.
+if test -d ${srcdir}/libphobos; then
+if test x$enable_libphobos = x; then
+	AC_MSG_CHECKING([for libphobos support])
+	if (srcdir=${srcdir}/libphobos; \
+		. ${srcdir}/configure.tgt; \
+		test -n "$UNSUPPORTED")
+	then
+	AC_MSG_RESULT([no])
+	noconfigdirs="$noconfigdirs target-libphobos"
+	else
+	AC_MSG_RESULT([yes])
+	fi
+fi
+fi
+
 # Disable Fortran for some systems.
 case "${target}" in
   mmix-*-*)
diff --git a/libphobos/configure.tgt b/libphobos/configure.tgt
new file mode 100644
index 000..8afd350c755
--- /dev/null
+++ b/libphobos/configure.tgt
@@ -0,0 +1,36 @@
+# -*- shell-script -*-
+# Copyright (C) 2018 Free Software Foundation, Inc.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+# This is the target specific configuration file.  This is invoked by the
+# autoconf generated configure script.  Putting it in a separate shell file
+# lets us skip running autoconf when modifying target specific information.
+
+# Disable the libphobos or libdruntime components on untested or known
+# broken systems.  More targets shall be added after testing.
+case "${target}" in
+  arm*-*-linux*)
+	;;
+  mips*-*-linux*)
+	;;
+  x86_64-*-kfreebsd*-gnu | i?86-*-kfreebsd*-gnu)
+	;;
+  x86_64-*-linux* | i?86-*-linux*)
+	;;
+  *)
+	UNSUPPORTED=1
+	;;
+esac


Re: [patch, fortran] Fix PR 85896, type confusion with min and max

2018-10-30 Thread Janne Blomqvist
On Tue, Oct 30, 2018 at 8:57 PM Thomas Koenig  wrote:

> Hello world,
>
> the attached patchlet fixes a rejects-valid bug by simply ignoring the
> type for max and min during simplification.  This is correct
> because setting the type of a generic intrinsic function has
> no effect.
>
> It is a rare pleasure to fix a bug by removing code only :-)
>
> Regression-tested. OK for trunk?
>

Ok, thanks.

-- 
Janne Blomqvist


Re: [PATCH] Provide extension hint for aarch64 target (PR driver/83193).

2018-10-30 Thread James Greenhalgh
On Thu, Oct 25, 2018 at 05:53:22AM -0500, Martin Liška wrote:
> On 10/24/18 7:48 PM, Martin Sebor wrote:
> > On 10/24/2018 03:52 AM, Martin Liška wrote:
> >> On 10/23/18 6:31 PM, Martin Sebor wrote:
> >>> On 10/22/2018 07:05 AM, Martin Liška wrote:
>  On 10/16/18 6:57 PM, James Greenhalgh wrote:
> > On Mon, Oct 08, 2018 at 05:34:52AM -0500, Martin Liška wrote:
> >> Hi.
> >>
> >> I'm attaching updated version of the patch.
> >
> > Can't say I'm thrilled by the allocation/free (aarch64_parse_extension
> > allocates, everyone else has to free) responsibilities here.
> 
>  Agreed.
> 
> >
> > If you can clean that up I'd be much happier. The overall patch is OK.
> 
>  I rewrote that to use std::string, hope it's improvement?
> >>>
> >>
> >> Hi Martin
> >>
> >>> If STR below is not nul-terminated the std::string ctor is not
> >>> safe.
> >>
> >> Appreciate the help. The string should be null-terminated, it either comes
> >> from GCC command line or it's a valid of an attribute in source code.
> >>
> >>  If it is nul-terminated but LEN is equal to its length
> >>> then the nul assignment should be unnecessary.  If LEN is less
> >>> than its length and the goal is to truncate the string then
> >>> calling resize() would be the right way to do it.  Otherwise,
> >>> assigning a nul to an element into the middle won't truncate
> >>> (it will leave the remaining elements there).  (This may not
> >>> matter if the string isn't appended to after that.)
> >>
> >> That's new for me, I reworked the patch to use resize. Btw. it sounds
> >> a candidate for a new warning ;) ? Must be quite common mistake?
> > 
> > I should have also mentioned that there is constructor that
> > takes a pointer and a count:
> > 
> >   *invalid_extension = std::string (str, len);
> > 
> > That would be even better than calling resize (sorry about that).
> 
> That's fine, I'm sending updated patch. Tested just locally as cross compiler
> in valgind.
> 
> > 
> > There are lots of opportunities for warnings about misuses of
> > the standard library.  I think we need to first solve
> > the -Wno-system-headers problem (which disables most warnings
> > for standard library headers).
> 
> I see!

OK.

Thanks,
James




Re: Turn complete to incomplete types in free_lang_data

2018-10-30 Thread Jan Hubicka
Hi,
this is variant I re-tested and comitted.  It goes by adding the type to 
worklist - 
it is very hard to debug ordering issues in free lang data so I think it is 
more robust
to avoid introducing more surprises and we will definitly want to also simplify 
function
types eventually.  I am now re-benchmarking builds and will also try 
firefox/libreoffice
without unified build that stresses merging a lot more.

lto-boottrapped/regtested x86_64-linux.

Honza

* tree.c
(free_lang_data_d, add_tree_to_fld_list, fld_worklist_push): Move
head in file.
(free_lang_data_in_type): Forward declare.
(fld_type_variant_equal_p): New function.
(fld_type_variant): New function
(fld_incomplete_types): New hash.
(fld_incomplete_type_of): New function
(fld_simplfied-type): New function.
(free_lang_data_in_decl): Add fld parameter; simplify type of FIELD_DECL
(free_lang_data): Allocate and free fld_incomplete_type; update call
of free_lang_data_in_decl.

Index: tree.c
===
--- tree.c  (revision 265573)
+++ tree.c  (working copy)
@@ -5037,7 +5037,163 @@ protected_set_expr_location (tree t, loc
   if (CAN_HAVE_LOCATION_P (t))
 SET_EXPR_LOCATION (t, loc);
 }
+
+/* Data used when collecting DECLs and TYPEs for language data removal.  */
+
+struct free_lang_data_d
+{
+  free_lang_data_d () : decls (100), types (100) {}
+
+  /* Worklist to avoid excessive recursion.  */
+  auto_vec worklist;
+
+  /* Set of traversed objects.  Used to avoid duplicate visits.  */
+  hash_set pset;
+
+  /* Array of symbols to process with free_lang_data_in_decl.  */
+  auto_vec decls;
+
+  /* Array of types to process with free_lang_data_in_type.  */
+  auto_vec types;
+};
+
+
+/* Add type or decl T to one of the list of tree nodes that need their
+   language data removed.  The lists are held inside FLD.  */
+
+static void
+add_tree_to_fld_list (tree t, struct free_lang_data_d *fld)
+{
+  if (DECL_P (t))
+fld->decls.safe_push (t);
+  else if (TYPE_P (t))
+fld->types.safe_push (t);
+  else
+gcc_unreachable ();
+}
+
+/* Push tree node T into FLD->WORKLIST.  */
+
+static inline void
+fld_worklist_push (tree t, struct free_lang_data_d *fld)
+{
+  if (t && !is_lang_specific (t) && !fld->pset.contains (t))
+fld->worklist.safe_push ((t));
+}
+
+
 
+/* Do same comparsion as check_qualified_type skipping lang part of type
+   and be more permissive about type names: we only care that names are
+   same (for diagnostics) and that ODR names are the same.  */
+
+static bool
+fld_type_variant_equal_p (tree t, tree v)
+{
+  if (TYPE_QUALS (t) != TYPE_QUALS (v)
+  || TYPE_NAME (t) != TYPE_NAME (v)
+  || TYPE_ALIGN (t) != TYPE_ALIGN (v)
+  || !attribute_list_equal (TYPE_ATTRIBUTES (t),
+   TYPE_ATTRIBUTES (v)))
+return false;
+
+  return true;
+}
+
+/* Find variant of FIRST that match T and create new one if necessary.  */
+
+static tree
+fld_type_variant (tree first, tree t, struct free_lang_data_d *fld)
+{
+  if (first == TYPE_MAIN_VARIANT (t))
+return t;
+  for (tree v = first; v; v = TYPE_NEXT_VARIANT (v))
+if (fld_type_variant_equal_p (t, v))
+  return v;
+  tree v = build_variant_type_copy (first);
+  TYPE_READONLY (v) = TYPE_READONLY (t);
+  TYPE_VOLATILE (v) = TYPE_VOLATILE (t);
+  TYPE_ATOMIC (v) = TYPE_ATOMIC (t);
+  TYPE_RESTRICT (v) = TYPE_RESTRICT (t);
+  TYPE_ADDR_SPACE (v) = TYPE_ADDR_SPACE (t);
+  TYPE_NAME (v) = TYPE_NAME (t);
+  TYPE_ATTRIBUTES (v) = TYPE_ATTRIBUTES (t);
+  add_tree_to_fld_list (v, fld);
+  return v;
+}
+
+/* Map complete types to incomplete types.  */
+
+static hash_map *fld_incomplete_types;
+
+/* For T being aggregate type try to turn it into a incomplete variant.
+   Return T if no simplification is possible.  */
+
+static tree
+fld_incomplete_type_of (tree t, struct free_lang_data_d *fld)
+{
+  if (!t)
+return NULL;
+  if (POINTER_TYPE_P (t))
+{
+  tree t2 = fld_incomplete_type_of (TREE_TYPE (t), fld);
+  if (t2 != TREE_TYPE (t))
+   {
+ tree first;
+ if (TREE_CODE (t) == POINTER_TYPE)
+   first = build_pointer_type_for_mode (t2, TYPE_MODE (t),
+   TYPE_REF_CAN_ALIAS_ALL (t));
+ else
+   first = build_reference_type_for_mode (t2, TYPE_MODE (t),
+   TYPE_REF_CAN_ALIAS_ALL (t));
+ add_tree_to_fld_list (first, fld);
+ return fld_type_variant (first, t, fld);
+   }
+  return t;
+}
+  if (!RECORD_OR_UNION_TYPE_P (t) || !COMPLETE_TYPE_P (t))
+return t;
+  if (TYPE_MAIN_VARIANT (t) == t)
+{
+  bool existed;
+  tree 
+= fld_incomplete_types->get_or_insert (t, );
+
+  if (!existed)
+   {
+ copy = build_distinct_type_copy (t);
+
+ /* It is possible type was not seen by 

Re: [PATCH, testsuite, c-compat] Handle another c/l option not recognised by older GCC or clang.

2018-10-30 Thread Jeff Law
On 10/29/18 1:53 PM, Iain Sandoe wrote:
> Hi
> 
> When using ALT_CC/CXX_UNDER_TEST in the compat/struct-layout-1 tests, the c/l 
> options provided to the “alt” compiler need to avoid latest and greatest GCC 
> capability.  The patch tests to see if the ‘alt’ compiler can handle 
> -fno-diagnostics-show-line-numbers.
> 
> OK for trunk?
> thanks
> Iain
> 
> gcc/testsuite/
> 
>   * lib/c-compat.exp (compat-use-alt-compiler): handle 
> -fno-diagnostics-show-line-numbers.
>   (compat_setup_dfp): Likewise.
OK
jeff


[PATCH 4/4] [og8] Attach / Detach compiler tests

2018-10-30 Thread Cesar Philippidis
This patch introduces a couple of compiler tests for the OpenACC
attach and detach clauses.

I've committed it to openacc-gcc-8-branch.

Cesar
2018-10-30  Cesar Philippidis  

	gcc/testsuite/
	* c-c++-common/goacc/mdc-1.c: New test.
	* c-c++-common/goacc/mdc-2.c: New test.
	* g++.dg/goacc/mdc.C: New test.
---
 gcc/testsuite/c-c++-common/goacc/mdc-1.c | 54 +++
 gcc/testsuite/c-c++-common/goacc/mdc-2.c | 62 +
 gcc/testsuite/g++.dg/goacc/mdc.C | 68 
 3 files changed, 184 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/mdc-1.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/mdc-2.c
 create mode 100644 gcc/testsuite/g++.dg/goacc/mdc.C

diff --git a/gcc/testsuite/c-c++-common/goacc/mdc-1.c b/gcc/testsuite/c-c++-common/goacc/mdc-1.c
new file mode 100644
index 000..c20b94ddbdc
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/mdc-1.c
@@ -0,0 +1,54 @@
+/* Test OpenACC's support for manual deep copy, including the attach
+   and detach clauses.  */
+
+/* { dg-additional-options "-fdump-tree-omplower" } */
+
+void
+t1 ()
+{
+  struct foo {
+int *a, *b, c, d, *e;
+  } s;
+
+  int *a, *z;
+
+#pragma acc enter data copyin(s)
+  {
+#pragma acc data copy(s.a[0:10]) copy(z[0:10])
+{
+  s.e = z;
+#pragma acc parallel loop attach(s.e)
+  for (int i = 0; i < 10; i++)
+s.a[i] = s.e[i];
+
+
+  a = s.e;
+#pragma acc enter data attach(a)
+#pragma acc exit data detach(a)
+}
+
+#pragma acc enter data copyin(a)
+#pragma acc acc enter data attach(s.e)
+#pragma acc exit data detach(s.e)
+
+#pragma acc data attach(s.e)
+{
+}
+#pragma acc exit data delete(a)
+
+#pragma acc exit data detach(a) finalize
+#pragma acc exit data detach(s.a) finalize
+  }
+}
+
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.to:s .len: 32.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_data map.tofrom:.z .len: 40.. map.struct:s .len: 1.. map.alloc:s.a .len: 8.. map.tofrom:._1 .len: 40.. map.always_pointer:s.a .pointer assign, bias: 0.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_parallel map.struct:s .len: 1.. map.attach:s.e .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.attach:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.detach:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.to:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.detach:s.e .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_data map.struct:s .len: 1.. map.attach:s.e .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.release:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data finalize map.force_detach:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data finalize map.force_detach:s.a .len: 8.." 1 "omplower" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/mdc-2.c b/gcc/testsuite/c-c++-common/goacc/mdc-2.c
new file mode 100644
index 000..ebfb99d4caf
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/mdc-2.c
@@ -0,0 +1,62 @@
+/* Test OpenACC's support for manual deep copy, including the attach
+   and detach clauses.  */
+
+void
+t1 ()
+{
+  struct foo {
+int *a, *b, c, d, *e;
+  } s;
+
+  int *a, *z, scalar, **y;
+
+#pragma acc enter data copyin(s) detach(z) /* { dg-error ".detach. is not valid for" } */
+  {
+#pragma acc data copy(s.a[0:10]) copy(z[0:10])
+{
+  s.e = z;
+#pragma acc parallel loop attach(s.e) detach(s.b) /* { dg-error ".detach. is not valid for" } */
+  for (int i = 0; i < 10; i++)
+s.a[i] = s.e[i];
+
+  a = s.e;
+#pragma acc enter data attach(a) detach(s.c) /* { dg-error ".detach. is not valid for" } */
+#pragma acc exit data detach(a)
+}
+
+#pragma acc enter data attach(z[:5]) /* { dg-error "array section in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(z[:5]) /* { dg-error "array section in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(z[1:]) /* { dg-error "array section in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(z[1:]) /* { dg-error "array section in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(z[:]) /* { dg-error "array section in .attach. clause" } */
+/* { dg-error "has no data 

[PATCH 3/4] [og8] Attach / Detach C++ FE changes

2018-10-30 Thread Cesar Philippidis
As noted here 
this patch adds support for attach and detach in the C++ front end.
Unlike trunk, OG8 has some preliminary support for the this pointer.
Consequently, finish_omp_clauses had to take care of a couple more cases
in order to get libgomp.oacc-c++/this.C to work.

I've committed this patch to openacc-gcc-8-branch.

Cesar
2018-10-30  Cesar Philippidis  

	gcc/cp/
	* parser.c (cp_parser_omp_clause_name): Scan for attach and detach.
	(cp_parser_oacc_data_clause): Handle PRAGMA_OACC_CLAUSE_{ATTACH,
	DETACH}.
	(cp_parser_oacc_all_clauses): Likewise.
	(OACC_DATA_CLAUSE_MASK): Add support for attach and detach.
	(OACC_ENTER_DATA_CLAUSE_MASK): Likewise.
	(cp_parser_oacc_declare): Likewise.
	(OACC_KERNELS_CLAUSE_MASK): Likewise.
	(OACC_PARALLEL_CLAUSE_MASK): Likewise.
	* semantics.c (handle_omp_array_sections_1): Reject subarrays for
	attach and detach.
	(cp_oacc_check_attachments): New function.
	(finish_omp_clauses): Use it. Also, allow structure fields and
	class members to appear in OpenACC data clauses.
---
 gcc/cp/parser.c| 28 +-
 gcc/cp/semantics.c | 71 +-
 2 files changed, 91 insertions(+), 8 deletions(-)

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 9a8ec70bb17..8161d6301df 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -31266,6 +31266,8 @@ cp_parser_omp_clause_name (cp_parser *parser, bool consume_token = true)
 	result = PRAGMA_OMP_CLAUSE_ALIGNED;
 	  else if (!strcmp ("async", p))
 	result = PRAGMA_OACC_CLAUSE_ASYNC;
+	  else if (!strcmp ("attach", p))
+	result = PRAGMA_OACC_CLAUSE_ATTACH;
 	  break;
 	case 'b':
 	  if (!strcmp ("bind", p))
@@ -31290,6 +31292,8 @@ cp_parser_omp_clause_name (cp_parser *parser, bool consume_token = true)
 	result = PRAGMA_OMP_CLAUSE_DEFAULTMAP;
 	  else if (!strcmp ("depend", p))
 	result = PRAGMA_OMP_CLAUSE_DEPEND;
+	  else if (!strcmp ("detach", p))
+	result = PRAGMA_OACC_CLAUSE_DETACH;
 	  else if (!strcmp ("device", p))
 	result = PRAGMA_OMP_CLAUSE_DEVICE;
 	  else if (!strcmp ("deviceptr", p))
@@ -31679,11 +31683,13 @@ cp_parser_omp_var_list (cp_parser *parser, enum omp_clause_code kind, tree list)
 }
 
 /* OpenACC 2.5:
+   attach ( variable-list )
copy ( variable-list )
copyin ( variable-list )
copyout ( variable-list )
create ( variable-list )
delete ( variable-list )
+   detach ( variable-list )
present ( variable-list ) */
 
 static tree
@@ -31693,6 +31699,9 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
   enum gomp_map_kind kind;
   switch (c_kind)
 {
+case PRAGMA_OACC_CLAUSE_ATTACH:
+  kind = GOMP_MAP_ATTACH;
+  break;
 case PRAGMA_OACC_CLAUSE_COPY:
   kind = GOMP_MAP_TOFROM;
   break;
@@ -31708,6 +31717,9 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
 case PRAGMA_OACC_CLAUSE_DELETE:
   kind = GOMP_MAP_RELEASE;
   break;
+case PRAGMA_OACC_CLAUSE_DETACH:
+  kind = GOMP_MAP_DETACH;
+  break;
 case PRAGMA_OACC_CLAUSE_DEVICE:
   kind = GOMP_MAP_FORCE_TO;
   break;
@@ -33851,6 +33863,10 @@ cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask,
 		 clauses, here);
 	  c_name = "auto";
 	  break;
+	case PRAGMA_OACC_CLAUSE_ATTACH:
+	  clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses);
+	  c_name = "attach";
+	  break;
 	case PRAGMA_OACC_CLAUSE_BIND:
 	  clauses = cp_parser_oacc_clause_bind (parser, clauses);
 	  c_name = "bind";
@@ -33883,6 +33899,10 @@ cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask,
 	  clauses = cp_parser_omp_clause_default (parser, clauses, here, true);
 	  c_name = "default";
 	  break;
+	case PRAGMA_OACC_CLAUSE_DETACH:
+	  clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses);
+	  c_name = "detach";
+	  break;
 	case PRAGMA_OACC_CLAUSE_DEVICE:
 	  clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses);
 	  c_name = "device";
@@ -36904,10 +36924,12 @@ cp_parser_oacc_cache (cp_parser *parser, cp_token *pragma_tok)
  structured-block  */
 
 #define OACC_DATA_CLAUSE_MASK		\
-	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
+	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DETACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DEVICEPTR)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF)			\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT) )
@@ -37107,6 +37129,7 @@ cp_parser_oacc_declare (cp_parser *parser, cp_token *pragma_tok)
 
 #define OACC_ENTER_DATA_CLAUSE_MASK	\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF)			\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\

Re: [PATCH] avoid -Wnonnull for printf format in dead code (PR 87041)

2018-10-30 Thread Jeff Law
On 10/29/18 3:59 PM, Martin Sebor wrote:
> PR 87041 - -Wformat "reading through null pointer" on unreachable
> code is a complaint about -Wformat false positives due to null
> arguments to %s directives in unreachable printf calls.  The warning
> is issued by the front end, too early to know whether or not the call
> is ever made.
> 
> The -Wformat-overflow has had the ability to detect null pointers
> in %s and similar directives to sprintf calls since GCC 7 without
> these false positives, but the warning doesn't consider stream or
> file I/O functions like printf/fprintf.  To resolve the bug report
> I have enhanced -Wformat-overflow to consider all printf-like
> functions, including user-defined ones declared attribute format
> (printf).
> 
> Besides null pointers the enhancement also makes it possible to
> detect other problems (like out-of-range arguments and output in
> excess of INT_MAX bytes).  It also lays the groundwork for
> checking user-defined printf-like functions for buffer overflow
> (once a suitable attribute is added to indicate which arguments
> are the destination buffer pointer and the buffer size).
> 
> With that, I have removed the null checking from -Wformat (again,
> only for printf-like functions).
> 
> Martin
> 
> gcc-87041.diff
> 
> PR middle-end/87041 - -Wformat reading through null pointer on unreachable 
> code
> 
> gcc/ChangeLog:
> 
>   PR middle-end/87041
>   * gimple-ssa-sprintf.c (format_directive): Use %G to include
>   inlining context.
>   (sprintf_dom_walker::compute_format_length):
>   Avoid setting POSUNDER4K here.
>   (get_destination_size): Handle null argument values.
>   (get_user_idx_format): New function.
>   (sprintf_dom_walker::handle_gimple_call): Handle all printf-like
>   functions, including user-defined with attribute format printf.
>   Use %G to include inlining context.
>   Set POSUNDER4K here.
> 
> gcc/c-family/ChangeLog:
> 
>   PR middle-end/87041
>   * c-format.c (check_format_types): Avoid diagnosing null pointer
>   arguments to printf-family of functions.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR middle-end/87041
>   * gcc.c-torture/execute/fprintf-2.c: New test.
>   * gcc.c-torture/execute/printf-2.c: Same.
>   * gcc.c-torture/execute/user-printf.c: Same.
>   * gcc.dg/tree-ssa/builtin-fprintf-warn-1.c: Same.
>   * gcc.dg/tree-ssa/builtin-printf-2.c: Same.
>   * gcc.dg/tree-ssa/builtin-printf-warn-1.c: Same.
>   * gcc.dg/tree-ssa/user-printf-warn-1.c: Same.
OK.

Note some folks might complain about dropping the warning from the
front-end.  Their (largely reasonable) argument is that warning out of
the front-end is stable across releases and doesn't depend on
optimizations.  Of course the downside of warning out of the front-end
is false positives like we see in this PR.

jeff


Re: [AArch64] Add Saphira pipeline description.

2018-10-30 Thread James Greenhalgh
On Tue, Oct 30, 2018 at 05:12:58AM -0500, Sameera Deshpande wrote:
> On Fri, 26 Oct 2018 at 13:33, Sameera Deshpande
>  wrote:
> >
> > Hi!
> >
> > Please find attached the patch to add a pipeline description for the
> > Qualcomm Saphira core.  It is tested with a bootstrap and make check,
> > with no regressions.
> >
> > Ok for trunk?

OK.

I wonder if there's anything we can do to improve maintainability in these
cases where two pipeline models have considerable overlaps. 

Thanks,
James

> >
> > gcc/
> > Changelog:
> >
> > 2018-10-26 Sameera Deshpande 
> >
> > * config/aarch64/aarch64-cores.def (saphira): Use saphira pipeline.
> > * config/aarch64/aarch64.md: Include saphira.md
> > * config/aarch64/saphira.md: New file for pipeline description.
> >
> > --
> > - Thanks and regards,
> >   Sameera D.
> 
> Hi!
> 
> Please find attached updated patch.
> Bootstrap and make check passed without regression. Ok for trunk?
> 
> -- 
> - Thanks and regards,
>   Sameera D.

> diff --git a/gcc/config/aarch64/aarch64-cores.def 
> b/gcc/config/aarch64/aarch64-cores.def
> index 3d876b8..8e4c646 100644
> --- a/gcc/config/aarch64/aarch64-cores.def
> +++ b/gcc/config/aarch64/aarch64-cores.def
> @@ -90,7 +90,7 @@ AARCH64_CORE("cortex-a76",  cortexa76, cortexa57, 8_2A,  
> AARCH64_FL_FOR_ARCH8_2
>  /* ARMv8.4-A Architecture Processors.  */
>  
>  /* Qualcomm ('Q') cores. */
> -AARCH64_CORE("saphira", saphira,falkor,8_4A,  
> AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   
> 0x51, 0xC01, -1)
> +AARCH64_CORE("saphira", saphira,saphira,8_4A,  
> AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   
> 0x51, 0xC01, -1)
>  
>  /* ARMv8-A big.LITTLE implementations.  */
>  
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index a014a01..f951354 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -298,6 +298,7 @@
>  (include "../arm/cortex-a57.md")
>  (include "../arm/exynos-m1.md")
>  (include "falkor.md")
> +(include "saphira.md")
>  (include "thunderx.md")
>  (include "../arm/xgene1.md")
>  (include "thunderx2t99.md")
> diff --git a/gcc/config/aarch64/saphira.md b/gcc/config/aarch64/saphira.md
> new file mode 100644
> index 000..bbf1c5c
> --- /dev/null
> +++ b/gcc/config/aarch64/saphira.md
> @@ -0,0 +1,583 @@
> +;; Saphira pipeline description
> +;; Copyright (C) 2017-2018 Free Software Foundation, Inc.
> +;;
> +;; This file is part of GCC.
> +;;
> +;; GCC is free software; you can redistribute it and/or modify it
> +;; under the terms of the GNU General Public License as published by
> +;; the Free Software Foundation; either version 3, or (at your option)
> +;; any later version.
> +;;
> +;; GCC is distributed in the hope that it will be useful, but
> +;; WITHOUT ANY WARRANTY; without even the implied warranty of
> +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +;; General Public License for more details.
> +;;
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3.  If not see
> +;; .
> +
> +(define_automaton "saphira")
> +
> +;; Complex int instructions (e.g. multiply and divide) execute in the X
> +;; pipeline.  Simple int instructions execute in the X, Y, Z and B pipelines.
> +
> +(define_cpu_unit "saphira_x" "saphira")
> +(define_cpu_unit "saphira_y" "saphira")
> +
> +;; Branches execute in the Z or B pipeline or in one of the int pipelines 
> depending
> +;; on how complex it is.  Simple int insns (like movz) can also execute here.
> +
> +(define_cpu_unit "saphira_z" "saphira")
> +(define_cpu_unit "saphira_b" "saphira")
> +
> +;; Vector and FP insns execute in the VX and VY pipelines.
> +
> +(define_automaton "saphira_vfp")
> +
> +(define_cpu_unit "saphira_vx" "saphira_vfp")
> +(define_cpu_unit "saphira_vy" "saphira_vfp")
> +
> +;; Loads execute in the LD pipeline.
> +;; Stores execute in the ST pipeline, for address, data, and
> +;; vector data.
> +
> +(define_automaton "saphira_mem")
> +
> +(define_cpu_unit "saphira_ld" "saphira_mem")
> +(define_cpu_unit "saphira_st" "saphira_mem")
> +
> +;; The GTOV and VTOG pipelines are for general to vector reg moves, and vice
> +;; versa.
> +
> +(define_cpu_unit "saphira_gtov" "saphira")
> +(define_cpu_unit "saphira_vtog" "saphira")
> +
> +;; Common reservation combinations.
> +
> +(define_reservation "saphira_vxvy" "saphira_vx|saphira_vy")
> +(define_reservation "saphira_zb"   "saphira_z|saphira_b")
> +(define_reservation "saphira_xyzb" "saphira_x|saphira_y|saphira_z|saphira_b")
> +
> +;; SIMD Floating-Point Instructions
> +
> +(define_insn_reservation "saphira_afp_1_vxvy" 1
> +  (and (eq_attr "tune" "saphira")
> +   (eq_attr "type" 
> "neon_fp_neg_s,neon_fp_neg_d,neon_fp_abs_s,neon_fp_abs_d,neon_fp_neg_s_q,neon_fp_neg_d_q,neon_fp_abs_s_q,neon_fp_abs_d_q"))
> +  "saphira_vxvy")
> +
> +(define_insn_reservation 

[PATCH 2/4] [og8] Attach / Detach C FE changes

2018-10-30 Thread Cesar Philippidis
As noted here
, this patch
adds support for attach and detach in the C front end. The only major
difference between this and the trunk patch is that OG8 supports the acc
routine bind clause, do the trunk patch didn't apply cleanly. Other than
that, these patches are identical.

I've committed this patch to openacc-gcc-8-branch.

Cesar
2018-10-30  Cesar Philippidis  

	gcc/c/
	* c-parser.c (c_parser_omp_clause_name): Scan for attach and detach.
	(c_parser_oacc_data_clause): Handle PRAGMA_OACC_CLAUSE_{ATTACH,
	DETACH}.
	(c_parser_oacc_all_clauses): Likewise.
	(OACC_DATA_CLAUSE_MASK): Add support for attach and detach.
	(OACC_ENTER_DATA_CLAUSE_MASK): Likewise.
	(OACC_KERNELS_CLAUSE_MASK): Likewise.
	(OACC_PARALLEL_CLAUSE_MASK): Likewise.
	* c-typeck.c (handle_omp_array_sections_1): Reject subarrays for
	attach and detach.
	(c_oacc_check_attachments): New function.
	(c_finish_omp_clauses): Use it.
---
 gcc/c/c-parser.c | 27 +++-
 gcc/c/c-typeck.c | 55 +---
 2 files changed, 78 insertions(+), 4 deletions(-)

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 578c0660c54..ffc5fe9b0d3 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -11226,6 +11226,8 @@ c_parser_omp_clause_name (c_parser *parser, bool consume_token = true)
 	result = PRAGMA_OMP_CLAUSE_ALIGNED;
 	  else if (!strcmp ("async", p))
 	result = PRAGMA_OACC_CLAUSE_ASYNC;
+	  else if (!strcmp ("attach", p))
+	result = PRAGMA_OACC_CLAUSE_ATTACH;
 	  break;
 	case 'b':
 	  if (!strcmp ("bind", p))
@@ -11252,6 +11254,8 @@ c_parser_omp_clause_name (c_parser *parser, bool consume_token = true)
 	result = PRAGMA_OACC_CLAUSE_DELETE;
 	  else if (!strcmp ("depend", p))
 	result = PRAGMA_OMP_CLAUSE_DEPEND;
+	  else if (!strcmp ("detach", p))
+	result = PRAGMA_OACC_CLAUSE_DETACH;
 	  else if (!strcmp ("device", p))
 	result = PRAGMA_OMP_CLAUSE_DEVICE;
 	  else if (!strcmp ("deviceptr", p))
@@ -11675,11 +11679,13 @@ c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
 }
 
 /* OpenACC 2.5:
+   attach (variable-list )
copy ( variable-list )
copyin ( variable-list )
copyout ( variable-list )
create ( variable-list )
delete ( variable-list )
+   detach ( variable-list )
present ( variable-list ) */
 
 static tree
@@ -11689,6 +11695,9 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
   enum gomp_map_kind kind;
   switch (c_kind)
 {
+case PRAGMA_OACC_CLAUSE_ATTACH:
+  kind = GOMP_MAP_ATTACH;
+  break;
 case PRAGMA_OACC_CLAUSE_COPY:
   kind = GOMP_MAP_TOFROM;
   break;
@@ -11704,6 +11713,9 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
 case PRAGMA_OACC_CLAUSE_DELETE:
   kind = GOMP_MAP_RELEASE;
   break;
+case PRAGMA_OACC_CLAUSE_DETACH:
+  kind = GOMP_MAP_DETACH;
+  break;
 case PRAGMA_OACC_CLAUSE_DEVICE:
   kind = GOMP_MAP_FORCE_TO;
   break;
@@ -14083,6 +14095,10 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask,
 		 clauses);
 	  c_name = "auto";
 	  break;
+	case PRAGMA_OACC_CLAUSE_ATTACH:
+	  clauses = c_parser_oacc_data_clause (parser, c_kind, clauses);
+	  c_name = "attach";
+	  break;
 	case PRAGMA_OACC_CLAUSE_BIND:
 	  clauses = c_parser_oacc_clause_bind (parser, clauses);
 	  c_name = "bind";
@@ -14115,6 +14131,10 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask,
 	  clauses = c_parser_omp_clause_default (parser, clauses, true);
 	  c_name = "default";
 	  break;
+	case PRAGMA_OACC_CLAUSE_DETACH:
+	  clauses = c_parser_oacc_data_clause (parser, c_kind, clauses);
+	  c_name = "detach";
+	  break;
 	case PRAGMA_OACC_CLAUSE_DEVICE:
 	  clauses = c_parser_oacc_data_clause (parser, c_kind, clauses);
 	  c_name = "device";
@@ -14589,7 +14609,8 @@ c_parser_oacc_cache (location_t loc, c_parser *parser)
 */
 
 #define OACC_DATA_CLAUSE_MASK		\
-	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
+	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
@@ -14773,6 +14794,7 @@ c_parser_oacc_declare (c_parser *parser)
 #define OACC_ENTER_DATA_CLAUSE_MASK	\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF)			\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) )
@@ -14782,6 +14804,7 @@ c_parser_oacc_declare (c_parser *parser)
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
 	| (OMP_CLAUSE_MASK_1 << 

[PATCH 1/4] [og8] Attach / Detach generic infrastructure

2018-10-30 Thread Cesar Philippidis
As mentioned here
, this patch
series adds support for the new attach / detach clauses introduced in
OpenACC 2.6 to the C and C++ front ends.

There is one notable difference between this patch and the one I posted
for trunk. This patch tweaks GOMP_MAP_DEEP_COPY because OG8 has a lot of
other map types for acc declare and dynamic arrays. I suspect that
change would be required for trunk too, eventually.

I've committed this patch to openacc-gcc-8-branch.

Cesar
2018-10-30  Cesar Philippidis  

	gcc/
	* gimplify.c (gimplify_adjust_omp_clauses): Filter out
	GOMP_MAP_STRUCT for acc exit data.
	(gimplify_omp_target_update): Promote GOMP_MAP_DETACH
	to GOMP_MAP_FORCE_DETACH when the finalize clause is present.
	* omp-low.c (lower_omp_target): Add support for GOMP_MAP_{ATTACH,
	DETACH, FORCE_DETACH}.
	* tree-pretty-print.c (dump_omp_clause): Likewise.

	gcc/c-family/
	* c-pragma.h (enum pragma_omp_clause): Define
	PRAGMA_OACC_CLAUSE_{ATTACH,DETACH}.

	include/
	* gomp-constants.h (GOMP_MAP_DEEP_COPY): Define.
	(enum gomp_map_kind): Add GOMP_MAP_{ATTACH, DETACH, FORCE_DETACH}.
---
 gcc/c-family/c-pragma.h  |  2 ++
 gcc/gimplify.c   | 12 +---
 gcc/omp-low.c|  3 +++
 gcc/tree-pretty-print.c  |  9 +
 include/gomp-constants.h |  9 +
 5 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index 8b392486615..bce915187c1 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -131,12 +131,14 @@ enum pragma_omp_clause {
 
   /* Clauses for OpenACC.  */
   PRAGMA_OACC_CLAUSE_ASYNC,
+  PRAGMA_OACC_CLAUSE_ATTACH,
   PRAGMA_OACC_CLAUSE_AUTO,
   PRAGMA_OACC_CLAUSE_BIND,
   PRAGMA_OACC_CLAUSE_COPY,
   PRAGMA_OACC_CLAUSE_COPYOUT,
   PRAGMA_OACC_CLAUSE_CREATE,
   PRAGMA_OACC_CLAUSE_DELETE,
+  PRAGMA_OACC_CLAUSE_DETACH,
   PRAGMA_OACC_CLAUSE_DEVICEPTR,
   PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT,
   PRAGMA_OACC_CLAUSE_DEVICE_TYPE,
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index fda0d69caf7..9be0b70fc7f 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -9468,7 +9468,8 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p,
 		}
 	}
 	  else if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_STRUCT
-		   && code == OMP_TARGET_EXIT_DATA)
+		   && (code == OMP_TARGET_EXIT_DATA
+		   || code == OACC_EXIT_DATA))
 	remove = true;
 	  else if (DECL_SIZE (decl)
 		   && TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST
@@ -11156,8 +11157,9 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p)
 	   && omp_find_clause (OMP_STANDALONE_CLAUSES (expr),
 			   OMP_CLAUSE_FINALIZE))
 {
-  /* Use GOMP_MAP_DELETE/GOMP_MAP_FORCE_FROM to denote that "finalize"
-	 semantics apply to all mappings of this OpenACC directive.  */
+  /* Use GOMP_MAP_DELETE, GOMP_MAP_FORCE_DETACH, and
+	 GOMP_MAP_FORCE_FROM to denote that "finalize" semantics apply
+	 to all mappings of this OpenACC directive.  */
   bool finalize_marked = false;
   for (tree c = OMP_STANDALONE_CLAUSES (expr); c; c = OMP_CLAUSE_CHAIN (c))
 	if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP)
@@ -11171,6 +11173,10 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p)
 	  OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_DELETE);
 	  finalize_marked = true;
 	  break;
+	case GOMP_MAP_DETACH:
+	  OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_DETACH);
+	  finalize_marked = true;
+	  break;
 	default:
 	  /* Check consistency: libgomp relies on the very first data
 		 mapping clause being marked, so make sure we did that before
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index a219b825488..e559211f413 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -8185,6 +8185,9 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_ALLOC:
 	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT:
 	  case GOMP_MAP_LINK:
+	  case GOMP_MAP_ATTACH:
+	  case GOMP_MAP_DETACH:
+	  case GOMP_MAP_FORCE_DETACH:
 	gcc_assert (is_gimple_omp_oacc (stmt));
 	break;
 	  default:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 05a163d8956..ecbb51646b0 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -778,6 +778,15 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
 	case GOMP_MAP_DECLARE_DEALLOCATE:
 	  pp_string (pp, "declare_deallocate");
 	  break;
+	case GOMP_MAP_ATTACH:
+	  pp_string (pp, "attach");
+	  break;
+	case GOMP_MAP_DETACH:
+	  pp_string (pp, "detach");
+	  break;
+	case GOMP_MAP_FORCE_DETACH:
+	  pp_string (pp, "force_detach");
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
diff --git a/include/gomp-constants.h b/include/gomp-constants.h
index 9ef51c04994..c6cd48805e0 100644
--- a/include/gomp-constants.h
+++ b/include/gomp-constants.h
@@ -44,6 +44,8 @@
 #define GOMP_MAP_FLAG_SPECIAL_4		(1 << 6)
 #define GOMP_MAP_FLAG_SPECIAL		

Re: [PATCH] xfail ira-shrink-wrap-prep tests (PR87708)

2018-10-30 Thread Jeff Law
On 10/30/18 11:59 AM, Segher Boessenkool wrote:
> After r265398, the ira-shrinkwrap-prep-[12].c tests fail on all
> targets, because the IRA feature tested can only move hard registers
> down, and we no longer have hard registers for the function parameters
> at this stage.
> 
> Is this okay for trunk?
> 
> 
> 2018-10-30  Segher Boessenkool  
> 
> gcc/testsuite/
>   PR rtl-optimization/87708
>   gcc.dg/ira-shrinkwrap-prep-1.c: xfail test.
>   gcc.dg/ira-shrinkwrap-prep-2.c: xfail test.
OK
jeff


[patch, fortran] Fix PR 85896, type confusion with min and max

2018-10-30 Thread Thomas Koenig

Hello world,

the attached patchlet fixes a rejects-valid bug by simply ignoring the
type for max and min during simplification.  This is correct
because setting the type of a generic intrinsic function has
no effect.

It is a rare pleasure to fix a bug by removing code only :-)

Regression-tested. OK for trunk?

Regards

Thomas

2018-10-30  Thomas Koenig  

PR fortran/85896
* simplify.c (simplify_min_max): Do not convert the type of the
return expression.

2018-10-30  Thomas Koenig  

PR fortran/85896
* gfortran.dg/min_max_type.f90: New test.
Index: simplify.c
===
--- simplify.c	(Revision 265502)
+++ simplify.c	(Arbeitskopie)
@@ -4961,11 +4961,9 @@ static gfc_expr *
 simplify_min_max (gfc_expr *expr, int sign)
 {
   gfc_actual_arglist *arg, *last, *extremum;
-  gfc_intrinsic_sym * specific;
 
   last = NULL;
   extremum = NULL;
-  specific = expr->value.function.isym;
 
   arg = expr->value.function.actual;
 
@@ -4995,15 +4993,6 @@ simplify_min_max (gfc_expr *expr, int sign)
   if (expr->value.function.actual->next != NULL)
 return NULL;
 
-  /* Convert to the correct type and kind.  */
-  if (expr->ts.type != BT_UNKNOWN)
-return gfc_convert_constant (expr->value.function.actual->expr,
-	expr->ts.type, expr->ts.kind);
-
-  if (specific->ts.type != BT_UNKNOWN)
-return gfc_convert_constant (expr->value.function.actual->expr,
-	specific->ts.type, specific->ts.kind);
-
   return gfc_copy_expr (expr->value.function.actual->expr);
 }
 


Re: [PATCH v2] S/390: Allow LARL of literal pool entries

2018-10-30 Thread Ilya Leoshkevich


> Am 30.10.2018 um 18:22 schrieb Ulrich Weigand :
> 
> Ilya Leoshkevich wrote:
> 
>> @@ -8223,6 +8237,18 @@ find_constant_pool_ref (rtx x, rtx *ref)
>>   && XINT (x, 1) == UNSPECV_POOL_ENTRY)
>> return;
>> 
>> +  if (SYMBOL_REF_P (x)
>> +  && CONSTANT_POOL_ADDRESS_P (x)
>> +  && s390_symbol_larl_p (x))
>> +{
>> +  if (*ref == NULL_RTX)
>> +*ref = x;
>> +  else
>> +gcc_assert (*ref == x);
>> +
>> +  return;
>> +}
> 
> This definitely looks wrong.  If we haven't annotated the address,
> it should *not* be found by find_constant_pool_ref, since we are
> not going to replace it!  That was the whole point of not annotating
> it in the first place …

There are two use cases for find_constant_pool_ref ().  One is indeed
replacing annotated references.  The other (in s390_mainpool_start ()
and s390_chunkify_start ()) is creating pool entries.  So I've decided
to let it find unannotated references for the second use case.

This impacts the first use case as well, that's why I have also changed
replace_constant_pool_ref () to ignore unannotated references.


Remove duplicate line in gcov's output

2018-10-30 Thread Eric Botcazou
...coming from a duplicate line in gcov's source code.

Tested on x86_64-suse-linux, applied on mainline as obvious.


2018-10-30  Eric Botcazou  

* gcov.c (output_lines): Remove duplicate line.

-- 
Eric BotcazouIndex: gcov.c
===
--- gcov.c	(revision 265582)
+++ gcov.c	(working copy)
@@ -2951,8 +2951,6 @@ output_lines (FILE *gcov_file, const sou
 	 SGR_SEQ (COLOR_BG_GREEN) "> 10%" SGR_RESET "\n");
 
   fprintf (gcov_file, DEFAULT_LINE_START "Source:%s\n", src->coverage.name);
-
-  fprintf (gcov_file, DEFAULT_LINE_START "Source:%s\n", src->coverage.name);
   if (!multiple_files)
 {
   fprintf (gcov_file, DEFAULT_LINE_START "Graph:%s\n", bbg_file_name);


[PATCH] xfail ira-shrink-wrap-prep tests (PR87708)

2018-10-30 Thread Segher Boessenkool
After r265398, the ira-shrinkwrap-prep-[12].c tests fail on all
targets, because the IRA feature tested can only move hard registers
down, and we no longer have hard registers for the function parameters
at this stage.

Is this okay for trunk?


2018-10-30  Segher Boessenkool  

gcc/testsuite/
PR rtl-optimization/87708
gcc.dg/ira-shrinkwrap-prep-1.c: xfail test.
gcc.dg/ira-shrinkwrap-prep-2.c: xfail test.

---
 gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c | 6 +++---
 gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c 
b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c
index 24ea45f..f290b9c 100644
--- a/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c
+++ b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-1.c
@@ -24,6 +24,6 @@ bar (long a)
   return r;
 }
 
-/* { dg-final { scan-rtl-dump "Will split live ranges of parameters" "ira"  } 
} */
-/* { dg-final { scan-rtl-dump "Split live-range of register" "ira"  } } */
-/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue"  
} } */
+/* { dg-final { scan-rtl-dump "Will split live ranges of parameters" "ira" } } 
*/
+/* { dg-final { scan-rtl-dump "Split live-range of register" "ira" { xfail 
*-*-* } } } */
+/* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue" 
{ xfail powerpc*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c 
b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c
index a23ac4e..6212c95 100644
--- a/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c
+++ b/gcc/testsuite/gcc.dg/ira-shrinkwrap-prep-2.c
@@ -29,7 +29,7 @@ bar (long a)
   return r;
 }
 
-/* { dg-final { scan-rtl-dump "Will split live ranges of parameters" "ira"  } 
} */
-/* { dg-final { scan-rtl-dump "Split live-range of register" "ira"  } } */
+/* { dg-final { scan-rtl-dump "Will split live ranges of parameters" "ira" } } 
*/
+/* { dg-final { scan-rtl-dump "Split live-range of register" "ira" { xfail 
*-*-* } } } */
 /* XFAIL due to PR70681.  */ 
 /* { dg-final { scan-rtl-dump "Performing shrink-wrapping" "pro_and_epilogue" 
{ xfail arm*-*-* powerpc*-*-* } } } */
-- 
1.8.3.1



Re: Fix D compilation on Solaris

2018-10-30 Thread Iain Buclaw
On Tue, 30 Oct 2018 at 14:13, Rainer Orth  wrote:
>
> Rainer Orth  writes:
>
> > * On sparc, I didn't get that far, unfortunately: as I mentioned, many
> >   compilations die with SIGBUS:
> >
> > libtool: compile:  /var/gcc/regression/trunk/11.5-gcc/build/./gcc/gdc 
> > -B/var/gcc/regression/trunk/11.5-gcc/build/./gcc/ 
> > -B/vol/gcc/sparc-sun-solaris2.11/bin/ -B/vol/gcc/sparc-sun-solaris2.11/lib/ 
> > -isystem /vol/gcc/sparc-sun-solaris2.11/include -isystem 
> > /vol/gcc/sparc-sun-solaris2.11/sys-include -fno-checking -fPIC -O2 -g 
> > -nostdinc -I /vol/gcc/src/hg/trunk/local/libphobos/libdruntime -I . -c 
> > /vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/thread.d 
> > -fversion=Shared -o core/.libs/thread.o
> > d21: internal compiler error: Bus Error
> > 0xbb5507 crash_signal
> > /vol/gcc/src/hg/trunk/local/gcc/toplev.c:325
> > 0x518700 IntegerExp::toInteger()
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/expression.c:2943
> > 0x4d05c3 interpret
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5984
> > 0x4d1543 interpret
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:6017
> > 0x4d1543 interpret(Statement*, InterState*)
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:6024
> > 0x4d263b interpretFunction
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:906
> > 0x4d263b interpretFunction
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:726
> > 0x4d05c3 interpret
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5984
> > 0x4d14df interpret(Expression*, InterState*, CtfeGoal)
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5994
> > 0x4d05c3 interpret
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5984
> > 0x4d14df interpret(Expression*, InterState*, CtfeGoal)
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5994
> > 0x5243d7 DeclarationExp::accept(Visitor*)
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/expression.h:661
> > 0x4d05c3 interpret
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5984
> > 0x4d1543 interpret
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:6017
> > 0x4d1543 interpret(Statement*, InterState*)
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:6024
> > 0x4d263b interpretFunction
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:906
> > 0x4d263b interpretFunction
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:726
> > 0x4d05c3 interpret
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5984
> > 0x4d14df interpret(Expression*, InterState*, CtfeGoal)
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5994
> > 0x4d05c3 interpret
> > /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5984
> >
> >   Will need to dig further here.
>
> It's exactly as I suspected:
>
> $ d21 /vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/thread.d 
> -mcpu=v9 -fversion=Shared -I 
> /vol/gcc/src/hg/trunk/local/libphobos/libdruntime -o thread.s
>

Hi Rainer,

Thanks for looking into this.

My first suspect here would be 'struct UnionExp', see d/dmd/expression.h

Upstream dmd use a poor man's alignment, from what I recall to be
compatible with the dmc compiler.

// Ensure that the union is suitably aligned.
real_t for_alignment_only;

What happens if you were to replace that with marking the type as
__attribute__ ((aligned (8))) ?

-- 
Iain


Re: [PATCH 2/2] asm inline

2018-10-30 Thread Marek Polacek
On Tue, Oct 30, 2018 at 05:30:34PM +, Segher Boessenkool wrote:
> --- a/gcc/c/c-typeck.c
> +++ b/gcc/c/c-typeck.c
> @@ -10064,7 +10064,7 @@ build_asm_stmt (tree cv_qualifier, tree args)
> are subtly different.  We use a ASM_EXPR node to represent this.  */
>  tree
>  build_asm_expr (location_t loc, tree string, tree outputs, tree inputs,
> - tree clobbers, tree labels, bool simple)
> + tree clobbers, tree labels, bool simple, bool is_inline)

The new parameter should probably be described in the comment above.

Marek


[PATCH 2/2] asm inline

2018-10-30 Thread Segher Boessenkool
The Linux kernel people want a feature that makes GCC pretend some
inline assembler code is tiny (while it would think it is huge), so
that such code will be inlined essentially always instead of
essentially never.

This patch lets you say "asm inline" instead of just "asm", with the
result that that inline assembler is always counted as minimum cost
for inlining.  It implements this for C and C++.


2018-10-30  Segher Boessenkool  

* doc/extend.texi (Using Assembly Language with C): Document asm inline.
(Size of an asm): Fix typo.  Document asm inline.
* gimple-pretty-print.c (dump_gimple_asm): Handle asm inline.
* gimple.h (enum gf_mask): Add GF_ASM_INLINE.
(gimple_asm_set_volatile): Fix typo.
* gimple_asm_inline_p: New.
* gimple_asm_set_inline: New.
* gimplify.c (gimplify_asm_expr): Propagate the asm inline flag from
tree to gimple.
* ipa-icf-gimple.c (func_checker::compare_gimple_asm): Compare the
gimple_asm_inline_p flag, too.
* tree-core.h (tree_base): Document that protected_flag is ASM_INLINE_P
in an ASM_EXPR.
* tree-inline.c (estimate_num_insns): If gimple_asm_inline_p return
a minimum size for an asm.
* tree.h (ASM_INLINE_P): New.

gcc/c/
* c-parser.c (c_parser_asm_statement): Detect the inline keyword
after asm.  Pass a flag for it to build_asm_expr.
* c-tree.h (build_asm_expr): Update declaration.
* c-typeck.c (build_asm_stmt): Add is_inline parameter.  Use it to
set ASM_INLINE_P.

gcc/cp/
* cp-tree.h (finish_asm_stmt): Update declaration.
* parser.c (cp_parser_asm_definition): Detect the inline keyword
after asm.  Pass a flag for it to finish_asm_stmt.
* pt.c (tsubst_expr): Pass the ASM_INLINE_P flag to finish_asm_stmt.
* semantics.c (finish_asm_stmt): Add inline_p parameter.  Use it to
set ASM_INLINE_P.

gcc/testsuite/
* c-c++-common/torture/asm-inline.c: New testcase.

---
 gcc/c/c-parser.c| 15 +--
 gcc/c/c-tree.h  |  3 +-
 gcc/c/c-typeck.c|  3 +-
 gcc/cp/cp-tree.h|  2 +-
 gcc/cp/parser.c | 15 ++-
 gcc/cp/pt.c |  2 +-
 gcc/cp/semantics.c  |  3 +-
 gcc/doc/extend.texi | 10 -
 gcc/gimple-pretty-print.c   |  2 +
 gcc/gimple.h| 24 ++-
 gcc/gimplify.c  |  1 +
 gcc/ipa-icf-gimple.c|  3 ++
 gcc/testsuite/c-c++-common/torture/asm-inline.c | 53 +
 gcc/tree-core.h |  3 ++
 gcc/tree-inline.c   |  3 ++
 gcc/tree.h  |  3 ++
 16 files changed, 133 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/torture/asm-inline.c

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index ce9921e..b28b712 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -6283,11 +6283,12 @@ c_parser_for_statement (c_parser *parser, bool ivdep, 
unsigned short unroll,
 }
 
 /* Parse an asm statement, a GNU extension.  This is a full-blown asm
-   statement with inputs, outputs, clobbers, and volatile and goto tag
-   allowed.
+   statement with inputs, outputs, clobbers, and volatile, inline, and goto
+   tags allowed.
 
asm-qualifier:
  type-qualifier
+ inline
  goto
 
asm-qualifier-list:
@@ -6315,7 +6316,7 @@ static tree
 c_parser_asm_statement (c_parser *parser)
 {
   tree quals, str, outputs, inputs, clobbers, labels, ret;
-  bool simple, is_goto;
+  bool simple, is_inline, is_goto;
   location_t asm_loc = c_parser_peek_token (parser)->location;
   int section, nsections;
 
@@ -6323,6 +6324,7 @@ c_parser_asm_statement (c_parser *parser)
   c_parser_consume_token (parser);
 
   quals = NULL_TREE;
+  is_inline = false;
   is_goto = false;
   for (bool done = false; !done; )
 switch (c_parser_peek_token (parser)->keyword)
@@ -6340,6 +6342,10 @@ c_parser_asm_statement (c_parser *parser)
c_parser_peek_token (parser)->value);
c_parser_consume_token (parser);
break;
+  case RID_INLINE:
+   is_inline = true;
+   c_parser_consume_token (parser);
+   break;
   case RID_GOTO:
is_goto = true;
c_parser_consume_token (parser);
@@ -6423,7 +6429,8 @@ c_parser_asm_statement (c_parser *parser)
 c_parser_skip_to_end_of_block_or_statement (parser);
 
   ret = build_asm_stmt (quals, build_asm_expr (asm_loc, str, outputs, inputs,
-  clobbers, labels, simple));
+  clobbers, labels, simple,
+  

[PATCH 0/2] asm qualifiers (PR55681) and asm input

2018-10-30 Thread Segher Boessenkool
Hi!

This is the same "asm input" patch as before, but now preceded by a patch
that makes all orderings of volatile/goto/inline valid, also the other type
qualifiers for C, and also repetitions for C.

Tested on powerpc64-linux {-m32,-m64}.  Is this okay for trunk?


Segher


 gcc/c/c-parser.c| 79 -
 gcc/c/c-tree.h  |  3 +-
 gcc/c/c-typeck.c|  3 +-
 gcc/cp/cp-tree.h|  2 +-
 gcc/cp/parser.c | 92 +
 gcc/cp/pt.c |  2 +-
 gcc/cp/semantics.c  |  3 +-
 gcc/doc/extend.texi | 10 ++-
 gcc/gimple-pretty-print.c   |  2 +
 gcc/gimple.h| 24 ++-
 gcc/gimplify.c  |  1 +
 gcc/ipa-icf-gimple.c|  3 +
 gcc/testsuite/c-c++-common/torture/asm-inline.c | 53 ++
 gcc/tree-core.h |  3 +
 gcc/tree-inline.c   |  3 +
 gcc/tree.h  |  3 +
 16 files changed, 221 insertions(+), 65 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/torture/asm-inline.c

-- 
1.8.3.1



[PATCH 1/2] asm qualifiers (PR55681)

2018-10-30 Thread Segher Boessenkool
PR55681 observes that currently only one qualifier is allowed for
inline asm, so that e.g. "volatile asm" is allowed, "const asm" is also
okay (with a warning), but "const volatile asm" gives an error.  Also
"const const asm" is an error (while "const const int" is okay for C),
"goto" has to be last, and "_Atomic" isn't handled at all.

This patch fixes all these.  It allows any order of qualifiers (and
goto), allows them all for C, allows duplications for C.  For C++ it
still allows only a single volatile and single goto, but in any order.


2018-10-30  Segher Boessenkool  

gcc/c/
PR inline-asm/55681
* c-parser.c (c_parser_for_statement): Update grammar.  Allow any
combination of type-qualifiers and goto in any order, with repetitions
allowed.

gcc/cp/
PR inline-asm/55681
* parser.c (cp_parser_using_directive): Update grammar.  Allow any
combination of volatile and goto in any order, without repetitions.

---
 gcc/c/c-parser.c | 66 +++-
 gcc/cp/parser.c  | 77 +---
 2 files changed, 89 insertions(+), 54 deletions(-)

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index ee66ce8..ce9921e 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -6283,23 +6283,31 @@ c_parser_for_statement (c_parser *parser, bool ivdep, 
unsigned short unroll,
 }
 
 /* Parse an asm statement, a GNU extension.  This is a full-blown asm
-   statement with inputs, outputs, clobbers, and volatile tag
+   statement with inputs, outputs, clobbers, and volatile and goto tag
allowed.
 
+   asm-qualifier:
+ type-qualifier
+ goto
+
+   asm-qualifier-list:
+ asm-qualifier-list asm-qualifier
+ asm-qualifier
+
asm-statement:
- asm type-qualifier[opt] ( asm-argument ) ;
- asm type-qualifier[opt] goto ( asm-goto-argument ) ;
+ asm asm-qualifier-list[opt] ( asm-argument ) ;
 
asm-argument:
  asm-string-literal
  asm-string-literal : asm-operands[opt]
  asm-string-literal : asm-operands[opt] : asm-operands[opt]
- asm-string-literal : asm-operands[opt] : asm-operands[opt] : 
asm-clobbers[opt]
-
-   asm-goto-argument:
+ asm-string-literal : asm-operands[opt] : asm-operands[opt] \
+   : asm-clobbers[opt]
  asm-string-literal : : asm-operands[opt] : asm-clobbers[opt] \
: asm-goto-operands
 
+   The form with asm-goto-operands is valid if and only if the
+   asm-qualifier-list contains goto, and is the only allowed form in that case.
Qualifiers other than volatile are accepted in the syntax but
warned for.  */
 
@@ -6313,30 +6321,32 @@ c_parser_asm_statement (c_parser *parser)
 
   gcc_assert (c_parser_next_token_is_keyword (parser, RID_ASM));
   c_parser_consume_token (parser);
-  if (c_parser_next_token_is_keyword (parser, RID_VOLATILE))
-{
-  quals = c_parser_peek_token (parser)->value;
-  c_parser_consume_token (parser);
-}
-  else if (c_parser_next_token_is_keyword (parser, RID_CONST)
-  || c_parser_next_token_is_keyword (parser, RID_RESTRICT))
-{
-  warning_at (c_parser_peek_token (parser)->location,
- 0,
- "%E qualifier ignored on asm",
- c_parser_peek_token (parser)->value);
-  quals = NULL_TREE;
-  c_parser_consume_token (parser);
-}
-  else
-quals = NULL_TREE;
 
+  quals = NULL_TREE;
   is_goto = false;
-  if (c_parser_next_token_is_keyword (parser, RID_GOTO))
-{
-  c_parser_consume_token (parser);
-  is_goto = true;
-}
+  for (bool done = false; !done; )
+switch (c_parser_peek_token (parser)->keyword)
+  {
+  case RID_VOLATILE:
+   quals = c_parser_peek_token (parser)->value;
+   c_parser_consume_token (parser);
+   break;
+  case RID_CONST:
+  case RID_RESTRICT:
+  case RID_ATOMIC:
+   warning_at (c_parser_peek_token (parser)->location,
+   0,
+   "%E qualifier ignored on asm",
+   c_parser_peek_token (parser)->value);
+   c_parser_consume_token (parser);
+   break;
+  case RID_GOTO:
+   is_goto = true;
+   c_parser_consume_token (parser);
+   break;
+  default:
+   done = true;
+  }
 
   /* ??? Follow the C++ parser rather than using the
  lex_untranslated_string kludge.  */
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index ebe326e..d44fd4d 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -19196,22 +19196,34 @@ cp_parser_using_directive (cp_parser* parser)
 
 /* Parse an asm-definition.
 
+  asm-qualifier:
+volatile
+goto
+
+  asm-qualifier-list:
+asm-qualifier
+asm-qualifier-list asm-qualifier
+
asm-definition:
  asm ( string-literal ) ;
 
GNU Extension:
 
asm-definition:
- asm volatile [opt] ( string-literal ) ;
- asm volatile [opt] ( string-literal : asm-operand-list [opt] ) ;
- asm volatile [opt] ( 

Re: [PATCH v2] S/390: Allow LARL of literal pool entries

2018-10-30 Thread Ulrich Weigand
Ilya Leoshkevich wrote:

> @@ -8223,6 +8237,18 @@ find_constant_pool_ref (rtx x, rtx *ref)
>&& XINT (x, 1) == UNSPECV_POOL_ENTRY)
>  return;
>  
> +  if (SYMBOL_REF_P (x)
> +  && CONSTANT_POOL_ADDRESS_P (x)
> +  && s390_symbol_larl_p (x))
> +{
> +  if (*ref == NULL_RTX)
> + *ref = x;
> +  else
> + gcc_assert (*ref == x);
> +
> +  return;
> +}

This definitely looks wrong.  If we haven't annotated the address,
it should *not* be found by find_constant_pool_ref, since we are
not going to replace it!  That was the whole point of not annotating
it in the first place ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH] S/390: Allow LARL of literal pool entries

2018-10-30 Thread Ulrich Weigand
Ilya Leoshkevich wrote:
> Am 30.10.2018 um 16:20 schrieb Ulrich Weigand :
> > Not sure that this is fully correct either.  *Some* instructions, like
> > e.g. floating-point loads, do not accept relative operands.  And even
> > for the relative loads that exist, there may be slightly different
> > restrictions on what addresses are allowed (e.g. LGRL only accepts
> > 8-byte aligned addresses, while LARL accepts 2-byte aligned addresses).
> 
> In such instructions SYMBOL_REFs appear only inside MEMs.  There is the
> special case for that in annotate_constant_pool_refs (), which ensures
> that UNSPEC_LTREFs are generated and recursive calls are not made.  So,
> the newly introduced check should have no effect on such instructions.

The point is, they should *not* get annotated either, since the
instructions actually can handle relative accesses to the literal
pool, so we don't need to use a base register.

But I think that still would work; alignment shouldn't be an issue after
all since literal pool contents are always naturally aligned.  So we only
need to recognize whether the insn alternative accepts relative
operands at all, which we could do as suggested below.

> > It seems the underlying problem is that we have predicates/constraints
> > that accept literal pool differences for two distinct reasons now:
> > either because they can be naturally handled via relative addressing,
> > or because they are supposed to be transformed to base-register addressing
> > later on.  We really need to distinguish the two cases in some way.
> >
> > Maybe it would make sense to check which alternative/constraint matched
> > the insn, and decide based on that whether we need to rewrite to base-
> > register addressing or not?
> 
> We currently have "type" attribute, which has the value "larl" for most
> relative addressing alternatives.  In a few cases it's "load" / "store",
> but that looks like an omission to me: e.g. for "lhrl" it's "larl", but
> for "lrl" it's "load".  We could query it from the back-end with
> get_attr_type () and compare the result with TYPE_LARL.

Good point, that looks like it should work.  For those cases where you do
have to change the type attribute, we need to verify that scheduler
properties do not change; this should really only be an issue with z10
(i.e. the 2097.md scheduler description).

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH/AARCH64] Add OcteonTX for -mcpu=

2018-10-30 Thread Richard Earnshaw (lists)
On 30/10/2018 17:06, Andrew Pinski wrote:
> Hi all,
>   There was a name change of the Products, ThunderX T81 and ThunderX
> T83 to OcteonTX family name.  This change was done a few years ago but
> I had not submmitted the change at that time.  This is also the first
> patch in a series to add OcteonTX 2 support to GCC.
> 
> OK?  Bootstrapped and tested on aarch64-linux-gnu with no regression.
> 

You're missing a documentation update.

R.

> Thanks,
> Andrew Pinski
> 
> gcc/ChangeLog:
> * config/aarch64/aarch64-cores.def (octeontx): New.
> (octeontx81): Likewise.
> (octeontx83): Likewise.
> * config/aarch64/aarch64-tune.md: Regenerate.
> 
> 
> addoctx.diff.txt
> 
> Index: gcc/config/aarch64/aarch64-cores.def
> ===
> --- gcc/config/aarch64/aarch64-cores.def  (revision 265605)
> +++ gcc/config/aarch64/aarch64-cores.def  (working copy)
> @@ -58,6 +58,12 @@
> this order is required to handle variant correctly. */
>  AARCH64_CORE("thunderxt88p1", thunderxt88p1, thunderx,  8A,  
> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO,  thunderxt88,  
> 0x43, 0x0a1, 0)
>  AARCH64_CORE("thunderxt88",   thunderxt88,   thunderx,  8A,  
> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderxt88,  
> 0x43, 0x0a1, -1)
> +
> +/* OcteonTX is the official name for T81/T83. */
> +AARCH64_CORE("octeontx",  octeontx,  thunderx,  8A,  
> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
> 0x0a0, -1)
> +AARCH64_CORE("octeontx81",octeontxt81,   thunderx,  8A,  
> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
> 0x0a2, -1)
> +AARCH64_CORE("octeontx83",octeontxt83,   thunderx,  8A,  
> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
> 0x0a3, -1)
> +
>  AARCH64_CORE("thunderxt81",   thunderxt81,   thunderx,  8A,  
> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
> 0x0a2, -1)
>  AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx,  8A,  
> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
> 0x0a3, -1)
>  
> Index: gcc/config/aarch64/aarch64-tune.md
> ===
> --- gcc/config/aarch64/aarch64-tune.md(revision 265605)
> +++ gcc/config/aarch64/aarch64-tune.md(working copy)
> @@ -1,5 +1,5 @@
>  ;; -*- buffer-read-only: t -*-
>  ;; Generated automatically by gentune.sh from aarch64-cores.def
>  (define_attr "tune"
> - 
> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
> + 
> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
>   (const (symbol_ref "((enum attr_tune) aarch64_tune)")))
> 



Re: [PATCH, testsuite, c-compat] Handle another c/l option not recognised by older GCC or clang.

2018-10-30 Thread Mike Stump
On Oct 29, 2018, at 12:53 PM, Iain Sandoe  wrote:
> 
> When using ALT_CC/CXX_UNDER_TEST in the compat/struct-layout-1 tests, the c/l 
> options provided to the “alt” compiler need to avoid latest and greatest GCC 
> capability.  The patch tests to see if the ‘alt’ compiler can handle 
> -fno-diagnostics-show-line-numbers.
> 
> OK for trunk?

Ok.

[PATCH/AARCH64] Add OcteonTX for -mcpu=

2018-10-30 Thread Andrew Pinski
Hi all,
  There was a name change of the Products, ThunderX T81 and ThunderX
T83 to OcteonTX family name.  This change was done a few years ago but
I had not submmitted the change at that time.  This is also the first
patch in a series to add OcteonTX 2 support to GCC.

OK?  Bootstrapped and tested on aarch64-linux-gnu with no regression.

Thanks,
Andrew Pinski

gcc/ChangeLog:
* config/aarch64/aarch64-cores.def (octeontx): New.
(octeontx81): Likewise.
(octeontx83): Likewise.
* config/aarch64/aarch64-tune.md: Regenerate.
Index: gcc/config/aarch64/aarch64-cores.def
===
--- gcc/config/aarch64/aarch64-cores.def(revision 265605)
+++ gcc/config/aarch64/aarch64-cores.def(working copy)
@@ -58,6 +58,12 @@
this order is required to handle variant correctly. */
 AARCH64_CORE("thunderxt88p1", thunderxt88p1, thunderx,  8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO,thunderxt88,  
0x43, 0x0a1, 0)
 AARCH64_CORE("thunderxt88",   thunderxt88,   thunderx,  8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderxt88,  0x43, 
0x0a1, -1)
+
+/* OcteonTX is the official name for T81/T83. */
+AARCH64_CORE("octeontx",  octeontx,  thunderx,  8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
0x0a0, -1)
+AARCH64_CORE("octeontx81",octeontxt81,   thunderx,  8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
0x0a2, -1)
+AARCH64_CORE("octeontx83",octeontxt83,   thunderx,  8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
0x0a3, -1)
+
 AARCH64_CORE("thunderxt81",   thunderxt81,   thunderx,  8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
0x0a2, -1)
 AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx,  8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
0x0a3, -1)
 
Index: gcc/config/aarch64/aarch64-tune.md
===
--- gcc/config/aarch64/aarch64-tune.md  (revision 265605)
+++ gcc/config/aarch64/aarch64-tune.md  (working copy)
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from aarch64-cores.def
 (define_attr "tune"
-   
"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
+   
"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
(const (symbol_ref "((enum attr_tune) aarch64_tune)")))


Re: [PATCH] S/390: Allow LARL of literal pool entries

2018-10-30 Thread Ilya Leoshkevich



> Am 30.10.2018 um 16:20 schrieb Ulrich Weigand :
> 
> Ilya Leoshkevich wrote:
>> Am 29.10.2018 um 19:45 schrieb Ulrich Weigand :
>> 
>>> This is true.  But something else must still be going on here.  Note that
>>> many other instruction patterns might contain constant pool addresses,
>>> since they are accepted e.g. by the 'b' constraint.  In all of those
>>> cases, we shouldn't add the UNSPEC_LTREF.  So just checking for the
>>> specific LARL instruction pattern in annotate_constant_pool_refs does
>>> not feel like a correct fix here.
>> 
>> I have changed the patch to skip all larl_operands, regardless of which
>> context they appear in.  Regtest is running.
> 
> Not sure that this is fully correct either.  *Some* instructions, like
> e.g. floating-point loads, do not accept relative operands.  And even
> for the relative loads that exist, there may be slightly different
> restrictions on what addresses are allowed (e.g. LGRL only accepts
> 8-byte aligned addresses, while LARL accepts 2-byte aligned addresses).

In such instructions SYMBOL_REFs appear only inside MEMs.  There is the
special case for that in annotate_constant_pool_refs (), which ensures
that UNSPEC_LTREFs are generated and recursive calls are not made.  So,
the newly introduced check should have no effect on such instructions.

> 
> It seems the underlying problem is that we have predicates/constraints
> that accept literal pool differences for two distinct reasons now:
> either because they can be naturally handled via relative addressing,
> or because they are supposed to be transformed to base-register addressing
> later on.  We really need to distinguish the two cases in some way.
> 
> Maybe it would make sense to check which alternative/constraint matched
> the insn, and decide based on that whether we need to rewrite to base-
> register addressing or not?

We currently have „type“ attribute, which has the value „larl“ for most
relative addressing alternatives.  In a few cases it’s „load“ / „store“,
but that looks like an omission to me: e.g. for „lhrl“ it's „larl“, but
for „lrl“ it’s „load“.  We could query it from the back-end with
get_attr_type () and compare the result with TYPE_LARL.


Re: [PATCH][rs6000] use index form addresses more often for ldbrx/stdbrx

2018-10-30 Thread Segher Boessenkool
On Tue, Oct 30, 2018 at 10:22:48AM -0500, Aaron Sawdey wrote:
> I had to make one more change to make this actually work. In
> rs6000_force_indexed_or_indirect_mem() it was necessary to
> return the updated rtx.

Yes, that probably work better ;-)

> Bootstrap/regtest passes on ppc64le (power7, power9), ok for trunk?

Okay.  Thanks!


Segher


> 2018-10-30  Aaron Sawdey  
> 
>   * config/rs6000/rs6000.md (bswapdi2): Force address into register
>   if not in indexed or indirect form.
>   (bswapdi2_load): Change predicate to indexed_or_indirect_operand.
>   (bswapdi2_store): Ditto.
>   * config/rs6000/rs6000.c (rs6000_force_indexed_or_indirect_mem): New
>   helper function.
>   * config/rs6000/rs6000-protos.h (rs6000_force_indexed_or_indirect_mem):
>   Prototype for helper function.


Re: [PATCH] detect missing nuls in address of const char (PR 87756)

2018-10-30 Thread Martin Sebor

On 10/30/2018 09:54 AM, Jeff Law wrote:

On 10/30/18 9:44 AM, Martin Sebor wrote:

On 10/30/2018 09:27 AM, Jeff Law wrote:

On 10/29/18 5:51 PM, Martin Sebor wrote:

The missing nul detection fails when the argument of the %s or
similar sprintf directive is the address of a non-nul character
constant such as in:

  const char c = 'a';
  int f (void)
  {
return snprintf (0, 0, "%s", );
  }

This is because the string_constant function only succeeds for
arguments that refer to STRRING_CSTs, not to individual characters.

For the same reason, calls to memchr() such as the one below aren't
folded into constants:

  const char d = '\0';
  void* g (void)
  {
return memchr (, 0, 1);
  }

To detect and diagnose the missing nul in the first example and
to fold the second, the attached patch modifies string_constant
to return a synthesized STRING_CST object for such references
(while also indicating whether such an object is properly
nul-terminated).

Tested on x86_64-linux.

Martin

gcc-87756.diff

PR tree-optimization/87756 - missing unterminated argument warning
using address of a constant character

gcc/ChangeLog:

PR tree-optimization/87756
* expr.c (string_constant): Synthesize a string literal from
the address of a constant character.
* tree.c (build_string_literal): Add an argument.
* tree.h (build_string_literal): Same.

gcc/testsuite/ChangeLog:

PR tree-optimization/87756
* gcc.dg/builtin-memchr-2.c: New test.
* gcc.dg/builtin-memchr-3.c: Same.
* gcc.dg/warn-sprintf-no-nul-2.c: Same.

Index: gcc/expr.c
===
--- gcc/expr.c(revision 265496)
+++ gcc/expr.c(working copy)
@@ -11484,18 +11484,40 @@ string_constant (tree arg, tree
*ptr_offset, tree
 offset = off;
 }

-  if (!init || TREE_CODE (init) != STRING_CST)
+  if (!init)
 return NULL_TREE;

+  *ptr_offset = offset;
+
+  tree eltype = TREE_TYPE (init);
+  tree initsize = TYPE_SIZE_UNIT (eltype);
   if (mem_size)
-*mem_size = TYPE_SIZE_UNIT (TREE_TYPE (init));
+*mem_size = initsize;
+
   if (decl)
 *decl = array;

-  gcc_checking_assert (tree_to_shwi (TYPE_SIZE_UNIT (TREE_TYPE (init)))
-   >= TREE_STRING_LENGTH (init));
+  if (TREE_CODE (init) == INTEGER_CST)
+{
+  /* For a reference to (address of) a single constant character,
+ store the native representation of the character in CHARBUF.   */
+  unsigned char charbuf[MAX_BITSIZE_MODE_ANY_MODE / BITS_PER_UNIT];
+  int len = native_encode_expr (init, charbuf, sizeof charbuf, 0);
+  if (len > 0)
+{
+  /* Construct a string literal with elements of ELTYPE and
+ the representation above.  Then strip
+ the ADDR_EXPR (ARRAY_REF (...)) around the STRING_CST.  */
+  init = build_string_literal (len, (char *)charbuf, eltype);
+  init = TREE_OPERAND (TREE_OPERAND (init, 0), 0);
+}
+}

-  *ptr_offset = offset;
+  if (TREE_CODE (init) != STRING_CST)
+return NULL_TREE;
+
+  gcc_checking_assert (tree_to_shwi (initsize) >= TREE_STRING_LENGTH
(init));
+
   return init;
 }

Index: gcc/tree.c
===
--- gcc/tree.c(revision 265496)
+++ gcc/tree.c(working copy)



Index: gcc/tree.h
===
--- gcc/tree.h(revision 265496)
+++ gcc/tree.h(working copy)
@@ -4194,7 +4194,7 @@ extern tree build_call_expr_internal_loc_array (lo
 extern tree maybe_build_call_expr_loc (location_t, combined_fn, tree,
int, ...);
 extern tree build_alloca_call_expr (tree, unsigned int, HOST_WIDE_INT);
-extern tree build_string_literal (int, const char *);
+extern tree build_string_literal (int, const char *, tree =
char_type_node);

 /* Construct various nodes representing data types.  */

There's only about a dozen calls to build_string_literal.  Instead of
defaulting the argument, just fix them.OK with that change.  Make
sure to catch those in config/{rs6000,i386}/ and cp/


Why?  Default arguments (and overloading) exist in C++ to deal
with just this case: to avoid having to provide the common
argument value while letting callers provide a different value
when they need to.  What purpose will it serve to make these
unnecessary changes and to force new callers to provide
the default argument value?  It will only make using
the function more error-prone and its callers harder
to read.I find them much harder to read especially once you get more than one.

In cases where we have a small number of call sites we should just fix
them.  I think that we're well in that range here.  If there's a large
number of call sites, then overloading via default args makes plenty of
sense.


Sorry, but I still don't see why or agree with the rule.  There
is just one call site out of the 14 that provides a value other
than the default.  There are APIs in GCC with default arguments
with fewer 

Re: [PATCH v3 3/3] or1k: gcc: initial support for openrisc

2018-10-30 Thread Segher Boessenkool
On Tue, Oct 30, 2018 at 09:49:18PM +0900, Stafford Horne wrote:
> Hello,
> 
> On Sun, Oct 28, 2018 at 05:54:47PM -0500, Segher Boessenkool wrote:
> > Yes, like that.  It also easily can handle the other combos (those with
> > STACK_POINTER), and it is easier if you have to switch FRAME_GROWS_DOWNWARD
> > ("false" is better on some args, but "true" is required for ssp).
> > 
> > Your code is fine as-is of course.
> 
> Just to be clear, when you say 'as-is' did you mean the original v3 patch?  Or
> are you referring to followup patch I posted with the some_offset (from) -
> some_offset (to) logic.

Either.  Both.  I meant the orig big patch, v3 if that's what it was.

I am not a global reviewer; this is all just my opinion :-)


Segher


[RFT PATCH, middle end]: Fix PR58372, internal compiler error: ix86_compute_frame_layout

2018-10-30 Thread Uros Bizjak
Hello!

Function calls, generated directly through emit_library_call (for the
testcase from PR the compiler builds a call to _Unwind_SjLj_Register
via sjlj_emit_function_enter) miss a whole lot of stack realignmnet
setup. There is an update to crtl->preferred_stack_boundary present,
but several updates for SUPPORTS_STACK_ALIGNMENT targets are missing,
including eventual DRAP setup.

Attached patch introduces additional updates to stack realignment crtl
variables in emit_library_call_1, based on what expand_stack_alignment
from cfgrtl.c does. In addition to update of preferred_stack_boundary,
it updates stack_alignment_estimated and stack_alignment_needed. The
patch also updates dependent variables stack_realign_meeded and
stack_realign_tried. Additionally, if needed, DRAP register is
prepared.

2018-10-30  Uros Bizjak  

PR middle-end/58372
* calls.c (emit_library_call_value_1): For SUPPORTS_STACK_ALIGNMENT
targets, also update crtl->stack_alignment_estimated,
crtl->stack_alignment_needed, crtl->stack_realign_needed,
crtl->stack_realign_tried and prepare DRAP register if needed.

testsuite/ChangeLog:

2018-10-30  Uros Bizjak  

PR middle-end/58372
* g++.target/i386/pr58372.C: New test.

The patch was bootstrapped and regression tested on x86_64-linux-gnu
{,-m32}. Additionally, the testcase from PR (and a couple of similar
ones) were compiled for i686-w64-mingw32 target with various
combinations of -mpreferred-stack-boundary= -mincoming-stack-boundary=
-mforce-drap and -m{no-}accumulate-outgoing-args.

The patch is posted as RFT, to leave some time for eventual tests on
other targets, comments and possible approval.

Uros.
Index: calls.c
===
--- calls.c (revision 265582)
+++ calls.c (working copy)
@@ -4736,9 +4736,41 @@ emit_library_call_value_1 (int retval, rtx orgfun,
 
   /* Ensure current function's preferred stack boundary is at least
  what we need.  */
-  if (crtl->preferred_stack_boundary < PREFERRED_STACK_BOUNDARY)
-crtl->preferred_stack_boundary = PREFERRED_STACK_BOUNDARY;
+  unsigned int preferred_stack_boundary = PREFERRED_STACK_BOUNDARY;
 
+  if (preferred_stack_boundary > crtl->preferred_stack_boundary)
+crtl->preferred_stack_boundary = preferred_stack_boundary;
+
+  if (SUPPORTS_STACK_ALIGNMENT)
+{
+  if (preferred_stack_boundary > crtl->stack_alignment_estimated)
+   crtl->stack_alignment_estimated = preferred_stack_boundary;
+  if (preferred_stack_boundary > crtl->stack_alignment_needed)
+   crtl->stack_alignment_needed = preferred_stack_boundary;
+
+  crtl->stack_realign_needed
+   = INCOMING_STACK_BOUNDARY < crtl->stack_alignment_estimated;
+  crtl->stack_realign_tried = crtl->stack_realign_needed;
+
+  if (crtl->drap_reg == NULL_RTX)
+   {
+ rtx drap_rtx = targetm.calls.get_drap_rtx ();
+
+ /* stack_realign_drap and drap_rtx must match.  */
+ gcc_assert ((stack_realign_drap != 0) == (drap_rtx != NULL));
+
+ /* Do nothing if NULL is returned, which means DRAP is not needed.  */
+ if (drap_rtx != NULL)
+   {
+ crtl->args.internal_arg_pointer = drap_rtx;
+
+ /* Call fixup_tail_calls to clean up REG_EQUIV note if DRAP is
+needed. */
+ fixup_tail_calls ();
+   }
+   }
+}
+
   /* If this kind of value comes back in memory,
  decide where in memory it should come back.  */
   if (outmode != VOIDmode)
Index: testsuite/g++.target/i386/pr58372.C
===
--- testsuite/g++.target/i386/pr58372.C (nonexistent)
+++ testsuite/g++.target/i386/pr58372.C (working copy)
@@ -0,0 +1,9 @@
+/* PR target/58372 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+__attribute__ ((__target__ ("rdrnd")))
+void f (unsigned int *b) noexcept
+{
+  __builtin_ia32_rdrand32_step (b);
+}


Re: [PATCH v3 3/3] or1k: gcc: initial support for openrisc

2018-10-30 Thread Richard Henderson
On 10/30/18 12:18 PM, Stafford Horne wrote:
> OK, I was just being lazy allowing the spill.  Do you think the split/expand
> would be an RTL using left shift / right shift?  Can you think of something
> more clever?  Since "real" hardware does not usually support shifts with an
> immediate we will need 1 instruction to load shift amount. i.e.
> 
>   l.ori %0, r0, 24
>   l.sll %1, %1, %0
>   l.sra %0, %1, %0

This clobbers %1.

So, ouch.  I think we will want to avoid creating this particular pattern in
the first place unless l.exts exists then.  We would use another pattern like

(define_insn "*sign_extend_mem"
  [(set (match_operand:SI 0 "register_operand" "=r")
(sign_extend:SI
  (match_operand:HI 1 "memory_operand" "m")))]
  ""
  "l.lhs\t%0, %1")

following the TARGET_SEXT pattern.  In this way combine can use this pattern
without getting us into trouble with the register allocator later.

> I am submitting patches on my git branch or1k-port-4. Just in case you want to
> track progress.

Will do.


r~


Re: [PATCH] detect missing nuls in address of const char (PR 87756)

2018-10-30 Thread Jeff Law
On 10/30/18 9:44 AM, Martin Sebor wrote:
> On 10/30/2018 09:27 AM, Jeff Law wrote:
>> On 10/29/18 5:51 PM, Martin Sebor wrote:
>>> The missing nul detection fails when the argument of the %s or
>>> similar sprintf directive is the address of a non-nul character
>>> constant such as in:
>>>
>>>   const char c = 'a';
>>>   int f (void)
>>>   {
>>>     return snprintf (0, 0, "%s", );
>>>   }
>>>
>>> This is because the string_constant function only succeeds for
>>> arguments that refer to STRRING_CSTs, not to individual characters.
>>>
>>> For the same reason, calls to memchr() such as the one below aren't
>>> folded into constants:
>>>
>>>   const char d = '\0';
>>>   void* g (void)
>>>   {
>>>     return memchr (, 0, 1);
>>>   }
>>>
>>> To detect and diagnose the missing nul in the first example and
>>> to fold the second, the attached patch modifies string_constant
>>> to return a synthesized STRING_CST object for such references
>>> (while also indicating whether such an object is properly
>>> nul-terminated).
>>>
>>> Tested on x86_64-linux.
>>>
>>> Martin
>>>
>>> gcc-87756.diff
>>>
>>> PR tree-optimization/87756 - missing unterminated argument warning
>>> using address of a constant character
>>>
>>> gcc/ChangeLog:
>>>
>>> PR tree-optimization/87756
>>> * expr.c (string_constant): Synthesize a string literal from
>>> the address of a constant character.
>>> * tree.c (build_string_literal): Add an argument.
>>> * tree.h (build_string_literal): Same.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> PR tree-optimization/87756
>>> * gcc.dg/builtin-memchr-2.c: New test.
>>> * gcc.dg/builtin-memchr-3.c: Same.
>>> * gcc.dg/warn-sprintf-no-nul-2.c: Same.
>>>
>>> Index: gcc/expr.c
>>> ===
>>> --- gcc/expr.c    (revision 265496)
>>> +++ gcc/expr.c    (working copy)
>>> @@ -11484,18 +11484,40 @@ string_constant (tree arg, tree
>>> *ptr_offset, tree
>>>  offset = off;
>>>  }
>>>
>>> -  if (!init || TREE_CODE (init) != STRING_CST)
>>> +  if (!init)
>>>  return NULL_TREE;
>>>
>>> +  *ptr_offset = offset;
>>> +
>>> +  tree eltype = TREE_TYPE (init);
>>> +  tree initsize = TYPE_SIZE_UNIT (eltype);
>>>    if (mem_size)
>>> -    *mem_size = TYPE_SIZE_UNIT (TREE_TYPE (init));
>>> +    *mem_size = initsize;
>>> +
>>>    if (decl)
>>>  *decl = array;
>>>
>>> -  gcc_checking_assert (tree_to_shwi (TYPE_SIZE_UNIT (TREE_TYPE (init)))
>>> -   >= TREE_STRING_LENGTH (init));
>>> +  if (TREE_CODE (init) == INTEGER_CST)
>>> +    {
>>> +  /* For a reference to (address of) a single constant character,
>>> + store the native representation of the character in CHARBUF.   */
>>> +  unsigned char charbuf[MAX_BITSIZE_MODE_ANY_MODE / BITS_PER_UNIT];
>>> +  int len = native_encode_expr (init, charbuf, sizeof charbuf, 0);
>>> +  if (len > 0)
>>> +    {
>>> +  /* Construct a string literal with elements of ELTYPE and
>>> + the representation above.  Then strip
>>> + the ADDR_EXPR (ARRAY_REF (...)) around the STRING_CST.  */
>>> +  init = build_string_literal (len, (char *)charbuf, eltype);
>>> +  init = TREE_OPERAND (TREE_OPERAND (init, 0), 0);
>>> +    }
>>> +    }
>>>
>>> -  *ptr_offset = offset;
>>> +  if (TREE_CODE (init) != STRING_CST)
>>> +    return NULL_TREE;
>>> +
>>> +  gcc_checking_assert (tree_to_shwi (initsize) >= TREE_STRING_LENGTH
>>> (init));
>>> +
>>>    return init;
>>>  }
>>>
>>> Index: gcc/tree.c
>>> ===
>>> --- gcc/tree.c    (revision 265496)
>>> +++ gcc/tree.c    (working copy)
>>
>>> Index: gcc/tree.h
>>> ===
>>> --- gcc/tree.h    (revision 265496)
>>> +++ gcc/tree.h    (working copy)
>>> @@ -4194,7 +4194,7 @@ extern tree build_call_expr_internal_loc_array (lo
>>>  extern tree maybe_build_call_expr_loc (location_t, combined_fn, tree,
>>>     int, ...);
>>>  extern tree build_alloca_call_expr (tree, unsigned int, HOST_WIDE_INT);
>>> -extern tree build_string_literal (int, const char *);
>>> +extern tree build_string_literal (int, const char *, tree =
>>> char_type_node);
>>>
>>>  /* Construct various nodes representing data types.  */
>> There's only about a dozen calls to build_string_literal.  Instead of
>> defaulting the argument, just fix them.    OK with that change.  Make
>> sure to catch those in config/{rs6000,i386}/ and cp/
> 
> Why?  Default arguments (and overloading) exist in C++ to deal
> with just this case: to avoid having to provide the common
> argument value while letting callers provide a different value
> when they need to.  What purpose will it serve to make these
> unnecessary changes and to force new callers to provide
> the default argument value?  It will only make using
> the function more error-prone and its callers harder
> to read.I find them much harder to read especially once you 

Re: [C++ Patch] Improve locations in flexible array members diagnostic

2018-10-30 Thread Jason Merrill
OK.
On Tue, Oct 30, 2018 at 11:23 AM Paolo Carlini  wrote:
>
> Hi,
>
> today I noticed quite a few additional places where we can exploit
> declarator->id_loc, the below are the first bits. Tested x86_64-linux.
>
> Thanks, Paolo.
>
> //
>


Re: [PATCH] detect missing nuls in address of const char (PR 87756)

2018-10-30 Thread Martin Sebor

On 10/30/2018 09:27 AM, Jeff Law wrote:

On 10/29/18 5:51 PM, Martin Sebor wrote:

The missing nul detection fails when the argument of the %s or
similar sprintf directive is the address of a non-nul character
constant such as in:

  const char c = 'a';
  int f (void)
  {
return snprintf (0, 0, "%s", );
  }

This is because the string_constant function only succeeds for
arguments that refer to STRRING_CSTs, not to individual characters.

For the same reason, calls to memchr() such as the one below aren't
folded into constants:

  const char d = '\0';
  void* g (void)
  {
return memchr (, 0, 1);
  }

To detect and diagnose the missing nul in the first example and
to fold the second, the attached patch modifies string_constant
to return a synthesized STRING_CST object for such references
(while also indicating whether such an object is properly
nul-terminated).

Tested on x86_64-linux.

Martin

gcc-87756.diff

PR tree-optimization/87756 - missing unterminated argument warning using 
address of a constant character

gcc/ChangeLog:

PR tree-optimization/87756
* expr.c (string_constant): Synthesize a string literal from
the address of a constant character.
* tree.c (build_string_literal): Add an argument.
* tree.h (build_string_literal): Same.

gcc/testsuite/ChangeLog:

PR tree-optimization/87756
* gcc.dg/builtin-memchr-2.c: New test.
* gcc.dg/builtin-memchr-3.c: Same.
* gcc.dg/warn-sprintf-no-nul-2.c: Same.

Index: gcc/expr.c
===
--- gcc/expr.c  (revision 265496)
+++ gcc/expr.c  (working copy)
@@ -11484,18 +11484,40 @@ string_constant (tree arg, tree *ptr_offset, tree
offset = off;
 }

-  if (!init || TREE_CODE (init) != STRING_CST)
+  if (!init)
 return NULL_TREE;

+  *ptr_offset = offset;
+
+  tree eltype = TREE_TYPE (init);
+  tree initsize = TYPE_SIZE_UNIT (eltype);
   if (mem_size)
-*mem_size = TYPE_SIZE_UNIT (TREE_TYPE (init));
+*mem_size = initsize;
+
   if (decl)
 *decl = array;

-  gcc_checking_assert (tree_to_shwi (TYPE_SIZE_UNIT (TREE_TYPE (init)))
-  >= TREE_STRING_LENGTH (init));
+  if (TREE_CODE (init) == INTEGER_CST)
+{
+  /* For a reference to (address of) a single constant character,
+store the native representation of the character in CHARBUF.   */
+  unsigned char charbuf[MAX_BITSIZE_MODE_ANY_MODE / BITS_PER_UNIT];
+  int len = native_encode_expr (init, charbuf, sizeof charbuf, 0);
+  if (len > 0)
+   {
+ /* Construct a string literal with elements of ELTYPE and
+the representation above.  Then strip
+the ADDR_EXPR (ARRAY_REF (...)) around the STRING_CST.  */
+ init = build_string_literal (len, (char *)charbuf, eltype);
+ init = TREE_OPERAND (TREE_OPERAND (init, 0), 0);
+   }
+}

-  *ptr_offset = offset;
+  if (TREE_CODE (init) != STRING_CST)
+return NULL_TREE;
+
+  gcc_checking_assert (tree_to_shwi (initsize) >= TREE_STRING_LENGTH (init));
+
   return init;
 }

Index: gcc/tree.c
===
--- gcc/tree.c  (revision 265496)
+++ gcc/tree.c  (working copy)



Index: gcc/tree.h
===
--- gcc/tree.h  (revision 265496)
+++ gcc/tree.h  (working copy)
@@ -4194,7 +4194,7 @@ extern tree build_call_expr_internal_loc_array (lo
 extern tree maybe_build_call_expr_loc (location_t, combined_fn, tree,
   int, ...);
 extern tree build_alloca_call_expr (tree, unsigned int, HOST_WIDE_INT);
-extern tree build_string_literal (int, const char *);
+extern tree build_string_literal (int, const char *, tree = char_type_node);

 /* Construct various nodes representing data types.  */

There's only about a dozen calls to build_string_literal.  Instead of
defaulting the argument, just fix them.OK with that change.  Make
sure to catch those in config/{rs6000,i386}/ and cp/


Why?  Default arguments (and overloading) exist in C++ to deal
with just this case: to avoid having to provide the common
argument value while letting callers provide a different value
when they need to.  What purpose will it serve to make these
unnecessary changes and to force new callers to provide
the default argument value?  It will only make using
the function more error-prone and its callers harder
to read.

Martin


Re: [PATCH v3 3/3] or1k: gcc: initial support for openrisc

2018-10-30 Thread Segher Boessenkool
On Tue, Oct 30, 2018 at 08:26:00PM +0900, Stafford Horne wrote:
> On Mon, Oct 29, 2018 at 04:42:43PM +, Richard Henderson wrote:
> > On 10/29/18 4:34 PM, Segher Boessenkool wrote:
> > > Is there some better documentation available?  This is what google found
> > > for me.  I would have like better docs (more compact, etc.)  Links to
> > > such would be great to have in readings.html :-)
> > 
> > https://openrisc.io/architecture
> > 
> > and especially the v1.2 pdf linked from there
> > 
> > https://raw.githubusercontent.com/openrisc/doc/master/openrisc-arch-1.2-rev0.pdf
> 
> Thanks,
> 
> I meant to point this out during my previous reply.  Also, I will send a patch
> for adding this to wwwdocs.
> 
>   https://www.gnu.org/software/gcc/readings.html

I figure out how I most likely found the out-of-date page btw: I googled
"openrisc xori" (no quotes).


Segher


Re: [PATCH, GCC/ARM] Fix PR87374: ICE with -mslow-flash-data and -mword-relocations

2018-10-30 Thread Ramana Radhakrishnan
On Fri, Oct 5, 2018 at 5:50 PM Thomas Preudhomme
 wrote:
>
> Hi Ramana and Kyrill,
>
> I've reworked the patch to add some documentation of the option
> conflict and reworked the -mword-relocation logic slightly to set the
> variable explicitely in PIC mode rather than test for PIC and word
> relocation everywhere.

Ok.

Thanks,
Ramana

>
> ChangeLog entries are now as follows:
>
> *** gcc/ChangeLog ***
>
> 2018-10-02  Thomas Preud'homme  
>
> PR target/87374
> * config/arm/arm.c (arm_option_check_internal): Disable the combined
> use of -mslow-flash-data and -mword-relocations.
> (arm_option_override): Enable -mword-relocations if -fpic or -fPIC.
> * config/arm/arm.md (SYMBOL_REF MOVT splitter): Stop checking for
> flag_pic.
> * doc/invoke.texi (-mword-relocations): Mention conflict with
> -mslow-flash-data.
> (-mslow-flash-data): Reciprocally.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2018-09-25  Thomas Preud'homme  
>
> PR target/87374
> * gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and
> -mword-relocations would be passed when compiling the test.
> * gcc.target/arm/movsi_movt.c: Likewise.
> * gcc.target/arm/pr81863.c: Likewise.
> * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise.
> * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise.
> * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
> * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
> * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
> * gcc.target/arm/tls-disable-literal-pool.c: Likewise.
>
> Is this ok for trunk?
>
> Best regards,
>
> Thomas
>
> On Tue, 2 Oct 2018 at 13:39, Ramana Radhakrishnan
>  wrote:
> >
> > On 02/10/2018 11:42, Thomas Preudhomme wrote:
> > > Hi Ramana,
> > >
> > > On Thu, 27 Sep 2018 at 11:14, Ramana Radhakrishnan
> > >  wrote:
> > >>
> > >> On 27/09/2018 09:26, Kyrill Tkachov wrote:
> > >>> Hi Thomas,
> > >>>
> > >>> On 26/09/18 18:39, Thomas Preudhomme wrote:
> >  Hi,
> > 
> >  GCC ICEs under -mslow-flash-data and -mword-relocations because there
> >  is no way to load an address, both literal pools and MOVW/MOVT being
> >  forbidden. This patch gives an error message when both options are
> >  specified by the user and adds the according dg-skip-if directives for
> >  tests that use either of these options.
> > 
> >  ChangeLog entries are as follows:
> > 
> >  *** gcc/ChangeLog ***
> > 
> >  2018-09-25  Thomas Preud'homme  
> > 
> > PR target/87374
> > * config/arm/arm.c (arm_option_check_internal): Disable the 
> >  combined
> > use of -mslow-flash-data and -mword-relocations.
> > 
> >  *** gcc/testsuite/ChangeLog ***
> > 
> >  2018-09-25  Thomas Preud'homme  
> > 
> > PR target/87374
> > * gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data 
> >  and
> > -mword-relocations would be passed when compiling the test.
> > * gcc.target/arm/movsi_movt.c: Likewise.
> > * gcc.target/arm/pr81863.c: Likewise.
> > * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise.
> > * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise.
> > * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
> > * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
> > * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
> > * gcc.target/arm/tls-disable-literal-pool.c: Likewise.
> > 
> > 
> >  Testing: Bootstrapped in Thumb-2 mode. No testsuite regression when
> >  targeting arm-none-eabi. Modified tests get skipped as expected when
> >  running the testsuite with -mslow-flash-data (pr81863.c) or
> >  -mword-relocations (all the others).
> > 
> > 
> >  Is this ok for trunk? I'd also appreciate guidance on whether this is
> >  worth a backport. It's a simple patch but on the other hand it only
> >  prevents some option combination, it does not fix anything so I have
> >  mixed feelings.
> > >>>
> > >>> In my opinion -mslow-flash-data is more of a tuning option rather than 
> > >>> a security/ABI feature
> > >>> and therefore erroring out on its combination with -mword-relocations 
> > >>> feels odd.
> > >>> I'm leaning more towards making -mword-relocations or any other option 
> > >>> that really requires constant pools
> > >>> to bypass/disable the effects of -mslow-flash-data instead.
> > >>
> > >> -mslow-flash-data and -mword-relocations are contradictory in their
> > >> expectations. mslow-flash-data is for not putting anything in the
> > >> literal pool whereas mword-relocations is purely around the use of movw
> > >> / movt instructions for word sized values. I wish we had called
> > >> -mslow-flash-data something else (probably -mno-literal-pools).
> > >> -mslow-flash-data is used primarily by M-profile users and
> > >> 

Re: [PATCH] detect missing nuls in address of const char (PR 87756)

2018-10-30 Thread Jeff Law
On 10/29/18 5:51 PM, Martin Sebor wrote:
> The missing nul detection fails when the argument of the %s or
> similar sprintf directive is the address of a non-nul character
> constant such as in:
> 
>   const char c = 'a';
>   int f (void)
>   {
> return snprintf (0, 0, "%s", );
>   }
> 
> This is because the string_constant function only succeeds for
> arguments that refer to STRRING_CSTs, not to individual characters.
> 
> For the same reason, calls to memchr() such as the one below aren't
> folded into constants:
> 
>   const char d = '\0';
>   void* g (void)
>   {
> return memchr (, 0, 1);
>   }
> 
> To detect and diagnose the missing nul in the first example and
> to fold the second, the attached patch modifies string_constant
> to return a synthesized STRING_CST object for such references
> (while also indicating whether such an object is properly
> nul-terminated).
> 
> Tested on x86_64-linux.
> 
> Martin
> 
> gcc-87756.diff
> 
> PR tree-optimization/87756 - missing unterminated argument warning using 
> address of a constant character
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/87756
>   * expr.c (string_constant): Synthesize a string literal from
>   the address of a constant character.
>   * tree.c (build_string_literal): Add an argument.
>   * tree.h (build_string_literal): Same.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/87756
>   * gcc.dg/builtin-memchr-2.c: New test.
>   * gcc.dg/builtin-memchr-3.c: Same.
>   * gcc.dg/warn-sprintf-no-nul-2.c: Same.
> 
> Index: gcc/expr.c
> ===
> --- gcc/expr.c(revision 265496)
> +++ gcc/expr.c(working copy)
> @@ -11484,18 +11484,40 @@ string_constant (tree arg, tree *ptr_offset, tree
>   offset = off;
>  }
>  
> -  if (!init || TREE_CODE (init) != STRING_CST)
> +  if (!init)
>  return NULL_TREE;
>  
> +  *ptr_offset = offset;
> +
> +  tree eltype = TREE_TYPE (init);
> +  tree initsize = TYPE_SIZE_UNIT (eltype);
>if (mem_size)
> -*mem_size = TYPE_SIZE_UNIT (TREE_TYPE (init));
> +*mem_size = initsize;
> +
>if (decl)
>  *decl = array;
>  
> -  gcc_checking_assert (tree_to_shwi (TYPE_SIZE_UNIT (TREE_TYPE (init)))
> ->= TREE_STRING_LENGTH (init));
> +  if (TREE_CODE (init) == INTEGER_CST)
> +{
> +  /* For a reference to (address of) a single constant character,
> +  store the native representation of the character in CHARBUF.   */
> +  unsigned char charbuf[MAX_BITSIZE_MODE_ANY_MODE / BITS_PER_UNIT];
> +  int len = native_encode_expr (init, charbuf, sizeof charbuf, 0);
> +  if (len > 0)
> + {
> +   /* Construct a string literal with elements of ELTYPE and
> +  the representation above.  Then strip
> +  the ADDR_EXPR (ARRAY_REF (...)) around the STRING_CST.  */
> +   init = build_string_literal (len, (char *)charbuf, eltype);
> +   init = TREE_OPERAND (TREE_OPERAND (init, 0), 0);
> + }
> +}
>  
> -  *ptr_offset = offset;
> +  if (TREE_CODE (init) != STRING_CST)
> +return NULL_TREE;
> +
> +  gcc_checking_assert (tree_to_shwi (initsize) >= TREE_STRING_LENGTH (init));
> +
>return init;
>  }
>  
> Index: gcc/tree.c
> ===
> --- gcc/tree.c(revision 265496)
> +++ gcc/tree.c(working copy)

> Index: gcc/tree.h
> ===
> --- gcc/tree.h(revision 265496)
> +++ gcc/tree.h(working copy)
> @@ -4194,7 +4194,7 @@ extern tree build_call_expr_internal_loc_array (lo
>  extern tree maybe_build_call_expr_loc (location_t, combined_fn, tree,
>  int, ...);
>  extern tree build_alloca_call_expr (tree, unsigned int, HOST_WIDE_INT);
> -extern tree build_string_literal (int, const char *);
> +extern tree build_string_literal (int, const char *, tree = char_type_node);
>  
>  /* Construct various nodes representing data types.  */
There's only about a dozen calls to build_string_literal.  Instead of
defaulting the argument, just fix them.OK with that change.  Make
sure to catch those in config/{rs6000,i386}/ and cp/

jeff



[C++ Patch] Improve locations in flexible array members diagnostic

2018-10-30 Thread Paolo Carlini

Hi,

today I noticed quite a few additional places where we can exploit 
declarator->id_loc, the below are the first bits. Tested x86_64-linux.


Thanks, Paolo.

//

/cp
2018-10-30  Paolo Carlini  

* decl.c (grokdeclarator): Use declarator->id_loc in diagnostic
about flexible array members.

/testsuite
2018-10-30  Paolo Carlini  

* g++.dg/cpp1z/has-unique-obj-representations1.C: Test location too.
* g++.dg/ext/flexarray-mangle-2.C: Likewise.
* g++.dg/ext/flexarray-mangle.C: Likewise.
* g++.dg/ext/flexarray-subst.C: Likewise.
* g++.dg/ext/flexary10.C: Likewise.
* g++.dg/ext/flexary11.C: Likewise.
* g++.dg/ext/flexary14.C: Likewise.
* g++.dg/ext/flexary16.C: Likewise.
* g++.dg/ext/flexary26.C: Likewise.
* g++.dg/ext/flexary27.C: Likewise.
* g++.dg/ext/flexary7.C: Likewise.
* g++.dg/ext/pr71290.C: Likewise.Index: cp/decl.c
===
--- cp/decl.c   (revision 265616)
+++ cp/decl.c   (working copy)
@@ -12210,7 +12223,7 @@ grokdeclarator (const cp_declarator *declarator,
  /* Do not warn on flexible array members in system
 headers because glibc uses them.  */;
else if (name)
- pedwarn (input_location, OPT_Wpedantic,
+ pedwarn (declarator->id_loc, OPT_Wpedantic,
   "ISO C++ forbids flexible array member %qs", name);
else
  pedwarn (input_location, OPT_Wpedantic,
Index: testsuite/g++.dg/cpp1z/has-unique-obj-representations1.C
===
--- testsuite/g++.dg/cpp1z/has-unique-obj-representations1.C(revision 
265616)
+++ testsuite/g++.dg/cpp1z/has-unique-obj-representations1.C(working copy)
@@ -9,7 +9,7 @@ struct V { int i : INTB * 3 / 4; int j : INTB / 4
 struct W {};
 struct X : public W { int i; void bar (); };
 struct Y {
-  char a[3]; char b[];   // { dg-warning "forbids flexible array member" }
+  char a[3]; char b[];   // { dg-warning "19:ISO C\\+\\+ forbids flexible 
array member" }
 };
 struct Z { int a; float b; };
 struct A { int i : INTB * 2; int j; }; // { dg-warning 
"exceeds its type" }
Index: testsuite/g++.dg/ext/flexarray-mangle-2.C
===
--- testsuite/g++.dg/ext/flexarray-mangle-2.C   (revision 265616)
+++ testsuite/g++.dg/ext/flexarray-mangle-2.C   (working copy)
@@ -4,7 +4,7 @@
 
 struct A {
   int n;
-  char a[];   // { dg-warning "forbids flexible array member" }
+  char a[];   // { dg-warning "8:ISO C\\+\\+ forbids flexible array member" }
 };
 
 // Declare but do not define function templates.
Index: testsuite/g++.dg/ext/flexarray-mangle.C
===
--- testsuite/g++.dg/ext/flexarray-mangle.C (revision 265616)
+++ testsuite/g++.dg/ext/flexarray-mangle.C (working copy)
@@ -4,7 +4,7 @@
 
 struct A {
   int n;
-  char a[];   // { dg-warning "forbids flexible array member" }
+  char a[];   // { dg-warning "8:ISO C\\+\\+ forbids flexible array member" }
 };
 
 // Declare but do not define function templates.
Index: testsuite/g++.dg/ext/flexarray-subst.C
===
--- testsuite/g++.dg/ext/flexarray-subst.C  (revision 265616)
+++ testsuite/g++.dg/ext/flexarray-subst.C  (working copy)
@@ -5,7 +5,7 @@
 
 struct A {
   int n;
-  char a[];   // { dg-warning "forbids flexible array member" }
+  char a[];   // { dg-warning "8:ISO C\\+\\+ forbids flexible array member" }
 };
 
 template 
Index: testsuite/g++.dg/ext/flexary10.C
===
--- testsuite/g++.dg/ext/flexary10.C(revision 265616)
+++ testsuite/g++.dg/ext/flexary10.C(working copy)
@@ -4,7 +4,7 @@
 
 struct A {
   int n;
-  int a[];  // { dg-warning "forbids flexible array member" }
+  int a[];  // { dg-warning "7:ISO C\\+\\+ forbids flexible array member" }
 };
 
 struct A foo (void)
Index: testsuite/g++.dg/ext/flexary11.C
===
--- testsuite/g++.dg/ext/flexary11.C(revision 265616)
+++ testsuite/g++.dg/ext/flexary11.C(working copy)
@@ -4,7 +4,7 @@
 
 struct A {
   int n;
-  char a[];   // { dg-error "forbids flexible array member" }
+  char a[];   // { dg-error "8:ISO C\\+\\+ forbids flexible array member" }
 };
 
 void f ()
Index: testsuite/g++.dg/ext/flexary14.C
===
--- testsuite/g++.dg/ext/flexary14.C(revision 265616)
+++ testsuite/g++.dg/ext/flexary14.C(working copy)
@@ -10,7 +10,7 @@ struct A { typedef int X; };
 template  int foo (T&, typename A::X = 0) { return 0; }
 
 struct B {
-  int n, a[]; // { dg-error "forbids flexible array 

Re: [PATCH][rs6000] use index form addresses more often for ldbrx/stdbrx

2018-10-30 Thread Aaron Sawdey
I had to make one more change to make this actually work. In
rs6000_force_indexed_or_indirect_mem() it was necessary to
return the updated rtx.

Bootstrap/regtest passes on ppc64le (power7, power9), ok for trunk?

Thanks!
   Aaron

2018-10-30  Aaron Sawdey  

* config/rs6000/rs6000.md (bswapdi2): Force address into register
if not in indexed or indirect form.
(bswapdi2_load): Change predicate to indexed_or_indirect_operand.
(bswapdi2_store): Ditto.
* config/rs6000/rs6000.c (rs6000_force_indexed_or_indirect_mem): New
helper function.
* config/rs6000/rs6000-protos.h (rs6000_force_indexed_or_indirect_mem):
Prototype for helper function.


Index: gcc/config/rs6000/rs6000-protos.h
===
--- gcc/config/rs6000/rs6000-protos.h   (revision 265588)
+++ gcc/config/rs6000/rs6000-protos.h   (working copy)
@@ -47,6 +47,7 @@
 extern bool legitimate_indirect_address_p (rtx, int);
 extern bool legitimate_indexed_address_p (rtx, int);
 extern bool avoiding_indexed_address_p (machine_mode);
+extern rtx rs6000_force_indexed_or_indirect_mem (rtx x);

 extern rtx rs6000_got_register (rtx);
 extern rtx find_addr_reg (rtx);
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 265588)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -8423,7 +8423,23 @@
   return false;
 }

+/* Helper function for making sure we will make full
+   use of indexed addressing.  */

+rtx
+rs6000_force_indexed_or_indirect_mem (rtx x)
+{
+  machine_mode m = GET_MODE (x);
+  if (!indexed_or_indirect_operand (x, m))
+{
+  rtx addr = XEXP (x, 0);
+  addr = force_reg (Pmode, addr);
+  x = replace_equiv_address_nv (x, addr);
+}
+  return x;
+}
+
+
 /* Implement the TARGET_LEGITIMATE_COMBINED_INSN hook.  */

 static bool
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 265588)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -2512,9 +2512,15 @@
   if (TARGET_POWERPC64 && TARGET_LDBRX)
 {
   if (MEM_P (src))
-   emit_insn (gen_bswapdi2_load (dest, src));
+{
+ src = rs6000_force_indexed_or_indirect_mem (src);
+ emit_insn (gen_bswapdi2_load (dest, src));
+}
   else if (MEM_P (dest))
-   emit_insn (gen_bswapdi2_store (dest, src));
+{
+ dest = rs6000_force_indexed_or_indirect_mem (dest);
+ emit_insn (gen_bswapdi2_store (dest, src));
+}
   else if (TARGET_P9_VECTOR)
emit_insn (gen_bswapdi2_xxbrd (dest, src));
   else
@@ -2535,13 +2541,13 @@
 ;; Power7/cell has ldbrx/stdbrx, so use it directly
 (define_insn "bswapdi2_load"
   [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
-   (bswap:DI (match_operand:DI 1 "memory_operand" "Z")))]
+   (bswap:DI (match_operand:DI 1 "indexed_or_indirect_operand" "Z")))]
   "TARGET_POWERPC64 && TARGET_LDBRX"
   "ldbrx %0,%y1"
   [(set_attr "type" "load")])

 (define_insn "bswapdi2_store"
-  [(set (match_operand:DI 0 "memory_operand" "=Z")
+  [(set (match_operand:DI 0 "indexed_or_indirect_operand" "=Z")
(bswap:DI (match_operand:DI 1 "gpc_reg_operand" "r")))]
   "TARGET_POWERPC64 && TARGET_LDBRX"
   "stdbrx %1,%y0"



-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain



Re: [doc PATCH] clarify attribute optimize and target syntax

2018-10-30 Thread Jeff Law
On 10/25/18 12:35 PM, Martin Sebor wrote:
> While testing the optimize and target attributes and comparing
> the results to what the manual describes I noticed that some
> syntactic forms aren't fully documented for both attributes.
> Specifically, the optimize attribute doesn't mention that each
> string argument can be a comma-separated list of -f option
> suffixes, e.g., like so:
> 
>   __attribute__ ((optimize ("foo,bar"))) void f (void);
> 
> (The target attribute does mention it.)
> 
> The attached patch amends the manual to describe this feature
> consistently and in some detail.  While I was there, I also made
> a few minor changes for consistency: besides adding the full
> syntactic form of the attribute (including arguments), I removed
> the quotes around the string argument (typically, in @var{string}
> the string isn't quoted), and corrected a minor grammatical issue.
> 
> Martin
> 
> gcc-doc-attr-optimize-target.diff
> 
> gcc/ChangeLog:
> 
>   * doc/extend.texi (optimize): Clarify/expand attribute documentation.
>   (target, pragma GCC optimize, pragma GCC target): Ditto.
OK
jeff


Re: [PATCH] S/390: Allow LARL of literal pool entries

2018-10-30 Thread Ulrich Weigand
Ilya Leoshkevich wrote:
> Am 29.10.2018 um 19:45 schrieb Ulrich Weigand :
>
> > This is true.  But something else must still be going on here.  Note that
> > many other instruction patterns might contain constant pool addresses,
> > since they are accepted e.g. by the 'b' constraint.  In all of those
> > cases, we shouldn't add the UNSPEC_LTREF.  So just checking for the
> > specific LARL instruction pattern in annotate_constant_pool_refs does
> > not feel like a correct fix here.
>
> I have changed the patch to skip all larl_operands, regardless of which
> context they appear in.  Regtest is running.

Not sure that this is fully correct either.  *Some* instructions, like
e.g. floating-point loads, do not accept relative operands.  And even
for the relative loads that exist, there may be slightly different
restrictions on what addresses are allowed (e.g. LGRL only accepts
8-byte aligned addresses, while LARL accepts 2-byte aligned addresses).

It seems the underlying problem is that we have predicates/constraints
that accept literal pool differences for two distinct reasons now:
either because they can be naturally handled via relative addressing,
or because they are supposed to be transformed to base-register addressing
later on.  We really need to distinguish the two cases in some way.

Maybe it would make sense to check which alternative/constraint matched
the insn, and decide based on that whether we need to rewrite to base-
register addressing or not?

> > In fact, before r265490, the pattern for movdi_larl could also contain a
> > constant pool address, so why didn't the problem occur then?  What's the
> > difference whether this is part of movdi_larl or just movdi?
> 
> The difference is usage of "X" constraint.  Before, when we initially
> chose movdi_larl, we could still put UNSPEC_LTREF inside it without
> consequences, because during UNSPEC_LTREF lifetime only constraints are
> checked.

Huh, I see.  That was certainly unintentional.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH] Fix bug 86293

2018-10-30 Thread Jeff Law
On 10/29/18 10:39 PM, Nicholas Krause wrote:
> This fixes the bug on the gcc bugzilla with id, 86293. Basically
> a variable is undefined in certain build configuration scentarios
> and must be enabled with the attribute marco and the flag, unused 
> for it to avoid this build error. Build and regtested on x86_64_gnu,
> ok for trunk?
> 
> Signed-off-by: Nicholas Krause 
> ---
>  libitm/method-serial.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libitm/method-serial.cc b/libitm/method-serial.cc
> index e4804946a34..ab23d0b5660 100644
> --- a/libitm/method-serial.cc
> +++ b/libitm/method-serial.cc
> @@ -306,7 +306,7 @@ GTM::gtm_thread::serialirr_mode ()
>// We're already serial, so we don't need to ensure privatization 
> safety
>// for other transactions here.
>gtm_word priv_time = 0;
> -  bool ok = disp->trycommit (priv_time);
> +  bool ok __attribute__((unused)) = disp->trycommit (priv_time);
>// Given that we're already serial, the trycommit better work.
>assert (ok);
THanks.  Installed.

jeff


[PATCH v2] S/390: Allow LARL of literal pool entries

2018-10-30 Thread Ilya Leoshkevich
Bootstrapped and regtested on s390x-redhat-linux.

Changes since v1:
* Removed unnecessary gen_rtx_CONST call.
* UNSPEC_LTREF are now omitted for all relatively addressed pool
  entries, and not only those which occur in LARL-like patterns.

r265490 allowed the compiler to choose in a more flexible way whether to
use load or load-address-relative-long (LARL) instruction.  When it
chose LARL for literal pool references, the latter ones were rewritten
by pass_s390_early_mach to use UNSPEC_LTREF, which assumes base register
usage, which in turn is not compatible with LARL.  The end result was an
ICE because of unrecognizable insn.

UNSPEC_LTREF and friends are necessary in order to communicate the
dependency on the base register to pass_sched2.  When LARL is used, no
base register is necessary, so in such cases the rewrite must be
avoided.

gcc/ChangeLog:

2018-10-26  Ilya Leoshkevich  

PR target/87762
* config/s390/predicates.md (larl_operand): Use
s390_symbol_larl_p () to reduce code duplication.
* config/s390/s390-protos.h (s390_symbol_larl_p): New function.
* config/s390/s390.c (s390_symbol_larl_p): New function.
(annotate_constant_pool_refs): Skip LARL operands.
(find_constant_pool_ref): Handle non-annotated literal pool
references, which are usable with LARL.
(replace_constant_pool_ref): Skip LARL operands.
---
 gcc/config/s390/predicates.md |  9 ++---
 gcc/config/s390/s390-protos.h |  1 +
 gcc/config/s390/s390.c| 30 ++
 3 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/gcc/config/s390/predicates.md b/gcc/config/s390/predicates.md
index 98a824e77b7..0e431302479 100644
--- a/gcc/config/s390/predicates.md
+++ b/gcc/config/s390/predicates.md
@@ -151,9 +151,7 @@
   if (GET_CODE (op) == LABEL_REF)
 return true;
   if (SYMBOL_REF_P (op))
-return (!SYMBOL_FLAG_NOTALIGN2_P (op)
-   && SYMBOL_REF_TLS_MODEL (op) == 0
-   && s390_rel_address_ok_p (op));
+return s390_symbol_larl_p (op);
 
   /* Everything else must have a CONST, so strip it.  */
   if (GET_CODE (op) != CONST)
@@ -176,10 +174,7 @@
   if (GET_CODE (op) == LABEL_REF)
 return true;
   if (SYMBOL_REF_P (op))
-return (!SYMBOL_FLAG_NOTALIGN2_P (op)
-   && SYMBOL_REF_TLS_MODEL (op) == 0
-   && s390_rel_address_ok_p (op));
-
+return s390_symbol_larl_p (op);
 
   /* Now we must have a @GOTENT offset or @PLT stub
  or an @INDNTPOFF TLS offset.  */
diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index 96fa705f879..a5cd80fa446 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -157,6 +157,7 @@ extern void s390_indirect_branch_via_thunk (unsigned int 
regno,
rtx comparison_operator,
enum s390_indirect_branch_type 
type);
 extern void s390_indirect_branch_via_inline_thunk (rtx execute_target);
+extern bool s390_symbol_larl_p (rtx);
 #endif /* RTX_CODE */
 
 /* s390-c.c routines */
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 29a829f48ea..bac215f647c 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -2816,6 +2816,16 @@ s390_decompose_constant_pool_ref (rtx *ref, rtx *disp, 
bool *is_ptr,
   return true;
 }
 
+/* Return true iff SYMBOL_REF X can be used with a LARL instruction. */
+
+bool
+s390_symbol_larl_p (rtx x)
+{
+  return (!SYMBOL_FLAG_NOTALIGN2_P (x)
+ && SYMBOL_REF_TLS_MODEL (x) == 0
+ && s390_rel_address_ok_p (x));
+}
+
 /* Decompose a RTL expression ADDR for a memory address into
its components, returned in OUT.
 
@@ -8120,6 +8130,10 @@ annotate_constant_pool_refs (rtx *x)
   int i, j;
   const char *fmt;
 
+  /* Skip LARL operands, because they don't require a base register.  */
+  if (larl_operand (*x, VOIDmode))
+return;
+
   gcc_assert (GET_CODE (*x) != SYMBOL_REF
  || !CONSTANT_POOL_ADDRESS_P (*x));
 
@@ -8223,6 +8237,18 @@ find_constant_pool_ref (rtx x, rtx *ref)
   && XINT (x, 1) == UNSPECV_POOL_ENTRY)
 return;
 
+  if (SYMBOL_REF_P (x)
+  && CONSTANT_POOL_ADDRESS_P (x)
+  && s390_symbol_larl_p (x))
+{
+  if (*ref == NULL_RTX)
+   *ref = x;
+  else
+   gcc_assert (*ref == x);
+
+  return;
+}
+
   gcc_assert (GET_CODE (x) != SYMBOL_REF
  || !CONSTANT_POOL_ADDRESS_P (x));
 
@@ -8264,6 +8290,10 @@ replace_constant_pool_ref (rtx *x, rtx ref, rtx offset)
   int i, j;
   const char *fmt;
 
+  /* Skip LARL operands, because they don't require a base register.  */
+  if (larl_operand (*x, VOIDmode))
+return;
+
   gcc_assert (*x != ref);
 
   if (GET_CODE (*x) == UNSPEC
-- 
2.19.0



[PATCH] PR libstdc++/87809 avoid invalid expressions in exception specifications

2018-10-30 Thread Jonathan Wakely

If the allocator isn't default constructible then checking if the
default constructor throws in an exception specification makes the
declaration invalid. Use the type trait instead.

PR libstdc++/87809
* include/bits/forward_list.h (_Fwd_list_impl::_Fwd_list_impl()): Use
trait in exception-specification instead of possibly invalid
expression.
* include/bits/stl_bvector.h (_Bvector_impl::_Bvector_impl()):
Likewise.
* include/bits/stl_list.h (_List_impl::_List_impl()): Likewise.
* include/bits/stl_vector.h (_Vector_impl::_Vector_impl()): Likewise.
* testsuite/23_containers/forward_list/cons/87809.cc: New test.
* testsuite/23_containers/list/cons/87809.cc: New test.
* testsuite/23_containers/vector/bool/cons/87809.cc: New test.
* testsuite/23_containers/vector/cons/87809.cc: New test.

Tested powerpc64le-linux, committed to trunk.

This needs to be backported to gcc-8-branch as well.


commit 4d9fc3aa6ee2bc8a5589e4b9709b3ea6c00c87be
Author: Jonathan Wakely 
Date:   Tue Oct 30 14:09:04 2018 +

PR libstdc++/87809 avoid invalid expressions in exception specifications

If the allocator isn't default constructible then checking if the
default constructor throws in an exception specification makes the
declaration invalid. Use the type trait instead.

PR libstdc++/87809
* include/bits/forward_list.h (_Fwd_list_impl::_Fwd_list_impl()): 
Use
trait in exception-specification instead of possibly invalid
expression.
* include/bits/stl_bvector.h (_Bvector_impl::_Bvector_impl()):
Likewise.
* include/bits/stl_list.h (_List_impl::_List_impl()): Likewise.
* include/bits/stl_vector.h (_Vector_impl::_Vector_impl()): 
Likewise.
* testsuite/23_containers/forward_list/cons/87809.cc: New test.
* testsuite/23_containers/list/cons/87809.cc: New test.
* testsuite/23_containers/vector/bool/cons/87809.cc: New test.
* testsuite/23_containers/vector/cons/87809.cc: New test.

diff --git a/libstdc++-v3/include/bits/forward_list.h 
b/libstdc++-v3/include/bits/forward_list.h
index ebec3b5c818..1d01a54721b 100644
--- a/libstdc++-v3/include/bits/forward_list.h
+++ b/libstdc++-v3/include/bits/forward_list.h
@@ -293,7 +293,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
_Fwd_list_node_base _M_head;
 
_Fwd_list_impl()
- noexcept( noexcept(_Node_alloc_type()) )
+ noexcept(is_nothrow_default_constructible<_Node_alloc_type>::value)
: _Node_alloc_type(), _M_head()
{ }
 
diff --git a/libstdc++-v3/include/bits/stl_bvector.h 
b/libstdc++-v3/include/bits/stl_bvector.h
index 19c16839cfa..3752897272f 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -473,8 +473,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
: public _Bit_alloc_type, public _Bvector_impl_data
{
public:
- _Bvector_impl()
-   _GLIBCXX_NOEXCEPT_IF( noexcept(_Bit_alloc_type()) )
+ _Bvector_impl() _GLIBCXX_NOEXCEPT_IF(
+   is_nothrow_default_constructible<_Bit_alloc_type>::value)
  : _Bit_alloc_type()
  { }
 
diff --git a/libstdc++-v3/include/bits/stl_list.h 
b/libstdc++-v3/include/bits/stl_list.h
index 3544981698c..cb8aa88d548 100644
--- a/libstdc++-v3/include/bits/stl_list.h
+++ b/libstdc++-v3/include/bits/stl_list.h
@@ -372,7 +372,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   {
__detail::_List_node_header _M_node;
 
-   _List_impl() _GLIBCXX_NOEXCEPT_IF( noexcept(_Node_alloc_type()) )
+   _List_impl() _GLIBCXX_NOEXCEPT_IF(
+   is_nothrow_default_constructible<_Node_alloc_type>::value)
: _Node_alloc_type()
{ }
 
diff --git a/libstdc++-v3/include/bits/stl_vector.h 
b/libstdc++-v3/include/bits/stl_vector.h
index 40debd62396..05f9b7ef6c3 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -125,7 +125,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   struct _Vector_impl
: public _Tp_alloc_type, public _Vector_impl_data
   {
-   _Vector_impl() _GLIBCXX_NOEXCEPT_IF( noexcept(_Tp_alloc_type()) )
+   _Vector_impl() _GLIBCXX_NOEXCEPT_IF(
+   is_nothrow_default_constructible<_Tp_alloc_type>::value)
: _Tp_alloc_type()
{ }
 
diff --git a/libstdc++-v3/testsuite/23_containers/forward_list/cons/87809.cc 
b/libstdc++-v3/testsuite/23_containers/forward_list/cons/87809.cc
new file mode 100644
index 000..cde9dfff3e0
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/forward_list/cons/87809.cc
@@ -0,0 +1,42 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as 

[PATCH] PR libstdc++/87784 fix dynamic_bitset::push_back

2018-10-30 Thread Jonathan Wakely

Previously the _M_Nb member was incremented before calling
_M_unchecked_set which meant that the bit being set was out of bounds.
It either set the wrong bit in an allocated word, or accessed beyond the
end of the allocated memory in the _M_w vector. The fix for the bug is
to update the _M_Nb member after using it as an index.

As an optimisation, when a new block needs to be appended the call to
_M_unchecked_set can be avoided by appending a block with the least
significant bit already set to the desired value.

PR libstdc++/87784
* include/tr2/dynamic_bitset (dynamic_bitset::push_back): When there
are no unused bits in the last block, append a new block with the
right value so the bit doesn't need to be set. Only increment size
after setting the new bit, not before.
* testsuite/tr2/dynamic_bitset/pr87784.cc: New test.

Tested powerpc64le-linux, committed to trunk.

commit 9c6be7dac22486040e1a724e8f6f83d5c34d0fb4
Author: Jonathan Wakely 
Date:   Tue Oct 30 12:30:09 2018 +

PR libstdc++/87784 fix dynamic_bitset::push_back

Previously the _M_Nb member was incremented before calling
_M_unchecked_set which meant that the bit being set was out of bounds.
It either set the wrong bit in an allocated word, or accessed beyond the
end of the allocated memory in the _M_w vector. The fix for the bug is
to update the _M_Nb member after using it as an index.

As an optimisation, when a new block needs to be appended the call to
_M_unchecked_set can be avoided by appending a block with the least
significant bit already set to the desired value.

PR libstdc++/87784
* include/tr2/dynamic_bitset (dynamic_bitset::push_back): When there
are no unused bits in the last block, append a new block with the
right value so the bit doesn't need to be set. Only increment size
after setting the new bit, not before.
* testsuite/tr2/dynamic_bitset/pr87784.cc: New test.

diff --git a/libstdc++-v3/include/tr2/dynamic_bitset 
b/libstdc++-v3/include/tr2/dynamic_bitset
index f76c8faf6e3..9e5c8170c81 100644
--- a/libstdc++-v3/include/tr2/dynamic_bitset
+++ b/libstdc++-v3/include/tr2/dynamic_bitset
@@ -727,10 +727,11 @@ namespace tr2
   void
   push_back(bool __bit)
   {
-   if (size_t __offset = this->size() % bits_per_block == 0)
- this->_M_do_append_block(block_type(0), this->_M_Nb);
+   if (this->size() % bits_per_block == 0)
+ this->_M_do_append_block(block_type(__bit), this->_M_Nb);
+   else
+ this->_M_unchecked_set(this->_M_Nb, __bit);
++this->_M_Nb;
-   this->_M_unchecked_set(this->_M_Nb, __bit);
   }
 
   /**
diff --git a/libstdc++-v3/testsuite/tr2/dynamic_bitset/pr87784.cc 
b/libstdc++-v3/testsuite/tr2/dynamic_bitset/pr87784.cc
new file mode 100644
index 000..52dc3893a31
--- /dev/null
+++ b/libstdc++-v3/testsuite/tr2/dynamic_bitset/pr87784.cc
@@ -0,0 +1,76 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run { target c++11 } }
+
+#include 
+#include 
+
+void
+test01()
+{
+  std::tr2::dynamic_bitset b;
+  VERIFY( b.size() == 0 );
+  VERIFY( b.find_first() == b.size() );
+  b.push_back(0);
+  VERIFY( b.size() == 1 );
+  VERIFY( b.find_first() == b.size() );
+  b.push_back(0);
+  VERIFY( b.size() == 2 );
+  VERIFY( b.find_first() == b.size() );
+
+  b.push_back(1);
+  VERIFY( b.size() == 3 );
+  VERIFY( b.find_first() == b.size() - 1 );
+  b.push_back(1);
+  VERIFY( b.size() == 4 );
+  VERIFY( b.find_first() == b.size() - 2 );
+  b.push_back(0);
+  VERIFY( b.size() == 5 );
+  VERIFY( b.find_first() == b.size() - 3 );
+
+  b.clear();
+  VERIFY( b.size() == 0 );
+  VERIFY( b.find_first() == b.size() );
+  b.push_back(1);
+  VERIFY( b.size() == 1 );
+  VERIFY( b.find_first() == 0 );
+  b.push_back(1);
+  VERIFY( b.size() == 2 );
+  VERIFY( b.find_first() == 0 );
+  b.push_back(1);
+  VERIFY( b.size() == 3 );
+  VERIFY( b.find_first() == 0 );
+
+  b.clear();
+  b.append(2u);
+  VERIFY( b.size() == b.bits_per_block );
+  VERIFY( b.find_first() == 1 );
+  b <<= 1;
+  VERIFY( b.find_first() == 2 );
+  b <<= 3;
+  VERIFY( b.find_first() == 5 );
+  b <<= 6;
+  VERIFY( 

Re: Ping: [PATCH, testsuite]: check for weak support

2018-10-30 Thread Paul Koning



> On Oct 30, 2018, at 10:17 AM, Jeff Law  wrote:
> 
> On 10/30/18 6:55 AM, Paul Koning wrote:
>> Ping.  Ok to commit?
>> 
>>  paul
>> 
>>> On Oct 25, 2018, at 2:57 PM, Paul Koning  wrote:
>>> 
>>> I ran into a failures due to no weak symbol support in my target.  This 
>>> patch cures that.  Is it right?  The test case uses "weakref" so I' not 
>>> 100% sure that checking for "weak" support is correct.  If not, I can put 
>>> in a skip-if check for the target (pdp11) instead.
>>> 
>>> paul
>>> 
>>> ChangeLog:
>>> 
>>> 2018-10-25  Paul Koning  
>>> 
>>> * gcc.dg/tree-ssa/attr-alias.c: Skip if no weak support.
> OK.  This would fall under the obvious rule IMHO.  There's a "weak"
> attribute so clearly the test needs to require weak support :-)
> 
> jeff

Thanks.  Committed.

paul



Re: Turn complete to incomplete types in free_lang_data

2018-10-30 Thread Richard Biener
On Mon, 29 Oct 2018, Jan Hubicka wrote:

> Hi,
> this is cleaner version of the patch.  During weekend I did some tests with
> firefox, libreoffice and gcc builds and it seems to work well. For firefox
> it reaches almost linear scalability of the ltrans files (they are 1.9GB,
> after increasing number of partitions to 128 they grow to 2GB that looks
> quite acceptable. Resulting .s and libxul binary are actually bigger than
> that when debug info is enabled).
> 
> So if that works in other cases, I will increas lto-partitions and probably
> declare it good enough for this stage1 and will try to move to other things.
> 
> Concerning two things we have discussed, I am keeping recursion to
> free_lang_data_in_type for now as reordering seems just temporary solution
> until we do more freeing from both types and decl (eventually I want to free
> subtypes of function types that also brings a lot of context. On Firefox that
> acount another 5% of stream data volume).
> 
> One option would be to change the walking order for this stage1 and worry
> about it next stage1.  Other option would be to simply push the newly created
> types to the fld queues.

If that works this sounds best.

> I also did not share fld_type_variant_equal_p with check_base_type since
> that one checks also context that we do not want to do. Probably could be
> cleaned up incrementally - I wonder why Objective-C needs it.
> 
> I looked into enabling free_lang_data by default.  My main problem there is
> what to do with late langhooks, say one for variably_modified_type_p.
> I suppose we could just declare middle-end self contained wrt analyzing
> IL and try to disable them one by one after free lang data was run?

Yes, that was the original idea.  IIRC enabling free-lang-data 
unconditionally mostly works fine apart from some testcases issues
with late diagnostics and dump scanning.

> What we however want to do about late warnings? Do we have some idea how
> many are those and what kind of % modifiers needs to be printed correctly?
> Say late warning wants to print a type, how we are going to do that?

Well, we print the type in the middle-end way - basically 
install the LTO variant of the langhooks.

You fail to free fld_incomplete_types btw.  The patch looks sensible
with that change - possibly with removing the recursion and pushing
to the worklist instead.

Thanks,
Richard.

> Boostrapped/regtested x86_64-linux.
> 
> Honza
>   * tree.c (free_lang_data_in_type): Forward declare.
>   (fld_type_variant_equal_p): New function.
>   (fld_type_variant): New function
>   (fld_incomplete_types): New hash.
>   (fld_incomplete_type_of): New function
>   (fld_simplfied-type): New function.
>   (free_lang_data_in_decl): New.
> Index: tree.c
> ===
> --- tree.c(revision 265573)
> +++ tree.c(working copy)
> @@ -265,6 +265,8 @@
>  static void print_debug_expr_statistics (void);
>  static void print_value_expr_statistics (void);
>  
> +static void free_lang_data_in_type (tree type);
> +
>  tree global_trees[TI_MAX];
>  tree integer_types[itk_none];
>  
> @@ -5038,6 +5041,115 @@
>  SET_EXPR_LOCATION (t, loc);
>  }
>  
> +/* Do same comparsion as check_qualified_type skipping lang part of type
> +   and be more permissive about type names: we only care that names are
> +   same (for diagnostics) and that ODR names are the same.  */
> +
> +static bool
> +fld_type_variant_equal_p (tree t, tree v)
> +{
> +  if (TYPE_QUALS (t) != TYPE_QUALS (v)
> +  || TYPE_NAME (t) != TYPE_NAME (v)
> +  || TYPE_ALIGN (t) != TYPE_ALIGN (v)
> +  || !attribute_list_equal (TYPE_ATTRIBUTES (t),
> + TYPE_ATTRIBUTES (v)))
> +return false;
> +
> +  return true;
> +}
> +
> +/* Find variant of FIRST that match T and create new one if necessary.  */
> +
> +static tree
> +fld_type_variant (tree first, tree t)
> +{
> +  if (first == TYPE_MAIN_VARIANT (t))
> +return t;
> +  for (tree v = first; v; v = TYPE_NEXT_VARIANT (v))
> +if (fld_type_variant_equal_p (t, v))
> +  return v;
> +  tree v = build_variant_type_copy (first);
> +  TYPE_READONLY (v) = TYPE_READONLY (t);
> +  TYPE_VOLATILE (v) = TYPE_VOLATILE (t);
> +  TYPE_ATOMIC (v) = TYPE_ATOMIC (t);
> +  TYPE_RESTRICT (v) = TYPE_RESTRICT (t);
> +  TYPE_ADDR_SPACE (v) = TYPE_ADDR_SPACE (t);
> +  TYPE_NAME (v) = TYPE_NAME (t);
> +  TYPE_ATTRIBUTES (v) = TYPE_ATTRIBUTES (t);
> +  return v;
> +}
> +
> +/* Map complete types to incomplete types.  */
> +
> +static hash_map *fld_incomplete_types;
> +
> +/* For T being aggregate type try to turn it into a incomplete variant.
> +   Return T if no simplification is possible.  */
> +
> +static tree
> +fld_incomplete_type_of (tree t)
> +{
> +  if (!t)
> +return NULL;
> +  if (POINTER_TYPE_P (t))
> +{
> +  tree t2 = fld_incomplete_type_of (TREE_TYPE (t));
> +  if (t2 != TREE_TYPE (t))
> + {
> +   tree first;
> +  

Re: Ping: [PATCH, testsuite]: check for weak support

2018-10-30 Thread Jeff Law
On 10/30/18 6:55 AM, Paul Koning wrote:
> Ping.  Ok to commit?
> 
>   paul
> 
>> On Oct 25, 2018, at 2:57 PM, Paul Koning  wrote:
>>
>> I ran into a failures due to no weak symbol support in my target.  This 
>> patch cures that.  Is it right?  The test case uses "weakref" so I' not 100% 
>> sure that checking for "weak" support is correct.  If not, I can put in a 
>> skip-if check for the target (pdp11) instead.
>>
>>  paul
>>
>> ChangeLog:
>>
>> 2018-10-25  Paul Koning  
>>
>>  * gcc.dg/tree-ssa/attr-alias.c: Skip if no weak support.
OK.  This would fall under the obvious rule IMHO.  There's a "weak"
attribute so clearly the test needs to require weak support :-)

jeff


[PATCH] Simplify replace_trapping_overflow

2018-10-30 Thread Richard Biener


Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2018-10-30  Richard Biener  

* tree-eh.c (replace_trapping_overflow): Simplify ABS_EXPR case
using ABSU_EXPR.

diff --git a/gcc/tree-eh.c b/gcc/tree-eh.c
index 1c7d9dc1d59..046405b8a15 100644
--- a/gcc/tree-eh.c
+++ b/gcc/tree-eh.c
@@ -2759,27 +2759,9 @@ replace_trapping_overflow (tree *tp, int *walk_subtrees, 
void *data)
 
   if (TREE_CODE (*tp) == ABS_EXPR)
{
- tree op = TREE_OPERAND (*tp, 0);
- op = save_expr (op);
- /* save_expr skips simple arithmetics, which is undesirable
-here, if it might trap due to flag_trapv.  We need to
-force a SAVE_EXPR in the COND_EXPR condition, to evaluate
-it before the comparison.  */
- if (EXPR_P (op)
- && TREE_CODE (op) != SAVE_EXPR
- && walk_tree (, find_trapping_overflow, NULL, NULL))
-   {
- op = build1_loc (EXPR_LOCATION (op), SAVE_EXPR, type, op);
- TREE_SIDE_EFFECTS (op) = 1;
-   }
- /* Change abs (op) to op < 0 ? -op : op and handle the NEGATE_EXPR
-like other signed integer trapping operations.  */
- tree cond = fold_build2 (LT_EXPR, boolean_type_node,
-  op, build_int_cst (type, 0));
- tree neg = fold_build1 (NEGATE_EXPR, utype,
- fold_convert (utype, op));
- *tp = fold_build3 (COND_EXPR, type, cond,
-fold_convert (type, neg), op);
+ TREE_SET_CODE (*tp, ABSU_EXPR);
+ TREE_TYPE (*tp, utype);
+ *tp = fold_convert (type, *tp);
}
   else
{


[PATCH] Fix PRs70359/86270

2018-10-30 Thread Richard Biener


This picks up work from earlier this year where Aldy worked on
undoing forwprop during out-of-SSA to improve coalescing across
backedges.

The following patch first rectifies the existing code which
is meant to insert necessary copies in places where it then
allows coalescing and thus avoids splitting of the backedge.

The current code has issues with handling conflicts with uses
in the exit condition badly which is why the patch instead
of on the backedge inserts the copy before the definition
of the backedge value.  It also expands the constraint of
handling only single-BB loops (because of trivially_conflicts_p
restrictions).  Also we can coalesce vars with different
SSA_NAME_VAR just fine now.

This makes the cases in the PRs keep their natural loop
form where originally they had their backedge split because
the inserted copies didn't do the job.

The testcases may be a bit awkward (and hopefully survive
solaris assembler woes...)

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

>From 812187f4eabef40a42594bad48244047f3b37b89 Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Tue, 30 Oct 2018 14:46:05 +0100
Subject: [PATCH] fix-pr70359

PR middle-end/70359
PR middle-end/86270
* tree-outof-ssa.c (insert_backedge_copies): Restrict
copy generation to useful cases.  Place the copy before
the definition of the backedge value when possible.

* gcc.target/i386/pr70359.c: New testcase.
* gcc.target/i386/pr86270.c: Likewise.

diff --git a/gcc/testsuite/gcc.target/i386/pr70359.c 
b/gcc/testsuite/gcc.target/i386/pr70359.c
new file mode 100644
index 000..85b7017e386
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr70359.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+char* inttostr(int i, char* buf, int len)
+{
+  unsigned int ui = (i > 0) ? i : -i;
+  char *p = buf + len - 1;
+  *p = '\0';
+  do {
+*--p = '0' + (ui % 10);
+  } while ((ui /= 10) != 0);
+  if (i < 0) {
+*--p = '-';
+  }
+  return p;
+}
+
+/* In out-of-SSA we should have avoided splitting the latch edge of the
+   loop by inserting copies.  */
+/* { dg-final { scan-assembler-times "L\[0-9\]+:" 2 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr86270.c 
b/gcc/testsuite/gcc.target/i386/pr86270.c
new file mode 100644
index 000..81841ef5bd7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr86270.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int *a;
+long len;
+
+int
+test ()
+{
+  for (int i = 0; i < len + 1; i++)
+a[i]=i;
+}
+
+/* Check we do not split the backedge but keep nice loop form.  */
+/* { dg-final { scan-assembler-times "L\[0-9\]+:" 2 } } */
diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
index fca15e5f898..efb75c207e6 100644
--- a/gcc/tree-outof-ssa.c
+++ b/gcc/tree-outof-ssa.c
@@ -1171,15 +1171,19 @@ insert_backedge_copies (void)
{
  tree arg = gimple_phi_arg_def (phi, i);
  edge e = gimple_phi_arg_edge (phi, i);
+ if (!(e->flags & EDGE_DFS_BACK)
+ /* If the backedge is already split there's nothing
+to avoid.  */
+ || e->src != bb)
+   continue;
 
  /* If the argument is not an SSA_NAME, then we will need a
-constant initialization.  If the argument is an SSA_NAME with
-a different underlying variable then a copy statement will be
-needed.  */
- if ((e->flags & EDGE_DFS_BACK)
- && (TREE_CODE (arg) != SSA_NAME
- || SSA_NAME_VAR (arg) != SSA_NAME_VAR (result)
- || trivially_conflicts_p (bb, result, arg)))
+constant initialization.  If the argument is an SSA_NAME then
+a copy statement may be needed.  First handle the case
+where we cannot insert before the argument definition.  */
+ if (TREE_CODE (arg) != SSA_NAME
+ || (gimple_code (SSA_NAME_DEF_STMT (arg)) == GIMPLE_PHI
+ && trivially_conflicts_p (bb, result, arg)))
{
  tree name;
  gassign *stmt;
@@ -1226,6 +1230,34 @@ insert_backedge_copies (void)
gsi_insert_after (, stmt, GSI_NEW_STMT);
  SET_PHI_ARG_DEF (phi, i, name);
}
+ /* Insert a copy before the definition of the backedge value
+and adjust all conflicting uses.  */
+ else if (trivially_conflicts_p (bb, result, arg))
+   {
+ gimple *def = SSA_NAME_DEF_STMT (arg);
+ if (gimple_nop_p (def)
+ || gimple_code (def) == GIMPLE_PHI)
+   continue;
+ tree name = copy_ssa_name (result);
+ gimple *stmt = gimple_build_assign (name, result);
+ imm_use_iterator imm_iter;
+ 

Re: Fix D compilation on Solaris

2018-10-30 Thread Rainer Orth
Rainer Orth  writes:

> * On sparc, I didn't get that far, unfortunately: as I mentioned, many
>   compilations die with SIGBUS:
>
> libtool: compile:  /var/gcc/regression/trunk/11.5-gcc/build/./gcc/gdc 
> -B/var/gcc/regression/trunk/11.5-gcc/build/./gcc/ 
> -B/vol/gcc/sparc-sun-solaris2.11/bin/ -B/vol/gcc/sparc-sun-solaris2.11/lib/ 
> -isystem /vol/gcc/sparc-sun-solaris2.11/include -isystem 
> /vol/gcc/sparc-sun-solaris2.11/sys-include -fno-checking -fPIC -O2 -g 
> -nostdinc -I /vol/gcc/src/hg/trunk/local/libphobos/libdruntime -I . -c 
> /vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/thread.d 
> -fversion=Shared -o core/.libs/thread.o
> d21: internal compiler error: Bus Error
> 0xbb5507 crash_signal
> /vol/gcc/src/hg/trunk/local/gcc/toplev.c:325
> 0x518700 IntegerExp::toInteger()
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/expression.c:2943
> 0x4d05c3 interpret
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5984
> 0x4d1543 interpret
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:6017
> 0x4d1543 interpret(Statement*, InterState*)
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:6024
> 0x4d263b interpretFunction
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:906
> 0x4d263b interpretFunction
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:726
> 0x4d05c3 interpret
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5984
> 0x4d14df interpret(Expression*, InterState*, CtfeGoal)
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5994
> 0x4d05c3 interpret
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5984
> 0x4d14df interpret(Expression*, InterState*, CtfeGoal)
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5994
> 0x5243d7 DeclarationExp::accept(Visitor*)
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/expression.h:661
> 0x4d05c3 interpret
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5984
> 0x4d1543 interpret
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:6017
> 0x4d1543 interpret(Statement*, InterState*)
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:6024
> 0x4d263b interpretFunction
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:906
> 0x4d263b interpretFunction
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:726
> 0x4d05c3 interpret
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5984
> 0x4d14df interpret(Expression*, InterState*, CtfeGoal)
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5994
> 0x4d05c3 interpret
> /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5984
>
>   Will need to dig further here.

It's exactly as I suspected:

$ d21 /vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/thread.d -mcpu=v9 
-fversion=Shared -I /vol/gcc/src/hg/trunk/local/libphobos/libdruntime -o 
thread.s

Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1 (LWP 1)]
0x00518700 in IntegerExp::toInteger (this=)
at /vol/gcc/src/hg/trunk/local/gcc/d/dmd/expression.c:2942
2942return value;
(gdb) where
#0  0x00518700 in IntegerExp::toInteger (this=)
at /vol/gcc/src/hg/trunk/local/gcc/d/dmd/expression.c:2942
#1  0x004d8f24 in Interpreter::visit(BinExp*) ()
at /vol/gcc/src/hg/trunk/local/gcc/d/dmd/declaration.h:292
#2  0x005268ec in CmpExp::accept(Visitor*) ()
#3  0x004d05c4 in interpret (pue=0xffbfbbf4, e=0x22778f0, istate=0xffbfc144, 
goal=ctfeNeedRvalue)
at /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:5984
#4  0x004d6254 in Interpreter::visit(ForStatement*) ()
at /vol/gcc/src/hg/trunk/local/gcc/d/dmd/declaration.h:292
#5  0x005aec68 in ForStatement::accept(Visitor*) ()
#6  0x004d475c in Interpreter::visit(CompoundStatement*) ()
at /vol/gcc/src/hg/trunk/local/gcc/d/dmd/declaration.h:292
#7  0x005b5708 in CompoundStatement::accept(Visitor*) ()
#8  0x004d4810 in Interpreter::visit(ScopeStatement*) ()
at /vol/gcc/src/hg/trunk/local/gcc/d/dmd/declaration.h:292
#9  0x005aec08 in ScopeStatement::accept(Visitor*) ()
#10 0x004d475c in Interpreter::visit(CompoundStatement*) ()
at /vol/gcc/src/hg/trunk/local/gcc/d/dmd/declaration.h:292
#11 0x005b5708 in CompoundStatement::accept(Visitor*) ()
#12 0x004d475c in Interpreter::visit(CompoundStatement*) ()
at /vol/gcc/src/hg/trunk/local/gcc/d/dmd/declaration.h:292
#13 0x005b5708 in CompoundStatement::accept(Visitor*) ()
#14 0x004d1544 in interpret (istate=, s=, 
pue=0xffbfc07c) at /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:6017
#15 interpret (s=0x22ea0f8, istate=0xffbfc144)
at /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:6024
#16 0x004d263c in interpretFunction (thisarg=0x22f6198, 
arguments=, istate=0xffbfcb5c, fd=0x2270cc0)
at /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:906
#17 interpretFunction (fd=0x2270cc0, istate=0xffbfcb5c, 
arguments=, thisarg=)
at /vol/gcc/src/hg/trunk/local/gcc/d/dmd/dinterpret.c:726
#18 0x004e22ac in 

Ping: [PATCH, testsuite]: check for weak support

2018-10-30 Thread Paul Koning
Ping.  Ok to commit?

paul

> On Oct 25, 2018, at 2:57 PM, Paul Koning  wrote:
> 
> I ran into a failures due to no weak symbol support in my target.  This patch 
> cures that.  Is it right?  The test case uses "weakref" so I' not 100% sure 
> that checking for "weak" support is correct.  If not, I can put in a skip-if 
> check for the target (pdp11) instead.
> 
>   paul
> 
> ChangeLog:
> 
> 2018-10-25  Paul Koning  
> 
>   * gcc.dg/tree-ssa/attr-alias.c: Skip if no weak support.
> 
> Index: testsuite/gcc.dg/tree-ssa/attr-alias.c
> ===
> --- testsuite/gcc.dg/tree-ssa/attr-alias.c(revision 265404)
> +++ testsuite/gcc.dg/tree-ssa/attr-alias.c(working copy)
> @@ -1,5 +1,6 @@
> /* { dg-do compile } */
> /* { dg-require-alias "" } */
> +/* { dg-require-weak "" } */
> /* { dg-options "-O2 -fdump-tree-optimized -std=gnu89" } */
> void abort (void);
> __attribute__ ((weak))
> 



Re: [PATCH v3 3/3] or1k: gcc: initial support for openrisc

2018-10-30 Thread Stafford Horne
Hello,

On Sun, Oct 28, 2018 at 05:54:47PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Mon, Oct 29, 2018 at 06:47:23AM +0900, Stafford Horne wrote:
> > On Sat, Oct 27, 2018 at 09:57:30PM -0500, Segher Boessenkool wrote:
> > > > +/* Helper for defining INITIAL_ELIMINATION_OFFSET.
> > > > +   We allow the following eliminiations:
> > > > + FP -> HARD_FP or SP
> > > > + AP -> HARD_FP or SP
> > > > +
> > > > +   HFP and AP are the same which is handled below.  */
> > > > +
> > > > +HOST_WIDE_INT
> > > > +or1k_initial_elimination_offset (int from, int to)
> > > 
> > > You could calculate this as  some_offset (from) - some_offset (to)  with
> > > some_offset a simple helper function.  That gives you all possible
> > > eliminations :-)
> > > 
> > > (Each offset is very cheap to compute in your case, so that's not a 
> > > problem).
> > 
> > Right, Do you mean something like the following?  I think it would work, 
> > but I
> > am not sure it make the code easier to read.  Do you think there would be 
> > much
> > benefits supporting all possible eliminations?
> 
> Yes, like that.  It also easily can handle the other combos (those with
> STACK_POINTER), and it is easier if you have to switch FRAME_GROWS_DOWNWARD
> ("false" is better on some args, but "true" is required for ssp).
> 
> Your code is fine as-is of course.

Just to be clear, when you say 'as-is' did you mean the original v3 patch?  Or
are you referring to followup patch I posted with the some_offset (from) -
some_offset (to) logic.

-Stafford


Re: [PATCH][RFC] Sanitize equals and hash functions in hash-tables.

2018-10-30 Thread Martin Liška
On 10/30/18 11:03 AM, Jakub Jelinek wrote:
> On Mon, Oct 29, 2018 at 04:14:21PM +0100, Martin Liška wrote:
>> +hashtab_chk_error ()
>> +{
>> +  fprintf (stderr, "hash table checking failed: "
>> +   "equal operator returns true for a pair "
>> +   "of values with a different hash value");
> 
> BTW, either use internal_error here, or at least if using fprintf
> terminate with \n, in your recent mail I saw:
> ...different hash valueduring RTL pass: vartrack
> ^^

Sure, fixed in attached patch.

Martin

> 
>> +  gcc_unreachable ();
>> +}
> 
>   Jakub
> 

>From 0d9c979c845580a98767b83c099053d36eb49bb9 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 29 Oct 2018 09:38:21 +0100
Subject: [PATCH] Sanitize equals and hash functions in hash-tables.

---
 gcc/hash-table.h | 40 +++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index bd83345c7b8..694eedfc4be 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -503,6 +503,7 @@ private:
 
   value_type *alloc_entries (size_t n CXX_MEM_STAT_INFO) const;
   value_type *find_empty_slot_for_expand (hashval_t);
+  void verify (const compare_type , hashval_t hash);
   bool too_empty_p (unsigned int);
   void expand ();
   static bool is_deleted (value_type )
@@ -882,8 +883,12 @@ hash_table
   if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
 expand ();
 
-  m_searches++;
+#if ENABLE_EXTRA_CHECKING
+if (insert == INSERT)
+  verify (comparable, hash);
+#endif
 
+  m_searches++;
   value_type *first_deleted_slot = NULL;
   hashval_t index = hash_table_mod1 (hash, m_size_prime_index);
   hashval_t hash2 = hash_table_mod2 (hash, m_size_prime_index);
@@ -930,6 +935,39 @@ hash_table
   return _entries[index];
 }
 
+#if ENABLE_EXTRA_CHECKING
+
+/* Report a hash table checking error.  */
+
+ATTRIBUTE_NORETURN ATTRIBUTE_COLD
+static void
+hashtab_chk_error ()
+{
+  fprintf (stderr, "hash table checking failed: "
+	   "equal operator returns true for a pair "
+	   "of values with a different hash value\n");
+  gcc_unreachable ();
+}
+
+/* Verify that all existing elements in th hash table which are
+   equal to COMPARABLE have an equal HASH value provided as argument.  */
+
+template class Allocator>
+void
+hash_table
+::verify (const compare_type , hashval_t hash)
+{
+  for (size_t i = 0; i < m_size; i++)
+{
+  value_type *entry = _entries[i];
+  if (!is_empty (*entry) && !is_deleted (*entry)
+	  && hash != Descriptor::hash (*entry)
+	  && Descriptor::equal (*entry, comparable))
+	hashtab_chk_error ();
+}
+}
+#endif
+
 /* This function deletes an element with the given COMPARABLE value
from hash table starting with the given HASH.  If there is no
matching element in the hash table, this function does nothing. */
-- 
2.19.0



Re: [PATCH v3 3/3] or1k: gcc: initial support for openrisc

2018-10-30 Thread Stafford Horne
On Sun, Oct 28, 2018 at 01:56:29AM +, Richard Henderson wrote:
> On 10/27/18 5:37 AM, Stafford Horne wrote:
> > +(define_insn "zero_extendhisi2"
> > +  [(set (match_operand:SI 0 "register_operand""=r,r")
> > +   (zero_extend:SI (match_operand:HI 1 "nonimmediate_operand" "r,m")))]
> > +  ""
> > +  "@
> > +   l.exthz\t%0, %1
> > +   l.lhz\t%0, %1"
> > +  [(set_attr "insn_support" "sext,*")])
> > +
> > +(define_insn "zero_extendqisi2"
> > +  [(set (match_operand:SI 0 "register_operand""=r,r")
> > +   (zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
> > +  ""
> > +  "@
> > +   l.extbz\t%0, %1
> > +   l.lbz\t%0, %1"
> > +  [(set_attr "insn_support" "sext,*")])
> 
> The !sext r/r case is just l.andi.

OK.

> > +;; Sign extension patterns
> > +
> > +;; We can do memory extensions with a single load
> > +(define_insn "extendhisi2"
> > +  [(set (match_operand:SI 0 "register_operand" "=r,r")
> > +   (sign_extend:SI (match_operand:HI 1 "nonimmediate_operand"  "r,m")))]
> > +  ""
> > +  "@
> > +   l.exths\t%0, %1
> > +   l.lhs\t%0, %1"
> > +  [(set_attr "insn_support" "sext,*")])
> > +
> > +(define_insn "extendqisi2"
> > +  [(set (match_operand:SI 0 "register_operand" "=r,r")
> > +   (sign_extend:SI (match_operand:QI 1 "nonimmediate_operand"  "r,m")))]
> > +  ""
> > +  "@
> > +   l.extbs\t%0, %1
> > +   l.lbs\t%0, %1"
> > +  [(set_attr "insn_support" "sext,*")])
> 
> You don't really want to give the register allocator no choice but to spill to
> memory in the !sext case.  Another r/r case with a splitter that is 
> conditional
> on !sext would work.

OK, I was just being lazy allowing the spill.  Do you think the split/expand
would be an RTL using left shift / right shift?  Can you think of something
more clever?  Since "real" hardware does not usually support shifts with an
immediate we will need 1 instruction to load shift amount. i.e.

  l.ori %0, r0, 24
  l.sll %1, %1, %0
  l.sra %0, %1, %0

If we support shift with immediate it would just be:

  l.slli %1, %1, 24
  l.srai %0, %1, 24

But, I cant think of anything better.

> Otherwise, OK.

Thanks,

I am submitting patches on my git branch or1k-port-4. Just in case you want to
track progress.

 - Stafford


[PATCH] Fix __builtin_expect_with_probability documentation

2018-10-30 Thread Jonathan Wakely

* doc/extend.texi: Fix prototype and description of
__builtin_expect_with_probability.

Committed to trunk, as discussed on the gcc list.

commit 9fb8a9f94f673e7dd12b3a625829db6caaeab99e
Author: Jonathan Wakely 
Date:   Tue Oct 30 12:15:14 2018 +

Fix __builtin_expect_with_probability documentation

* doc/extend.texi: Fix prototype and description of
__builtin_expect_with_probability.

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 8772f3afe6b..4dbb2da39e4 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -12025,12 +12025,12 @@ when testing pointer or floating-point values.
 @end deftypefn
 
 @deftypefn {Built-in Function} long __builtin_expect_with_probability
-(long @var{exp}, long @var{c}, long @var{probability})
+(long @var{exp}, long @var{c}, double @var{probability})
 
-The built-in has same semantics as @code{__builtin_expect},
-but user can provide expected probability (in percent) for value of @var{exp}.
-Last argument @var{probability} is of float type and valid values
-are in inclusive range 0.0f and 1.0f.
+This function has the same semantics as @code{__builtin_expect},
+but the caller provides the expected probability that @var{exp} == @var{c}.
+The last argument, @var{probability}, is a floating-point value in the
+range 0.0 to 1.0, inclusive.
 @end deftypefn
 
 @deftypefn {Built-in Function} void __builtin_trap (void)


Re: [PATCH] S/390: Allow LARL of literal pool entries

2018-10-30 Thread Ilya Leoshkevich



> Am 29.10.2018 um 19:45 schrieb Ulrich Weigand :
> 
> Ilya Leoshkevich wrote:
> 
>> 
>> UNSPEC_LTREF and friends are necessary in order to communicate the
>> dependency on the base register to pass_sched2.  When LARL is used, no
>> base register is necessary, so in such cases the rewrite must be
>> avoided.
> 
> This is true.  But something else must still be going on here.  Note that
> many other instruction patterns might contain constant pool addresses,
> since they are accepted e.g. by the 'b' constraint.  In all of those
> cases, we shouldn't add the UNSPEC_LTREF.  So just checking for the
> specific LARL instruction pattern in annotate_constant_pool_refs does
> not feel like a correct fix here.

I have changed the patch to skip all larl_operands, regardless of which
context they appear in.  Regtest is running.

> 
> In fact, before r265490, the pattern for movdi_larl could also contain a
> constant pool address, so why didn't the problem occur then?  What's the
> difference whether this is part of movdi_larl or just movdi?
> 

The difference is usage of "X" constraint.  Before, when we initially
chose movdi_larl, we could still put UNSPEC_LTREF inside it without
consequences, because during UNSPEC_LTREF lifetime only constraints are
checked.  Example:

pr59037.c.274r.reload: Recognized as movdi_larl.
(insn 7 9 14 2 (set (reg/f:DI 1 %r1 [65])
(const:DI (plus:DI (symbol_ref/u:DI ("*.LC0") [flags 0x2])
(const_int 16 [0x10] 1269 {*movdi_larl})

pr59037.c.281r.early_mach: Rewrite with UNSPEC_LTREF.
pr59037.c.291r.cprop_hardreg: We fail here today, because constraints
  don't match.  We would have also failed in
  the past, if predicates were also checked.
(insn 7 9 14 2 (set (reg/f:DI 1 %r1 [65])
(plus:DI (unspec:DI [
(symbol_ref/u:DI ("*.LC0") [flags 0x2])
(reg:DI 5 %r5)
] UNSPEC_LTREF)
(const_int 16 [0x10]))) 1269 {*movdi_larl})

pr59037.c.306r.shorten: Re-recognized as la_64.
(insn 7 21 14 (set (reg/f:DI 1 %r1 [65])
(plus:DI (reg:DI 5 %r5)
(const:DI (plus:DI (unspec:DI [
(label_ref:DI 35)
(label_ref:DI 34)
] UNSPEC_POOL_OFFSET)
(const_int 16 [0x10]) 1272 {*la_64})


>> @@ -8184,7 +8200,8 @@ annotate_constant_pool_refs (rtx *x)
>>rtx addr = gen_rtx_UNSPEC (Pmode, gen_rtvec (2, sym, base),
>>   UNSPEC_LTREF);
>> 
>> -  SET_SRC (*x) = plus_constant (Pmode, addr, off);
>> +  SET_SRC (*x) = gen_rtx_CONST (Pmode,
>> +plus_constant (Pmode, addr, off));
> 
> This looks like an unrelated change ... it seems incorrect to me, given
> the UNSPEC_LTREF actually contains a register reference, so it shouldn't
> really be CONST.  (And if it were, why make the change just here and not
> everywhere a UNSPEC_LTREF is generated?)

You are right, this is a leftover from the first attempt to fix the
symptom: larl_operand did not match non-CONST PLUS rtxs.  I have removed
it from the patch and will send it separately later, if it somehow
proves useful.



Re: [PATCH v4] Avoid unnecessarily numbering cloned symbols.

2018-10-30 Thread Martin Liška
On 10/29/18 6:43 PM, Michael Ploujnikov wrote:
> Thanks for installing the patch while I figure out the SVN access.
> - Michael

Installed as r265621.

Martin


Re: [PATCH v3 3/3] or1k: gcc: initial support for openrisc

2018-10-30 Thread Stafford Horne
On Mon, Oct 29, 2018 at 04:42:43PM +, Richard Henderson wrote:
> On 10/29/18 4:34 PM, Segher Boessenkool wrote:
> > Is there some better documentation available?  This is what google found
> > for me.  I would have like better docs (more compact, etc.)  Links to
> > such would be great to have in readings.html :-)
> 
> https://openrisc.io/architecture
> 
> and especially the v1.2 pdf linked from there
> 
> https://raw.githubusercontent.com/openrisc/doc/master/openrisc-arch-1.2-rev0.pdf

Thanks,

I meant to point this out during my previous reply.  Also, I will send a patch
for adding this to wwwdocs.

  https://www.gnu.org/software/gcc/readings.html

-Stafford


Re: [patch] Don't allow the pool allocator to be configured to allocate zero-sized objects

2018-10-30 Thread Richard Biener
On Tue, Oct 30, 2018 at 11:55 AM Richard Earnshaw (lists)
 wrote:
>
> PR bootstrap/87747 would have been significantly easier to track down if
> the pool allocator had faulted an attempt to configure it to allocate
> zero-sized objects.  Instead, this slipped through and we later hit
> memory corruption when the assumed size turned out to be different to
> the configured size.
>
> While, theoretically, there might be a use case for this, it seems
> unlikely to me that GCC would have such a use.  So this patch adds a
> checking assert that the object size is not zero.
>
> * alloc-pool.h (base_pool_allocator ::initialize): 
> Assert
> that the allocation size is not zero.
>
> OK?

OK.


Fix D compilation on Solaris

2018-10-30 Thread Rainer Orth
I just tried building D on Solaris 11/SPARC and x86 and ran into a
couple of issues.  The following patch is at least enough to have the
build finish on Solaris 11/x86, but on SPARC d21 runs into several BUS
errors (probably due to alignment issues).

One comment up front: I believe it would be good if libphobos had a
configure.tgt like several other target libraries so users won't run
into D related build failures with --enable-languages=all for targets
known not to work.

Here are the issues I ran into:

* On sparc, the build first aborted with

In file included from ./tm_d.h:7,
 from /vol/gcc/src/hg/trunk/local/gcc/config/default-d.c:21:
/vol/gcc/src/hg/trunk/local/gcc/config/sparc/sparc-protos.h:45:47: error: use 
of enum 'memmodel' without previous declaration
   45 | extern void sparc_emit_membar_for_model (enum memmodel, int, int);
  |   ^~~~

  Unlike glibc-d.c, default-d.c fails to include memmodel.h.  The patch
  below fixes that.

* However, default-d.c isn't very useful for Solaris.  Instead, I've
  added sol2-d.c which implements a proper TARGET_D_OS_VERSIONS.

* Next, the sparc build ran into

/vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/stdc/fenv.d:700:9: 
error: static assert  "Unimplemented architecture"
  700 | static assert(0, "Unimplemented architecture");
  | ^

  and indeed the file lacked SPARC definitions.  However, even the x86
  ones are really Linux/x86, so I added apropriate definitions for
  Solaris/x86, too.

* Next, I ran into

/vol/gcc/src/hg/trunk/local/libphobos/libdruntime/rt/sections.d:70:5: error: 
static assert  (is(typeof() == void* function() nothrow 
@nogc)) is false
   70 | static assert(is(typeof() == void* function() 
nothrow @nogc));
  | ^
/vol/gcc/src/hg/trunk/local/libphobos/libdruntime/rt/sections.d:70:5: error: 
static assert  (is(typeof() == void* function() nothrow 
@nogc)) is false
   70 | static assert(is(typeof() == void* function() 
nothrow @nogc));
  | ^

  To get me further along, I added dummy definitions to
  sections_solaris.d.  However, I wonder if it wouldn't be better to
  adapt sections_elf_shared.d to Solaris instead?

* Next error on sparc:

/vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/sys/posix/ucontext.d:951:26:
 error: undefined identifier '_NGREG'
  951 | alias greg_t[_NGREG] gregset_t;
  |  ^
d21: internal compiler error: Bus Error

/vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/sys/posix/ucontext.d:1017:21:
 error: undefined identifier 'fpregset_t', did you mean alias 'gregset_t'?
 1017 | fpregset_t  fpregs;
  | ^

  ucontext.d lacked several SPARC and SPARC64 definitions here.  While
  adding D version of system types manually in a few cases is workable
  if tedious, I wonder if there isn't an easier way to do this.  Imagine
  a new port when there are many more definitions to add...

* Next error:

/vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/thread.d:989:21: error: 
cannot modify immutable expression m_isRTClass
  989 | m_isRTClass = true;
  | ^
/vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/thread.d:997:21: error: 
cannot modify immutable expression m_isRTClass
  997 | m_isRTClass = false;
  | ^

  It seems weird to assign to an immutable variable, so I removed that
  attribute.

* Next:

/vol/gcc/src/hg/trunk/local/libphobos/libdruntime/rt/sections_solaris.d:63:5: 
error: @nogc function 'rt.sections_solaris.initSections' cannot call non-@nogc 
function 'rt.sections_solaris.SectionGroup.moduleGroup'
   63 | _sections.moduleGroup = ModuleGroup(mbeg[0 .. mend - mbeg]);
  | ^
/vol/gcc/src/hg/trunk/local/libphobos/libdruntime/rt/sections_solaris.d:63:5: 
error: function 'rt.sections_solaris.SectionGroup.moduleGroup' is not nothrow
   63 | _sections.moduleGroup = ModuleGroup(mbeg[0 .. mend - mbeg]);
  | ^
/vol/gcc/src/hg/trunk/local/libphobos/libdruntime/rt/sections_solaris.d:59:6: 
error: nothrow function 'rt.sections_solaris.initSections' may throw
   59 | void initSections() nothrow @nogc
  |  ^

  Adding nothrow @nogc to moduleGroup definition fixed that.

* Next:

/vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/sys/posix/aio.d:127:5: 
error: static assert  "Unsupported platform"
  127 | static assert(false, "Unsupported platform");
  | ^

  The file needed a definition of the Solaris versions.  The one I added
  is enough to get the build to continue, but may well need an
  additional largefile version...

* Next and final build error:

/vol/gcc/src/hg/trunk/local/libphobos/src/std/datetime/systime.d:229:25: 
error:undefined identifier ‘clock_gettime’
  229 | if (clock_gettime(clockArg, ) != 0)
  | ^

Re: [Patch, regrename] Fix PR87330 : ICE in scan_rtx_reg, at regrename.c

2018-10-30 Thread Sameera Deshpande
On Tue, 30 Oct 2018 at 16:16, Richard Earnshaw (lists)
 wrote:
>
> On 30/10/2018 10:09, Sameera Deshpande wrote:
> > On Tue, 9 Oct 2018 at 04:08, Eric Botcazou  wrote:
> >>
> >>> Other notes need not be changed, as they don't hold renamed register
> >>> information.
> >>>
> >>> Ok for trunk?
> >>
> >> No, REG_DEAD & REG_UNUSED note must be recomputed by passes consuming them.
> >>
> >>> 2018-10-09 Sameera Deshpande  >>>
> >>> * gcc/regrename.c (regrename_do_replace): Add condition to alter
> >>> regname if note has same register marked dead in notes.
> >>
> >> No gcc/ prefix in gcc/ChangeLog.
> >>
> >> --
> >> Eric Botcazou
> >
> > Hi Eric,
> >
> > Thanks for your comments.
> >
> > Please find attached updated patch invoking data flow for updating the
> > REG_DEAD and REG_UNUSED notes.
> >
> > As this change is made in falkor specific file, adding James and
> > Richard for review.
> >
> > Ok for trunk?
> >
> > Changelog:
> >
> > 2018-10-30 Sameera Deshpande  >
> > * gcc/config/aarch64/falkor-tag-collision-avoidance.c
> > (execute_tag_collision_avoidance): Invoke df_note_add_problem to
> > recompute REG_DEAD and REG_UNUSED notes before analysis.
> >
>
> 'Call df_note_add_problem.' is enough.
>
> OK with that change.
>
> R.
>
> >
> > bug87330.patch
> >
> > diff --git a/gcc/config/aarch64/falkor-tag-collision-avoidance.c 
> > b/gcc/config/aarch64/falkor-tag-collision-avoidance.c
> > index fb6568f..4ca9d66 100644
> > --- a/gcc/config/aarch64/falkor-tag-collision-avoidance.c
> > +++ b/gcc/config/aarch64/falkor-tag-collision-avoidance.c
> > @@ -805,6 +805,7 @@ execute_tag_collision_avoidance ()
> >df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
> >df_chain_add_problem (DF_UD_CHAIN);
> >df_compute_regs_ever_live (true);
> > +  df_note_add_problem ();
> >df_analyze ();
> >df_set_flags (DF_DEFER_INSN_RESCAN);
> >
> >
>
Thanks Richard! Patch committed at revision 265618.

-- 
- Thanks and regards,
  Sameera D.


[patch] Don't allow the pool allocator to be configured to allocate zero-sized objects

2018-10-30 Thread Richard Earnshaw (lists)
PR bootstrap/87747 would have been significantly easier to track down if
the pool allocator had faulted an attempt to configure it to allocate
zero-sized objects.  Instead, this slipped through and we later hit
memory corruption when the assumed size turned out to be different to
the configured size.

While, theoretically, there might be a use case for this, it seems
unlikely to me that GCC would have such a use.  So this patch adds a
checking assert that the object size is not zero.

* alloc-pool.h (base_pool_allocator ::initialize): 
Assert
that the allocation size is not zero.

OK?

diff --git a/gcc/alloc-pool.h b/gcc/alloc-pool.h
index c0a12920558..d2ee0005761 100644
--- a/gcc/alloc-pool.h
+++ b/gcc/alloc-pool.h
@@ -256,6 +256,7 @@ base_pool_allocator ::initialize ()
   size_t size = m_size;
 
   gcc_checking_assert (m_name);
+  gcc_checking_assert (m_size);
 
   /* Make size large enough to store the list header.  */
   if (size < sizeof (allocation_pool_list*))


Re: [PATCH, GCC/ARM, ping3] Fix PR87374: ICE with -mslow-flash-data and -mword-relocations

2018-10-30 Thread Thomas Preudhomme
Ping?

Best regards,

Thomas

On Tue, 23 Oct 2018 at 10:10, Thomas Preudhomme
 wrote:
>
> Ping?
>
> Best regards,
>
> Thomas
>
> On Mon, 15 Oct 2018 at 16:01, Thomas Preudhomme
>  wrote:
> >
> > Ping?
> >
> > Best regards,
> >
> > Thomas
> > On Fri, 5 Oct 2018 at 17:50, Thomas Preudhomme
> >  wrote:
> > >
> > > Hi Ramana and Kyrill,
> > >
> > > I've reworked the patch to add some documentation of the option
> > > conflict and reworked the -mword-relocation logic slightly to set the
> > > variable explicitely in PIC mode rather than test for PIC and word
> > > relocation everywhere.
> > >
> > > ChangeLog entries are now as follows:
> > >
> > > *** gcc/ChangeLog ***
> > >
> > > 2018-10-02  Thomas Preud'homme  
> > >
> > > PR target/87374
> > > * config/arm/arm.c (arm_option_check_internal): Disable the combined
> > > use of -mslow-flash-data and -mword-relocations.
> > > (arm_option_override): Enable -mword-relocations if -fpic or -fPIC.
> > > * config/arm/arm.md (SYMBOL_REF MOVT splitter): Stop checking for
> > > flag_pic.
> > > * doc/invoke.texi (-mword-relocations): Mention conflict with
> > > -mslow-flash-data.
> > > (-mslow-flash-data): Reciprocally.
> > >
> > > *** gcc/testsuite/ChangeLog ***
> > >
> > > 2018-09-25  Thomas Preud'homme  
> > >
> > > PR target/87374
> > > * gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and
> > > -mword-relocations would be passed when compiling the test.
> > > * gcc.target/arm/movsi_movt.c: Likewise.
> > > * gcc.target/arm/pr81863.c: Likewise.
> > > * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise.
> > > * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise.
> > > * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
> > > * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
> > > * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
> > > * gcc.target/arm/tls-disable-literal-pool.c: Likewise.
> > >
> > > Is this ok for trunk?
> > >
> > > Best regards,
> > >
> > > Thomas
> > >
> > > On Tue, 2 Oct 2018 at 13:39, Ramana Radhakrishnan
> > >  wrote:
> > > >
> > > > On 02/10/2018 11:42, Thomas Preudhomme wrote:
> > > > > Hi Ramana,
> > > > >
> > > > > On Thu, 27 Sep 2018 at 11:14, Ramana Radhakrishnan
> > > > >  wrote:
> > > > >>
> > > > >> On 27/09/2018 09:26, Kyrill Tkachov wrote:
> > > > >>> Hi Thomas,
> > > > >>>
> > > > >>> On 26/09/18 18:39, Thomas Preudhomme wrote:
> > > >  Hi,
> > > > 
> > > >  GCC ICEs under -mslow-flash-data and -mword-relocations because 
> > > >  there
> > > >  is no way to load an address, both literal pools and MOVW/MOVT 
> > > >  being
> > > >  forbidden. This patch gives an error message when both options are
> > > >  specified by the user and adds the according dg-skip-if directives 
> > > >  for
> > > >  tests that use either of these options.
> > > > 
> > > >  ChangeLog entries are as follows:
> > > > 
> > > >  *** gcc/ChangeLog ***
> > > > 
> > > >  2018-09-25  Thomas Preud'homme  
> > > > 
> > > > PR target/87374
> > > > * config/arm/arm.c (arm_option_check_internal): Disable the 
> > > >  combined
> > > > use of -mslow-flash-data and -mword-relocations.
> > > > 
> > > >  *** gcc/testsuite/ChangeLog ***
> > > > 
> > > >  2018-09-25  Thomas Preud'homme  
> > > > 
> > > > PR target/87374
> > > > * gcc.target/arm/movdi_movt.c: Skip if both 
> > > >  -mslow-flash-data and
> > > > -mword-relocations would be passed when compiling the test.
> > > > * gcc.target/arm/movsi_movt.c: Likewise.
> > > > * gcc.target/arm/pr81863.c: Likewise.
> > > > * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise.
> > > > * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise.
> > > > * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
> > > > * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
> > > > * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
> > > > * gcc.target/arm/tls-disable-literal-pool.c: Likewise.
> > > > 
> > > > 
> > > >  Testing: Bootstrapped in Thumb-2 mode. No testsuite regression when
> > > >  targeting arm-none-eabi. Modified tests get skipped as expected 
> > > >  when
> > > >  running the testsuite with -mslow-flash-data (pr81863.c) or
> > > >  -mword-relocations (all the others).
> > > > 
> > > > 
> > > >  Is this ok for trunk? I'd also appreciate guidance on whether this 
> > > >  is
> > > >  worth a backport. It's a simple patch but on the other hand it only
> > > >  prevents some option combination, it does not fix anything so I 
> > > >  have
> > > >  mixed feelings.
> > > > >>>
> > > > >>> In my opinion -mslow-flash-data is more of a tuning option rather 
> > > > >>> than a security/ABI feature
> > > > 

Re: [Patch, regrename] Fix PR87330 : ICE in scan_rtx_reg, at regrename.c

2018-10-30 Thread Richard Earnshaw (lists)
On 30/10/2018 10:09, Sameera Deshpande wrote:
> On Tue, 9 Oct 2018 at 04:08, Eric Botcazou  wrote:
>>
>>> Other notes need not be changed, as they don't hold renamed register
>>> information.
>>>
>>> Ok for trunk?
>>
>> No, REG_DEAD & REG_UNUSED note must be recomputed by passes consuming them.
>>
>>> 2018-10-09 Sameera Deshpande >>
>>> * gcc/regrename.c (regrename_do_replace): Add condition to alter
>>> regname if note has same register marked dead in notes.
>>
>> No gcc/ prefix in gcc/ChangeLog.
>>
>> --
>> Eric Botcazou
> 
> Hi Eric,
> 
> Thanks for your comments.
> 
> Please find attached updated patch invoking data flow for updating the
> REG_DEAD and REG_UNUSED notes.
> 
> As this change is made in falkor specific file, adding James and
> Richard for review.
> 
> Ok for trunk?
> 
> Changelog:
> 
> 2018-10-30 Sameera Deshpande  
> * gcc/config/aarch64/falkor-tag-collision-avoidance.c
> (execute_tag_collision_avoidance): Invoke df_note_add_problem to
> recompute REG_DEAD and REG_UNUSED notes before analysis.
> 

'Call df_note_add_problem.' is enough.

OK with that change.

R.

> 
> bug87330.patch
> 
> diff --git a/gcc/config/aarch64/falkor-tag-collision-avoidance.c 
> b/gcc/config/aarch64/falkor-tag-collision-avoidance.c
> index fb6568f..4ca9d66 100644
> --- a/gcc/config/aarch64/falkor-tag-collision-avoidance.c
> +++ b/gcc/config/aarch64/falkor-tag-collision-avoidance.c
> @@ -805,6 +805,7 @@ execute_tag_collision_avoidance ()
>df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
>df_chain_add_problem (DF_UD_CHAIN);
>df_compute_regs_ever_live (true);
> +  df_note_add_problem ();
>df_analyze ();
>df_set_flags (DF_DEFER_INSN_RESCAN);
>  
> 



Re: hash-table violation in gcc/cp/pt.c

2018-10-30 Thread Martin Liška
On 10/30/18 11:25 AM, Martin Liška wrote:
> On 10/29/18 12:04 PM, Martin Liška wrote:
>> 3) lookup_template_class_1
>>
>> $ ./xg++ -B. 
>> /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/template/ttp23.C -c 
>> -fchecking=3
>> hash table checking failed: equal operator returns true for a pair of values 
>> with a different hash 
>> value/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/template/ttp23.C: In 
>> instantiation of ‘struct B’:
>> /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/template/ttp23.C:15:8:   
>> required from here
>> /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/template/ttp23.C:8:17: 
>> internal compiler error: in find_slot_with_hash, at hash-table.h:905
>> 8 | friend bool foo (const B& a);
>>   | ^~~
>> 0xa265a4 hash_table> xcallocator>::find_slot_with_hash(spec_entry* const&, unsigned int, 
>> insert_option)
>>  /home/marxin/Programming/gcc/gcc/hash-table.h:905
>> 0xa042ce lookup_template_class_1
>>  /home/marxin/Programming/gcc/gcc/cp/pt.c:9629
>> 0xa042ce lookup_template_class(tree_node*, tree_node*, tree_node*, 
>> tree_node*, int, int)
>>  /home/marxin/Programming/gcc/gcc/cp/pt.c:9674
>> 0xa03670 tsubst_aggr_type
>>  /home/marxin/Programming/gcc/gcc/cp/pt.c:12679
>> 0x9fefcd tsubst(tree_node*, tree_node*, int, tree_node*)
>>  /home/marxin/Programming/gcc/gcc/cp/pt.c:14294
>> 0x9fe1a9 tsubst(tree_node*, tree_node*, int, tree_node*)
>>  /home/marxin/Programming/gcc/gcc/cp/pt.c:14285
>> 0xa0d8bd tsubst_arg_types
>>  /home/marxin/Programming/gcc/gcc/cp/pt.c:13891
>> 0xa0dc24 tsubst_function_type
>>  /home/marxin/Programming/gcc/gcc/cp/pt.c:14032
>> 0x9fe790 tsubst(tree_node*, tree_node*, int, tree_node*)
>>  /home/marxin/Programming/gcc/gcc/cp/pt.c:14769
>> 0x9f2c7c tsubst_function_decl
>>  /home/marxin/Programming/gcc/gcc/cp/pt.c:12921
>> 0xa02d27 tsubst_template_decl
>>  /home/marxin/Programming/gcc/gcc/cp/pt.c:13214
>> 0x9f4416 tsubst_decl
>>  /home/marxin/Programming/gcc/gcc/cp/pt.c:13316
>> 0x9ff0ca tsubst(tree_node*, tree_node*, int, tree_node*)
>>  /home/marxin/Programming/gcc/gcc/cp/pt.c:14212
>> 0xa1dfd0 tsubst_friend_function
>>  /home/marxin/Programming/gcc/gcc/cp/pt.c:10310
>> 0xa1dfd0 instantiate_class_template_1
>>  /home/marxin/Programming/gcc/gcc/cp/pt.c:11359
>> 0xa1dfd0 instantiate_class_template(tree_node*)
>>  /home/marxin/Programming/gcc/gcc/cp/pt.c:11424
>> 0xa66b22 complete_type(tree_node*)
>>  /home/marxin/Programming/gcc/gcc/cp/typeck.c:138
>> 0x9023c7 start_decl_1(tree_node*, bool)
>>  /home/marxin/Programming/gcc/gcc/cp/decl.c:5278
>> 0x92a15f start_decl(cp_declarator const*, cp_decl_specifier_seq*, int, 
>> tree_node*, tree_node*, tree_node**)
>>  /home/marxin/Programming/gcc/gcc/cp/decl.c:5241
>> 0x9c1944 cp_parser_init_declarator
>>  /home/marxin/Programming/gcc/gcc/cp/parser.c:19750
> 
> This one is about inconsistency between:
> 
> /* Returns a hash for a template TMPL and template arguments ARGS.  */
> 
> static hashval_t
> hash_tmpl_and_args (tree tmpl, tree args)
> {
>   hashval_t val = iterative_hash_object (DECL_UID (tmpl), 0);
>   return iterative_hash_template_arg (args, val);
> }
> 
> iterative_hash_template_arg is problematic, differs from:
> bool
> spec_hasher::equal (spec_entry *e1, spec_entry *e2)
> {
>   int equal;
> 
>   ++comparing_specializations;
>   equal = (e1->tmpl == e2->tmpl
>  && comp_template_args (e1->args, e2->args));
> ...
> 
> where comp_template_args (e1->args, e2->args) returns true, but
> iterative_hash_template_arg values are different.
> 
> Can please some C++ maintainer take a look?
> 
> Thanks,
> Martin
> 

The same spec_hasher type is also involved in 4)

0xa265a4 hash_table::find_slot_with_hash(spec_entry* 
const&, unsigned int, insert_option)
/home/marxin/Programming/gcc/gcc/hash-table.h:905
0x9e35e6 register_specialization
/home/marxin/Programming/gcc/gcc/cp/pt.c:1534
0xa22ac3 check_explicit_specialization(tree_node*, tree_node*, int, int, 
tree_node*)
/home/marxin/Programming/gcc/gcc/cp/pt.c:3243
0x91552d grokfndecl
/home/marxin/Programming/gcc/gcc/cp/decl.c:9106
...

Martin


hash-table violation in gcc/cp/pt.c

2018-10-30 Thread Martin Liška
On 10/29/18 12:04 PM, Martin Liška wrote:
> 3) lookup_template_class_1
> 
> $ ./xg++ -B. 
> /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/template/ttp23.C -c 
> -fchecking=3
> hash table checking failed: equal operator returns true for a pair of values 
> with a different hash 
> value/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/template/ttp23.C: In 
> instantiation of ‘struct B’:
> /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/template/ttp23.C:15:8:   
> required from here
> /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/template/ttp23.C:8:17: 
> internal compiler error: in find_slot_with_hash, at hash-table.h:905
> 8 | friend bool foo (const B& a);
>   | ^~~
> 0xa265a4 hash_table xcallocator>::find_slot_with_hash(spec_entry* const&, unsigned int, 
> insert_option)
>   /home/marxin/Programming/gcc/gcc/hash-table.h:905
> 0xa042ce lookup_template_class_1
>   /home/marxin/Programming/gcc/gcc/cp/pt.c:9629
> 0xa042ce lookup_template_class(tree_node*, tree_node*, tree_node*, 
> tree_node*, int, int)
>   /home/marxin/Programming/gcc/gcc/cp/pt.c:9674
> 0xa03670 tsubst_aggr_type
>   /home/marxin/Programming/gcc/gcc/cp/pt.c:12679
> 0x9fefcd tsubst(tree_node*, tree_node*, int, tree_node*)
>   /home/marxin/Programming/gcc/gcc/cp/pt.c:14294
> 0x9fe1a9 tsubst(tree_node*, tree_node*, int, tree_node*)
>   /home/marxin/Programming/gcc/gcc/cp/pt.c:14285
> 0xa0d8bd tsubst_arg_types
>   /home/marxin/Programming/gcc/gcc/cp/pt.c:13891
> 0xa0dc24 tsubst_function_type
>   /home/marxin/Programming/gcc/gcc/cp/pt.c:14032
> 0x9fe790 tsubst(tree_node*, tree_node*, int, tree_node*)
>   /home/marxin/Programming/gcc/gcc/cp/pt.c:14769
> 0x9f2c7c tsubst_function_decl
>   /home/marxin/Programming/gcc/gcc/cp/pt.c:12921
> 0xa02d27 tsubst_template_decl
>   /home/marxin/Programming/gcc/gcc/cp/pt.c:13214
> 0x9f4416 tsubst_decl
>   /home/marxin/Programming/gcc/gcc/cp/pt.c:13316
> 0x9ff0ca tsubst(tree_node*, tree_node*, int, tree_node*)
>   /home/marxin/Programming/gcc/gcc/cp/pt.c:14212
> 0xa1dfd0 tsubst_friend_function
>   /home/marxin/Programming/gcc/gcc/cp/pt.c:10310
> 0xa1dfd0 instantiate_class_template_1
>   /home/marxin/Programming/gcc/gcc/cp/pt.c:11359
> 0xa1dfd0 instantiate_class_template(tree_node*)
>   /home/marxin/Programming/gcc/gcc/cp/pt.c:11424
> 0xa66b22 complete_type(tree_node*)
>   /home/marxin/Programming/gcc/gcc/cp/typeck.c:138
> 0x9023c7 start_decl_1(tree_node*, bool)
>   /home/marxin/Programming/gcc/gcc/cp/decl.c:5278
> 0x92a15f start_decl(cp_declarator const*, cp_decl_specifier_seq*, int, 
> tree_node*, tree_node*, tree_node**)
>   /home/marxin/Programming/gcc/gcc/cp/decl.c:5241
> 0x9c1944 cp_parser_init_declarator
>   /home/marxin/Programming/gcc/gcc/cp/parser.c:19750

This one is about inconsistency between:

/* Returns a hash for a template TMPL and template arguments ARGS.  */

static hashval_t
hash_tmpl_and_args (tree tmpl, tree args)
{
  hashval_t val = iterative_hash_object (DECL_UID (tmpl), 0);
  return iterative_hash_template_arg (args, val);
}

iterative_hash_template_arg is problematic, differs from:
bool
spec_hasher::equal (spec_entry *e1, spec_entry *e2)
{
  int equal;

  ++comparing_specializations;
  equal = (e1->tmpl == e2->tmpl
   && comp_template_args (e1->args, e2->args));
...

where comp_template_args (e1->args, e2->args) returns true, but
iterative_hash_template_arg values are different.

Can please some C++ maintainer take a look?

Thanks,
Martin


Re: [AArch64] Add Saphira pipeline description.

2018-10-30 Thread Sameera Deshpande
On Fri, 26 Oct 2018 at 13:33, Sameera Deshpande
 wrote:
>
> Hi!
>
> Please find attached the patch to add a pipeline description for the
> Qualcomm Saphira core.  It is tested with a bootstrap and make check,
> with no regressions.
>
> Ok for trunk?
>
> gcc/
> Changelog:
>
> 2018-10-26 Sameera Deshpande 
>
> * config/aarch64/aarch64-cores.def (saphira): Use saphira pipeline.
> * config/aarch64/aarch64.md: Include saphira.md
> * config/aarch64/saphira.md: New file for pipeline description.
>
> --
> - Thanks and regards,
>   Sameera D.

Hi!

Please find attached updated patch.
Bootstrap and make check passed without regression. Ok for trunk?

-- 
- Thanks and regards,
  Sameera D.
diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def
index 3d876b8..8e4c646 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -90,7 +90,7 @@ AARCH64_CORE("cortex-a76",  cortexa76, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2
 /* ARMv8.4-A Architecture Processors.  */
 
 /* Qualcomm ('Q') cores. */
-AARCH64_CORE("saphira", saphira,falkor,8_4A,  AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   0x51, 0xC01, -1)
+AARCH64_CORE("saphira", saphira,saphira,8_4A,  AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   0x51, 0xC01, -1)
 
 /* ARMv8-A big.LITTLE implementations.  */
 
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index a014a01..f951354 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -298,6 +298,7 @@
 (include "../arm/cortex-a57.md")
 (include "../arm/exynos-m1.md")
 (include "falkor.md")
+(include "saphira.md")
 (include "thunderx.md")
 (include "../arm/xgene1.md")
 (include "thunderx2t99.md")
diff --git a/gcc/config/aarch64/saphira.md b/gcc/config/aarch64/saphira.md
new file mode 100644
index 000..bbf1c5c
--- /dev/null
+++ b/gcc/config/aarch64/saphira.md
@@ -0,0 +1,583 @@
+;; Saphira pipeline description
+;; Copyright (C) 2017-2018 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+(define_automaton "saphira")
+
+;; Complex int instructions (e.g. multiply and divide) execute in the X
+;; pipeline.  Simple int instructions execute in the X, Y, Z and B pipelines.
+
+(define_cpu_unit "saphira_x" "saphira")
+(define_cpu_unit "saphira_y" "saphira")
+
+;; Branches execute in the Z or B pipeline or in one of the int pipelines depending
+;; on how complex it is.  Simple int insns (like movz) can also execute here.
+
+(define_cpu_unit "saphira_z" "saphira")
+(define_cpu_unit "saphira_b" "saphira")
+
+;; Vector and FP insns execute in the VX and VY pipelines.
+
+(define_automaton "saphira_vfp")
+
+(define_cpu_unit "saphira_vx" "saphira_vfp")
+(define_cpu_unit "saphira_vy" "saphira_vfp")
+
+;; Loads execute in the LD pipeline.
+;; Stores execute in the ST pipeline, for address, data, and
+;; vector data.
+
+(define_automaton "saphira_mem")
+
+(define_cpu_unit "saphira_ld" "saphira_mem")
+(define_cpu_unit "saphira_st" "saphira_mem")
+
+;; The GTOV and VTOG pipelines are for general to vector reg moves, and vice
+;; versa.
+
+(define_cpu_unit "saphira_gtov" "saphira")
+(define_cpu_unit "saphira_vtog" "saphira")
+
+;; Common reservation combinations.
+
+(define_reservation "saphira_vxvy" "saphira_vx|saphira_vy")
+(define_reservation "saphira_zb"   "saphira_z|saphira_b")
+(define_reservation "saphira_xyzb" "saphira_x|saphira_y|saphira_z|saphira_b")
+
+;; SIMD Floating-Point Instructions
+
+(define_insn_reservation "saphira_afp_1_vxvy" 1
+  (and (eq_attr "tune" "saphira")
+   (eq_attr "type" "neon_fp_neg_s,neon_fp_neg_d,neon_fp_abs_s,neon_fp_abs_d,neon_fp_neg_s_q,neon_fp_neg_d_q,neon_fp_abs_s_q,neon_fp_abs_d_q"))
+  "saphira_vxvy")
+
+(define_insn_reservation "saphira_afp_2_vxvy" 2
+  (and (eq_attr "tune" "saphira")
+   (eq_attr "type" "neon_fp_minmax_s,neon_fp_minmax_d,neon_fp_reduc_minmax_s,neon_fp_reduc_minmax_d,neon_fp_compare_s,neon_fp_compare_d,neon_fp_round_s,neon_fp_round_d,neon_fp_minmax_s_q,neon_fp_minmax_d_q,neon_fp_compare_s_q,neon_fp_compare_d_q,neon_fp_round_s_q,neon_fp_round_d_q"))
+  "saphira_vxvy")
+
+(define_insn_reservation "saphira_afp_3_vxvy" 3
+  (and (eq_attr "tune" "saphira")
+   (eq_attr "type" 

Re: Add a loop versioning pass

2018-10-30 Thread Richard Biener


(sorry for breaking threading -- I composed a review mail offline but
gmail has no way of nicely sending that neither has it a way to bounce
messages...)

> This patch adds a pass that versions loops with variable index strides
> for the case in which the stride is 1.  E.g.:
> 
> for (int i = 0; i < n; ++i)
>   x[i * stride] = ...;
> 
> becomes:
> 
> if (stepx == 1)
>   for (int i = 0; i < n; ++i)
> x[i] = ...;
> else
>   for (int i = 0; i < n; ++i)
> x[i * stride] = ...;
> 
> This is useful for both vector code and scalar code, and in some cases
> can enable further optimisations like loop interchange or pattern
> recognition.
> 
> The pass gives a 7.6% improvement on Cortex-A72 for 554.roms_r at -O3
> and a 2.4% improvement for 465.tonto.  I haven't found any SPEC tests
> that regress.
> 
> Sizewise, there's a 10% increase in .text for both 554.roms_r and
> 465.tonto.  That's obviously a lot, but in tonto's case it's because
> the whole program is written using assumed-shape arrays and pointers,
> so a large number of functions really do benefit from versioning.
> roms likewise makes heavy use of assumed-shape arrays, and that
> improvement in performance IMO justifies the code growth.

Ouch.  I know that at least with LTO IPA-CP can do "quite" some
propagation of constant strides.  Not sure if we're aggressive
enough in actually doing the cloning for all cases we figure out
strides though.  But my question is how we can avoid doing the
versioning for loops in the copy that did not have the IPA-CPed
stride of one?  Ideally we'd be able to mark individual references
as {definitely,likely,unlikely,not}-unit-stride?

> The next biggest .text increase is 4.5% for 548.exchange2_r.  I did see
> a small (0.4%) speed improvement there, but although both 3-iteration runs
> produced stable results, that might still be noise.  There was a slightly
> larger (non-noise) improvement for a 256-bit SVE model.
> 
> 481.wrf and 521.wrf_r .text grew by 2.8% and 2.5% respectively, but
> without any noticeable improvement in performance.  No other test grew
> by more than 2%.
> 
> Although the main SPEC beneficiaries are all Fortran tests, the
> benchmarks we use for SVE also include some C and C++ tests that
> benefit.

Did you see any slowdown, for example because versioning was forced
to be on an innermost loop?  I'm thinking of the testcase in
PR87561 where we do have strided accesses in the innermost loop.

Since you cite performance numbers how did you measure them?
I assume -Ofast -march=native but did you check with -flto?

> Using -frepack-arrays gives the same benefits in many Fortran cases.
> The problem is that using that option inappropriately can force a full
> array copy for arguments that the function only reads once, and so it
> isn't really something we can turn on by default.  The new pass is
> supposed to give most of the benefits of -frepack-arrays without
> the risk of unnecessary repacking.
> 
> The patch therefore enables the pass by default at -O3.

I think that's reasonable.

One possible enhancement would be to add a value-profile for the
strides so we can guide this optimization better.

The pass falls foul of C++ class make small methods of everything.
That makes following the code very hard.  Please inline single-used
methods in callers wherever possible to make the code read
more like GCC code (using GCC API).

The pass contains an awful lot of heuristics :/  Like last year
with the interchange pass I would suggest to rip most of it out
and first lay infrastructure with the cases you can positively
identify without applying heuristics or "hacks" like stripping
semantically required casts.  That makes it also clear which
testcases test which code-path.  That said, all the analyze
multiplications/plusses/factors stuff was extremely hard to review
and I have no overall picture why this is all so complicated or
necessary.

> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> 2018-10-24  Richard Sandiford  
> 
> gcc/
>   * doc/invoke.texi (-fversion-loops-for-strides): Document
>   (loop-versioning-group-size, loop-versioning-max-inner-insns)
>   (loop-versioning-max-outer-insns): Document new --params.
>   * Makefile.in (OBJS): Add gimple-loop-versioning.o.
>   * common.opt (fversion-loops-for-strides): New option.
>   * opts.c (default_options_table): Enable fversion-loops-for-strides
>   at -O3.
>   * params.def (PARAM_LOOP_VERSIONING_GROUP_SIZE)
>   (PARAM_LOOP_VERSIONING_MAX_INNER_INSNS)
>   (PARAM_LOOP_VERSIONING_MAX_OUTER_INSNS): New parameters.
>   * passes.def: Add pass_loop_versioning.
>   * timevar.def (TV_LOOP_VERSIONING): New time variable.
>   * tree-ssa-propagate.h
>   (substitute_and_fold_engine::substitute_and_fold): Add an optional
>   block parameter.
>   * tree-ssa-propagate.c
>   (substitute_and_fold_engine::substitute_and_fold): 

  1   2   >