Re: PING^1: [PATCH] driver: Also prune joined switches with negation

2019-02-12 Thread Jakub Jelinek
On Wed, Feb 13, 2019 at 12:43:32AM +, Joseph Myers wrote:
> On Wed, 13 Feb 2019, Jakub Jelinek wrote:
> 
> > On Tue, Feb 12, 2019 at 11:21:04PM +, Joseph Myers wrote:
> > > I think this is changing architecture-independent code in a way that is 
> > > not clearly safe based on the architecture-independent options design, in 
> > > order to address an architecture-specific problem.  The exclusion of 
> > 
> > Actually, I think it is a problem common to many backends, in particular
> > those where *_host_detect_local_cpu emits for -m*=native sometimes more than
> > one option, so at least i386, s390, rs6000, maybe also those that emit just
> > one option because it likely ends up at a different spot on the command line
> > from where -m{arch,cpu,tune}=native was originally present (that would be
> > aarch64, alpha, arm, mips and sparc).  I guess the user expectations is that
> > -march=native -march=foobar will be handled as
> > -march=foobar, rather than -march=native -march=foobar -march=my_great_cpu 
> > -mfoo -mbar
> 
> It seems right in the march= case to handle that combination as 
> -march=foobar - but it's less clear if that must always be the case for 
> Joined options with negative versions (at least, the semantics would need 
> defining more carefully in options.texi, with an analysis of existing 
> affected options).

We have only a few Joined/JoinedOrMissing options with Negative:
find . -name \*.opt | xargs grep -B1 
'Joined.*[[:blank:]]Negative\|[[:blank:]]Negative.*Joined\|^Negative.*Joined'
./config/s390/s390.opt-mstack-guard=
./config/s390/s390.opt:Target RejectNegative Negative(mno-stack-guard) Joined 
UInteger Var(s390_stack_guard) Save
./common.opt-gdwarf
./common.opt:Common Driver JoinedOrMissing Negative(gdwarf-)
./common.opt-gdwarf-
./common.opt:Common Driver Joined UInteger Var(dwarf_version) Init(4) 
Negative(gstabs)
./common.opt-gstabs
./common.opt:Common Driver JoinedOrMissing Negative(gstabs+)
./common.opt-gstabs+
./common.opt:Common Driver JoinedOrMissing Negative(gvms)
./common.opt-gvms
./common.opt:Common Driver JoinedOrMissing Negative(gxcoff)
./common.opt-gxcoff
./common.opt:Common Driver JoinedOrMissing Negative(gxcoff+)
./common.opt-gxcoff+
./common.opt:Common Driver JoinedOrMissing Negative(gdwarf)
./fortran/lang.opt-cpp=
./fortran/lang.opt:Fortran Joined Negative(nocpp) Undocumented NoDWARFRecord

The patch indeed does change behavior for say:
gcc -c test.s -gstabs2 -gdwarf-4 -gstabs3
gcc: error: debug format ‘dwarf-2’ conflicts with prior selection
gcc: error: debug format ‘stabs’ conflicts with prior selection
(previously the above errors, now accepted as -gstabs3)
but wouldn't that be an advantage here (use the latest option win)?

For s390 (which has it weird, as there is no Negative(mno-stack-size) on
very similar mstack-size= option), I believe it shouldn't change end result,
while the driver will not pass 3 options for
-mno-stack-guard -mstack-guard=64 -mno-stack-guard
but just the last one (similarly for other combinations), the option
handling in cc1 etc. will handle it the same anyway (last option wins)
it seems.

And finally Fortran -cpp= option is internally generated from -cpp which
should have normal Negative processing with -nocpp.

Jakub


[PATCH 2/2] RISC-V: Support ELF attribute

2019-02-12 Thread Kito Cheng
From: Kito Cheng 

This patch added a configure time option, 
--with-riscv-attribute=[yes|no|default],
run time option, -mriscv-attribute to control the output of ELF attribute.

This feature is only enabled by default for ELF/Bare mental target
configuration.

Kito Cheng 
Monk Chiang  

ChangeLog:
gcc:
* common/config/riscv/riscv-common.c: Include sstream.
(riscv_subset_list::to_string): New.
(riscv_arch_str): Likewise.
* config.gcc (riscv*-*-*): Hanlde --with-riscv-attribute=
* config.in: Regen
* config/riscv/riscv-protos.h (riscv_arch_str): New.
* config/riscv/riscv.c (INCLUDE_STRING): Defined.
(riscv_emit_attribute): New.
(riscv_file_start): Emit attribute if needed.
(riscv_option_override): Init riscv_emit_attribute_p.
* config/riscv/riscv.opt (mriscv-attribute): New option.
* configure.ac (riscv*-*-*): Check binutils is supporting ELF
attribute.
* doc/install.texi: Document --with-riscv-attribute.
* doc/invoke.texi: Document -mriscv-attribute.

gcc/testsuite:

* gcc.target/riscv/attribute-1.c: New.
* gcc.target/riscv/attribute-2.c: Likewise.
* gcc.target/riscv/attribute-3.c: Likewise.
* gcc.target/riscv/attribute-4.c: Likewise.
* gcc.target/riscv/attribute-5.c: Likewise.
* gcc.target/riscv/attribute-6.c: Likewise.
* gcc.target/riscv/attribute-7.c: Likewise.
* gcc.target/riscv/attribute-8.c: Likewise.
* gcc.target/riscv/attribute-9.c: Likewise.
---
 gcc/common/config/riscv/riscv-common.c   | 38 
 gcc/config.gcc   | 26 ++-
 gcc/config.in|  6 +
 gcc/config/riscv/riscv-protos.h  |  3 +++
 gcc/config/riscv/riscv.c | 25 ++
 gcc/config/riscv/riscv.opt   |  4 +++
 gcc/configure.ac |  7 +
 gcc/doc/install.texi |  8 ++
 gcc/doc/invoke.texi  |  7 -
 gcc/testsuite/gcc.target/riscv/attribute-1.c |  7 +
 gcc/testsuite/gcc.target/riscv/attribute-2.c |  7 +
 gcc/testsuite/gcc.target/riscv/attribute-3.c |  7 +
 gcc/testsuite/gcc.target/riscv/attribute-4.c |  7 +
 gcc/testsuite/gcc.target/riscv/attribute-5.c |  7 +
 gcc/testsuite/gcc.target/riscv/attribute-6.c |  7 +
 gcc/testsuite/gcc.target/riscv/attribute-7.c |  7 +
 gcc/testsuite/gcc.target/riscv/attribute-8.c |  7 +
 gcc/testsuite/gcc.target/riscv/attribute-9.c |  7 +
 18 files changed, 185 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/attribute-9.c

diff --git a/gcc/common/config/riscv/riscv-common.c 
b/gcc/common/config/riscv/riscv-common.c
index e412acb..3f938ba 100644
--- a/gcc/common/config/riscv/riscv-common.c
+++ b/gcc/common/config/riscv/riscv-common.c
@@ -17,6 +17,8 @@ You should have received a copy of the GNU General Public 
License
 along with GCC; see the file COPYING3.  If not see
 .  */
 
+#include 
+
 #define INCLUDE_STRING
 #include "config.h"
 #include "system.h"
@@ -78,6 +80,8 @@ public:
  int major_version = RISCV_DONT_CARE_VERSION,
  int minor_version = RISCV_DONT_CARE_VERSION) const;
 
+  std::string to_string() const;
+
   unsigned xlen() const {return m_xlen;};
 
   static riscv_subset_list *parse (const char *, location_t);
@@ -134,6 +138,32 @@ void riscv_subset_list::add (const char *subset,
   m_tail = s;
 }
 
+/* Convert subset info to string with explicit version info.  */
+
+std::string riscv_subset_list::to_string() const
+{
+  std::ostringstream oss;
+  oss << "rv" << m_xlen;
+
+  bool first = true;
+  riscv_subset_t *subset = m_head;
+
+  while (subset != NULL)
+{
+  if (!first)
+   oss << '_';
+  first = false;
+
+  oss << subset->name
+ << subset->major_version
+ << 'p'
+ << subset->minor_version;
+  subset = subset->next;
+}
+
+  return oss.str();
+}
+
 /* Find subset in list without version checking, return NULL if not found.  */
 
 riscv_subset_t *riscv_subset_list::lookup (const char *subset,
@@ -492,6 +522,14 @@ fail:
   return NULL;
 }
 
+/* Return the current arch string.  */
+
+std::string riscv_arch_str ()
+{
+  gcc_assert (current_subset_list);
+  return 

[PATCH 1/2] RISC-V: Accept version, supervisor ext and more than one NSE for -march.

2019-02-12 Thread Kito Cheng
From: Kito Cheng 

Kito Cheng 
Monk Chiang  

ChangeLog:
gcc:
* common/config/riscv/riscv-common.c:
Include config/riscv/riscv-protos.h.
(INCLUDE_STRING): Defined.
(RISCV_DONT_CARE_VERSION): Defined.
(riscv_subset_t): Declare.
(riscv_subset_list): Declare.
(riscv_subset_list::riscv_subset_list): New.
(riscv_subset_list::~riscv_subset_list): Likewise.
(riscv_subset_list::parsing_subset_version): Likewise.
(riscv_subset_list::parse_std_ext): Likewise.
(riscv_subset_list::parse_sv_or_non_std_ext): Likewise.
(riscv_subset_list::add): Likewise.
(riscv_subset_list::lookup): Likewise.
(riscv_subset_list::xlen): Likewise.
(riscv_subset_list::parse): Likewise.
(riscv_supported_std_ext): Likewise.
(current_subset_list): Likewise.
(riscv_parse_arch_string): Using riscv_subset_list::parse to
parse.

gcc/testsuite:
* gcc.target/riscv/arch-1.c: New.
* gcc.target/riscv/arch-2.c: Likewise.
* gcc.target/riscv/arch-3.c: Likewise.
* gcc.target/riscv/arch-4.c: Likewise.
---
 gcc/common/config/riscv/riscv-common.c  | 530 
 gcc/testsuite/gcc.target/riscv/arch-1.c |   7 +
 gcc/testsuite/gcc.target/riscv/arch-2.c |   6 +
 gcc/testsuite/gcc.target/riscv/arch-3.c |   6 +
 gcc/testsuite/gcc.target/riscv/arch-4.c |   6 +
 5 files changed, 496 insertions(+), 59 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-4.c

diff --git a/gcc/common/config/riscv/riscv-common.c 
b/gcc/common/config/riscv/riscv-common.c
index cb5bb7f..e412acb 100644
--- a/gcc/common/config/riscv/riscv-common.c
+++ b/gcc/common/config/riscv/riscv-common.c
@@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public 
License
 along with GCC; see the file COPYING3.  If not see
 .  */
 
+#define INCLUDE_STRING
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -26,99 +27,510 @@ along with GCC; see the file COPYING3.  If not see
 #include "opts.h"
 #include "flags.h"
 #include "diagnostic-core.h"
+#include "config/riscv/riscv-protos.h"
 
-/* Parse a RISC-V ISA string into an option mask.  Must clear or set all arch
-   dependent mask bits, in case more than one -march string is passed.  */
+#define RISCV_DONT_CARE_VERSION -1
 
-static void
-riscv_parse_arch_string (const char *isa, int *flags, location_t loc)
+/* Subset info.  */
+struct riscv_subset_t {
+  riscv_subset_t ();
+
+  std::string name;
+  int major_version;
+  int minor_version;
+  struct riscv_subset_t *next;
+};
+
+/* Subset list.  */
+class riscv_subset_list {
+private:
+  /* Original arch string.  */
+  const char *m_arch;
+
+  /* Localtion of arch string, used for report error.  */
+  location_t m_loc;
+
+  /* Head of subset info list.  */
+  riscv_subset_t *m_head;
+
+  /* Tail of subset info list.  */
+  riscv_subset_t *m_tail;
+
+  /* X-len of m_arch. */
+  unsigned m_xlen;
+
+  riscv_subset_list (const char *, location_t);
+
+  const char *parsing_subset_version (const char *, unsigned *, unsigned *,
+ unsigned, unsigned, bool);
+
+  const char *parse_std_ext (const char *);
+
+  const char *parse_sv_or_non_std_ext (const char *, const char *,
+  const char *);
+
+public:
+  ~riscv_subset_list ();
+
+  void add (const char *, int, int);
+
+  riscv_subset_t *lookup (const char *,
+ int major_version = RISCV_DONT_CARE_VERSION,
+ int minor_version = RISCV_DONT_CARE_VERSION) const;
+
+  unsigned xlen() const {return m_xlen;};
+
+  static riscv_subset_list *parse (const char *, location_t);
+
+};
+
+static const char *riscv_supported_std_ext (void);
+
+static riscv_subset_list *current_subset_list = NULL;
+
+riscv_subset_t::riscv_subset_t()
+: name (), major_version (0), minor_version (0), next (NULL)
 {
-  const char *p = isa;
+}
 
-  if (strncmp (p, "rv32", 4) == 0)
-*flags &= ~MASK_64BIT, p += 4;
-  else if (strncmp (p, "rv64", 4) == 0)
-*flags |= MASK_64BIT, p += 4;
-  else
+riscv_subset_list::riscv_subset_list (const char *arch, location_t loc)
+: m_arch (arch), m_loc(loc), m_head (NULL), m_tail (NULL), m_xlen(0)
+{
+}
+
+riscv_subset_list::~riscv_subset_list()
+{
+  if (!m_head)
+return;
+
+  riscv_subset_t *item = this->m_head->next;
+  while (item != NULL)
 {
-  error_at (loc, "-march=%s: ISA string must begin with rv32 or rv64", 
isa);
-  return;
+  riscv_subset_t *next = item->next;
+  delete item;
+  item = next;
 }
+}
 
-  if (*p == 'g')
-{
-  p++;
+/* Add new subset to list.  */
 
-  *flags &= ~MASK_RVE;
+void riscv_subset_list::add (const char 

[PATCH 0/2] RISC-V: Support ELF attribute for GCC.

2019-02-12 Thread Kito Cheng
This patch series is implementation of RISC-V ELF attribute[1], it consists of
two part, first part is improve the -march string parser, in order to support
arch string with version and all kind of extension in the RISC-V ISA spec
v2.2[2], and second part is attribute directive generation, including configure
time and run time option to control that.

[1] https://github.com/riscv/riscv-elf-psabi-doc/pull/71
[2] 
https://github.com/riscv/riscv-isa-manual/blob/master/release/riscv-spec-v2.2.pdf




[PATCH, libphobos] Committed fallback UnwindBacktrace if LibBacktrace unfound

2019-02-12 Thread Iain Buclaw
Hi,

In the gcc.backtrace module, either one of LibBacktrace or
UnwindBacktrace will always be defined.  This patch gives
UnwindBacktrace a higher precedence over the libc backtrace as the
default backtrace handler as the latter depends on a rt.backtrace
module that is not compiled in.

Only useful if building --without-libbacktrace or libbacktrace is
unsupported for whatever reason.

Bootstrapped and regression tested on x86_64-linux-gnu.

Committed to trunk as r268836.

-- 
Iain
---
libphobos/ChangeLog:

* libdruntime/core/runtime.d (defaultTraceHandler): Give
UnwindBacktrace handler precedence over backtrace.
---
diff --git a/libphobos/libdruntime/core/runtime.d b/libphobos/libdruntime/core/runtime.d
index a78363cf477..0ead04752e4 100644
--- a/libphobos/libdruntime/core/runtime.d
+++ b/libphobos/libdruntime/core/runtime.d
@@ -619,6 +619,22 @@ Throwable.TraceInfo defaultTraceHandler( void* ptr = null )
 }
 return new LibBacktrace(FIRSTFRAME);
 }
+else static if ( __traits( compiles, new UnwindBacktrace(0) ) )
+{
+version (Posix)
+{
+static enum FIRSTFRAME = 5;
+}
+else version (Win64)
+{
+static enum FIRSTFRAME = 4;
+}
+else
+{
+static enum FIRSTFRAME = 0;
+}
+return new UnwindBacktrace(FIRSTFRAME);
+}
 else static if ( __traits( compiles, backtrace ) )
 {
 import core.demangle;
@@ -885,22 +901,6 @@ Throwable.TraceInfo defaultTraceHandler( void* ptr = null )
 auto s = new StackTrace(FIRSTFRAME, cast(CONTEXT*)ptr);
 return s;
 }
-else static if ( __traits( compiles, new UnwindBacktrace(0) ) )
-{
-version (Posix)
-{
-static enum FIRSTFRAME = 5;
-}
-else version (Win64)
-{
-static enum FIRSTFRAME = 4;
-}
-else
-{
-static enum FIRSTFRAME = 0;
-}
-return new UnwindBacktrace(FIRSTFRAME);
-}
 else
 {
 return null;


PR87689, PowerPC64 ELFv2 function parameter passing violation

2019-02-12 Thread Alan Modra
Covers for a generic fortran bug.  The effect is that we'll needlessly
waste 64 bytes of stack space on some calls, but I don't see any
simple and fully correct patch in generic code.  Bootstrapped and
regression tested powerpc64le-linux.  OK mainline and branches?

PR target/87689
* config/rs6000/rs6000.c (rs6000_function_parms_need_stack): Cope
with fortran function decls that lack all args.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 31256a4da8d..288b7606b5e 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -12325,6 +12325,13 @@ rs6000_function_parms_need_stack (tree fun, bool 
incoming)
   if ((!incoming && !prototype_p (fntype)) || stdarg_p (fntype))
 return true;
 
+  /* FIXME: Fortran arg lists can contain hidden parms, fooling
+ prototype_p into saying the function is prototyped when in fact
+ the number and type of args is unknown.  See PR 87689.  */
+  if (!incoming && (strcmp (lang_hooks.name, "GNU F77") == 0
+   || lang_GNU_Fortran ()))
+return true;
+
   INIT_CUMULATIVE_INCOMING_ARGS (args_so_far_v, fntype, NULL_RTX);
   args_so_far = pack_cumulative_args (_so_far_v);
 

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] Fix -fdec simplification (PR fortran/88649).

2019-02-12 Thread Martin Liška
PING^1.

On 2/4/19 1:46 PM, Martin Liška wrote:
> On 2/4/19 10:56 AM, Martin Liška wrote:
>> Hi.
>>
>> Starting from r266926 'switch (e->value.op.op)' is reached when
>> one using -fdec. That's wrong as -fdec causes to create a e->value.function.
>> I hope the proper fix is to skip the mentioned patch and allow simplification
>> at the end of the function?
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
>> Thanks,
>> Martin
>>
>> gcc/fortran/ChangeLog:
>>
>> 2019-01-25  Martin Liska  
>>
>>  PR fortran/88649
>>  * resolve.c (resolve_operator): Initialize 't' right
>>  after function entry.  Skip switch (e->value.op.op)
>>  for -fdec operands that become function calls.
>> ---
>>  gcc/fortran/resolve.c | 10 +-
>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>
>>
> 
> I forgot to include fortran ML.
> 
> Martin
> 



[PATCH] Bump LTO_minor_version on GCC-8 branch (PR lto/89260).

2019-02-12 Thread Martin Liška
As seen in the PR, bump would be needed due to r268698.

Ready for GCC-8 branch?
Thanks,
Martin

gcc/ChangeLog:

2019-02-13  Martin Liska  

PR lto/89260
* lto-streamer.h (LTO_minor_version): Bump version due
to r268698.
---
 gcc/lto-streamer.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index d5873f7dabf..11d9888dafb 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -121,7 +121,7 @@ along with GCC; see the file COPYING3.  If not see
  form followed by the data for the string.  */
 
 #define LTO_major_version 7
-#define LTO_minor_version 0
+#define LTO_minor_version 1
 
 typedef unsigned char	lto_decl_flags_t;
 



Re: [PATCH] Remove a barrier when EDGE_CROSSING is remoed (PR lto/88858).

2019-02-12 Thread Martin Liška
On 2/11/19 10:00 AM, Jan Hubicka wrote:
> Aha, yes, fundament of the patch is obvious - the barrier has to go :)
> There is same hunk of code in cfgrtl.c:1061, so please just merge it
> Note that I am not rtl reviewer. But as author of the code I would say
> that the updated patch can go in as obvious.
> 
> Honza

Thank you Honza for review, I'll install it after testing and
verifying that it helps to Firefox with PGO.

Martin
>From 67aec11ebc560b8ff85bfead32a7caadf8ba7fd4 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 23 Jan 2019 08:11:15 +0100
Subject: [PATCH] Remove a barrier when EDGE_CROSSING is removed (PR
 lto/88858).

gcc/ChangeLog:

2019-02-12  Martin Liska  

	PR lto/88858
	* cfgrtl.c (remove_barriers_from_footer): New function.
	(try_redirect_by_replacing_jump): Use it.
	(cfg_layout_redirect_edge_and_branch): Likewise.
---
 gcc/cfgrtl.c | 46 +++---
 1 file changed, 27 insertions(+), 19 deletions(-)

diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
index 172bdf585d0..56564c2fda7 100644
--- a/gcc/cfgrtl.c
+++ b/gcc/cfgrtl.c
@@ -988,6 +988,31 @@ block_label (basic_block block)
   return as_a  (BB_HEAD (block));
 }
 
+/* Remove all barriers from BB_FOOTER of a BB.  */
+
+static void
+remove_barriers_from_footer (basic_block bb)
+{
+  rtx_insn *insn = BB_FOOTER (bb);
+
+  /* Remove barriers but keep jumptables.  */
+  while (insn)
+{
+  if (BARRIER_P (insn))
+	{
+	  if (PREV_INSN (insn))
+	SET_NEXT_INSN (PREV_INSN (insn)) = NEXT_INSN (insn);
+	  else
+	BB_FOOTER (bb) = NEXT_INSN (insn);
+	  if (NEXT_INSN (insn))
+	SET_PREV_INSN (NEXT_INSN (insn)) = PREV_INSN (insn);
+	}
+  if (LABEL_P (insn))
+	return;
+  insn = NEXT_INSN (insn);
+}
+}
+
 /* Attempt to perform edge redirection by replacing possibly complex jump
instruction by unconditional jump or removing jump completely.  This can
apply only if all edges now point to the same block.  The parameters and
@@ -1051,26 +1076,8 @@ try_redirect_by_replacing_jump (edge e, basic_block target, bool in_cfglayout)
   /* Selectively unlink whole insn chain.  */
   if (in_cfglayout)
 	{
-	  rtx_insn *insn = BB_FOOTER (src);
-
 	  delete_insn_chain (kill_from, BB_END (src), false);
-
-	  /* Remove barriers but keep jumptables.  */
-	  while (insn)
-	{
-	  if (BARRIER_P (insn))
-		{
-		  if (PREV_INSN (insn))
-		SET_NEXT_INSN (PREV_INSN (insn)) = NEXT_INSN (insn);
-		  else
-		BB_FOOTER (src) = NEXT_INSN (insn);
-		  if (NEXT_INSN (insn))
-		SET_PREV_INSN (NEXT_INSN (insn)) = PREV_INSN (insn);
-		}
-	  if (LABEL_P (insn))
-		break;
-	  insn = NEXT_INSN (insn);
-	}
+	  remove_barriers_from_footer (src);
 	}
   else
 	delete_insn_chain (kill_from, PREV_INSN (BB_HEAD (target)),
@@ -4396,6 +4403,7 @@ cfg_layout_redirect_edge_and_branch (edge e, basic_block dest)
 	  	 "Removing crossing jump while redirecting edge form %i to %i\n",
 		 e->src->index, dest->index);
   delete_insn (BB_END (src));
+  remove_barriers_from_footer (src);
   e->flags |= EDGE_FALLTHRU;
 }
 
-- 
2.20.1



Re: [PATCH] Construct ipa_reduced_postorder always for overwritable (PR ipa/89009).

2019-02-12 Thread Martin Liška
Hi.

This is patch candidate I created and tested. It's not adding
filtering based on opt_for_fn which I would defer to the next
stage1.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin
>From d036f75a880bc91f67a5473767b35ba2f8a4ffe3 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 11 Feb 2019 16:47:06 +0100
Subject: [PATCH] Reduce SCCs in IPA postorder.

gcc/ChangeLog:

2019-02-13  Martin Liska  

	* ipa-cp.c (build_toporder_info): Use
	ignore_edge_if_not_available as edge filter.
	* ipa-inline.c (inline_small_functions): Likewise.
	* ipa-pure-const.c (ignore_edge_for_pure_const):
	Move to ipa-utils.h and rename to ignore_edge_if_not_available.
	(propagate_pure_const): Use ignore_edge_if_not_available
	as edge filter.
	* ipa-reference.c (ignore_edge_p): Make SCCs more fine
	based on availability and ECF_LEAF attribute.
	* ipa-utils.c (searchc): Refactor code.
	* ipa-utils.h (ignore_edge_if_not_available): New.
---
 gcc/ipa-cp.c |  3 ++-
 gcc/ipa-inline.c |  2 +-
 gcc/ipa-pure-const.c | 13 +
 gcc/ipa-reference.c  | 13 ++---
 gcc/ipa-utils.c  |  3 +--
 gcc/ipa-utils.h  | 10 ++
 6 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 442d5c63eff..2253b0cef63 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -815,7 +815,8 @@ build_toporder_info (struct ipa_topo_info *topo)
   topo->stack = XCNEWVEC (struct cgraph_node *, symtab->cgraph_count);
 
   gcc_checking_assert (topo->stack_top == 0);
-  topo->nnodes = ipa_reduced_postorder (topo->order, true, NULL);
+  topo->nnodes = ipa_reduced_postorder (topo->order, true,
+	ignore_edge_if_not_available);
 }
 
 /* Free information about strongly connected components and the arrays in
diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index 360c3de3289..c7e68a73706 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -1778,7 +1778,7 @@ inline_small_functions (void)
  metrics.  */
 
   max_count = profile_count::uninitialized ();
-  ipa_reduced_postorder (order, true, NULL);
+  ipa_reduced_postorder (order, true, ignore_edge_if_not_available);
   free (order);
 
   FOR_EACH_DEFINED_FUNCTION (node)
diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index a8a3956d2d5..e61d279289e 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -1395,17 +1395,6 @@ cdtor_p (cgraph_node *n, void *)
   return false;
 }
 
-/* We only propagate across edges with non-interposable callee.  */
-
-static bool
-ignore_edge_for_pure_const (struct cgraph_edge *e)
-{
-  enum availability avail;
-  e->callee->function_or_virtual_thunk_symbol (, e->caller);
-  return (avail <= AVAIL_INTERPOSABLE);
-}
-
-
 /* Produce transitive closure over the callgraph and compute pure/const
attributes.  */
 
@@ -1423,7 +1412,7 @@ propagate_pure_const (void)
   bool has_cdtor;
 
   order_pos = ipa_reduced_postorder (order, true,
- ignore_edge_for_pure_const);
+ ignore_edge_if_not_available);
   if (dump_file)
 {
   cgraph_node::dump_cgraph (dump_file);
diff --git a/gcc/ipa-reference.c b/gcc/ipa-reference.c
index d1759a374bc..16cc4cf44f9 100644
--- a/gcc/ipa-reference.c
+++ b/gcc/ipa-reference.c
@@ -677,14 +677,21 @@ get_read_write_all_from_node (struct cgraph_node *node,
   }
 }
 
-/* Skip edges from and to nodes without ipa_reference enables.  This leave
-   them out of strongy connected coponents and makes them easyto skip in the
+/* Skip edges from and to nodes without ipa_reference enabled.
+   Ignore not available symbols.  This leave
+   them out of strongly connected components and makes them easy to skip in the
propagation loop bellow.  */
 
 static bool
 ignore_edge_p (cgraph_edge *e)
 {
-  return (!opt_for_fn (e->caller->decl, flag_ipa_reference)
+  enum availability avail;
+  e->callee->function_or_virtual_thunk_symbol (, e->caller);
+
+  return (avail < AVAIL_INTERPOSABLE
+	  || (avail == AVAIL_INTERPOSABLE
+	  && !(flags_from_decl_or_type (e->callee->decl) & ECF_LEAF))
+	  || !opt_for_fn (e->caller->decl, flag_ipa_reference)
   || !opt_for_fn (e->callee->function_symbol ()->decl,
 			  flag_ipa_reference));
 }
diff --git a/gcc/ipa-utils.c b/gcc/ipa-utils.c
index 79b250c3943..25c2e2cf789 100644
--- a/gcc/ipa-utils.c
+++ b/gcc/ipa-utils.c
@@ -103,8 +103,7 @@ searchc (struct searchc_env* env, struct cgraph_node *v,
 continue;
 
   if (w->aux
-	  && (avail > AVAIL_INTERPOSABLE
-	  || avail == AVAIL_INTERPOSABLE))
+	  && (avail >= AVAIL_INTERPOSABLE))
 	{
 	  w_info = (struct ipa_dfs_info *) w->aux;
 	  if (w_info->new_node)
diff --git a/gcc/ipa-utils.h b/gcc/ipa-utils.h
index b70e8c57108..aad08148348 100644
--- a/gcc/ipa-utils.h
+++ b/gcc/ipa-utils.h
@@ -262,6 +262,16 @@ odr_type_p (const_tree t)
   return false;
 }
 
+/* We only propagate across edges with non-interposable callee.  */
+
+inline bool
+ignore_edge_if_not_available (struct cgraph_edge *e)
+{
+  enum 

[PATCH] Call free_dominance_info when transformed in DCE (PR rtl-optimization/89242).

2019-02-12 Thread Martin Liška
Hi.

The patch is very similar to r236460 where we should release dominance info
when the CFG is modified.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-02-12  Martin Liska  

PR rtl-optimization/89242
* dce.c (delete_unmarked_insns): Call free_dominance_info we
process a transformation.

gcc/testsuite/ChangeLog:

2019-02-12  Martin Liska  

PR rtl-optimization/89242
* g++.dg/pr89242.C: New test.
---
 gcc/dce.c  |  1 +
 gcc/testsuite/g++.dg/pr89242.C | 15 +++
 2 files changed, 16 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/pr89242.C


diff --git a/gcc/dce.c b/gcc/dce.c
index cb18e81592a..8fb109c7388 100644
--- a/gcc/dce.c
+++ b/gcc/dce.c
@@ -652,6 +652,7 @@ delete_unmarked_insns (void)
 {
   gcc_assert (can_alter_cfg);
   delete_unreachable_blocks ();
+  free_dominance_info (CDI_DOMINATORS);
 }
 }
 
diff --git a/gcc/testsuite/g++.dg/pr89242.C b/gcc/testsuite/g++.dg/pr89242.C
new file mode 100644
index 000..a702fef4f31
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr89242.C
@@ -0,0 +1,15 @@
+// { dg-do compile }
+// { dg-options "-fno-rerun-cse-after-loop -ftrapv -fno-tree-loop-optimize -fdelete-dead-exceptions -fno-forward-propagate -fnon-call-exceptions -O2" }
+
+void bar (int n, char *p)
+{
+  try
+{
+  n++;
+  for (int i = 0; i < n - 1; i++)
+	p[i];
+}
+  catch (...)
+{}
+}
+



[PATCH] Clean up another MPX-related stuff.

2019-02-12 Thread Martin Liška
Hi.

As Honza noticed, there's still some leftover from MPX removal.
May I remove another bunch of fields now, or should I wait
for next stage1?

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Thanks,
Martin

gcc/ChangeLog:

2019-02-13  Martin Liska  

* builtins.h (expand_builtin_with_bounds): Remove declaration.
* calls.c (struct arg_data): Remove special_slot, pointer_arg
and pointer_offset fields.
(initialize_argument_information): Remove usage of dead
fields.
* cgraph.h (struct cgraph_thunk_info): Remove
add_pointer_bounds_args.
* cgraphunit.c (cgraph_node::expand_thunk): Remove usage of dead
fields.
(cgraph_node::assemble_thunks_and_aliases): Remove usage of dead
fields.
* config/i386/i386.c (ix86_function_arg_advance): Remove
unrelated comment.
(struct builtin_isa): Remove leaf_p and nothrow_p fields.
(def_builtin):  Remove usage of dead
fields.
(ix86_add_new_builtins): Likewise.
* ipa-fnsummary.c (compute_fn_summary): Likewise.
* ipa-icf.c (sem_function::equals_wpa): Likewise.
(sem_function::init): Likewise.
(sem_variable::merge): Likewise.
* ipa-visibility.c (function_and_variable_visibility): Likewise.
* ipa.c (symbol_table::remove_unreachable_nodes): Likewise.
* lto-cgraph.c (lto_output_node): Likewise.
(lto_output_varpool_node): Likewise.
(input_node): Likewise.
(input_varpool_node): Likewise.
* lto-streamer-out.c (lto_output): Likewise.
* tree-inline.c (expand_call_inline): Remove usage of
assign_stmts.
* tree-inline.h (struct copy_body_data): Likewise.
* varpool.c (varpool_node::dump): Likewise.
---
 gcc/builtins.h |  1 -
 gcc/calls.c| 14 +-
 gcc/cgraph.h   |  7 ---
 gcc/cgraphunit.c   |  8 +---
 gcc/config/i386/i386.c | 13 -
 gcc/ipa-fnsummary.c| 18 +-
 gcc/ipa-icf.c  |  5 -
 gcc/ipa-visibility.c   |  1 -
 gcc/ipa.c  |  6 --
 gcc/lto-cgraph.c   |  6 +-
 gcc/lto-streamer-out.c |  3 +--
 gcc/tree-inline.c  |  2 --
 gcc/tree-inline.h  |  3 ---
 gcc/varpool.c  |  2 --
 14 files changed, 5 insertions(+), 84 deletions(-)


diff --git a/gcc/builtins.h b/gcc/builtins.h
index 3ec4ba09b66..599c96e72e1 100644
--- a/gcc/builtins.h
+++ b/gcc/builtins.h
@@ -119,7 +119,6 @@ extern void expand_builtin_trap (void);
 extern void expand_ifn_atomic_bit_test_and (gcall *);
 extern void expand_ifn_atomic_compare_exchange (gcall *);
 extern rtx expand_builtin (tree, rtx, rtx, machine_mode, int);
-extern rtx expand_builtin_with_bounds (tree, rtx, rtx, machine_mode, int);
 extern enum built_in_function builtin_mathfn_code (const_tree);
 extern tree fold_builtin_expect (location_t, tree, tree, tree, tree);
 extern bool avoid_folding_inline_builtin (tree);
diff --git a/gcc/calls.c b/gcc/calls.c
index e11977e98df..63c1bc52077 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -82,15 +82,6 @@ struct arg_data
   /* If REG is a PARALLEL, this is a copy of VALUE pulled into the correct
  form for emit_group_move.  */
   rtx parallel_value;
-  /* If value is passed in neither reg nor stack, this field holds a number
- of a special slot to be used.  */
-  rtx special_slot;
-  /* For pointer bounds hold an index of parm bounds are bound to.  -1 if
- there is no such pointer.  */
-  int pointer_arg;
-  /* If pointer_arg refers a structure, then pointer_offset holds an offset
- of a pointer in this structure.  */
-  int pointer_offset;
   /* If REG was promoted from the actual mode of the argument expression,
  indicates whether the promotion is sign- or zero-extended.  */
   int unsignedp;
@@ -2129,10 +2120,7 @@ initialize_argument_information (int num_actuals ATTRIBUTE_UNUSED,
 		argpos < n_named_args);
 
   if (args[i].reg && CONST_INT_P (args[i].reg))
-	{
-	  args[i].special_slot = args[i].reg;
-	  args[i].reg = NULL;
-	}
+	args[i].reg = NULL;
 
   /* If this is a sibling call and the machine has register windows, the
 	 register window has to be unwinded before calling the routine, so
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 75d4cec0ba8..2f6daa75a24 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -690,9 +690,6 @@ struct GTY(()) cgraph_thunk_info {
the virtual one.  */
   bool virtual_offset_p;
 
-  /* ??? True for special kind of thunks, seems related to instrumentation.  */
-  bool add_pointer_bounds_args;
-
   /* Set to true when alias node (the cgraph_node to which this struct belong)
  is a thunk.  Access to any other fields is invalid if this is false.  */
   bool thunk_p;
@@ -1939,10 +1936,6 @@ public:
   /* Set when variable is scheduled to be assembled.  */
   unsigned output : 1;
 
-  /* Set when variable has statically initialized pointer
- or is a static 

C++ PATCH for c++/89297 - ICE with OVERLOAD in template

2019-02-12 Thread Marek Polacek
Here we ICE because we're in a template and the constructor contains an
OVERLOAD, so calling check_narrowing -> maybe_constant_value crashes.

check_narrowing deliberately calls maybe_constant_value and not
fold_non_dependent_expr so as to avoid instantiating expressions twice.

So let's use instantiate_non_dependent_expr_sfinae to deal with the OVERLOAD;
fold_non_dependent_expr always calls maybe_constant_value and we can avoid
that call.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-02-12  Marek Polacek  

PR c++/89297 - ICE with OVERLOAD in template.
* semantics.c (finish_compound_literal): Call
instantiate_non_dependent_expr_sfinae.

* g++.dg/cpp0x/initlist113.C: New test.

diff --git gcc/cp/semantics.c gcc/cp/semantics.c
index 786f18ab0c8..e89a38d3cba 100644
--- gcc/cp/semantics.c
+++ gcc/cp/semantics.c
@@ -2826,9 +2826,13 @@ finish_compound_literal (tree type, tree 
compound_literal,
 return error_mark_node;
   compound_literal = reshape_init (type, compound_literal, complain);
   if (SCALAR_TYPE_P (type)
-  && !BRACE_ENCLOSED_INITIALIZER_P (compound_literal)
-  && !check_narrowing (type, compound_literal, complain))
-return error_mark_node;
+  && !BRACE_ENCLOSED_INITIALIZER_P (compound_literal))
+{
+  compound_literal
+   = instantiate_non_dependent_expr_sfinae (compound_literal, complain);
+  if (!check_narrowing (type, compound_literal, complain))
+   return error_mark_node;
+}
   if (TREE_CODE (type) == ARRAY_TYPE
   && TYPE_DOMAIN (type) == NULL_TREE)
 {
diff --git gcc/testsuite/g++.dg/cpp0x/initlist113.C 
gcc/testsuite/g++.dg/cpp0x/initlist113.C
new file mode 100644
index 000..0b7e7ff606a
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp0x/initlist113.C
@@ -0,0 +1,11 @@
+// PR c++/89297
+// { dg-do compile { target c++11 } }
+
+int id(int v) { return v; }
+float id(float v) { return v; }
+
+template 
+int foo(int v)
+{
+return int{id(v)};
+}


Re: PING^1: [PATCH] driver: Also prune joined switches with negation

2019-02-12 Thread H.J. Lu
On Tue, Feb 12, 2019 at 4:43 PM Joseph Myers  wrote:
>
> On Wed, 13 Feb 2019, Jakub Jelinek wrote:
>
> > On Tue, Feb 12, 2019 at 11:21:04PM +, Joseph Myers wrote:
> > > I think this is changing architecture-independent code in a way that is
> > > not clearly safe based on the architecture-independent options design, in
> > > order to address an architecture-specific problem.  The exclusion of
> >
> > Actually, I think it is a problem common to many backends, in particular
> > those where *_host_detect_local_cpu emits for -m*=native sometimes more than
> > one option, so at least i386, s390, rs6000, maybe also those that emit just
> > one option because it likely ends up at a different spot on the command line
> > from where -m{arch,cpu,tune}=native was originally present (that would be
> > aarch64, alpha, arm, mips and sparc).  I guess the user expectations is that
> > -march=native -march=foobar will be handled as
> > -march=foobar, rather than -march=native -march=foobar -march=my_great_cpu 
> > -mfoo -mbar
>
> It seems right in the march= case to handle that combination as
> -march=foobar - but it's less clear if that must always be the case for
> Joined options with negative versions (at least, the semantics would need
> defining more carefully in options.texi, with an analysis of existing
> affected options).
>

Backend must have "Negative(march=) " to have negative Joined options.
It is not like it will happen automatically.  What is wrong to have a negative
Joined option when a backend is specifically asking for it?

-- 
H.J.


Re: PING^1: [PATCH] driver: Also prune joined switches with negation

2019-02-12 Thread Joseph Myers
On Wed, 13 Feb 2019, Jakub Jelinek wrote:

> On Tue, Feb 12, 2019 at 11:21:04PM +, Joseph Myers wrote:
> > I think this is changing architecture-independent code in a way that is 
> > not clearly safe based on the architecture-independent options design, in 
> > order to address an architecture-specific problem.  The exclusion of 
> 
> Actually, I think it is a problem common to many backends, in particular
> those where *_host_detect_local_cpu emits for -m*=native sometimes more than
> one option, so at least i386, s390, rs6000, maybe also those that emit just
> one option because it likely ends up at a different spot on the command line
> from where -m{arch,cpu,tune}=native was originally present (that would be
> aarch64, alpha, arm, mips and sparc).  I guess the user expectations is that
> -march=native -march=foobar will be handled as
> -march=foobar, rather than -march=native -march=foobar -march=my_great_cpu 
> -mfoo -mbar

It seems right in the march= case to handle that combination as 
-march=foobar - but it's less clear if that must always be the case for 
Joined options with negative versions (at least, the semantics would need 
defining more carefully in options.texi, with an analysis of existing 
affected options).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Fix UB in prepare_cmp_insn (PR middle-end/89281)

2019-02-12 Thread Jakub Jelinek
On Wed, Feb 13, 2019 at 12:51:26AM +0100, Eric Botcazou wrote:
> > The following hunk of code results in UB on the recently added testcase,
> > because if cmp_mode is SImode or DImode, then 1 << 32 or 1 << 64 is
> > undefined.  Fixed by using GET_MODE_MASK, plus UINTVAL because size is
> > really unsigned (code later on uses unsignedp=1 too).
> 
> Doesn't the current check make sure that the RTL constant is valid for the 
> mode though (since RTL constants are sign-extended for their mode)?  See 
> emit_block_move_via_movmem for an equivalent check with GET_MODE_MASK >> 1.

The code will do:
  size = convert_to_mode (cmp_mode, size, 1);
i.e. convert size from whatever mode it had before to cmp_mode and the
test is whether it can do so without changing the behavior.  If size is
non-constant, then that can be obviously (without using range info etc.)
done only if the original mode is narrower or at most as wide as cmp_mode.
We could do the same for CONST_INT_P too, but as we know the constant,
it wants to make sure that the size can be expressed in cmp_mode.
As it is unsigned quantity, that can be checked by checking if the value is
<= GET_MODE_MASK.

Jakub


gotools patch committed: Remove test directories in mostlyclean

2019-02-12 Thread Ian Lance Taylor
This patch fixes the gotools Makefile to remove some more test
directories when running `make mostlyclean`.  It also cleans up the
chmod to avoid an error message when the directory does not exist.
This fixes PR 89193.  Ran gotools tests, ran various make clean
targets on x86_64-pc-linux-gnu.  Committed to mainline.

Ian

2019-02-12  Ian Lance Taylor  

PR go/89193
* Makefile.am (mostlyclean-local): Avoid getting an error from
chmod.  Remove check-vet-dir and gocache-test.
* Makefile.in: Regenerate.
Index: Makefile.am
===
--- Makefile.am (revision 268829)
+++ Makefile.am (working copy)
@@ -100,8 +100,9 @@ MOSTLYCLEANFILES = \
*.sent
 
 mostlyclean-local:
-   -chmod -R u+w check-go-dir
-   rm -rf check-go-dir check-runtime-dir cgo-test-dir carchive-test-dir
+   if test -d check-go-dir; then chmod -R u+w check-go-dir; fi
+   rm -rf check-go-dir check-runtime-dir cgo-test-dir carchive-test-dir \
+   check-vet-dir gocache-test
 
 if NATIVE
 


Re: [PATCH] Fix UB in prepare_cmp_insn (PR middle-end/89281)

2019-02-12 Thread Eric Botcazou
> The following hunk of code results in UB on the recently added testcase,
> because if cmp_mode is SImode or DImode, then 1 << 32 or 1 << 64 is
> undefined.  Fixed by using GET_MODE_MASK, plus UINTVAL because size is
> really unsigned (code later on uses unsignedp=1 too).

Doesn't the current check make sure that the RTL constant is valid for the 
mode though (since RTL constants are sign-extended for their mode)?  See 
emit_block_move_via_movmem for an equivalent check with GET_MODE_MASK >> 1.

-- 
Eric Botcazou


Re: [PATCH] Avoid assuming valid_constant_size_p argument is a constant expression (PR 89294)

2019-02-12 Thread Rainer Orth
Hi Martin,

> The attached patch removes the assumption introduced earlier today
> in my fix for bug 87996 that the valid_constant_size_p argument is
> a constant expression.  I couldn't come up with a C/C++ test case
> where this isn't true but apparently it can happen in Ada which I
> inadvertently didn't build.  I still haven't figured out what
> I have to do to build it on my Fedora 29 machine so I tested
> this change by hand (besides bootstrapping w/o Ada).

I've just completed a i386-pc-solaris2.11 bootstrap with your patch and
the failures are gone.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: PING^1: [PATCH] driver: Also prune joined switches with negation

2019-02-12 Thread Jakub Jelinek
On Tue, Feb 12, 2019 at 11:21:04PM +, Joseph Myers wrote:
> I think this is changing architecture-independent code in a way that is 
> not clearly safe based on the architecture-independent options design, in 
> order to address an architecture-specific problem.  The exclusion of 

Actually, I think it is a problem common to many backends, in particular
those where *_host_detect_local_cpu emits for -m*=native sometimes more than
one option, so at least i386, s390, rs6000, maybe also those that emit just
one option because it likely ends up at a different spot on the command line
from where -m{arch,cpu,tune}=native was originally present (that would be
aarch64, alpha, arm, mips and sparc).  I guess the user expectations is that
-march=native -march=foobar will be handled as
-march=foobar, rather than -march=native -march=foobar -march=my_great_cpu 
-mfoo -mbar

Jakub


Re: PING^1: [PATCH] driver: Also prune joined switches with negation

2019-02-12 Thread Joseph Myers
On Tue, 12 Feb 2019, H.J. Lu wrote:

> > > Prune joined switches with negation to allow -march=skylake-avx512 to
> > > override previous -march=native on command-line.
> > >
> > > PR driver/69471
> > > * opts-common.c (prune_options): Also prune joined switches
> > > with negation.
> > > * config/i386/i386.opt (march=): Add Negative(march=).
> > > (mtune=): Add Negative(mtune=).
> >
> > Here is the updated patch.
> >
> 
> PING:
> 
> https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00492.html

I think this is changing architecture-independent code in a way that is 
not clearly safe based on the architecture-independent options design, in 
order to address an architecture-specific problem.  The exclusion of 
joined switches is presumably aimed at such switches that can be used 
multiple times with different arguments, with different semantics to just 
using them once (e.g. -I).  Is there any reason such an option should not 
have a negative form?

I think anything like this (would not be suitable for the current 
development stage and) would need a more detailed analysis of what 
existing options might be affected by the change and why it's OK for them, 
as well as updates to options.texi to discuss this issue with the 
semantics of Negative and kinds of options it cannot be used with.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] Fix ICE in strlen () > 0 folding (PR tree-optimization/89314)

2019-02-12 Thread Jakub Jelinek
Hi!

fold_binary_loc verifies that strlen argument is a pointer, but doesn't
verify what the pointee is.
The following patch just always converts it to the right pointer type
(const char *) and dereferences only that.
Another option would be punt if the pointee (TYPE_MAIN_VARIANT) is not
char_type_node, but then e.g. unsigned_char_type_node or
signed_char_type_node (or maybe char8_t) wouldn't be that bad.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-02-12  Jakub Jelinek  

PR tree-optimization/89314
* fold-const.c (fold_binary_loc): Cast strlen argument to
const char * before dereferencing it.  Formatting fixes.

* gcc.dg/pr89314.c: New test.

--- gcc/fold-const.c.jj 2019-02-11 18:04:18.0 +0100
+++ gcc/fold-const.c2019-02-12 21:11:21.491388038 +0100
@@ -10740,20 +10740,24 @@ fold_binary_loc (location_t loc, enum tr
strlen(ptr) != 0   =>  *ptr != 0
 Other cases should reduce to one of these two (or a constant)
 due to the return value of strlen being unsigned.  */
-  if (TREE_CODE (arg0) == CALL_EXPR
- && integer_zerop (arg1))
+  if (TREE_CODE (arg0) == CALL_EXPR && integer_zerop (arg1))
{
  tree fndecl = get_callee_fndecl (arg0);
 
  if (fndecl
  && fndecl_built_in_p (fndecl, BUILT_IN_STRLEN)
  && call_expr_nargs (arg0) == 1
- && TREE_CODE (TREE_TYPE (CALL_EXPR_ARG (arg0, 0))) == 
POINTER_TYPE)
+ && (TREE_CODE (TREE_TYPE (CALL_EXPR_ARG (arg0, 0)))
+ == POINTER_TYPE))
{
- tree iref = build_fold_indirect_ref_loc (loc,
-  CALL_EXPR_ARG (arg0, 0));
+ tree ptrtype
+   = build_pointer_type (build_qualified_type (char_type_node,
+   TYPE_QUAL_CONST));
+ tree ptr = fold_convert_loc (loc, ptrtype,
+  CALL_EXPR_ARG (arg0, 0));
+ tree iref = build_fold_indirect_ref_loc (loc, ptr);
  return fold_build2_loc (loc, code, type, iref,
- build_int_cst (TREE_TYPE (iref), 0));
+ build_int_cst (TREE_TYPE (iref), 0));
}
}
 
--- gcc/testsuite/gcc.dg/pr89314.c.jj   2019-02-12 21:15:11.624589045 +0100
+++ gcc/testsuite/gcc.dg/pr89314.c  2019-02-12 21:14:49.138960233 +0100
@@ -0,0 +1,13 @@
+/* PR tree-optimization/89314 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wbuiltin-declaration-mismatch -Wextra" } */
+
+extern __SIZE_TYPE__ strlen (const float *);   /* { dg-warning "mismatch in 
argument 1 type of built-in function" } */
+void bar (void);
+
+void
+foo (float *s)
+{
+  if (strlen (s) > 0)
+bar ();
+}

Jakub


[PATCH] Fix up norm2 simplification (PR middle-end/88074)

2019-02-12 Thread Jakub Jelinek
Hi!

As discussed recently on the mailing list, the norm2 simplification doesn't
work if we limit mpfr emin/emax to some values derived from maximum floating
exponents (and precision for denormals).

The following patch adjusts the computation, so that it is scaled down if
needed.  In particular, if any value in the array is so large that **2 will
overflow on it or will be very close to it, the scale down is set to
2**(max_exponent/2+4), and if the result during computation gets close to
overflowing, it is scaled down a little bit too.  The scaling is always done
using powers of two, operands by that and the sum by **2 of that, and at the
end it multiplies the sqrt back.  I had to change
simplify_transformation_to_array, so that post_op is done immediately after
finishing ops corresponding to that, so that there can be just one global
variable for the scale.  From my understanding of e.g. the libgfortran norm2
code where sqrt is done basically in this spot I hope it isn't possible that
the same *dest is updated multiple times with dest increments/decrements in
between.

Bootstrapped/regtested on x86_64-linux and i686-linux (together with
Richard's patch), ok for trunk?

2019-02-12  Jakub Jelinek  

PR middle-end/88074
* simplify.c (simplify_transformation_to_array): Run post_op
immediately after processing corresponding row, rather than at the
end.
(norm2_scale): New variable.
(add_squared): Rename to ...
(norm2_add_squared): ... this.  Scale down operand and/or result
if needed.
(do_sqrt): Rename to ...
(norm2_do_sqrt): ... this.  Handle the result == e case.  Scale up
result and clear norm2_scale.
(gfc_simplify_norm2): Clear norm2_scale.  Change add_squared to
norm2_add_squared and _sqrt to norm2_do_sqrt.  Scale up result
and clear norm2_scale again.

--- gcc/fortran/simplify.c.jj   2019-01-10 11:43:12.452409482 +0100
+++ gcc/fortran/simplify.c  2019-02-12 19:54:03.726526824 +0100
@@ -636,6 +636,9 @@ simplify_transformation_to_array (gfc_ex
if (*src)
  *dest = op (*dest, gfc_copy_expr (*src));
 
+  if (post_op)
+   *dest = post_op (*dest, *dest);
+
   count[0]++;
   base += sstride[0];
   dest += dstride[0];
@@ -671,10 +674,7 @@ simplify_transformation_to_array (gfc_ex
   result_ctor = gfc_constructor_first (result->value.constructor);
   for (i = 0; i < resultsize; ++i)
 {
-  if (post_op)
-   result_ctor->expr = post_op (result_ctor->expr, resultvec[i]);
-  else
-   result_ctor->expr = resultvec[i];
+  result_ctor->expr = resultvec[i];
   result_ctor = gfc_constructor_next (result_ctor);
 }
 
@@ -6048,9 +6048,10 @@ gfc_simplify_idnint (gfc_expr *e)
   return simplify_nint ("IDNINT", e, NULL);
 }
 
+static int norm2_scale;
 
 static gfc_expr *
-add_squared (gfc_expr *result, gfc_expr *e)
+norm2_add_squared (gfc_expr *result, gfc_expr *e)
 {
   mpfr_t tmp;
 
@@ -6059,8 +6060,45 @@ add_squared (gfc_expr *result, gfc_expr
  && result->expr_type == EXPR_CONSTANT);
 
   gfc_set_model_kind (result->ts.kind);
+  int index = gfc_validate_kind (BT_REAL, result->ts.kind, false);
+  mpfr_exp_t exp;
+  if (mpfr_regular_p (result->value.real))
+{
+  exp = mpfr_get_exp (result->value.real);
+  /* If result is getting close to overflowing, scale down.  */
+  if (exp >= gfc_real_kinds[index].max_exponent - 4
+ && norm2_scale <= gfc_real_kinds[index].max_exponent - 2)
+   {
+ norm2_scale += 2;
+ mpfr_div_ui (result->value.real, result->value.real, 16,
+  GFC_RND_MODE);
+   }
+}
+
   mpfr_init (tmp);
-  mpfr_pow_ui (tmp, e->value.real, 2, GFC_RND_MODE);
+  if (mpfr_regular_p (e->value.real))
+{
+  exp = mpfr_get_exp (e->value.real);
+  /* If e**2 would overflow or close to overflowing, scale down.  */
+  if (exp - norm2_scale >= gfc_real_kinds[index].max_exponent / 2 - 2)
+   {
+ int new_scale = gfc_real_kinds[index].max_exponent / 2 + 4;
+ mpfr_set_ui (tmp, 1, GFC_RND_MODE);
+ mpfr_set_exp (tmp, new_scale - norm2_scale);
+ mpfr_div (result->value.real, result->value.real, tmp, GFC_RND_MODE);
+ mpfr_div (result->value.real, result->value.real, tmp, GFC_RND_MODE);
+ norm2_scale = new_scale;
+   }
+}
+  if (norm2_scale)
+{
+  mpfr_set_ui (tmp, 1, GFC_RND_MODE);
+  mpfr_set_exp (tmp, norm2_scale);
+  mpfr_div (tmp, e->value.real, tmp, GFC_RND_MODE);
+}
+  else
+mpfr_set (tmp, e->value.real, GFC_RND_MODE);
+  mpfr_pow_ui (tmp, tmp, 2, GFC_RND_MODE);
   mpfr_add (result->value.real, result->value.real, tmp,
GFC_RND_MODE);
   mpfr_clear (tmp);
@@ -6070,14 +6108,26 @@ add_squared (gfc_expr *result, gfc_expr
 
 
 static gfc_expr *
-do_sqrt (gfc_expr *result, gfc_expr *e)
+norm2_do_sqrt (gfc_expr *result, gfc_expr *e)
 {
   gcc_assert (e->ts.type == BT_REAL 

[C++ PATCH] preview: Fix braces around scalar initializer (C++/88572) Inbox x

2019-02-12 Thread will wray
A proposed patch for Bug 88572 is attached to the bug report along
with a short description and Change Log (a link there gives a pretty
diff of the patch):

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88572#c15

I'd appreciate any review of this patch, as well as testing on more
platforms. The patch with updated tests passes for me on x86_64.

There's also test code in bug comment #1 that demonstrates SFINAE
based on the nesting of braces. It could also be added to the
testsuite - I'm not sure how to do that or if it is needed.


braces_patch
Description: Binary data


[PATCH] Fix TLS ICE with -mcmodel=large (PR target/89290)

2019-02-12 Thread Jakub Jelinek
Hi!

The following patch fixes ICE, when we try to split a double-word TLS load
or store.  The problem is that x86_64_immediate_operand disallows CONST_INT
offsets with -mcmodel=large.  I guess it is intentional for SYMBOL_REFs,
LABEL_REFs etc., but for the TLS LE UNSPECs it should be fine, there are no
64-bit relocations for these, we don't really support > 2GB thread local
segments.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-02-12  Jakub Jelinek  

PR target/89290
* config/i386/predicates.md (x86_64_immediate_operand): Allow
TLS UNSPECs offsetted by signed 32-bit CONST_INT even with
-mcmodel=large.

* gcc.target/i386/pr89290.c: New test.

--- gcc/config/i386/predicates.md.jj2019-01-01 12:37:32.267727037 +0100
+++ gcc/config/i386/predicates.md   2019-02-12 17:07:15.937097266 +0100
@@ -182,7 +182,7 @@ (define_predicate "x86_64_immediate_oper
  rtx op1 = XEXP (XEXP (op, 0), 0);
  rtx op2 = XEXP (XEXP (op, 0), 1);
 
- if (ix86_cmodel == CM_LARGE)
+ if (ix86_cmodel == CM_LARGE && GET_CODE (op1) != UNSPEC)
return false;
  if (!CONST_INT_P (op2))
return false;
--- gcc/testsuite/gcc.target/i386/pr89290.c.jj  2019-02-12 17:32:49.291750588 
+0100
+++ gcc/testsuite/gcc.target/i386/pr89290.c 2019-02-12 17:34:00.847568163 
+0100
@@ -0,0 +1,19 @@
+/* PR target/89290 */
+/* { dg-do compile { target { tls && lp64 } } } */
+/* { dg-options "-O0 -mcmodel=large" } */
+
+struct S { long int a, b; } e;
+__thread struct S s;
+__thread struct S t[2];
+
+void
+foo (void)
+{
+  s = e;
+}
+
+void
+bar (void)
+{
+  t[1] = e;
+}

Jakub


[PATCH] Fix UB in prepare_cmp_insn (PR middle-end/89281)

2019-02-12 Thread Jakub Jelinek
Hi!

The following hunk of code results in UB on the recently added testcase,
because if cmp_mode is SImode or DImode, then 1 << 32 or 1 << 64 is
undefined.  Fixed by using GET_MODE_MASK, plus UINTVAL because size is
really unsigned (code later on uses unsignedp=1 too).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-02-12  Jakub Jelinek  

PR middle-end/89281
* optabs.c (prepare_cmp_insn): Use UINTVAL (size) instead of
INTVAL (size), compare it to GET_MODE_MASK instead of
1 << GET_MODE_BITSIZE.

--- gcc/optabs.c.jj 2019-02-05 10:16:34.533743051 +0100
+++ gcc/optabs.c2019-02-11 09:48:15.514432541 +0100
@@ -3898,7 +3898,7 @@ prepare_cmp_insn (rtx x, rtx y, enum rtx
 
  /* Must make sure the size fits the insn's mode.  */
  if (CONST_INT_P (size)
- ? INTVAL (size) >= (1 << GET_MODE_BITSIZE (cmp_mode))
+ ? UINTVAL (size) > GET_MODE_MASK (cmp_mode)
  : (GET_MODE_BITSIZE (as_a  (GET_MODE (size)))
 > GET_MODE_BITSIZE (cmp_mode)))
continue;

Jakub


Re: [PATCH] Updated patches for the port of gccgo to GNU/Hurd

2019-02-12 Thread Ian Lance Taylor
On Mon, Feb 11, 2019 at 1:38 PM Svante Signell  wrote:
>
> On Mon, 2019-02-11 at 10:27 -0800, Ian Lance Taylor wrote:
>
> > It sound like the right fix is to use #ifdef WIFCONTINUED in
> > syscall/wait.c.  If WIFCONTINUED is not defined, the Continued
> > function should always return 0.

> I can also easily submit a patch for WIFCONTINUED returning 0. Problem is I'll
> be AFK for the next week. Maybe this can wait, or you find a solution? 
> Regardinga comm opttion for ps Samuel is the best source.

I've committed this patch that should fix this problem.  Bootstrapped
and tested on x86_64-pc-linux-gnu.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 268785)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-fc8aa5a46433d6ecba9fd1cd0bee4290c314ca06
+6d03c4c8ca320042bd550d44c0f25575c5311ac2
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/syscall/wait.c
===
--- libgo/go/syscall/wait.c (revision 268369)
+++ libgo/go/syscall/wait.c (working copy)
@@ -16,6 +16,10 @@
 #define WCOREDUMP(status) (((status) & 0200) != 0)
 #endif
 
+#ifndef WIFCONTINUED
+#define WIFCONTINUED(x) 0
+#endif
+
 extern _Bool Exited (uint32_t *w)
   __asm__ (GOSYM_PREFIX "syscall.WaitStatus.Exited");
 


Re: [PATCH] Add target-zlib to top-level configure, use zlib from libphobos

2019-02-12 Thread Iain Buclaw
On Tue, 12 Feb 2019 at 10:40, Richard Biener  wrote:
>
> On Sat, Feb 9, 2019 at 10:37 AM Iain Buclaw  wrote:
> >
> > On Mon, 28 Jan 2019 at 13:10, Richard Biener  
> > wrote:
> > >
> > > On Mon, Jan 21, 2019 at 7:35 PM Iain Buclaw  
> > > wrote:
> > > >
> > > > Hi,
> > > >
> > > > Following on from the last, this adds target-zlib to target_libraries
> > > > and updates libphobos build scripts to link to libz_convenience.a.
> > > > The D front-end already has target-zlib in d/config-lang.in.
> > > >
> > > > Is the top-level part OK?  I considered disabling target-zlib if
> > > > libphobos is not being built, but decided against unless it's
> > > > requested.
> > >
> > > Hmm, you overload --with-system-zlib to apply to both host and target
> > > (I guess it already applied to build), not sure if that's really desired?
> > > I suppose libphobos is the first target library linking against zlib?
> > >
> >
> > Originally, libgcj linked to zlib.
> >
> > > You are also falling back to in-tree zlib if --with-system-zlib was
> > > specified but no zlib was found - I guess for cross builds that
> > > will easily get not noticed...  The toplevel --with-system-zlib makes
> > > it much harder and simply fails.
> > >
> >
> > OK, so keep --with-target-system-zlib to distinguish between the two?
>
> Yes, and fail if specificed but not found.
>

Updated patch.  Checked that it correctly fails when
--with-target-system-zlib and zlib missing.

-- 
Iain

---
ChangeLog:

2019-02-12  Iain Buclaw  

* configure.ac: configure.ac: Add target-zlib to target_libraries.
* configure: Regenerate.

libphobos/ChangeLog:

2019-02-12  Iain Buclaw  

* m4/druntime/libraries.m4 (DRUNTIME_LIBRARIES_ZLIB): Use
libz_convenience.a if not using system zlib.
* Makefile.in: Regenerate.
* configure: Regenerate.
* libdruntime/Makefile.in: Regenerate.
* src/Makefile.am: Remove ZLIB_CSOURCES and AM_CFLAGS.
* src/Makefile.in: Regenerate.
* testsuite/Makefile.in: Regenerate.
---
diff --git a/configure b/configure
index adf4fda0f69..1c5f9b502a8 100755
--- a/configure
+++ b/configure
@@ -2813,7 +2813,8 @@ target_libraries="target-libgcc \
 		target-libobjc \
 		target-libada \
 		target-libgo \
-		target-libphobos"
+		target-libphobos \
+		target-zlib"
 
 # these tools are built using the target libraries, and are intended to
 # run only in the target environment
diff --git a/configure.ac b/configure.ac
index 87f2aee0500..cffccd37805 100644
--- a/configure.ac
+++ b/configure.ac
@@ -163,7 +163,8 @@ target_libraries="target-libgcc \
 		target-libobjc \
 		target-libada \
 		target-libgo \
-		target-libphobos"
+		target-libphobos \
+		target-zlib"
 
 # these tools are built using the target libraries, and are intended to
 # run only in the target environment
diff --git a/libphobos/Makefile.in b/libphobos/Makefile.in
index 87eaf28aba7..6a7793a75e8 100644
--- a/libphobos/Makefile.in
+++ b/libphobos/Makefile.in
@@ -240,6 +240,7 @@ LIBBACKTRACE = @LIBBACKTRACE@
 LIBOBJS = @LIBOBJS@
 LIBS = @LIBS@
 LIBTOOL = @LIBTOOL@
+LIBZ = @LIBZ@
 LIPO = @LIPO@
 LN_S = @LN_S@
 LTLIBOBJS = @LTLIBOBJS@
diff --git a/libphobos/configure b/libphobos/configure
index 9f96ad5d190..252ec6ad718 100755
--- a/libphobos/configure
+++ b/libphobos/configure
@@ -3,7 +3,7 @@
 # Generated by GNU Autoconf 2.69 for package-unused version-unused.
 #
 #
-# Copyright (C) 1992-2019 Free Software Foundation, Inc.
+# Copyright (C) 1992-1996, 1998-2012 Free Software Foundation, Inc.
 #
 #
 # This configure script is free software; the Free Software Foundation
@@ -640,8 +640,7 @@ gdc_include_dir
 libphobos_toolexeclibdir
 libphobos_toolexecdir
 gcc_version
-DRUNTIME_ZLIB_SYSTEM_FALSE
-DRUNTIME_ZLIB_SYSTEM_TRUE
+LIBZ
 BACKTRACE_SUPPORTS_THREADS
 BACKTRACE_USES_MALLOC
 BACKTRACE_SUPPORTED
@@ -1568,7 +1567,7 @@ if $ac_init_version; then
 package-unused configure version-unused
 generated by GNU Autoconf 2.69
 
-Copyright (C) 2012-2019 Free Software Foundation, Inc.
+Copyright (C) 2012 Free Software Foundation, Inc.
 This configure script is free software; the Free Software Foundation
 gives unlimited permission to copy, distribute and modify it.
 _ACEOF
@@ -14717,79 +14716,95 @@ ac_compiler_gnu=$ac_cv_c_compiler_gnu
 fi
 
 
+  ac_ext=c
+ac_cpp='$CPP $CPPFLAGS'
+ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5'
+ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5'
+ac_compiler_gnu=$ac_cv_c_compiler_gnu
+
+  LIBZ=""
+
 
 # Check whether --with-target-system-zlib was given.
 if test "${with_target_system_zlib+set}" = set; then :
-  withval=$with_target_system_zlib;
+  withval=$with_target_system_zlib; system_zlib=yes
+else
+  system_zlib=no
 fi
 
 
-  system_zlib=false
-  if test "x$with_target_system_zlib" = "xyes"; then :
-
-{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for deflate in -lz" >&5
-$as_echo_n "checking for deflate in -lz... " >&6; }
-if ${ac_cv_lib_z_deflate+:} false; then :
-  $as_echo_n "(cached) " >&6

[C++ PATCH] PR c++/89144 - link error with constexpr initializer_list.

2019-02-12 Thread Jason Merrill
In this PR, we were unnecessarily rejecting a constexpr initializer_list
with no elements.  This seems like a fairly useless degenerate case, but it
makes sense to avoid allocating an underlying array at all if there are no
elements and instead use a null pointer, like the initializer_list default
constructor.

If the (automatic storage duration) list does have initializer elements, we
continue to reject the declaration, because the initializer_list ends up
referring to an automatic storage duration temporary array, which is not a
suitable constant initializer.  If we make it static, it should be OK
because we refer to a static array.  The second hunk fixes that case.  It
also means we won't diagnose some real errors in templates, but those
diagnostics aren't required, and we'll get them when the template is
instantiated.

Tested x86_64-pc-linux-gnu, applying to trunk.

* call.c (convert_like_real) [ck_list]: Don't allocate a temporary
array for an empty list.
* typeck2.c (store_init_value): Don't use cxx_constant_init in a
template.
---
 gcc/cp/call.c | 61 +++
 gcc/cp/typeck2.c  |  5 ++
 .../g++.dg/cpp0x/constexpr-initlist11.C   | 11 
 gcc/cp/ChangeLog  |  8 +++
 4 files changed, 59 insertions(+), 26 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-initlist11.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index e9c131dd66b..c53eb582aac 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -7085,34 +7085,42 @@ convert_like_real (conversion *convs, tree expr, tree 
fn, int argnum,
   {
/* Conversion to std::initializer_list.  */
tree elttype = TREE_VEC_ELT (CLASSTYPE_TI_ARGS (totype), 0);
-   tree new_ctor = build_constructor (init_list_type_node, NULL);
unsigned len = CONSTRUCTOR_NELTS (expr);
-   tree array, val, field;
-   vec *vec = NULL;
-   unsigned ix;
+   tree array;
 
-   /* Convert all the elements.  */
-   FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (expr), ix, val)
+   if (len)
  {
-   tree sub = convert_like_real (convs->u.list[ix], val, fn, argnum,
- false, false, complain);
-   if (sub == error_mark_node)
- return sub;
-   if (!BRACE_ENCLOSED_INITIALIZER_P (val)
-   && !check_narrowing (TREE_TYPE (sub), val, complain))
- return error_mark_node;
-   CONSTRUCTOR_APPEND_ELT (CONSTRUCTOR_ELTS (new_ctor), NULL_TREE, 
sub);
-   if (!TREE_CONSTANT (sub))
- TREE_CONSTANT (new_ctor) = false;
+   tree val; unsigned ix;
+
+   tree new_ctor = build_constructor (init_list_type_node, NULL);
+
+   /* Convert all the elements.  */
+   FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (expr), ix, val)
+ {
+   tree sub = convert_like_real (convs->u.list[ix], val, fn,
+ argnum, false, false, complain);
+   if (sub == error_mark_node)
+ return sub;
+   if (!BRACE_ENCLOSED_INITIALIZER_P (val)
+   && !check_narrowing (TREE_TYPE (sub), val, complain))
+ return error_mark_node;
+   CONSTRUCTOR_APPEND_ELT (CONSTRUCTOR_ELTS (new_ctor),
+   NULL_TREE, sub);
+   if (!TREE_CONSTANT (sub))
+ TREE_CONSTANT (new_ctor) = false;
+ }
+   /* Build up the array.  */
+   elttype = cp_build_qualified_type
+ (elttype, cp_type_quals (elttype) | TYPE_QUAL_CONST);
+   array = build_array_of_n_type (elttype, len);
+   array = finish_compound_literal (array, new_ctor, complain);
+   /* Take the address explicitly rather than via decay_conversion
+  to avoid the error about taking the address of a temporary.  */
+   array = cp_build_addr_expr (array, complain);
  }
-   /* Build up the array.  */
-   elttype = cp_build_qualified_type
- (elttype, cp_type_quals (elttype) | TYPE_QUAL_CONST);
-   array = build_array_of_n_type (elttype, len);
-   array = finish_compound_literal (array, new_ctor, complain);
-   /* Take the address explicitly rather than via decay_conversion
-  to avoid the error about taking the address of a temporary.  */
-   array = cp_build_addr_expr (array, complain);
+   else
+ array = nullptr_node;
+
array = cp_convert (build_pointer_type (elttype), array, complain);
if (array == error_mark_node)
  return error_mark_node;
@@ -7123,11 +7131,12 @@ convert_like_real (conversion *convs, tree expr, tree 
fn, int argnum,
totype = complete_type_or_maybe_complain (totype, NULL_TREE, complain);
if (!totype)
  return error_mark_node;
-   field = 

Re: [PATCH 36/40] i386: Allow MMX vector expanders with TARGET_MMX_WITH_SSE

2019-02-12 Thread H.J. Lu
On Tue, Feb 12, 2019 at 12:24 PM Uros Bizjak  wrote:
>
> On 2/12/19, H.J. Lu  wrote:
> > On Tue, Feb 12, 2019 at 11:44 AM Uros Bizjak  wrote:
> >>
> >> On Tue, Feb 12, 2019 at 8:35 PM H.J. Lu  wrote:
> >> >
> >> > On Tue, Feb 12, 2019 at 5:43 AM Uros Bizjak  wrote:
> >> > >
> >> > > On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
> >> > > >
> >> > > > PR target/89021
> >> > > > * config/i386/i386.c (ix86_expand_vector_init_duplicate):
> >> > > > Set
> >> > > > mmx_ok to true if TARGET_MMX_WITH_SSE is true.
> >> > > > (ix86_expand_vector_init_one_nonzero): Likewise.
> >> > > > (ix86_expand_vector_init_one_var): Likewise.
> >> > > > (ix86_expand_vector_init_general): Likewise.
> >> > > > (ix86_expand_vector_init): Likewise.
> >> > > > (ix86_expand_vector_set): Likewise.
> >> > > > (ix86_expand_vector_extract): Likewise.
> >> > > > * config/i386/mmx.md (*vec_dupv2sf): Changed to
> >> > > > define_insn_and_split to support SSE emulation.
> >> > > > (vec_setv2sf): Also allow TARGET_MMX_WITH_SSE.
> >> > > > (vec_extractv2sf_1 splitter): Likewise.
> >> > > > (vec_extractv2sfsf): Likewise.
> >> > > > (vec_setv2si): Likewise.
> >> > > > (vec_extractv2si_1 splitter): Likewise.
> >> > > > (vec_extractv2sisi): Likewise.
> >> > > > (vec_setv4hi): Likewise.
> >> > > > (vec_extractv4hihi): Likewise.
> >> > > > (vec_setv8qi): Likewise.
> >> > > > (vec_extractv8qiqi): Likewise.
> >> > > > (*vec_extractv2sf_0): Don't allow TARGET_MMX_WITH_SSE.
> >> > > > (*vec_extractv2sf_1): Likewise.
> >> > > > (*vec_extractv2si_0): Likewise.
> >> > > > (*vec_extractv2si_1): Likewise.
> >> > > > (*vec_extractv2sf_0_sse): New.
> >> > > > (*vec_extractv2sf_1_sse): Likewise.
> >> > > > (*vec_extractv2si_0_sse): Likewise.
> >> > > > (*vec_extractv2si_1_sse): Likewise.
> >> > >
> >> > > Please do not introduce new *_sse patterns, use mmx_isa attribute to
> >> > > disable unwanted alternatives.
> >> >
> >> > Will do.
> >> >
> >> > > >  (define_insn_and_split "*vec_extractv2si_zext_mem"
> >> > > > -  [(set (match_operand:DI 0 "register_operand" "=y,x,r")
> >> > > > +  [(set (match_operand:DI 0 "register_operand" "=x,r")
> >> > > > (zero_extend:DI
> >> > > >   (vec_select:SI
> >> > > > -   (match_operand:V2SI 1 "memory_operand" "o,o,o")
> >> > > > +   (match_operand:V2SI 1 "memory_operand" "o,o")
> >> > > > (parallel [(match_operand:SI 2
> >> > > > "const_0_to_1_operand")]]
> >> > > > -  "TARGET_64BIT && TARGET_MMX"
> >> > > > +  "TARGET_64BIT"
> >> > >
> >> > > Here you need TARGET_64BIT && (TARGET_MMX || TARGET_MMX_WITH_SSE) and
> >> > > mmx_isa attribute.
> >> > >
> >> >
> >> > Why is && (TARGET_MMX || TARGET_MMX_WITH_SSE) needed?  The 3rd
> >> > alternative doesn't need MMX nor SSE2:
> >>
> >> Ah, I didn't notice that. LGTM then.
> >>
> >> > (define_insn_and_split "*vec_extractv2si_zext_mem"
> >> >   [(set (match_operand:DI 0 "register_operand" "=y,x,r")
> >> > (zero_extend:DI
> >> >   (vec_select:SI
> >> > (match_operand:V2SI 1 "memory_operand" "o,o,o")
> >> > (parallel [(match_operand:SI 2
> >> > "const_0_to_1_operand")]]
> >> >   "TARGET_64BIT"
> >> >   "#"
> >> >   "&& reload_completed"
> >> >   [(set (match_dup 0) (zero_extend:DI (match_dup 1)))]
> >> > {
> >> >   operands[1] = adjust_address (operands[1], SImode, INTVAL
> >> > (operands[2]) * 4);
> >> > }
> >> >   [(set_attr "mmx_isa" "native,sse2,base")])
> >>
> >> Please write this as "native,*,*".
> >
> > Did you mean "native,sse2,*"?  The second alternative is SSE2 MOVD:
>
> No, my proposed definition is OK, see below.
>
> > MOVD (when destination operand is XMM register)
> > DEST[31:0] ← SRC;
> > DEST[127:32] ← H;
> > DEST[MAXVL-1:128] (Unmodified)
>
> You should also add "isa" attribute with "*,sse2,*", which should be
> there from the beginning.

Can we have both isa and mmx_isa attributes in the same pattern?

> BTW: sse2 is not a member of mmx_isa. attribute.

If not, I can add sse2 to  mmx_isa.

> Uros.
>
> >> This way, it is clear that we enable alternative 0 only for native
> >> mmx. It looks to me that we need to add similar treatment to a couple
> >> of other patterns in sse.md, where we allow "y" constraint, e.g.
> >> *vec_concatv2sf_sse, *vec_concatv2si_sse4_1, etc.
> >>
> >
> > I will take a look.
> >
> > Thanks.
> >
> > --
> > H.J.
> >



-- 
H.J.


Re: [PATCH 36/40] i386: Allow MMX vector expanders with TARGET_MMX_WITH_SSE

2019-02-12 Thread Uros Bizjak
On 2/12/19, H.J. Lu  wrote:

>> This way, it is clear that we enable alternative 0 only for native
>> mmx. It looks to me that we need to add similar treatment to a couple
>> of other patterns in sse.md, where we allow "y" constraint, e.g.
>> *vec_concatv2sf_sse, *vec_concatv2si_sse4_1, etc.
>>
>
> I will take a look.

>From a quick look to mentioned pattern, I have see that a couple of
movd with XMM reg are wrongly marked as sse2 isa. movd/movq with MMX
regs are base MMX instructions.

Uros.


Re: [PATCH 36/40] i386: Allow MMX vector expanders with TARGET_MMX_WITH_SSE

2019-02-12 Thread Uros Bizjak
On 2/12/19, H.J. Lu  wrote:
> On Tue, Feb 12, 2019 at 11:44 AM Uros Bizjak  wrote:
>>
>> On Tue, Feb 12, 2019 at 8:35 PM H.J. Lu  wrote:
>> >
>> > On Tue, Feb 12, 2019 at 5:43 AM Uros Bizjak  wrote:
>> > >
>> > > On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
>> > > >
>> > > > PR target/89021
>> > > > * config/i386/i386.c (ix86_expand_vector_init_duplicate):
>> > > > Set
>> > > > mmx_ok to true if TARGET_MMX_WITH_SSE is true.
>> > > > (ix86_expand_vector_init_one_nonzero): Likewise.
>> > > > (ix86_expand_vector_init_one_var): Likewise.
>> > > > (ix86_expand_vector_init_general): Likewise.
>> > > > (ix86_expand_vector_init): Likewise.
>> > > > (ix86_expand_vector_set): Likewise.
>> > > > (ix86_expand_vector_extract): Likewise.
>> > > > * config/i386/mmx.md (*vec_dupv2sf): Changed to
>> > > > define_insn_and_split to support SSE emulation.
>> > > > (vec_setv2sf): Also allow TARGET_MMX_WITH_SSE.
>> > > > (vec_extractv2sf_1 splitter): Likewise.
>> > > > (vec_extractv2sfsf): Likewise.
>> > > > (vec_setv2si): Likewise.
>> > > > (vec_extractv2si_1 splitter): Likewise.
>> > > > (vec_extractv2sisi): Likewise.
>> > > > (vec_setv4hi): Likewise.
>> > > > (vec_extractv4hihi): Likewise.
>> > > > (vec_setv8qi): Likewise.
>> > > > (vec_extractv8qiqi): Likewise.
>> > > > (*vec_extractv2sf_0): Don't allow TARGET_MMX_WITH_SSE.
>> > > > (*vec_extractv2sf_1): Likewise.
>> > > > (*vec_extractv2si_0): Likewise.
>> > > > (*vec_extractv2si_1): Likewise.
>> > > > (*vec_extractv2sf_0_sse): New.
>> > > > (*vec_extractv2sf_1_sse): Likewise.
>> > > > (*vec_extractv2si_0_sse): Likewise.
>> > > > (*vec_extractv2si_1_sse): Likewise.
>> > >
>> > > Please do not introduce new *_sse patterns, use mmx_isa attribute to
>> > > disable unwanted alternatives.
>> >
>> > Will do.
>> >
>> > > >  (define_insn_and_split "*vec_extractv2si_zext_mem"
>> > > > -  [(set (match_operand:DI 0 "register_operand" "=y,x,r")
>> > > > +  [(set (match_operand:DI 0 "register_operand" "=x,r")
>> > > > (zero_extend:DI
>> > > >   (vec_select:SI
>> > > > -   (match_operand:V2SI 1 "memory_operand" "o,o,o")
>> > > > +   (match_operand:V2SI 1 "memory_operand" "o,o")
>> > > > (parallel [(match_operand:SI 2
>> > > > "const_0_to_1_operand")]]
>> > > > -  "TARGET_64BIT && TARGET_MMX"
>> > > > +  "TARGET_64BIT"
>> > >
>> > > Here you need TARGET_64BIT && (TARGET_MMX || TARGET_MMX_WITH_SSE) and
>> > > mmx_isa attribute.
>> > >
>> >
>> > Why is && (TARGET_MMX || TARGET_MMX_WITH_SSE) needed?  The 3rd
>> > alternative doesn't need MMX nor SSE2:
>>
>> Ah, I didn't notice that. LGTM then.
>>
>> > (define_insn_and_split "*vec_extractv2si_zext_mem"
>> >   [(set (match_operand:DI 0 "register_operand" "=y,x,r")
>> > (zero_extend:DI
>> >   (vec_select:SI
>> > (match_operand:V2SI 1 "memory_operand" "o,o,o")
>> > (parallel [(match_operand:SI 2
>> > "const_0_to_1_operand")]]
>> >   "TARGET_64BIT"
>> >   "#"
>> >   "&& reload_completed"
>> >   [(set (match_dup 0) (zero_extend:DI (match_dup 1)))]
>> > {
>> >   operands[1] = adjust_address (operands[1], SImode, INTVAL
>> > (operands[2]) * 4);
>> > }
>> >   [(set_attr "mmx_isa" "native,sse2,base")])
>>
>> Please write this as "native,*,*".
>
> Did you mean "native,sse2,*"?  The second alternative is SSE2 MOVD:

No, my proposed definition is OK, see below.

> MOVD (when destination operand is XMM register)
> DEST[31:0] ← SRC;
> DEST[127:32] ← H;
> DEST[MAXVL-1:128] (Unmodified)

You should also add "isa" attribute with "*,sse2,*", which should be
there from the beginning.

BTW: sse2 is not a member of mmx_isa. attribute.

Uros.

>> This way, it is clear that we enable alternative 0 only for native
>> mmx. It looks to me that we need to add similar treatment to a couple
>> of other patterns in sse.md, where we allow "y" constraint, e.g.
>> *vec_concatv2sf_sse, *vec_concatv2si_sse4_1, etc.
>>
>
> I will take a look.
>
> Thanks.
>
> --
> H.J.
>


Re: [PATCH 36/40] i386: Allow MMX vector expanders with TARGET_MMX_WITH_SSE

2019-02-12 Thread H.J. Lu
On Tue, Feb 12, 2019 at 11:44 AM Uros Bizjak  wrote:
>
> On Tue, Feb 12, 2019 at 8:35 PM H.J. Lu  wrote:
> >
> > On Tue, Feb 12, 2019 at 5:43 AM Uros Bizjak  wrote:
> > >
> > > On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
> > > >
> > > > PR target/89021
> > > > * config/i386/i386.c (ix86_expand_vector_init_duplicate): Set
> > > > mmx_ok to true if TARGET_MMX_WITH_SSE is true.
> > > > (ix86_expand_vector_init_one_nonzero): Likewise.
> > > > (ix86_expand_vector_init_one_var): Likewise.
> > > > (ix86_expand_vector_init_general): Likewise.
> > > > (ix86_expand_vector_init): Likewise.
> > > > (ix86_expand_vector_set): Likewise.
> > > > (ix86_expand_vector_extract): Likewise.
> > > > * config/i386/mmx.md (*vec_dupv2sf): Changed to
> > > > define_insn_and_split to support SSE emulation.
> > > > (vec_setv2sf): Also allow TARGET_MMX_WITH_SSE.
> > > > (vec_extractv2sf_1 splitter): Likewise.
> > > > (vec_extractv2sfsf): Likewise.
> > > > (vec_setv2si): Likewise.
> > > > (vec_extractv2si_1 splitter): Likewise.
> > > > (vec_extractv2sisi): Likewise.
> > > > (vec_setv4hi): Likewise.
> > > > (vec_extractv4hihi): Likewise.
> > > > (vec_setv8qi): Likewise.
> > > > (vec_extractv8qiqi): Likewise.
> > > > (*vec_extractv2sf_0): Don't allow TARGET_MMX_WITH_SSE.
> > > > (*vec_extractv2sf_1): Likewise.
> > > > (*vec_extractv2si_0): Likewise.
> > > > (*vec_extractv2si_1): Likewise.
> > > > (*vec_extractv2sf_0_sse): New.
> > > > (*vec_extractv2sf_1_sse): Likewise.
> > > > (*vec_extractv2si_0_sse): Likewise.
> > > > (*vec_extractv2si_1_sse): Likewise.
> > >
> > > Please do not introduce new *_sse patterns, use mmx_isa attribute to
> > > disable unwanted alternatives.
> >
> > Will do.
> >
> > > >  (define_insn_and_split "*vec_extractv2si_zext_mem"
> > > > -  [(set (match_operand:DI 0 "register_operand" "=y,x,r")
> > > > +  [(set (match_operand:DI 0 "register_operand" "=x,r")
> > > > (zero_extend:DI
> > > >   (vec_select:SI
> > > > -   (match_operand:V2SI 1 "memory_operand" "o,o,o")
> > > > +   (match_operand:V2SI 1 "memory_operand" "o,o")
> > > > (parallel [(match_operand:SI 2 "const_0_to_1_operand")]]
> > > > -  "TARGET_64BIT && TARGET_MMX"
> > > > +  "TARGET_64BIT"
> > >
> > > Here you need TARGET_64BIT && (TARGET_MMX || TARGET_MMX_WITH_SSE) and
> > > mmx_isa attribute.
> > >
> >
> > Why is && (TARGET_MMX || TARGET_MMX_WITH_SSE) needed?  The 3rd
> > alternative doesn't need MMX nor SSE2:
>
> Ah, I didn't notice that. LGTM then.
>
> > (define_insn_and_split "*vec_extractv2si_zext_mem"
> >   [(set (match_operand:DI 0 "register_operand" "=y,x,r")
> > (zero_extend:DI
> >   (vec_select:SI
> > (match_operand:V2SI 1 "memory_operand" "o,o,o")
> > (parallel [(match_operand:SI 2 "const_0_to_1_operand")]]
> >   "TARGET_64BIT"
> >   "#"
> >   "&& reload_completed"
> >   [(set (match_dup 0) (zero_extend:DI (match_dup 1)))]
> > {
> >   operands[1] = adjust_address (operands[1], SImode, INTVAL (operands[2]) * 
> > 4);
> > }
> >   [(set_attr "mmx_isa" "native,sse2,base")])
>
> Please write this as "native,*,*".

Did you mean "native,sse2,*"?  The second alternative is SSE2 MOVD:

MOVD (when destination operand is XMM register)
DEST[31:0] ← SRC;
DEST[127:32] ← H;
DEST[MAXVL-1:128] (Unmodified)

> This way, it is clear that we enable alternative 0 only for native
> mmx. It looks to me that we need to add similar treatment to a couple
> of other patterns in sse.md, where we allow "y" constraint, e.g.
> *vec_concatv2sf_sse, *vec_concatv2si_sse4_1, etc.
>

I will take a look.

Thanks.

-- 
H.J.


Re: [PATCH doc] Remove documentation for PowerPC -maltivec={be,le}

2019-02-12 Thread Segher Boessenkool
Hi Pat,

On Tue, Feb 12, 2019 at 12:15:56PM -0600, Pat Haugen wrote:
> The options were removed in May 2018 (r260109), but documentation was not 
> updated.

> 2019-02-12  Pat Haugen  
> 
> * doc/invoke.texi (RS/6000 and PowerPC Options): Remove duplicate
> -maltivec. Delete -maltivec=be and -maltivec=le documentation.

Yes please.  Thanks!


Segher


Re: [PATCH 36/40] i386: Allow MMX vector expanders with TARGET_MMX_WITH_SSE

2019-02-12 Thread Uros Bizjak
On Tue, Feb 12, 2019 at 8:35 PM H.J. Lu  wrote:
>
> On Tue, Feb 12, 2019 at 5:43 AM Uros Bizjak  wrote:
> >
> > On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
> > >
> > > PR target/89021
> > > * config/i386/i386.c (ix86_expand_vector_init_duplicate): Set
> > > mmx_ok to true if TARGET_MMX_WITH_SSE is true.
> > > (ix86_expand_vector_init_one_nonzero): Likewise.
> > > (ix86_expand_vector_init_one_var): Likewise.
> > > (ix86_expand_vector_init_general): Likewise.
> > > (ix86_expand_vector_init): Likewise.
> > > (ix86_expand_vector_set): Likewise.
> > > (ix86_expand_vector_extract): Likewise.
> > > * config/i386/mmx.md (*vec_dupv2sf): Changed to
> > > define_insn_and_split to support SSE emulation.
> > > (vec_setv2sf): Also allow TARGET_MMX_WITH_SSE.
> > > (vec_extractv2sf_1 splitter): Likewise.
> > > (vec_extractv2sfsf): Likewise.
> > > (vec_setv2si): Likewise.
> > > (vec_extractv2si_1 splitter): Likewise.
> > > (vec_extractv2sisi): Likewise.
> > > (vec_setv4hi): Likewise.
> > > (vec_extractv4hihi): Likewise.
> > > (vec_setv8qi): Likewise.
> > > (vec_extractv8qiqi): Likewise.
> > > (*vec_extractv2sf_0): Don't allow TARGET_MMX_WITH_SSE.
> > > (*vec_extractv2sf_1): Likewise.
> > > (*vec_extractv2si_0): Likewise.
> > > (*vec_extractv2si_1): Likewise.
> > > (*vec_extractv2sf_0_sse): New.
> > > (*vec_extractv2sf_1_sse): Likewise.
> > > (*vec_extractv2si_0_sse): Likewise.
> > > (*vec_extractv2si_1_sse): Likewise.
> >
> > Please do not introduce new *_sse patterns, use mmx_isa attribute to
> > disable unwanted alternatives.
>
> Will do.
>
> > >  (define_insn_and_split "*vec_extractv2si_zext_mem"
> > > -  [(set (match_operand:DI 0 "register_operand" "=y,x,r")
> > > +  [(set (match_operand:DI 0 "register_operand" "=x,r")
> > > (zero_extend:DI
> > >   (vec_select:SI
> > > -   (match_operand:V2SI 1 "memory_operand" "o,o,o")
> > > +   (match_operand:V2SI 1 "memory_operand" "o,o")
> > > (parallel [(match_operand:SI 2 "const_0_to_1_operand")]]
> > > -  "TARGET_64BIT && TARGET_MMX"
> > > +  "TARGET_64BIT"
> >
> > Here you need TARGET_64BIT && (TARGET_MMX || TARGET_MMX_WITH_SSE) and
> > mmx_isa attribute.
> >
>
> Why is && (TARGET_MMX || TARGET_MMX_WITH_SSE) needed?  The 3rd
> alternative doesn't need MMX nor SSE2:

Ah, I didn't notice that. LGTM then.

> (define_insn_and_split "*vec_extractv2si_zext_mem"
>   [(set (match_operand:DI 0 "register_operand" "=y,x,r")
> (zero_extend:DI
>   (vec_select:SI
> (match_operand:V2SI 1 "memory_operand" "o,o,o")
> (parallel [(match_operand:SI 2 "const_0_to_1_operand")]]
>   "TARGET_64BIT"
>   "#"
>   "&& reload_completed"
>   [(set (match_dup 0) (zero_extend:DI (match_dup 1)))]
> {
>   operands[1] = adjust_address (operands[1], SImode, INTVAL (operands[2]) * 
> 4);
> }
>   [(set_attr "mmx_isa" "native,sse2,base")])

Please write this as "native,*,*".

This way, it is clear that we enable alternative 0 only for native
mmx. It looks to me that we need to add similar treatment to a couple
of other patterns in sse.md, where we allow "y" constraint, e.g.
*vec_concatv2sf_sse, *vec_concatv2si_sse4_1, etc.

Uros.


Re: [PATCH 36/40] i386: Allow MMX vector expanders with TARGET_MMX_WITH_SSE

2019-02-12 Thread H.J. Lu
On Tue, Feb 12, 2019 at 5:43 AM Uros Bizjak  wrote:
>
> On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
> >
> > PR target/89021
> > * config/i386/i386.c (ix86_expand_vector_init_duplicate): Set
> > mmx_ok to true if TARGET_MMX_WITH_SSE is true.
> > (ix86_expand_vector_init_one_nonzero): Likewise.
> > (ix86_expand_vector_init_one_var): Likewise.
> > (ix86_expand_vector_init_general): Likewise.
> > (ix86_expand_vector_init): Likewise.
> > (ix86_expand_vector_set): Likewise.
> > (ix86_expand_vector_extract): Likewise.
> > * config/i386/mmx.md (*vec_dupv2sf): Changed to
> > define_insn_and_split to support SSE emulation.
> > (vec_setv2sf): Also allow TARGET_MMX_WITH_SSE.
> > (vec_extractv2sf_1 splitter): Likewise.
> > (vec_extractv2sfsf): Likewise.
> > (vec_setv2si): Likewise.
> > (vec_extractv2si_1 splitter): Likewise.
> > (vec_extractv2sisi): Likewise.
> > (vec_setv4hi): Likewise.
> > (vec_extractv4hihi): Likewise.
> > (vec_setv8qi): Likewise.
> > (vec_extractv8qiqi): Likewise.
> > (*vec_extractv2sf_0): Don't allow TARGET_MMX_WITH_SSE.
> > (*vec_extractv2sf_1): Likewise.
> > (*vec_extractv2si_0): Likewise.
> > (*vec_extractv2si_1): Likewise.
> > (*vec_extractv2sf_0_sse): New.
> > (*vec_extractv2sf_1_sse): Likewise.
> > (*vec_extractv2si_0_sse): Likewise.
> > (*vec_extractv2si_1_sse): Likewise.
>
> Please do not introduce new *_sse patterns, use mmx_isa attribute to
> disable unwanted alternatives.

Will do.

> >  (define_insn_and_split "*vec_extractv2si_zext_mem"
> > -  [(set (match_operand:DI 0 "register_operand" "=y,x,r")
> > +  [(set (match_operand:DI 0 "register_operand" "=x,r")
> > (zero_extend:DI
> >   (vec_select:SI
> > -   (match_operand:V2SI 1 "memory_operand" "o,o,o")
> > +   (match_operand:V2SI 1 "memory_operand" "o,o")
> > (parallel [(match_operand:SI 2 "const_0_to_1_operand")]]
> > -  "TARGET_64BIT && TARGET_MMX"
> > +  "TARGET_64BIT"
>
> Here you need TARGET_64BIT && (TARGET_MMX || TARGET_MMX_WITH_SSE) and
> mmx_isa attribute.
>

Why is && (TARGET_MMX || TARGET_MMX_WITH_SSE) needed?  The 3rd
alternative doesn't need MMX nor SSE2:

(define_insn_and_split "*vec_extractv2si_zext_mem"
  [(set (match_operand:DI 0 "register_operand" "=y,x,r")
(zero_extend:DI
  (vec_select:SI
(match_operand:V2SI 1 "memory_operand" "o,o,o")
(parallel [(match_operand:SI 2 "const_0_to_1_operand")]]
  "TARGET_64BIT"
  "#"
  "&& reload_completed"
  [(set (match_dup 0) (zero_extend:DI (match_dup 1)))]
{
  operands[1] = adjust_address (operands[1], SImode, INTVAL (operands[2]) * 4);
}
  [(set_attr "mmx_isa" "native,sse2,base")])

-- 
H.J.


[PATCH, OpenACC, og8] OpenACC kernels control flow analysis bug fix

2019-02-12 Thread Gergö Barany

Hi all,

The attached patch fixes a bug in recent work on OpenACC "kernels" 
regions. Jumps within nested binds or try statements were not analyzed 
correctly and could lead to ICEs.


Tested on x86_64 with offloading to NVPTX.

Thanks,
Gergö


Correctly handle nested bind and try statements in the OpenACC kernels
conversion control-flow region analysis.

gcc/
* omp-oacc-kernels.c (control_flow_regions::compute_regions): Factored
out...
(control_flow_regions::visit_gimple_seq): ... this new method, now also
handling bind and try statements.

gcc/testsuite/
* gcc/testsuite/c-c++-common/goacc/kernels-decompose-1.c: Add tests.
>From 6db1f381f344b4482dcca6b82fc6316d172840be Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Gerg=C3=B6=20Barany?= 
Date: Mon, 11 Feb 2019 08:23:31 -0800
Subject: [PATCH] OpenACC kernels control flow analysis bug fix

Correctly handle nested bind and try statements in the OpenACC kernels
conversion control-flow region analysis.

gcc/
* omp-oacc-kernels.c (control_flow_regions::compute_regions): Factored
out...
(control_flow_regions::visit_gimple_seq): ... this new method, now also
handling bind and try statements.

gcc/testsuite/
* gcc/testsuite/c-c++-common/goacc/kernels-decompose-1.c: Add tests.
---
 gcc/omp-oacc-kernels.c | 94 +++---
 .../c-c++-common/goacc/kernels-decompose-1.c   | 44 ++
 2 files changed, 107 insertions(+), 31 deletions(-)

diff --git a/gcc/omp-oacc-kernels.c b/gcc/omp-oacc-kernels.c
index d1db492..1fa2647 100644
--- a/gcc/omp-oacc-kernels.c
+++ b/gcc/omp-oacc-kernels.c
@@ -935,12 +935,24 @@ class control_flow_regions
control-flow regions in the statement sequence SEQ.  */
 void compute_regions (gimple_seq seq);
 
+/* Helper for compute_regions, scanning a single statement sequence SEQ
+   starting at index IDX and returning the next index after the last
+   statement in the sequence.  */
+size_t visit_gimple_seq (gimple_seq seq, size_t idx);
+
 /* The mapping from statement indices to region representatives.  */
 vec  representatives;
 
 /* A cache mapping statement indices to a flag indicating whether the
statement is a top level OpenACC for loop.  */
 vec  omp_for_loops;
+
+/* A mapping of control flow statements (goto, switch, cond) to their
+   representatives.  */
+hash_map  control_flow_reps;
+
+/* A mapping of labels to their representatives.  */
+hash_map  label_reps;
 };
 
 control_flow_regions::control_flow_regions (gimple_seq seq)
@@ -1008,41 +1020,12 @@ control_flow_regions::union_reps (size_t a, size_t b)
 void
 control_flow_regions::compute_regions (gimple_seq seq)
 {
-  hash_map  control_flow_reps;
-  hash_map  label_reps;
-  size_t current_region = 0, idx = 0;
+  size_t idx = 0;
 
   /* In a first pass, assign an initial region to each statement.  Except in
  the case of OpenACC loops, each statement simply gets the same region
  representative as its predecessor.  */
-  for (gimple_stmt_iterator gsi = gsi_start (seq);
-   !gsi_end_p (gsi);
-   gsi_next ())
-{
-  gimple *stmt = gsi_stmt (gsi);
-  gimple *omp_for = top_level_omp_for_in_stmt (stmt);
-  omp_for_loops.safe_push (omp_for != NULL);
-  if (omp_for != NULL)
-{
-  /* Assign a new region to this loop and to its successor.  */
-  current_region = idx;
-  representatives.safe_push (current_region);
-  current_region++;
-}
-  else
-{
-  representatives.safe_push (current_region);
-  /* Remember any jumps and labels for the second pass below.  */
-  if (gimple_code (stmt) == GIMPLE_COND
-  || gimple_code (stmt) == GIMPLE_SWITCH
-  || gimple_code (stmt) == GIMPLE_GOTO)
-control_flow_reps.put (stmt, current_region);
-  else if (gimple_code (stmt) == GIMPLE_LABEL)
-label_reps.put (gimple_label_label (as_a  (stmt)),
-current_region);
-}
-  idx++;
-}
+  visit_gimple_seq (seq, idx);
   gcc_assert (representatives.length () == omp_for_loops.length ());
 
   /* Revisit all the control flow statements and union the region of each
@@ -1087,6 +1070,55 @@ control_flow_regions::compute_regions (gimple_seq seq)
 }
 }
 
+size_t
+control_flow_regions::visit_gimple_seq (gimple_seq seq, size_t idx)
+{
+  size_t current_region = idx;
+
+  for (gimple_stmt_iterator gsi = gsi_start (seq);
+   !gsi_end_p (gsi);
+   gsi_next ())
+{
+  gimple *stmt = gsi_stmt (gsi);
+  gimple *omp_for = top_level_omp_for_in_stmt (stmt);
+  omp_for_loops.safe_push (omp_for != NULL);
+  if (omp_for != NULL)
+{
+  /* Assign a new region to this loop and to its successor.  */
+  current_region = idx;
+  representatives.safe_push (current_region);
+  current_region++;
+

Re: [PATCH] i386: Use OI/TImode in *mov[ot]i_internal_avx with AVX512VL

2019-02-12 Thread H.J. Lu
On Tue, Feb 12, 2019 at 10:02 AM Uros Bizjak  wrote:
>
> On Fri, Feb 8, 2019 at 12:29 PM H.J. Lu  wrote:
> >
> > On Fri, Feb 8, 2019 at 1:51 AM Uros Bizjak  wrote:
> > >
> > > On Thu, Feb 7, 2019 at 10:11 PM H.J. Lu  wrote:
> > > >
> > > > OImode and TImode moves must be done in XImode to access upper 16
> > > > vector registers without AVX512VL.  With AVX512VL, we can access
> > > > upper 16 vector registers in OImode and TImode.
> > > >
> > > > PR target/89229
> > > > * config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
> > > > upper 16 vector registers without TARGET_AVX512VL.
> > > > (*movti_internal): Likewise.
> > >
> > > Please use (not (match_test "...")) instead of (match_test "!...") and
> > > put the new test as the first argument of the AND rtx.
> > >
> > > LGTM with the above change.
> >
> > This is the patch I am checking in.
>
> HJ,
>
> please revert two PR89229 patches as they introduce a regression.
>

This is what I checked in.

-- 
H.J.
From 8b572f6aae417645bb8caabc05d761474155d406 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Fri, 8 Feb 2019 16:20:49 -0800
Subject: [PATCH] i386: Revert revision 268678 and revision 268657

i386 backend has

INT_MODE (OI, 32);
INT_MODE (XI, 64);

So, XI_MODE represents 64 INTEGER bytes = 64 * 8 = 512 bit operation,
in case of const_1, all 512 bits set.

We can load zeros with narrower instruction, (e.g. 256 bit by inherent
zeroing of highpart in case of 128 bit xor), so TImode in this case.

Some targets prefer V4SF mode, so they will emit float xorps for zeroing

Then the introduction of AVX512F fubared everything by overloading the
meaning of insn mode.

How should we use INSN mode,  MODE_XI, in standard_sse_constant_opcode
and patterns which use standard_sse_constant_opcode? 2 options:

1.  MODE_XI should only used to check if EXT_REX_SSE_REG_P is true
in any register operand.  The operand size must be determined by operand
itself , not by MODE_XI.  The operand encoding size should be determined
by the operand size, EXT_REX_SSE_REG_P and AVX512VL.
2. MODE_XI should be used to determine the operand encoding size.
EXT_REX_SSE_REG_P and AVX512VL should be checked for encoding
instructions.

gcc/

	PR target/89229
	* config/i386/i386.md (*movoi_internal_avx): Revert revision
	268678 and revision 268657.
	(*movti_internal): Likewise.

gcc/testsuite/

	PR target/89229
	* gcc.target/i386/pr89229-1.c: New test.
---
 gcc/config/i386/i386.md   | 14 +++
 gcc/testsuite/gcc.target/i386/pr89229-1.c | 47 +++
 2 files changed, 53 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-1.c

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 3d9141ae450..9948f77fca5 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -1933,13 +1933,12 @@
(set_attr "type" "sselog1,sselog1,ssemov,ssemov")
(set_attr "prefix" "vex")
(set (attr "mode")
-	(cond [(and (not (match_test "TARGET_AVX512VL"))
-		(ior (match_operand 0 "ext_sse_reg_operand")
-			 (match_operand 1 "ext_sse_reg_operand")))
+	(cond [(ior (match_operand 0 "ext_sse_reg_operand")
+		(match_operand 1 "ext_sse_reg_operand"))
 		 (const_string "XI")
 	   (and (eq_attr "alternative" "1")
 		(match_test "TARGET_AVX512VL"))
-		 (const_string "OI")
+		 (const_string "XI")
 	   (ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
 		(and (eq_attr "alternative" "3")
 			 (match_test "TARGET_SSE_TYPELESS_STORES")))
@@ -2013,13 +2012,12 @@
(set (attr "mode")
 	(cond [(eq_attr "alternative" "0,1")
 		 (const_string "DI")
-	   (and (not (match_test "TARGET_AVX512VL"))
-		(ior (match_operand 0 "ext_sse_reg_operand")
-			 (match_operand 1 "ext_sse_reg_operand")))
+	   (ior (match_operand 0 "ext_sse_reg_operand")
+		(match_operand 1 "ext_sse_reg_operand"))
 		 (const_string "XI")
 	   (and (eq_attr "alternative" "3")
 		(match_test "TARGET_AVX512VL"))
-		 (const_string "TI")
+		 (const_string "XI")
 	   (ior (not (match_test "TARGET_SSE2"))
 		(ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
 			 (and (eq_attr "alternative" "5")
diff --git a/gcc/testsuite/gcc.target/i386/pr89229-1.c b/gcc/testsuite/gcc.target/i386/pr89229-1.c
new file mode 100644
index 000..cce95350bf2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr89229-1.c
@@ -0,0 +1,47 @@
+/* { dg-do assemble { target { avx512bw && avx512vl } } } */
+/* { dg-options "-O1 -mavx512bw -mavx512vl -mtune=skylake-avx512" } */
+
+extern void abort (void);
+extern void exit (int);
+struct s { unsigned char a[256]; };
+union u { struct { struct s b; int c; } d; struct { int c; struct s b; } e; };
+static union u v;
+static union u v0;
+static struct s *p = 
+static struct s *q = 
+
+static inline struct s rp (void) { return *p; }
+static inline struct s rq (void) { return *q; }
+static void pq (void) { *p = rq(); }
+static void qp (void) { *q = 

[PATCH, libphobos] Committed add hppa version in std.experimental.allocator

2019-02-12 Thread Iain Buclaw
Hi,

This is a backport from phobos 2.084, the hppa changes that were
applied missed adding this one change in
allocator/building_blocks/region.d.

Bootstrapped and regression tested on x86_64-linux-gnu. Despite not
the ended target that's being fixed, only validates that scoping is
correct.

Committed to trunk as r268810.

-- 
Iain
---
diff --git a/libphobos/src/MERGE b/libphobos/src/MERGE
index aef240e0722..61c42525d44 100644
--- a/libphobos/src/MERGE
+++ b/libphobos/src/MERGE
@@ -1,4 +1,4 @@
-6c9fb28b0f8813d41798202a9d19c6b37ba5da5f
+791c5d2407e500bb4e777d6a90fc96cf250ba2f6
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/phobos repository.
diff --git a/libphobos/src/std/experimental/allocator/building_blocks/region.d b/libphobos/src/std/experimental/allocator/building_blocks/region.d
index 80157aee7e6..dfcecce72bd 100644
--- a/libphobos/src/std/experimental/allocator/building_blocks/region.d
+++ b/libphobos/src/std/experimental/allocator/building_blocks/region.d
@@ -387,6 +387,7 @@ struct InSituRegion(size_t size, size_t minAlign = platformAlignment)
 else version (X86_64) enum growDownwards = Yes.growDownwards;
 else version (ARM) enum growDownwards = Yes.growDownwards;
 else version (AArch64) enum growDownwards = Yes.growDownwards;
+else version (HPPA) enum growDownwards = No.growDownwards;
 else version (PPC) enum growDownwards = Yes.growDownwards;
 else version (PPC64) enum growDownwards = Yes.growDownwards;
 else version (MIPS32) enum growDownwards = Yes.growDownwards;


[PATCH doc] Remove documentation for PowerPC -maltivec={be,le}

2019-02-12 Thread Pat Haugen
The options were removed in May 2018 (r260109), but documentation was not 
updated.

Bootstrap on powerpc64le. Ok for trunk?

-Pat


2019-02-12  Pat Haugen  

* doc/invoke.texi (RS/6000 and PowerPC Options): Remove duplicate
-maltivec. Delete -maltivec=be and -maltivec=le documentation.

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 268784)
+++ gcc/doc/invoke.texi (working copy)
@@ -1085,7 +1085,7 @@ See RS/6000 and PowerPC Options.
 -mstrict-align  -mno-strict-align  -mrelocatable @gol
 -mno-relocatable  -mrelocatable-lib  -mno-relocatable-lib @gol
 -mtoc  -mno-toc  -mlittle  -mlittle-endian  -mbig  -mbig-endian @gol
--mdynamic-no-pic  -maltivec  -mswdiv  -msingle-pic-base @gol
+-mdynamic-no-pic  -mswdiv  -msingle-pic-base @gol
 -mprioritize-restricted-insns=@var{priority} @gol
 -msched-costly-dep=@var{dependence_type} @gol
 -minsert-sched-nops=@var{scheme} @gol
@@ -24088,40 +24088,14 @@ the AltiVec instruction set.  You may al
 @option{-mabi=altivec} to adjust the current ABI with AltiVec ABI
 enhancements.

-When @option{-maltivec} is used, rather than @option{-maltivec=le} or
-@option{-maltivec=be}, the element order for AltiVec intrinsics such
-as @code{vec_splat}, @code{vec_extract}, and @code{vec_insert}
+When @option{-maltivec} is used, the element order for AltiVec intrinsics
+such as @code{vec_splat}, @code{vec_extract}, and @code{vec_insert}
 match array element order corresponding to the endianness of the
 target.  That is, element zero identifies the leftmost element in a
 vector register when targeting a big-endian platform, and identifies
 the rightmost element in a vector register when targeting a
 little-endian platform.

-@item -maltivec=be
-@opindex maltivec=be
-Generate AltiVec instructions using big-endian element order,
-regardless of whether the target is big- or little-endian.  This is
-the default when targeting a big-endian platform.  Using this option
-is currently deprecated.  Support for this feature will be removed in
-GCC 9.
-
-The element order is used to interpret element numbers in AltiVec
-intrinsics such as @code{vec_splat}, @code{vec_extract}, and
-@code{vec_insert}.  By default, these match array element order
-corresponding to the endianness for the target.
-
-@item -maltivec=le
-@opindex maltivec=le
-Generate AltiVec instructions using little-endian element order,
-regardless of whether the target is big- or little-endian.  This is
-the default when targeting a little-endian platform.  This option is
-currently ignored when targeting a big-endian platform.
-
-The element order is used to interpret element numbers in AltiVec
-intrinsics such as @code{vec_splat}, @code{vec_extract}, and
-@code{vec_insert}.  By default, these match array element order
-corresponding to the endianness for the target.
-
 @item -mvrsave
 @itemx -mno-vrsave
 @opindex mvrsave



Re: [PATCH] i386: Use OI/TImode in *mov[ot]i_internal_avx with AVX512VL

2019-02-12 Thread Uros Bizjak
On Fri, Feb 8, 2019 at 12:29 PM H.J. Lu  wrote:
>
> On Fri, Feb 8, 2019 at 1:51 AM Uros Bizjak  wrote:
> >
> > On Thu, Feb 7, 2019 at 10:11 PM H.J. Lu  wrote:
> > >
> > > OImode and TImode moves must be done in XImode to access upper 16
> > > vector registers without AVX512VL.  With AVX512VL, we can access
> > > upper 16 vector registers in OImode and TImode.
> > >
> > > PR target/89229
> > > * config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
> > > upper 16 vector registers without TARGET_AVX512VL.
> > > (*movti_internal): Likewise.
> >
> > Please use (not (match_test "...")) instead of (match_test "!...") and
> > put the new test as the first argument of the AND rtx.
> >
> > LGTM with the above change.
>
> This is the patch I am checking in.

HJ,

please revert two PR89229 patches as they introduce a regression.

Uros.

> Thanks.
>
> H.J.
> ---
> OImode and TImode moves must be done in XImode to access upper 16
> vector registers without AVX512VL.  With AVX512VL, we can access
> upper 16 vector registers in OImode and TImode.
>
> PR target/89229
> * config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
> upper 16 vector registers without TARGET_AVX512VL.
> (*movti_internal): Likewise.
> ---
>  gcc/config/i386/i386.md | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index c1492363bca..3d9141ae450 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -1933,8 +1933,9 @@
> (set_attr "type" "sselog1,sselog1,ssemov,ssemov")
> (set_attr "prefix" "vex")
> (set (attr "mode")
> - (cond [(ior (match_operand 0 "ext_sse_reg_operand")
> - (match_operand 1 "ext_sse_reg_operand"))
> + (cond [(and (not (match_test "TARGET_AVX512VL"))
> + (ior (match_operand 0 "ext_sse_reg_operand")
> + (match_operand 1 "ext_sse_reg_operand")))
>   (const_string "XI")
>  (and (eq_attr "alternative" "1")
>   (match_test "TARGET_AVX512VL"))
> @@ -2012,8 +2013,9 @@
> (set (attr "mode")
>   (cond [(eq_attr "alternative" "0,1")
>   (const_string "DI")
> -(ior (match_operand 0 "ext_sse_reg_operand")
> - (match_operand 1 "ext_sse_reg_operand"))
> +(and (not (match_test "TARGET_AVX512VL"))
> + (ior (match_operand 0 "ext_sse_reg_operand")
> + (match_operand 1 "ext_sse_reg_operand")))
>   (const_string "XI")
>  (and (eq_attr "alternative" "3")
>   (match_test "TARGET_AVX512VL"))
> --


PING^1: [PATCH] driver: Also prune joined switches with negation

2019-02-12 Thread H.J. Lu
On Fri, Feb 8, 2019 at 3:09 PM H.J. Lu  wrote:
>
> On Fri, Feb 8, 2019 at 3:02 PM H.J. Lu  wrote:
> >
> > When -march=native is passed to host_detect_local_cpu to the backend,
> > it overrides all command lines after it.  That means
> >
> > $ gcc -march=native -march=skylake-avx512
> >
> > is the treated as
> >
> > $ gcc -march=skylake-avx512 -march=native
> >
> > Prune joined switches with negation to allow -march=skylake-avx512 to
> > override previous -march=native on command-line.
> >
> > PR driver/69471
> > * opts-common.c (prune_options): Also prune joined switches
> > with negation.
> > * config/i386/i386.opt (march=): Add Negative(march=).
> > (mtune=): Add Negative(mtune=).
>
> Here is the updated patch.
>

PING:

https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00492.html


-- 
H.J.


Re: [PATCH 0/7, OpenACC, libgomp, v5, stage1] Async re-work

2019-02-12 Thread Thomas Schwinge
Hi Chung-Lin!

Happy New Year now to you, too!  :-)


On Tue, 22 Jan 2019 22:52:09 +0800, Chung-Lin Tang  
wrote:
> Hi, this is a rebase to current trunk and re-submission of the OpenACC Async
> re-organization work, aiming to commit when stage1 re-opens.

Thanks!

> This is technically
> the 2nd time I'm sending this whole patch series, but because I've named
> partial revisions up to v4 by now, for clarity I will just call this entire 
> set "v5".

As far as I'm concerned, these patches should all (with a few exceptions
to be split out, see below) be merged into one patch, because they
logically all belong together, as one piece: "async re-work".


> Thomas, I hope I resolved all discussed issues in this current patch set. 
> Please
> kindly remind if I missed anything, as there were so many emails to re-check 
> :)

I'm still waiting for you to commit the PR87924 "OpenACC wait clauses
without async-arguments" changes, as a prerequisite to this re-work,
.


If we agree that we actually need such a thing (I'll have to re-read
Jakub's comments), please submit the 'GOMP_PLUGIN_IF_VERSION' changes
separately, with 'GOMP_PLUGIN_IF_VERSION' equal to 'GOMP_VERSION'
(initially).  As this then is only a kind of documentation update, this
might then go into trunk right now -- and even if not right now, should
still be done separately as a prerequisite patch to this re-work, which
will then just increment 'GOMP_PLUGIN_IF_VERSION'.

Maybe rename 'GOMP_PLUGIN_IF_VERSION' to 'GOMP_PLUGIN_VERSION', for
similarity with 'GOMP_VERSION'?

And, it's then a bit confusing that 'GOMP_PLUGIN_VERSION' is returned
from 'GOMP_OFFLOAD_version' functions (plus 'host_version'); we there got
"plugin" vs. "offload".  But I suppose we'll just live with that?

The 'GOMP_OFFLOAD_version' functions should then also get their source
code comments updated: "libgomp [plugin] version"?


Now, back to the actual async re-work.

I see you've incorporated some of the incremental patches I provided
(thanks!), but not all of them.  I don't know if you just missed (some
of) these, or actually object?


I had requested that the OpenACC 2.5 'default_async' changes be discussed
separately, after this re-work has gone in, so please remove these
changes from this patch series.  I've again attached "into async re-work:
revert default_async changes".


I had provided changes, "into async re-work: don't create an asyncqueue
just to then test/synchronize with it", again attached.  I had asked that
you 'Please especially review the "libgomp/oacc-parallel.c:goacc_wait"
change, and confirm no corresponding "libgomp/oacc-parallel.c:GOACC_wait"
change to be done, because that code is structured differently'.


I had requested that we maintain the current behavior, that
"acc_async_noval" stays in its own, separate asyncqueue, instead of
aliasing it to 'async(0)'.  I had proposed "into async re-work:
libgomp/oacc-async.c:async2id", again attached.

You said you don't like the 'async2id' function I'm adding there (I still
don't understand why), so I assume you'd then implement this
async-argument to queue ID translation in 'lookup_goacc_asyncqueue'
proper?


I had provided "[WIP] into async re-work: documentation", again attached,
as 'A little bit of documentation starter update for you to include.
Please make sure that all relevant functions have such comments addded'.


I'm again attaching my changes 'into async re-work: replicate
"[PR88407] [OpenACC] Correctly handle unseen async-arguments"', which --
I suppose -- are necessary to maintain the current GCC trunk behavior
(that is, avoid testsuite regressions).


I'm again attaching my changes 'into async re-work: replicate "[PR88370]
acc_get_cuda_stream/acc_set_cuda_stream: acc_async_sync,
acc_async_noval"', which -- I suppose -- are necessary to maintain the
current GCC trunk behavior (that is, avoid testsuite regressions).


I'm again attaching my changes 'into async re-work: adjust for test case
added in "[PR88484] OpenACC wait directive without wait argument but with
async clause"', which -- I suppose -- are necessary to maintain the
current GCC trunk behavior (that is, avoid testsuite regressions).

You suggested that "Instead of fixing it here, will it make more sense to
have the serialize_func hook to accommodate the NULL asyncqueue?", to
which I said "Sure, that may make sense, yes.  Right: if there's no
asyncqueue to serialize with, then serialize/synchronize with the local
(host) thread", but this has not yet been implemented, as far as I can
tell.


I'm again attaching my changes 'into async re-work: don't synchronize
with the local thread unless actually necessary', which is the behavior
that makes most sense to me, and I had asked 'Would you please review the
"TODO" comments, and again also especially review the
"libgomp/oacc-parallel.c:goacc_wait" change, and confirm no corresponding
"libgomp/oacc-parallel.c:GOACC_wait" change to be done, 

Re: [PATCH] rs6000: new vec-s*d-modulo.c tests should require p8vector_hw

2019-02-12 Thread Segher Boessenkool
On Mon, Feb 11, 2019 at 08:56:38PM -0600, Bill Schmidt wrote:
> It turns out that the new tests added today actually require POWER8 hardware 
> at
> a minimum, since the vec_vsrad interface requires it.  (Note that requiring
> P8 hardware obviates the need to specify -mvsx, so that is now removed.)
> 
> Tested on powerpc64le (P9, P8) and powerpc64 (P7) with correct behavior.  Is 
> this
> okay for trunk?

Okay if you add -mpower8-vector for these tests.  Thanks,


Segher


Re: [PATCH][libbacktrace] Handle bsearch with NULL base in dwarf_lookup_pc

2019-02-12 Thread Ian Lance Taylor via gcc-patches
On Tue, Feb 12, 2019 at 12:36 AM Tom de Vries  wrote:
>
> The call to bsearch in dwarf_lookup_pc can have NULL as base argument when
> the nmemb argument is 0.  The base argument is required to be pointing to the
> initial member of an array of nmemb objects.  It is not specified what
> constitutes a valid pointer to an array of 0 objects, but glibc declares base
> with attribute non-null, so the NULL will trigger a sanitizer runtime error.
>
> Fix this by only calling bsearch if nmemb != 0.
>
> OK for trunk?
>
> Thanks,
> - Tom
>
> [libbacktrace] Handle bsearch with NULL base in dwarf_lookup_pc
>
> 2019-02-12  Tom de Vries  
>
> PR libbacktrace/81983
> * dwarf.c (dwarf_lookup_pc): Don't call bsearch if nmemb == 0.

This is OK.

Thanks.

Ian


Re: [PATCH 36/40] i386: Allow MMX vector expanders with TARGET_MMX_WITH_SSE

2019-02-12 Thread Uros Bizjak
On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
>
> PR target/89021
> * config/i386/i386.c (ix86_expand_vector_init_duplicate): Set
> mmx_ok to true if TARGET_MMX_WITH_SSE is true.
> (ix86_expand_vector_init_one_nonzero): Likewise.
> (ix86_expand_vector_init_one_var): Likewise.
> (ix86_expand_vector_init_general): Likewise.
> (ix86_expand_vector_init): Likewise.
> (ix86_expand_vector_set): Likewise.
> (ix86_expand_vector_extract): Likewise.
> * config/i386/mmx.md (*vec_dupv2sf): Changed to
> define_insn_and_split to support SSE emulation.
> (vec_setv2sf): Also allow TARGET_MMX_WITH_SSE.
> (vec_extractv2sf_1 splitter): Likewise.
> (vec_extractv2sfsf): Likewise.
> (vec_setv2si): Likewise.
> (vec_extractv2si_1 splitter): Likewise.
> (vec_extractv2sisi): Likewise.
> (vec_setv4hi): Likewise.
> (vec_extractv4hihi): Likewise.
> (vec_setv8qi): Likewise.
> (vec_extractv8qiqi): Likewise.
> (*vec_extractv2sf_0): Don't allow TARGET_MMX_WITH_SSE.
> (*vec_extractv2sf_1): Likewise.
> (*vec_extractv2si_0): Likewise.
> (*vec_extractv2si_1): Likewise.
> (*vec_extractv2sf_0_sse): New.
> (*vec_extractv2sf_1_sse): Likewise.
> (*vec_extractv2si_0_sse): Likewise.
> (*vec_extractv2si_1_sse): Likewise.

Please do not introduce new *_sse patterns, use mmx_isa attribute to
disable unwanted alternatives.

> ---
>  gcc/config/i386/i386.c |   8 +++
>  gcc/config/i386/mmx.md | 129 +
>  2 files changed, 113 insertions(+), 24 deletions(-)
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 7d65192c1cd..4e776b8c3ea 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -42365,6 +42365,7 @@ ix86_expand_vector_init_duplicate (bool mmx_ok, 
> machine_mode mode,
>  {
>bool ok;
>
> +  mmx_ok |= TARGET_MMX_WITH_SSE;
>switch (mode)
>  {
>  case E_V2SImode:
> @@ -42524,6 +42525,7 @@ ix86_expand_vector_init_one_nonzero (bool mmx_ok, 
> machine_mode mode,
>bool use_vector_set = false;
>rtx (*gen_vec_set_0) (rtx, rtx, rtx) = NULL;
>
> +  mmx_ok |= TARGET_MMX_WITH_SSE;
>switch (mode)
>  {
>  case E_V2DImode:
> @@ -42717,6 +42719,7 @@ ix86_expand_vector_init_one_var (bool mmx_ok, 
> machine_mode mode,
>XVECEXP (const_vec, 0, one_var) = CONST0_RTX (GET_MODE_INNER (mode));
>const_vec = gen_rtx_CONST_VECTOR (mode, XVEC (const_vec, 0));
>
> +  mmx_ok |= TARGET_MMX_WITH_SSE;
>switch (mode)
>  {
>  case E_V2DFmode:
> @@ -43102,6 +43105,7 @@ ix86_expand_vector_init_general (bool mmx_ok, 
> machine_mode mode,
>machine_mode quarter_mode = VOIDmode;
>int n, i;
>
> +  mmx_ok |= TARGET_MMX_WITH_SSE;
>switch (mode)
>  {
>  case E_V2SFmode:
> @@ -43301,6 +43305,8 @@ ix86_expand_vector_init (bool mmx_ok, rtx target, rtx 
> vals)
>int i;
>rtx x;
>
> +  mmx_ok |= TARGET_MMX_WITH_SSE;
> +
>/* Handle first initialization from vector elts.  */
>if (n_elts != XVECLEN (vals, 0))
>  {
> @@ -43400,6 +43406,7 @@ ix86_expand_vector_set (bool mmx_ok, rtx target, rtx 
> val, int elt)
>machine_mode mmode = VOIDmode;
>rtx (*gen_blendm) (rtx, rtx, rtx, rtx);
>
> +  mmx_ok |= TARGET_MMX_WITH_SSE;
>switch (mode)
>  {
>  case E_V2SFmode:
> @@ -43755,6 +43762,7 @@ ix86_expand_vector_extract (bool mmx_ok, rtx target, 
> rtx vec, int elt)
>bool use_vec_extr = false;
>rtx tmp;
>
> +  mmx_ok |= TARGET_MMX_WITH_SSE;
>switch (mode)
>  {
>  case E_V2SImode:
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index c8bd544dc9e..4e8b6e54b4c 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -591,14 +591,23 @@
> (set_attr "prefix_extra" "1")
> (set_attr "mode" "V2SF")])
>
> -(define_insn "*vec_dupv2sf"
> -  [(set (match_operand:V2SF 0 "register_operand" "=y")
> +(define_insn_and_split "*vec_dupv2sf"
> +  [(set (match_operand:V2SF 0 "register_operand" "=y,x,Yv")
> (vec_duplicate:V2SF
> - (match_operand:SF 1 "register_operand" "0")))]
> -  "TARGET_MMX"
> -  "punpckldq\t%0, %0"
> -  [(set_attr "type" "mmxcvt")
> -   (set_attr "mode" "DI")])
> + (match_operand:SF 1 "register_operand" "0,0,Yv")))]
> +  "TARGET_MMX || TARGET_MMX_WITH_SSE"
> +  "@
> +   punpckldq\t%0, %0
> +   #
> +   #"
> +  "TARGET_MMX_WITH_SSE && reload_completed"
> +  [(set (match_dup 0)
> +   (vec_duplicate:V4SF (match_dup 1)))]
> +  "operands[0] = lowpart_subreg (V4SFmode, operands[0],
> +GET_MODE (operands[0]));"
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "mmxcvt,ssemov,ssemov")
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_insn "*mmx_concatv2sf"
>[(set (match_operand:V2SF 0 "register_operand" "=y,y")
> @@ -616,7 +625,7 @@
>

Define __sparcv8 in 64-bit-default Solaris/SPARC gcc with -m32

2019-02-12 Thread Rainer Orth
I noticed that a 64-bit-default Solaris/SPARC gcc with -m32 doesn't
predefine __sparcv8, unlike the corresponding 32-bit-default compiler.

Since those defines happen in CPP_CPU_SPEC for any explicit -mcpu option
of v8, supersparc, v9 and beyond, this must be due to -mcpu being set in
the driver of the 32-bit compiler, but not in the 64-bit one.

Indeed, I find

-mcpu=v9

in COLLECT_GCC_OPTIONS of the 32-bit compiler, but only

-m32

for the 64-bit one.

AFAICS this happens due to OPTION_DEFAULT_SPECS: it includes

  {"cpu", "%{!m64:%{!mcpu=*:-mcpu=%(VALUE)}}" }, \

for 32-bit-default and for 64-bit-default:

  {"cpu", "%{!m32:%{!mcpu=*:-mcpu=%(VALUE)}}" }, \

Without an explicit --with-cpu configure option, config.gcc sets
with_cpu=v9 for both 32 and 64-bit sparc compilers.  For the sparcv9 gcc
with -m32 however, the passing of -mcpu=v9 is suppressed by the second
spec above.

In line with the handling of --with-tune/-mtune below, -mcpu should be
the same when no explicit -mcpu is given, irrespective of -m32 or -m64.

The following patch just removes those guards.  Bootstrapped without
regressions on sparcv9-sun-solaris2.11 and sparc-sun-solaris2.11.
Besides, I've compared the output of

gcc -m32/-m64 -g3 -E -x c /dev/null

Before the patch, -D__sparcv8 was missing in the sparcv9 gcc's -m32
case, now -m32 and -m64 output is identical between sparc and sparcv9
compilers.

Eric, could you have another look if I'm missing something?  Otherwise,
I'm going to commit the patch (it's necessary to enable
SANITIZER_CAN_FAST_UNWIND on Solaris/SPARC and make use of
libsanitizer/sanitizer_common/sanitizer_stacktrace_sparc.cc).

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2019-02-11  Rainer Orth  

* config/sparc/sol2.h (OPTION_DEFAULT_SPECS) [DEFAULT_ARCH32_P]:
Remove !m64 guard from cpu entry.
[!DEFAULT_ARCH32_P]: Remove !m32 guard from cpu entry.

# HG changeset patch
# Parent  c72a287bd21743e4d6759e58a6adc3ad5e30d52c
Define __sparcv8 in 64-bit-default Solaris/SPARC gcc with -m32

diff --git a/gcc/config/sparc/sol2.h b/gcc/config/sparc/sol2.h
--- a/gcc/config/sparc/sol2.h
+++ b/gcc/config/sparc/sol2.h
@@ -273,7 +273,7 @@ extern const char *host_detect_local_cpu
 #define OPTION_DEFAULT_SPECS \
   {"cpu_32", "%{!m64:%{!mcpu=*:-mcpu=%(VALUE)}}" }, \
   {"cpu_64", "%{m64:%{!mcpu=*:-mcpu=%(VALUE)}}" }, \
-  {"cpu", "%{!m64:%{!mcpu=*:-mcpu=%(VALUE)}}" }, \
+  {"cpu", "%{!mcpu=*:-mcpu=%(VALUE)}" }, \
   {"tune_32", "%{!m64:%{!mtune=*:-mtune=%(VALUE)}}" }, \
   {"tune_64", "%{m64:%{!mtune=*:-mtune=%(VALUE)}}" }, \
   {"tune", "%{!mtune=*:-mtune=%(VALUE)}" }, \
@@ -282,7 +282,7 @@ extern const char *host_detect_local_cpu
 #define OPTION_DEFAULT_SPECS \
   {"cpu_32", "%{m32:%{!mcpu=*:-mcpu=%(VALUE)}}" }, \
   {"cpu_64", "%{!m32:%{!mcpu=*:-mcpu=%(VALUE)}}" }, \
-  {"cpu", "%{!m32:%{!mcpu=*:-mcpu=%(VALUE)}}" }, \
+  {"cpu", "%{!mcpu=*:-mcpu=%(VALUE)}" }, \
   {"tune_32", "%{m32:%{!mtune=*:-mtune=%(VALUE)}}" },	\
   {"tune_64", "%{!m32:%{!mtune=*:-mtune=%(VALUE)}}" },	\
   {"tune", "%{!mtune=*:-mtune=%(VALUE)}" }, \


Re: [PATCH][ARM] Fix PR89222

2019-02-12 Thread Wilco Dijkstra
Hi Alexander,

> It seems odd to me that the spec requires '(S+A) | T' instead of the (imho
> more intuitive) '(S|T) + A', but apart from the missing diagnostic from the
> linkers, it seems they do as they must and GCC was at fault.

Doing (S+A) | T means bit zero always correctly encodes the Thumb state,
otherwise the +A could change Thumb into Arm and visa versa.

> (perhaps it's okay to allow addends with low bit zero though, instead of
> allowing only zero addends as your patch does?)

Maybe, but there aren't many cases where an addend is useful. One really
shouldn't be doing arbitrary arithmetic with function pointers.

Cheers,
Wilco


Re: [rs6000] 64-bit integer loads/stores and FP instructions

2019-02-12 Thread Segher Boessenkool
On Tue, Feb 12, 2019 at 11:55:24AM +0100, Eric Botcazou wrote:
> > No, we should allow both integer and floating point insns for integer stores
> > always.  We just get the cost estimates slightly wrong now, apparently.
> 
> Note that my proof of concept patch doesn't disallow them either...  So what 
> do you suggest?  Just putting back the '*' modifiers in the DI patterns?

Yeah, something like that.  It will need some serious testing, to make
sure we don't regress (including not regressing what that patch that took
them away was meant to do).  I can arrange some testing, will you do the
patch though?

> As a matter of fact there are still present in the SI pattern.

Yeah.  It might not hurt at all to put them back in the DI as well.
Here's hoping.

Thanks,


Segher


[PING] [PATCH] Fix not 8-byte aligned ldrd/strd on ARMv5

2019-02-12 Thread Bernd Edlinger
Hi!

I'd like to ping for this patch:
https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00248.html

Thanks
Bernd.


On 2/5/19 4:07 PM, Bernd Edlinger wrote:
> Hi,
> 
> due to the AAPCS parameter passing of 8-byte aligned structures, which happen 
> to
> be 8-byte aligned or only 4-byte aligned in the test case, ldrd instructions
> are generated that may access 4-byte aligned stack slots, which will trap on 
> ARMv5 and
> ARMv6 according to the following document:
> 
> 
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0473m/dom1361290002364.html
> says:
> 
> "In ARMv5TE, or in ARMv6 when SCTLR.U is 0, LDRD and STRD doubleword data 
> transfers must be
> eight-byte aligned.  Use ALIGN 8 before memory allocation directives such as 
> DCQ if the data
> is to be accessed using LDRD or STRD.  This is not required in ARMv6 when 
> SCTLR.U is 1, or in
> ARMv7, because in these versions, doubleword data transfers can be 
> word-aligned."
> 
> 
> The reason why the ldrd instruction is generated seems to be a missing 
> alignment check in the
> function output_move_double.  But when that is fixed, it turns out that if 
> the parameter happens
> to be 8-byte aligned by chance, they still have MEM_ALIGN = 4, which prevents 
> the ldrd completely.
> 
> The reason for that is in function.c (assign_parm_find_stack_rtl), where 
> values that happen to be
> aligned to STACK_BOUNDARY, are only  aligned to PARM_BOUNDARY.
> 
> 
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu and arm-linux-gnueabihf 
> with all languages.
> Is it OK for trunk?
> 
> 
> Thanks
> Bernd.
> 


Re: [PATCH] S/390: Reject invalid Q/R/S/T addresses after LRA

2019-02-12 Thread Ulrich Weigand
Ilya Leoshkevich wrote:

> gcc/ChangeLog:
> 
> 2019-02-11  Ilya Leoshkevich  
> 
>   PR target/89233
>   * config/s390/s390.c (s390_decompose_address): Update comment.
>   (s390_check_qrst_address): Reject invalid address forms after
>   LRA.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-02-11  Ilya Leoshkevich  
> 
>   PR target/89233
>   * gcc.target/s390/pr89233.c: New test.

This is OK.

Thanks,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



[PATCH] [LIBPHOBOS] Fix ualigned access in murmurhash.d (PR d/89177)

2019-02-12 Thread Bernd Edlinger
Hi,

in the function MurmurHash3::put an unaligned access fault happens on an arm 
system.
The linux kernel is able to emulate the unaligned LDM instruction, but it is 
still not
desirable, to generate code that depends on that.


Fixed by the attached patch.

Bootstrapped and reg-tested on arm-linux-gnueabihf
Is it OK for trunk?


Thanks
Bernd.
2019-02-08  Bernd Edlinger  

	PR d/89177
	* src/std/digest/murmurhash.d (MurmurHash3::put): Avoid unaligned
	access traps.

Index: libphobos/src/std/digest/murmurhash.d
===
--- libphobos/src/std/digest/murmurhash.d	(revision 268614)
+++ libphobos/src/std/digest/murmurhash.d	(working copy)
@@ -516,9 +516,11 @@ struct MurmurHash3(uint size /* 32 or 128 */ , uin
 // Do main work: process chunks of `Element.sizeof` bytes.
 const numElements = data.length / Element.sizeof;
 const remainderStart = numElements * Element.sizeof;
-foreach (ref const Element block; cast(const(Element[]))(data[0 .. remainderStart]))
+for (auto start = 0; start < remainderStart; start += Element.sizeof)
 {
-putElement(block);
+BufferUnion buffer;
+buffer.data[0 .. $] = data[start .. start + Element.sizeof];
+putElement(buffer.block);
 }
 // +1 for bufferLeeway Element.
 element_count += (numElements + 1) * Element.sizeof;


Re: [PATCH 37/40] i386: Allow MMX intrinsic emulation with SSE

2019-02-12 Thread Uros Bizjak
On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
>
> Allow MMX intrinsic emulation with SSE/SSE2/SSSE3.  Don't enable MMX ISA
> by default with TARGET_MMX_WITH_SSE.
>
> For pr82483-1.c and pr82483-2.c, "-mssse3 -mno-mmx" compiles in 64-bit
> mode since MMX intrinsics can be emulated wit SSE.
>
> gcc/
>
> PR target/89021
> * config/i386/i386-builtin.def: Enable MMX intrinsics with
> SSE/SSE2/SSSE3.
> * config/i386/i386.c (ix86_option_override_internal): Don't
> enable MMX ISA with TARGET_MMX_WITH_SSE by default.
> (bdesc_tm): Enable MMX intrinsics with SSE/SSE2/SSSE3.
> (ix86_init_mmx_sse_builtins): Likewise.
> (ix86_expand_builtin): Allow SSE/SSE2/SSSE3 to emulate MMX
> intrinsics with TARGET_MMX_WITH_SSE.
> * config/i386/mmintrin.h: Don't require MMX in 64-bit mode.
>
> gcc/testsuite/
>
> PR target/89021
> * gcc.target/i386/pr82483-1.c: Error only on ia32.
> * gcc.target/i386/pr82483-2.c: Likewise.
> ---
>  gcc/config/i386/i386-builtin.def  | 126 +++---
>  gcc/config/i386/i386.c|  62 +++
>  gcc/config/i386/mmintrin.h|  10 +-
>  gcc/testsuite/gcc.target/i386/pr82483-1.c |   2 +-
>  gcc/testsuite/gcc.target/i386/pr82483-2.c |   2 +-
>  5 files changed, 118 insertions(+), 84 deletions(-)
>
> diff --git a/gcc/config/i386/i386-builtin.def 
> b/gcc/config/i386/i386-builtin.def
> index 88005f4687f..10a9d631f29 100644
> --- a/gcc/config/i386/i386-builtin.def
> +++ b/gcc/config/i386/i386-builtin.def
> @@ -100,7 +100,7 @@ BDESC (0, 0, CODE_FOR_fnstsw, "__builtin_ia32_fnstsw", 
> IX86_BUILTIN_FNSTSW, UNKN
>  BDESC (0, 0, CODE_FOR_fnclex, "__builtin_ia32_fnclex", IX86_BUILTIN_FNCLEX, 
> UNKNOWN, (int) VOID_FTYPE_VOID)
>
>  /* MMX */
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_emms, "__builtin_ia32_emms", 
> IX86_BUILTIN_EMMS, UNKNOWN, (int) VOID_FTYPE_VOID)
> +BDESC (OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE2, 0, CODE_FOR_mmx_emms, 
> "__builtin_ia32_emms", IX86_BUILTIN_EMMS, UNKNOWN, (int) VOID_FTYPE_VOID)
>
>  /* 3DNow! */
>  BDESC (OPTION_MASK_ISA_3DNOW, 0, CODE_FOR_mmx_femms, "__builtin_ia32_femms", 
> IX86_BUILTIN_FEMMS, UNKNOWN, (int) VOID_FTYPE_VOID)
> @@ -442,68 +442,68 @@ BDESC (0, 0, CODE_FOR_rotrqi3, "__builtin_ia32_rorqi", 
> IX86_BUILTIN_RORQI, UNKNO
>  BDESC (0, 0, CODE_FOR_rotrhi3, "__builtin_ia32_rorhi", IX86_BUILTIN_RORHI, 
> UNKNOWN, (int) UINT16_FTYPE_UINT16_INT)
>
>  /* MMX */
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_addv8qi3, 
> "__builtin_ia32_paddb", IX86_BUILTIN_PADDB, UNKNOWN, (int) 
> V8QI_FTYPE_V8QI_V8QI)
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_addv4hi3, 
> "__builtin_ia32_paddw", IX86_BUILTIN_PADDW, UNKNOWN, (int) 
> V4HI_FTYPE_V4HI_V4HI)
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_addv2si3, 
> "__builtin_ia32_paddd", IX86_BUILTIN_PADDD, UNKNOWN, (int) 
> V2SI_FTYPE_V2SI_V2SI)
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_subv8qi3, 
> "__builtin_ia32_psubb", IX86_BUILTIN_PSUBB, UNKNOWN, (int) 
> V8QI_FTYPE_V8QI_V8QI)
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_subv4hi3, 
> "__builtin_ia32_psubw", IX86_BUILTIN_PSUBW, UNKNOWN, (int) 
> V4HI_FTYPE_V4HI_V4HI)
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_subv2si3, 
> "__builtin_ia32_psubd", IX86_BUILTIN_PSUBD, UNKNOWN, (int) 
> V2SI_FTYPE_V2SI_V2SI)
> -
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_ssaddv8qi3, 
> "__builtin_ia32_paddsb", IX86_BUILTIN_PADDSB, UNKNOWN, (int) 
> V8QI_FTYPE_V8QI_V8QI)
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_ssaddv4hi3, 
> "__builtin_ia32_paddsw", IX86_BUILTIN_PADDSW, UNKNOWN, (int) 
> V4HI_FTYPE_V4HI_V4HI)
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_sssubv8qi3, 
> "__builtin_ia32_psubsb", IX86_BUILTIN_PSUBSB, UNKNOWN, (int) 
> V8QI_FTYPE_V8QI_V8QI)
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_sssubv4hi3, 
> "__builtin_ia32_psubsw", IX86_BUILTIN_PSUBSW, UNKNOWN, (int) 
> V4HI_FTYPE_V4HI_V4HI)
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_usaddv8qi3, 
> "__builtin_ia32_paddusb", IX86_BUILTIN_PADDUSB, UNKNOWN, (int) 
> V8QI_FTYPE_V8QI_V8QI)
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_usaddv4hi3, 
> "__builtin_ia32_paddusw", IX86_BUILTIN_PADDUSW, UNKNOWN, (int) 
> V4HI_FTYPE_V4HI_V4HI)
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_ussubv8qi3, 
> "__builtin_ia32_psubusb", IX86_BUILTIN_PSUBUSB, UNKNOWN, (int) 
> V8QI_FTYPE_V8QI_V8QI)
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_ussubv4hi3, 
> "__builtin_ia32_psubusw", IX86_BUILTIN_PSUBUSW, UNKNOWN, (int) 
> V4HI_FTYPE_V4HI_V4HI)
> -
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_mulv4hi3, 
> "__builtin_ia32_pmullw", IX86_BUILTIN_PMULLW, UNKNOWN, (int) 
> V4HI_FTYPE_V4HI_V4HI)
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_smulv4hi3_highpart, 
> "__builtin_ia32_pmulhw", IX86_BUILTIN_PMULHW, UNKNOWN, (int) 
> V4HI_FTYPE_V4HI_V4HI)
> -
> -BDESC (OPTION_MASK_ISA_MMX, 0, CODE_FOR_mmx_andv2si3, "__builtin_ia32_pand", 
> IX86_BUILTIN_PAND, UNKNOWN, (int) 

Re: [patch] Disable store merging in asan_expand_mark_ifn

2019-02-12 Thread Jakub Jelinek
On Tue, Feb 12, 2019 at 12:41:47PM +0100, Eric Botcazou wrote:
> > Ok, stand corrected on that, 128-bit indeed, but even that is nothing not
> > really used.
> 
> The irony is that I'm doing this for 32-bit SPARC (we cannot get ASAN to work 
> in 64-bit mode for the time being) and the maximum alignment on 32-bit SPARC 
> is 64-bit (even long doubles) so this will be totally unused. ;-)
> 
> > For STRICT_ALIGNMENT targets store merging pass obviously can't do anything
> > with those, because unlike asan.c it can't figure out the alignment.
> 
> OK, revised patch attached.  I have manually verified that it yields the 
> expected result for an array of long doubles on 64-bit SPARC.
> 
> 
> 2019-02-12  Eric Botcazou  
> 
>   * asan.c (asan_expand_mark_ifn): Take into account the alignment of
>   the object to pick the size of stores on strict-alignment platforms.

Ok, thanks.

> Index: asan.c
> ===
> --- asan.c(revision 268508)
> +++ asan.c(working copy)
> @@ -3218,7 +3218,10 @@ asan_expand_mark_ifn (gimple_stmt_iterat
>/* Generate direct emission if size_in_bytes is small.  */
>if (size_in_bytes <= ASAN_PARAM_USE_AFTER_SCOPE_DIRECT_EMISSION_THRESHOLD)
>  {
> -  unsigned HOST_WIDE_INT shadow_size = shadow_mem_size (size_in_bytes);
> +  const unsigned HOST_WIDE_INT shadow_size
> + = shadow_mem_size (size_in_bytes);
> +  const unsigned int shadow_align
> + = (get_pointer_alignment (base) / BITS_PER_UNIT) >> ASAN_SHADOW_SHIFT;
>  
>tree shadow = build_shadow_mem_access (iter, loc, base_addr,
>shadow_ptr_types[0], true);
> @@ -3226,9 +3229,11 @@ asan_expand_mark_ifn (gimple_stmt_iterat
>for (unsigned HOST_WIDE_INT offset = 0; offset < shadow_size;)
>   {
> unsigned size = 1;
> -   if (shadow_size - offset >= 4)
> +   if (shadow_size - offset >= 4
> +   && (!STRICT_ALIGNMENT || shadow_align >= 4))
>   size = 4;
> -   else if (shadow_size - offset >= 2)
> +   else if (shadow_size - offset >= 2
> +&& (!STRICT_ALIGNMENT || shadow_align >= 2))
>   size = 2;
>  
> unsigned HOST_WIDE_INT last_chunk_size = 0;


Jakub


Re: [patch] Disable store merging in asan_expand_mark_ifn

2019-02-12 Thread Eric Botcazou
> Ok, stand corrected on that, 128-bit indeed, but even that is nothing not
> really used.

The irony is that I'm doing this for 32-bit SPARC (we cannot get ASAN to work 
in 64-bit mode for the time being) and the maximum alignment on 32-bit SPARC 
is 64-bit (even long doubles) so this will be totally unused. ;-)

> For STRICT_ALIGNMENT targets store merging pass obviously can't do anything
> with those, because unlike asan.c it can't figure out the alignment.

OK, revised patch attached.  I have manually verified that it yields the 
expected result for an array of long doubles on 64-bit SPARC.


2019-02-12  Eric Botcazou  

* asan.c (asan_expand_mark_ifn): Take into account the alignment of
the object to pick the size of stores on strict-alignment platforms.

-- 
Eric BotcazouIndex: asan.c
===
--- asan.c	(revision 268508)
+++ asan.c	(working copy)
@@ -3218,7 +3218,10 @@ asan_expand_mark_ifn (gimple_stmt_iterat
   /* Generate direct emission if size_in_bytes is small.  */
   if (size_in_bytes <= ASAN_PARAM_USE_AFTER_SCOPE_DIRECT_EMISSION_THRESHOLD)
 {
-  unsigned HOST_WIDE_INT shadow_size = shadow_mem_size (size_in_bytes);
+  const unsigned HOST_WIDE_INT shadow_size
+	= shadow_mem_size (size_in_bytes);
+  const unsigned int shadow_align
+	= (get_pointer_alignment (base) / BITS_PER_UNIT) >> ASAN_SHADOW_SHIFT;
 
   tree shadow = build_shadow_mem_access (iter, loc, base_addr,
 	 shadow_ptr_types[0], true);
@@ -3226,9 +3229,11 @@ asan_expand_mark_ifn (gimple_stmt_iterat
   for (unsigned HOST_WIDE_INT offset = 0; offset < shadow_size;)
 	{
 	  unsigned size = 1;
-	  if (shadow_size - offset >= 4)
+	  if (shadow_size - offset >= 4
+	  && (!STRICT_ALIGNMENT || shadow_align >= 4))
 	size = 4;
-	  else if (shadow_size - offset >= 2)
+	  else if (shadow_size - offset >= 2
+		   && (!STRICT_ALIGNMENT || shadow_align >= 2))
 	size = 2;
 
 	  unsigned HOST_WIDE_INT last_chunk_size = 0;


Re: [PATCH 33/40] i386: Emulate MMX ssse3_palignrdi with SSE

2019-02-12 Thread Uros Bizjak
On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
>
> Emulate MMX version of palignrq with SSE version by concatenating 2
> 64-bit MMX operands into a single 128-bit SSE operand, followed by
> SSE psrldq.  Only SSE register source operand is allowed.
>
> PR target/89021
> * config/i386/sse.md (ssse3_palignrdi): Changed to
> define_insn_and_split to support SSE emulation.
> ---
>  gcc/config/i386/sse.md | 54 ++
>  1 file changed, 44 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 15c187f7f5c..a1d43204344 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -15977,23 +15977,57 @@
> (set_attr "prefix" "orig,vex,evex")
> (set_attr "mode" "")])
>
> -(define_insn "ssse3_palignrdi"
> -  [(set (match_operand:DI 0 "register_operand" "=y")
> -   (unspec:DI [(match_operand:DI 1 "register_operand" "0")
> -   (match_operand:DI 2 "nonimmediate_operand" "ym")
> -   (match_operand:SI 3 "const_0_to_255_mul_8_operand" "n")]
> +(define_insn_and_split "ssse3_palignrdi"
> +  [(set (match_operand:DI 0 "register_operand" "=y,x,Yv")
> +   (unspec:DI [(match_operand:DI 1 "register_operand" "0,0,Yv")
> +   (match_operand:DI 2 "nonimmediate_operand" "ym,x,Yv")
> +   (match_operand:SI 3 "const_0_to_255_mul_8_operand" 
> "n,n,n")]
>UNSPEC_PALIGNR))]
> -  "TARGET_SSSE3"
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
>  {
> -  operands[3] = GEN_INT (INTVAL (operands[3]) / 8);
> -  return "palignr\t{%3, %2, %0|%0, %2, %3}";
> +  if (TARGET_MMX_WITH_SSE)
> +return "#";
> +  else
> +{
> +  operands[3] = GEN_INT (INTVAL (operands[3]) / 8);
> +  return "palignr\t{%3, %2, %0|%0, %2, %3}";
> +}

Use switch with "which_alternative" instead.

Uros.

>  }
> -  [(set_attr "type" "sseishft")
> +  "TARGET_MMX_WITH_SSE && reload_completed"
> +  [(set (match_dup 0)
> +   (lshiftrt:V1TI (match_dup 0) (match_dup 3)))]
> +{
> +  /* Emulate MMX palignrdi with SSE psrldq.  */
> +  rtx op0 = lowpart_subreg (V2DImode, operands[0],
> +   GET_MODE (operands[0]));
> +  rtx insn;
> +  if (TARGET_AVX)
> +insn = gen_vec_concatv2di (op0, operands[2], operands[1]);
> +  else
> +{
> +  /* NB: SSE can only concatenate OP0 and OP1 to OP0.  */
> +  insn = gen_vec_concatv2di (op0, operands[1], operands[2]);
> +  emit_insn (insn);
> +  /* Swap bits 0:63 with bits 64:127.  */
> +  rtx mask = gen_rtx_PARALLEL (VOIDmode,
> +  gen_rtvec (4, GEN_INT (2),
> + GEN_INT (3),
> + GEN_INT (0),
> + GEN_INT (1)));
> +  rtx op1 = gen_rtx_REG (V4SImode, REGNO (op0));

lowpart_subreg.

Uros.
> +  rtx op2 = gen_rtx_VEC_SELECT (V4SImode, op1, mask);
> +  insn = gen_rtx_SET (op1, op2);
> +}
> +  emit_insn (insn);
> +  operands[0] = lowpart_subreg (V1TImode, op0, GET_MODE (op0));
> +}
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "sseishft")
> (set_attr "atom_unit" "sishuf")
> (set_attr "prefix_extra" "1")
> (set_attr "length_immediate" "1")
> (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p 
> (insn)"))
> -   (set_attr "mode" "DI")])
> +   (set_attr "mode" "DI,TI,TI")])
>
>  ;; Mode iterator to handle singularity w/ absence of V2DI and V4DI
>  ;; modes for abs instruction on pre AVX-512 targets.
> --
> 2.20.1
>


Re: [PATCH 27/40] i386: Emulate MMX ssse3_phwv4hi3 with SSE

2019-02-12 Thread Uros Bizjak
On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
>
> Emulate MMX ssse3_phwv4hi3 with SSE by moving bits
> 64:95 to bits 32:63 in SSE register.  Only SSE register source operand
> is allowed.
>
> PR target/89021
> * config/i386/sse.md (ssse3_phwv4hi3):
> Changed to define_insn_and_split to support SSE emulation.
> ---
>  gcc/config/i386/sse.md | 32 
>  1 file changed, 24 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 75e711624ce..e3b63b0e890 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -15358,13 +15358,13 @@
> (set_attr "prefix" "orig,vex")
> (set_attr "mode" "TI")])
>
> -(define_insn "ssse3_phwv4hi3"
> -  [(set (match_operand:V4HI 0 "register_operand" "=y")
> +(define_insn_and_split "ssse3_phwv4hi3"
> +  [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yv")
> (vec_concat:V4HI
>   (vec_concat:V2HI
> (ssse3_plusminus:HI
>   (vec_select:HI
> -   (match_operand:V4HI 1 "register_operand" "0")
> +   (match_operand:V4HI 1 "register_operand" "0,0,Yv")
> (parallel [(const_int 0)]))
>   (vec_select:HI (match_dup 1) (parallel [(const_int 1)])))
> (ssse3_plusminus:HI
> @@ -15373,19 +15373,35 @@
>   (vec_concat:V2HI
> (ssse3_plusminus:HI
>   (vec_select:HI
> -   (match_operand:V4HI 2 "nonimmediate_operand" "ym")
> +   (match_operand:V4HI 2 "nonimmediate_operand" "ym,x,Yv")
> (parallel [(const_int 0)]))
>   (vec_select:HI (match_dup 2) (parallel [(const_int 1)])))
> (ssse3_plusminus:HI
>   (vec_select:HI (match_dup 2) (parallel [(const_int 2)]))
>   (vec_select:HI (match_dup 2) (parallel [(const_int 3)]))]
> -  "TARGET_SSSE3"
> -  "phw\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "sseiadd")
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
> +  "@
> +   phw\t{%2, %0|%0, %2}
> +   #
> +   #"
> +  "TARGET_MMX_WITH_SSE && reload_completed"
> +  [(const_int 0)]
> +{
> +  /* Generate SSE version of the operation.  */
> +  rtx op0 = gen_rtx_REG (V8HImode, REGNO (operands[0]));
> +  rtx op1 = gen_rtx_REG (V8HImode, REGNO (operands[1]));
> +  rtx op2 = gen_rtx_REG (V8HImode, REGNO (operands[2]));

lowpart_subreg

> +  rtx insn = gen_ssse3_phwv8hi3 (op0, op1, op2);
> +  emit_insn (insn);

emit_insn (gen_ssse3_ph...)

Uros.

> +  ix86_move_vector_high_sse_to_mmx (op0);
> +  DONE;
> +}
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "sseiadd")
> (set_attr "atom_unit" "complex")
> (set_attr "prefix_extra" "1")
> (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p 
> (insn)"))
> -   (set_attr "mode" "DI")])
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_insn "avx2_phdv8si3"
>[(set (match_operand:V8SI 0 "register_operand" "=x")
> --
> 2.20.1
>


Re: [PATCH 28/40] i386: Emulate MMX ssse3_phdv2si3 with SSE

2019-02-12 Thread Uros Bizjak
On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
>
> Emulate MMX ssse3_phdv2si3 with SSE by moving bits
> 64:95 to bits 32:63 in SSE register.  Only SSE register source operand
> is allowed.
>
> PR target/89021
> * config/i386/sse.md (ssse3_phdv2si3):
> Changed to define_insn_and_split to support SSE emulation.
> ---
>  gcc/config/i386/sse.md | 32 
>  1 file changed, 24 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index e3b63b0e890..3fe41b772c2 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -15480,26 +15480,42 @@
> (set_attr "prefix" "orig,vex")
> (set_attr "mode" "TI")])
>
> -(define_insn "ssse3_phdv2si3"
> -  [(set (match_operand:V2SI 0 "register_operand" "=y")
> +(define_insn_and_split "ssse3_phdv2si3"
> +  [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv")
> (vec_concat:V2SI
>   (plusminus:SI
> (vec_select:SI
> - (match_operand:V2SI 1 "register_operand" "0")
> + (match_operand:V2SI 1 "register_operand" "0,0,Yv")
>   (parallel [(const_int 0)]))
> (vec_select:SI (match_dup 1) (parallel [(const_int 1)])))
>   (plusminus:SI
> (vec_select:SI
> - (match_operand:V2SI 2 "nonimmediate_operand" "ym")
> + (match_operand:V2SI 2 "nonimmediate_operand" "ym,x,Yv")
>   (parallel [(const_int 0)]))
> (vec_select:SI (match_dup 2) (parallel [(const_int 1)])]
> -  "TARGET_SSSE3"
> -  "phd\t{%2, %0|%0, %2}"
> -  [(set_attr "type" "sseiadd")
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
> +  "@
> +   phd\t{%2, %0|%0, %2}
> +   #
> +   #"
> +  "TARGET_MMX_WITH_SSE && reload_completed"
> +  [(const_int 0)]
> +{
> +  /* Generate SSE version of the operation.  */
> +  rtx op0 = gen_rtx_REG (V4SImode, REGNO (operands[0]));
> +  rtx op1 = gen_rtx_REG (V4SImode, REGNO (operands[1]));
> +  rtx op2 = gen_rtx_REG (V4SImode, REGNO (operands[2]));

lowpart_subreg.

> +  rtx insn = gen_ssse3_phdv4si3 (op0, op1, op2);
> +  emit_insn (insn);

No need for a variable:

emit_insn (gen_ssse3_ph...)

Uros.

> +  ix86_move_vector_high_sse_to_mmx (op0);
> +  DONE;
> +}
> +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> +   (set_attr "type" "sseiadd")
> (set_attr "atom_unit" "complex")
> (set_attr "prefix_extra" "1")
> (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p 
> (insn)"))
> -   (set_attr "mode" "DI")])
> +   (set_attr "mode" "DI,TI,TI")])
>
>  (define_insn "avx2_pmaddubsw256"
>[(set (match_operand:V16HI 0 "register_operand" "=x,v")
> --
> 2.20.1
>


Re: [PATCH 31/40] i386: Emulate MMX pshufb with SSE version

2019-02-12 Thread Uros Bizjak
On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
>
> Emulate MMX version of pshufb with SSE version by masking out the bit 3
> of the shuffle control byte.  Only SSE register source operand is allowed.
>
> PR target/89021
> * config/i386/sse.md (ssse3_pshufbv8qi3): Renamed to ...
> (ssse3_pshufbv8qi3_mmx): This.
> (ssse3_pshufbv8qi3): New.
> (ssse3_pshufbv8qi3_sse): Likewise.
> ---
>  gcc/config/i386/sse.md | 63 --
>  1 file changed, 61 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index dc35fcfd34a..6e748d0543c 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -15819,18 +15819,77 @@
> (set_attr "btver2_decode" "vector")
> (set_attr "mode" "")])
>
> -(define_insn "ssse3_pshufbv8qi3"
> +(define_expand "ssse3_pshufbv8qi3"
> +  [(set (match_operand:V8QI 0 "register_operand")
> +   (unspec:V8QI [(match_operand:V8QI 1 "register_operand")
> + (match_operand:V8QI 2 "nonimmediate_operand")]
> +UNSPEC_PSHUFB))]
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
> +{
> +  if (TARGET_MMX_WITH_SSE)
> +{
> +  /* Emulate MMX version of pshufb with SSE version by masking
> +out the bit 3 of the shuffle control byte.  */
> +  rtvec par = gen_rtvec (4, GEN_INT (0xf7f7f7f7),
> +GEN_INT (0xf7f7f7f7),
> +GEN_INT (0xf7f7f7f7),
> +GEN_INT (0xf7f7f7f7));
> +  rtx vec_const = gen_rtx_CONST_VECTOR (V4SImode, par);
> +  vec_const = force_const_mem (V4SImode, vec_const);
> +  rtx op3 = gen_reg_rtx (V4SImode);
> +  rtx op4 = gen_reg_rtx (V4SImode);
> +  rtx insn = gen_rtx_SET (op4, vec_const);
> +  emit_insn (insn);
> +  rtx op2 = force_reg (V8QImode, operands[2]);
> +  insn = gen_ssse3_pshufbv8qi3_sse (operands[0], operands[1],
> +   op2, op3, op4);
> +  emit_insn (insn);
> +  DONE;
> +}
> +})
> +
> +(define_insn "ssse3_pshufbv8qi3_mmx"
>[(set (match_operand:V8QI 0 "register_operand" "=y")
> (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0")
>   (match_operand:V8QI 2 "nonimmediate_operand" "ym")]
>  UNSPEC_PSHUFB))]
> -  "TARGET_SSSE3"
> +  "TARGET_SSSE3 && !TARGET_MMX_WITH_SSE"
>"pshufb\t{%2, %0|%0, %2}";
>[(set_attr "type" "sselog1")
> (set_attr "prefix_extra" "1")
> (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p 
> (insn)"))
> (set_attr "mode" "DI")])
>
> +(define_insn_and_split "ssse3_pshufbv8qi3_sse"
> +  [(set (match_operand:V8QI 0 "register_operand" "=x,Yv")
> +   (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0,Yv")
> + (match_operand:V8QI 2 "register_operand" "x,Yv")]
> +UNSPEC_PSHUFB))
> +   (set (match_operand:V4SI 3 "register_operand" "=x,Yv")
> +   (unspec:V4SI [(match_operand:V4SI 4 "register_operand" "3,3")]
> +UNSPEC_PSHUFB))]

Another suspicious pattern form. match_scratch?

Uros.

> +  "TARGET_SSSE3 && TARGET_MMX_WITH_SSE"
> +  "#"
> +  "&& reload_completed"
> +  [(const_int 0)]
> +{
> +  /* Mask out the bit 3 of the shuffle control byte.  */
> +  rtx op2 = gen_rtx_REG (V4SImode, REGNO (operands[2]));
> +  rtx op3 = operands[3];
> +  rtx insn = gen_andv4si3 (op3, op3, op2);
> +  emit_insn (insn);
> +  /* Generate SSE version of pshufb.  */
> +  rtx op0 = gen_rtx_REG (V16QImode, REGNO (operands[0]));
> +  rtx op1 = gen_rtx_REG (V16QImode, REGNO (operands[1]));
> +  op3 = gen_rtx_REG (V16QImode, REGNO (op3));

lowpart_subreg.

> +  insn = gen_ssse3_pshufbv16qi3 (op0, op1, op3);
> +  emit_insn (insn);

Emit these two instructions directly from the splitter.

Uros.

> +  DONE;
> +}
> +  [(set_attr "mmx_isa" "x64_noavx,x64_avx")
> +   (set_attr "type" "sselog1")
> +   (set_attr "mode" "TI,TI")])
> +
>  (define_insn "_psign3"
>[(set (match_operand:VI124_AVX2 0 "register_operand" "=x,x")
> (unspec:VI124_AVX2
> --
> 2.20.1
>


Re: [PATCH 19/40] i386: Emulate MMX mmx_pmovmskb with SSE

2019-02-12 Thread Uros Bizjak
On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
>
> Emulate MMX mmx_pmovmskb with SSE by zero-extending result of SSE pmovmskb
> from QImode to SImode.  Only SSE register source operand is allowed.
>
> PR target/89021
> * config/i386/mmx.md (mmx_pmovmskb): Changed to
> define_insn_and_split to support SSE emulation.
> ---
>  gcc/config/i386/mmx.md | 30 +++---
>  1 file changed, 23 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 4cf008e99c7..d9ff70884bd 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -1799,14 +1799,30 @@
>[(set_attr "type" "mmxshft")
> (set_attr "mode" "DI")])
>
> -(define_insn "mmx_pmovmskb"
> -  [(set (match_operand:SI 0 "register_operand" "=r")
> -   (unspec:SI [(match_operand:V8QI 1 "register_operand" "y")]
> +(define_insn_and_split "mmx_pmovmskb"
> +  [(set (match_operand:SI 0 "register_operand" "=r,r")
> +   (unspec:SI [(match_operand:V8QI 1 "register_operand" "y,x")]
>UNSPEC_MOVMSK))]
> -  "TARGET_SSE || TARGET_3DNOW_A"
> -  "pmovmskb\t{%1, %0|%0, %1}"
> -  [(set_attr "type" "mmxcvt")
> -   (set_attr "mode" "DI")])
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
> +   && (TARGET_SSE || TARGET_3DNOW_A)"
> +  "@
> +   pmovmskb\t{%1, %0|%0, %1}
> +   #"
> +  "TARGET_MMX_WITH_SSE && reload_completed"
> +  [(set (match_dup 0)
> +   (zero_extend:SI (match_dup 1)))]
> +{
> +  /* Generate SSE pmovmskb and zero-extend from QImode to SImode.  */
> +  rtx op1 = lowpart_subreg (V16QImode, operands[1],
> +   GET_MODE (operands[1]));
> +  rtx insn = gen_sse2_pmovmskb (operands[0], op1);
> +  emit_insn (insn);

This should be emitted explicitly from the splitter.

Uros.

> +  operands[1] = lowpart_subreg (QImode, operands[0],
> +   GET_MODE (operands[0]));
> +}
> +  [(set_attr "mmx_isa" "native,x64")
> +   (set_attr "type" "mmxcvt,ssemov")
> +   (set_attr "mode" "DI,TI")])
>
>  (define_expand "mmx_maskmovq"
>[(set (match_operand:V8QI 0 "memory_operand")
> --
> 2.20.1
>


Re: [PATCH 03/40] i386: Emulate MMX punpcklXX/punpckhXX with SSE punpcklXX

2019-02-12 Thread Uros Bizjak
On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
>
> Emulate MMX punpcklXX/punpckhXX with SSE punpcklXX.  For MMX punpckhXX,
> move bits 64:127 to bits 0:63 in SSE register.  Only SSE register source
> operand is allowed.
>
> PR target/89021
> * config/i386/i386-protos.h (ix86_split_mmx_punpck): New
> prototype.
> * config/i386/i386.c (ix86_split_mmx_punpck): New function.
> * config/i386/mmx.m (mmx_punpckhbw): Changed to
> define_insn_and_split to support SSE emulation.
> (mmx_punpcklbw): Likewise.
> (mmx_punpckhwd): Likewise.
> (mmx_punpcklwd): Likewise.
> (mmx_punpckhdq): Likewise.
> (mmx_punpckldq): Likewise.
> ---
>  gcc/config/i386/i386-protos.h |   1 +
>  gcc/config/i386/i386.c|  77 +++
>  gcc/config/i386/mmx.md| 138 ++
>  3 files changed, 168 insertions(+), 48 deletions(-)
>
> diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
> index bb96a420a85..dc7fc38d8e4 100644
> --- a/gcc/config/i386/i386-protos.h
> +++ b/gcc/config/i386/i386-protos.h
> @@ -202,6 +202,7 @@ extern rtx ix86_split_stack_guard (void);
>
>  extern void ix86_move_vector_high_sse_to_mmx (rtx);
>  extern void ix86_split_mmx_pack (rtx[], enum rtx_code);
> +extern void ix86_split_mmx_punpck (rtx[], bool);
>
>  #ifdef TREE_CODE
>  extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree, int);
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index b8d5ba7f28f..7d65192c1cd 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -20009,6 +20009,83 @@ ix86_split_mmx_pack (rtx operands[], enum rtx_code 
> code)
>ix86_move_vector_high_sse_to_mmx (op0);
>  }
>
> +/* Split MMX punpcklXX/punpckhXX with SSE punpcklXX.  */
> +
> +void
> +ix86_split_mmx_punpck (rtx operands[], bool high_p)
> +{
> +  rtx op0 = operands[0];
> +  rtx op1 = operands[1];
> +  rtx op2 = operands[2];
> +  machine_mode mode = GET_MODE (op0);
> +  rtx mask;
> +  /* The corresponding SSE mode.  */
> +  machine_mode sse_mode, double_sse_mode;
> +
> +  switch (mode)
> +{
> +case E_V8QImode:
> +  sse_mode = V16QImode;
> +  double_sse_mode = V32QImode;
> +  mask = gen_rtx_PARALLEL (VOIDmode,
> +  gen_rtvec (16,
> + GEN_INT (0), GEN_INT (16),
> + GEN_INT (1), GEN_INT (17),
> + GEN_INT (2), GEN_INT (18),
> + GEN_INT (3), GEN_INT (19),
> + GEN_INT (4), GEN_INT (20),
> + GEN_INT (5), GEN_INT (21),
> + GEN_INT (6), GEN_INT (22),
> + GEN_INT (7), GEN_INT (23)));
> +  break;
> +
> +case E_V4HImode:
> +  sse_mode = V8HImode;
> +  double_sse_mode = V16HImode;
> +  mask = gen_rtx_PARALLEL (VOIDmode,
> +  gen_rtvec (8,
> + GEN_INT (0), GEN_INT (8),
> + GEN_INT (1), GEN_INT (9),
> + GEN_INT (2), GEN_INT (10),
> + GEN_INT (3), GEN_INT (11)));
> +  break;
> +
> +case E_V2SImode:
> +  sse_mode = V4SImode;
> +  double_sse_mode = V8SImode;
> +  mask = gen_rtx_PARALLEL (VOIDmode,
> +  gen_rtvec (4,
> + GEN_INT (0), GEN_INT (4),
> + GEN_INT (1), GEN_INT (5)));
> +  break;
> +
> +default:
> +  gcc_unreachable ();
> +}
> +
> +  /* Generate SSE punpcklXX.  */
> +  rtx dest = gen_rtx_REG (sse_mode, REGNO (op0));
> +  op1 = gen_rtx_REG (sse_mode, REGNO (op1));
> +  op2 = gen_rtx_REG (sse_mode, REGNO (op2));

lowpart_subreg here.

Uros.

> +
> +  op1 = gen_rtx_VEC_CONCAT (double_sse_mode, op1, op2);
> +  op2 = gen_rtx_VEC_SELECT (sse_mode, op1, mask);
> +  rtx insn = gen_rtx_SET (dest, op2);
> +  emit_insn (insn);
> +
> +  if (high_p)
> +{
> +  /* Move bits 64:127 to bits 0:63.  */
> +  mask = gen_rtx_PARALLEL (VOIDmode,
> +  gen_rtvec (4, GEN_INT (2), GEN_INT (3),
> + GEN_INT (0), GEN_INT (0)));
> +  dest = gen_rtx_REG (V4SImode, REGNO (dest));
> +  op1 = gen_rtx_VEC_SELECT (V4SImode, dest, mask);
> +  insn = gen_rtx_SET (dest, op1);
> +  emit_insn (insn);
> +}
> +}
> +
>  /* Helper function of ix86_fixup_binary_operands to canonicalize
> operand order.  Returns true if the operands should be swapped.  */
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index 840d369ab02..034c6a855e0 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> 

Re: [PATCH 02/40] i386: Emulate MMX packsswb/packssdw/packuswb with SSE2

2019-02-12 Thread Uros Bizjak
On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
>
> Emulate MMX packsswb/packssdw/packuswb with SSE packsswb/packssdw/packuswb
> plus moving bits 64:95 to bits 32:63 in SSE register.  Only SSE register
> source operand is allowed.
>
> 2019-02-08  H.J. Lu  
> Uros Bizjak  
>
> PR target/89021
> * config/i386/i386-protos.h (ix86_move_vector_high_sse_to_mmx):
> New prototype.
> (ix86_split_mmx_pack): Likewise.
> * config/i386/i386.c (ix86_move_vector_high_sse_to_mmx): New
> function.
> (ix86_split_mmx_pack): Likewise.
> * config/i386/i386.md (mmx_isa): New.
> (enabled): Also check mmx_isa.
> * config/i386/mmx.md (any_s_truncate): New code iterator.
> (s_trunsuffix): New code attr.
> (mmx_packsswb): Removed.
> (mmx_packssdw): Likewise.
> (mmx_packuswb): Likewise.
> (mmx_packswb): New define_insn_and_split to emulate
> MMX packsswb/packuswb with SSE2.
> (mmx_packssdw): Likewise.
> ---
>  gcc/config/i386/i386-protos.h |  3 ++
>  gcc/config/i386/i386.c| 54 
>  gcc/config/i386/i386.md   | 12 +++
>  gcc/config/i386/mmx.md| 67 +++
>  4 files changed, 106 insertions(+), 30 deletions(-)
>
> diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
> index 2d600173917..bb96a420a85 100644
> --- a/gcc/config/i386/i386-protos.h
> +++ b/gcc/config/i386/i386-protos.h
> @@ -200,6 +200,9 @@ extern void ix86_expand_vecop_qihi (enum rtx_code, rtx, 
> rtx, rtx);
>
>  extern rtx ix86_split_stack_guard (void);
>
> +extern void ix86_move_vector_high_sse_to_mmx (rtx);
> +extern void ix86_split_mmx_pack (rtx[], enum rtx_code);
> +
>  #ifdef TREE_CODE
>  extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree, int);
>  #endif /* TREE_CODE  */
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 61e602bdb38..b8d5ba7f28f 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -19955,6 +19955,60 @@ ix86_expand_vector_move_misalign (machine_mode mode, 
> rtx operands[])
>  gcc_unreachable ();
>  }
>
> +/* Move bits 64:95 to bits 32:63.  */
> +
> +void
> +ix86_move_vector_high_sse_to_mmx (rtx op)
> +{
> +  rtx mask = gen_rtx_PARALLEL (VOIDmode,
> +  gen_rtvec (4, GEN_INT (0), GEN_INT (2),
> + GEN_INT (0), GEN_INT (0)));
> +  rtx dest = gen_rtx_REG (V4SImode, REGNO (op));
> +  op = gen_rtx_VEC_SELECT (V4SImode, dest, mask);
> +  rtx insn = gen_rtx_SET (dest, op);
> +  emit_insn (insn);
> +}
> +
> +/* Split MMX pack with signed/unsigned saturation with SSE/SSE2.  */
> +
> +void
> +ix86_split_mmx_pack (rtx operands[], enum rtx_code code)
> +{
> +  rtx op0 = operands[0];
> +  rtx op1 = operands[1];
> +  rtx op2 = operands[2];
> +
> +  machine_mode dmode = GET_MODE (op0);
> +  machine_mode smode = GET_MODE (op1);
> +  machine_mode inner_dmode = GET_MODE_INNER (dmode);
> +  machine_mode inner_smode = GET_MODE_INNER (smode);
> +
> +  /* Get the corresponding SSE mode for destination.  */
> +  int nunits = 16 / GET_MODE_SIZE (inner_dmode);
> +  machine_mode sse_dmode = mode_for_vector (GET_MODE_INNER (dmode),
> +   nunits).require ();
> +  machine_mode sse_half_dmode = mode_for_vector (GET_MODE_INNER (dmode),
> +nunits / 2).require ();
> +
> +  /* Get the corresponding SSE mode for source.  */
> +  nunits = 16 / GET_MODE_SIZE (inner_smode);
> +  machine_mode sse_smode = mode_for_vector (GET_MODE_INNER (smode),
> +   nunits).require ();
> +
> +  /* Generate SSE pack with signed/unsigned saturation.  */
> +  rtx dest = gen_rtx_REG (sse_dmode, REGNO (op0));
> +  op1 = gen_rtx_REG (sse_smode, REGNO (op1));
> +  op2 = gen_rtx_REG (sse_smode, REGNO (op2));

Please use lowpart_subreg.

Uros.

> +
> +  op1 = gen_rtx_fmt_e (code, sse_half_dmode, op1);
> +  op2 = gen_rtx_fmt_e (code, sse_half_dmode, op2);
> +  rtx insn = gen_rtx_SET (dest, gen_rtx_VEC_CONCAT (sse_dmode,
> +   op1, op2));
> +  emit_insn (insn);
> +
> +  ix86_move_vector_high_sse_to_mmx (op0);
> +}
> +
>  /* Helper function of ix86_fixup_binary_operands to canonicalize
> operand order.  Returns true if the operands should be swapped.  */
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 5b89e52493e..633b1dab523 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -792,6 +792,9 @@
> avx512vl,noavx512vl,x64_avx512dq,x64_avx512bw"
>(const_string "base"))
>
> +;; Define instruction set of MMX instructions
> +(define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx" (const_string 
> "base"))
> +
>  (define_attr "enabled" ""
>(cond [(eq_attr "isa" "x64") (symbol_ref "TARGET_64BIT")

Re: [PATCH 25/40] i386: Emulate MMX movntq with SSE2 movntidi

2019-02-12 Thread Uros Bizjak
On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
>
> Emulate MMX movntq with SSE2 movntidi.  Only SSE register source operand
> is allowed.

Actually, it allows general register source operand.
>
> PR target/89021
> * config/i386/mmx.md (sse_movntq): Renamed to ...
> (*sse_movntq): This.  Require TARGET_MMX and disallow
> TARGET_MMX_WITH_SSE.
> (sse_movntq): New.  Emulate MMX movntq with SSE2 movntidi.

No need to complicate that much. Just add movnti alternative to the
existing pattern.

Uros.

> ---
>  gcc/config/i386/mmx.md | 19 +--
>  1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index b3048a6a3b8..2efa663b3e2 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -238,11 +238,26 @@
>DONE;
>  })
>
> -(define_insn "sse_movntq"
> +(define_expand "sse_movntq"
> +  [(set (match_operand:DI 0 "memory_operand")
> +   (unspec:DI [(match_operand:DI 1 "register_operand")]
> +  UNSPEC_MOVNTQ))]
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
> +   && (TARGET_SSE || TARGET_3DNOW_A)"
> +{
> +  if (TARGET_MMX_WITH_SSE)
> +{
> +  rtx insn = gen_sse2_movntidi (operands[0], operands[1]);
> +  emit_insn (insn);
> +  DONE;
> +}
> +})
> +
> +(define_insn "*sse_movntq"
>[(set (match_operand:DI 0 "memory_operand" "=m")
> (unspec:DI [(match_operand:DI 1 "register_operand" "y")]
>UNSPEC_MOVNTQ))]
> -  "TARGET_SSE || TARGET_3DNOW_A"
> +  "TARGET_MMX && !TARGET_MMX_WITH_SSE && (TARGET_SSE || TARGET_3DNOW_A)"
>"movntq\t{%1, %0|%0, %1}"
>[(set_attr "type" "mmxmov")
> (set_attr "mode" "DI")])
> --
> 2.20.1
>


Re: [rs6000] 64-bit integer loads/stores and FP instructions

2019-02-12 Thread Eric Botcazou
> No, we should allow both integer and floating point insns for integer stores
> always.  We just get the cost estimates slightly wrong now, apparently.

Note that my proof of concept patch doesn't disallow them either...  So what 
do you suggest?  Just putting back the '*' modifiers in the DI patterns?  As a 
matter of fact there are still present in the SI pattern.

-- 
Eric Botcazou


[SVE ACLE] Implement svlsl

2019-02-12 Thread Prathamesh Kulkarni
Committed attached patch to aarch64/sve-acle-branch.

Thanks,
Prathamesh
diff --git a/gcc/config/aarch64/aarch64-sve-builtins.c b/gcc/config/aarch64/aarch64-sve-builtins.c
index ed06db9b7c6..598411fb834 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.c
+++ b/gcc/config/aarch64/aarch64-sve-builtins.c
@@ -126,7 +126,11 @@ enum function_shape {
   SHAPE_shift_right_imm,
 
   /* sv_t svfoo_wide[_t0](sv_t, svuint64_t).  */
-  SHAPE_binary_wide
+  SHAPE_binary_wide,
+
+  /* sv_t svfoo[_t0](sv_t, sv_t)
+ sv_t svfoo[_t0](sv_t, uint64_t).  */
+  SHAPE_shift_opt_n
 };
 
 /* Classifies an operation into "modes"; for example, to distinguish
@@ -172,6 +176,7 @@ enum function {
   FUNC_svdup,
   FUNC_sveor,
   FUNC_svindex,
+  FUNC_svlsl,
   FUNC_svlsl_wide,
   FUNC_svmax,
   FUNC_svmad,
@@ -479,6 +484,7 @@ private:
   rtx expand_dup ();
   rtx expand_eor ();
   rtx expand_index ();
+  rtx expand_lsl ();
   rtx expand_lsl_wide ();
   rtx expand_max ();
   rtx expand_min ();
@@ -912,6 +918,12 @@ arm_sve_h_builder::build (const function_group )
   add_overloaded_functions (group, MODE_none);
   build_all (_sve_h_builder::sig_00i, group, MODE_none);
   break;
+
+case SHAPE_shift_opt_n:
+  add_overloaded_functions (group, MODE_none);
+  build_all (_sve_h_builder::sig_000, group, MODE_none);
+  build_all (_sve_h_builder::sig_n_00i, group, MODE_n);
+  break;
 }
 }
 
@@ -1222,6 +1234,7 @@ arm_sve_h_builder::get_attributes (const function_instance )
 case FUNC_svdup:
 case FUNC_sveor:
 case FUNC_svindex:
+case FUNC_svlsl:
 case FUNC_svlsl_wide:
 case FUNC_svmax:
 case FUNC_svmad:
@@ -1280,6 +1293,7 @@ arm_sve_h_builder::get_explicit_types (function_shape shape)
 case SHAPE_ternary_qq_opt_n:
 case SHAPE_shift_right_imm:
 case SHAPE_binary_wide:
+case SHAPE_shift_opt_n:
   return 0;
 }
   gcc_unreachable ();
@@ -1347,6 +1361,7 @@ function_resolver::resolve ()
 case SHAPE_unary:
   return resolve_uniform (1);
 case SHAPE_binary_opt_n:
+case SHAPE_shift_opt_n:
   return resolve_uniform (2);
 case SHAPE_ternary_opt_n:
   return resolve_uniform (3);
@@ -1706,6 +1721,7 @@ function_checker::check ()
 case SHAPE_ternary_opt_n:
 case SHAPE_ternary_qq_opt_n:
 case SHAPE_binary_wide:
+case SHAPE_shift_opt_n:
   return true;
 }
   gcc_unreachable ();
@@ -1895,6 +1911,7 @@ gimple_folder::fold ()
 case FUNC_svdup:
 case FUNC_sveor:
 case FUNC_svindex:
+case FUNC_svlsl:
 case FUNC_svlsl_wide:
 case FUNC_svmax:
 case FUNC_svmad:
@@ -2001,6 +2018,9 @@ function_expander::expand ()
 case FUNC_svindex:
   return expand_index ();
 
+case FUNC_svlsl:
+  return expand_lsl ();
+
 case FUNC_svlsl_wide:
   return expand_lsl_wide ();
 
@@ -2175,6 +2195,30 @@ function_expander::expand_index ()
   return expand_via_unpred_direct_optab (vec_series_optab);
 }
 
+/* Expand a call to svlsl.  */
+rtx
+function_expander::expand_lsl ()
+{
+  machine_mode mode = get_mode (0);
+  machine_mode elem_mode = GET_MODE_INNER (mode);
+
+  if (m_fi.mode == MODE_n
+  && mode != VNx2DImode
+  && !aarch64_simd_shift_imm_p (m_args[2], elem_mode, true))
+return expand_lsl_wide ();
+
+  if (m_fi.pred == PRED_x)
+{
+  insn_code icode = code_for_aarch64_pred (ASHIFT, mode);
+  return expand_via_pred_x_insn (icode);
+}
+  else
+{
+  insn_code icode = code_for_cond (ASHIFT, mode);
+  return expand_via_pred_insn (icode);
+}
+}
+
 /* Expand a call to svlsl_wide.  */
 rtx
 function_expander::expand_lsl_wide ()
diff --git a/gcc/config/aarch64/aarch64-sve-builtins.def b/gcc/config/aarch64/aarch64-sve-builtins.def
index 8322c4bb349..5513598c64e 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.def
+++ b/gcc/config/aarch64/aarch64-sve-builtins.def
@@ -71,6 +71,7 @@ DEF_SVE_FUNCTION (svdot, ternary_qq_opt_n, sdi, none)
 DEF_SVE_FUNCTION (svdup, unary_n, all_data, mxznone)
 DEF_SVE_FUNCTION (sveor, binary_opt_n, all_integer, mxz)
 DEF_SVE_FUNCTION (svindex, binary_scalar, all_data, none)
+DEF_SVE_FUNCTION (svlsl, shift_opt_n, all_integer, mxz)
 DEF_SVE_FUNCTION (svlsl_wide, binary_wide, all_bhsi, mxz)
 DEF_SVE_FUNCTION (svmax, binary_opt_n, all_data, mxz)
 DEF_SVE_FUNCTION (svmin, binary_opt_n, all_data, mxz)
diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md
index bd635645050..9076f83449f 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -1510,20 +1510,21 @@
 ;; actually need the predicate for the first alternative, but using Upa
 ;; or X isn't likely to gain much and would make the instruction seem
 ;; less uniform to the register allocator.
-(define_insn "*v3"
-  [(set (match_operand:SVE_I 0 "register_operand" "=w, w, ?")
+(define_insn "@aarch64_pred_"
+  [(set (match_operand:SVE_I 0 "register_operand" "=w, w, w, ?")
 	(unspec:SVE_I
-	  [(match_operand: 1 "register_operand" 

Re: [PATCH 21/40] i386: Emulate MMX maskmovq with SSE2 maskmovdqu

2019-02-12 Thread Uros Bizjak
On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
>
> Emulate MMX maskmovq with SSE2 maskmovdqu in 64-bit mode by zero-extending
> source and mask operands to 128 bits.  Handle unmapped bits 64:127 at
> memory address by adjusting source and mask operands together with memory
> address.
>
> PR target/89021
> * config/i386/xmmintrin.h: Emulate MMX maskmovq with SSE2
> maskmovdqu in 64-bit mode.
> ---
>  gcc/config/i386/xmmintrin.h | 61 +
>  1 file changed, 61 insertions(+)
>
> diff --git a/gcc/config/i386/xmmintrin.h b/gcc/config/i386/xmmintrin.h
> index 58284378514..e797795f127 100644
> --- a/gcc/config/i386/xmmintrin.h
> +++ b/gcc/config/i386/xmmintrin.h
> @@ -1165,7 +1165,68 @@ _m_pshufw (__m64 __A, int const __N)
>  extern __inline void __attribute__((__gnu_inline__, __always_inline__, 
> __artificial__))
>  _mm_maskmove_si64 (__m64 __A, __m64 __N, char *__P)
>  {
> +#ifdef __x86_64__

We need __MMX_WITH_SSE__ target macro defined from the compiler here.

Uros.

> +  /* Emulate MMX maskmovq with SSE2 maskmovdqu and handle unmapped bits
> + 64:127 at address __P.  */
> +  typedef long long __v2di __attribute__ ((__vector_size__ (16)));
> +  typedef char __v16qi __attribute__ ((__vector_size__ (16)));
> +  /* Zero-extend __A and __N to 128 bits.  */
> +  __v2di __A128 = __extension__ (__v2di) { ((__v1di) __A)[0], 0 };
> +  __v2di __N128 = __extension__ (__v2di) { ((__v1di) __N)[0], 0 };
> +
> +  /* Check the alignment of __P.  */
> +  __SIZE_TYPE__ offset = ((__SIZE_TYPE__) __P) & 0xf;
> +  if (offset)
> +{
> +  /* If the misalignment of __P > 8, subtract __P by 8 bytes.
> +Otherwise, subtract __P by the misalignment.  */
> +  if (offset > 8)
> +   offset = 8;
> +  __P = (char *) (((__SIZE_TYPE__) __P) - offset);
> +
> +  /* Shift __A128 and __N128 to the left by the adjustment.  */
> +  switch (offset)
> +   {
> +   case 1:
> + __A128 = __builtin_ia32_pslldqi128 (__A128, 8);
> + __N128 = __builtin_ia32_pslldqi128 (__N128, 8);
> + break;
> +   case 2:
> + __A128 = __builtin_ia32_pslldqi128 (__A128, 2 * 8);
> + __N128 = __builtin_ia32_pslldqi128 (__N128, 2 * 8);
> + break;
> +   case 3:
> + __A128 = __builtin_ia32_pslldqi128 (__A128, 3 * 8);
> + __N128 = __builtin_ia32_pslldqi128 (__N128, 3 * 8);
> + break;
> +   case 4:
> + __A128 = __builtin_ia32_pslldqi128 (__A128, 4 * 8);
> + __N128 = __builtin_ia32_pslldqi128 (__N128, 4 * 8);
> + break;
> +   case 5:
> + __A128 = __builtin_ia32_pslldqi128 (__A128, 5 * 8);
> + __N128 = __builtin_ia32_pslldqi128 (__N128, 5 * 8);
> + break;
> +   case 6:
> + __A128 = __builtin_ia32_pslldqi128 (__A128, 6 * 8);
> + __N128 = __builtin_ia32_pslldqi128 (__N128, 6 * 8);
> + break;
> +   case 7:
> + __A128 = __builtin_ia32_pslldqi128 (__A128, 7 * 8);
> + __N128 = __builtin_ia32_pslldqi128 (__N128, 7 * 8);
> + break;
> +   case 8:
> + __A128 = __builtin_ia32_pslldqi128 (__A128, 8 * 8);
> + __N128 = __builtin_ia32_pslldqi128 (__N128, 8 * 8);
> + break;
> +   default:
> + break;
> +   }
> +}
> +  __builtin_ia32_maskmovdqu ((__v16qi)__A128, (__v16qi)__N128, __P);
> +#else
>__builtin_ia32_maskmovq ((__v8qi)__A, (__v8qi)__N, __P);
> +#endif
>  }
>
>  extern __inline void __attribute__((__gnu_inline__, __always_inline__, 
> __artificial__))
> --
> 2.20.1
>


Re: [PATCH 15/40] i386: Emulate MMX sse_cvtpi2ps with SSE

2019-02-12 Thread Uros Bizjak
On Mon, Feb 11, 2019 at 11:55 PM H.J. Lu  wrote:
>
> Emulate MMX sse_cvtpi2ps with SSE2 cvtdq2ps, preserving upper 64 bits of
> destination XMM register.  Only SSE register source operand is allowed.
>
> PR target/89021
> * config/i386/mmx.md (UNSPEC_CVTPI2PS): New.
> (sse_cvtpi2ps): Renamed to ...
> (*mmx_cvtpi2ps): This.  Disabled for TARGET_MMX_WITH_SSE.
> (sse_cvtpi2ps): New.
> (mmx_cvtpi2ps_sse): Likewise.
> ---
>  gcc/config/i386/sse.md | 83 +-
>  1 file changed, 81 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 80bb4cb935d..75e711624ce 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -18,6 +18,9 @@
>  ;; .
>
>  (define_c_enum "unspec" [
> +  ;; MMX with SSE
> +  UNSPEC_CVTPI2PS
> +
>;; SSE
>UNSPEC_MOVNT
>
> @@ -4655,14 +4658,90 @@
>  ;;
>  ;
>
> -(define_insn "sse_cvtpi2ps"
> +(define_expand "sse_cvtpi2ps"
> +  [(set (match_operand:V4SF 0 "register_operand")
> +   (vec_merge:V4SF
> + (vec_duplicate:V4SF
> +   (float:V2SF (match_operand:V2SI 2 "nonimmediate_operand")))
> + (match_operand:V4SF 1 "register_operand")
> + (const_int 3)))]
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSE"
> +{
> +  if (TARGET_MMX_WITH_SSE)
> +{
> +  rtx op2 = force_reg (V2SImode, operands[2]);
> +  rtx op3 = gen_reg_rtx (V4SFmode);
> +  rtx op4 = gen_reg_rtx (V4SFmode);
> +  rtx insn = gen_mmx_cvtpi2ps_sse (operands[0], operands[1], op2,
> +  op3, op4);
> +  emit_insn (insn);
> +  DONE;
> +}
> +})
> +
> +(define_insn_and_split "mmx_cvtpi2ps_sse"
> +  [(set (match_operand:V4SF 0 "register_operand" "=x,Yv")
> +   (unspec:V4SF [(match_operand:V2SI 2 "register_operand" "x,Yv")
> + (match_operand:V4SF 1 "register_operand" "0,Yv")]
> +UNSPEC_CVTPI2PS))
> +   (set (match_operand:V4SF 3 "register_operand" "=x,Yv")
> +   (unspec:V4SF [(match_operand:V4SF 4 "register_operand" "3,3")]
> +UNSPEC_CVTPI2PS))]

This is indeed one strange pattern. Can you please elaborate why it
should be written in this way. Do you need a scratch register
(match_scratch) here?

Uros.

> +  "TARGET_MMX_WITH_SSE"
> +  "#"
> +  "&& reload_completed"
> +  [(const_int 0)]
> +{
> +  rtx op2 = gen_rtx_REG (V4SImode, REGNO (operands[2]));
> +  /* Generate SSE2 cvtdq2ps.  */
> +  rtx insn = gen_floatv4siv4sf2 (operands[3], op2);
> +  emit_insn (insn);
> +
> +  /* Merge operands[3] with operands[0].  */
> +  rtx mask, op1;
> +  if (TARGET_AVX)
> +{
> +  mask = gen_rtx_PARALLEL (VOIDmode,
> +  gen_rtvec (4, GEN_INT (0), GEN_INT (1),
> + GEN_INT (6), GEN_INT (7)));
> +  op1 = gen_rtx_VEC_CONCAT (V8SFmode, operands[3], operands[1]);
> +  op2 = gen_rtx_VEC_SELECT (V4SFmode, op1, mask);
> +  insn = gen_rtx_SET (operands[0], op2);
> +}
> +  else
> +{
> +  /* NB: SSE can only concatenate OP0 and OP3 to OP0.  */
> +  mask = gen_rtx_PARALLEL (VOIDmode,
> +  gen_rtvec (4, GEN_INT (2), GEN_INT (3),
> + GEN_INT (4), GEN_INT (5)));
> +  op1 = gen_rtx_VEC_CONCAT (V8SFmode, operands[0], operands[3]);
> +  op2 = gen_rtx_VEC_SELECT (V4SFmode, op1, mask);
> +  insn = gen_rtx_SET (operands[0], op2);
> +  emit_insn (insn);
> +
> +  /* Swap bits 0:63 with bits 64:127.  */
> +  mask = gen_rtx_PARALLEL (VOIDmode,
> +  gen_rtvec (4, GEN_INT (2), GEN_INT (3),
> + GEN_INT (0), GEN_INT (1)));
> +  rtx dest = gen_rtx_REG (V4SImode, REGNO (operands[0]));
> +  op1 = gen_rtx_VEC_SELECT (V4SImode, dest, mask);
> +  insn = gen_rtx_SET (dest, op1);
> +}
> +  emit_insn (insn);
> +  DONE;
> +}
> +  [(set_attr "isa" "noavx,avx")
> +   (set_attr "type" "ssecvt")
> +   (set_attr "mode" "V4SF")])
> +
> +(define_insn "*mmx_cvtpi2ps"
>[(set (match_operand:V4SF 0 "register_operand" "=x")
> (vec_merge:V4SF
>   (vec_duplicate:V4SF
> (float:V2SF (match_operand:V2SI 2 "nonimmediate_operand" "ym")))
>   (match_operand:V4SF 1 "register_operand" "0")
>   (const_int 3)))]
> -  "TARGET_SSE"
> +  "TARGET_SSE && !TARGET_MMX_WITH_SSE"
>"cvtpi2ps\t{%2, %0|%0, %2}"
>[(set_attr "type" "ssecvt")
> (set_attr "mode" "V4SF")])
> --
> 2.20.1
>


Re: Fix odr ICE on Ada LTO

2019-02-12 Thread Richard Biener
On Sun, Feb 10, 2019 at 11:07 PM Jan Hubicka  wrote:
>
> Hi,
> I am attaching correct patch.
> The option is new only in a relative sense - it was added 5 years ago
> with the orinal ODR warning infrastructure.
> We have -Wodr-type-merging that controls streming data needed for -Wodr
> to work and -fno-devirtualize that controls streaming of BINFOs.
>
> I was concerned at that time about extra overhead this streaming causes,
> but with all the optimizations this overhead is quite small now (i.e.
> the mangled type names and there are "only" about 4k types in Firefox)
>
> What is anoying about -Wno-odr-type-merging is that we lose mangled
> names that are also used by devirtualization. ipa-devirt still has two
> implementations of the main hash - one based on mangled names and the
> original one based on virtual table names, but combining both hashes
> results in incomplete type inheritance graphs.

Ah, I see.  I guess we can clean this up for GCC 10 then.

Richard.

> Honza
>
>
> PR lto/89272
> * tree.c (fld_simplified_type_name): Also keep TYPE_DECL for
> polymorphic types.
>
>
> --- trunk/gcc/tree.c2019/02/10 09:45:55 268741
> +++ trunk/gcc/tree.c2019/02/10 10:46:43 268742
> @@ -5153,7 +5153,10 @@
>   TYPE_DECL if the type doesn't have linkage.
>   this must match fld_  */
>if (type != TYPE_MAIN_VARIANT (type)
> -  || !DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (type)))
> +  || (!DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (type))
> + && (TREE_CODE (type) != RECORD_TYPE
> + || !TYPE_BINFO (type)
> + || !BINFO_VTABLE (TYPE_BINFO (type)
>  return DECL_NAME (TYPE_NAME (type));
>return TYPE_NAME (type);
>  }
>


Re: Do not use TYPE_NEED_CONSTRUCTING in may_be_aliased

2019-02-12 Thread Richard Biener
On Sun, 10 Feb 2019, Jan Hubicka wrote:

> > Hi,
> > this patch drops test for TYPE_NEEDS_CONSTRUCTING in tree.h and instead
> > sets TREE_READONLY to 0 for external vars of this type. For vars
> > declared locally we drop TREE_READONLY while expanding constructor.
> > Note that I have tried to drop TREE_READONLY always (not only for
> > DECL_EXTERNAL) and it breaks a testcase where constructor is constexpr.
> > So perhaps this is unnecesarily conservative for external vars having
> > constexpr cotr and perhaps it is better done by frontend.
> > 
> > Curiously enough, this does not fix the actual testcase in PR88677.
> This turned out to be bug in my patch: I cleared the flag too late so
> free_lang_data caused very much same effect as the may_be_aliased flag.
> Here is updated patch, bootstrapped/regtested x86_64-linux. It also
> fixes the testcase though I am not quite sure how to add it to
> testsuite.

OK.

Richard.

> > 
> > Bootstrapped/regtested x86_64-linux, makes sense?
> > 
>   PR lto/88777
>   * cgraphunit.c (analyze_functions): Clear READONLY flag for external
>   types that needs constructiong.
>   * tree.h (may_be_aliased): Do not check TYPE_NEEDS_CONSTRUCTING.
> Index: cgraphunit.c
> ===
> --- cgraphunit.c  (revision 268741)
> +++ cgraphunit.c  (working copy)
> @@ -1226,6 +1226,15 @@ analyze_functions (bool first_time)
> && node != first_handled_var; node = next)
>  {
>next = node->next;
> +  /* For symbols declared locally we clear TREE_READONLY when emitting
> +  the construtor (if one is needed).  For external declarations we can
> +  not safely assume that the type is readonly because we may be called
> +  during its construction.  */
> +  if (TREE_CODE (node->decl) == VAR_DECL
> +   && TYPE_P (TREE_TYPE (node->decl))
> +   && TYPE_NEEDS_CONSTRUCTING (TREE_TYPE (node->decl))
> +   && DECL_EXTERNAL (node->decl))
> + TREE_READONLY (node->decl) = 0;
>if (!node->aux && !node->referred_to_p ())
>   {
> if (symtab->dump_file)
> Index: tree.h
> ===
> --- tree.h(revision 268741)
> +++ tree.h(working copy)
> @@ -5371,8 +5371,7 @@ may_be_aliased (const_tree var)
> || DECL_EXTERNAL (var)
> || TREE_ADDRESSABLE (var))
> && !((TREE_STATIC (var) || TREE_PUBLIC (var) || DECL_EXTERNAL (var))
> -&& ((TREE_READONLY (var)
> - && !TYPE_NEEDS_CONSTRUCTING (TREE_TYPE (var)))
> +&& (TREE_READONLY (var)
>  || (TREE_CODE (var) == VAR_DECL
>  && DECL_NONALIASED (var);
>  }
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[PR fortran/89286, patch] Intrinsic sign and GNU Extension for review

2019-02-12 Thread Mark Eggleston

For review.

The attached patch and change logs is to treat SIGN in the same way as 
DIM, MOD and MODULO in regard to the GNU extension i.e. when -std=gnu.


The change logs have no dates, they can be added when the patch is 
committed provided this patch is accepted.


Note: I do not have write access to svn.

regards,

Mark

--
https://www.codethink.co.uk/privacy.html

>From f722d946230613894f7f91103494b0078319fe29 Mon Sep 17 00:00:00 2001
From: Mark Eggleston 
Date: Thu, 31 Jan 2019 13:36:48 +
Subject: [PATCH 1/3] Intrinsic sign and GNU extension.

The intrinsic sign has the same parameters as other intrinsics such as
dim and mod. This support is part of the GNU extension enabled by using
-std=gnu (the default).
---
 gcc/fortran/check.c|  14 ---
 gcc/fortran/intrinsic.c|   2 +-
 gcc/fortran/intrinsic.texi |   6 +-
 gcc/fortran/iresolve.c |  13 +++
 gcc/fortran/simplify.c |   4 +-
 gcc/testsuite/gfortran.dg/pr78619.f90  |   2 +-
 gcc/testsuite/gfortran.dg/sign-gnu-extension_1.f90 | 103 +
 gcc/testsuite/gfortran.dg/sign-gnu-extension_2.f90 |  60 
 8 files changed, 185 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/sign-gnu-extension_1.f90
 create mode 100644 gcc/testsuite/gfortran.dg/sign-gnu-extension_2.f90

diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c
index c60de6b5e4d..f2f6e9b6869 100644
--- a/gcc/fortran/check.c
+++ b/gcc/fortran/check.c
@@ -4484,20 +4484,6 @@ gfc_check_shift (gfc_expr *i, gfc_expr *shift)
   return true;
 }
 
-
-bool
-gfc_check_sign (gfc_expr *a, gfc_expr *b)
-{
-  if (!int_or_real_check (a, 0))
-return false;
-
-  if (!same_type_check (a, 0, b, 1))
-return false;
-
-  return true;
-}
-
-
 bool
 gfc_check_size (gfc_expr *array, gfc_expr *dim, gfc_expr *kind)
 {
diff --git a/gcc/fortran/intrinsic.c b/gcc/fortran/intrinsic.c
index f8d3a69d46d..2fdf41c007d 100644
--- a/gcc/fortran/intrinsic.c
+++ b/gcc/fortran/intrinsic.c
@@ -2930,7 +2930,7 @@ add_functions (void)
   make_generic ("shiftr", GFC_ISYM_SHIFTR, GFC_STD_F2008);
 
   add_sym_2 ("sign", GFC_ISYM_SIGN, CLASS_ELEMENTAL, ACTUAL_YES, BT_REAL, dr, GFC_STD_F77,
-	 gfc_check_sign, gfc_simplify_sign, gfc_resolve_sign,
+	 gfc_check_a_p, gfc_simplify_sign, gfc_resolve_sign,
 	 a, BT_REAL, dr, REQUIRED, b, BT_REAL, dr, REQUIRED);
 
   add_sym_2 ("isign", GFC_ISYM_SIGN, CLASS_ELEMENTAL, ACTUAL_YES, BT_INTEGER, di, GFC_STD_F77,
diff --git a/gcc/fortran/intrinsic.texi b/gcc/fortran/intrinsic.texi
index 941c2e39374..97032994f20 100644
--- a/gcc/fortran/intrinsic.texi
+++ b/gcc/fortran/intrinsic.texi
@@ -12911,11 +12911,13 @@ Elemental function
 @item @emph{Arguments}:
 @multitable @columnfractions .15 .70
 @item @var{A} @tab Shall be of type @code{INTEGER} or @code{REAL}
-@item @var{B} @tab Shall be of the same type and kind as @var{A}
+@item @var{B} @tab Shall be of the same type and kind as @var{A}.  (As a GNU
+extension, arguments of different kinds are permitted.)
 @end multitable
 
 @item @emph{Return value}:
-The kind of the return value is that of @var{A} and @var{B}.
+The kind of the return value is that of @var{A} and @var{B}.  (As a GNU
+extension, kind is the largest kind of the actual arguments.)
 If @math{B\ge 0} then the result is @code{ABS(A)}, else
 it is @code{-ABS(A)}.
 
diff --git a/gcc/fortran/iresolve.c b/gcc/fortran/iresolve.c
index 135e6bc6920..77d074c8e3c 100644
--- a/gcc/fortran/iresolve.c
+++ b/gcc/fortran/iresolve.c
@@ -2576,6 +2576,19 @@ void
 gfc_resolve_sign (gfc_expr *f, gfc_expr *a, gfc_expr *b ATTRIBUTE_UNUSED)
 {
   f->ts = a->ts;
+  if (b != NULL)
+{
+  f->ts.kind = gfc_kind_max (a,b);
+
+  if (a->ts.kind != b->ts.kind)
+{
+	  if (a->ts.kind == f->ts.kind)
+	gfc_convert_type (b, >ts, 2);
+	  else
+	gfc_convert_type (a, >ts, 2);
+	}
+}
+
   f->value.function.name
 = gfc_get_string ("__sign_%c%d", gfc_type_letter (a->ts.type), a->ts.kind);
 }
diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c
index 06b0b87d8eb..3b215b3d864 100644
--- a/gcc/fortran/simplify.c
+++ b/gcc/fortran/simplify.c
@@ -7372,11 +7372,13 @@ gfc_expr *
 gfc_simplify_sign (gfc_expr *x, gfc_expr *y)
 {
   gfc_expr *result;
+  int kind;
 
   if (x->expr_type != EXPR_CONSTANT || y->expr_type != EXPR_CONSTANT)
 return NULL;
 
-  result = gfc_get_constant_expr (x->ts.type, x->ts.kind, >where);
+  kind = x->ts.kind > y->ts.kind ? x->ts.kind : y->ts.kind;
+  result = gfc_get_constant_expr (x->ts.type, kind, >where);
 
   switch (x->ts.type)
 {
diff --git a/gcc/testsuite/gfortran.dg/pr78619.f90 b/gcc/testsuite/gfortran.dg/pr78619.f90
index 5fbe185cfab..8b8619fea64 100644
--- a/gcc/testsuite/gfortran.dg/pr78619.f90
+++ b/gcc/testsuite/gfortran.dg/pr78619.f90
@@ -10,7 +10,7 @@
 contains
   function f(x) result(z)
 real :: x, z
-z = sign(1.0, f) 

Re: [PATCH] Add target-zlib to top-level configure, use zlib from libphobos

2019-02-12 Thread Richard Biener
On Sat, Feb 9, 2019 at 10:37 AM Iain Buclaw  wrote:
>
> On Mon, 28 Jan 2019 at 13:10, Richard Biener  
> wrote:
> >
> > On Mon, Jan 21, 2019 at 7:35 PM Iain Buclaw  wrote:
> > >
> > > Hi,
> > >
> > > Following on from the last, this adds target-zlib to target_libraries
> > > and updates libphobos build scripts to link to libz_convenience.a.
> > > The D front-end already has target-zlib in d/config-lang.in.
> > >
> > > Is the top-level part OK?  I considered disabling target-zlib if
> > > libphobos is not being built, but decided against unless it's
> > > requested.
> >
> > Hmm, you overload --with-system-zlib to apply to both host and target
> > (I guess it already applied to build), not sure if that's really desired?
> > I suppose libphobos is the first target library linking against zlib?
> >
>
> Originally, libgcj linked to zlib.
>
> > You are also falling back to in-tree zlib if --with-system-zlib was
> > specified but no zlib was found - I guess for cross builds that
> > will easily get not noticed...  The toplevel --with-system-zlib makes
> > it much harder and simply fails.
> >
>
> OK, so keep --with-target-system-zlib to distinguish between the two?

Yes, and fail if specificed but not found.

Richard.

> --
> Iain


[PATCH][libbacktrace] Handle bsearch with NULL base in dwarf_lookup_pc

2019-02-12 Thread Tom de Vries
Hi,

The call to bsearch in dwarf_lookup_pc can have NULL as base argument when
the nmemb argument is 0.  The base argument is required to be pointing to the
initial member of an array of nmemb objects.  It is not specified what
constitutes a valid pointer to an array of 0 objects, but glibc declares base
with attribute non-null, so the NULL will trigger a sanitizer runtime error.

Fix this by only calling bsearch if nmemb != 0.

OK for trunk?

Thanks,
- Tom

[libbacktrace] Handle bsearch with NULL base in dwarf_lookup_pc

2019-02-12  Tom de Vries  

PR libbacktrace/81983
* dwarf.c (dwarf_lookup_pc): Don't call bsearch if nmemb == 0.

---
 libbacktrace/dwarf.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c
index d7dacf3ef32..f338489fe44 100644
--- a/libbacktrace/dwarf.c
+++ b/libbacktrace/dwarf.c
@@ -2821,8 +2821,10 @@ dwarf_lookup_pc (struct backtrace_state *state, struct 
dwarf_data *ddata,
   *found = 1;
 
   /* Find an address range that includes PC.  */
-  entry = bsearch (, ddata->addrs, ddata->addrs_count,
-  sizeof (struct unit_addrs), unit_addrs_search);
+  entry = (ddata->addrs_count == 0
+  ? NULL
+  : bsearch (, ddata->addrs, ddata->addrs_count,
+ sizeof (struct unit_addrs), unit_addrs_search));
 
   if (entry == NULL)
 {