Re: [PATCH] Reset insn priority after inc/ref replacement in haifa sched

2018-10-15 Thread Robin Dapp
> A C++ style nit/question: instead of adding a new overload 
> 
>   priority (rtx_insn *, bool)
> 
> you can add a parameter with a default value in the existing
> static function
> 
>   priority (rtx_insn *insn, bool force_recompute = false)

Sometimes I'm still stuck in C land with GCC :), thanks will change this
if the rest of the patch is ok.

Regards
 Robin



Re: [PATCH] __debug::list use C++11 direct initialization

2018-10-15 Thread François Dumont

On 10/15/2018 12:07 PM, Jonathan Wakely wrote:

On 09/10/18 07:11 +0200, François Dumont wrote:
Here is the communication for my yesterday's patch which I thought 
svn had failed to commit (I had to interrupt it).


Similarly to what I've done for associative containers here is a 
cleanup of the std::__debug::list implementation leveraging more on 
C++11 direct initialization.


I also made sure we use consistent comparison between 
iterator/const_iterator in erase and emplace methods.


2018-10-08  François Dumont 

    * include/debug/list (list<>::cbegin()): Use C++11 direct
    initialization.
    (list<>::cend()): Likewise.
    (list<>::emplace<>(const_iterator, _Args&&...)): Likewise.
    (list<>::insert(const_iterator, initializer_list<>)): Likewise.
    (list<>::insert(const_iterator, size_type, const _Tp&)): Likewise.
    (list<>::erase(const_iterator, const_iterator)): Ensure consistent
    iterator comparisons.
    (list<>::splice(const_iterator, list&&, const_iterator,
    const_iterator)): Likewise.

Tested under Linux x86_64 Debug mode and committed.

François



diff --git a/libstdc++-v3/include/debug/list 
b/libstdc++-v3/include/debug/list

index 8add1d596e0..879e1177497 100644
--- a/libstdc++-v3/include/debug/list
+++ b/libstdc++-v3/include/debug/list
@@ -244,11 +244,11 @@ namespace __debug
#if __cplusplus >= 201103L
  const_iterator
  cbegin() const noexcept
-  { return const_iterator(_Base::begin(), this); }
+  { return { _Base::begin(), this }; }

  const_iterator
  cend() const noexcept
-  { return const_iterator(_Base::end(), this); }
+  { return { _Base::end(), this }; }


For functions like emplace (which are C++11-only) and for forward_list
(also C++11-only) using this syntax makes it clearer.

But for these functions it just makes cbegin() and cend() look
different to the C++98 begin() and end() functions, for no obvious
benefit.

Simply using { return end(); } would have been another option.


Personnaly I hesitated in writting:

{ return { _Base::cbegin(), this }; }

cause I prefer when you see clearly that Debug implem forward calls to 
the Normal implem.


I thought that using C++11 direct init was more likely to have gcc elide 
the copy constructor so I used it everywhere possible.




[PATCH v3 5/6] [MIPS] Add Loongson 3A2000/3A3000 processor support

2018-10-15 Thread Paul Hua

From 55047aa22e40de2637fbab4b5e246dfc4ca191f8 Mon Sep 17 00:00:00 2001
From: Chenghua Xu 
Date: Mon, 3 Sep 2018 19:45:15 +0800
Subject: [PATCH 5/6] Add support for Loongson 3A2000/3A3000 proccessor.

gcc/
	* config/mips/gs464e.md: New.
	* config/mips/mips-cpus.def: Define gs464e.
	* config/mips/mips-tables.opt: Regenerate.
	* config/mips/mips.c (mips_rtx_cost_data): Add DEFAULT_COSTS for
	gs464e.
	(mips_issue_rate): Add support for gs464e.
	(mips_multipass_dfa_lookahead): Likewise.
	(mips_option_override): Enable MMI, EXT and EXT2 for gs464e.
	* config/mips/mips.h: Define TARGET_GS464E and TUNE_GS464E.
	(MIPS_ISA_LEVEL_SPEC): Infer mips64r2 from gs464e.
	(ISA_HAS_FUSED_MADD4): Enable for TARGET_GS464E.
	(ISA_HAS_UNFUSED_MADD4): Exclude TARGET_GS464E.
	* config/mips/mips.md: Include gs464e.md.
	(processor): Add gs464e.
	* doc/invoke.texi: Add gs464e to supported architectures.
---
 gcc/config/mips/gs464e.md   | 137 
 gcc/config/mips/mips-cpus.def   |   1 +
 gcc/config/mips/mips-tables.opt |  19 +++---
 gcc/config/mips/mips.c  |  22 +--
 gcc/config/mips/mips.h  |  10 ++-
 gcc/config/mips/mips.md |   2 +
 gcc/doc/invoke.texi |   2 +-
 7 files changed, 176 insertions(+), 17 deletions(-)
 create mode 100644 gcc/config/mips/gs464e.md

diff --git a/gcc/config/mips/gs464e.md b/gcc/config/mips/gs464e.md
new file mode 100644
index 000..60e0e6b0463
--- /dev/null
+++ b/gcc/config/mips/gs464e.md
@@ -0,0 +1,137 @@
+;; Pipeline model for Loongson gs464e cores.
+
+;; Copyright (C) 2018 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+;; Uncomment the following line to output automata for debugging.
+;; (automata_option "v")
+
+;; Automaton for integer instructions.
+(define_automaton "gs464e_a_alu")
+
+;; Automaton for floating-point instructions.
+(define_automaton "gs464e_a_falu")
+
+;; Automaton for memory operations.
+(define_automaton "gs464e_a_mem")
+
+;; Describe the resources.
+
+(define_cpu_unit "gs464e_alu1" "gs464e_a_alu")
+(define_cpu_unit "gs464e_alu2" "gs464e_a_alu")
+(define_cpu_unit "gs464e_mem1" "gs464e_a_mem")
+(define_cpu_unit "gs464e_mem2" "gs464e_a_mem")
+(define_cpu_unit "gs464e_falu1" "gs464e_a_falu")
+(define_cpu_unit "gs464e_falu2" "gs464e_a_falu")
+
+;; Describe instruction reservations.
+
+(define_insn_reservation "gs464e_arith" 1
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "arith,clz,const,logical,
+			move,nop,shift,signext,slt"))
+  "gs464e_alu1 | gs464e_alu2")
+
+(define_insn_reservation "gs464e_branch" 1
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "branch,jump,call,condmove,trap"))
+  "gs464e_alu1 | gs464e_alu2")
+
+(define_insn_reservation "gs464e_mfhilo" 1
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "mfhi,mflo,mthi,mtlo"))
+  "gs464e_alu1 | gs464e_alu2")
+
+;; Operation imul3nc is fully pipelined.
+(define_insn_reservation "gs464e_imul3nc" 5
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "imul3nc"))
+  "gs464e_alu1 | gs464e_alu2")
+
+(define_insn_reservation "gs464e_imul" 7
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "imul,imadd"))
+  "gs464e_alu1 | gs464e_alu2")
+
+(define_insn_reservation "gs464e_idiv_si" 12
+  (and (eq_attr "cpu" "gs464e")
+   (and (eq_attr "type" "idiv")
+	(eq_attr "mode" "SI")))
+  "gs464e_alu1 | gs464e_alu2")
+
+(define_insn_reservation "gs464e_idiv_di" 25
+  (and (eq_attr "cpu" "gs464e")
+   (and (eq_attr "type" "idiv")
+	(eq_attr "mode" "DI")))
+  "gs464e_alu1 | gs464e_alu2")
+
+(define_insn_reservation "gs464e_load" 4
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "load"))
+  "gs464e_mem1 | gs464e_mem2")
+
+(define_insn_reservation "gs464e_fpload" 5
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "load,mfc,mtc"))
+  "gs464e_mem1 | gs464e_mem2")
+
+(define_insn_reservation "gs464e_prefetch" 0
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "prefetch,prefetchx"))
+  "gs464e_mem1 | gs464e_mem2")
+
+(define_insn_reservation "gs464e_store" 0
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "store,fpstore,fpidxstore"))
+  "gs464e_mem1 | gs464e_mem2")
+
+(define_insn_reservation "gs464e_fadd" 4
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "fadd,fmul,fmadd"))
+  "gs464e_falu1 | gs464e_falu2")
+
+(define_

[PATCH v3 6/6] [MIPS] Add Loongson 2K1000 processor support

2018-10-15 Thread Paul Hua

From 0df9c46bea628086ca2c4b5db24c28cec912d319 Mon Sep 17 00:00:00 2001
From: Chenghua Xu 
Date: Mon, 3 Sep 2018 20:01:54 +0800
Subject: [PATCH 6/6] Add support for Loongson 2K1000 proccessor.

gcc/
	* config/mips/gs264e.md: New.
	* config/mips/mips-cpus.def: Define gs264e.
	* config/mips/mips-tables.opt: Regenerate.
	* config/mips/mips.c (mips_rtx_cost_data): Add DEFAULT_COSTS for
	gs264e.
	(mips_issue_rate): Add support for gs264e.
	(mips_multipass_dfa_lookahead): Likewise.
	(mips_option_override): Enable MMI, EXT, EXT2 and MSA for gs264e.
	* config/mips/mips.h: Define TARGET_GS264E and TUNE_GS264E.
	(MIPS_ISA_LEVEL_SPEC): Infer mips64r2 from gs264e.
	(ISA_HAS_FUSED_MADD4): Enable for TARGET_GS264E.
	(ISA_HAS_UNFUSED_MADD4): Exclude TARGET_GS264E.
	* config/mips/mips.md: Include gs264e.md.
	(processor): Add gs264e.
	* config/mips/mips.opt (MSA): Use Mask instead of Var.
	* doc/invoke.texi: Add gs264e to supported architectures.
---
 gcc/config/mips/gs264e.md   | 133 
 gcc/config/mips/mips-cpus.def   |   1 +
 gcc/config/mips/mips-tables.opt |  19 +++---
 gcc/config/mips/mips.c  |  29 ++---
 gcc/config/mips/mips.h  |  12 ++--
 gcc/config/mips/mips.md |   2 +
 gcc/config/mips/mips.opt|   2 +-
 gcc/doc/invoke.texi |   1 +
 8 files changed, 178 insertions(+), 21 deletions(-)
 create mode 100644 gcc/config/mips/gs264e.md

diff --git a/gcc/config/mips/gs264e.md b/gcc/config/mips/gs264e.md
new file mode 100644
index 000..8f1f9e17e08
--- /dev/null
+++ b/gcc/config/mips/gs264e.md
@@ -0,0 +1,133 @@
+;; Pipeline model for Loongson gs264e cores.
+
+;; Copyright (C) 2018 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+;; Uncomment the following line to output automata for debugging.
+;; (automata_option "v")
+
+;; Automaton for integer instructions.
+(define_automaton "gs264e_a_alu")
+
+;; Automaton for floating-point instructions.
+(define_automaton "gs264e_a_falu")
+
+;; Automaton for memory operations.
+(define_automaton "gs264e_a_mem")
+
+;; Describe the resources.
+
+(define_cpu_unit "gs264e_alu1" "gs264e_a_alu")
+(define_cpu_unit "gs264e_mem1" "gs264e_a_mem")
+(define_cpu_unit "gs264e_falu1" "gs264e_a_falu")
+
+;; Describe instruction reservations.
+
+(define_insn_reservation "gs264e_arith" 1
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "arith,clz,const,logical,
+			move,nop,shift,signext,slt"))
+  "gs264e_alu1")
+
+(define_insn_reservation "gs264e_branch" 1
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "branch,jump,call,condmove,trap"))
+  "gs264e_alu1")
+
+(define_insn_reservation "gs264e_mfhilo" 1
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "mfhi,mflo,mthi,mtlo"))
+  "gs264e_alu1")
+
+;; Operation imul3nc is fully pipelined.
+(define_insn_reservation "gs264e_imul3nc" 7
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "imul3nc"))
+  "gs264e_alu1")
+
+(define_insn_reservation "gs264e_imul" 7
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "imul,imadd"))
+  "gs264e_alu1")
+
+(define_insn_reservation "gs264e_idiv_si" 12
+  (and (eq_attr "cpu" "gs264e")
+   (and (eq_attr "type" "idiv")
+	(eq_attr "mode" "SI")))
+  "gs264e_alu1")
+
+(define_insn_reservation "gs264e_idiv_di" 25
+  (and (eq_attr "cpu" "gs264e")
+   (and (eq_attr "type" "idiv")
+	(eq_attr "mode" "DI")))
+  "gs264e_alu1")
+
+(define_insn_reservation "gs264e_load" 4
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "load"))
+  "gs264e_mem1")
+
+(define_insn_reservation "gs264e_fpload" 4
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "load,mfc,mtc"))
+  "gs264e_mem1")
+
+(define_insn_reservation "gs264e_prefetch" 0
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "prefetch,prefetchx"))
+  "gs264e_mem1")
+
+(define_insn_reservation "gs264e_store" 0
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "store,fpstore,fpidxstore"))
+  "gs264e_mem1")
+
+(define_insn_reservation "gs264e_fadd" 4
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "fadd,fmul,fmadd"))
+  "gs264e_falu1")
+
+(define_insn_reservation "gs264e_fcmp" 2
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "fabs,fcmp,fmove,fneg"))
+  "gs264e_falu1")
+
+(define_insn_reservation "gs264e_fcvt" 4
+  (and (eq_attr "cpu" "gs264e")
+ 

[PATCH v3 3/6] [MIPS] Add Loongson EXTensions R2 (EXT2) instructions support

2018-10-15 Thread Paul Hua

From 14eabf990f187631cacd47e02342941ddb1b04a0 Mon Sep 17 00:00:00 2001
From: Chenghua Xu 
Date: Fri, 31 Aug 2018 11:55:48 +0800
Subject: [PATCH 3/6] Add support for Loongson EXT2 istructions.

gcc/
	* config/mips/mips.h (TARGET_CPU_CPP_BUILTINS): Define
	__mips_loongson_ext2, __mips_loongson_ext_rev=2.
	(ISA_HAS_CTZ_CTO): New, ture if TARGET_LOONGSON_EXT2.
	(ASM_SPEC): Add mloongson-ext2 and mno-loongson-ext2.
	* config/mips/mips.md: Add ctz to "define_attr "type"".
	(define_insn "ctz2"): New insn pattern.
	(define_insn "prefetch"): Include TARGET_LOONGSON_EXT2.
	* config/mips/mips.opt (-mloongson-ext2): Add option.
	* gcc/doc/invoke.texi (-mloongson-ext2): Document.

gcc/testsuite/
	* gcc.target/mips/loongson-ctz.c: New test.
	* gcc.target/mips/loongson-dctz.c: Likewise.
	* gcc.target/mips/mips.exp (mips_option_groups): Add
	-mloongson-ext2 option.
---
 gcc/config/mips/mips.h| 12 +++
 gcc/config/mips/mips.md   | 31 ++-
 gcc/config/mips/mips.opt  |  4 
 gcc/doc/invoke.texi   |  7 ++
 gcc/testsuite/gcc.target/mips/loongson-ctz.c  | 11 ++
 gcc/testsuite/gcc.target/mips/loongson-dctz.c | 11 ++
 gcc/testsuite/gcc.target/mips/mips.exp|  1 +
 7 files changed, 72 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/loongson-ctz.c
 create mode 100644 gcc/testsuite/gcc.target/mips/loongson-dctz.c

diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index e0e78ba610e..b75646d66ce 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -600,8 +600,16 @@ struct mips_cpu_info {
   if (TARGET_LOONGSON_EXT)		\
 	{\
 	  builtin_define ("__mips_loongson_ext");			\
+	  if (TARGET_LOONGSON_EXT2)	\
+	{\
+	  builtin_define ("__mips_loongson_ext2");			\
+	  builtin_define ("__mips_loongson_ext_rev=2");		\
+	}\
+	  else\
+	  builtin_define ("__mips_loongson_ext_rev=1");		\
 	}\
 	\
+	\
   /* Historical Octeon macro.  */	\
   if (TARGET_OCTEON)		\
 	builtin_define ("__OCTEON__");	\
@@ -1117,6 +1125,9 @@ struct mips_cpu_info {
 /* ISA has count leading zeroes/ones instruction (not implemented).  */
 #define ISA_HAS_CLZ_CLO		(mips_isa_rev >= 1 && !TARGET_MIPS16)
 
+/* ISA has count tailing zeroes/ones instruction (not implemented).  */
+#define ISA_HAS_CTZ_CTO		(TARGET_LOONGSON_EXT2)
+
 /* ISA has three operand multiply instructions that put
the high part in an accumulator: mulhi or mulhiu.  */
 #define ISA_HAS_MULHI		((TARGET_MIPS5400			 \
@@ -1362,6 +1373,7 @@ struct mips_cpu_info {
 %{mmsa} %{mno-msa} \
 %{mloongson-mmi} %{mno-loongson-mmi} \
 %{mloongson-ext} %{mno-loongson-ext} \
+%{mloongson-ext2} %{mno-loongson-ext2} \
 %{msmartmips} %{mno-smartmips} \
 %{mmt} %{mno-mt} \
 %{mfix-rm7000} %{mno-fix-rm7000} \
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index 4b7a627b7a6..c8128d4d530 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -335,6 +335,7 @@
 ;; slt		set less than instructions
 ;; signext  sign extend instructions
 ;; clz		the clz and clo instructions
+;; ctz		the ctz and cto instructions
 ;; pop		the pop instruction
 ;; trap		trap if instructions
 ;; imul		integer multiply 2 operands
@@ -375,7 +376,7 @@
 (define_attr "type"
   "unknown,branch,jump,call,load,fpload,fpidxload,store,fpstore,fpidxstore,
prefetch,prefetchx,condmove,mtc,mfc,mthi,mtlo,mfhi,mflo,const,arith,logical,
-   shift,slt,signext,clz,pop,trap,imul,imul3,imul3nc,imadd,idiv,idiv3,move,
+   shift,slt,signext,clz,ctz,pop,trap,imul,imul3,imul3nc,imadd,idiv,idiv3,move,
fmove,fadd,fmul,fmadd,fdiv,frdiv,frdiv1,frdiv2,fabs,fneg,fcmp,fcvt,fsqrt,
frsqrt,frsqrt1,frsqrt2,dspmac,dspmacsat,accext,accmod,dspalu,dspalusat,
multi,atomic,syncloop,nop,ghost,multimem,
@@ -3149,6 +3150,23 @@
 ;;
 ;;  ...
 ;;
+;;  Count tailing zeroes.
+;;
+;;  ...
+;;
+
+(define_insn "ctz2"
+  [(set (match_operand:GPR 0 "register_operand" "=d")
+	(ctz:GPR (match_operand:GPR 1 "register_operand" "d")))]
+  "ISA_HAS_CTZ_CTO"
+  "ctz\t%0,%1"
+  [(set_attr "type" "ctz")
+   (set_attr "mode" "")])
+
+
+;;
+;;  ...
+;;
 ;;  Count number of set bits.
 ;;
 ;;  ...
@@ -7136,13 +7154,16 @@
 	 (match_operand 2 "const_int_operand" "n"))]
   "ISA_HAS_PREFETCH && TARGET_EXPLICIT_RELOCS"
 {
-  if (TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT)
+  if (TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT || TARGET_LOONGSON_EXT2)
 {
-  /* Loongson 2[ef] and Loongson 3a use load to $0 for prefetching.  */
+  /* Loongson ext2 implementation pref insnstructions.  */
+  if (TARGET_LOONGSON_EXT2)
+	return "pref\t%1, %a0";
+  /* Loongson 2[ef] and Loongson ext use load to $0 for prefetching.  */
   if (TARGET_64BIT)
-return "ld\t$0,%a0";
+	return "ld\t$0,%a0";
   

[PATCH v3 4/6] [MIPS] Add Loongson 3A1000 processor support

2018-10-15 Thread Paul Hua

From ce950df0f918eb02d15c4287d21e3aecb43bf351 Mon Sep 17 00:00:00 2001
From: Chenghua Xu 
Date: Fri, 31 Aug 2018 14:08:01 +0800
Subject: [PATCH 4/6] Add support for Loongson 3A1000 proccessor.

gcc/
	* config/mips/loongson3a.md: Rename to ...
	* config/mips/gs464.md: ... here.
	* config/mips/mips-cpus.def: Define gs464; Add loongson3a
	as an alias of gs464 processor.
	* config/mips/mips-tables.opt: Regenerate.
	* config/mips/mips.c (mips_issue_rate): Use PROCESSOR_GS464
	instead of ROCESSOR_LOONGSON_3A.
	(mips_multipass_dfa_lookahead): Use TUNE_GS464 instread of
	TUNE_LOONGSON_3A.
	(mips_option_override): Enable MMI and EXT for gs464.
	* config/mips/mips.h: Rename TARGET_LOONGSON_3A to TARGET_GS464;
	Rename TUNE_LOONGSON_3A to TUNE_GS464.
	(MIPS_ISA_LEVEL_SPEC): Infer mips64r2 from gs464.
	(ISA_HAS_ODD_SPREG, ISA_AVOID_DIV_HILO, ISA_HAS_FUSED_MADD4,
	ISA_HAS_UNFUSED_MADD4): Use TARGET_GS464 instead of
	TARGET_LOONGSON_3A.
	* config/mips/mips.md: Include gs464.md instead of loongson3a.md.
	(processor): Add gs464;
	* doc/invoke.texi: Add gs464 to supported architectures.
---
 gcc/config/mips/gs464.md| 137 
 gcc/config/mips/loongson3a.md   | 137 
 gcc/config/mips/mips-cpus.def   |   3 +-
 gcc/config/mips/mips-tables.opt |  19 +++---
 gcc/config/mips/mips.c  |  16 +++--
 gcc/config/mips/mips.h  |  15 +++--
 gcc/config/mips/mips.md |   4 +-
 gcc/doc/invoke.texi |   2 +-
 8 files changed, 170 insertions(+), 163 deletions(-)
 create mode 100644 gcc/config/mips/gs464.md
 delete mode 100644 gcc/config/mips/loongson3a.md

diff --git a/gcc/config/mips/gs464.md b/gcc/config/mips/gs464.md
new file mode 100644
index 000..82efb66786f
--- /dev/null
+++ b/gcc/config/mips/gs464.md
@@ -0,0 +1,137 @@
+;; Pipeline model for Loongson gs464 cores.
+
+;; Copyright (C) 2011-2018 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+;; Uncomment the following line to output automata for debugging.
+;; (automata_option "v")
+
+;; Automaton for integer instructions.
+(define_automaton "gs464_a_alu")
+
+;; Automaton for floating-point instructions.
+(define_automaton "gs464_a_falu")
+
+;; Automaton for memory operations.
+(define_automaton "gs464_a_mem")
+
+;; Describe the resources.
+
+(define_cpu_unit "gs464_alu1" "gs464_a_alu")
+(define_cpu_unit "gs464_alu2" "gs464_a_alu")
+(define_cpu_unit "gs464_mem" "gs464_a_mem")
+(define_cpu_unit "gs464_falu1" "gs464_a_falu")
+(define_cpu_unit "gs464_falu2" "gs464_a_falu")
+
+;; Describe instruction reservations.
+
+(define_insn_reservation "gs464_arith" 1
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "arith,clz,const,logical,
+			move,nop,shift,signext,slt"))
+  "gs464_alu1 | gs464_alu2")
+
+(define_insn_reservation "gs464_branch" 1
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "branch,jump,call,condmove,trap"))
+  "gs464_alu1")
+
+(define_insn_reservation "gs464_mfhilo" 1
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "mfhi,mflo,mthi,mtlo"))
+  "gs464_alu2")
+
+;; Operation imul3nc is fully pipelined.
+(define_insn_reservation "gs464_imul3nc" 5
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "imul3nc"))
+  "gs464_alu2")
+
+(define_insn_reservation "gs464_imul" 7
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "imul,imadd"))
+  "gs464_alu2 * 7")
+
+(define_insn_reservation "gs464_idiv_si" 12
+  (and (eq_attr "cpu" "gs464")
+   (and (eq_attr "type" "idiv")
+	(eq_attr "mode" "SI")))
+  "gs464_alu2 * 12")
+
+(define_insn_reservation "gs464_idiv_di" 25
+  (and (eq_attr "cpu" "gs464")
+   (and (eq_attr "type" "idiv")
+	(eq_attr "mode" "DI")))
+  "gs464_alu2 * 25")
+
+(define_insn_reservation "gs464_load" 3
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "load"))
+  "gs464_mem")
+
+(define_insn_reservation "gs464_fpload" 4
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "load,mfc,mtc"))
+  "gs464_mem")
+
+(define_insn_reservation "gs464_prefetch" 0
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "prefetch,prefetchx"))
+  "gs464_mem")
+
+(define_insn_reservation "gs464_store" 0
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "store,fpstore,fpidxstore"))
+  "gs464_mem")
+
+;; All the fp operations can be executed in FALU1.  Only fp

[PATCH v3 1/6] [MIPS] Split Loongson (MMI) from loongson3a

2018-10-15 Thread Paul Hua

From e9d36eb4d4a841486ac82037497a2671481f8a27 Mon Sep 17 00:00:00 2001
From: Chenghua Xu 
Date: Sun, 14 Oct 2018 11:11:00 +0800
Subject: [PATCH 1/6] Add support for loongson mmi instructions.

gcc/
* config.gcc (extra_headers): Add loongson-mmiintrin.h.
* config/mips/loongson.md: Move to ...
* config/mips/loongson-mmi.md: here; Adjustment.
* config/mips/loongsson.h: Move to ...
State as deprecated. Include loongson-mmiintrin.h for back
compatibility and warning.
* config/mips/loongsson-mmiintrin.h: ... here.
* config/mips/mips.c (mips_hard_regno_mode_ok_uncached,
mips_vector_mode_supported_p, AVAIL_NON_MIPS16): Use
TARGET_LOONGSON_MMI instead of TARGET_LOONGSON_VECTORS.
(mips_option_override): Make sure MMI use hard float; Default
enable MMI on Loongson 2e/2f/3a.
(mips_shift_truncation_mask, mips_expand_vpc_loongson_even_odd,
mips_expand_vpc_loongson_pshufh, mips_expand_vpc_loongson_bcast,
mips_expand_vector_init): Use TARGET_LOONGSON_MMI instead of
TARGET_LOONGSON_VECTORS.
* gcc/config/mips/mips.h (TARGET_LOONGSON_VECTORS): Delete.
(TARGET_CPU_CPP_BUILTINS): Add __mips_loongson_mmi.
(SHIFT_COUNT_TRUNCATED): Use TARGET_LOONGSON_MMI instead of
TARGET_LOONGSON_VECTORS.
* gcc/config/mips/mips.md (MOVE64, MOVE128): Use
TARGET_LOONGSON_MMI instead of TARGET_LOONGSON_VECTORS.
(Loongson MMI patterns): Include loongson-mmi.md instead of
loongson.md.
* gcc/config/mips/mips.opt (-mloongson-mmi): New option.
* gcc/doc/invoke.texi (-mloongson-mmi): Document.

gcc/testsuite/
* gcc.target/mips/loongson-shift-count-truncated-1.c
(dg-options): Run under -mloongson-mmi option.
Include loongson-mmiintrin.h instead of loongson.h.
* gcc.target/mips/loongson-simd.c: Likewise.
* gcc.target/mips/mips.exp (mips_option_groups): Add
-mloongson-mmi option.
(mips-dg-init): Add -mloongson-mmi option.
* gcc.target/mips/umips-store16-1.c (dg-options): Add
forbid_cpu=loongson3a.
* lib/target-supports.exp: Rename check_mips_loongson_hw_available
to check_mips_loongson_mmi_hw_available.
Rename check_effective_target_mips_loongson_runtime to
check_effective_target_mips_loongson_mmi_runtime.
(check_effective_target_vect_int): Use mips_loongson_mmi instead
of mips_loongson when check et-is-effective-target.
(add_options_for_mips_loongson_mmi): New proc.
Rename check_effective_target_mips_loongson to
check_effective_target_mips_loongson_mmi.
(check_effective_target_vect_shift,
check_effective_target_whole_vector_shift,
check_effective_target_vect_no_int_min_max,
check_effective_target_vect_no_align,
check_effective_target_vect_short_mult,
check_vect_support_and_set_flags):Use mips_loongson_mmi instead
of mips_loongson when check et-is-effective-target.
---
 gcc/config.gcc |   2 +-
 gcc/config/mips/{loongson.md => loongson-mmi.md}   | 241 ---
 gcc/config/mips/loongson-mmiintrin.h   | 691 +
 gcc/config/mips/loongson.h | 669 +---
 gcc/config/mips/mips.c |  34 +-
 gcc/config/mips/mips.h |  21 +-
 gcc/config/mips/mips.md|  16 +-
 gcc/config/mips/mips.opt   |   4 +
 gcc/doc/invoke.texi|   7 +
 .../mips/loongson-shift-count-truncated-1.c|   6 +-
 gcc/testsuite/gcc.target/mips/loongson-simd.c  |   4 +-
 gcc/testsuite/gcc.target/mips/mips.exp |   7 +
 gcc/testsuite/gcc.target/mips/umips-store16-1.c|   2 +-
 gcc/testsuite/lib/target-supports.exp  |  47 +-
 14 files changed, 913 insertions(+), 838 deletions(-)
 rename gcc/config/mips/{loongson.md => loongson-mmi.md} (79%)
 create mode 100644 gcc/config/mips/loongson-mmiintrin.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 8521f7d556e..7871700db13 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -441,7 +441,7 @@ microblaze*-*-*)
 ;;
 mips*-*-*)
 	cpu_type=mips
-	extra_headers="loongson.h msa.h"
+	extra_headers="loongson.h loongson-mmiintrin.h msa.h"
 	extra_objs="frame-header-opt.o"
 	extra_options="${extra_options} g.opt fused-madd.opt mips/mips-tables.opt"
 	;;
diff --git a/gcc/config/mips/loongson.md b/gcc/config/mips/loongson-mmi.md
similarity index 79%
rename from gcc/config/mips/loongson.md
rename to gcc/config/mips/loongson-mmi.md
index 14794d3671f..ad23f676581 100644
--- a/gcc/config/mips/loongson.md
+++ b/gcc/config/mips/loongson-mmi.md
@@ -1,5 +1,4 @@
-;; Machine description for Loongson-specific patterns, such as
-;; ST Microelectronics Loongson-2E/2F etc.
+;; Machine description for Loong

[PATCH v3 2/6] [MIPS] Split Loongson EXTensions (EXT) instructions from loongson3a

2018-10-15 Thread Paul Hua

From 2e053c832497892c6b8b1b685aaf871d8fc4da76 Mon Sep 17 00:00:00 2001
From: Chenghua Xu 
Date: Fri, 31 Aug 2018 11:52:33 +0800
Subject: [PATCH 2/6] Add support for Loongson EXT istructions.

gcc/
	* config/mips/mips.c (mips_option_override): Default enable
	Loongson EXT on Loongson 3a target.
	* config/mips/mips.h (TARGET_CPU_CPP_BUILTINS): Add
	__mips_loongson_ext.
	(ASM_SPEC): Add mloongson-ext and mno-loongson-ext.
	* config/mips/mips.md (mul3, mul3_mul3_nohilo,
	div3, mod3, prefetch): Use TARGET_LOONGSON_EXT
	instead of TARGET_LOONGSON_3A.
	* config/mips/mips.opt (-mloongson-ext): Add option.
	* gcc/doc/invoke.texi (-mloongson-ext): Document.

gcc/testsuite/
	* gcc.target/mips/mips.exp (mips_option_groups): Add
	-mloongson-ext option.
---
 gcc/config/mips/mips.c |  5 +
 gcc/config/mips/mips.h |  7 +++
 gcc/config/mips/mips.md| 16 
 gcc/config/mips/mips.opt   |  4 
 gcc/doc/invoke.texi|  7 +++
 gcc/testsuite/gcc.target/mips/mips.exp |  1 +
 6 files changed, 32 insertions(+), 8 deletions(-)

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index a804f7030db..019a6dca752 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -20178,6 +20178,11 @@ mips_option_override (void)
 	  || (strcmp (mips_arch_info->name, "loongson3a") == 0)))
 target_flags |= MASK_LOONGSON_MMI;
 
+  /* Default to enable Loongson EXT on Longson 3a target.  */
+  if ((target_flags_explicit & MASK_LOONGSON_EXT) == 0
+  && (strcmp (mips_arch_info->name, "loongson3a") == 0))
+target_flags |= MASK_LOONGSON_EXT;
+
   /* .eh_frame addresses should be the same width as a C pointer.
  Most MIPS ABIs support only one pointer size, so the assembler
  will usually know exactly how big an .eh_frame address is.
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 3563c1d78fe..e0e78ba610e 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -596,6 +596,12 @@ struct mips_cpu_info {
 	  builtin_define ("__mips_loongson_mmi");			\
 	}\
 	\
+  /* Whether Loongson EXT modes are enabled.  */			\
+  if (TARGET_LOONGSON_EXT)		\
+	{\
+	  builtin_define ("__mips_loongson_ext");			\
+	}\
+	\
   /* Historical Octeon macro.  */	\
   if (TARGET_OCTEON)		\
 	builtin_define ("__OCTEON__");	\
@@ -1355,6 +1361,7 @@ struct mips_cpu_info {
 %{mginv} %{mno-ginv} \
 %{mmsa} %{mno-msa} \
 %{mloongson-mmi} %{mno-loongson-mmi} \
+%{mloongson-ext} %{mno-loongson-ext} \
 %{msmartmips} %{mno-smartmips} \
 %{mmt} %{mno-mt} \
 %{mfix-rm7000} %{mno-fix-rm7000} \
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index a88c1c53134..4b7a627b7a6 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -1599,7 +1599,7 @@
 {
   rtx lo;
 
-  if (TARGET_LOONGSON_2EF || TARGET_LOONGSON_3A || ISA_HAS_R6MUL)
+  if (TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT || ISA_HAS_R6MUL)
 emit_insn (gen_mul3_mul3_nohilo (operands[0], operands[1],
 	   operands[2]));
   else if (ISA_HAS_MUL3)
@@ -1622,11 +1622,11 @@
   [(set (match_operand:GPR 0 "register_operand" "=d")
 (mult:GPR (match_operand:GPR 1 "register_operand" "d")
   (match_operand:GPR 2 "register_operand" "d")))]
-  "TARGET_LOONGSON_2EF || TARGET_LOONGSON_3A || ISA_HAS_R6MUL"
+  "TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT || ISA_HAS_R6MUL"
 {
   if (TARGET_LOONGSON_2EF)
 return "multu.g\t%0,%1,%2";
-  else if (TARGET_LOONGSON_3A)
+  else if (TARGET_LOONGSON_EXT)
 return "gsmultu\t%0,%1,%2";
   else
 return "mul\t%0,%1,%2";
@@ -3016,11 +3016,11 @@
   [(set (match_operand:GPR 0 "register_operand" "=&d")
 	(any_div:GPR (match_operand:GPR 1 "register_operand" "d")
 		 (match_operand:GPR 2 "register_operand" "d")))]
-  "TARGET_LOONGSON_2EF || TARGET_LOONGSON_3A || ISA_HAS_R6DIV"
+  "TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT || ISA_HAS_R6DIV"
   {
 if (TARGET_LOONGSON_2EF)
   return mips_output_division ("div.g\t%0,%1,%2", operands);
-else if (TARGET_LOONGSON_3A)
+else if (TARGET_LOONGSON_EXT)
   return mips_output_division ("gsdiv\t%0,%1,%2", operands);
 else
   return mips_output_division ("div\t%0,%1,%2", operands);
@@ -3032,11 +3032,11 @@
   [(set (match_operand:GPR 0 "register_operand" "=&d")
 	(any_mod:GPR (match_operand:GPR 1 "register_operand" "d")
 		 (match_operand:GPR 2 "register_operand" "d")))]
-  "TARGET_LOONGSON_2EF || TARGET_LOONGSON_3A || ISA_HAS_R6DIV"
+  "TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT || ISA_HAS_R6DIV"
   {
 if (TARGET_LOONGSON_2EF)
   return mips_output_division ("mod.g\t%0,%1,%2", operands);
-else if (TARGET_LOONGSON_3A)
+else if (TARGET_LOONGSON_EXT)
   return mips_output_division ("gsmod\t%0,%1,%2", operands);
 else
   return mips_output_division ("mod\t%0,%1,%2", operands);
@@ -7136,7 +7136,7 @@
 	 (match_operand 2 "co

[PATCH v3 0/6] [MIPS] Reorganize the loongson march and extensions instructions set

2018-10-15 Thread Paul Hua
Hi:

The original version of patches were here:
https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00099.html

This is a update version. please review, thanks.

This series patches reorganize the Loongson -march=xxx and Loongson
extensions instructions set.  For long time, the Loongson extensions
instructions set puts under -march=loongson3a option.  We can't
disable one of them when we need.

The patch (1) split Loongson  MultiMedia extensions Instructions (MMI)
from loongson3a, add -mloongson-mmi/-mno-loongson-mmi option for
enable/disable them.

The patch (2) split Loongson EXTensions (EXT) instructions from
loongson3a, add -mloongson-ext/-mno-loongson-ext option for
enable/disable them.

The patch (3) add Loongson EXTensions R2 (EXT2) instructions support,
add -mloongson-ext2/-mno-loongson-ext2 option for enable/disable them.

The patch (4) add Loongson 3A1000 processor support.  The gs464 is a
codename of 3A1000 microarchitecture.  Rename -march=loongson3a to
-march=gs464, Keep -march=loongson3a as an alias of -march=gs464 for
compatibility.

The patch (5) add Loongson 3A2000/3A3000 processor support.  Include
Loongson MMI, EXT, EXT2 instructions set.

The patch (6) add Loongson 2K1000 processor support. Include Loongson
MMI, EXT, EXT2 and MSA instructions set.

The binutils patch has been upstreamed.

There are six patches in this set, as follows.
1) 0001-MIPS-Add-support-for-loongson-mmi-instructions.patch
2) 0002-MIPS-Add-support-for-Loongson-EXT-istructions.patch
3) 0003-MIPS-Add-support-for-Loongson-EXT2-istructions.patch
4) 0004-MIPS-Add-support-for-Loongson-3A1000-proccessor.patch
5) 0005-MIPS-Add-support-for-Loongson-3A2000-3A3000-proccess.patch
6) 0006-MIPS-Add-support-for-Loongson-2K1000-proccessor.patch

All patchs test under mips64el-linux-gnu no new regressions.

Ok for commit ?

Thanks,
Paul Hua


Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register

2018-10-15 Thread Peter Bergner
On 10/11/18 10:40 PM, Jeff Law wrote:
> On 10/11/18 1:23 PM, Peter Bergner wrote:
>>  * ira-lives (non_conflicting_reg_copy_p): Disable for non LRA targets.
> So this helped the alpha & hppa and sh4.
> 
> I'm still seeing failures on the aarch64, s390x.  No surprise on these
> since they use LRA by default and would be unaffected by this patch.

Ok, I was able to reduce the aarch64 test case down to the minimum test case
that still kept the kernel's __cmpxchg_double() function intact.  I tested
the patch you're currently running on your builders which changed some
of the "... == OP_OUT" to "... != OP_IN", etc and it doesn't fix the
following test case, like it seems to fix the s390 issue and segher's
small test case (both aarch64 and ppc64).

It's late here, so I'll start digging into this one in the morning.

Peter



bergner@pike:~/gcc/BUGS/PR87507/$ cat slub-min.c
long
__cmpxchg_double (unsigned long arg)
{
  unsigned long old1 = 0;
  unsigned long old2 = arg;
  unsigned long new1 = 0;
  unsigned long new2 = 0;
  volatile void *ptr = 0;

  unsigned long oldval1 = old1;
  unsigned long oldval2 = old2;
  register unsigned long x0 asm ("x0") = old1;
  register unsigned long x1 asm ("x1") = old2;
  register unsigned long x2 asm ("x2") = new1;
  register unsigned long x3 asm ("x3") = new2;
  register unsigned long x4 asm ("x4") = (unsigned long) ptr;
  asm volatile ("   casp%[old1], %[old2], %[new1], %[new2], %[v]\n"
"   eor %[old1], %[old1], %[oldval1]\n"
"   eor %[old2], %[old2], %[oldval2]\n"
"   orr %[old1], %[old1], %[old2]\n"
: [old1] "+&r" (x0), [old2] "+&r" (x1), [v] "+Q" (* (unsigned 
long *) ptr)
: [new1] "r" (x2), [new2] "r" (x3), [ptr] "r" (x4), [oldval1] 
"r" (oldval1),[oldval2] "r" (oldval2)
: "x16", "x17", "x30");
  return x0;
}
bergner@pike:~/gcc/BUGS/PR87507/$ 
/home/bergner/gcc/build/gcc-fsf-mainline-aarch64-r264897/gcc/xgcc 
-B/home/bergner/gcc/build/gcc-fsf-mainline-aarch64-r264897/gcc -O2 
-march=armv8.1-a -c slub-min.c
/tmp/ccQCkiSG.s: Assembler messages:
/tmp/ccQCkiSG.s:24: Error: reg pair must be contiguous at operand 2 -- `casp 
x0,x6,x2,x3,[x5]'





[committed] Work around ft32 port issue

2018-10-15 Thread Jeff Law


I sent mail to James about a month ago, but never heard back  So...

This patch works around a problem in the ft32 port that shows up when
building newlib.

 The assembler was complaining about a line like this:

ldi.b  $r1,_ctype_-0x80+1($r0)

That's certainly an odd looking address computation.

It corresponds to this in the .final dump:



(insn 8 7 9 (set (reg:QI 3 $r1 [orig:53 *_3 ] [53])
(mem:QI (plus:SI (reg/v:SI 2 $r0 [orig:49 c ] [49])
(const:SI (plus:SI (symbol_ref:SI ("_ctype_") [flags
0x1040]  )
(const_int 1 [0x1] [0 *_3+0 S1 A8])) "j.c":7
31 {*movqi}
 (expr_list:REG_EQUIV (mem:QI (plus:SI (reg/v:SI 2 $r0 [orig:49 c ]
[49])
(const:SI (plus:SI (symbol_ref:SI ("_ctype_") [flags
0x1040]  )
(const_int 1 [0x1] [0 *_3+0 S1 A8])
(nil)))


Presumably the -0x800+1 is to deal with some oddity in the ft32
port, but there's no comments where this happens:

#define ASM_OUTPUT_SYMBOL_REF(stream, sym) \
  do { \
assemble_name (stream, XSTR (sym, 0)); \
int section_debug = in_section && \
  (SECTION_STYLE (in_section) == SECTION_NAMED) && \
  (in_section->named.common.flags & SECTION_DEBUG); \
if (!section_debug && SYMBOL_REF_FLAGS (sym) & 0x1000) \
  asm_fprintf (stream, "-0x80"); \
  } while (0)


Essentially it's some section encoding and ultimately I don't think it's
terribly important.  The assembler simply won't accept the insn in question.

Arguably it could/should, since it's really just a reg+disp address
where the disp isn't know until link time.  If I simplify the line to:

ldi.b $r1,_ctype($r0)

The assembler still complains.  After a bit more poking around I think
we're just formatting the address totally wrong.

AFAICT the ft32 wants:

ldi.b $r1,$r0,

So it would seem that just fixing ft32_print_operand_address to emit it
in that format would be sufficient.  However, I'm not familiar enough
with this port to know if there are other contexts where it wants the
(register) style formatting.

Can you take a look and take appropriate action.

Testcase, compile with -O2.

extern const char _ctype_[];
int
tolower (int c)
{
  return
(_ctype_) + sizeof (""[c]))[(int) (c)]) & (01 | 02)) == 01)
? (c) - 'A' + 'a' : c;
}


My patch is a hack, plain and simple.  It disables addressing modes such
as reg + sym +- const_int.  I doubt it matters in any significant way.
With the maintainer unresponsive, this patch seems like a reasonable
minimal effort workaround.

Jeff
commit 73262eafbf11463163d9cd805648ee2d704a677a
Author: law 
Date:   Mon Oct 15 23:22:05 2018 +

* config/ft32/ft32.md (ft32_general_movsrc_operand): Disable
reg + sym +- const_int addressing modes.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@265179 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 789e43b2388..0f4e293d06f 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2018-10-12  Jeff Law  
+
+   * config/ft32/ft32.md (ft32_general_movsrc_operand): Disable
+   reg + sym +- const_int addressing modes.
+
 2018-10-15  David Malcolm  
 
* common.opt (fdiagnostics-minimum-margin-width=): New option.
diff --git a/gcc/config/ft32/predicates.md b/gcc/config/ft32/predicates.md
index bac2e8ef5aa..0c147ec1aab 100644
--- a/gcc/config/ft32/predicates.md
+++ b/gcc/config/ft32/predicates.md
@@ -23,6 +23,11 @@
 ;; -
 
 ;; Nonzero if OP can be source of a simple move operation.
+;;
+;; The CONST_INT could really be CONST if we were to fix
+;; ft32_print_operand_address to format the address correctly.
+;; It might require assembler/linker work as well to ensure
+;; the right relocation is emitted.
 
 (define_predicate "ft32_general_movsrc_operand"
   (match_code "mem,const_int,reg,subreg,symbol_ref,label_ref,const")
@@ -34,7 +39,7 @@
   if (MEM_P (op)
   && GET_CODE (XEXP (op, 0)) == PLUS
   && GET_CODE (XEXP (XEXP (op, 0), 0)) == REG
-  && GET_CODE (XEXP (XEXP (op, 0), 1)) == CONST)
+  && GET_CODE (XEXP (XEXP (op, 0), 1)) == CONST_INT)
 return 1;
 
   return general_operand (op, mode);


[committed] diagnostics: add minimum width to left margin for line numbers

2018-10-15 Thread David Malcolm
This patch adds a minimum width to the left margin used for printing
line numbers.   I set the default to 6.  Hence rather than:

some-filename:9:1: some message
9 | some source text
  | ^~~~
some-filename:10:1: another message
10 | more source text
   | ^~~~

we now print:

some-filename:9:42: some message
9 | some source text
  | ^~~~
some-filename:10:42: another message
   10 | more source text
  | ^~~~

This implicitly fixes issues with margins failing to line up due
to different lengths of the number when we haven't read the full
file yet and so don't know the highest possible line number, for
line numbers up to 9.

Doing so adds some whitespace on the left-hand side, for non-huge
files, at least.  I believe that this makes it easier to see where each
diagnostic starts, by visually breaking things up at the leftmost
column; my hope is to make it easier for the eye to see the different
diagnostics as if they were different "paragraphs".

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Committed to trunk as r265178.

gcc/ChangeLog:
* common.opt (fdiagnostics-minimum-margin-width=): New option.
* diagnostic-show-locus.c (layout::layout): Apply the minimum
margin width.
(layout::start_annotation_line): Only print up to 3 of the
margin character, to avoid touching the left-hand side.
(selftest::test_diagnostic_show_locus_fixit_lines): Update for
minimum margin width, as set by test_diagnostic_context's ctor.
(selftest::test_fixit_insert_containing_newline): Likewise.
(selftest::test_fixit_insert_containing_newline_2): Likewise.
(selftest::test_line_numbers_multiline_range): Clear
dc.min_margin_width.
* diagnostic.c (diagnostic_initialize): Initialize
min_margin_width.
* diagnostic.h (struct diagnostic_context): Add field
"min_margin_width".
* doc/invoke.texi: Add -fdiagnostics-minimum-margin-width=.
* opts.c (common_handle_option): Handle
OPT_fdiagnostics_minimum_margin_width_.
* selftest-diagnostic.c
(selftest::test_diagnostic_context::test_diagnostic_context):
Initialize min_margin_width to 6.
* toplev.c (general_init): Initialize global_dc->min_margin_width.

gcc/testsuite/ChangeLog:
* gcc.dg/missing-header-fixit-3.c: Update expected indentation
to reflect minimum margin width.
* gcc.dg/missing-header-fixit-4.c: Likewise.
* gcc.dg/plugin/diagnostic-test-show-locus-bw-line-numbers.c:
Likewise.
* gcc.dg/plugin/diagnostic-test-show-locus-color-line-numbers.c:
Likewise.
* gcc.dg/plugin/diagnostic-test-show-locus-bw-line-numbers-2.c:
New test.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add it.
---
 gcc/common.opt |  4 ++
 gcc/diagnostic-show-locus.c| 41 +--
 gcc/diagnostic.c   |  1 +
 gcc/diagnostic.h   |  4 ++
 gcc/doc/invoke.texi|  6 +++
 gcc/opts.c |  4 ++
 gcc/selftest-diagnostic.c  |  1 +
 gcc/testsuite/gcc.dg/missing-header-fixit-3.c  |  8 +--
 gcc/testsuite/gcc.dg/missing-header-fixit-4.c  | 10 ++--
 .../diagnostic-test-show-locus-bw-line-numbers-2.c | 22 
 .../diagnostic-test-show-locus-bw-line-numbers.c   | 58 +++---
 ...diagnostic-test-show-locus-color-line-numbers.c | 12 ++---
 gcc/testsuite/gcc.dg/plugin/plugin.exp |  1 +
 gcc/toplev.c   |  2 +
 14 files changed, 114 insertions(+), 60 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw-line-numbers-2.c

diff --git a/gcc/common.opt b/gcc/common.opt
index 53aac19..2971dc2 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1281,6 +1281,10 @@ fdiagnostics-show-option
 Common Var(flag_diagnostics_show_option) Init(1)
 Amend appropriate diagnostic messages with the command line option that 
controls them.
 
+fdiagnostics-minimum-margin-width=
+Common Joined UInteger Var(diagnostics_minimum_margin_width) Init(6)
+Set minimum width of left margin of source code when showing source
+
 fdisable-
 Common Joined RejectNegative Var(common_deferred_options) Defer
 -fdisable-[tree|rtl|ipa]-=range1+range2 disables an optimization pass.
diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 43a49ea..a42ff81 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -930,6 +930,9 @@ layout::layout (diagnostic_context * context,
   /* If we're showing jumps in the line-numbering, allow at least 3 chars.  */
   if (m_line_spans.length () > 1)
 m_linenum_width = MAX (m_linenum_width, 3);
+  /* If there's a minimum margin width, apply it (subtr

[committed] Remove stray reference to error_at_rich_loc

2018-10-15 Thread David Malcolm
"error_at_rich_loc" went away in r254280 (in favor of overloading
"error_at"), but there was a stray reference in a comment.

Remove it.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Committed to trunk as r265177.

gcc/ChangeLog:
* gcc-rich-location.h (gcc_rich_location::add_location_if_nearby):
Fix usage of "error_at_rich_loc" in the comment.
 ---
 gcc/gcc-rich-location.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/gcc-rich-location.h b/gcc/gcc-rich-location.h
index e55dd76..d282fd4 100644
--- a/gcc/gcc-rich-location.h
+++ b/gcc/gcc-rich-location.h
@@ -56,7 +56,7 @@ class gcc_rich_location : public rich_location
 
gcc_rich_location richloc (primary_loc);
bool added secondary = richloc.add_location_if_nearby (secondary_loc);
-   error_at_rich_loc (&richloc, "main message");
+   error_at (&richloc, "main message");
if (!added secondary)
  inform (secondary_loc, "message for secondary");
 
-- 
1.8.5.3



Re: Extend usage of C++11 direct init in __debug::vector

2018-10-15 Thread François Dumont

On 10/15/2018 12:10 PM, Jonathan Wakely wrote:

On 15/10/18 07:23 +0200, François Dumont wrote:
This patch extend usage of C++11 direct initialization in 
__debug::vector and makes some calls to operator - more consistent.


Note that I also rewrote following expression in erase method:

-      return begin() + (__first.base() - cbegin().base());
+      return { _Base::begin() + (__first.base() - _Base::cbegin()), 
this };


The latter version was building 2 safe iterators and incrementing 1 
with the additional debug check inherent to such an operation whereas 
the new version just build 1 safe iterator with directly the expected 
offset.


Makes sense.


2018-10-15  François Dumont 

    * include/debug/vector (vector<>::cbegin()): Use C++11 direct
    initialization.
    (vector<>::cend()): Likewise.
    (vector<>::emplace(const_iterator, _Args&&...)): Likewise and use
    consistent iterator comparison.
    (vector<>::insert(const_iterator, size_type, const _Tp&)): Likewise.
    (vector<>::insert(const_iterator, _InputIterator, _InputIterator)):
    Likewise.
    (vector<>::erase(const_iterator)): Likewise.
    (vector<>::erase(const_iterator, const_iterator)): Likewise.

Tested under Linux x86_64 Debug mode and committed.

François




@@ -542,7 +542,8 @@ namespace __debug
  {
__glibcxx_check_insert(__position);
bool __realloc = this->_M_requires_reallocation(this->size() + 1);
-    difference_type __offset = __position.base() - _Base::begin();
+    difference_type __offset
+  = __position.base() - 
__position._M_get_sequence()->_M_base().begin();


What's the reason for this change?

Doesn't __glibcxx_check_insert(__position) already ensure that
__position is attached to *this, and so _Base::begin() returns the
same thing as __position._M_get_sequence()->_M_base().begin() ?

If they're equivalent, the original code seems more readable.



This is the consistent iterator comparison part. Depending on C++ mode 
__position can be iterator or const_iterator.


As _M_get_sequence() return type depends on iterator type we always have 
iterator - iterator or const_iterator - const_iterator so no conversion.




Hashtable Small size optimization

2018-10-15 Thread François Dumont

I started considering PR libstdc++/68303.

First thing was to write a dedicated performance test case, it is the 
unordered_small_size.cc I'd like to add with this patch.


The first runs show a major difference between tr1 and std 
implementations, tr1 being much better:


std::tr1::unordered_set without hash code cached: 1st insert       
9r    9u    1s  14725920mem    0pf
std::tr1::unordered_set with hash code cached: 1st insert       
8r    9u    0s  14719680mem    0pf
std::unordered_set without hash code cached: 1st insert      17r   
17u    0s  16640080mem    0pf
std::unordered_set with hash code cached: 1st insert      14r   
14u    0s  16638656mem    0pf


I had a look in gdb to find out why and the answer was quite obvious. 
For 20 insertions tr1 implementation bucket count goes through [11, 23] 
whereas for std it is [2, 5, 11, 23], so 2 more expensive rehash.


As unordered containers are dedicated to rather important number of 
elements I propose to review the rehash policy with this patch so that 
std also starts at 11 on the 1st insertion. After the patch figures are:


std::tr1::unordered_set without hash code cached: 1st insert       
9r    9u    0s  14725920mem    0pf
std::tr1::unordered_set with hash code cached: 1st insert       
8r    8u    0s  14719680mem    0pf
std::unordered_set without hash code cached: 1st insert      15r   
15u    0s  16640128mem    0pf
std::unordered_set with hash code cached: 1st insert      12r   
12u    0s  16638688mem    0pf


Moreover I noticed that performance tests are built with -O2, is that 
intentional ? The std implementation uses more abstractions than tr1, 
looks like building with -O3 optimizes away most of those abstractions 
making tr1 and std implementation much closer:


std::tr1::unordered_set without hash code cached: 1st insert       
2r    1u    1s  14725952mem    0pf
std::tr1::unordered_set with hash code cached: 1st insert       
2r    1u    0s  14719536mem    0pf
std::unordered_set without hash code cached: 1st insert       2r    
2u    0s  16640064mem    0pf
std::unordered_set with hash code cached: 1st insert       2r    
2u    0s  16638608mem    0pf


Note that this patch also rework the alternative rehash policy based on 
powers of 2 so that it also starts with a larger number of bucket (16) 
and respects LWG2156.


Last I had to wider the memory column so that alignment is preserved 
even when memory diff is negative.


Tested under Linux x86_64.

Ok to commit ?

François


diff --git a/libstdc++-v3/include/bits/hashtable_policy.h b/libstdc++-v3/include/bits/hashtable_policy.h
index 66fbfbe5f21..70d3c5c8194 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -536,6 +536,13 @@ namespace __detail
 std::size_t
 _M_next_bkt(std::size_t __n) noexcept
 {
+  if (__n < 0x11)
+	{
+	  _M_next_resize =
+	__builtin_floor(0x10 * (long double)_M_max_load_factor);
+	  return 0x10;
+	}
+
   const auto __max_width = std::min(sizeof(size_t), 8);
   const auto __max_bkt = size_t(1) << (__max_width * __CHAR_BIT__ - 1);
   std::size_t __res = __clp2(__n);
@@ -553,7 +560,7 @@ namespace __detail
 	_M_next_resize = std::size_t(-1);
   else
 	_M_next_resize
-	  = __builtin_ceil(__res * (long double)_M_max_load_factor);
+	  = __builtin_floor(__res * (long double)_M_max_load_factor);
 
   return __res;
 }
@@ -571,7 +578,7 @@ namespace __detail
 _M_need_rehash(std::size_t __n_bkt, std::size_t __n_elt,
 		   std::size_t __n_ins) noexcept
 {
-  if (__n_elt + __n_ins >= _M_next_resize)
+  if (__n_elt + __n_ins > _M_next_resize)
 	{
 	  long double __min_bkts = (__n_elt + __n_ins)
 	/ (long double)_M_max_load_factor;
diff --git a/libstdc++-v3/src/c++11/hashtable_c++0x.cc b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
index b9b11ff4385..6df9775b954 100644
--- a/libstdc++-v3/src/c++11/hashtable_c++0x.cc
+++ b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
@@ -46,14 +46,11 @@ namespace __detail
   {
 // Optimize lookups involving the first elements of __prime_list.
 // (useful to speed-up, eg, constructors)
-static const unsigned char __fast_bkt[]
-  = { 2, 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11, 13, 13 };
-
-if (__n < sizeof(__fast_bkt))
+if (__n < 12)
   {
 	_M_next_resize =
-	  __builtin_floor(__fast_bkt[__n] * (long double)_M_max_load_factor);
-	return __fast_bkt[__n];
+	  __builtin_floor(11 * (long double)_M_max_load_factor);
+	return 11;
   }
 
 // Number of primes (without sentinel).
@@ -66,7 +63,7 @@ namespace __detail
 constexpr auto __last_prime = __prime_list + __n_primes - 1;
 
 const unsigned long* __next_bkt =
-  std::lower_bound(__prime_list + 6, __last_prime, __n);
+  std::lower_bound(__prime_list + 5, __last_prime, __n);
 
 if (__next_bkt == __last_prime)
   // Set next resize to the max value so that we never try to rehash again
diff --git a/libstdc++-v3/testsuite/performance/23_

Re: Make std::vector iterator operators friend inline

2018-10-15 Thread François Dumont

On 10/15/2018 11:36 AM, Jonathan Wakely wrote:

On 12/10/18 18:25 +0200, François Dumont wrote:

Here is the patch for _Bit_iterator and _Bit_const_iterator operators.

I noticed that _Bit_reference == and < operators could be made inline 
friend too. Do you want me to include this change in the patch ?



    * include/bits/stl_bvector.h (_Bit_iterator_base::operator==): 
Replace

    member method with inline friend.
    (_Bit_iterator_base::operator<): Likewise.
    (_Bit_iterator_base::operator!=): Likewise.
    (_Bit_iterator_base::operator>): Likewise.
    (_Bit_iterator_base::operator<=): Likewise.
    (_Bit_iterator_base::operator>=): Likewise.
    (operator-(const _Bit_iterator_base&, const 
_Bit_iterator_base&)): Make

    inline friend.
    (_Bit_iterator::operator+(difference_type)): Replace member 
method with

    inline friend.
    (_Bit_iterator::operator-(difference_type)): Likewise.
    (operator+(ptrdiff_t, const _Bit_iterator&)): Make inline friend.
    (_Bit_const_iterator::operator+(difference_type)): Replace member 
method

    with inline friend.
    (_Bit_const_iterator::operator-(difference_type)): Likewise.
    (operator+(ptrdiff_t, const _Bit_const_iterator&)): Make inline
    friend.

Tested under Linux x86_64.

Ok to commit ?

François



diff --git a/libstdc++-v3/include/bits/stl_bvector.h 
b/libstdc++-v3/include/bits/stl_bvector.h

index 19c16839cfa..8fbef7a1a3a 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -182,40 +182,40 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  _M_offset = static_cast(__n);
    }

-    bool
-    operator==(const _Bit_iterator_base& __i) const
-    { return _M_p == __i._M_p && _M_offset == __i._M_offset; }
+    friend bool
+    operator==(const _Bit_iterator_base& __x, const 
_Bit_iterator_base& __y)

+    { return __x._M_p == __y._M_p && __x._M_offset == __y._M_offset; }

-    bool
-    operator<(const _Bit_iterator_base& __i) const
+    friend bool
+    operator<(const _Bit_iterator_base& __x, const 
_Bit_iterator_base& __y)

    {
-  return _M_p < __i._M_p
-    || (_M_p == __i._M_p && _M_offset < __i._M_offset);
+  return __x._M_p < __y._M_p
+    || (__x._M_p == __y._M_p && __x._M_offset < __y._M_offset);
    }

-    bool
-    operator!=(const _Bit_iterator_base& __i) const
-    { return !(*this == __i); }
+    friend bool
+    operator!=(const _Bit_iterator_base& __x, const 
_Bit_iterator_base& __y)

+    { return !(__x == __y); }

-    bool
-    operator>(const _Bit_iterator_base& __i) const
-    { return __i < *this; }
+    friend bool
+    operator>(const _Bit_iterator_base& __x, const 
_Bit_iterator_base& __y)

+    { return __y < __x; }

-    bool
-    operator<=(const _Bit_iterator_base& __i) const
-    { return !(__i < *this); }
+    friend bool
+    operator<=(const _Bit_iterator_base& __x, const 
_Bit_iterator_base& __y)

+    { return !(__y < __x); }

-    bool
-    operator>=(const _Bit_iterator_base& __i) const
-    { return !(*this < __i); }
-  };
+    friend bool
+    operator>=(const _Bit_iterator_base& __x, const 
_Bit_iterator_base& __y)

+    { return !(__x < __y); }

-  inline ptrdiff_t
-  operator-(const _Bit_iterator_base& __x, const _Bit_iterator_base& 
__y)

-  {
-    return (int(_S_word_bit) * (__x._M_p - __y._M_p)
-    + __x._M_offset - __y._M_offset);
-  }


For the non-member operator- and operator+ this change makes sense,
but what is the benefit of changing all the others? As members they're
already not considered as candidates for unrelated types, or am I
missing something?


Well, I remember a pretty old conversation about making all operators 
consistent that is to say not sometimes member and sometimes non-member.


So I am trying to make all operators non-member as much as possible. 
That's indeed the only reason.




Re: [PATCH] asm non-code template parts (alternative to asm inline)

2018-10-15 Thread Segher Boessenkool
On Mon, Oct 15, 2018 at 10:45:08PM +0300, Alexander Monakov wrote:
> On Mon, 15 Oct 2018, Segher Boessenkool wrote:
> > On Sun, Oct 14, 2018 at 11:07:20PM +0300, Alexander Monakov wrote:
> > > For Basic asms, no similar mechanism is necessary since they are 
> > > antithetical
> > > to efficiency in the first place.
> > 
> > I missed this part.
> > 
> >   asm("bla");
> > 
> > means almost the same as
> > 
> >   asm("bla" : );
> > 
> > and there is nothing in there that is bad for optimisation.
> 
> The extended asm does not clobber all memory, unlike its basic counterpart.

Yeah, that's new in GCC 7, and I keep forgetting.  I'm still in the
denial phase for this.


Segher


Re: Use C++11 direct init in __debug::forward_list

2018-10-15 Thread François Dumont

On 10/15/2018 11:58 AM, Jonathan Wakely wrote:

On 11/10/18 22:46 +0200, François Dumont wrote:
This patch makes extensive use of C++11 direct init in 
__debug::forward_list.


Doing so I also try to detect useless creation of safe iterators in 
debug implementation. In __debug::forward_list there are severals but 
I wonder if it is worth fixing those. Most of them are like this:


  void
  splice_after(const_iterator __pos, forward_list& __list)
  { splice_after(__pos, std::move(__list)); }

__pos is copied.

Do you think I shouldn't care, gcc will optimize it ?


I think the _Safe_iterator construction/destruction is too complex to
be optimised away (it locks a mutex, doesn't it?).


Yes it does, I also would be surprised if gcc was able to optimize it away.



Normally I'd say you could use std::move(__pos) but IIRC that's even
more expensive than a copy, as it locks two mutexes.

I wonder if it would be ok in debug implementation to use this kind 
of signature:


void splice_after(const const_iterator& __pos, forward_list& __list)

Iterator taken as rvalue reference ?

I guess it is not Standard conformant so not correct but maybe I 
could add a private _M_splice_after with this signature.


It doesn't seem worthwhile to me.



Ok, I'll leave it this way.

Thanks,

François



Re: [PATCH] asm non-code template parts (alternative to asm inline)

2018-10-15 Thread Alexander Monakov
On Mon, 15 Oct 2018, Segher Boessenkool wrote:
> On Sun, Oct 14, 2018 at 11:07:20PM +0300, Alexander Monakov wrote:
> > For Basic asms, no similar mechanism is necessary since they are 
> > antithetical
> > to efficiency in the first place.
> 
> I missed this part.
> 
>   asm("bla");
> 
> means almost the same as
> 
>   asm("bla" : );
> 
> and there is nothing in there that is bad for optimisation.

The extended asm does not clobber all memory, unlike its basic counterpart.

Alexander


[Patch, Fortran] PR87556 – for FORM TEAM also use argse.pre/post

2018-10-15 Thread Tobias Burnus
as the subject states, FORM TEAM was only using the resulting tree 
expression, ignoring code which was generated before (or afterward).


I am not sure how to best convert it to a test-suite test case. For

   form team (team(this_image()), my_team2)

the old dump was:

    integer(kind=4) D.3829;
…
    _gfortran_caf_form_team (team (&D.3829), &my_team2, 0);

the new one is:

  {
    integer(kind=4) D.3822;

    D.3822 = _gfortran_caf_this_image (0);
    _gfortran_caf_form_team (team (&D.3822), &my_team2, 0);
  }

[Does it make sense to check for 5 "this_image (0)" calls? or for 4 
"D.\[0-9\]+ = _gfortran_caf_this_image (0);" calls?]



Build and on-going regtesting on x86-64-gnu-linux.

OK for the trunk?

Tobias

2018-10-15  Tobias Burnus  

	PR fortran/87556
	* trans-stmt.c (form_team, change_team, sync_team):
	Don't ignore argse.pre/argse.post.

diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index 6256e3fa805..130e67ba1e4 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -707,19 +707,30 @@ gfc_trans_form_team (gfc_code *code)
 {
   if (flag_coarray == GFC_FCOARRAY_LIB)
 {
-  gfc_se argse;
-  tree team_id,team_type;
-  gfc_init_se (&argse, NULL);
-  gfc_conv_expr_val (&argse, code->expr1);
-  team_id = fold_convert (integer_type_node, argse.expr);
-  gfc_init_se (&argse, NULL);
-  gfc_conv_expr_val (&argse, code->expr2);
-  team_type = gfc_build_addr_expr (ppvoid_type_node, argse.expr);
+  gfc_se se;
+  gfc_se argse1, argse2;
+  tree team_id, team_type, tmp;
 
-  return build_call_expr_loc (input_location,
-  gfor_fndecl_caf_form_team, 3,
-  team_id, team_type,
-  build_int_cst (integer_type_node, 0));
+  gfc_init_se (&se, NULL);
+  gfc_init_se (&argse1, NULL);
+  gfc_init_se (&argse2, NULL);
+  gfc_start_block (&se.pre);
+
+  gfc_conv_expr_val (&argse1, code->expr1);
+  gfc_conv_expr_val (&argse2, code->expr2);
+  team_id = fold_convert (integer_type_node, argse1.expr);
+  team_type = gfc_build_addr_expr (ppvoid_type_node, argse2.expr);
+
+  gfc_add_block_to_block (&se.pre, &argse1.pre);
+  gfc_add_block_to_block (&se.pre, &argse2.pre);
+  tmp = build_call_expr_loc (input_location,
+ gfor_fndecl_caf_form_team, 3,
+ team_id, team_type,
+ build_int_cst (integer_type_node, 0));
+  gfc_add_expr_to_block (&se.pre, tmp);
+  gfc_add_block_to_block (&se.pre, &argse1.post);
+  gfc_add_block_to_block (&se.pre, &argse2.post);
+  return gfc_finish_block (&se.pre);
 }
   else
 {
@@ -738,15 +749,18 @@ gfc_trans_change_team (gfc_code *code)
   if (flag_coarray == GFC_FCOARRAY_LIB)
 {
   gfc_se argse;
-  tree team_type;
+  tree team_type, tmp;
 
   gfc_init_se (&argse, NULL);
   gfc_conv_expr_val (&argse, code->expr1);
   team_type = gfc_build_addr_expr (ppvoid_type_node, argse.expr);
 
-  return build_call_expr_loc (input_location,
-  gfor_fndecl_caf_change_team, 2, team_type,
-  build_int_cst (integer_type_node, 0));
+  tmp = build_call_expr_loc (input_location,
+ gfor_fndecl_caf_change_team, 2, team_type,
+ build_int_cst (integer_type_node, 0));
+  gfc_add_expr_to_block (&argse.pre, tmp);
+  gfc_add_block_to_block (&argse.pre, &argse.post);
+  return gfc_finish_block (&argse.pre);
 }
   else
 {
@@ -785,16 +799,19 @@ gfc_trans_sync_team (gfc_code *code)
   if (flag_coarray == GFC_FCOARRAY_LIB)
 {
   gfc_se argse;
-  tree team_type;
+  tree team_type, tmp;
 
   gfc_init_se (&argse, NULL);
   gfc_conv_expr_val (&argse, code->expr1);
   team_type = gfc_build_addr_expr (ppvoid_type_node, argse.expr);
 
-  return build_call_expr_loc (input_location,
-  gfor_fndecl_caf_sync_team, 2,
-  team_type,
-  build_int_cst (integer_type_node, 0));
+  tmp = build_call_expr_loc (input_location,
+ gfor_fndecl_caf_sync_team, 2,
+ team_type,
+ build_int_cst (integer_type_node, 0));
+  gfc_add_expr_to_block (&argse.pre, tmp);
+  gfc_add_block_to_block (&argse.pre, &argse.post);
+  return gfc_finish_block (&argse.pre);
 }
   else
 {



Re: [Patch, Fortran] PR87597 - fix off-by-one issue with inline matmul

2018-10-15 Thread Tobias Burnus
Right commit revision, wrong attached file (original patch, not the 
follow-up one).

Now hopefully the correct one.

Tobias

Am 15.10.18 um 21:02 schrieb Tobias Burnus:

Fixed with commit Rev. 265175 as attached.

Cheers

Tobias


Dominique d'Humières wrote:


Le 14 oct. 2018 à 00:43, Tobias Burnus  a écrit :


Dominique d'Humières wrote:
UNRESOLVED: gfortran.dg/inline_matmul_24.f90   -O0 
scan-tree-dump-times optimized "gamma5[__var_1_do * 4 + 
__var_2_do]" 1


! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+ 
__var_2_do\]" 1 "optimized" } }


Shouldn’t -fdump-tree-original be -fdump-tree-optimized?
As it is a front-end optimization (which is explicitly enabled), 
-fdump-tree-original should work (and avoids issues with further 
later optimizations).


What do you get on your system? Seemingly something else than I do. 
Can you look for the gamma5 line in your dump?


Tobias

Then

! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+ 
__var_2_do\]" 1 "optimized" } }


should be

! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+ 
__var_2_do\]" 1 "original" } }


isn’t it?

see https://gcc.gnu.org/ml/gcc-testresults/2018-10/msg01721.html for 
non darwin log.


Dominique
Index: gcc/testsuite/gfortran.dg/inline_matmul_24.f90
===
--- gcc/testsuite/gfortran.dg/inline_matmul_24.f90	(Revision 265174)
+++ gcc/testsuite/gfortran.dg/inline_matmul_24.f90	(Revision 265175)
@@ -39,4 +39,4 @@
   call abort()
 end if
 end program testMATMUL
-! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+ __var_2_do\]" 1 "optimized" } }
+! { dg-final { scan-tree-dump-times "gamma5\\\[__var_1_do \\* 4 \\+ __var_2_do\\\]|gamma5\\\[NON_LVALUE_EXPR <__var_1_do> \\* 4 \\+ NON_LVALUE_EXPR <__var_2_do>\\\]" 1 "original" } }
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 265174)
+++ gcc/testsuite/ChangeLog	(Revision 265175)
@@ -1,3 +1,8 @@
+2018-10-15  Tobias Burnus  
+
+	PR fortran/87597
+	* gfortran.dg/inline_matmul_24.f90: Tweak scan-tree.
+
 2018-10-15  Renlin Li  
 
 	PR target/87563


Re: [Patch, Fortran] PR87597 - fix off-by-one issue with inline matmul

2018-10-15 Thread Tobias Burnus

Fixed with commit Rev. 265175 as attached.

Cheers

Tobias


Dominique d'Humières wrote:


Le 14 oct. 2018 à 00:43, Tobias Burnus  a écrit :


Dominique d'Humières wrote:

UNRESOLVED: gfortran.dg/inline_matmul_24.f90   -O0   scan-tree-dump-times optimized 
"gamma5[__var_1_do * 4 + __var_2_do]" 1

! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+ __var_2_do\]" 1 
"optimized" } }

Shouldn’t -fdump-tree-original be -fdump-tree-optimized?

As it is a front-end optimization (which is explicitly enabled), 
-fdump-tree-original should work (and avoids issues with further later 
optimizations).

What do you get on your system? Seemingly something else than I do. Can you 
look for the gamma5 line in your dump?

Tobias

Then

! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+ __var_2_do\]" 1 
"optimized" } }

should be

! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+ __var_2_do\]" 1 
"original" } }

isn’t it?

see https://gcc.gnu.org/ml/gcc-testresults/2018-10/msg01721.html for non darwin 
log.

Dominique
2018-10-12  Tobias Burnus  

PR fortran/87597
	* expr.c (gfc_simplify_expr): Avoid simplifying
	the 'array' argument to lbound/ubound/lcobound/
	ucobound.

PR fortran/87597
	* gfortran.dg/inline_matmul_24.f90: New.

diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c
index 1cfda5fbfed..ca6f95d9d8e 100644
--- a/gcc/fortran/expr.c
+++ b/gcc/fortran/expr.c
@@ -1937,7 +1937,20 @@ gfc_simplify_expr (gfc_expr *p, int type)
   break;
 
 case EXPR_FUNCTION:
-  for (ap = p->value.function.actual; ap; ap = ap->next)
+  // For array-bound functions, we don't need to optimize
+  // the 'array' argument. In particular, if the argument
+  // is a PARAMETER, simplifying might convert an EXPR_VARIABLE
+  // into an EXPR_ARRAY; the latter has lbound = 1, the former
+  // can have any lbound.
+  ap = p->value.function.actual;
+  if (p->value.function.isym &&
+	  (p->value.function.isym->id == GFC_ISYM_LBOUND
+	   || p->value.function.isym->id == GFC_ISYM_UBOUND
+	   || p->value.function.isym->id == GFC_ISYM_LCOBOUND
+	   || p->value.function.isym->id == GFC_ISYM_UCOBOUND))
+	ap = ap->next;
+
+  for ( ; ap; ap = ap->next)
 	if (!gfc_simplify_expr (ap->expr, type))
 	  return false;
 
diff --git a/gcc/testsuite/gfortran.dg/inline_matmul_24.f90 b/gcc/testsuite/gfortran.dg/inline_matmul_24.f90
new file mode 100644
index 000..067f6daf200
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/inline_matmul_24.f90
@@ -0,0 +1,42 @@
+! { dg-do run }
+! { dg-options "-ffrontend-optimize -fdump-tree-original" }
+!
+! PR fortran/87597
+!
+! Contributed by gallmeister
+!
+! Before, for the inlined matmul,
+! gamma5 was converted to an EXPR_ARRAY with lbound = 1
+! instead of the lbound = 0 as declared; leading to
+! an off-by-one problem.
+!
+program testMATMUL
+  implicit none
+complex, dimension(0:3,0:3), parameter :: gamma5 = reshape((/ 0., 0., 1., 0., &
+  0., 0., 0., 1., &
+  1., 0., 0., 0., &
+  0., 1., 0., 0. /),(/4,4/))
+complex, dimension(0:3,0:3) :: A, B, D
+integer :: i
+
+A = 0.0
+do i=0,3
+   A(i,i) = i*1.0
+end do
+
+B = cmplx(7,-9)
+B = matmul(A,gamma5)
+
+D = reshape([0, 0, 2, 0, &
+ 0, 0, 0, 3, &
+ 0, 0, 0, 0, &
+ 0, 1, 0, 0], [4, 4])
+write(*,*) B(0,:)
+write(*,*) B(1,:)
+write(*,*) B(2,:)
+write(*,*) B(3,:)
+if (any(B /= D)) then
+  call abort()
+end if
+end program testMATMUL
+! { dg-final { scan-tree-dump-times "gamma5\[__var_1_do \\* 4 \\+ __var_2_do\]" 1 "optimized" } }


Re: [PATCH] Adjust test to pass with latest glibc

2018-10-15 Thread Jonathan Wakely

On 15/10/18 14:55 +0100, Jonathan Wakely wrote:

Glibc changed the it_IT locales to use thousands separators,
invalidating this test. Use nl_NL instead, as Dutch only uses grouping
for money not numbers.

* testsuite/22_locale/numpunct/members/char/3.cc: Adjust test to
account for change to glibc it_IT localedata (glibc bz#10797).

Tested x86_64-linux, committed to trunk.


Backported to 6, 7 and 8 too.


commit e4a550b85fec6440e4fe70817e96f496874f36d8
Author: Jonathan Wakely 
Date:   Mon Oct 15 14:48:20 2018 +0100

   Adjust test to pass with latest glibc

   Glibc changed the it_IT locales to use thousands separators,
   invalidating this test. Use nl_NL instead, as Dutch only uses grouping
   for money not numbers.

   * testsuite/22_locale/numpunct/members/char/3.cc: Adjust test to
   account for change to glibc it_IT localedata (glibc bz#10797).

diff --git a/libstdc++-v3/testsuite/22_locale/numpunct/members/char/3.cc 
b/libstdc++-v3/testsuite/22_locale/numpunct/members/char/3.cc
index f314502461a..a55cf89b294 100644
--- a/libstdc++-v3/testsuite/22_locale/numpunct/members/char/3.cc
+++ b/libstdc++-v3/testsuite/22_locale/numpunct/members/char/3.cc
@@ -1,4 +1,4 @@
-// { dg-require-namedlocale "it_IT.ISO8859-15" }
+// { dg-require-namedlocale "nl_NL.ISO8859-15" }

// 2001-01-24 Benjamin Kosnik  

@@ -28,12 +28,14 @@ void test02()
{
  using namespace std;

-  locale loc_it = locale(ISO_8859(15,it_IT));
+  // nl_NL chosen because it has no thousands separator (at this time).
+  locale loc_it = locale(ISO_8859(15,nl_NL));

  const numpunct& nump_it = use_facet >(loc_it);

  string g = nump_it.grouping();

+  // Ensure that grouping is empty for locales with empty thousands separator.
  VERIFY( g == "" );
}





Re: [PATCH/RFC] Add "User Experience Guidelines" to gccint.texi

2018-10-15 Thread David Malcolm
On Sun, 2018-10-14 at 11:01 -0600, Martin Sebor wrote:
> On 10/12/2018 09:43 AM, David Malcolm wrote:
> > Here's a proposed "User Experience Guidelines" section for our
> > internals manual
> > 
> > It's a mixture of proposed policy, together with notes on how to
> > implement the recommendations.
> > 
> > Thoughts?
> 
> To improve consistency among diagnostic messages it's important
> to have a set of guidelines in place.  Thank you for taking
> the first step!

I'm not as interested in "consistency" here as I am in what I'm calling
"actionable" diagnostics ("actionableness"?).

For a reductio ad absurdum, consider if we replaced all of our messages
with the empty string, so that we merely issued "error" or "warning" to
the user.  That would achieve "consistency" for all diagnostic
messages, but wouldn't be helpful to the user.

"Being an assistant to the user rather than an adversary" is the high-
level goal I want to emphasize. Consistency is good, and will usually
be in service to the primary goal (as it helps the user learn the
program and be able to figure things out from past experience), but if
we can help the user by special-casing certain things, that can take
precedence over consistency of output.

Various comments inline below.

> > 
> > gcc/ChangeLog:
> > * Makefile.in (TEXI_GCCINT_FILES): Add ux.texi.
> > * doc/gccint.texi: Include ux.texi and use it in top-level
> > menu.
> > * doc/ux.texi: New file.
> > ---
> >  gcc/Makefile.in |   2 +-
> >  gcc/doc/gccint.texi |   2 +
> >  gcc/doc/ux.texi | 455
> > 
> >  3 files changed, 458 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/doc/ux.texi
> > 
> > diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> > index 70efab7..3f05e95 100644
> > --- a/gcc/Makefile.in
> > +++ b/gcc/Makefile.in
> > @@ -3176,7 +3176,7 @@ TEXI_GCCINT_FILES = gccint.texi gcc-
> > common.texi gcc-vers.texi   \
> >  gnu.texi gpl_v3.texi fdl.texi contrib.texi languages.texi
> > \
> >  sourcebuild.texi gty.texi libgcc.texi cfg.texi tree-
> > ssa.texi\
> >  loop.texi generic.texi gimple.texi plugins.texi
> > optinfo.texi   \
> > -match-and-simplify.texi poly-int.texi
> > +match-and-simplify.texi ux.texi poly-int.texi
> > 
> >  TEXI_GCCINSTALL_FILES = install.texi install-old.texi fdl.texi 
> > \
> >  gcc-common.texi gcc-vers.texi
> > diff --git a/gcc/doc/gccint.texi b/gcc/doc/gccint.texi
> > index 1a1af41..2554b31 100644
> > --- a/gcc/doc/gccint.texi
> > +++ b/gcc/doc/gccint.texi
> > @@ -125,6 +125,7 @@ Additional tutorial information is linked to
> > from
> >  * LTO:: Using Link-Time Optimization.
> > 
> >  * Match and Simplify:: How to write expression simplification
> > patterns for GIMPLE and GENERIC
> > +* User Experience Guidelines:: Guidelines for implementing
> > diagnostics and options.
> >  * Funding:: How to help assure funding for free software.
> >  * GNU Project:: The GNU Project and GNU/Linux.
> > 
> > @@ -162,6 +163,7 @@ Additional tutorial information is linked to
> > from
> >  @include plugins.texi
> >  @include lto.texi
> >  @include match-and-simplify.texi
> > +@include ux.texi
> > 
> >  @include funding.texi
> >  @include gnu.texi
> > diff --git a/gcc/doc/ux.texi b/gcc/doc/ux.texi
> > new file mode 100644
> > index 000..87ff599
> > --- /dev/null
> > +++ b/gcc/doc/ux.texi
> > @@ -0,0 +1,455 @@
> > +@c Copyright (C) 2018 Free Software Foundation, Inc.
> > +@c Free Software Foundation, Inc.
> > +@c This is part of the GCC manual.
> > +@c For copying conditions, see the file gcc.texi.
> > +
> > +@node User Experience Guidelines
> > +@chapter User Experience Guidelines
> > +@cindex User Experience Guidelines
> > +@cindex Guidelines, User Experience
> > +
> > +To borrow a slogan from
> > + @uref{https://elm-lang.org/blog/compilers-as-assistants, Elm},
> > +
> > +@quotation
> > +@strong{Compilers should be assistants, not adversaries.}  A
> > compiler should
> > +not just detect bugs, it should then help you understand why there
> > is a bug.
> > +It should not berate you in a robot voice, it should give you
> > specific hints
> > +that help you write better code. Ultimately, a compiler should
> > make
> > +programming faster and more fun!
> > +@author Evan Czaplicki
> > +@end quotation
> > +
> > +This chapter provides guidelines on how to implement diagnostics
> > and
> > +command-line options in ways that we hope achieve the above ideal.
> > +
> > +@menu
> > +* Guidelines for Diagnostics::   How to implement diagnostics.
> > +* Guidelines for Options::   Guidelines for command-line
> > options.
> > +@end menu
> > +
> > +
> > +@node Guidelines for Diagnostics
> > +@cindex Guidelines for Diagnostics
> > +@cindex Diagnostics, Guidelines for
> > +
> > +@section Guidelines for Diagnostics
> > +
> > +@subsection Talk in terms of the user's code
> > +
> > +Diagnostics should be worded in terms of the user's sou

Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-15 Thread Uecker, Martin

Hi Richard,

as Joseph pointed out, there are some related discussions
on the WG14 reflector. How a about moving the discussion
there?

I find your approach very interesting and that it already
comes with an implementation is of course very useful

But I don't really understand the reasons why this is not based
on (2). These types are not "sizeless" at all, their size
just isn't known at compile time. So to me this seems to me
a misnomer.

In fact, to me these types *do* in fact seem very similar
to VLAs as VLAs are also complete types which also do no
have a known size at compile time.

That arrays decay to pointers doesn't mean that we
couldn't have similar vectors types which don't decay.
This is hardly a fundamental problem.

I also don't understand the problem about the array
size. If I understand this correctly, the size is somehow
known at run-time and implicitly passed along with the
values. So these new types do not need to have a
size expression (as in your proposal). 

Assignment, the possibility to return the type from
functions, and something like __sizeless_structs would
make sense for VLAs too.

So creating a new category "variable-length types" for 
both VLAs and variably-length vector types seems do make
much more sense to me. As I see it, this would be mainly
a change in terminology and not so much of the underlying
approach.


Best,
Martin

Am Montag, den 15.10.2018, 15:30 +0100 schrieb Richard Sandiford:
> The C standard says:
> 
> At various points within a translation unit an object type may be
> "incomplete" (lacking sufficient information to determine the size of
> objects of that type) or "complete" (having sufficient information).
> 
> For AArch64 SVE, we'd like to split this into two concepts:
> 
>   * has the type been fully defined?
>   * would fully-defining the type determine its size?
> 
> This is because we'd like to be able to represent SVE vectors as C and C++
> types.  Since SVE is a "vector-length agnostic" architecture, the size
> of the vectors is determined by the runtime environment rather than the
> programmer or compiler.  In that sense, defining an SVE vector type does
> not determine its size.  It's nevertheless possible to use SVE vector types
> in meaningful ways, such as having automatic vector variables and passing
> vectors between functions.
> 
> The main questions in the RFC are:
> 
>   1) is splitting the definition like this OK in principle?
>   2) are the specific rules described below OK?
>   3) coding-wise, how should the split be represented in GCC?
> 
> Terminology
> ---
> 
> Going back to the second bullet above:
> 
>   * would fully-defining the type determine its size?
> 
> the rest of the RFC calls a type "sized" if fully defining it would
> determine its size.  The type is "sizeless" otherwise.
> 
> Contents
> 
> 
> The RFC is organised as follows.  I've erred on the side of including
> detail rather than leaving it out, but each section is meant to be
> self-contained and skippable:
> 
>   - An earlier RFC
>   - Quick overview of SVE
>   - Why we need SVE types in C and C++
>   - How we ended up with this definition
>   - The SVE types in more detail
>   - Outline of the type system changes
>   - Sizeless structures (and testing on non-SVE targets)
>   - Other variable-length vector architectures
>   - Edits to the C standard
> - Base changes
> - Updates for consistency
> - Sizeless structures
>   - Edits to the C++ standard
>   - GCC implementation questions
> 
> I'll follow up with patches that implement the split.
> 
> 
> 
> An earlier RFC
> ==
> 
> For the record (in case this sounds familiar) I sent an RFC about the
> sizeless type extension a while ago:
> 
> https://gcc.gnu.org/ml/gcc/2017-08/msg00012.html
> 
> The rules haven't changed since then, but this version includes more
> information and includes support for sizeless structures.
> 
> 
> Quick overview of SVE
> =
> 
> SVE is a vector extension to AArch64.  A detailed description is
> available here:
> 
> https://static.docs.arm.com/ddi0584/a/DDI0584A_a_SVE_supp_armv8A.pdf
> 
> but the only feature that really matters for this RFC is that SVE has no
> fixed or preferred vector length.  Implementations can instead choose
> from a range of possible vector lengths, with 128 bits being the minimum
> and 2048 bits being the maximum.  Priveleged software can further
> constrain the vector length within the range offered by the implementation;
> e.g. linux currently provides per-thread control of the vector length.
> 
> 
> Why we need SVE types in C and C++
> ==
> 
> SVE was designed to be an easy target for autovectorising normal scalar
> code.  There are also various language extensions that support explicit
> data parallelism or that make explicit vector chunking easier to do in
> an architecture-neutral way (e.g. C++ P0214).  This means that many users
> won't need to do 

Re: [PATCH] asm non-code template parts (alternative to asm inline)

2018-10-15 Thread Segher Boessenkool
On Sun, Oct 14, 2018 at 11:07:20PM +0300, Alexander Monakov wrote:
> For Basic asms, no similar mechanism is necessary since they are antithetical
> to efficiency in the first place.

I missed this part.

  asm("bla");

means almost the same as

  asm("bla" : );

and there is nothing in there that is bad for optimisation.  It's not like
adding some garbage input arg will help.

Also, very many inline asm "in the field" are written as basic inline asm.
Including many of the problem cases.


Segher


Re: [PATCH] asm non-code template parts (alternative to asm inline)

2018-10-15 Thread Segher Boessenkool
On Sun, Oct 14, 2018 at 11:07:20PM +0300, Alexander Monakov wrote:
> impacts inlining decisions badly, since GCC assumes cost of the asm to be
> high, even though it emits just one instruction to the text section. I'd
> like to point out that branch range optimization is also negatively affected.

The "length" of an asm is currently calculated as a conservatively correct
number (which you can trick of course, but that aside).  This is on purpose.
This is also documented.  And you are destroying this.

> Kernels developers can then use this extension via
> 
> [if gcc-9 or compatible]
> #define ASM_NONTEXT_BEGIN "%`\n"
> [else]
> #define ASM_NONTEXT_BEGIN "\n"
> [endif]
> 
> #define ASM_NONTEXT_END ASM_NONTEXT_BEGIN

Have you tested this?  This counts everything as being longer than you
supposedly want.

> +@item %`
> +Signifies a boundary of a region where instruction separators are not
> +counted towards its cost (@pxref{Size of an asm}). Must be followed by
> +a whitespace character.
>  @end table

It's not cost.

> +Likewise, it is possible for GCC to significantly over-estimate the
> +number of instructions in an @code{asm}, resulting in suboptimal decisions
> +when the estimate is used during inlining and branch range optimization.

GCC estimates the size of an asm conservatively.  Without this you get
ICEs all over the place.  Any proposed "adjust inlining cost of inline asm"
extension that adjusts instruction length instead (and can change it to too
low) will cause us too much trouble, and users a bad experience using the
compiler.  IMO of course.

> + /* Leave room for future extensions.  */
> + if (*p && !ISSPACE (*p))
> +   output_operand_lossage ("%%` must be followed by whitespace");

What does that mean?  And, why?

This isn't documented either.

> +  if (in_backticks)
> +output_operand_lossage ("missing closing %%`");

Why?


Segher


[C++ Patch] PR 84644 ("internal compiler error: in warn_misplaced_attr_for_class_type, at cp/decl.c:4718")

2018-10-15 Thread Paolo Carlini

Hi,

here we ICE when, at the end of check_tag_decl we pass a DECLTYPE_TYPE 
to warn_misplaced_attr_for_class_type. I think the right fix is 
rejecting earlier a decltype with no declarator as a declaration that 
doesn't declare anything (note: all the compilers I have at hand agree). 
Tested x86_64-linux.


Thanks, Paolo.



/cp
2018-10-15  Paolo Carlini  

PR c++/84644
* decl.c (check_tag_decl): A decltype with no declarator
doesn't declare anything.

/testsuite
2018-10-15  Paolo Carlini  

PR c++/84644
* g++.dg/cpp0x/decltype68.C: New.
* g++.dg/cpp0x/decltype-33838.C: Adjust.Index: cp/decl.c
===
--- cp/decl.c   (revision 265158)
+++ cp/decl.c   (working copy)
@@ -4793,6 +4793,7 @@ check_tag_decl (cp_decl_specifier_seq *declspecs,
   if (declspecs->type
   && TYPE_P (declspecs->type)
   && ((TREE_CODE (declspecs->type) != TYPENAME_TYPE
+  && TREE_CODE (declspecs->type) != DECLTYPE_TYPE
   && MAYBE_CLASS_TYPE_P (declspecs->type))
  || TREE_CODE (declspecs->type) == ENUMERAL_TYPE))
 declared_type = declspecs->type;
Index: testsuite/g++.dg/cpp0x/decltype-33838.C
===
--- testsuite/g++.dg/cpp0x/decltype-33838.C (revision 265158)
+++ testsuite/g++.dg/cpp0x/decltype-33838.C (working copy)
@@ -2,5 +2,5 @@
 // PR c++/33838
 template struct A
 {
-  __decltype (T* foo()); // { dg-error "expected|no arguments|accept" }
+  __decltype (T* foo()); // { dg-error "expected|no arguments|declaration" }
 };
Index: testsuite/g++.dg/cpp0x/decltype68.C
===
--- testsuite/g++.dg/cpp0x/decltype68.C (nonexistent)
+++ testsuite/g++.dg/cpp0x/decltype68.C (working copy)
@@ -0,0 +1,7 @@
+// PR c++/84644
+// { dg-do compile { target c++11 } }
+
+template
+struct b {
+  decltype(a) __attribute__((break));  // { dg-error "declaration does not 
declare anything" }
+};


Re: [Patch, fortran] PR87566 - ICE with class(*) and select

2018-10-15 Thread Paul Richard Thomas
Committed as revision 265171.

Thanks to you, Dominique and, of course, Tobias.

Paul

On Mon, 15 Oct 2018 at 10:15, Thomas Koenig  wrote:
>
> Hi Paul,
>
> > Bootstrapped and regtested on FC28/x86_64 - OK for trunk?
>
> Looks good. Thanks!
>
> Regards
>
> Thomas



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein


Re: [PATCH 02/14] Add D frontend (GDC) implementation.

2018-10-15 Thread Iain Buclaw
On Mon, 15 Oct 2018 at 16:19, David Malcolm  wrote:
>
> On Tue, 2018-09-18 at 02:33 +0200, Iain Buclaw wrote:
> > This patch adds the D front-end implementation, the only part of the
> > compiler that interacts with GCC directly, and being the parts that I
> > maintain, is something that I can talk about more directly.
> >
> > For the actual code generation pass, that converts the front-end AST
> > to GCC trees, most parts use a separate Visitor interfaces to do a
> > certain kind of lowering, for instance, types.cc builds *_TYPE trees
> > from AST Type's.  The Visitor class is part of the DMD front-end, and
> > is defined in dfrontend/visitor.h.
> >
> > There are also a few interfaces which have their headers in the DMD
> > frontend, but are implemented here because they do something that
> > requires knowledge of the GCC backend (d-target.cc), does something
> > that may not be portable, or differ between D compilers
> > (d-frontend.cc) or are a thin wrapper around something that is
> > managed
> > by GCC (d-diagnostic.cc).
> >
> > Many high level operations result in generation of calls to D runtime
> > library functions (runtime.def), all with require some kind of
> > runtime
> > type information (typeinfo.cc).  The compiler also generates
> > functions
> > for registering/deregistering compiled modules with the D runtime
> > library (modules.cc).
> >
> > As well as the D language having it's own built-in functions
> > (intrinsics.cc), we also expose GCC builtins to D code via a
> > `gcc.builtins' module (d-builtins.cc), and give special treatment to
> > a
> > number of UDAs that could be applied to functions (d-attribs.cc).
> >
> >
> > That is roughly the high level jist of how things are currently
> > organized.
> >
> > ftp://ftp.gdcproject.org/patches/v4/02-v4-d-frontend-gdc.patch
>
> Hi Iain.  I took at look at this patch, focusing on the diagnostics
> side of things.
>
> These are more suggestions than hard review blockers.
>
> diff --git a/gcc/d/d-attribs.c b/gcc/d/d-attribs.c
> new file mode 100644
> index 000..6c65b8cad9e
> --- /dev/null
> +++ b/gcc/d/d-attribs.c
>
> I believe all new C++ source files are meant to be .cc, rather than .c,
> so this should be d-attribs.cc, rather that d-attribs.c.
>
> [...snip...]
>
> diff --git a/gcc/d/d-codegen.cc b/gcc/d/d-codegen.cc
> new file mode 100644
> index 000..c698890ba07
> --- /dev/null
> +++ b/gcc/d/d-codegen.cc
>
> [...snip...]
>
> +/* Return the GCC location for the D frontend location LOC.   */
> +
> +location_t
> +get_linemap (const Loc& loc)
> +{
>
> I don't like the name "get_linemap", as it suggests to me that it's
> getting the "struct line_map *" for LOC, rather than a location_t.
>
> How about "get_location_t" instead, or "make_location_t"?  The latter
> feels more appropriate, as it's doing non-trivial work.
>

OK.

> +/* Rewrite the format string FORMAT to deal with any format extensions not
> +   supported by pp_format().  The result should be freed by the caller.  */
> +static char *
> +expand_format (const char *format)
>
> Am I right in thinking this is to handle FORMAT strings coming from the
> upstream D frontend, and this has its own formatting conventions?
> Please can the leading comment have example(s) of the format, and what
> it becomes (e.g. the backticks thing).
>
> Maybe adopt a naming convention in the file, to distinguish d format
> strings from pp format strings?  Maybe "d_format" vs "gcc_format"
> ?(though given the verbatim vs !verbatim below, am not sure how
> feasible that is).
>
> Maybe rename this function to expand_d_format??
>
> (Might be nice to add some unittesting of this function via "selftest",
> but that's definitely not a requirement; it's awkward to add right now)
>

Yes, you are right, it's handling format specifiers coming from the
upstream front-end.

The only peculiar convention is the use of backticks to quote parts of
the message (dmd actually does syntax highlighting).

Others in this function just rewrite format specifiers that GCC
diagnostics don't understand.

> +
> +void ATTRIBUTE_GCC_DIAG(2,0)
> +verror (const Loc& loc, const char *format, va_list ap,
> +   const char *p1, const char *p2, const char *)
>
> This one needs a leading comment: what's the deal with P1 and P2?
>

I'll rename them to prefix1 and prefix2 with appropriate comments.

> +
> +void ATTRIBUTE_GCC_DIAG(2,0)
> +vwarning (const Loc& loc, const char *format, va_list ap)
> +{
> +  if (global.params.warnings && !global.gag)
> +{
> +  /* Warnings don't count if gagged.  */
> +  if (global.params.warnings == 1)
>
> What's the magic "1" value above?  Can it be an enum?  (or is this
> something from the upstream parsing code?)
>

This is from upstream, there are a few places that use numbers instead
of enums.  This is one such place, I've not really given myself any
time to go through them and correct them.  If you feel strongly about
this, I can set aside some time to get this fixed in upstream.

> +

Re: [PATCH] Fix recent i386 regressions (was Re: [PATCH] i386: Also disable AVX512IFMA/AVX5124FMAPS/AVX5124VNNIW)

2018-10-15 Thread Jakub Jelinek
On Mon, Oct 15, 2018 at 05:57:18PM +0200, Uros Bizjak wrote:
> On Mon, Oct 15, 2018 at 5:49 PM Uros Bizjak  wrote:
> 
> > > Plus, I wonder if we shouldn't make it harder to run into these issues, by
> > > changing
> > > Target Report Mask(ISA_AVX5124FMAPS) Var(ix86_isa_flags2) Save
> > > etc. to
> > > Target Report Mask(ISA2_AVX5124FMAPS) Var(ix86_isa_flags2) Save
> > > so that we'll have OPTION_MASK_ISA2_AVX5124FMAPS macros instead of
> > > OPTION_MASK_ISA_AVX5124FMAPS and adjust all i386-common.c etc. uses from 
> > > ISA
> > > to ISA2 for the ix86_isa_flags2 options.  Perhaps we could have
> > > #define TARGET_ISA_AVX5124FMAPS TARGET_ISA2_AVX5124FMAPS
> > > compatibility macro, because unlike the OPTION_MASK_* and TARGET_*_P 
> > > macros
> > > where you need to specify the right flags the TARGET_* macros already have
> > > that in implicitly.  Uros, thoughts on this?
> >
> > I was looking for a mail, where we discussed x86_isa_flags2 as a
> > temporary solution, with the expectation that some other extensible
> > mechanism gets invented to handle ISA flags. Now we are in c++, and I
> > guess there should be more elegant way to deal with the issue.
> 
> Maybe wide-int-bitmask.h can be used here, similar to how PTA_*
> defines are handled in i386.h?

Maybe, though I'm worried a lot about compile time performance,
we use TARGET_* macros everywhere.

Jakub


Re: [PATCH] Fix recent i386 regressions (was Re: [PATCH] i386: Also disable AVX512IFMA/AVX5124FMAPS/AVX5124VNNIW)

2018-10-15 Thread Uros Bizjak
On Mon, Oct 15, 2018 at 5:49 PM Uros Bizjak  wrote:

> > Plus, I wonder if we shouldn't make it harder to run into these issues, by
> > changing
> > Target Report Mask(ISA_AVX5124FMAPS) Var(ix86_isa_flags2) Save
> > etc. to
> > Target Report Mask(ISA2_AVX5124FMAPS) Var(ix86_isa_flags2) Save
> > so that we'll have OPTION_MASK_ISA2_AVX5124FMAPS macros instead of
> > OPTION_MASK_ISA_AVX5124FMAPS and adjust all i386-common.c etc. uses from ISA
> > to ISA2 for the ix86_isa_flags2 options.  Perhaps we could have
> > #define TARGET_ISA_AVX5124FMAPS TARGET_ISA2_AVX5124FMAPS
> > compatibility macro, because unlike the OPTION_MASK_* and TARGET_*_P macros
> > where you need to specify the right flags the TARGET_* macros already have
> > that in implicitly.  Uros, thoughts on this?
>
> I was looking for a mail, where we discussed x86_isa_flags2 as a
> temporary solution, with the expectation that some other extensible
> mechanism gets invented to handle ISA flags. Now we are in c++, and I
> guess there should be more elegant way to deal with the issue.

Maybe wide-int-bitmask.h can be used here, similar to how PTA_*
defines are handled in i386.h?

Uros.


Re: [PATCH] Fix recent i386 regressions (was Re: [PATCH] i386: Also disable AVX512IFMA/AVX5124FMAPS/AVX5124VNNIW)

2018-10-15 Thread Uros Bizjak
On Mon, Oct 15, 2018 at 4:50 PM Jakub Jelinek  wrote:
>
> On Mon, Oct 15, 2018 at 04:22:04PM +0200, Richard Biener wrote:
> > On Sun, Oct 14, 2018 at 9:29 PM Uros Bizjak  wrote:
> > >
> > > On Sat, Oct 13, 2018 at 11:54 PM H.J. Lu  wrote:
> > > >
> > > > Also disable AVX512IFMA, AVX5124FMAPS and AVX5124VNNIW when disabling
> > > > AVX512F.
> > > >
> > > > gcc/
> > > >
> > > > PR target/87572
> > > > * common/config/i386/i386-common.c 
> > > > (OPTION_MASK_ISA_AVX512F_UNSET):
> > > > Add OPTION_MASK_ISA_AVX512IFMA_UNSET,
> > > > OPTION_MASK_ISA_AVX5124FMAPS_UNSET and
> > > > OPTION_MASK_ISA_AVX5124VNNIW_UNSET.
> > > >
> > > > gcc/testsuite/
> > > >
> > > > PR target/87572
> > > > * gcc.target/i386/pr87572.c: New test.
> > >
> > > LGTM.
> >
> > This caused gazillion of testsuite FAILs like
> >
> > FAIL: gcc.target/i386/isa-11.c (test for excess errors)
> > Excess errors:
> > /tmp/ccyurT91.s:8: Error: invalid instruction suffix for `push'
> > /tmp/ccyurT91.s:14: Error: invalid instruction suffix for `pop'
> >
> > where we now emit pushl in 64bit mode.
>
> That change was incorrect, avx5124fmaps and avx5124vnniw flags are
> isa_flags2, rather than isa_flags, and are handled already properly:
> #define OPTION_MASK_ISA2_AVX512F_UNSET \
>   (OPTION_MASK_ISA_AVX5124FMAPS_UNSET | OPTION_MASK_ISA_AVX5124VNNIW_UNSET)
>
> So I think we need at least following patch, ok for trunk?
>
> Plus, I wonder if we shouldn't make it harder to run into these issues, by
> changing
> Target Report Mask(ISA_AVX5124FMAPS) Var(ix86_isa_flags2) Save
> etc. to
> Target Report Mask(ISA2_AVX5124FMAPS) Var(ix86_isa_flags2) Save
> so that we'll have OPTION_MASK_ISA2_AVX5124FMAPS macros instead of
> OPTION_MASK_ISA_AVX5124FMAPS and adjust all i386-common.c etc. uses from ISA
> to ISA2 for the ix86_isa_flags2 options.  Perhaps we could have
> #define TARGET_ISA_AVX5124FMAPS TARGET_ISA2_AVX5124FMAPS
> compatibility macro, because unlike the OPTION_MASK_* and TARGET_*_P macros
> where you need to specify the right flags the TARGET_* macros already have
> that in implicitly.  Uros, thoughts on this?

I was looking for a mail, where we discussed x86_isa_flags2 as a
temporary solution, with the expectation that some other extensible
mechanism gets invented to handle ISA flags. Now we are in c++, and I
guess there should be more elegant way to deal with the issue.
>
> 2018-10-15  Jakub Jelinek  
>
> PR target/87572
> * common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512F_UNSET):
> Remove OPTION_MASK_ISA_AVX5124FMAPS_UNSET and
> OPTION_MASK_ISA_AVX5124VNNIW_UNSET.

Yes, please go ahead with this patch.

Thanks,
Uros.

> --- gcc/common/config/i386/i386-common.c.jj 2018-10-15 16:27:59.214107805 
> +0200
> +++ gcc/common/config/i386/i386-common.c2018-10-15 16:30:30.750564097 
> +0200
> @@ -195,8 +195,6 @@ along with GCC; see the file COPYING3.
> | OPTION_MASK_ISA_AVX512PF_UNSET | OPTION_MASK_ISA_AVX512ER_UNSET \
> | OPTION_MASK_ISA_AVX512DQ_UNSET | OPTION_MASK_ISA_AVX512BW_UNSET \
> | OPTION_MASK_ISA_AVX512VL_UNSET | OPTION_MASK_ISA_AVX512IFMA_UNSET \
> -   | OPTION_MASK_ISA_AVX5124FMAPS_UNSET \
> -   | OPTION_MASK_ISA_AVX5124VNNIW_UNSET \
> | OPTION_MASK_ISA_AVX512VBMI2_UNSET \
> | OPTION_MASK_ISA_AVX512VNNI_UNSET \
> | OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET \
>
>
> Jakub


Re: [PATCH] Add option to control warnings added through attribure "warning"

2018-10-15 Thread Nikolai Merinov

Hi Martin,

On 10/15/18 6:20 PM, Martin Sebor wrote:

On 10/15/2018 01:55 AM, Nikolai Merinov wrote:

Hi Martin,

On 10/12/18 9:58 PM, Martin Sebor wrote:

On 10/12/2018 04:14 AM, Nikolai Merinov wrote:

Hello,

In https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01795.html mail I
suggested patch to have ability to control behavior of
"__attribute__((warning))" in case when option "-Werror" enabled. Usage
example:


#include 
int a() __attribute__((warning("Warning: `a' was used")));
int a() { return 1; }
int main () { return a(); }



$ gcc -Werror test.c
test.c: In function ‘main’:
test.c:4:22: error: call to ‘a’ declared with attribute warning:
Warning: `a' was used [-Werror]
 int main () { return a(); }
  ^
cc1: all warnings being treated as errors
$ gcc -Werror -Wno-error=warning-attribute test.c
test.c: In function ‘main’:
test.c:4:22: warning: call to ‘a’ declared with attribute warning:
Warning: `a' was used
 int main () { return a(); }
  ^

Can you provide any feedback on suggested changes?


It seems like a useful feature and in line with the philosophy
that distinct warnings should be controlled by their own options.

I would only suggest to consider changing the name to
-Wattribute-warning, because it applies specifically to that
attribute (as opposed to warnings about attributes in general).

There are many attributes in GCC and diagnosing problems that
are unique to each, under the same -Wattributes option, is
becoming too coarse and overly limiting.  To make it more
flexible, I expect new options will need to be introduced,
such as -Wattribute-alias (to control aspects of the alias
attribute and others related to it), or -Wattribute-const
(to control diagnostics about functions declared with
attribute const that violate the attribute's constraints).

An alternative might be to introduce a single -Wattribute=
 option where the  gives
the names of all the distinct attributes whose unique
diagnostics one might need to control.

Martin


Currently there is several styles already in use:

-Wattribute-alias where "attribute" word used as prefix for name of attribute,
-Wsuggest-attribute=[pure|const|noreturn|format|malloc] where name of attribute 
passed as possible argument,
-Wmissing-format-attribute where "attribute" word used as suffix,
-Wdeprecated-declarations where "attribute" word not used at all even if this warning 
option was created especially for "deprecated" attribute.

I changed name to "-Wattribute-warning" as you suggested, but unifying style 
for all attribute related warning looks like separate activity. Please check new patch in 
attachments.



Thanks for survey!  I agree that making the existing options
consistent (if that's what we want) should be done separately.

Martin

PS It doesn't look like your latest attachments made it to
the list.


Thank you for mentioning. There was my mistake. Now it's attached



Updated changelog:

gcc/Changelog

2018-10-14  Nikolai Merinov 

 * gcc/common.opt: Add -Wattribute-warning.
 * gcc/doc/invoke.texi: Add documentation for -Wno-attribute-warning.
 * gcc/testsuite/gcc.dg/Wno-attribute-warning.c: New test.
 * gcc/expr.c (expand_expr_real_1): Add new attribute to warning_at
 call to allow user configure behavior of "warning" attribute


Index: gcc/common.opt
===
--- gcc/common.opt	(revision 265156)
+++ gcc/common.opt	(working copy)
@@ -571,6 +571,10 @@ Wcpp
 Common Var(warn_cpp) Init(1) Warning
 Warn when a #warning directive is encountered.
 
+Wattribute-warning
+Common Var(warn_attribute_warning) Init(1) Warning
+Warn about uses of __attribute__((warning)) declarations.
+
 Wdeprecated-declarations
 Common Var(warn_deprecated_decl) Init(1) Warning
 Warn about uses of __attribute__((deprecated)) declarations.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 265156)
+++ gcc/doc/invoke.texi	(working copy)
@@ -291,6 +291,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wclobbered  -Wcomment  -Wconditionally-supported @gol
 -Wconversion  -Wcoverage-mismatch  -Wno-cpp  -Wdangling-else  -Wdate-time @gol
 -Wdelete-incomplete @gol
+-Wno-attribute-warning @gol
 -Wno-deprecated  -Wno-deprecated-declarations  -Wno-designated-init @gol
 -Wdisabled-optimization @gol
 -Wno-discarded-qualifiers  -Wno-discarded-array-qualifiers @gol
@@ -6950,6 +6951,15 @@ confused with the digit 0, and so is not the defau
 useful as a local coding convention if the programming environment 
 cannot be fixed to display these characters distinctly.
 
+@item -Wno-attribute-warning
+@opindex Wno-attribute-warning
+@opindex Wattribute-warning
+Do not warn about usage of functions (@pxref{Function Attributes})
+declared with @code{warning} attribute. By default, this warning is
+enabled.  @option{-Wno-attribute-warning} can be used to disable the
+warning or @op

Re: [PATCH v2] fixincludes: vxworks: regs.h: Fix includes in regs.h wrapper

2018-10-15 Thread Olivier Hainque
Hi Rasmus,

> On 8 Oct 2018, at 15:03, Rasmus Villemoes  wrote:
> 
> fixincludes/
> 
>   * inclhack.def (AAB_vxworks_regs_vxtypes): Add unconditional
>   include of vxCpu.h, guard include of vxTypesOld.h by
>   !_ASMLANGUAGE.
>   * fixincl.x: Regenerate.

Good for me, thanks.




Re: [PATCH, GCC/ARM] Fix PR87374: ICE with -mslow-flash-data and -mword-relocations

2018-10-15 Thread Thomas Preudhomme
Ping?

Best regards,

Thomas
On Fri, 5 Oct 2018 at 17:50, Thomas Preudhomme
 wrote:
>
> Hi Ramana and Kyrill,
>
> I've reworked the patch to add some documentation of the option
> conflict and reworked the -mword-relocation logic slightly to set the
> variable explicitely in PIC mode rather than test for PIC and word
> relocation everywhere.
>
> ChangeLog entries are now as follows:
>
> *** gcc/ChangeLog ***
>
> 2018-10-02  Thomas Preud'homme  
>
> PR target/87374
> * config/arm/arm.c (arm_option_check_internal): Disable the combined
> use of -mslow-flash-data and -mword-relocations.
> (arm_option_override): Enable -mword-relocations if -fpic or -fPIC.
> * config/arm/arm.md (SYMBOL_REF MOVT splitter): Stop checking for
> flag_pic.
> * doc/invoke.texi (-mword-relocations): Mention conflict with
> -mslow-flash-data.
> (-mslow-flash-data): Reciprocally.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2018-09-25  Thomas Preud'homme  
>
> PR target/87374
> * gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data and
> -mword-relocations would be passed when compiling the test.
> * gcc.target/arm/movsi_movt.c: Likewise.
> * gcc.target/arm/pr81863.c: Likewise.
> * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise.
> * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise.
> * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
> * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
> * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
> * gcc.target/arm/tls-disable-literal-pool.c: Likewise.
>
> Is this ok for trunk?
>
> Best regards,
>
> Thomas
>
> On Tue, 2 Oct 2018 at 13:39, Ramana Radhakrishnan
>  wrote:
> >
> > On 02/10/2018 11:42, Thomas Preudhomme wrote:
> > > Hi Ramana,
> > >
> > > On Thu, 27 Sep 2018 at 11:14, Ramana Radhakrishnan
> > >  wrote:
> > >>
> > >> On 27/09/2018 09:26, Kyrill Tkachov wrote:
> > >>> Hi Thomas,
> > >>>
> > >>> On 26/09/18 18:39, Thomas Preudhomme wrote:
> >  Hi,
> > 
> >  GCC ICEs under -mslow-flash-data and -mword-relocations because there
> >  is no way to load an address, both literal pools and MOVW/MOVT being
> >  forbidden. This patch gives an error message when both options are
> >  specified by the user and adds the according dg-skip-if directives for
> >  tests that use either of these options.
> > 
> >  ChangeLog entries are as follows:
> > 
> >  *** gcc/ChangeLog ***
> > 
> >  2018-09-25  Thomas Preud'homme  
> > 
> > PR target/87374
> > * config/arm/arm.c (arm_option_check_internal): Disable the 
> >  combined
> > use of -mslow-flash-data and -mword-relocations.
> > 
> >  *** gcc/testsuite/ChangeLog ***
> > 
> >  2018-09-25  Thomas Preud'homme  
> > 
> > PR target/87374
> > * gcc.target/arm/movdi_movt.c: Skip if both -mslow-flash-data 
> >  and
> > -mword-relocations would be passed when compiling the test.
> > * gcc.target/arm/movsi_movt.c: Likewise.
> > * gcc.target/arm/pr81863.c: Likewise.
> > * gcc.target/arm/thumb2-slow-flash-data-1.c: Likewise.
> > * gcc.target/arm/thumb2-slow-flash-data-2.c: Likewise.
> > * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
> > * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
> > * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
> > * gcc.target/arm/tls-disable-literal-pool.c: Likewise.
> > 
> > 
> >  Testing: Bootstrapped in Thumb-2 mode. No testsuite regression when
> >  targeting arm-none-eabi. Modified tests get skipped as expected when
> >  running the testsuite with -mslow-flash-data (pr81863.c) or
> >  -mword-relocations (all the others).
> > 
> > 
> >  Is this ok for trunk? I'd also appreciate guidance on whether this is
> >  worth a backport. It's a simple patch but on the other hand it only
> >  prevents some option combination, it does not fix anything so I have
> >  mixed feelings.
> > >>>
> > >>> In my opinion -mslow-flash-data is more of a tuning option rather than 
> > >>> a security/ABI feature
> > >>> and therefore erroring out on its combination with -mword-relocations 
> > >>> feels odd.
> > >>> I'm leaning more towards making -mword-relocations or any other option 
> > >>> that really requires constant pools
> > >>> to bypass/disable the effects of -mslow-flash-data instead.
> > >>
> > >> -mslow-flash-data and -mword-relocations are contradictory in their
> > >> expectations. mslow-flash-data is for not putting anything in the
> > >> literal pool whereas mword-relocations is purely around the use of movw
> > >> / movt instructions for word sized values. I wish we had called
> > >> -mslow-flash-data something else (probably -mno-literal-pools).
> > >> -mslow-flash-data is used primarily by M-profile users and
> > >> -mword-

Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-15 Thread Joseph Myers
On Mon, 15 Oct 2018, Richard Sandiford wrote:

> The patches therefore add a new "__sizeless_struct" keyword to denote
> structures that are sizeless rather than sized.  Unlike normal
> structures, these structures can have members of sizeless type in
> addition to members of sized type.  On the other hand, they have all
> the same limitations as other sizeless types (described in earlier
> sections).

I don't see anything here disallowing offsetof on such structures.

> Edits to the C standard
> ===
> 
> This section specifies the behaviour for sizeless types as an edit to N1570.

That's a very old standard version.

I'm not in Pittsburgh this week, but I don't see anything to do with these 
ideas on the agenda.  I haven't seen any contributions from Arm to the 
ongoing discussions on the WG14 reflector that include issues relating to 
possibly runtime sized types (vectors, bignums, types representing 
information about another type, for example), unless they're using 
not-obviously-Arm email addresses.  Is Arm going to be engaging in those 
discussions and working with people interested in these areas to produce 
proposals that take account of the different ideas people have for use of 
non-VLA types that may not have a compile-time-constant size (some of 
which may not end up in the C standard, of course)?  (It might of course 
require multiple papers, e.g. starting with fixed-width vector types which 
as a widely-implemented feature are something it might be natural to 
consider for C2x.)

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] Fix recent i386 regressions (was Re: [PATCH] i386: Also disable AVX512IFMA/AVX5124FMAPS/AVX5124VNNIW)

2018-10-15 Thread Jakub Jelinek
On Mon, Oct 15, 2018 at 04:22:04PM +0200, Richard Biener wrote:
> On Sun, Oct 14, 2018 at 9:29 PM Uros Bizjak  wrote:
> >
> > On Sat, Oct 13, 2018 at 11:54 PM H.J. Lu  wrote:
> > >
> > > Also disable AVX512IFMA, AVX5124FMAPS and AVX5124VNNIW when disabling
> > > AVX512F.
> > >
> > > gcc/
> > >
> > > PR target/87572
> > > * common/config/i386/i386-common.c 
> > > (OPTION_MASK_ISA_AVX512F_UNSET):
> > > Add OPTION_MASK_ISA_AVX512IFMA_UNSET,
> > > OPTION_MASK_ISA_AVX5124FMAPS_UNSET and
> > > OPTION_MASK_ISA_AVX5124VNNIW_UNSET.
> > >
> > > gcc/testsuite/
> > >
> > > PR target/87572
> > > * gcc.target/i386/pr87572.c: New test.
> >
> > LGTM.
> 
> This caused gazillion of testsuite FAILs like
> 
> FAIL: gcc.target/i386/isa-11.c (test for excess errors)
> Excess errors:
> /tmp/ccyurT91.s:8: Error: invalid instruction suffix for `push'
> /tmp/ccyurT91.s:14: Error: invalid instruction suffix for `pop'
> 
> where we now emit pushl in 64bit mode.

That change was incorrect, avx5124fmaps and avx5124vnniw flags are
isa_flags2, rather than isa_flags, and are handled already properly:
#define OPTION_MASK_ISA2_AVX512F_UNSET \
  (OPTION_MASK_ISA_AVX5124FMAPS_UNSET | OPTION_MASK_ISA_AVX5124VNNIW_UNSET)

So I think we need at least following patch, ok for trunk?

Plus, I wonder if we shouldn't make it harder to run into these issues, by
changing
Target Report Mask(ISA_AVX5124FMAPS) Var(ix86_isa_flags2) Save
etc. to
Target Report Mask(ISA2_AVX5124FMAPS) Var(ix86_isa_flags2) Save
so that we'll have OPTION_MASK_ISA2_AVX5124FMAPS macros instead of
OPTION_MASK_ISA_AVX5124FMAPS and adjust all i386-common.c etc. uses from ISA
to ISA2 for the ix86_isa_flags2 options.  Perhaps we could have
#define TARGET_ISA_AVX5124FMAPS TARGET_ISA2_AVX5124FMAPS
compatibility macro, because unlike the OPTION_MASK_* and TARGET_*_P macros
where you need to specify the right flags the TARGET_* macros already have
that in implicitly.  Uros, thoughts on this?

2018-10-15  Jakub Jelinek  

PR target/87572
* common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512F_UNSET):
Remove OPTION_MASK_ISA_AVX5124FMAPS_UNSET and
OPTION_MASK_ISA_AVX5124VNNIW_UNSET.

--- gcc/common/config/i386/i386-common.c.jj 2018-10-15 16:27:59.214107805 
+0200
+++ gcc/common/config/i386/i386-common.c2018-10-15 16:30:30.750564097 
+0200
@@ -195,8 +195,6 @@ along with GCC; see the file COPYING3.
| OPTION_MASK_ISA_AVX512PF_UNSET | OPTION_MASK_ISA_AVX512ER_UNSET \
| OPTION_MASK_ISA_AVX512DQ_UNSET | OPTION_MASK_ISA_AVX512BW_UNSET \
| OPTION_MASK_ISA_AVX512VL_UNSET | OPTION_MASK_ISA_AVX512IFMA_UNSET \
-   | OPTION_MASK_ISA_AVX5124FMAPS_UNSET \
-   | OPTION_MASK_ISA_AVX5124VNNIW_UNSET \
| OPTION_MASK_ISA_AVX512VBMI2_UNSET \
| OPTION_MASK_ISA_AVX512VNNI_UNSET \
| OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET \


Jakub


[09/10] C support for sizeless types

2018-10-15 Thread Richard Sandiford
This patch adds support for sizeless types to C, along the lines
described in the covering RFC.  The patch is actually a squash
of 26 patches that I've attached as a tarball, with each patch
building up the support piece-by-piece.  The individual patches
say which part of the standard they relate to and add associated
tests to gcc.dg/sizeless-1.c.

2018-10-15  Richard Sandiford  

gcc/c-family/
* c-common.h (RID_SIZELESS_STRUCT): New rid enum.
(DEFINITE_OR_VOID_TYPE_P): New macro.
(DEFINITE_OR_UNBOUND_ARRAY_TYPE_P): Likewise.
* c-common.c (c_common_reswords): Add __sizeless_struct.
(complete_size_in_bytes): New function.
(pointer_int_sum): Use it instead of size_in_bytes_loc.
(c_alignof_expr): Pass sizeless types directly to c_alignof.

gcc/c/
* c-tree.h (C_TYPE_INCOMPLETE_VARS): Rename to...
(C_TYPE_INDEFINITE_VARS): ...this.
(start_struct, parser_xref_tag): Add a bool parameter.
(require_complete_type): Add a default false parameter.
(require_definite_type): New function.
* c-decl.c (pushdecl): Use DEFINITE_TYPE_P instead of
COMPLETE_TYPE_P.  Update after above name change.
(lookup_tag): Add a sizeless_p parameter.  Check that it
matches the TYPE_SIZELESS_P field of any existing type with
the same name.
(shadow_tag_warned): Update call accordingly.
(start_decl): Require variables to have definite rather than
complete type.  Don't reject initializers for variable-sized
objects with sizeless type.
(finish_decl): Reject sizeless objects with static or
thread-local storage duration.
(build_compound_literal): Require compound literals to have
definite rather than complete type.
(grokdeclarator): Likewise fields.
(grokparms): Likewise function parameters.
(parser_xref_tag): Add a sizeless_p parameter.  Update call to
lookup_tag.  Initialize TYPE_SIZELESS_P when creating a new type.
(start_struct): Likewise.
(xref_tag): Update call to parser_xref_tag.
(finish_struct): Reject flexible array members in sizeless
structures.  Reject sizeless fields in sized aggregates.
Update after above name change.
(start_enum): Update call to lookup_tag.
(start_function): Require the return type to be definite rather
than complete.
(store_parm_decls_oldstyle): Likewise function parameters.
* c-parser.c (c_parser_declspecs): Handle RID_SIZELESS_STRUCT.
(c_parser_struct_or_union_specifier): Likewise.  Update calls
to start_struct and parser_xref_tag.
(c_parser_enum_specifier): Update call to parser_xref_tag.
(c_parser_generic_selection): Require the type in a _Generic
association to be definite rather than complete.
(c_parser_objc_selector): Handle RID_SIZELESS_STRUCT (but
commented out).
(c_parser_omp_threadprivate): Add a comment.
* c-typeck.c (require_complete_type): Add an allow_sizeless_p
parameter.
(default_conversion): Require the type to be definite rather than
complete when processing the expression being converted.
(build_component_ref): Likewise when processing member accesses.
(build_indirect_ref): Likewise when processing pointer dereferences.
(build_function_call_vec): Likewise when processing the return
type of a function call.
(convert_arguments): Likewise when processing the types of the
formal and actual parameters.
(build_unary_op): Likewise when processing the operand of a
unary operation.
(build_c_cast): Likewise when processing the value being cast.
(build_modify_expr): Likewise when processing the lhs of an
assignment.
(convert_for_assignment): Likewise when processing the rhs
of an assignment.
(c_process_expr_stmt): Likewise when processing the type of
a statement expression.
(c_build_va_arg): Likewise when processing the second argument
of a va_arg call.
(c_build_qualified_type): Update after above name change.

gcc/objc/
* objc-runtime-shared-support.c (objc_start_struct): Update call
to start_struct.

gcc/objcp/
* objcp-decl.h (start_struct): Add a sizeless_p parameter.

gcc/testsuite/
* gcc.dg/sizeless-1.c: New test.

Index: gcc/c-family/c-common.h
===
--- gcc/c-family/c-common.h 2018-10-15 14:13:28.592266684 +0100
+++ gcc/c-family/c-common.h 2018-10-15 14:13:32.988230244 +0100
@@ -104,6 +104,7 @@ enum rid
   RID_TYPES_COMPATIBLE_P,  RID_BUILTIN_COMPLEX, 
RID_BUILTIN_SHUFFLE,
   RID_BUILTIN_TGMATH,
   RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128,
+  RID_SIZELESS_STRUCT,
 
   /* TS 18661-3 keywords, in the same sequence as the TI_* values.  */
   RID_F

[08/10] Add a TYPE_SIZELESS_P property to types

2018-10-15 Thread Richard Sandiford
This patch adds a bit to tree_type_common (which still has plenty
of bits spare) to indicate whether the type has a measurable size
at the language level once fully-defined.

2018-10-15  Richard Sandiford  

gcc/
* tree-core.h (tree_type_common::sizeless): New bitfield.
(tree_type_common::spare): Reduce size by one bit.
* tree.h (TYPE_SIZELESS_P): New macro.

gcc/c-family/
* c-common.h (COMPLETE_TYPE_P): Check TYPE_SIZELESS_P.

Index: gcc/tree-core.h
===
--- gcc/tree-core.h 2018-10-05 13:46:11.195787561 +0100
+++ gcc/tree-core.h 2018-10-15 14:13:28.592266684 +0100
@@ -1534,7 +1534,8 @@ struct GTY(()) tree_type_common {
   unsigned warn_if_not_align : 6;
   unsigned typeless_storage : 1;
   unsigned empty_flag : 1;
-  unsigned spare : 17;
+  unsigned sizeless : 1;
+  unsigned spare : 16;
 
   alias_set_type alias_set;
   tree pointer_to;
Index: gcc/tree.h
===
--- gcc/tree.h  2018-10-15 14:13:18.280352163 +0100
+++ gcc/tree.h  2018-10-15 14:13:28.596266651 +0100
@@ -688,6 +688,13 @@ #define TRANSLATION_UNIT_WARN_EMPTY_P(NO
 /* Nonzero if this type is "empty" according to the particular psABI.  */
 #define TYPE_EMPTY_P(NODE) (TYPE_CHECK (NODE)->type_common.empty_flag)
 
+/* True if this type is "sizeless" according to the SVE extensions to
+   C and C++.  Sizeless types have no measurable size or alignment at
+   the language level, even when the types have been fully defined
+   (are "definite").  Size, alignment and layout are instead decided
+   by the ABI.  */
+#define TYPE_SIZELESS_P(NODE) (TYPE_CHECK (NODE)->type_common.sizeless)
+
 /* Used to indicate that this TYPE represents a compiler-generated entity.  */
 #define TYPE_ARTIFICIAL(NODE) (TYPE_CHECK (NODE)->base.nowarning_flag)
 
Index: gcc/c-family/c-common.h
===
--- gcc/c-family/c-common.h 2018-10-15 14:13:22.880314033 +0100
+++ gcc/c-family/c-common.h 2018-10-15 14:13:28.592266684 +0100
@@ -727,7 +727,8 @@ enum cxx_dialect {
 extern bool done_lexing;
 
 /* Nonzero if this type is a complete type.  */
-#define COMPLETE_TYPE_P(NODE) (TYPE_SIZE (NODE) != NULL_TREE)
+#define COMPLETE_TYPE_P(NODE) \
+  (TYPE_SIZE (NODE) != NULL_TREE && !TYPE_SIZELESS_P (NODE))
 
 /* C types are partitioned into three subsets: object, function, and
incomplete types.  */


[07/10] Use COMPLETE_TYPE_P instead of TYPE_SIZE

2018-10-15 Thread Richard Sandiford
This patch makes a couple of c-family macros use COMPLETE_TYPE_P instead
of TYPE_SIZE, so that the definitions more clearly correspond to the
names of the macros.

2018-10-15  Richard Sandiford  

gcc/c-family/
* c-common.h (C_TYPE_OBJECT_P, C_TYPE_INCOMPLETE_P): Test
COMPLETE_TYPE_P instead of TYPE_SIZE.

Index: gcc/c-family/c-common.h
===
--- gcc/c-family/c-common.h 2018-10-15 14:13:18.280352163 +0100
+++ gcc/c-family/c-common.h 2018-10-15 14:13:22.880314033 +0100
@@ -732,10 +732,10 @@ #define COMPLETE_TYPE_P(NODE) (TYPE_SIZE
 /* C types are partitioned into three subsets: object, function, and
incomplete types.  */
 #define C_TYPE_OBJECT_P(type) \
-  (TREE_CODE (type) != FUNCTION_TYPE && TYPE_SIZE (type))
+  (TREE_CODE (type) != FUNCTION_TYPE && COMPLETE_TYPE_P (type))
 
 #define C_TYPE_INCOMPLETE_P(type) \
-  (TREE_CODE (type) != FUNCTION_TYPE && TYPE_SIZE (type) == 0)
+  (TREE_CODE (type) != FUNCTION_TYPE && !COMPLETE_TYPE_P (type))
 
 #define C_TYPE_FUNCTION_P(type) \
   (TREE_CODE (type) == FUNCTION_TYPE)


[06/10] Move COMPLETE_TYPE_P to the C and C++ frontends

2018-10-15 Thread Richard Sandiford
After previous patches there are no more uses of COMPLETE_TYPE_P outside
the frontends.  This patch moves the definition to c-common.h.

2018-10-15  Richard Sandiford  

gcc/
* tree.h (COMPLETE_TYPE_P): Move to c-common.h.

gcc/c-family/
* c-common.h (COMPLETE_TYPE_P): Moved from tree.h.
* c-ada-spec.c: Include c-common.h.

Index: gcc/tree.h
===
--- gcc/tree.h  2018-10-15 14:13:13.584391090 +0100
+++ gcc/tree.h  2018-10-15 14:13:18.280352163 +0100
@@ -596,9 +596,6 @@ #define FUNCTION_POINTER_TYPE_P(TYPE) \
to create objects of that type.  The type might be sized or sizeless.  */
 #define DEFINITE_TYPE_P(NODE) (TYPE_SIZE (NODE) != NULL_TREE)
 
-/* Nonzero if this type is a complete type.  */
-#define COMPLETE_TYPE_P(NODE) (TYPE_SIZE (NODE) != NULL_TREE)
-
 /* Nonzero if this type is the (possibly qualified) void type.  */
 #define VOID_TYPE_P(NODE) (TREE_CODE (NODE) == VOID_TYPE)
 
Index: gcc/c-family/c-common.h
===
--- gcc/c-family/c-common.h 2018-10-15 14:13:13.584391090 +0100
+++ gcc/c-family/c-common.h 2018-10-15 14:13:18.280352163 +0100
@@ -726,6 +726,9 @@ enum cxx_dialect {
 
 extern bool done_lexing;
 
+/* Nonzero if this type is a complete type.  */
+#define COMPLETE_TYPE_P(NODE) (TYPE_SIZE (NODE) != NULL_TREE)
+
 /* C types are partitioned into three subsets: object, function, and
incomplete types.  */
 #define C_TYPE_OBJECT_P(type) \
Index: gcc/c-family/c-ada-spec.c
===
--- gcc/c-family/c-ada-spec.c   2018-10-05 13:46:08.28787 +0100
+++ gcc/c-family/c-ada-spec.c   2018-10-15 14:13:18.280352163 +0100
@@ -27,6 +27,7 @@ Software Foundation; either version 3, o
 #include "c-ada-spec.h"
 #include "fold-const.h"
 #include "c-pragma.h"
+#include "c-common.h"
 #include "diagnostic.h"
 #include "stringpool.h"
 #include "attribs.h"


[05/10] Move complete_or_array_type_p to the C and C++ frontends

2018-10-15 Thread Richard Sandiford
complete_or_array_type_p was defined in tree.h but unused outside
the frontends.  This patch moves it to c-common.h.

2018-10-15  Richard Sandiford  

gcc/
* tree.h (complete_or_array_type_p): Move to c-common.h.

gcc/c-family/
* c-common.h (complete_or_array_type_p): Moved from tree.h.

Index: gcc/tree.h
===
--- gcc/tree.h  2018-10-15 14:13:08.520433065 +0100
+++ gcc/tree.h  2018-10-15 14:13:13.584391090 +0100
@@ -4859,17 +4859,6 @@ ptrofftype_p (tree type)
  && TYPE_UNSIGNED (type) == TYPE_UNSIGNED (sizetype));
 }
 
-/* Return true if the argument is a complete type or an array
-   of unknown bound (whose type is incomplete but) whose elements
-   have complete type.  */
-static inline bool
-complete_or_array_type_p (const_tree type)
-{
-  return COMPLETE_TYPE_P (type)
- || (TREE_CODE (type) == ARRAY_TYPE
-&& COMPLETE_TYPE_P (TREE_TYPE (type)));
-}
-
 /* Return true if the value of T could be represented as a poly_widest_int.  */
 
 inline bool
Index: gcc/c-family/c-common.h
===
--- gcc/c-family/c-common.h 2018-10-15 14:13:08.516433099 +0100
+++ gcc/c-family/c-common.h 2018-10-15 14:13:13.584391090 +0100
@@ -750,6 +750,17 @@ #define COMPLETE_OR_VOID_TYPE_P(NODE) \
 #define COMPLETE_OR_UNBOUND_ARRAY_TYPE_P(NODE) \
   (COMPLETE_TYPE_P (TREE_CODE (NODE) == ARRAY_TYPE ? TREE_TYPE (NODE) : 
(NODE)))
 
+/* Return true if the argument is a complete type or an array
+   of unknown bound (whose type is incomplete but) whose elements
+   have complete type.  */
+static inline bool
+complete_or_array_type_p (const_tree type)
+{
+  return COMPLETE_TYPE_P (type)
+ || (TREE_CODE (type) == ARRAY_TYPE
+&& COMPLETE_TYPE_P (TREE_TYPE (type)));
+}
+
 struct visibility_flags
 {
   unsigned inpragma : 1;   /* True when in #pragma GCC visibility.  */


[04/10] Move COMPLETE_OR_UNBOUND_ARRAY_TYPE_P to the C and C++ frontends

2018-10-15 Thread Richard Sandiford
There was only one use of COMPLETE_OR_UNBOUND_ARRAY_TYPE_P outside the
frontends, in expr.c.  This patch expands the macro there and moves the
macro's definition to c-common.h.

It feels a bit odd that we still have decls with no layout at
this late stage, but that's a separate issue...

2018-10-15  Richard Sandiford  

gcc/
* tree.h (COMPLETE_OR_UNBOUND_ARRAY_TYPE_P): Move to c-common.h.
* expr.c (expand_expr_real_1): Expand use of
COMPLETE_OR_UNBOUND_ARRAY_TYPE_P here.

gcc/c-family/
* c-common.h (COMPLETE_OR_UNBOUND_ARRAY_TYPE_P): New macro,
moved from tree.h.

Index: gcc/tree.h
===
--- gcc/tree.h  2018-10-15 14:13:04.148469305 +0100
+++ gcc/tree.h  2018-10-15 14:13:08.520433065 +0100
@@ -602,10 +602,6 @@ #define COMPLETE_TYPE_P(NODE) (TYPE_SIZE
 /* Nonzero if this type is the (possibly qualified) void type.  */
 #define VOID_TYPE_P(NODE) (TREE_CODE (NODE) == VOID_TYPE)
 
-/* Nonzero if this type is complete or is an array with unspecified bound.  */
-#define COMPLETE_OR_UNBOUND_ARRAY_TYPE_P(NODE) \
-  (COMPLETE_TYPE_P (TREE_CODE (NODE) == ARRAY_TYPE ? TREE_TYPE (NODE) : 
(NODE)))
-
 #define FUNC_OR_METHOD_TYPE_P(NODE) \
   (TREE_CODE (NODE) == FUNCTION_TYPE || TREE_CODE (NODE) == METHOD_TYPE)
 
Index: gcc/expr.c
===
--- gcc/expr.c  2018-10-15 14:12:54.040553089 +0100
+++ gcc/expr.c  2018-10-15 14:13:08.520433065 +0100
@@ -9884,7 +9884,8 @@ expand_expr_real_1 (tree exp, rtx target
   /* If a static var's type was incomplete when the decl was written,
 but the type is complete now, lay out the decl now.  */
   if (DECL_SIZE (exp) == 0
- && COMPLETE_OR_UNBOUND_ARRAY_TYPE_P (TREE_TYPE (exp))
+ && DEFINITE_TYPE_P (TREE_CODE (type) == ARRAY_TYPE
+ ? TREE_TYPE (type) : type)
  && (TREE_STATIC (exp) || DECL_EXTERNAL (exp)))
layout_decl (exp, 0);
 
Index: gcc/c-family/c-common.h
===
--- gcc/c-family/c-common.h 2018-10-15 14:13:04.148469305 +0100
+++ gcc/c-family/c-common.h 2018-10-15 14:13:08.516433099 +0100
@@ -746,6 +746,10 @@ #define C_TYPE_OBJECT_OR_INCOMPLETE_P(ty
 #define COMPLETE_OR_VOID_TYPE_P(NODE) \
   (COMPLETE_TYPE_P (NODE) || VOID_TYPE_P (NODE))
 
+/* Nonzero if this type is complete or is an array with unspecified bound.  */
+#define COMPLETE_OR_UNBOUND_ARRAY_TYPE_P(NODE) \
+  (COMPLETE_TYPE_P (TREE_CODE (NODE) == ARRAY_TYPE ? TREE_TYPE (NODE) : 
(NODE)))
+
 struct visibility_flags
 {
   unsigned inpragma : 1;   /* True when in #pragma GCC visibility.  */


[03/10] Move COMPLETE_OR_VOID_TYPE_P to the C and C++ frontends

2018-10-15 Thread Richard Sandiford
There was only one use of this macro outside the frontends, in dbxout.c.
This patch expands that use and moves the macro's definition to c-common.h.

There's no expectation that dbx will support sizeless types,
so keeping the current definition should be fine.

2018-10-15  Richard Sandiford  

gcc/
* tree.h (COMPLETE_OR_VOID_TYPE_P): Move to c-common.h.
* dbxout.c (dbxout_typedefs): Expand definition of
COMPLETE_OR_VOID_TYPE_P.

gcc/c-family/
* c-common.h (COMPLETE_OR_VOID_TYPE_P): Moved from tree.h.

Index: gcc/tree.h
===
--- gcc/tree.h  2018-10-15 14:12:59.036511679 +0100
+++ gcc/tree.h  2018-10-15 14:13:04.148469305 +0100
@@ -602,10 +602,6 @@ #define COMPLETE_TYPE_P(NODE) (TYPE_SIZE
 /* Nonzero if this type is the (possibly qualified) void type.  */
 #define VOID_TYPE_P(NODE) (TREE_CODE (NODE) == VOID_TYPE)
 
-/* Nonzero if this type is complete or is cv void.  */
-#define COMPLETE_OR_VOID_TYPE_P(NODE) \
-  (COMPLETE_TYPE_P (NODE) || VOID_TYPE_P (NODE))
-
 /* Nonzero if this type is complete or is an array with unspecified bound.  */
 #define COMPLETE_OR_UNBOUND_ARRAY_TYPE_P(NODE) \
   (COMPLETE_TYPE_P (TREE_CODE (NODE) == ARRAY_TYPE ? TREE_TYPE (NODE) : 
(NODE)))
Index: gcc/dbxout.c
===
--- gcc/dbxout.c2018-10-15 14:12:59.024511777 +0100
+++ gcc/dbxout.c2018-10-15 14:13:04.148469305 +0100
@@ -1093,7 +1093,7 @@ dbxout_typedefs (tree syms)
  tree type = TREE_TYPE (syms);
  if (TYPE_NAME (type)
  && TREE_CODE (TYPE_NAME (type)) == TYPE_DECL
- && COMPLETE_OR_VOID_TYPE_P (type)
+ && (DEFINITE_TYPE_P (type) || VOID_TYPE_P (type))
  && ! TREE_ASM_WRITTEN (TYPE_NAME (type)))
dbxout_symbol (TYPE_NAME (type), 0);
}
Index: gcc/c-family/c-common.h
===
--- gcc/c-family/c-common.h 2018-10-15 14:08:44.946617301 +0100
+++ gcc/c-family/c-common.h 2018-10-15 14:13:04.148469305 +0100
@@ -742,6 +742,10 @@ #define C_TYPE_FUNCTION_P(type) \
 #define C_TYPE_OBJECT_OR_INCOMPLETE_P(type) \
   (!C_TYPE_FUNCTION_P (type))
 
+/* Nonzero if this type is complete or is cv void.  */
+#define COMPLETE_OR_VOID_TYPE_P(NODE) \
+  (COMPLETE_TYPE_P (NODE) || VOID_TYPE_P (NODE))
+
 struct visibility_flags
 {
   unsigned inpragma : 1;   /* True when in #pragma GCC visibility.  */


[02/10] Replace most uses of COMPLETE_TYPE_P outside the frontends

2018-10-15 Thread Richard Sandiford
This patch adds a DEFINITE_TYPE_P macro for testing whether a
type has been fully-defined.  The definition is the same as the
current definition of COMPLETE_TYPE_P, but later patches redefine
COMPLETE_TYPE_P and make it local to the C and C++ frontends.
The name "definite type" comes from the SVE ACLE specification.

The patch also replaces all *.c uses of COMPLETE_TYPE_P outside
the frontends (along with a couple of *.h uses).

2018-10-15  Richard Sandiford  

gcc/
* tree.h (DEFINITE_TYPE_P): New macro.
(type_with_alias_set_p): Use it instead of COMPLETE_TYPE_P.
* alias.c (get_alias_set): Likewise.
* calls.c (initialize_argument_information): Likewise.
* config/i386/winnt.c (gen_stdcall_or_fastcall_suffix): Likewise.
* convert.c (convert_to_integer_1): Likewise.
* dbxout.c (dbxout_type, dbxout_symbol): Likewise.
* dwarf2out.c (add_pubtype, gen_generic_params_dies)
(add_subscript_info, gen_scheduled_generic_parms_dies): Likewise.
* function.c (assign_temp): Likewise.
* gimple-expr.c (create_tmp_var): Likewise.
* gimplify.c (gimplify_expr): Likewise.
* ipa-devirt.c (set_type_binfo, warn_types_mismatch)
(odr_types_equivalent_p, add_type_duplicate, get_odr_type): Likewise.
* ipa-icf.c (sem_item::add_type): Likewise.
* langhooks.c (lhd_omp_mappable_type): Likewise.
* omp-low.c (scan_sharing_clauses): Likewise.
* tree-ssa-sccvn.c (fully_constant_vn_reference_p): Likewise.
* tree.c (type_cache_hasher::equal, build_function_type)
(build_method_type_directly, build_offset_type, build_complex_type)
(verify_type_variant, gimple_canonical_types_compatible_p)
(verify_type): Likewise.

gcc/ada/
* gcc-interface/decl.c (gnat_to_gnu_entity): Likewise.
* gcc-interface/utils2.c (build_simple_component_ref): Likewise.

gcc/fortran/
* trans-decl.c (gfc_build_qualified_array): Likewise.
(create_function_arglist): Likewise.

gcc/lto/
* lto-symtab.c (lto_symtab_merge_decls_1): Likewise.

Index: gcc/tree.h
===
--- gcc/tree.h  2018-10-05 13:46:08.863806452 +0100
+++ gcc/tree.h  2018-10-15 14:12:59.036511679 +0100
@@ -592,6 +592,10 @@ #define POINTER_TYPE_P(TYPE) \
 #define FUNCTION_POINTER_TYPE_P(TYPE) \
   (POINTER_TYPE_P (TYPE) && TREE_CODE (TREE_TYPE (TYPE)) == FUNCTION_TYPE)
 
+/* Nonzero if this type is "definite"; that is, if we have enough information
+   to create objects of that type.  The type might be sized or sizeless.  */
+#define DEFINITE_TYPE_P(NODE) (TYPE_SIZE (NODE) != NULL_TREE)
+
 /* Nonzero if this type is a complete type.  */
 #define COMPLETE_TYPE_P(NODE) (TYPE_SIZE (NODE) != NULL_TREE)
 
@@ -5796,12 +5800,12 @@ type_with_alias_set_p (const_tree t)
   if (TREE_CODE (t) == FUNCTION_TYPE || TREE_CODE (t) == METHOD_TYPE)
 return false;
 
-  if (COMPLETE_TYPE_P (t))
+  if (DEFINITE_TYPE_P (t))
 return true;
 
   /* Incomplete types can not be accessed in general except for arrays
  where we can fetch its element despite we have no array bounds.  */
-  if (TREE_CODE (t) == ARRAY_TYPE && COMPLETE_TYPE_P (TREE_TYPE (t)))
+  if (TREE_CODE (t) == ARRAY_TYPE && DEFINITE_TYPE_P (TREE_TYPE (t)))
 return true;
 
   return false;
Index: gcc/alias.c
===
--- gcc/alias.c 2018-10-05 13:46:11.111788242 +0100
+++ gcc/alias.c 2018-10-15 14:12:59.020511811 +0100
@@ -922,15 +922,15 @@ get_alias_set (tree t)
   if (TYPE_ALIAS_SET_KNOWN_P (t))
 return TYPE_ALIAS_SET (t);
 
-  /* We don't want to set TYPE_ALIAS_SET for incomplete types.  */
-  if (!COMPLETE_TYPE_P (t))
+  /* We don't want to set TYPE_ALIAS_SET for indefinite types.  */
+  if (!DEFINITE_TYPE_P (t))
 {
   /* For arrays with unknown size the conservative answer is the
 alias set of the element type.  */
   if (TREE_CODE (t) == ARRAY_TYPE)
return get_alias_set (TREE_TYPE (t));
 
-  /* But return zero as a conservative answer for incomplete types.  */
+  /* But return zero as a conservative answer for indefinite types.  */
   return 0;
 }
 
@@ -1006,7 +1006,7 @@ get_alias_set (tree t)
   for (p = t; POINTER_TYPE_P (p)
   || (TREE_CODE (p) == ARRAY_TYPE
   && (!TYPE_NONALIASED_COMPONENT (p)
-  || !COMPLETE_TYPE_P (p)
+  || !DEFINITE_TYPE_P (p)
   || TYPE_STRUCTURAL_EQUALITY_P (p)))
   || TREE_CODE (p) == VECTOR_TYPE;
   p = TREE_TYPE (p))
Index: gcc/calls.c
===
--- gcc/calls.c 2018-10-15 14:12:54.016553288 +0100
+++ gcc/calls.c 2018-10-15 14:12:59.024511777 +0100
@@ -1950,7 +1950,7 @@ initialize_argument_information (int num
   machine_mode mode;
 
   /* Replace erroneous argument with constant zero.  */
-  if (t

[01/10] Expand COMPLETE_TYPE_P in obvious checks for null

2018-10-15 Thread Richard Sandiford
Some tests for COMPLETE_TYPE_P are just protecting against a null
TYPE_SIZE or TYPE_SIZE_UNIT.  Rather than replace them with a new
macro, it seemed clearer to write out the underlying test.

2018-10-15  Richard Sandiford  

gcc/
* calls.c (initialize_argument_information): Replace COMPLETE_TYPE_P
with checks for null.
* config/aarch64/aarch64.c (aapcs_vfp_sub_candidate): Likewise.
* config/arm/arm.c (aapcs_vfp_sub_candidate): Likewise.
* config/powerpcspe/powerpcspe.c (rs6000_aggregate_candidate):
Likewise.
* config/riscv/riscv.c (riscv_flatten_aggregate_field): Likewise.
* config/rs6000/rs6000.c (rs6000_aggregate_candidate): Likewise.
* expr.c (expand_assignment, safe_from_p): Likewise.
(expand_expr_real_1): Likewise.
* tree-data-ref.c (initialize_data_dependence_relation): Likewise.
* tree-sra.c (maybe_add_sra_candidate): Likewise.
(find_param_candidates): Likewise.
* tree-ssa-alias.c (indirect_ref_may_alias_decl_p): Likewise.
* tree-vrp.c (vrp_prop::check_mem_ref): Likewise.

gcc/lto/
* lto-symtab.c (warn_type_compatibility_p): Likewise.

Index: gcc/calls.c
===
--- gcc/calls.c 2018-10-05 13:46:11.115788209 +0100
+++ gcc/calls.c 2018-10-15 14:12:54.016553288 +0100
@@ -2039,7 +2039,7 @@ initialize_argument_information (int num
 function being called.  */
  rtx copy;
 
- if (!COMPLETE_TYPE_P (type)
+ if (!TYPE_SIZE_UNIT (type)
  || TREE_CODE (TYPE_SIZE_UNIT (type)) != INTEGER_CST
  || (flag_stack_check == GENERIC_STACK_CHECK
  && compare_tree_int (TYPE_SIZE_UNIT (type),
Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2018-10-15 14:08:45.970608817 +0100
+++ gcc/config/aarch64/aarch64.c2018-10-15 14:12:54.020553256 +0100
@@ -13000,7 +13000,7 @@ aapcs_vfp_sub_candidate (const_tree type
 
/* Can't handle incomplete types nor sizes that are not
   fixed.  */
-   if (!COMPLETE_TYPE_P (type)
+   if (!TYPE_SIZE (type)
|| TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
  return -1;
 
@@ -13033,7 +13033,7 @@ aapcs_vfp_sub_candidate (const_tree type
 
/* Can't handle incomplete types nor sizes that are not
   fixed.  */
-   if (!COMPLETE_TYPE_P (type)
+   if (!TYPE_SIZE (type)
|| TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
  return -1;
 
@@ -13066,7 +13066,7 @@ aapcs_vfp_sub_candidate (const_tree type
 
/* Can't handle incomplete types nor sizes that are not
   fixed.  */
-   if (!COMPLETE_TYPE_P (type)
+   if (!TYPE_SIZE (type)
|| TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
  return -1;
 
Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c2018-10-05 13:46:14.375761802 +0100
+++ gcc/config/arm/arm.c2018-10-15 14:12:54.024553222 +0100
@@ -5927,7 +5927,7 @@ aapcs_vfp_sub_candidate (const_tree type
 
/* Can't handle incomplete types nor sizes that are not
   fixed.  */
-   if (!COMPLETE_TYPE_P (type)
+   if (!TYPE_SIZE (type)
|| TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
  return -1;
 
@@ -5960,7 +5960,7 @@ aapcs_vfp_sub_candidate (const_tree type
 
/* Can't handle incomplete types nor sizes that are not
   fixed.  */
-   if (!COMPLETE_TYPE_P (type)
+   if (!TYPE_SIZE (type)
|| TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
  return -1;
 
@@ -5993,7 +5993,7 @@ aapcs_vfp_sub_candidate (const_tree type
 
/* Can't handle incomplete types nor sizes that are not
   fixed.  */
-   if (!COMPLETE_TYPE_P (type)
+   if (!TYPE_SIZE (type)
|| TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
  return -1;
 
Index: gcc/config/powerpcspe/powerpcspe.c
===
--- gcc/config/powerpcspe/powerpcspe.c  2018-10-05 13:46:13.855766014 +0100
+++ gcc/config/powerpcspe/powerpcspe.c  2018-10-15 14:12:54.032553156 +0100
@@ -11540,7 +11540,7 @@ rs6000_aggregate_candidate (const_tree t
 
/* Can't handle incomplete types nor sizes that are not
   fixed.  */
-   if (!COMPLETE_TYPE_P (type)
+   if (!TYPE_SIZE (type)
|| TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
  return -1;
 
@@ -11573,7 +11573,7 @@ rs6000_aggregate_candidate (const_tree t
 
/* Can't handle incomplete types nor sizes that are not
   fixed.  */
-   if (!COMPLETE_TYPE_P (type)
+   if (!TYPE_SIZE (type)
|| TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
  return -1;
 
@@ -11606,7 +11606,7 @@ rs6000_aggregate_candid

[00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-15 Thread Richard Sandiford
The C standard says:

At various points within a translation unit an object type may be
"incomplete" (lacking sufficient information to determine the size of
objects of that type) or "complete" (having sufficient information).

For AArch64 SVE, we'd like to split this into two concepts:

  * has the type been fully defined?
  * would fully-defining the type determine its size?

This is because we'd like to be able to represent SVE vectors as C and C++
types.  Since SVE is a "vector-length agnostic" architecture, the size
of the vectors is determined by the runtime environment rather than the
programmer or compiler.  In that sense, defining an SVE vector type does
not determine its size.  It's nevertheless possible to use SVE vector types
in meaningful ways, such as having automatic vector variables and passing
vectors between functions.

The main questions in the RFC are:

  1) is splitting the definition like this OK in principle?
  2) are the specific rules described below OK?
  3) coding-wise, how should the split be represented in GCC?

Terminology
---

Going back to the second bullet above:

  * would fully-defining the type determine its size?

the rest of the RFC calls a type "sized" if fully defining it would
determine its size.  The type is "sizeless" otherwise.

Contents


The RFC is organised as follows.  I've erred on the side of including
detail rather than leaving it out, but each section is meant to be
self-contained and skippable:

  - An earlier RFC
  - Quick overview of SVE
  - Why we need SVE types in C and C++
  - How we ended up with this definition
  - The SVE types in more detail
  - Outline of the type system changes
  - Sizeless structures (and testing on non-SVE targets)
  - Other variable-length vector architectures
  - Edits to the C standard
- Base changes
- Updates for consistency
- Sizeless structures
  - Edits to the C++ standard
  - GCC implementation questions

I'll follow up with patches that implement the split.



An earlier RFC
==

For the record (in case this sounds familiar) I sent an RFC about the
sizeless type extension a while ago:

https://gcc.gnu.org/ml/gcc/2017-08/msg00012.html

The rules haven't changed since then, but this version includes more
information and includes support for sizeless structures.


Quick overview of SVE
=

SVE is a vector extension to AArch64.  A detailed description is
available here:

https://static.docs.arm.com/ddi0584/a/DDI0584A_a_SVE_supp_armv8A.pdf

but the only feature that really matters for this RFC is that SVE has no
fixed or preferred vector length.  Implementations can instead choose
from a range of possible vector lengths, with 128 bits being the minimum
and 2048 bits being the maximum.  Priveleged software can further
constrain the vector length within the range offered by the implementation;
e.g. linux currently provides per-thread control of the vector length.


Why we need SVE types in C and C++
==

SVE was designed to be an easy target for autovectorising normal scalar
code.  There are also various language extensions that support explicit
data parallelism or that make explicit vector chunking easier to do in
an architecture-neutral way (e.g. C++ P0214).  This means that many users
won't need to do anything SVE-specific.

Even so, there's always going to be a place for writing SVE-specific
optimisations, with full access to the underlying ISA.  As for other
vector architectures, we'd like users to be able to write such routines
in C and C++ rather than force them to go all the way to assembly.

We'd also like C and C++ functions to be able to take SVE vector
parameters and return SVE vector results, which is particularly useful
when implementing things like vector math routines.  In this case in
particular, the types need to map directly to something that fits in
an SVE register, so that passing and returning vectors has minimal
overhead.


How we ended up with this definition


Requirements


We need the SVE vector types to define and use SVE intrinsic functions
and to write SVE vector library routines.  The key requirements when
defining the types were:

  * They must be available in both C and C++ (because we want to be able
add SVE optimisations to C-only codebases).

  * They must fit in an SVE vector register (so there can be no on-the-side
information).

  * It must be possible to define automatic variables with these types.

  * It must be possible to pass and return objects of these types
(since that's what intrinsics and vector library routines need to do).

  * It must be possible to use the types in _Generic associations
(so that _Generic can be used to provide tgmath.h-style overloads).

  * It must be possible to use pointers or references to the types
(for passing or returning by pointer or reference, and because not
allowin

Re: [PATCH][GCC][AARCH64]Introduce aarch64 atomic_{load,store}ti patterns

2018-10-15 Thread Matthew Malcomson

ping


On 27/09/18 14:43, Matthew Malcomson wrote:

[PATCH][GCC][AARCH64] Introduce aarch64 atomic_{load,store}ti patterns

In Armv8.4-a these patterns use the LDP/STP instructions that are guaranteed to
be single-copy atomic, ensure correct memory ordering semantics by using
the DMB instruction.

We put the use of these inline expansions behind a command line flag since they
do not satisfy the libatomic ABI and hence can't be used together with code
already compiled using 16 byte atomics.
This command line flag is -matomic-128bit-instructions.

Given the introduction of a flag specified to break ABI compatibility with
libatomic, it seems reasonable to introduce the load-exclusive/store-exclusive
read-modify-write loop emulation of 128 bit atomic load and stores for older
architectures behind this flag.

We introduce the usual extension macros for the "at" extension marking the
LDP/STP atomicity guarantees introduced in Armv8.4-a and use these to decide
which to use when -matomic-128bit-instructions is provided on the command line.

Tested with full bootstrap and make check on aarch64-none-linux-gnu.
Ok for trunk?

gcc/ChangeLog:

2018-09-27  Matthew Malcomson  

* config/aarch64/aarch64-protos.h (aarch64_split_atomic_ti_access): New
prototype.
* config/aarch64/aarch64.c (aarch64_split_atomic_ti_access): New.
* config/aarch64/aarch64.h (AARCH64_FL_AT): New flag.
(AARCH64_FL_PROFILE): Flag moved to accomodate above.
(AARCH64_FL_FOR_ARCH8_4): Include AARCH64_FL_AT.
(AARCH64_ISA_AT): New ISA flag.
* config/aarch64/aarch64.opt (-matomic-128bit-instruction): New.
* config/aarch64/atomics.md (atomic_load, atomic_store,
@aarch64_load_exclusive {smaller registers},
@aarch64_load_exclusive {GPI registers},
@aarch64_store_exclusive): Use aarch_mm_needs_{acquire,release}
instead of three part check.
(atomic_loadti, aarch64_atomic_loadti_ldp, aarch64_atomic_loadti_basic
atomic_storeti, aarch64_atomic_storeti_stp,
aarch64_atomic_storeti_basic) New
* config/aarch64/iterators.md (GPI_TI): New.
* config/aarch64/predicates.md (aarch64_atomic_TImode_operand,
aarch64_TImode_pair_operand): New.
* doc/invoke.texi (-matomic-128bit-instructions): Document option.

gcc/testsuite/ChangeLog:

2018-09-27  Matthew Malcomson  

* gcc.target/aarch64/atomic-load128.c: New test.
* gcc.target/aarch64/atomic-store.x: Shared macro for below tests.
* gcc.target/aarch64/atomic-store.c: Use atomic-store.x.
* gcc.target/aarch64/atomic-store128.c: New test using atomic-store.x.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
caf1d2041f0cac8e3f975f8384a167a90dc638e5..578ea925fac9a7237af3a53e7ec642d0ba8e7b93
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -560,6 +560,8 @@ machine_mode aarch64_select_cc_mode (RTX_CODE, rtx, rtx);
  rtx aarch64_gen_compare_reg (RTX_CODE, rtx, rtx);
  rtx aarch64_load_tp (rtx);
  
+void aarch64_split_atomic_ti_access (rtx op[], bool);

+
  void aarch64_expand_compare_and_swap (rtx op[]);
  void aarch64_split_compare_and_swap (rtx op[]);
  void aarch64_gen_atomic_cas (rtx, rtx, rtx, rtx, rtx);
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
e5cdb1d54f4ee96140202ea21a9478438d208f45..c1e407b5a3f27aa7eea9c35e749fe597e79f3e65
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -158,9 +158,10 @@ extern unsigned aarch64_architecture_version;
  #define AARCH64_FL_SHA3 (1 << 18)  /* Has ARMv8.4-a SHA3 and 
SHA512.  */
  #define AARCH64_FL_F16FML (1 << 19)  /* Has ARMv8.4-a FP16 extensions.  */
  #define AARCH64_FL_RCPC8_4(1 << 20)  /* Has ARMv8.4-a RCPC extensions.  */
+#define AARCH64_FL_AT (1 << 21)  /* Has ARMv8.4-a AT extensions.  */
  
  /* Statistical Profiling extensions.  */

-#define AARCH64_FL_PROFILE(1 << 21)
+#define AARCH64_FL_PROFILE(1 << 22)
  
  /* Has FP and SIMD.  */

  #define AARCH64_FL_FPSIMD (AARCH64_FL_FP | AARCH64_FL_SIMD)
@@ -179,7 +180,7 @@ extern unsigned aarch64_architecture_version;
(AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_V8_3)
  #define AARCH64_FL_FOR_ARCH8_4\
(AARCH64_FL_FOR_ARCH8_3 | AARCH64_FL_V8_4 | AARCH64_FL_F16FML \
-   | AARCH64_FL_DOTPROD | AARCH64_FL_RCPC8_4)
+   | AARCH64_FL_DOTPROD | AARCH64_FL_RCPC8_4 | AARCH64_FL_AT)
  
  /* Macros to test ISA flags.  */
  
@@ -201,6 +202,7 @@ extern unsigned aarch64_architecture_version;

  #define AARCH64_ISA_SHA3 (aarch64_isa_flags & AARCH64_FL_SHA3)
  #define AARCH64_ISA_F16FML   (aarch64_isa_flags & AARCH64_FL_F16FML)
  #define AARCH64_ISA_RCPC8_4  (aarch64_isa_flags & AARCH64_FL_RCPC8_4)
+#define AARCH64_ISA_AT(aarch64_isa_

Re: [ARM/FDPIC v3 07/21] [ARM] FDPIC: Avoid saving/restoring r9 on stack since it is RO

2018-10-15 Thread Christophe Lyon
On Fri, 12 Oct 2018 at 13:45, Richard Earnshaw (lists)
 wrote:
>
> On 11/10/18 14:34, Christophe Lyon wrote:
> > 2018-XX-XX  Christophe Lyon  
> >   Mickaël Guêné 
> >
> >   gcc/
> >   * config/arm/arm.c (arm_compute_save_reg0_reg12_mask): Handle
> >   FDPIC.
> >   (thumb1_compute_save_core_reg_mask): Likewise.
>
> The hunk for this bit is missing.
>
Sigh, I forgot to remove it when I removed the code from v2.

>
> >
> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > index 92ae24b..a6dce36 100644
> > --- a/gcc/config/arm/arm.c
> > +++ b/gcc/config/arm/arm.c
> > @@ -19470,7 +19470,7 @@ arm_compute_save_reg0_reg12_mask (void)
> >
> >/* Also save the pic base register if necessary.  */
> >if (flag_pic
> > -   && !TARGET_SINGLE_PIC_BASE
> > +   && !TARGET_SINGLE_PIC_BASE && !TARGET_FDPIC
> > && arm_pic_register != INVALID_REGNUM
> > && crtl->uses_pic_offset_table)
> >   save_reg_mask |= 1 << PIC_OFFSET_TABLE_REGNUM;
> > @@ -19504,7 +19504,7 @@ arm_compute_save_reg0_reg12_mask (void)
> >/* If we aren't loading the PIC register,
> >don't stack it even though it may be live.  */
> >if (flag_pic
> > -   && !TARGET_SINGLE_PIC_BASE
> > +   && !TARGET_SINGLE_PIC_BASE && !TARGET_FDPIC
> > && arm_pic_register != INVALID_REGNUM
> > && (df_regs_ever_live_p (PIC_OFFSET_TABLE_REGNUM)
> > || crtl->uses_pic_offset_table))
> >
>
> flag_pic
>   && !TARGET_SINGLE_PIC_BASE && !TARGET_FDPIC
>   && arm_pic_register != INVALID_REGNUM
>
> Might be worth lifting this out into a macro.
>
> R.


Re: [PATCH] i386: Also disable AVX512IFMA/AVX5124FMAPS/AVX5124VNNIW

2018-10-15 Thread Richard Biener
On Sun, Oct 14, 2018 at 9:29 PM Uros Bizjak  wrote:
>
> On Sat, Oct 13, 2018 at 11:54 PM H.J. Lu  wrote:
> >
> > Also disable AVX512IFMA, AVX5124FMAPS and AVX5124VNNIW when disabling
> > AVX512F.
> >
> > gcc/
> >
> > PR target/87572
> > * common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512F_UNSET):
> > Add OPTION_MASK_ISA_AVX512IFMA_UNSET,
> > OPTION_MASK_ISA_AVX5124FMAPS_UNSET and
> > OPTION_MASK_ISA_AVX5124VNNIW_UNSET.
> >
> > gcc/testsuite/
> >
> > PR target/87572
> > * gcc.target/i386/pr87572.c: New test.
>
> LGTM.

This caused gazillion of testsuite FAILs like

FAIL: gcc.target/i386/isa-11.c (test for excess errors)
Excess errors:
/tmp/ccyurT91.s:8: Error: invalid instruction suffix for `push'
/tmp/ccyurT91.s:14: Error: invalid instruction suffix for `pop'

where we now emit pushl in 64bit mode.

Richard.


> Thanks,
> Uros.
>
> >  gcc/common/config/i386/i386-common.c|  8 ++--
> >  gcc/testsuite/gcc.target/i386/pr87572.c | 10 ++
> >  2 files changed, 16 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr87572.c
> >
> > diff --git a/gcc/common/config/i386/i386-common.c 
> > b/gcc/common/config/i386/i386-common.c
> > index 3b5312d7250..36ef999df83 100644
> > --- a/gcc/common/config/i386/i386-common.c
> > +++ b/gcc/common/config/i386/i386-common.c
> > @@ -194,8 +194,12 @@ along with GCC; see the file COPYING3.  If not see
> >(OPTION_MASK_ISA_AVX512F | OPTION_MASK_ISA_AVX512CD_UNSET \
> > | OPTION_MASK_ISA_AVX512PF_UNSET | OPTION_MASK_ISA_AVX512ER_UNSET \
> > | OPTION_MASK_ISA_AVX512DQ_UNSET | OPTION_MASK_ISA_AVX512BW_UNSET \
> > -   | OPTION_MASK_ISA_AVX512VL_UNSET | OPTION_MASK_ISA_AVX512VBMI2_UNSET \
> > -   | OPTION_MASK_ISA_AVX512VNNI_UNSET | 
> > OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET \
> > +   | OPTION_MASK_ISA_AVX512VL_UNSET | OPTION_MASK_ISA_AVX512IFMA_UNSET \
> > +   | OPTION_MASK_ISA_AVX5124FMAPS_UNSET \
> > +   | OPTION_MASK_ISA_AVX5124VNNIW_UNSET \
> > +   | OPTION_MASK_ISA_AVX512VBMI2_UNSET \
> > +   | OPTION_MASK_ISA_AVX512VNNI_UNSET \
> > +   | OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET \
> > | OPTION_MASK_ISA_AVX512BITALG_UNSET)
> >  #define OPTION_MASK_ISA_AVX512CD_UNSET OPTION_MASK_ISA_AVX512CD
> >  #define OPTION_MASK_ISA_AVX512PF_UNSET OPTION_MASK_ISA_AVX512PF
> > diff --git a/gcc/testsuite/gcc.target/i386/pr87572.c 
> > b/gcc/testsuite/gcc.target/i386/pr87572.c
> > new file mode 100644
> > index 000..ea1beb78f5c
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr87572.c
> > @@ -0,0 +1,10 @@
> > +/* PR target/82483 */
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -mavx512ifma -mno-sse2 -w -Wno-psabi" } */
> > +
> > +typedef long long __m512i __attribute__((__vector_size__(64)));
> > +__m512i
> > +foo (__m512i c, __m512i d, __m512i e, int b)
> > +{
> > +  return __builtin_ia32_vpmadd52huq512_maskz (c, d, e, b); /* { dg-error 
> > "incompatible types" } */
> > +}
> > --
> > 2.17.2
> >


Re: [PATCH 02/14] Add D frontend (GDC) implementation.

2018-10-15 Thread David Malcolm
On Tue, 2018-09-18 at 02:33 +0200, Iain Buclaw wrote:
> This patch adds the D front-end implementation, the only part of the
> compiler that interacts with GCC directly, and being the parts that I
> maintain, is something that I can talk about more directly.
> 
> For the actual code generation pass, that converts the front-end AST
> to GCC trees, most parts use a separate Visitor interfaces to do a
> certain kind of lowering, for instance, types.cc builds *_TYPE trees
> from AST Type's.  The Visitor class is part of the DMD front-end, and
> is defined in dfrontend/visitor.h.
> 
> There are also a few interfaces which have their headers in the DMD
> frontend, but are implemented here because they do something that
> requires knowledge of the GCC backend (d-target.cc), does something
> that may not be portable, or differ between D compilers
> (d-frontend.cc) or are a thin wrapper around something that is
> managed
> by GCC (d-diagnostic.cc).
> 
> Many high level operations result in generation of calls to D runtime
> library functions (runtime.def), all with require some kind of
> runtime
> type information (typeinfo.cc).  The compiler also generates
> functions
> for registering/deregistering compiled modules with the D runtime
> library (modules.cc).
> 
> As well as the D language having it's own built-in functions
> (intrinsics.cc), we also expose GCC builtins to D code via a
> `gcc.builtins' module (d-builtins.cc), and give special treatment to
> a
> number of UDAs that could be applied to functions (d-attribs.cc).
> 
> 
> That is roughly the high level jist of how things are currently
> organized.
> 
> ftp://ftp.gdcproject.org/patches/v4/02-v4-d-frontend-gdc.patch

Hi Iain.  I took at look at this patch, focusing on the diagnostics
side of things.

These are more suggestions than hard review blockers.

diff --git a/gcc/d/d-attribs.c b/gcc/d/d-attribs.c
new file mode 100644
index 000..6c65b8cad9e
--- /dev/null
+++ b/gcc/d/d-attribs.c

I believe all new C++ source files are meant to be .cc, rather than .c,
so this should be d-attribs.cc, rather that d-attribs.c.

[...snip...]

diff --git a/gcc/d/d-codegen.cc b/gcc/d/d-codegen.cc
new file mode 100644
index 000..c698890ba07
--- /dev/null
+++ b/gcc/d/d-codegen.cc

[...snip...]

+/* Return the GCC location for the D frontend location LOC.   */
+
+location_t
+get_linemap (const Loc& loc)
+{

I don't like the name "get_linemap", as it suggests to me that it's
getting the "struct line_map *" for LOC, rather than a location_t.

How about "get_location_t" instead, or "make_location_t"?  The latter
feels more appropriate, as it's doing non-trivial work.

+  location_t gcc_location = input_location;
+
+  if (loc.filename)
+{
+  linemap_add (line_table, LC_ENTER, 0, loc.filename, loc.linnum);
+  linemap_line_start (line_table, loc.linnum, 0);
+  gcc_location = linemap_position_for_column (line_table, loc.charnum);
+  linemap_add (line_table, LC_LEAVE, 0, NULL, 0);
+}
+
+  return gcc_location;
+}

[...snip...]

diff --git a/gcc/d/d-diagnostic.cc b/gcc/d/d-diagnostic.cc
new file mode 100644
index 000..8b8a21e31b6
--- /dev/null
+++ b/gcc/d/d-diagnostic.cc
@@ -0,0 +1,347 @@
+/* d-diagnostics.cc -- D frontend interface to gcc diagnostics.
+   Copyright (C) 2017-2018 Free Software Foundation, Inc.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+
+#include "dmd/globals.h"
+#include "dmd/errors.h"
+
+#include "tree.h"
+#include "options.h"
+#include "diagnostic.h"
+
+#include "d-tree.h"
+
+
+/* Rewrite the format string FORMAT to deal with any format extensions not
+   supported by pp_format().  The result should be freed by the caller.  */
+static char *
+expand_format (const char *format)

Am I right in thinking this is to handle FORMAT strings coming from the
upstream D frontend, and this has its own formatting conventions?
Please can the leading comment have example(s) of the format, and what
it becomes (e.g. the backticks thing).

Maybe adopt a naming convention in the file, to distinguish d format
strings from pp format strings?  Maybe "d_format" vs "gcc_format"
?(though given the verbatim vs !verbatim below, am not sure how
feasible that is).

Maybe rename this function to expand_d_format??

(Might be nice to add some unittesting of this function via "selftest",
but that's definitely not a requirement

Re: [ARM/FDPIC v3 06/21] [ARM] FDPIC: Add support for c++ exceptions

2018-10-15 Thread Christophe Lyon
On Fri, 12 Oct 2018 at 13:37, Richard Earnshaw (lists) <
richard.earns...@arm.com> wrote:

> On 11/10/18 14:34, Christophe Lyon wrote:
> > The main difference with existing support is that function addresses
> > are function descriptor addresses instead. This means that all code
> > dealing with function pointers now has to cope with function
> > descriptors instead.
> >
> > For the same reason, Linux kernel helpers can no longer be called by
> > dereferencing their address, so we implement the same functionality as
> > a regular function here.
> >
> > When restoring a function address, we also have to restore the FDPIC
> > register value (r9).
> >
> > 2018-XX-XX  Christophe Lyon  
> >   Mickaël Guêné 
> >
> >   gcc/
> >   * ginclude/unwind-arm-common.h (unwinder_cache): Add reserved5
> >   field.
> >   (FDPIC_REGNUM): New define.
> >
> >   libgcc/
> >   * config/arm/linux-atomic.c (__kernel_cmpxchg): Add FDPIC support.
> >   (__kernel_dmb): Likewise.
> >   (__fdpic_cmpxchg): New function.
> >   (__fdpic_dmb): New function.
> >   * config/arm/unwind-arm.h (gnu_Unwind_Find_got): New function.
> >   (_Unwind_decode_typeinfo_ptr): Add FDPIC support.
> >   * unwindo-arm-common.inc (UCB_PR_GOT): New.
> >   (funcdesc_t): New struct.
> >   (get_eit_entry): Add FDPIC support.
> >   (unwind_phase2): Likewise.
> >   (unwind_phase2_forced): Likewise.
> >   (__gnu_Unwind_RaiseException): Likewise.
> >   (__gnu_Unwind_Resume): Likewise.
> >   (__gnu_Unwind_Backtrace): Likewise.
> >   * unwind-pe.h (read_encoded_value_with_base): Likewise.
> >
> >   libstdc++/
> >   * libsupc++/eh_personality.cc (get_ttype_entry): Add FDPIC
> >   support.
> >
> > Change-Id: I517a49ff18fae21c686cd1c6008ea7974515b347
> >
> > diff --git a/gcc/ginclude/unwind-arm-common.h
> b/gcc/ginclude/unwind-arm-common.h
> > index 8a1a919..f663891 100644
> > --- a/gcc/ginclude/unwind-arm-common.h
> > +++ b/gcc/ginclude/unwind-arm-common.h
> > @@ -91,7 +91,7 @@ extern "C" {
> > _uw reserved2;  /* Personality routine address */
> > _uw reserved3;  /* Saved callsite address */
> > _uw reserved4;  /* Forced unwind stop arg */
> > -   _uw reserved5;
> > +   _uw reserved5;  /* Personality routine GOT value in FDPIC mode.
> */
> >   }
> >unwinder_cache;
> >/* Propagation barrier cache (valid after phase 1): */
> > @@ -247,4 +247,6 @@ typedef unsigned long _uleb128_t;
> >  }   /* extern "C" */
> >  #endif
> >
> > +#define FDPIC_REGNUM 9
>
> Looking at the rest of this file, I think it can end up being included
> in user code.  So you have to put predefines into the reserved
> namespace.  Why do you need this here anyway?
>
> That was to address a comment I got on v2 of this patch: I was requested
to avoid hardcoding "r9" in unwind-arm.h, and to use FDPIC_REGNUM.
I needed the definition to be visible in both unwind-arm.h and
unwind-arm-common.inc, but I moved it at a too high level. I'll fix this.

> +
> >  #endif /* defined UNWIND_ARM_COMMON_H */
> > diff --git a/libgcc/config/arm/linux-atomic.c
> b/libgcc/config/arm/linux-atomic.c
> > index d334c58..161d1ce 100644
> > --- a/libgcc/config/arm/linux-atomic.c
> > +++ b/libgcc/config/arm/linux-atomic.c
> > @@ -25,11 +25,49 @@ see the files COPYING3 and COPYING.RUNTIME
> respectively.  If not, see
> >
> >  /* Kernel helper for compare-and-exchange.  */
> >  typedef int (__kernel_cmpxchg_t) (int oldval, int newval, int *ptr);
> > +#if __FDPIC__
> > +/* Non-FDPIC ABIs call __kernel_cmpxchg directly by dereferencing its
> > +   address, but under FDPIC we would generate a broken call
> > +   sequence. That's why we have to implement __kernel_cmpxchg and
> > +   __kernel_dmb here: this way, the FDPIC call sequence works.  */
> > +#define __kernel_cmpxchg __fdpic_cmpxchg
> > +#else
> >  #define __kernel_cmpxchg (*(__kernel_cmpxchg_t *) 0x0fc0)
> > +#endif
> >
> >  /* Kernel helper for memory barrier.  */
> >  typedef void (__kernel_dmb_t) (void);
> > +#if __FDPIC__
> > +#define __kernel_dmb __fdpic_dmb
> > +#else
> >  #define __kernel_dmb (*(__kernel_dmb_t *) 0x0fa0)
> > +#endif
> > +
> > +#if __FDPIC__
> > +static int __fdpic_cmpxchg (int oldval, int newval, int *ptr)
> > +{
> > +  int result;
> > +
> > +  asm volatile ("1: ldrex r3, [%[ptr]]\n\t"
> > + "subs  r3, r3, %[oldval]\n\t"
> > + "itt eq\n\t"
> > + "strexeq r3, %[newval], [%[ptr]]\n\t"
> > + "teqeq r3, #1\n\t"
> > + "it eq\n\t"
> > + "beq 1b\n\t"
> > + "rsbs  %[result], r3, #0\n\t"
> > + : [result] "=r" (result)
> > + : [oldval] "r" (oldval) , [newval] "r" (newval), [ptr] "r"
> (ptr)
> > + : "r3");
> > +return result;
> > +}
> > +
> > +static void __fdpic_dmb ()
> > +{
> > +  asm volatile ("dmb\n\t");
> > +}
> > +
>
> The whole point of __kernel_dmb and __kernel_cmpxchg is that the ker

Re: [PATCH] Reset insn priority after inc/ref replacement in haifa sched

2018-10-15 Thread Alexander Monakov
On Mon, 15 Oct 2018, Robin Dapp wrote:
>   * haifa-sched.c (priority): Add force_recompute parameter.
>   (apply_replacement):
>   Call priority () with force_recompute = true.
>   (restore_pattern): Likewise.

A C++ style nit/question: instead of adding a new overload 

  priority (rtx_insn *, bool)

you can add a parameter with a default value in the existing
static function

  priority (rtx_insn *insn, bool force_recompute = false)

unless I'm missing something and the new overload is on purpose?

Alexander


Re: [PATCH] asm non-code template parts (alternative to asm inline)

2018-10-15 Thread Alexander Monakov
> > I think only double quote, backslash, backtick remain unclaimed. And of 
> > course
> > ASCII \0 through \040 and \177 ;)
> 
> I see.  Apart from using some of the traditional begin-end sequences we
> could use %; or similar on each line to "comment" it?

I guess in theory we could define percent-backslash-separator to not count,
but wouldn't that go a bit into micro-management territory?  In the
kernel usecase the main goal would be to "comment" one block of lines, not
meticulously mark up each and every non-insn line.

Alexander


[PATCH] Adjust test to pass with latest glibc

2018-10-15 Thread Jonathan Wakely

Glibc changed the it_IT locales to use thousands separators,
invalidating this test. Use nl_NL instead, as Dutch only uses grouping
for money not numbers.

* testsuite/22_locale/numpunct/members/char/3.cc: Adjust test to
account for change to glibc it_IT localedata (glibc bz#10797).

Tested x86_64-linux, committed to trunk.

commit e4a550b85fec6440e4fe70817e96f496874f36d8
Author: Jonathan Wakely 
Date:   Mon Oct 15 14:48:20 2018 +0100

Adjust test to pass with latest glibc

Glibc changed the it_IT locales to use thousands separators,
invalidating this test. Use nl_NL instead, as Dutch only uses grouping
for money not numbers.

* testsuite/22_locale/numpunct/members/char/3.cc: Adjust test to
account for change to glibc it_IT localedata (glibc bz#10797).

diff --git a/libstdc++-v3/testsuite/22_locale/numpunct/members/char/3.cc 
b/libstdc++-v3/testsuite/22_locale/numpunct/members/char/3.cc
index f314502461a..a55cf89b294 100644
--- a/libstdc++-v3/testsuite/22_locale/numpunct/members/char/3.cc
+++ b/libstdc++-v3/testsuite/22_locale/numpunct/members/char/3.cc
@@ -1,4 +1,4 @@
-// { dg-require-namedlocale "it_IT.ISO8859-15" }
+// { dg-require-namedlocale "nl_NL.ISO8859-15" }
 
 // 2001-01-24 Benjamin Kosnik  
 
@@ -28,12 +28,14 @@ void test02()
 {
   using namespace std;
 
-  locale loc_it = locale(ISO_8859(15,it_IT));
+  // nl_NL chosen because it has no thousands separator (at this time).
+  locale loc_it = locale(ISO_8859(15,nl_NL));
 
   const numpunct& nump_it = use_facet >(loc_it); 
 
   string g = nump_it.grouping();
 
+  // Ensure that grouping is empty for locales with empty thousands separator.
   VERIFY( g == "" );
 }
 


Re: [PATCH][i386] Fix vec_construct cost, remove unused ix86_vec_cost arg

2018-10-15 Thread Richard Biener
On Thu, 11 Oct 2018, Richard Biener wrote:

> 
> The following fixes vec_construct cost calculation to properly consider
> that the inserts will happen to SSE regs thus forgo the multiplication
> done in ix86_vec_cost which is passed the wrong mode.  This gets rid of
> the only call passing false to ix86_vec_cost (so consider the patch
> amended to remove the arg if approved).
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> 
> OK for trunk?

PING.

> I am considering to make the factor we apply in ix86_vec_cost
> which currently depends on X86_TUNE_AVX128_OPTIMAL and
> X86_TUNE_SSE_SPLIT_REGS part of the actual cost tables since
> the reason we apply them are underlying CPU architecture details.
> Was the original reason of doing the multiplication based on
> those tunings to be able to "share" the same basic cost table
> across architectures that differ in this important detail?
> I see X86_TUNE_SSE_SPLIT_REGS is only used for m_ATHLON_K8
> and X86_TUNE_AVX128_OPTIMAL is used for m_BDVER, m_BTVER2
> and m_ZNVER1.  Those all have (multiple) exclusive processor_cost_table
> entries.
> 
> As a first step I'd like to remove the use of ix86_vec_cost for
> the entries that already have entries for multiple modes
> (loads and stores) and apply the factor there.  For example
> Zen can do two 128bit loads per cycle but only one 128bit store.
> With multiplying AVX256 costs by two we seem to cost sth like
> # instructions to dispatch * instruction latency which is an
> odd thing.  I'd have expected # instructions to dispatch / instruction 
> throughput * instruction latency - so a AVX256 add would cost
> the same as a AVX128 add, likewise for loads but stores would be
> more expensive because of the throughput issue.  This all
> ignores resource utilization across multiple insns but that's
> how the cost model works ...
> 
> Thanks,
> Richard.
> 
> 2018-10-11  Richard Biener  
> 
>   * config/i386/i386.c (ix86_vec_cost): Remove !parallel path
>   and argument.
>   (ix86_builtin_vectorization_cost): For vec_construct properly
>   cost insertion into SSE regs.
>   (...): Adjust calls to ix86_vec_cost.
> 
> Index: gcc/config/i386/i386.c
> ===
> --- gcc/config/i386/i386.c(revision 265022)
> +++ gcc/config/i386/i386.c(working copy)
> @@ -39846,11 +39846,10 @@ ix86_set_reg_reg_cost (machine_mode mode
>  static int
>  ix86_vec_cost (machine_mode mode, int cost, bool parallel)
>  {
> +  gcc_assert (parallel);
>if (!VECTOR_MODE_P (mode))
>  return cost;
> - 
> -  if (!parallel)
> -return cost * GET_MODE_NUNITS (mode);
> +
>if (GET_MODE_BITSIZE (mode) == 128
>&& TARGET_SSE_SPLIT_REGS)
>  return cost * 2;
> @@ -45190,8 +45189,9 @@ ix86_builtin_vectorization_cost (enum ve
>  
>case vec_construct:
>   {
> -   /* N element inserts.  */
> -   int cost = ix86_vec_cost (mode, ix86_cost->sse_op, false);
> +   gcc_assert (VECTOR_MODE_P (mode));
> +   /* N element inserts into SSE vectors.  */
> +   int cost = GET_MODE_NUNITS (mode) * ix86_cost->sse_op;
> /* One vinserti128 for combining two SSE vectors for AVX256.  */
> if (GET_MODE_BITSIZE (mode) == 256)
>   cost += ix86_vec_cost (mode, ix86_cost->addss, true);
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH][i386] Fix vec_construct cost, remove unused ix86_vec_cost arg

2018-10-15 Thread Richard Biener
On Thu, 11 Oct 2018, Richard Biener wrote:

> On Thu, 11 Oct 2018, Richard Biener wrote:
> 
> > 
> > The following fixes vec_construct cost calculation to properly consider
> > that the inserts will happen to SSE regs thus forgo the multiplication
> > done in ix86_vec_cost which is passed the wrong mode.  This gets rid of
> > the only call passing false to ix86_vec_cost (so consider the patch
> > amended to remove the arg if approved).
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > 
> > OK for trunk?
> > 
> > I am considering to make the factor we apply in ix86_vec_cost
> > which currently depends on X86_TUNE_AVX128_OPTIMAL and
> > X86_TUNE_SSE_SPLIT_REGS part of the actual cost tables since
> > the reason we apply them are underlying CPU architecture details.
> > Was the original reason of doing the multiplication based on
> > those tunings to be able to "share" the same basic cost table
> > across architectures that differ in this important detail?
> > I see X86_TUNE_SSE_SPLIT_REGS is only used for m_ATHLON_K8
> > and X86_TUNE_AVX128_OPTIMAL is used for m_BDVER, m_BTVER2
> > and m_ZNVER1.  Those all have (multiple) exclusive processor_cost_table
> > entries.
> > 
> > As a first step I'd like to remove the use of ix86_vec_cost for
> > the entries that already have entries for multiple modes
> > (loads and stores) and apply the factor there.  For example
> > Zen can do two 128bit loads per cycle but only one 128bit store.
> > With multiplying AVX256 costs by two we seem to cost sth like
> > # instructions to dispatch * instruction latency which is an
> > odd thing.  I'd have expected # instructions to dispatch / instruction 
> > throughput * instruction latency - so a AVX256 add would cost
> > the same as a AVX128 add, likewise for loads but stores would be
> > more expensive because of the throughput issue.  This all
> > ignores resource utilization across multiple insns but that's
> > how the cost model works ...
> 
> So like the following which removes the use of ix86_vec_cost
> for SSE loads and stores since we have per-mode costs already.
> I've applied the relevant factor to the individual cost tables
> (noting that for X86_TUNE_SSE_SPLIT_REGS we only apply the
> multiplication for size == 128, not size >= 128 ...)
> 
> There's a ??? hunk in inline_memory_move_cost where we
> failed to apply the scaling thus in that place we'd now have
> a behavior change.  Alternatively I could leave the cost
> tables unaltered if that costing part is more critical than
> the vectorizer one.
> 
> I've also spotted, when reviewing ix86_vec_cost uses, a bug
> in ix86_rtx_cost which keys on SFmode which doesn't work
> for SSE modes, thus use GET_MODE_INNER.
> 
> Also I've changed X86_TUNE_AVX128_OPTIMAL to also apply
> to BTVER1 - everywhere else we glob BTVER1 and BTVER2 so
> this must surely be a omission.
> 
> Honza - is a patch like this OK?

PING.  I dropped the config/i386/x86-tune.def hunk since btver1
doesn't have AVX.

Otherwise bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.
 
> Should I split out individual fixes to make bisection possible?
> 
> Should I update the cost tables or instead change the vectorizer
> costing when considering the inline_memory_move_cost "issue"?
> 
> Thanks,
> Richard.
> 
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 0cf4152acb2..f5392232f61 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -39432,6 +39432,7 @@ inline_memory_move_cost (machine_mode mode, enum 
> reg_class regclass,
>int index = sse_store_index (mode);
>if (index == -1)
>   return 100;
> +  /* ??? */
>if (in == 2)
>  return MAX (ix86_cost->sse_load [index], ix86_cost->sse_store 
> [index]);
>return in ? ix86_cost->sse_load [index] : ix86_cost->sse_store [index];
> @@ -40183,7 +40181,8 @@ ix86_rtx_costs (rtx x, machine_mode mode, int 
> outer_code_i, int opno,
>  gcc_assert (TARGET_FMA || TARGET_FMA4 || TARGET_AVX512F);
>  
>  *total = ix86_vec_cost (mode,
> - mode == SFmode ? cost->fmass : cost->fmasd,
> + GET_MODE_INNER (mode) == SFmode
> + ? cost->fmass : cost->fmasd,
>   true);
>   *total += rtx_cost (XEXP (x, 1), mode, FMA, 1, speed);
>  
> @@ -45122,18 +45121,14 @@ ix86_builtin_vectorization_cost (enum 
> vect_cost_for_stmt type_of_cost,
>   /* See PR82713 - we may end up being called on non-vector type.  */
>   if (index < 0)
> index = 2;
> -return ix86_vec_cost (mode,
> -   COSTS_N_INSNS (ix86_cost->sse_load[index]) / 2,
> -   true);
> +return COSTS_N_INSNS (ix86_cost->sse_load[index]) / 2;
>  
>case vector_store:
>   index = sse_store_index (mode);
>   /* See PR82713 - we may end up being called on non-vector type.  */
>   if (index < 0)
> inde

Re: [PATCH] asm non-code template parts (alternative to asm inline)

2018-10-15 Thread Richard Biener
On Mon, Oct 15, 2018 at 12:36 PM Alexander Monakov  wrote:
>
> On Mon, 15 Oct 2018, Richard Biener wrote:
> >
> > Oh, and I personally find %` ugly ;)  What non-alnum chars
> > are taken by backends?
>
> I think only double quote, backslash, backtick remain unclaimed. And of course
> ASCII \0 through \040 and \177 ;)

I see.  Apart from using some of the traditional begin-end sequences we
could use %; or similar on each line to "comment" it?

Richard.

> Alexander


Re: [PATCH] Add option to control warnings added through attribure "warning"

2018-10-15 Thread Martin Sebor

On 10/15/2018 01:55 AM, Nikolai Merinov wrote:

Hi Martin,

On 10/12/18 9:58 PM, Martin Sebor wrote:

On 10/12/2018 04:14 AM, Nikolai Merinov wrote:

Hello,

In https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01795.html mail I
suggested patch to have ability to control behavior of
"__attribute__((warning))" in case when option "-Werror" enabled. Usage
example:


#include 
int a() __attribute__((warning("Warning: `a' was used")));
int a() { return 1; }
int main () { return a(); }



$ gcc -Werror test.c
test.c: In function ‘main’:
test.c:4:22: error: call to ‘a’ declared with attribute warning:
Warning: `a' was used [-Werror]
 int main () { return a(); }
  ^
cc1: all warnings being treated as errors
$ gcc -Werror -Wno-error=warning-attribute test.c
test.c: In function ‘main’:
test.c:4:22: warning: call to ‘a’ declared with attribute warning:
Warning: `a' was used
 int main () { return a(); }
  ^

Can you provide any feedback on suggested changes?


It seems like a useful feature and in line with the philosophy
that distinct warnings should be controlled by their own options.

I would only suggest to consider changing the name to
-Wattribute-warning, because it applies specifically to that
attribute (as opposed to warnings about attributes in general).

There are many attributes in GCC and diagnosing problems that
are unique to each, under the same -Wattributes option, is
becoming too coarse and overly limiting.  To make it more
flexible, I expect new options will need to be introduced,
such as -Wattribute-alias (to control aspects of the alias
attribute and others related to it), or -Wattribute-const
(to control diagnostics about functions declared with
attribute const that violate the attribute's constraints).

An alternative might be to introduce a single -Wattribute=
 option where the  gives
the names of all the distinct attributes whose unique
diagnostics one might need to control.

Martin


Currently there is several styles already in use:

-Wattribute-alias where "attribute" word used as prefix for name of 
attribute,
-Wsuggest-attribute=[pure|const|noreturn|format|malloc] where name of 
attribute passed as possible argument,

-Wmissing-format-attribute where "attribute" word used as suffix,
-Wdeprecated-declarations where "attribute" word not used at all even if 
this warning option was created especially for "deprecated" attribute.


I changed name to "-Wattribute-warning" as you suggested, but unifying 
style for all attribute related warning looks like separate activity. 
Please check new patch in attachments.




Thanks for survey!  I agree that making the existing options
consistent (if that's what we want) should be done separately.

Martin

PS It doesn't look like your latest attachments made it to
the list.



Updated changelog:

gcc/Changelog

2018-10-14  Nikolai Merinov 

     * gcc/common.opt: Add -Wattribute-warning.
     * gcc/doc/invoke.texi: Add documentation for 
-Wno-attribute-warning.

     * gcc/testsuite/gcc.dg/Wno-attribute-warning.c: New test.
     * gcc/expr.c (expand_expr_real_1): Add new attribute to warning_at
     call to allow user configure behavior of "warning" attribute


[PATCH][RFC] Fix PR87609 - dependence info copying

2018-10-15 Thread Richard Biener


During CFG transforms like loop unrolling or peeling (or jump threading
or loop heder copying performing essentially the latter) we need
to remap dependence cliques similar to how we need to do during inlining
to avoid false non-dependences across iterations.

We've talked about this a bit in the context of optimization passes
wanting to transfer the knowledge they impose on parts of the IL
via runtime checks to followup passes (in which case also hoisting
across such checks is problematic).

PR87609 now shows a case where this is important for code that
was brought in via inlining as well.

The patch does this by hooking into copy_bbs() as the primitive
(hopefully exclusively) used by such transforms.  I've not yet
touched the RTL parts which need to remap cliques as they appear
in MEM_EXPRs of MEM_ATTRs of MEMs copied.  As said above both
jump threading and unrolling are affected.

The patch as-is may end up pessimizing jump threaded code that is
_not_ performing peeling.  To avoid this the interface I chose
might not be optimal (there's no way to disable copy_bbs behavior
at the moment).

Similar the remapping is not strictly needed for loop versioning.

One option might be to pass down the copy_bb_info instance to
copy_bbs from the callers which should know whether it's necessary
to remap dependence info or not.

I've also added a checker that we do not end up with dependence
info on addresses (but at the moment I do remap that in the
patch anyhow).

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Comments appreciated - Bin, you had sth similar IIRC?

Thanks,
Richard.

2018-10-15  Richard Biener  

* cfghooks.h (dependence_hash): New typedef.
(struct copy_bb_data): New type.
(cfg_hooks::duplicate_block): Adjust to take a copy_bb_data argument.
(duplicate_block): Likewise.
* cfghooks.c (duplicate_block): Pass down copy_bb_data.
(copy_bbs): Create and pass down copy_bb_data.
* cfgrtl.c (cfg_layout_duplicate_bb): Adjust.
(rtl_duplicate_bb): Likewise.
* tree-cfg.c (verify_address): Verify addresses do not have
dependence info.
(gimple_duplicate_bb): If the copy_bb_data arg is not NULL
remap dependence info.

* gcc.dg/torture/restrict-7.c: New testcase.

Index: gcc/cfghooks.h
===
--- gcc/cfghooks.h  (revision 265162)
+++ gcc/cfghooks.h  (working copy)
@@ -54,6 +54,19 @@ struct profile_record
   bool run;
 };
 
+typedef int_hash  dependence_hash;
+
+/* Optional data for duplicate_block.   */
+
+struct copy_bb_data
+{
+  copy_bb_data() : dependence_map (NULL) {}
+  ~copy_bb_data () { delete dependence_map; }
+
+  /* A map from the copied BBs dependence info cliques to
+ equivalents in the BBs duplicated to.  */
+  hash_map *dependence_map;
+};
 
 struct cfg_hooks
 {
@@ -112,7 +125,7 @@ struct cfg_hooks
   bool (*can_duplicate_block_p) (const_basic_block a);
 
   /* Duplicate block A.  */
-  basic_block (*duplicate_block) (basic_block a);
+  basic_block (*duplicate_block) (basic_block a, copy_bb_data *);
 
   /* Higher level functions representable by primitive operations above if
  we didn't have some oddities in RTL and Tree representations.  */
@@ -227,7 +240,8 @@ extern void tidy_fallthru_edges (void);
 extern void predict_edge (edge e, enum br_predictor predictor, int 
probability);
 extern bool predicted_by_p (const_basic_block bb, enum br_predictor predictor);
 extern bool can_duplicate_block_p (const_basic_block);
-extern basic_block duplicate_block (basic_block, edge, basic_block);
+extern basic_block duplicate_block (basic_block, edge, basic_block,
+   copy_bb_data * = NULL);
 extern bool block_ends_with_call_p (basic_block bb);
 extern bool empty_block_p (basic_block);
 extern basic_block split_block_before_cond_jump (basic_block);
Index: gcc/cfghooks.c
===
--- gcc/cfghooks.c  (revision 265162)
+++ gcc/cfghooks.c  (working copy)
@@ -1066,7 +1066,7 @@ can_duplicate_block_p (const_basic_block
AFTER.  */
 
 basic_block
-duplicate_block (basic_block bb, edge e, basic_block after)
+duplicate_block (basic_block bb, edge e, basic_block after, copy_bb_data *id)
 {
   edge s, n;
   basic_block new_bb;
@@ -1082,7 +1082,7 @@ duplicate_block (basic_block bb, edge e,
 
   gcc_checking_assert (can_duplicate_block_p (bb));
 
-  new_bb = cfg_hooks->duplicate_block (bb);
+  new_bb = cfg_hooks->duplicate_block (bb, id);
   if (after)
 move_block_after (new_bb, after);
 
@@ -1337,6 +1337,7 @@ copy_bbs (basic_block *bbs, unsigned n,
   unsigned i, j;
   basic_block bb, new_bb, dom_bb;
   edge e;
+  copy_bb_data id;
 
   /* Mark the blocks to be copied.  This is used by edge creation hooks
  to decide whether to reallocate PHI nodes capacity to avoid reallocating
@@ -1349,7 +1350,7 @@ copy_bbs (basic_block *bbs, u

[PATCH] PR libstdc++/87587 prevent -Wabi warnings

2018-10-15 Thread Jonathan Wakely

The warnings about changes to empty struct parameter passing can be
ignored because the callers are all internal to the library, and so
compiled with the same -fabi-version as the function definitions.

It would be preferable to use #pragma GCC diagnostic warning "-Wabi=12"
to get warnings about any other ABI changes in future versions, but
until PR c++/87611 is fixed the warnings must be completely disabled
with #pragma GCC diagnostic ignroed "-Wabi".

PR libstdc++/87587
* src/c++11/cxx11-shim_facets.cc: Suppress -Wabi warnings.

Tested x86_64-linux, committed to trunk.

commit 4f4aa94bc56613ff8f3079f471bce14bdd4b2407
Author: Jonathan Wakely 
Date:   Mon Oct 15 13:16:26 2018 +0100

PR libstdc++/87587 prevent -Wabi warnings

The warnings about changes to empty struct parameter passing can be
ignored because the callers are all internal to the library, and so
compiled with the same -fabi-version as the function definitions.

It would be preferable to use #pragma GCC diagnostic warning "-Wabi=12"
to get warnings about any other ABI changes in future versions, but
until PR c++/87611 is fixed the warnings must be completely disabled
with #pragma GCC diagnostic ignroed "-Wabi".

PR libstdc++/87587
* src/c++11/cxx11-shim_facets.cc: Suppress -Wabi warnings.

diff --git a/libstdc++-v3/src/c++11/cxx11-shim_facets.cc 
b/libstdc++-v3/src/c++11/cxx11-shim_facets.cc
index 017b0a0fdb6..78537bd152f 100644
--- a/libstdc++-v3/src/c++11/cxx11-shim_facets.cc
+++ b/libstdc++-v3/src/c++11/cxx11-shim_facets.cc
@@ -224,6 +224,11 @@ namespace __facet_shims
 void
 __messages_close(other_abi, const facet*, messages_base::catalog);
 
+#pragma GCC diagnostic push
+// Suppress -Wabi=2 warnings due to empty struct argument passing changes.
+// TODO This should use -Wabi=12 but that currently fails (PR c++/87611).
+#pragma GCC diagnostic ignored "-Wabi"
+
   namespace // unnamed
   {
 struct __shim_accessor : facet
@@ -767,6 +772,8 @@ namespace __facet_shims
return m->put(s, intl, io, fill, units);
 }
 
+#pragma GCC diagnostic pop
+
   template ostreambuf_iterator
   __money_put(current_abi, const facet*, ostreambuf_iterator,
bool, ios_base&, char, long double, const __any_string*);


Re: [PATCH/RFC] Add "User Experience Guidelines" to gccint.texi

2018-10-15 Thread Richard Sandiford
Another thanks for doing this.

Martin Sebor  writes:
> On 10/12/2018 09:43 AM, David Malcolm wrote:
>> +Avoid using the @code{input_location} global, and the diagnostic functions
>> +that implicitly use it - use @code{error_at} and @code{warning_at} rather
>> +than @code{error} and @code{warning}.
>> +
>> +@c TODO labelling of ranges
>> +
>> +@subsection Coding Conventions
>> +
>> +See the @uref{https://gcc.gnu.org/codingconventions.html#Diagnostics,
>> +diagnostics section} of the GCC coding conventions.
>> +
>> +In the C++ frontend, when comparing two types in a message, use @code{%H}
>> +and @code{%I} rather tha @code{%T}, as this allows the diagnostics
>> +subsystem to highlight differences between template-based types.
>> +
>> +Use @code{auto_diagnostic_group} when issuing multiple related
>> +diagnostics (seen in various examples on this page).  This informs the
>> +diagnostic subsystem that all diagnostics issued within the lifetime
>> +of the @code{auto_diagnostic_group} are related.  (Currently it doesn't
>> +do anything with this information, but we may implement that in the
>> +future).
>
> Same here.  I have never used this and even though I saw the %H
> and %I patches go in I probably wouldn't have thought of using
> these codes.  Having a little tutorial (or an example showing
> the effect of these directives) might help.

Yeah, agree a bad %T vs. good %H/%I example would help here.

>> +
>> +@subsection Spelling and Terminology
>> +
>> +See the @uref{https://gcc.gnu.org/codingconventions.html#Spelling
>> +Spelling, terminology and markup} section of the GCC coding conventions.
>> +
>> +@subsection Tense of messages
>> +Syntax errors occur in the present tense e.g.
>> +
>> +@smallexample
>> +error_at (loc, "cannot convert %qH to %qI");
>> +@end smallexample
>> +
>> +and thus
>> +
>> +@smallexample
>> +// CORRECT: usage of present tense:
>> +error: cannot convert 'int' to 'void *'
>> +@end smallexample
>
> This sounds fine.  It would be nice to change the few instances
> of the past tense.  I think past tense is appropriate in runtime
> messages like "could not open file" so it might be worth mentioning.

Yeah.  Is the conversion example above really a case of fixing the tense,
or of fixing the error to be in terms of the user's code?  The implicit
subject in "could not convert ..." is clearly "I"/"the compiler",
making the past tense appropriate.  If the compiler failed to find
something, it wouldn't make sense to do a straight tense swap from
"failed to find..." to "fail to find...".

In "cannot convert ..." I think the subject becomes "you"/"the code" and
the message becomes a prohibition.  That's a good thing because like you
say the error should be in terms of the user's code (although I guess,
going back to your quote at the start of the page, that also means it
has a slightly more lecturing tone -- "hey, you can't do that!")

I guess using the present tense also means that we shouldn't treat the
code as a narrative, so maybe the examples should include diagnostics
that correctly talk in terms of the user's code but use the wrong tense.
E.g. if we don't want to use the past tense:

  error ("redefinition of %q#D", value);
  inform (DECL_SOURCE_LOCATION (value),
  "%q#D previously defined here", value);

should presumably be something like:

  error ("redefinition of %q#D", value);
  inform (DECL_SOURCE_LOCATION (value),
  "previous definition of %q#D is here", value);

(Although TBH the original or:

  "previous definition of %q#D was here"

sound more natural to me.)

>> +rather than
>> +
>> +@smallexample
>> +// BAD: usage of past tense:
>> +error: could not convert 'int' to 'void *'
>> +@end smallexample
>> +
>> +Predictions about run-time behavior should be described in the future tense 
>> e.g.
>> +
>> +@smallexample
>> +warning_at (loc, OPT_some_warning,
>> +"this code will read past the end of the array");
>> +@end smallexample
>
> I don't think this will always achieve a good result.  Consider
> warnings like -Wuninitialized:
>
>  'x' is used uninitialized in this function
>
> A slightly different example is -Walloca, -Warray-bounds, and
> -Wvla: it makes more sense (to me) to say "value is too large"
> or "index is out of bounds" than "it will be too large" etc.
>
> It would be fine for warnings like -Wformat-truncation but all
> it will do there is add a word without making things any clearer:
> "output truncated" vs "output will be truncated."

FWIW, this would also be the opposite of the GCC convention for
documentation, where predictions about compiler behaviour in particular
situations should be given in the present tense rather than the future
tense.

Thanks,
Richard


Re: [PATCH] Reset insn priority after inc/ref replacement in haifa sched

2018-10-15 Thread Robin Dapp
Hi,

> See my last message.  I find myself wondering if we need to reset
> INSN_PRIORITY_STATUS in update_insn_after_change and/or calling
> update_insn_after_change on INSN in additional to calling it on DESC->insn.

I tried calling update_insn_after_change even before sending my message
but it seems to modify variables that are assumed to be set at this
stage of the algorithm (resetting INSN_TICK (insn) and INSN_COST (insn)
causes ICEs in other places).  It suffices, however, to call priority
like in the attached patch to achieve the same result.

Test suite is still running.

Regards
 Robin

gcc/ChangeLog:

2018-10-15  Robin Dapp  

* haifa-sched.c (priority): Add force_recompute parameter.
(apply_replacement):
Call priority () with force_recompute = true.
(restore_pattern): Likewise.
diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index 1fdc9df9fb2..8aab37a2ba8 100644
--- a/gcc/haifa-sched.c
+++ b/gcc/haifa-sched.c
@@ -1590,7 +1590,7 @@ bool sched_fusion;
 
 /* Compute the priority number for INSN.  */
 static int
-priority (rtx_insn *insn)
+priority (rtx_insn *insn, bool force_recompute)
 {
   if (! INSN_P (insn))
 return 0;
@@ -1598,7 +1598,7 @@ priority (rtx_insn *insn)
   /* We should not be interested in priority of an already scheduled insn.  */
   gcc_assert (QUEUE_INDEX (insn) != QUEUE_SCHEDULED);
 
-  if (!INSN_PRIORITY_KNOWN (insn))
+  if (force_recompute || !INSN_PRIORITY_KNOWN (insn))
 {
   int this_priority = -1;
 
@@ -1695,6 +1695,12 @@ priority (rtx_insn *insn)
 
   return INSN_PRIORITY (insn);
 }
+
+static int
+priority (rtx_insn *insn)
+{
+  return priority (insn, false);
+}
 
 /* Macros and functions for keeping the priority queue sorted, and
dealing with queuing and dequeuing of instructions.  */
@@ -4713,7 +4719,12 @@ apply_replacement (dep_t dep, bool immediately)
   success = validate_change (desc->insn, desc->loc, desc->newval, 0);
   gcc_assert (success);
 
+  rtx_insn *insn = DEP_PRO (dep);
+
+  /* Recompute priority since dependent priorities have changed.  */
+  priority (insn, true);
   update_insn_after_change (desc->insn);
+
   if ((TODO_SPEC (desc->insn) & (HARD_DEP | DEP_POSTPONED)) == 0)
 	fix_tick_ready (desc->insn);
 
@@ -4767,7 +4778,13 @@ restore_pattern (dep_t dep, bool immediately)
 
   success = validate_change (desc->insn, desc->loc, desc->orig, 0);
   gcc_assert (success);
+
+  rtx_insn *insn = DEP_PRO (dep);
+  /* Recompute priority since dependent priorities have changed.  */
+  priority (insn, true);
+
   update_insn_after_change (desc->insn);
+
   if (backtrack_queue != NULL)
 	{
 	  backtrack_queue->replacement_deps.safe_push (dep);


Re: [PATCH] PR libstdc++/86751 default assignment operators for std::pair

2018-10-15 Thread Jonathan Wakely

On 31/07/18 23:34 +0100, Jonathan Wakely wrote:

On 31/07/18 18:40 +0100, Jonathan Wakely wrote:

On 31/07/18 20:14 +0300, Ville Voutilainen wrote:

On 31 July 2018 at 20:07, Jonathan Wakely  wrote:

The solution for PR 77537 causes ambiguities due to the extra copy
assignment operator taking a __nonesuch_no_braces parameter. The copy
and move assignment operators can be defined as defaulted to meet the
semantics required by the standard.

In order to preserve ABI compatibility (specifically argument passing
conventions for pair) we need a new base class that makes the
assignment operators non-trivial.

  PR libstdc++/86751
  * include/bits/stl_pair.h (__nonesuch_no_braces): Remove.
  (__pair_base): New class with non-trivial copy assignment operator.
  (pair): Derive from __pair_base. Define copy assignment and move
  assignment operators as defaulted.
  * testsuite/20_util/pair/86751.cc: New test.


Ville, this passes all our tests, but am I forgetting something that
means this isn't right?


Pairs of references?


I knew there was a reason.

We need better tests, since nothing failed when I made this change.

OK, let me rework the patch ...


Here's the patch I've committed. It adds a test for pairs of
references, so I don't try to define t he assignment ops as defaulted
again :-)  Thanks for the quick feedback for these patches.

Tested powerpc64le-linux, committed to trunk.

This is a regression on all branches, but I'd like to leave it on
trunk for a short while before backporting it.


Now backported to all active branches.


commit 988a9158fd074353621f4f216270109c767a4725
Author: Jonathan Wakely 
Date:   Tue Jul 31 17:26:04 2018 +0100

   PR libstdc++/86751 default assignment operators for std::pair
   
   The solution for PR 77537 causes ambiguities due to the extra copy

   assignment operator taking a __nonesuch_no_braces parameter. By making
   the base class non-assignable we don't need the extra deleted overload
   in std::pair. The copy assignment operator will be implicitly deleted
   (and the move assignment operator not declared) as needed. Without the
   additional user-provided operator in std::pair the ambiguity is avoided.
   
   PR libstdc++/86751

   * include/bits/stl_pair.h (__pair_base): New class with deleted copy
   assignment operator.
   (pair): Derive from __pair_base.
   (pair::operator=): Remove deleted overload.
   * python/libstdcxx/v6/printers.py (StdPairPrinter): New pretty 
printer
   so that new base class isn't shown in GDB.
   * testsuite/20_util/pair/86751.cc: New test.
   * testsuite/20_util/pair/ref_assign.cc: New test.

diff --git a/libstdc++-v3/include/bits/stl_pair.h 
b/libstdc++-v3/include/bits/stl_pair.h
index a2486ba8244..ea8bd981559 100644
--- a/libstdc++-v3/include/bits/stl_pair.h
+++ b/libstdc++-v3/include/bits/stl_pair.h
@@ -185,8 +185,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  struct __nonesuch_no_braces : std::__nonesuch {
explicit __nonesuch_no_braces(const __nonesuch&) = delete;
  };
+#endif // C++11

-#endif
+  class __pair_base
+  {
+#if __cplusplus >= 201103L
+template friend struct pair;
+__pair_base() = default;
+~__pair_base() = default;
+__pair_base(const __pair_base&) = default;
+__pair_base& operator=(const __pair_base&) = delete;
+#endif // C++11
+  };

 /**
   *  @brief Struct holding two objects of arbitrary type.
@@ -196,6 +206,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   */
  template
struct pair
+: private __pair_base
{
  typedef _T1 first_type;/// @c first_type is the first bound type
  typedef _T2 second_type;   /// @c second_type is the second bound type
@@ -374,19 +385,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return *this;
  }

-  pair&
-  operator=(typename conditional<
-   __not_<__and_,
- is_copy_assignable<_T2>>>::value,
-   const pair&, const __nonesuch_no_braces&>::type __p) = delete;
-
  pair&
  operator=(typename conditional<
__and_,
   is_move_assignable<_T2>>::value,
pair&&, __nonesuch_no_braces&&>::type __p)
  noexcept(__and_,
- is_nothrow_move_assignable<_T2>>::value)
+ is_nothrow_move_assignable<_T2>>::value)
  {
first = std::forward(__p.first);
second = std::forward(__p.second);
diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py 
b/libstdc++-v3/python/libstdcxx/v6/printers.py
index 34d8b4e6606..43d459ec8ec 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -1229,6 +1229,39 @@ class StdExpPathPrinter:
return self._iterator(self.val['_M_cmpts'])


+class StdPairPrinter:
+"Print a std::pair object, with 'first' and 'second' as children"
+
+def __init__(self, typename, val):
+self.val = val
+
+class

Re: [PATCH] asm non-code template parts (alternative to asm inline)

2018-10-15 Thread Richard Sandiford
Jakub Jelinek  writes:

> On Mon, Oct 15, 2018 at 01:53:09PM +0300, Alexander Monakov wrote:
>> On Mon, 15 Oct 2018, Jakub Jelinek wrote:
>> 
>> > On Mon, Oct 15, 2018 at 01:36:36PM +0300, Alexander Monakov wrote:
>> > > On Mon, 15 Oct 2018, Richard Biener wrote:
>> > > > 
>> > > > Oh, and I personally find %` ugly ;)  What non-alnum chars
>> > > > are taken by backends?
>> > > 
>> > > I think only double quote, backslash, backtick remain
>> > > unclaimed. And of course
>> > > ASCII \0 through \040 and \177 ;)
>> > 
>> > As has been said, the way microblaze claims non-alnum characters it doesn't
>> > support is just bogus, so we shouldn't consider them to be taken.
>> 
>> I understand - I've made an effort to manually go through the backends and
>> find characters they meaningfully handle in their print_operand hooks. In
>> particular MIPS handles all of []()<> (but %[ is special anyway, for
>> %[name] substitution).
>
> Ugh.  Wonder how %[name] then works on mips or if its %[ something %] works.

It's only provided for .md patterns (and probably predates the
named operands in inline asms), so in practice there's no problem.

Richard


[PATCH] Fix PR87610

2018-10-15 Thread Richard Biener


This fixes an issue with restrict noted by N2260.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2018-10-15  Richard Biener  

PR middle-end/87610
* tree-ssa-structalias.c (struct vls_data): Add escaped_p member.
(visit_loadstore): When a used restrict tag escaped verify that
the points-to solution of "other" pointers do not include
escaped.
(compute_dependence_clique): If a used restrict tag escaped
communicated that down to visit_loadstore.

* gcc.dg/torture/restrict-6.c: New testcase.

Index: gcc/tree-ssa-structalias.c
===
--- gcc/tree-ssa-structalias.c  (revision 265038)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -7397,6 +7397,7 @@ delete_points_to_sets (void)
 struct vls_data
 {
   unsigned short clique;
+  bool escaped_p;
   bitmap rvars;
 };
 
@@ -7408,6 +7409,7 @@ visit_loadstore (gimple *, tree base, tr
 {
   unsigned short clique = ((vls_data *) data)->clique;
   bitmap rvars = ((vls_data *) data)->rvars;
+  bool escaped_p = ((vls_data *) data)->escaped_p;
   if (TREE_CODE (base) == MEM_REF
   || TREE_CODE (base) == TARGET_MEM_REF)
 {
@@ -7428,7 +7430,8 @@ visit_loadstore (gimple *, tree base, tr
return false;
 
  vi = get_varinfo (find (vi->id));
- if (bitmap_intersect_p (rvars, vi->solution))
+ if (bitmap_intersect_p (rvars, vi->solution)
+ || (escaped_p && bitmap_bit_p (vi->solution, escaped_id)))
return false;
}
 
@@ -7505,6 +7508,7 @@ compute_dependence_clique (void)
   unsigned short clique = 0;
   unsigned short last_ruid = 0;
   bitmap rvars = BITMAP_ALLOC (NULL);
+  bool escaped_p = false;
   for (unsigned i = 0; i < num_ssa_names; ++i)
 {
   tree ptr = ssa_name (i);
@@ -7574,7 +7578,12 @@ compute_dependence_clique (void)
 last_ruid);
}
  if (used)
-   bitmap_set_bit (rvars, restrict_var->id);
+   {
+ bitmap_set_bit (rvars, restrict_var->id);
+ varinfo_t escaped = get_varinfo (find (escaped_id));
+ if (bitmap_bit_p (escaped->solution, restrict_var->id))
+   escaped_p = true;
+   }
}
 }
 
@@ -7587,7 +7596,7 @@ compute_dependence_clique (void)
 parameters) we can't restrict scoping properly thus the following
 is too aggressive there.  For now we have excluded those globals from
 getting into the MR_DEPENDENCE machinery.  */
-  vls_data data = { clique, rvars };
+  vls_data data = { clique, escaped_p, rvars };
   basic_block bb;
   FOR_EACH_BB_FN (bb, cfun)
for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
Index: gcc/testsuite/gcc.dg/torture/restrict-6.c
===
--- gcc/testsuite/gcc.dg/torture/restrict-6.c   (revision 0)
+++ gcc/testsuite/gcc.dg/torture/restrict-6.c   (working copy)
@@ -0,0 +1,24 @@
+/* { dg-do run } */
+
+extern void abort (void);
+
+void __attribute__((noinline)) g(int **a, int *b)
+{
+  *a = b;
+}
+
+int foo(int * restrict p, int *q)
+{
+  g(&q, p);
+  *p = 1;
+  *q = 2;
+  return *p + *q;
+}
+
+int main()
+{
+  int x, y;
+  if (foo(&x, &y) != 4)
+abort ();
+  return 0;
+}


Re: [PATCH] asm non-code template parts (alternative to asm inline)

2018-10-15 Thread Jakub Jelinek
On Mon, Oct 15, 2018 at 01:53:09PM +0300, Alexander Monakov wrote:
> On Mon, 15 Oct 2018, Jakub Jelinek wrote:
> 
> > On Mon, Oct 15, 2018 at 01:36:36PM +0300, Alexander Monakov wrote:
> > > On Mon, 15 Oct 2018, Richard Biener wrote:
> > > > 
> > > > Oh, and I personally find %` ugly ;)  What non-alnum chars
> > > > are taken by backends?
> > > 
> > > I think only double quote, backslash, backtick remain unclaimed. And of 
> > > course
> > > ASCII \0 through \040 and \177 ;)
> > 
> > As has been said, the way microblaze claims non-alnum characters it doesn't
> > support is just bogus, so we shouldn't consider them to be taken.
> 
> I understand - I've made an effort to manually go through the backends and
> find characters they meaningfully handle in their print_operand hooks. In
> particular MIPS handles all of []()<> (but %[ is special anyway, for
> %[name] substitution).

Ugh.  Wonder how %[name] then works on mips or if its %[ something %] works.

Jakub


Re: [PATCH] asm non-code template parts (alternative to asm inline)

2018-10-15 Thread Alexander Monakov
On Mon, 15 Oct 2018, Jakub Jelinek wrote:

> On Mon, Oct 15, 2018 at 01:36:36PM +0300, Alexander Monakov wrote:
> > On Mon, 15 Oct 2018, Richard Biener wrote:
> > > 
> > > Oh, and I personally find %` ugly ;)  What non-alnum chars
> > > are taken by backends?
> > 
> > I think only double quote, backslash, backtick remain unclaimed. And of 
> > course
> > ASCII \0 through \040 and \177 ;)
> 
> As has been said, the way microblaze claims non-alnum characters it doesn't
> support is just bogus, so we shouldn't consider them to be taken.

I understand - I've made an effort to manually go through the backends and
find characters they meaningfully handle in their print_operand hooks. In
particular MIPS handles all of []()<> (but %[ is special anyway, for
%[name] substitution).

Alexander


Re: [PATCH] asm non-code template parts (alternative to asm inline)

2018-10-15 Thread Jakub Jelinek
On Mon, Oct 15, 2018 at 01:36:36PM +0300, Alexander Monakov wrote:
> On Mon, 15 Oct 2018, Richard Biener wrote:
> > 
> > Oh, and I personally find %` ugly ;)  What non-alnum chars
> > are taken by backends?
> 
> I think only double quote, backslash, backtick remain unclaimed. And of course
> ASCII \0 through \040 and \177 ;)

As has been said, the way microblaze claims non-alnum characters it doesn't
support is just bogus, so we shouldn't consider them to be taken.

Jakub


Re: [PATCH] asm non-code template parts (alternative to asm inline)

2018-10-15 Thread Alexander Monakov
On Mon, 15 Oct 2018, Richard Biener wrote:
> 
> Oh, and I personally find %` ugly ;)  What non-alnum chars
> are taken by backends?

I think only double quote, backslash, backtick remain unclaimed. And of course
ASCII \0 through \040 and \177 ;)

Alexander


[SVE ACLE] Implements ACLE svdup, svindex, svqad/qsub, svabd and svmul

2018-10-15 Thread Kugan Vivekanandarajah
Hi,
Attached patch implements ACLE svdup, svindex, svqad/qsub, svabd and
svmul built-ins.
Committed to ACLE branch,
Thanks,
Kugan


0001-svdup-svindex-svqad-qsub-svabd-and-svmul.patch.gz
Description: application/gzip


Re: [ARM/FDPIC v3 03/21] [ARM] FDPIC: Force FDPIC related options unless -mno-fdpic is provided

2018-10-15 Thread Christophe Lyon
On Fri, 12 Oct 2018 at 12:01, Richard Earnshaw (lists) <
richard.earns...@arm.com> wrote:

> On 11/10/18 14:34, Christophe Lyon wrote:
> > In FDPIC mode, we set -fPIE unless the user provides -fno-PIE, -fpie,
> > -fPIC or -fpic: indeed FDPIC code is PIC, but we want to generate code
> > for executables rather than shared libraries by default.
> >
> > We also make sure to use the --fdpic assembler option, and select the
> > appropriate linker emulation.
> >
> > At link time, we also default to -pie, unless we are generating a
> > shared library or a relocatable file (-r). Note that even for static
> > link, we must specify the dynamic linker because the executable still
> > has to relocate itself at startup.
> >
> > We also force 'now' binding since lazy binding is not supported.
> >
> > We should also apply the same behavior for -Wl,-Ur as for -r, but I
> > couldn't find how to describe that in the specs fragment.
> >
> > 2018-XX-XX  Christophe Lyon  
> >   Mickaël Guêné 
> >
> >   gcc/
> >   * config.gcc: Handle arm*-*-uclinuxfdpiceabi.
> >   * config/arm/bpabi.h (TARGET_FDPIC_ASM_SPEC): New.
> >   (SUBTARGET_EXTRA_ASM_SPEC): Use TARGET_FDPIC_ASM_SPEC.
> >   * config/arm/linux-eabi.h (FDPIC_CC1_SPEC): New.
> >   (CC1_SPEC): Use FDPIC_CC1_SPEC.
> >   * config/arm/uclinuxfdpiceabi.h: New file.
> >
> >   libsanitizer/
> >   * configure.tgt (arm*-*-uclinuxfdpiceabi): Sanitizers are
> >   unsupported in this configuration.
>
> The documentation (in patch 1) seems to imply that -mfdpic is available
> in all configurations and has certain effects (such as enabling -fPIE),
> but this patch set suggests that such behaviours are only available when
> the compiler is configured explicitly for an fdpic target.
>
> I think this needs to be resolved.  Either -mfdpic works everywhere, or
> the option should only be available when configured for -mfdpic.
>
>
You are right, this is not clear. I tried to follow what other fdpic
targets do,
but it's not consistent either, it seems.

So, at present, -mfdpic alone is in general not sufficient, and the user has
to use -fpic/-fPIC/-fpie/-fPIE as needed. When configured for
arm-uclinuxfdpiceabi,
this is done implicitly (thanks to this patch).

One possibility is to rephrase the doc, and say that -fPIE is only implied
when GCC
is configured for arm-uclinuxfdpiceabi.

Do you mean to also make -mfdpic non-existent/rejected when GCC is not
configured
for arm-uclinuxfdpiceabi? How to achieve that?


R.
>
> >
> > Change-Id: If369e0a10bb916fd72e38f71498d3c640fa85c4c
> >
> > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > index 793fc69..a4f4331 100644
> > --- a/gcc/config.gcc
> > +++ b/gcc/config.gcc
> > @@ -1144,6 +1144,11 @@ arm*-*-linux-* | arm*-*-uclinuxfdpiceabi)
>   # ARM GNU/Linux with ELF
> >   esac
> >   tmake_file="${tmake_file} arm/t-arm arm/t-arm-elf arm/t-bpabi
> arm/t-linux-eabi"
> >   tm_file="$tm_file arm/bpabi.h arm/linux-eabi.h arm/aout.h
> arm/arm.h"
> > + case $target in
> > + arm*-*-uclinuxfdpiceabi)
> > + tm_file="$tm_file arm/uclinuxfdpiceabi.h"
> > + ;;
> > + esac
> >   # Generation of floating-point instructions requires at least
> ARMv5te.
> >   if [ "$with_float" = "hard" -o "$with_float" = "softfp" ] ; then
> >   target_cpu_cname="arm10e"
> > diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h
> > index 1e3ecfb..5901154 100644
> > --- a/gcc/config/arm/bpabi.h
> > +++ b/gcc/config/arm/bpabi.h
> > @@ -55,6 +55,8 @@
> >  #define TARGET_FIX_V4BX_SPEC " %{mcpu=arm8|mcpu=arm810|mcpu=strongarm*"\
> >"|march=armv4|mcpu=fa526|mcpu=fa626:--fix-v4bx}"
> >
> > +#define TARGET_FDPIC_ASM_SPEC  ""
> > +
> >  #define BE8_LINK_SPEC
>   \
> >"%{!r:%{!mbe32:%:be8_linkopt(%{mlittle-endian:little}" \
> >" %{mbig-endian:big}"  \
> > @@ -64,7 +66,7 @@
> >  /* Tell the assembler to build BPABI binaries.  */
> >  #undef  SUBTARGET_EXTRA_ASM_SPEC
> >  #define SUBTARGET_EXTRA_ASM_SPEC \
> > -  "%{mabi=apcs-gnu|mabi=atpcs:-meabi=gnu;:-meabi=5}"
> TARGET_FIX_V4BX_SPEC
> > +  "%{mabi=apcs-gnu|mabi=atpcs:-meabi=gnu;:-meabi=5}"
> TARGET_FIX_V4BX_SPEC TARGET_FDPIC_ASM_SPEC
> >
> >  #ifndef SUBTARGET_EXTRA_LINK_SPEC
> >  #define SUBTARGET_EXTRA_LINK_SPEC ""
> > diff --git a/gcc/config/arm/linux-eabi.h b/gcc/config/arm/linux-eabi.h
> > index 8585fde..4cee958 100644
> > --- a/gcc/config/arm/linux-eabi.h
> > +++ b/gcc/config/arm/linux-eabi.h
> > @@ -98,11 +98,14 @@
> >  #undef  ASAN_CC1_SPEC
> >  #define ASAN_CC1_SPEC "%{%:sanitize(address):-funwind-tables}"
> >
> > +#define FDPIC_CC1_SPEC ""
> > +
> >  #undef  CC1_SPEC
> >  #define CC1_SPEC \
> > -  LINUX_OR_ANDROID_CC (GNU_USER_TARGET_CC1_SPEC " " ASAN_CC1_SPEC,   \
> > +  LINUX_OR_ANDROID_CC (GNU_USER_TARGET_CC1_SPEC " " ASAN_CC1_SPEC " "
>   \
> > +FDPIC_CC1_SPEC, 

Re: [PATCH] __debug::list use C++11 direct initialization

2018-10-15 Thread Jonathan Wakely

On 09/10/18 07:11 +0200, François Dumont wrote:
Here is the communication for my yesterday's patch which I thought svn 
had failed to commit (I had to interrupt it).


Similarly to what I've done for associative containers here is a 
cleanup of the std::__debug::list implementation leveraging more on 
C++11 direct initialization.


I also made sure we use consistent comparison between 
iterator/const_iterator in erase and emplace methods.


2018-10-08  François Dumont 

    * include/debug/list (list<>::cbegin()): Use C++11 direct
    initialization.
    (list<>::cend()): Likewise.
    (list<>::emplace<>(const_iterator, _Args&&...)): Likewise.
    (list<>::insert(const_iterator, initializer_list<>)): Likewise.
    (list<>::insert(const_iterator, size_type, const _Tp&)): Likewise.
    (list<>::erase(const_iterator, const_iterator)): Ensure consistent
    iterator comparisons.
    (list<>::splice(const_iterator, list&&, const_iterator,
    const_iterator)): Likewise.

Tested under Linux x86_64 Debug mode and committed.

François




diff --git a/libstdc++-v3/include/debug/list b/libstdc++-v3/include/debug/list
index 8add1d596e0..879e1177497 100644
--- a/libstdc++-v3/include/debug/list
+++ b/libstdc++-v3/include/debug/list
@@ -244,11 +244,11 @@ namespace __debug
#if __cplusplus >= 201103L
  const_iterator
  cbegin() const noexcept
-  { return const_iterator(_Base::begin(), this); }
+  { return { _Base::begin(), this }; }

  const_iterator
  cend() const noexcept
-  { return const_iterator(_Base::end(), this); }
+  { return { _Base::end(), this }; }


For functions like emplace (which are C++11-only) and for forward_list
(also C++11-only) using this syntax makes it clearer.

But for these functions it just makes cbegin() and cend() look
different to the C++98 begin() and end() functions, for no obvious
benefit.

Simply using { return end(); } would have been another option.




Re: Extend usage of C++11 direct init in __debug::vector

2018-10-15 Thread Jonathan Wakely

On 15/10/18 07:23 +0200, François Dumont wrote:
This patch extend usage of C++11 direct initialization in 
__debug::vector and makes some calls to operator - more consistent.


Note that I also rewrote following expression in erase method:

-      return begin() + (__first.base() - cbegin().base());
+      return { _Base::begin() + (__first.base() - _Base::cbegin()), this };

The latter version was building 2 safe iterators and incrementing 1 
with the additional debug check inherent to such an operation whereas 
the new version just build 1 safe iterator with directly the expected 
offset.


Makes sense.


2018-10-15  François Dumont  

    * include/debug/vector (vector<>::cbegin()): Use C++11 direct
    initialization.
    (vector<>::cend()): Likewise.
    (vector<>::emplace(const_iterator, _Args&&...)): Likewise and use
    consistent iterator comparison.
    (vector<>::insert(const_iterator, size_type, const _Tp&)): Likewise.
    (vector<>::insert(const_iterator, _InputIterator, _InputIterator)):
    Likewise.
    (vector<>::erase(const_iterator)): Likewise.
    (vector<>::erase(const_iterator, const_iterator)): Likewise.

Tested under Linux x86_64 Debug mode and committed.

François




@@ -542,7 +542,8 @@ namespace __debug
  {
__glibcxx_check_insert(__position);
bool __realloc = this->_M_requires_reallocation(this->size() + 1);
-   difference_type __offset = __position.base() - _Base::begin();
+   difference_type __offset
+ = __position.base() - __position._M_get_sequence()->_M_base().begin();


What's the reason for this change?

Doesn't __glibcxx_check_insert(__position) already ensure that
__position is attached to *this, and so _Base::begin() returns the
same thing as __position._M_get_sequence()->_M_base().begin() ?

If they're equivalent, the original code seems more readable.




Re: Use C++11 direct init in __debug::forward_list

2018-10-15 Thread Jonathan Wakely

On 11/10/18 22:46 +0200, François Dumont wrote:
This patch makes extensive use of C++11 direct init in 
__debug::forward_list.


Doing so I also try to detect useless creation of safe iterators in 
debug implementation. In __debug::forward_list there are severals but 
I wonder if it is worth fixing those. Most of them are like this:


  void
  splice_after(const_iterator __pos, forward_list& __list)
  { splice_after(__pos, std::move(__list)); }

__pos is copied.

Do you think I shouldn't care, gcc will optimize it ?


I think the _Safe_iterator construction/destruction is too complex to
be optimised away (it locks a mutex, doesn't it?).

Normally I'd say you could use std::move(__pos) but IIRC that's even
more expensive than a copy, as it locks two mutexes.

I wonder if it would be ok in debug implementation to use this kind of 
signature:


void splice_after(const const_iterator& __pos, forward_list& __list)

Iterator taken as rvalue reference ?

I guess it is not Standard conformant so not correct but maybe I could 
add a private _M_splice_after with this signature.


It doesn't seem worthwhile to me.



Re: Make std::vector iterator operators friend inline

2018-10-15 Thread Jonathan Wakely

On 12/10/18 18:25 +0200, François Dumont wrote:

Here is the patch for _Bit_iterator and _Bit_const_iterator operators.

I noticed that _Bit_reference == and < operators could be made inline 
friend too. Do you want me to include this change in the patch ?



    * include/bits/stl_bvector.h (_Bit_iterator_base::operator==): Replace
    member method with inline friend.
    (_Bit_iterator_base::operator<): Likewise.
    (_Bit_iterator_base::operator!=): Likewise.
    (_Bit_iterator_base::operator>): Likewise.
    (_Bit_iterator_base::operator<=): Likewise.
    (_Bit_iterator_base::operator>=): Likewise.
    (operator-(const _Bit_iterator_base&, const _Bit_iterator_base&)): Make
    inline friend.
    (_Bit_iterator::operator+(difference_type)): Replace member method with
    inline friend.
    (_Bit_iterator::operator-(difference_type)): Likewise.
    (operator+(ptrdiff_t, const _Bit_iterator&)): Make inline friend.
    (_Bit_const_iterator::operator+(difference_type)): Replace member 
method

    with inline friend.
    (_Bit_const_iterator::operator-(difference_type)): Likewise.
    (operator+(ptrdiff_t, const _Bit_const_iterator&)): Make inline
    friend.

Tested under Linux x86_64.

Ok to commit ?

François




diff --git a/libstdc++-v3/include/bits/stl_bvector.h 
b/libstdc++-v3/include/bits/stl_bvector.h
index 19c16839cfa..8fbef7a1a3a 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -182,40 +182,40 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  _M_offset = static_cast(__n);
}

-bool
-operator==(const _Bit_iterator_base& __i) const
-{ return _M_p == __i._M_p && _M_offset == __i._M_offset; }
+friend bool
+operator==(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
+{ return __x._M_p == __y._M_p && __x._M_offset == __y._M_offset; }

-bool
-operator<(const _Bit_iterator_base& __i) const
+friend bool
+operator<(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
{
-  return _M_p < __i._M_p
-   || (_M_p == __i._M_p && _M_offset < __i._M_offset);
+  return __x._M_p < __y._M_p
+   || (__x._M_p == __y._M_p && __x._M_offset < __y._M_offset);
}

-bool
-operator!=(const _Bit_iterator_base& __i) const
-{ return !(*this == __i); }
+friend bool
+operator!=(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
+{ return !(__x == __y); }

-bool
-operator>(const _Bit_iterator_base& __i) const
-{ return __i < *this; }
+friend bool
+operator>(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
+{ return __y < __x; }

-bool
-operator<=(const _Bit_iterator_base& __i) const
-{ return !(__i < *this); }
+friend bool
+operator<=(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
+{ return !(__y < __x); }

-bool
-operator>=(const _Bit_iterator_base& __i) const
-{ return !(*this < __i); }
-  };
+friend bool
+operator>=(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
+{ return !(__x < __y); }

-  inline ptrdiff_t
-  operator-(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
-  {
-return (int(_S_word_bit) * (__x._M_p - __y._M_p)
-   + __x._M_offset - __y._M_offset);
-  }


For the non-member operator- and operator+ this change makes sense,
but what is the benefit of changing all the others? As members they're
already not considered as candidates for unrelated types, or am I
missing something?



Re: [PATCH] asm non-code template parts (alternative to asm inline)

2018-10-15 Thread Richard Biener
On Sun, Oct 14, 2018 at 10:07 PM Alexander Monakov  wrote:
>
> Hello,
>
> This is an alternative proposal to the "asm inline" feature.
>
> Kernel developers have reported suboptimal optimization where use of asm
> statements such as
>
>   asm("ud2\n"
>   ".pushsection foo\n"
>   ...
>   ".popsection\n" : : ...)
>
> impacts inlining decisions badly, since GCC assumes cost of the asm to be
> high, even though it emits just one instruction to the text section. I'd
> like to point out that branch range optimization is also negatively affected.
>
> I suggest we give asm writers a way to mark portions of the asm template
> that should be ignored in cost estimation. This is a more fine-grained
> mechanism compared to 'asm inline', and it also helps branch range 
> optimization.
>
> Specifically, I propose that in Extended asms, percent-backtick (%`) is
> recognized as such region boundary. Percent sign is of course always special
> in Extended asms, and backtick sign is not claimed by any backend.
>
> For Basic asms, no similar mechanism is necessary since they are antithetical
> to efficiency in the first place.
>
> Kernels developers can then use this extension via
>
> [if gcc-9 or compatible]
> #define ASM_NONTEXT_BEGIN "%`\n"
> [else]
> #define ASM_NONTEXT_BEGIN "\n"
> [endif]
>
> #define ASM_NONTEXT_END ASM_NONTEXT_BEGIN
>
>   asm("ud2\n"
>   ASM_NONTEXT_BEGIN
>   ".pushsection foo\n"
>   ...
>   ".popsection\n"
>   ASM_NONTEXT_END : : ...)
>
> How does this look?

I think it's sound but also note that I think it is logically independent of
asm inline ().  While it may work for the inlining issue for some kernel
examples to asm inline () is sth similar to always_inline for functions,
that is, even though an asm _does_ have non-negligible .text size
we do want to ignore that for the purpose of inlining (but not for the
purpose of branch size estimation).

Your idea is good to make convoluted asms more precise.

Note in your docs you refer to "non-code" sections but it should
equally apply to .text sections that are not the section of the asm
context (.text.cold, for example).  So better wording there would
be appreciated.

Note that I'm concerned about ignoring %` regions for inlining
purposes since iff .text.cold or .data stuff is ignored that sections
might explode in size.  In this context inlining is _not_ free.
So the question is whether we should add an argument to
asm_insn_count () for whether to ignore non-"code" parts
or not and I'd say we should _not_ ignore them for inlining.

Then there's the question if we want people to start writing

 "%`.1:\n"
 "%`jne .1\n"

thus, make GCC ignore lines with just labels or other
asm directives.  Or if we should add some (target / assembler)
specific magic to ignore those that are free.

Oh, and I personally find %` ugly ;)  What non-alnum chars
are taken by backends?

Richard.

>
> * doc/extend.texi (Extended Asm): Document %` in template.
> (Size of an Asm): Document intended use of %`.
> * final.c (asm_insn_count): Adjust.
> (asm_str_count): Add argument to distinguish basic and extended asms.
> In extended asms, ignore separators inside of %` ... %`.
> (output_asm_insn): Handle %`.
> * rtl.h (asm_str_count): Adjust prototype.
> * tree-inline.c (estimate_num_insns): Adjust.
> * config/arm/arm.c (arm_rtx_costs_internal): Adjust.
>
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index cfe6a8e5bb8..798d310061c 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -8613,6 +8613,11 @@ generates multiple assembler instructions.
>  Outputs @samp{@{}, @samp{|}, and @samp{@}} characters (respectively)
>  into the assembler code.  When unescaped, these characters have special
>  meaning to indicate multiple assembler dialects, as described below.
> +
> +@item %`
> +Signifies a boundary of a region where instruction separators are not
> +counted towards its cost (@pxref{Size of an asm}). Must be followed by
> +a whitespace character.
>  @end table
>
>  @subsubheading Multiple assembler dialects in @code{asm} templates
> @@ -9821,7 +9826,7 @@ does this by counting the number of instructions in the 
> pattern of the
>  @code{asm} and multiplying that by the length of the longest
>  instruction supported by that processor.  (When working out the number
>  of instructions, it assumes that any occurrence of a newline or of
> -whatever statement separator character is supported by the assembler --
> +whatever statement separator character is supported by the assembler ---
>  typically @samp{;} --- indicates the end of an instruction.)
>
>  Normally, GCC's estimate is adequate to ensure that correct
> @@ -9832,6 +9837,15 @@ space in the object file than is needed for a single 
> instruction.
>  If this happens then the assembler may produce a diagnostic saying that
>  a label is unreachable.
>
> +Likewise, it is possible for GCC to signif

Re: [PATCH PR87022]Check all bits in dist-vector rather than the fisrt in loop distribution

2018-10-15 Thread Richard Biener
On Sun, Oct 14, 2018 at 1:10 PM bin.cheng  wrote:
>
> Hi,
> This patch fixes PR87022.  The root cause is the original code checks the 
> first bit
> in dist vector for zero and we still do that after enabling loop nest 
> distribution.
> For the test case, the first bit is for outer loop while the dependence 
> happens in
> the inner loop, as a result, the direction of dependence is not correctly 
> reverted.
> This patch fixes the issue by checking all bits in dist vector.
>
> Bootstrap and test on x86_64, is it OK?

OK.

Thanks and welcome back.
Richard.

> Thanks,
> bin
>
> 2018-10-14  Bin Cheng  
>
> PR tree-optimization/87022
> * tree-loop-distribution.c (pg_add_dependence_edges): Check all
> bits in dist vector rather than the first one.
>
> 2018-10-14  Bin Cheng  
>
> PR tree-optimization/87022
> * gcc.dg/tree-ssa/pr87022.c: New test.


Re: [Patch, fortran] PR87566 - ICE with class(*) and select

2018-10-15 Thread Thomas Koenig

Hi Paul,


Bootstrapped and regtested on FC28/x86_64 - OK for trunk?


Looks good. Thanks!

Regards

Thomas


Re: [Patch, fortran] PR87566 - ICE with class(*) and select

2018-10-15 Thread Dominique d'Humières
Hi Paul,

The ICEs for the following PRs 58906, a variant of 77385, 80260, and 82077, 
have been fixed between revision r264941 + patches and r265126 + same patches + 
this patch + patch for pr56386.

Cheers,

Dominique



Re: [PATCH] Fix PR87473 (SLSR ICE on hidden basis)

2018-10-15 Thread Richard Biener
On Fri, Oct 12, 2018 at 10:01 PM Bill Schmidt  wrote:
>
> Hi,
>
> This patch addresses SLSR bug PR87473.  The underlying problem here is that
> create_add_on_incoming_edge contains code to handle a phi_arg that is equal
> to the base expression of the PHI candidate, where the increment assigned to
> the incoming arc should be zero minus the index expression of the hidden
> basis; but several other places in SLSR processing need to handle the same
> case, and fail to do so.  As a result, the code to replace the PHI basis
> attempts to use an initializing statement that was never created in the first
> place, and we ICE.  This patch adds the necessary logic in four parts of the
> code to ensure we handle this consistently throughout.
>
> This error survived this long because the typical case when accessing the
> hidden basis is for the index of the hidden basis to be zero.  For such a
> case we don't need an initializing statement, and the ICE doesn't trigger.
> The test case provided with the PR is a counter-example where the hidden
> basis has an index of 2.
>
> For the four functions fixed here, each identified the case of interest,
> but just didn't do anything when that case arose.  I've reorganized the
> code in each case to always execute the relevant logic, but change what's
> done for the specific situation of the "pass-through" PHI argument.  This
> makes the diffs a little hard to read, unfortunately.
>
> During the investigation I noticed that we are dealing with a degenerate PHI,
> introduced by the loopinit pass, that we would be better off optimizing away
> sooner:
>
>   [local count: 14598063]:
>   # qz_1 = PHI 
>   # jl_22 = PHI 
>   _8 = (unsigned int) jl_22;
>   _13 = _8 * _15;
>   qz_11 = (int) _13;
>
> The assignment to _8 should just use jl_6 in place of jl_22.  This would
> greatly simplify SLSR's job, since PHI-free code is handled much more
> straightforwardly than code that involves conditional updates.  We go through
> at least 30 passes without having this cleaned up, and I expect other passes
> than SLSR would perhaps be hamstrung by this as well.  Any recommendations?

Without more context these are likely loop-closed PHIs.  It's probaby DOM
after loop that gets rid of them currently but a cheaper way would be to
propagate them out in pass_tree_loop_done.  Note that IIRC there are some
other passes rewriting things into loop-closed SSA form that might expose
such degenerate PHIs as well (a quick look shows invariant motion, both
VRP and EVRP should eventually propagate them out during their propagation
step and EVRP shouldn't even need loop-closed SSA?).

So some

  FOR_EACH_LOOP ()
 exits = get_loop_exit_edges ();
 for-each-edge (exits)
   if (single_pred_p (exit->dest))
 for-each-phi (exit->dest)
propagate ()

in tree-loop-done should do the trick.

> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  I've
> added the test case from the bugzilla to the torture tests.  Is this okay for
> trunk, and after a suitable period, to GCC 7 and 8 also?

OK for trunk and branches after a while.

Richard.

> Thanks!
> Bill
>
>
> [gcc]
>
> 2018-10-12  Bill Schmidt  
>
> PR tree-optimization/87473
> * gimple-ssa-strength-reduction.c (record_phi_increments_1): For
> phi arguments identical to the base expression of the phi
> candidate, record a phi-adjust increment of zero minus the index
> expression of the hidden basis.
> (phi_incr_cost_1): For phi arguments identical to the base
> expression of the phi candidate, the difference to compare against
> the increment is zero minus the index expression of the hidden
> basis, and there is no potential savings from replacing the (phi)
> statement.
> (ncd_with_phi): For phi arguments identical to the base expression
> of the phi candidate, the difference to compare against the
> increment is zero minus the index expression of the hidden basis.
> (all_phi_incrs_profitable_1): For phi arguments identical to the
> base expression of the phi candidate, the increment to be checked
> for profitability is zero minus the index expression of the hidden
> basis.
>
> [gcc/testsuite]
>
> 2018-10-12  Bill Schmidt  
>
> PR tree-optimization/87473
> * gcc.c-torture/compile/pr87473.c: New file.
>
>
> Index: gcc/gimple-ssa-strength-reduction.c
> ===
> --- gcc/gimple-ssa-strength-reduction.c (revision 265112)
> +++ gcc/gimple-ssa-strength-reduction.c (working copy)
> @@ -2779,17 +2779,23 @@ record_phi_increments_1 (slsr_cand_t basis, gimple
>for (i = 0; i < gimple_phi_num_args (phi); i++)
>  {
>tree arg = gimple_phi_arg_def (phi, i);
> +  gimple *arg_def = SSA_NAME_DEF_STMT (arg);
>
> -  if (!operand_equal_p (arg, phi_cand->base_expr, 0))
> +  if (gimple_code (arg_def) == GI

Re: [PR87563][AARCH64-SVE]: Don't keep ifcvt loop when COND_ ifn could not be vectorized.

2018-10-15 Thread Richard Biener
On Fri, Oct 12, 2018 at 6:36 PM Renlin Li  wrote:
>
> Hi all,
>
> ifcvt will created versioned loop and it will permissively generate
> scalar COND_ ifn.
>
> If in the loop vectorize pass, COND_ could not get vectorized,
> the if-converted loop should be abandoned when the target doesn't support
> such ifn.
>
> As currently, COND_ is only used by aarch64 sve extension,
> I only run the aarch64-sve testsuites, no change to the result.
>
> Okay to commit?

OK.

Richard.

> Regards,
> Renlin
>
>
> gcc/ChangeLog:
>
> 2018-10-12  Renlin Li  
>
> PR target/87563
> * tree-vectorizer.c (try_vectorize_loop_1): Don't use
> if-conversioned loop when it contains ifn with types not
> supported by backend.
> * internal-fn.c (expand_direct_optab_fn): Add an assert.
> (direct_internal_fn_supported_p): New helper function.
> * internal-fn.h (direct_internal_fn_supported_p): Declare.
>
> gcc/testsuite/ChangeLog:
>
> 2018-10-12  Renlin Li  
>
> PR target/87563
> * gcc.target/aarch64/sve/pr87563.c: New.


[PATCH,FORTRAN] Fix memory leak in finalization wrappers

2018-10-15 Thread Bernhard Reutner-Fischer
If a finalization is not required we created a namespace containing
formal arguments for an internal interface definition but never used
any of these. So the whole sub_ns namespace was not wired up to the
program and consequently was never freed. The fix is to simply not
generate any finalization wrappers if we know that it will be unused.
Note that this reverts back to the original r190869
(8a96d64282ac534cb597f446f02ac5d0b13249cc) handling for this case
by reverting this specific part of r194075
(f1ee56b4be7cc3892e6ccc75d73033c129098e87) for PR fortran/37336.

Regtests cleanly, installed to the fortran-fe-stringpool branch, sent
here for reference and later inclusion.
I might plug a few more leaks in preparation of switching to hash-maps.
I fear that the leaks around interfaces are another candidate ;)

Should probably add a tag for the compile-time leak PR68800 shouldn't i.

valgrind summary for e.g.
gfortran.dg/abstract_type_3.f03 and gfortran.dg/abstract_type_4.f03
where ".orig" is pristine trunk and ".mine" contains this fix:

at3.orig.vg:LEAK SUMMARY:
at3.orig.vg-   definitely lost: 8,460 bytes in 11 blocks
at3.orig.vg-   indirectly lost: 13,288 bytes in 55 blocks
at3.orig.vg- possibly lost: 0 bytes in 0 blocks
at3.orig.vg-   still reachable: 572,278 bytes in 2,142 blocks
at3.orig.vg-suppressed: 0 bytes in 0 blocks
at3.orig.vg-
at3.orig.vg-Use --track-origins=yes to see where uninitialised values come from
at3.orig.vg-ERROR SUMMARY: 38 errors from 33 contexts (suppressed: 0 from 0)
--
at3.mine.vg:LEAK SUMMARY:
at3.mine.vg-   definitely lost: 344 bytes in 1 blocks
at3.mine.vg-   indirectly lost: 7,192 bytes in 18 blocks
at3.mine.vg- possibly lost: 0 bytes in 0 blocks
at3.mine.vg-   still reachable: 572,278 bytes in 2,142 blocks
at3.mine.vg-suppressed: 0 bytes in 0 blocks
at3.mine.vg-
at3.mine.vg-ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
at3.mine.vg-ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
at4.orig.vg:LEAK SUMMARY:
at4.orig.vg-   definitely lost: 13,751 bytes in 12 blocks
at4.orig.vg-   indirectly lost: 11,976 bytes in 60 blocks
at4.orig.vg- possibly lost: 0 bytes in 0 blocks
at4.orig.vg-   still reachable: 572,278 bytes in 2,142 blocks
at4.orig.vg-suppressed: 0 bytes in 0 blocks
at4.orig.vg-
at4.orig.vg-Use --track-origins=yes to see where uninitialised values come from
at4.orig.vg-ERROR SUMMARY: 18 errors from 16 contexts (suppressed: 0 from 0)
--
at4.mine.vg:LEAK SUMMARY:
at4.mine.vg-   definitely lost: 3,008 bytes in 3 blocks
at4.mine.vg-   indirectly lost: 4,056 bytes in 11 blocks
at4.mine.vg- possibly lost: 0 bytes in 0 blocks
at4.mine.vg-   still reachable: 572,278 bytes in 2,142 blocks
at4.mine.vg-suppressed: 0 bytes in 0 blocks
at4.mine.vg-
at4.mine.vg-ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
at4.mine.vg-ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)

gcc/fortran/ChangeLog:

2018-10-12  Bernhard Reutner-Fischer  

* class.c (generate_finalization_wrapper): Do leak finalization
wrappers if they will not be used.
* expr.c (gfc_free_actual_arglist): Formatting fix.
* gfortran.h (gfc_free_symbol): Pass argument by reference.
(gfc_release_symbol): Likewise.
(gfc_free_namespace): Likewise.
* symbol.c (gfc_release_symbol): Adjust acordingly.
(free_components): Set procedure pointer components
of derived types to NULL after freeing.
(free_tb_tree): Likewise.
(gfc_free_symbol): Set sym to NULL after freeing.
(gfc_free_namespace): Set namespace to NULL after freeing.
---
 gcc/fortran/class.c| 25 +
 gcc/fortran/expr.c |  2 +-
 gcc/fortran/gfortran.h |  6 +++---
 gcc/fortran/symbol.c   | 19 ++-
 4 files changed, 23 insertions(+), 29 deletions(-)

diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c
index 69c95fc5dfa..e0bb381a55f 100644
--- a/gcc/fortran/class.c
+++ b/gcc/fortran/class.c
@@ -1533,7 +1533,6 @@ generate_finalization_wrapper (gfc_symbol *derived, 
gfc_namespace *ns,
   gfc_code *last_code, *block;
   const char *name;
   bool finalizable_comp = false;
-  bool expr_null_wrapper = false;
   gfc_expr *ancestor_wrapper = NULL, *rank;
   gfc_iterator *iter;
 
@@ -1561,13 +1560,17 @@ generate_finalization_wrapper (gfc_symbol *derived, 
gfc_namespace *ns,
 }
 
   /* No wrapper of the ancestor and no own FINAL subroutines and allocatable
- components: Return a NULL() expression; we defer this a bit to have have
+ components: Return a NULL() expression; we defer this a bit to have
  an interface declaration.  */
   if ((!ancestor_wrapper || ancestor_wrapper->expr_type == EXPR_NULL)
   && !derived->attr.alloc_comp
   && (!derived->f2k_derived || !derived->f2k_derived->finalizers)
   && !has_finalizer_component (derived))
-expr_null_wrapper = true;
+{
+  vtab_final->initializer = gfc_get_null_

Re: [ARM/FDPIC v3 02/21] [ARM] FDPIC: Handle arm*-*-uclinuxfdpiceabi in configure scripts

2018-10-15 Thread Christophe Lyon
On Fri, 12 Oct 2018 at 11:54, Richard Earnshaw (lists) <
richard.earns...@arm.com> wrote:

> On 11/10/18 14:34, Christophe Lyon wrote:
> > The new arm-uclinuxfdpiceabi target behaves pretty much like
> > arm-linux-gnueabi. In order the enable the same set of features, we
> > have to update several configure scripts that generally match targets
> > like *-*-linux*: in most places, we add *-uclinux* where there is
> > already *-linux*, or uclinux* when there is already linux*.
> >
> > In gcc/config.gcc and libgcc/config.host we use *-*-uclinuxfdpiceabi
> > because there is already a different behaviour for *-*uclinux* target.
> >
> > In libtool.m4, we use uclinuxfdpiceabi in cases where ELF shared
> > libraries support is required, as uclinux does not guarantee that.
> >
> > 2018-XX-XX  Christophe Lyon  
> >
> >   config/
> >   * futex.m4: Handle *-uclinux*.
> >   * tls.m4 (GCC_CHECK_TLS): Likewise.
> >
> >   gcc/
> >   * config.gcc: Handle *-*-uclinuxfdpiceabi.
> >
> >   libatomic/
> >   * configure.tgt: Handle arm*-*-uclinux*.
> >   * configure: Regenerate.
> >
> >   libgcc/
> >   * config.host: Handle *-*-uclinuxfdpiceabi.
> >
> >   libitm/
> >   * configure.tgt: Handle *-*-uclinux*.
> >   * configure: Regenerate.
> >
> >   libstdc++-v3/
> >   * acinclude.m4: Handle uclinux*.
> >   * configure: Regenerate.
> >   * configure.host: Handle uclinux*
> >
> >   * libtool.m4: Handle uclinux*.
>
> What testing have you done to ensure that these new uclinux* changes
> have not affected existing uclinux configurations?
>
> Also, do you really need to use uclinuxfdpiceabi (which is quite
> Arm-specific) everywhere, or would uclinuxfdpic* be better and ease work
> for other fdpic targets?
>
>
This patch became necessary when I was asked to change the target name
from  arm-linux-uclibceabi to arm-uclinuxfdpiceabi.
Changing it implied that many features were disabled and tests regressed
because the new target name didn't match the regexps in configure scripts.

I iterated over the regressions to see which features were now missing, and
I updated the configure scripts accordingly.

When the feature being tested was generic, I used the general *-*-uclinux*
form, because it seemed reasonable that it was OK for other targets.
When the test was arm-related, I used the stricter uclinuxfdpiceabi form,
and specifically not uclinuxfdpic* to avoid breaking other fdpic targets.
When in doubt, I preferred to stay on the safe side.

To answer you first question, I do not have the setup to test other uclinux
configs, I'm not even sure of the list.
So I tested arm-uclinuxfdpiceabi, made sure I got the same results as with
our previous target name, and that
the whole series didn't regress on arm-linux-gnueabi*.

I hope other uclinux/fdpic target maintainers comment on this patch if
something looks wrong to them.

Christophe

R.
> >
> > Change-Id: I6a1fdcd9847d8a82179a214612a3474c1f492916
> >
> > diff --git a/config/futex.m4 b/config/futex.m4
> > index e95144d..4dffe15 100644
> > --- a/config/futex.m4
> > +++ b/config/futex.m4
> > @@ -9,7 +9,7 @@ AC_DEFUN([GCC_LINUX_FUTEX],[dnl
> >  GCC_ENABLE(linux-futex,default, ,[use the Linux futex system call],
> >  permit yes|no|default)
> >  case "$target" in
> > -  *-linux*)
> > +  *-linux* | *-uclinux*)
> >  case "$enable_linux_futex" in
> >default)
> >   # If headers don't have gettid/futex syscalls definition, then
> > diff --git a/config/tls.m4 b/config/tls.m4
> > index 4e170c8..5a8676e 100644
> > --- a/config/tls.m4
> > +++ b/config/tls.m4
> > @@ -76,7 +76,7 @@ AC_DEFUN([GCC_CHECK_TLS], [
> > dnl Shared library options may depend on the host; this check
> > dnl is only known to be needed for GNU/Linux.
> > case $host in
> > - *-*-linux*)
> > + *-*-linux* | -*-uclinux*)
> > LDFLAGS="-shared -Wl,--no-undefined $LDFLAGS"
> > ;;
> > esac
> > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > index 0c579d1..793fc69 100644
> > --- a/gcc/config.gcc
> > +++ b/gcc/config.gcc
> > @@ -753,7 +753,7 @@ case ${target} in
> >  *-*-fuchsia*)
> >native_system_header_dir=/include
> >;;
> > -*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu | *-*-gnu* |
> *-*-kopensolaris*-gnu)
> > +*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu | *-*-gnu* |
> *-*-kopensolaris*-gnu | *-*-uclinuxfdpiceabi)
> >extra_options="$extra_options gnu-user.opt"
> >gas=yes
> >gnu_ld=yes
> > @@ -762,7 +762,7 @@ case ${target} in
> >esac
> >tmake_file="t-slibgcc"
> >case $target in
> > -*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu |
> *-*-kopensolaris*-gnu)
> > +*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu |
> *-*-kopensolaris*-gnu  | *-*-uclinuxfdpiceabi)
> >:;;
> >  *-*-gnu*)
> >native_system_header_dir=/include
> > @@ -782,7 +782,7 @@ case ${target} in
> >  *-*-*android*)
> >tm_defines="$tm_defines D

[Committed] S/390: Fix problem with vec_init expander

2018-10-15 Thread Andreas Krebbel
gcc/ChangeLog:

2018-10-15  Andreas Krebbel  

* config/s390/s390.c (s390_expand_vec_init): Force vector element
into reg if it isn't a general operand.

gcc/testsuite/ChangeLog:

2018-10-15  Andreas Krebbel  

* g++.dg/vec-init-1.C: New test.
---
 gcc/config/s390/s390.c| 11 ---
 gcc/testsuite/g++.dg/vec-init-1.C | 26 ++
 2 files changed, 34 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/vec-init-1.C

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 71039fe..ab22c2c 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -6627,11 +6627,16 @@ s390_expand_vec_init (rtx target, rtx vals)
   return;
 }
 
+  /* Use vector replicate instructions.  vlrep/vrepi/vrep  */
   if (all_same)
 {
-  emit_insn (gen_rtx_SET (target,
- gen_rtx_VEC_DUPLICATE (mode,
-XVECEXP (vals, 0, 0;
+  rtx elem = XVECEXP (vals, 0, 0);
+
+  /* vec_splats accepts general_operand as source.  */
+  if (!general_operand (elem, GET_MODE (elem)))
+   elem = force_reg (inner_mode, elem);
+
+  emit_insn (gen_rtx_SET (target, gen_rtx_VEC_DUPLICATE (mode, elem)));
   return;
 }
 
diff --git a/gcc/testsuite/g++.dg/vec-init-1.C 
b/gcc/testsuite/g++.dg/vec-init-1.C
new file mode 100644
index 000..f35d39c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vec-init-1.C
@@ -0,0 +1,26 @@
+/* On S/390 this ends up calling the vec_init RTL expander with a
+   parallel of two symbol_refs.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O3 -fPIC" } */
+
+
+struct test
+{
+struct base
+{
+   int key;
+};
+struct derived : public base
+{
+   int key;
+};
+
+derived core;
+derived &dRef;
+base &bRef;
+
+test() : dRef (core), bRef (core) {}
+};
+
+test test;
-- 
2.7.4



Re: [PATCH] Add option to control warnings added through attribure "warning"

2018-10-15 Thread Nikolai Merinov

Hi Martin,

On 10/12/18 9:58 PM, Martin Sebor wrote:

On 10/12/2018 04:14 AM, Nikolai Merinov wrote:

Hello,

In https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01795.html mail I
suggested patch to have ability to control behavior of
"__attribute__((warning))" in case when option "-Werror" enabled. Usage
example:


#include 
int a() __attribute__((warning("Warning: `a' was used")));
int a() { return 1; }
int main () { return a(); }



$ gcc -Werror test.c
test.c: In function ‘main’:
test.c:4:22: error: call to ‘a’ declared with attribute warning:
Warning: `a' was used [-Werror]
 int main () { return a(); }
  ^
cc1: all warnings being treated as errors
$ gcc -Werror -Wno-error=warning-attribute test.c
test.c: In function ‘main’:
test.c:4:22: warning: call to ‘a’ declared with attribute warning:
Warning: `a' was used
 int main () { return a(); }
  ^

Can you provide any feedback on suggested changes?


It seems like a useful feature and in line with the philosophy
that distinct warnings should be controlled by their own options.

I would only suggest to consider changing the name to
-Wattribute-warning, because it applies specifically to that
attribute (as opposed to warnings about attributes in general).

There are many attributes in GCC and diagnosing problems that
are unique to each, under the same -Wattributes option, is
becoming too coarse and overly limiting.  To make it more
flexible, I expect new options will need to be introduced,
such as -Wattribute-alias (to control aspects of the alias
attribute and others related to it), or -Wattribute-const
(to control diagnostics about functions declared with
attribute const that violate the attribute's constraints).

An alternative might be to introduce a single -Wattribute=
 option where the  gives
the names of all the distinct attributes whose unique
diagnostics one might need to control.

Martin


Currently there is several styles already in use:

-Wattribute-alias where "attribute" word used as prefix for name of attribute,
-Wsuggest-attribute=[pure|const|noreturn|format|malloc] where name of attribute 
passed as possible argument,
-Wmissing-format-attribute where "attribute" word used as suffix,
-Wdeprecated-declarations where "attribute" word not used at all even if this warning 
option was created especially for "deprecated" attribute.

I changed name to "-Wattribute-warning" as you suggested, but unifying style 
for all attribute related warning looks like separate activity. Please check new patch in 
attachments.

Updated changelog:

gcc/Changelog

2018-10-14  Nikolai Merinov 

* gcc/common.opt: Add -Wattribute-warning.
* gcc/doc/invoke.texi: Add documentation for -Wno-attribute-warning.
* gcc/testsuite/gcc.dg/Wno-attribute-warning.c: New test.
* gcc/expr.c (expand_expr_real_1): Add new attribute to warning_at
call to allow user configure behavior of "warning" attribute


[RFC PATCH] libgcc: apply LIB2FUNCS_EXCLUDE logic to LIB2FUNCS_ST

2018-10-15 Thread Rasmus Villemoes
One target file (config/c6x/t-elf) lists _printf and _gcc_bcmp in
LIB2FUNCS_EXCLUDE, but that does not have any effect, since those are
not filtered away from LIB2FUNCS_ST. Another option is to do as in
config/rl78/t-rl78, which explicitly sets LIB2FUNCS_ST

# Remove __gcc_bcmp from LIB2FUNCS_ST
LIB2FUNCS_ST = _eprintf

but honouring LIB2FUNCS_EXCLUDE also for LIB2FUNCS_ST seems more
natural.

==changelog==

libgcc/

* Makefile.in: Filter out LIB2FUNCS_EXCLUDE from LIB2FUNCS_ST.
---
AFAICT, this will only affect the c6x port, to do what I assume was
always intended, but I don't have a way of testing that. My only
motivation for this is that I have an out-of-tree VxWorks patch that is
more natural on top of this.

 libgcc/Makefile.in | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libgcc/Makefile.in b/libgcc/Makefile.in
index 0766de58500..aeb96c475e2 100644
--- a/libgcc/Makefile.in
+++ b/libgcc/Makefile.in
@@ -472,6 +472,8 @@ lib2funcs := $(filter-out $(LIB2FUNCS_EXCLUDE) 
$(LIB1ASMFUNCS),$(lib2funcs))
 LIB2_DIVMOD_FUNCS := $(filter-out $(LIB2FUNCS_EXCLUDE) $(LIB1ASMFUNCS), \
   $(LIB2_DIVMOD_FUNCS))
 
+LIB2FUNCS_ST := $(filter-out $(LIB2FUNCS_EXCLUDE),$(LIB2FUNCS_ST))
+
 # Build "libgcc1" (assembly) components.
 
 lib1asmfuncs-o = $(patsubst %,%$(objext),$(LIB1ASMFUNCS))
-- 
2.19.1.6.g084f1d7761