[Bug gcov-profile/91601] [GCOV]gcov: internal compiler error: in handle_cycle, at gcov.c:699 happen which get code coverage with lcov.

2019-08-29 Thread ammy.yi at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91601

--- Comment #2 from ammy.yi  ---
gcc version 9.1.1 20190503 gcc-9-branch@270849 has not this issue, but latest
gcc version 9.2.1 20190816 gcc-9-branch@274554 has this issue

[Bug gcov-profile/91601] [GCOV]gcov: internal compiler error: in handle_cycle, at gcov.c:699 happen which get code coverage with lcov.

2019-08-29 Thread ammy.yi at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91601

--- Comment #1 from ammy.yi  ---
gcc version 9.2.1 20190816 gcc-9-branch@274554 (Clear Linux OS for Intel
Architecture)

[Bug gcov-profile/91601] New: [GCOV]gcov: internal compiler error: in handle_cycle, at gcov.c:699 happen which get code coverage with lcov.

2019-08-29 Thread ammy.yi at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91601

Bug ID: 91601
   Summary: [GCOV]gcov: internal compiler error: in handle_cycle,
at gcov.c:699 happen which get code coverage with
lcov.
   Product: gcc
   Version: 9.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: gcov-profile
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ammy.yi at intel dot com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Processing fs/namespace.gcda
Processing fs/fsopen.gcda
Processing fs/pipe.gcda
Processing fs/stack.gcda
Processing fs/exec.gcda
gcov: internal compiler error: in handle_cycle, at gcov.c:699
0x4037d0 handle_cycle
../../gcc-9.1.0/gcc/gcov.c:699
0x4037d0 circuit
../../gcc-9.1.0/gcc/gcov.c:765
0x45aa6c circuit
../../gcc-9.1.0/gcc/gcov.c:770
0x45aba1 get_cycles_count
../../gcc-9.1.0/gcc/gcov.c:817
0x45cd5f accumulate_line_info
../../gcc-9.1.0/gcc/gcov.c:2694
0x45cd5f accumulate_line_counts
../../gcc-9.1.0/gcc/gcov.c:2734
0x45cd5f generate_results
../../gcc-9.1.0/gcc/gcov.c:1446
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug d/91600] New: "Architecture not supported" reported for MinGW-W64

2019-08-29 Thread ray_linn at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91600

Bug ID: 91600
   Summary: "Architecture not supported" reported for MinGW-W64
   Product: gcc
   Version: 9.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: d
  Assignee: ibuclaw at gdcproject dot org
  Reporter: ray_linn at hotmail dot com
  Target Milestone: ---

I am enabling D with its runtime on MinGW-W64 + Windows 10, the TargetTDM is
patched to enable version (Windows). Now I notice many file in runtime is not
support either version (Windows) or version (MinGW), so it will finally reports
"Architecture not supported" and break.

impact file includes: stdio.d,  assert_.d, thread.d and more will be identified
if continue.


I think it is a good chance to enable the Dlang with MinGW-W64, i hope to
co-work with you to check all gatings.

Thanks a lot

[Bug tree-optimization/91457] FAIL: g++.dg/warn/Warray-bounds-4.C -std=gnu++98 (test for warnings, line 25)

2019-08-29 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91457

--- Comment #9 from Martin Sebor  ---
The Glibc warning is being discussed on libc-alpha:
https://sourceware.org/ml/libc-alpha/2019-08/msg00774.html

[Bug middle-end/91599] GCC does not say where warning is happening

2019-08-29 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91599

Martin Sebor  changed:

   What|Removed |Added

   Keywords||patch

--- Comment #2 from Martin Sebor  ---
Patch: https://gcc.gnu.org/ml/gcc-patches/2019-08/msg02025.html

[PATCH] use fallback location for warning (PR 91599)

2019-08-29 Thread Martin Sebor

warning_at() calls with the %G directive rely on the gimple statement
for both their location and the inlining context.  When the statement
is not associated with a location, the warning doesn't point at any
line even if the location argument passed to the call was valid.
The attached patch changes the percent_G_percent handler to fall back
on the provided location in that case, and the recently added warning
for char assignments to pass to the function a fallback location if
the statement doesn't have one.

Tested on x86_64-linux.

Martin
PR middle-end/91599 - GCC does not say where warning is happening

gcc/ChangeLog:

	PR middle-end/91599
	* tree-ssa-strlen.c (handle_store): Use a fallback location if
	the statement doesn't have one.
	* gimple-pretty-print.c (percent_G_format): Same.

gcc/testsuite/ChangeLog:

	PR middle-end/91599
	* gcc.dg/Wstringop-overflow-16.c: New test.

Index: gcc/tree-ssa-strlen.c
===
--- gcc/tree-ssa-strlen.c	(revision 275047)
+++ gcc/tree-ssa-strlen.c	(working copy)
@@ -4036,7 +4036,12 @@ handle_store (gimple_stmt_iterator *gsi)
 	  if (tree dstsize = compute_objsize (lhs, 1, ))
 	if (compare_tree_int (dstsize, lenrange[2]) < 0)
 	  {
+		/* Fall back on the LHS location if the statement
+		   doesn't have one.  */
 		location_t loc = gimple_nonartificial_location (stmt);
+		if (loc == UNKNOWN_LOCATION)
+		  loc = tree_nonartificial_location (lhs);
+		loc = expansion_point_location_if_in_system_header (loc);
 		if (warning_n (loc, OPT_Wstringop_overflow_,
 			   lenrange[2],
 			   "%Gwriting %u byte into a region of size %E",
Index: gcc/gimple-pretty-print.c
===
--- gcc/gimple-pretty-print.c	(revision 275047)
+++ gcc/gimple-pretty-print.c	(working copy)
@@ -3034,8 +3034,12 @@ percent_G_format (text_info *text)
 {
   gimple *stmt = va_arg (*text->args_ptr, gimple*);
 
+  /* Fall back on the rich location if the statement doesn't have one.  */
+  location_t loc = gimple_location (stmt);
+  if (loc == UNKNOWN_LOCATION)
+loc = text->m_richloc->get_loc ();
   tree block = gimple_block (stmt);
-  percent_K_format (text, gimple_location (stmt), block);
+  percent_K_format (text, loc, block);
 }
 
 #if __GNUC__ >= 10
Index: gcc/testsuite/gcc.dg/Wstringop-overflow-16.c
===
--- gcc/testsuite/gcc.dg/Wstringop-overflow-16.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/Wstringop-overflow-16.c	(working copy)
@@ -0,0 +1,21 @@
+/* PR middle-end/91599 - GCC does not say where warning is happening
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+struct charseq {
+  unsigned char bytes[0]; // { dg-message "object declared here" }
+};
+
+struct locale_ctype_t {
+  struct charseq *mboutdigits[10];
+};
+
+void ctype_finish (struct locale_ctype_t *ctype)
+{
+  long unsigned int cnt;
+  for (cnt = 0; cnt < 20; ++cnt) {
+static struct charseq replace[2];
+replace[0].bytes[1] = '\0';   // { dg-warning "\\\[-Wstringop-overflow" }
+ctype->mboutdigits[cnt] = [0];
+  }
+}


Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.

2019-08-29 Thread Hongtao Liu
On Fri, Aug 30, 2019 at 8:10 AM Hongtao Liu  wrote:
>
> On Fri, Aug 30, 2019 at 2:09 AM Uros Bizjak  wrote:
> >
> > 2019-08-28  Uroš Bizjak  
> >
> > * config/i386/i386.c (ix86_register_move_cost): Do not
> > limit the cost of moves to/from XMM register to minimum 8.
> >
> > Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
> >
> > Actually committed as r274994 with the wrong ChangeLog.
> >
> > Uros.
>
> There is 11% regression in 548.exchange_r of SPEC2017.
>
> Reason for the regression:
> For 548.exchange_r, a lot of movements between gpr and xmm are
> generated as expected,
> and it reduced  clocksticks by 3%.
> But  however maybe too many xmm registers are used,
> a frequency reduction issue is triggered(average frequency reduced by 13%).
> So totally it takes more time.
>
>
>
> --
> BR,
> Hongtao

Tested on skylake workstation.

-- 
BR,
Hongtao


Re: [PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.

2019-08-29 Thread Hongtao Liu
On Fri, Aug 30, 2019 at 2:09 AM Uros Bizjak  wrote:
>
> 2019-08-28  Uroš Bizjak  
>
> * config/i386/i386.c (ix86_register_move_cost): Do not
> limit the cost of moves to/from XMM register to minimum 8.
>
> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>
> Actually committed as r274994 with the wrong ChangeLog.
>
> Uros.

There is 11% regression in 548.exchange_r of SPEC2017.

Reason for the regression:
For 548.exchange_r, a lot of movements between gpr and xmm are
generated as expected,
and it reduced  clocksticks by 3%.
But  however maybe too many xmm registers are used,
a frequency reduction issue is triggered(average frequency reduced by 13%).
So totally it takes more time.



-- 
BR,
Hongtao


[Bug middle-end/91599] GCC does not say where warning is happening

2019-08-29 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91599

Martin Sebor  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
Version|unknown |10.0
   Assignee|unassigned at gcc dot gnu.org  |msebor at gcc dot 
gnu.org
   Target Milestone|--- |10.0

[Bug middle-end/91599] GCC does not say where warning is happening

2019-08-29 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91599

Martin Sebor  changed:

   What|Removed |Added

   Keywords||diagnostic
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-08-29
 CC||msebor at gcc dot gnu.org
 Blocks||88443
 Ever confirmed|0   |1
   Severity|enhancement |normal

--- Comment #1 from Martin Sebor  ---
Confirmed.  gimple_location(stmt) returns zero for the statement but
EXPR_LOCATION(lhs) has the right location so it can be used as a fallback. 
Unfortunately, that alone isn't enough.  The %G directive in the
warning_at(loc, "%G...", stmt) call seems to insist on using the stmt location
in preference to loc even when it's invalid/unknown, and so the warning still
doesn't point in the right place.  I could hack around it at the call site but
a better fix is in the warning machinery.  (Ideally, of course, every statement
would have location and this wouldn't be a problem at all. But what fun would
that be?)


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88443
[Bug 88443] [meta-bug] bogus/missing -Wstringop-overflow warnings

Re: [PATCH] Couple of debug dump improvements to scheduler (no code-gen changes)

2019-08-29 Thread Jeff Law
On 8/29/19 9:44 AM, Maxim Kuvyrkov wrote:
> Hi,
> 
> The first patch adds ranking statistics for autoprefetcher heuristic.
> 
> The second one makes it easier to diff scheduler debug dumps by adding more 
> context lines for diff at clock increments.
> 
> OK to commit?
OK for both.
jeff


Re: [PATCH, V3, #3 of 10], Add prefixed RTL insn attribute

2019-08-29 Thread Segher Boessenkool
Hi Mike,

On Mon, Aug 26, 2019 at 04:31:02PM -0400, Michael Meissner wrote:
>   (rs6000_asm_output_opcode): New function for prifixed memory.

Typo.  Just say "New." or "New function." please.

> --- gcc/config/rs6000/rs6000.c(revision 274871)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -13827,23 +13827,23 @@ addr_mask_to_trad_insn (machine_mode mod
>   early RTL stages before register allocation has been done.  */
>if ((addr_mask & flags) == RELOAD_REG_MULTIPLE)
>  {
> -  machine_mode inner = word_mode;
> +  machine_mode mode2 = mode;

So what is "mode2" for?  A meaningful name and/or some comments would help.

> +   if ((reg_addr[E_DFmode].default_addr_mask & RELOAD_REG_OFFSET) != 0)
> + mode = DFmode;

(Don't use E_ if you do not need it -- i.e. most of the time).

> +/* Helper function to take a REG and a MODE and turn it into the traditional
> +   instruction format (D/DS/DQ) used for offset memory.  */

Is this the form of the preferred insn to do this?  Or ths minimum required
to do it at all?  Something else?

> +  /* If it isn't a register, use the defaults.  */
> +  if (!REG_P (reg) && !SUBREG_P (reg))
> +addr_mask = reg_addr[mode].default_addr_mask;
> +
> +  else
> +{
> +  unsigned int r = reg_or_subregno (reg);

This ICEs if it is a subreg of something else than a reg.

You can just start with

  if (SUBREG_P (reg))
reg = SUBREG_REG (reg);

  if (REG_P (reg))
   ... etc.

> +/* Whether a load instruction is a prefixed instruction.  This is called from
> +   the prefixed attribute processing.  */
> +
> +bool
> +prefixed_load_p (rtx_insn *insn)
> +{
> +  /* Validate the insn to make sure it is a normal load insn.  */
> +  extract_insn_cached (insn);
> +  if (recog_data.n_operands < 2)
> +return false;

Why don't you handle this the same way "indexed" and "update" are already
handled?  That is *easy* and it *works*, it trivially verifiably works.
It also doesn't care whether something is a load or a store.  You hardcode
the few exceptions (okay, twenty or whatever update insns -- but all are
similar, so that is easy), and everything else just works.

The way you code it you just hope to exclude all of the exceptions,
instead of handling them directly.

> +void
> +rs6000_asm_output_opcode (FILE *stream)
> +{
> +  if (next_insn_prefixed_p)
> +fputc ('p', stream);
> +
> +  return;
> +}

You can just write fprintf fwiw, the compile can optimise it for you
just fine.

> +#define ASM_OUTPUT_OPCODE(STREAM, OPCODE)\
> +  do \
> +{
> \
> + if (TARGET_PREFIXED_ADDR)   
> \
> +   rs6000_asm_output_opcode (STREAM);\
> +}
> \
> +  while (0)

(Indentation of the "if" is weird?)

> +;; Whether an insn is a prefixed insn, and an initial 'p' should be printed
> +;; before the instruction.  A prefixed instruction has a prefix instruction

Whether it is a prefixed insn, period.

> +;; word that extends the immediate value of the instructions from 12-16 bits 
> to
> +;; 34 bits.  The macro ASM_OUTPUT_OPCODE emits a leading 'p' for prefixed
> +;; insns.  The default "length" attribute will also be adjusted by default to
> +;; be 12 bytes.

Don't say all the effects here, say that where you make it happen?

> +;; Length in bytes of instructions that use prefixed addressing and length in
> +;; bytes of instructions that does not use prefixed addressing.  This allows
> +;; both lengths to be defined as constants, and the length attribute can pick
> +;; the size as appropriate.
> +(define_attr "prefixed_length" "" (const_int 12))
> +(define_attr "non_prefixed_length" "" (const_int 4))

Do you mean a define_insn can override either to something else?  Then
say that, please?

> +;; Length of the instruction (in bytes).  Prefixed insns are 8 bytes, but the
> +;; assembler might issue need to issue a NOP so that the prefixed instruction
> +;; does not cross a cache boundary, which makes them possibly 12 bytes.

s/issue //

> @@ -9883,8 +9926,8 @@ (define_insn "pcrel_local_addr"
>[(set (match_operand:DI 0 "gpc_reg_operand" "=b*r")
>   (match_operand:DI 1 "pcrel_local_address"))]
>"TARGET_PCREL"
> -  "pla %0,%a1"
> -  [(set_attr "length" "12")])
> +  "la %0,%a1"
> +  [(set_attr "prefixed" "yes")])

And just like this you can set the few insns that do not have operands 0
and 1 as source and dest to "no", exactly like is already done for "update"
and "indexed".


Segher


gcc-7-20190829 is now available

2019-08-29 Thread gccadmin
Snapshot gcc-7-20190829 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/7-20190829/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 7 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-7-branch 
revision 275055

You'll find:

 gcc-7-20190829.tar.xzComplete GCC

  SHA256=3940927900cf61fefe41b00cc3786c900fc17e9dcdcd1b6087b58e21ec06f1bb
  SHA1=7a62088ab2dfb9dee6b8ffaea5ebb68b3abf013e

Diffs from 7-20190822 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-7
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


[Bug middle-end/91584] [9 Regression] Bogus warning from -Warray-bounds during string assignment

2019-08-29 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91584

Martin Sebor  changed:

   What|Removed |Added

   Target Milestone|--- |10.0
Summary|Bogus warning from  |[9 Regression] Bogus
   |-Warray-bounds during   |warning from -Warray-bounds
   |string assignment   |during string assignment
   Severity|enhancement |normal

[Bug middle-end/91584] Bogus warning from -Warray-bounds during string assignment

2019-08-29 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91584

Martin Sebor  changed:

   What|Removed |Added

   Keywords||patch
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |msebor at gcc dot 
gnu.org

--- Comment #3 from Martin Sebor  ---
Yes, the array domain checking wasn't correct for languages where the first
element of an array is at a nonzero index.  I posted the following patch:
https://gcc.gnu.org/ml/gcc-patches/2019-08/msg02020.html

[PATCH] correct MEM_REF bounds checking of arrays (PR 91584)

2019-08-29 Thread Martin Sebor

The -Warray-bounds enhancement I added to GCC 9 causes false
positives in languages like Fortran whose first array element
is at a non-zero index.  The attached patch has the function
responsible for the warning normalize the array bounds to
always start at zero to avoid these false positives.

Tested on x86_64-linux.

Martin
PR middle-end/91584 - Bogus warning from -Warray-bounds during string assignment

gcc/ChangeLog:

	PR middle-end/91584
	* tree-vrp.c (vrp_prop::check_mem_ref): Normalize type domain bounds
	before using them to validate MEM_REF offset.

gcc/testsuite/ChangeLog:
	* gfortran.dg/char_array_constructor_4.f90: New test.

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c	(revision 275047)
+++ gcc/tree-vrp.c	(working copy)
@@ -4703,31 +4703,23 @@ vrp_prop::check_mem_ref (location_t location, tree
   || RECORD_OR_UNION_TYPE_P (reftype))
 return false;
 
+  arrbounds[0] = 0;
+
   offset_int eltsize;
   if (TREE_CODE (reftype) == ARRAY_TYPE)
 {
   eltsize = wi::to_offset (TYPE_SIZE_UNIT (TREE_TYPE (reftype)));
-
   if (tree dom = TYPE_DOMAIN (reftype))
 	{
 	  tree bnds[] = { TYPE_MIN_VALUE (dom), TYPE_MAX_VALUE (dom) };
-	  if (array_at_struct_end_p (arg)
-	  || !bnds[0] || !bnds[1])
-	{
-	  arrbounds[0] = 0;
-	  arrbounds[1] = wi::lrshift (maxobjsize, wi::floor_log2 (eltsize));
-	}
+	  if (array_at_struct_end_p (arg) || !bnds[0] || !bnds[1])
+	arrbounds[1] = wi::lrshift (maxobjsize, wi::floor_log2 (eltsize));
 	  else
-	{
-	  arrbounds[0] = wi::to_offset (bnds[0]) * eltsize;
-	  arrbounds[1] = (wi::to_offset (bnds[1]) + 1) * eltsize;
-	}
+	arrbounds[1] = (wi::to_offset (bnds[1]) - wi::to_offset (bnds[0])
+			+ 1) * eltsize;
 	}
   else
-	{
-	  arrbounds[0] = 0;
-	  arrbounds[1] = wi::lrshift (maxobjsize, wi::floor_log2 (eltsize));
-	}
+	arrbounds[1] = wi::lrshift (maxobjsize, wi::floor_log2 (eltsize));
 
   if (TREE_CODE (ref) == MEM_REF)
 	{
@@ -4742,7 +4734,6 @@ vrp_prop::check_mem_ref (location_t location, tree
   else
 {
   eltsize = 1;
-  arrbounds[0] = 0;
   arrbounds[1] = wi::to_offset (TYPE_SIZE_UNIT (reftype));
 }
 
Index: gcc/testsuite/gfortran.dg/char_array_constructor_4.f90
===
--- gcc/testsuite/gfortran.dg/char_array_constructor_4.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/char_array_constructor_4.f90	(working copy)
@@ -0,0 +1,13 @@
+! PR 30319 - Bogus warning from -Warray-bounds during string assignment
+! { dg-do compile }
+! { dg-options "-O2 -Warray-bounds" }
+
+program test_bounds
+
+  character(256) :: foo
+
+  foo = '1234' ! { dg-bogus "\\\[-Warray-bounds" }
+
+  print *, foo
+
+end program test_bounds


[Bug middle-end/91599] New: GCC does not say where warning is happening

2019-08-29 Thread sje at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91599

Bug ID: 91599
   Summary: GCC does not say where warning is happening
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sje at gcc dot gnu.org
  Target Milestone: ---

When compiling the following source file, GCC gives a warning.  The warning
notes that the declaration is on line 2 but it does not say what line the
actual write is on (line 12).  This message started showing up with Martin
Sebor's patch for PR c++/83431 though I don't know if he added it or if he just
made it show up in places where it wasn't happening before.

% cat x.c
struct charseq {
   unsigned char bytes[0];
};
struct locale_ctype_t {
   struct charseq *mboutdigits[10];
};
void ctype_finish (struct locale_ctype_t *ctype)
{
   long unsigned int cnt;
   for (cnt = 0; cnt < 20; ++cnt) {
static struct charseq replace[2];
replace[0].bytes[1] = '\0';
ctype->mboutdigits[cnt] = [0];
   }
}


% install/bin/gcc -O2 -c x.c
x.c: In function ‘ctype_finish’:
cc1: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
x.c:2:18: note: destination object declared here
2 |unsigned char bytes[0];
  |


It would be nice if the warning said the write was on line 12 as well as saying
that the declaration is on line 2.  This test case is cutdown from code in
glibc where the code doing the write was less easy to find.

[Bug fortran/91556] Problems with better interface checking

2019-08-29 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91556

--- Comment #18 from Thomas Koenig  ---
(In reply to anlauf from comment #14)
> The current solution is a bit annoying for implicitly-derived interfaces.
> 
> Consider a code like:
> 
> module foo
>   implicit none
>   type t1
>  integer :: i = 1
>   end type t1
>   type t2
>  integer :: j = 2
>   end type t2
> contains
>   subroutine s1 (x)
> type(t1) :: x
> call my_mpi_bcast_wrapper (x, storage_size (x)/8)
>   end subroutine s1
>   subroutine s2 (y)
> type(t2) :: y
> call my_mpi_bcast_wrapper (y, storage_size (y)/8)
>   end subroutine s2
> end module foo
> 
> That's perfectly legal,

This is illegal, as far as I know. The type names are different,
which makes them different types.

To quote 7.5.2.4  Determination of derived types

Two data entities have the same type if they are declared with reference to the
same derived-type definition. Data
entities also have the same type if they are declared with reference to
different derived-type definitions that specify
the same type name, all have the SEQUENCE attribute or all have the BIND
attribute, have no components
with PRIVATE accessibility, and have components that agree in order, name, and
attributes. Otherwise, they
are of different derived types.[...]

Re: [PATCH V3, #1 of 10], Add basic pc-relative support

2019-08-29 Thread Segher Boessenkool
On Wed, Aug 28, 2019 at 05:26:55PM -0400, Michael Meissner wrote:
> On Wed, Aug 28, 2019 at 12:14:58PM -0500, Segher Boessenkool wrote:
> > > +/* Enumeration giving the type of traditional addressing that would be 
> > > used to
> > > +   decide whether an instruction uses prefixed memory or not.  If the
> > > +   traditional instruction uses the DS instruction format, and the 
> > > bottom 2
> > > +   bits of the offset are not 0, the traditional instruction cannot be 
> > > used,
> > > +   but a prefixed instruction can be used.  */
> > 
> > "Traditional" is a bad word for documentation.  What you mean is what was
> > supported before.  Before you know it "new" will be old as well.
> 
> Yeah, yeah, yeah.  I recall in Amsterdam there is the "Oude Kerk" (old church)
> built in the 1200's and the "De Nieuwe Kerk" in Amsterdam (built in the 
> 1500's)
> and thinking then of the problems of calling something "new" and "old".

:-)

> > Can you fix this struct / arrays / whatever, instead of adding more to it?

> > And these "address masks" are bitmaps of random flags, one for each
> > "register class" (which is not related to the core GCC concept of "register
> > class", and the bits are called "RELOAD_REG_*" although this isn't for
> > reload at all?
> 
> Actually no, they were created explicitly for the secondary reload handler 
> when
> I wrote this interface to add VSX support.

This is not just for reload anymore, so please don't name it that.  Renaming
things isn't hard, this isn't a public API or anything :-)

> > > +   if ((addr_mask & quad_flags) == RELOAD_REG_OFFSET
> > > +   && ((rc == RELOAD_REG_GPR && msize >= 8 && TARGET_POWERPC64)
> > > +   || (rc == RELOAD_REG_VMX)))
> > > + addr_mask |= RELOAD_REG_DS_OFFSET;
> > > +
> > > reg_addr[m].addr_mask[rc] = addr_mask;
> > > -   any_addr_mask |= addr_mask;
> > > +   any_addr_mask |= (addr_mask & ~RELOAD_REG_AND_M16);
> > 
> > Why do you need this last line?  Why was that flag set at all?  What does
> > "any mask" mean if it is not?
> 
> The flag is set to say this register class allows the funky (reg + reg) & -16
> addressing used with the original Altivec instructions.

No, I understand that, but why was it set in some individual mask if you
need to clean it in the "any" mask?

> > > @@ -10770,11 +10855,10 @@ rs6000_secondary_reload_memory (rtx addr
> > >& ~RELOAD_REG_AND_M16);
> > >  
> > >/* If the register allocator hasn't made up its mind yet on the 
> > > register
> > > - class to use, settle on defaults to use.  */
> > > + class to use, use the default address mask bits.  */
> > >else if (rclass == NO_REGS)
> > 
> > And this *does* mean register class.
> 
> No, in the context of the code, it means reload register class.

rclass is a register class.  NO_REGS is a register class.  "rc" isn't.

> The whole
> point is to reduce all of the normal register classes just to the 3 hardware
> register types.

Yes, so don't call it register class.  Don't use the same word for two
different things, esp. when one is used all over the place already.

> > I think this would all be much simpler with just a few lines of code instead
> > of all these tables, fwiw.

That's the core of most of this.  All this precomputation is indirection
that makes things really hard to understand.

And a lot of the more problematic code is the *older* code.  If you improve
that first -- *first*, that is what the earlier patches in a series are
for -- then this all will be much much easier to read and understand and
review and comment on and accept.

> > > +;; Load up a pc-relative address.  Print_operand_address will append a 
> > > @pcrel
> > > +;; to the symbol or label.
> > > +(define_insn "pcrel_local_addr"
> > 
> > This isn't used anywhere?  Not by name, that is?
> 
> Yes it is used in rs6000_emit_move.

Not in this patch though?

> > > +  [(set (match_operand:DI 0 "gpc_reg_operand" "=b*r")
> > > + (match_operand:DI 1 "pcrel_local_address"))]
> > > +  "TARGET_PCREL"
> > > +  "pla %0,%a1"
> > > +  [(set_attr "length" "12")])
> > 
> > I wonder if that whole "b*r" thing is useful at all these days, btw.
> 
> Yep.

You mean it is useful?  Or you question it too?

> > This patch changes a whole bunch of things.  You probably can split it
> > into smaller, self-contained pieces.
> 
> Not really, but I will try.  However, then of course you have the issue that a
> particular patch creates a function that isn't used for a few patches, and you
> have to look at several patches all at once.

No, not if you divide things properly.  You *never* need to introduce more
than one thing at once, if they all are unused!

Multiple concepts in one patch is a LOT of work to review.  It is MUCH
less work to review 50 focused patches than to review just 5 doing the
same, even if those 50 make up twice as many lines of patch total.


Segher


Re: [PATCH] Sanitizing the middle-end interface to the back-end for strict alignment

2019-08-29 Thread Bernd Edlinger
On 8/29/19 11:08 AM, Christophe Lyon wrote:
> On Thu, 29 Aug 2019 at 10:58, Kyrill Tkachov
>  wrote:
>>
>> Hi Bernd,
>>
>> On 8/28/19 10:36 PM, Bernd Edlinger wrote:
>>> On 8/28/19 2:07 PM, Christophe Lyon wrote:
 Hi,

 This patch causes an ICE when building libgcc's unwind-arm.o
 when configuring GCC:
 --target  arm-none-linux-gnueabihf --with-mode thumb --with-cpu
 cortex-a15 --with-fpu neon-vfpv4:

 The build works for the same target, but --with-mode arm --with-cpu
 cortex a9 --with-fpu vfp

 In file included from
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/config/arm/unwind-arm.c:144:
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/unwind-arm-common.inc:
 In function 'get_eit_entry':
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/unwind-arm-common.inc:245:29:
 warning: cast discards 'const' qualifier from pointer target type
 [-Wcast-qual]
245 |   ucbp->pr_cache.ehtp = (_Unwind_EHT_Header *)>content;
| ^
 during RTL pass: expand
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/unwind-arm-common.inc:
 In function 'unwind_phase2_forced':
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/unwind-arm-common.inc:319:18:
 internal compiler error: in gen_movdi, at config/arm/arm.md:5235
319 |   saved_vrs.core = entry_vrs->core;
|   ~~~^
 0x126530f gen_movdi(rtx_def*, rtx_def*)
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.md:5235
 0x896d92 insn_gen_fn::operator()(rtx_def*, rtx_def*) const
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/recog.h:318
 0x896d92 emit_move_insn_1(rtx_def*, rtx_def*)
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:3694
 0x897083 emit_move_insn(rtx_def*, rtx_def*)
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:3790
 0xfc25d6 gen_cpymem_ldrd_strd(rtx_def**)
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:14582
 0x126a1f1 gen_cpymemqi(rtx_def*, rtx_def*, rtx_def*, rtx_def*)
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.md:6688
 0xb0bc08 maybe_expand_insn(insn_code, unsigned int, expand_operand*)
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/optabs.c:7440
 0x89ba1e emit_block_move_via_cpymem
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:1808
 0x89ba1e emit_block_move_hints(rtx_def*, rtx_def*, rtx_def*,
 block_op_methods, unsigned int, long, unsigned long, unsigned long,
 unsigned long, bool, bool*)
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:1627
 0x89c383 emit_block_move(rtx_def*, rtx_def*, rtx_def*, block_op_methods)
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:1667
 0x89fb4e store_expr(tree_node*, rtx_def*, int, bool, bool)
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:5845
 0x88c1f9 store_field
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:7149
 0x8a0c22 expand_assignment(tree_node*, tree_node*, bool)
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:5304
 0x761964 expand_gimple_stmt_1
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:3779
 0x761964 expand_gimple_stmt
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:3875
 0x768583 expand_gimple_basic_block
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:5915
 0x76abc6 execute
  
 /tmp/6852788_4.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:6538

 Christophe

>>> Okay, sorry for the breakage.
>>>
>>> What is happening in gen_cpymem_ldrd_strd is of course against the rules:
>>>
>>> It uses emit_move_insn on only 4-byte aligned DI-mode memory operands.
>>>
>>> I have a patch for this, which is able to fix the libgcc build on a cross, 
>>> but have no
>>> possibility to bootstrap the affected target.
>>>
>>> Could you please help?
>>
>> Well it's good that the sanitisation is catching the bugs!
>>

Yes, more than expected, though ;)

>> Bootstrapping this patch I get another assert with the backtrace:
> 
> Thanks for the additional testing, Kyrill!
> 
> FWIW, my original report was with a failure to just build GCC for
> cortex-a15. I later got the reports of testing cross-toolchains, and
> saw other problems on cortex-a9 for instance.
> But I guess, you have noticed them with your 

Re: C++ PATCH for P1152R4: Deprecating some uses of volatile (PR c++/91361)

2019-08-29 Thread Martin Sebor

On 8/28/19 5:56 PM, Marek Polacek wrote:

--- gcc/doc/invoke.texi
+++ gcc/doc/invoke.texi
@@ -3516,6 +3516,19 @@ result in a call to @code{terminate}.
  Disable the warning about the case when a conversion function converts an
  object to the same type, to a base class of that type, or to void; such
  a conversion function will never be called.
+
+@item -Wvolatile @r{(C++ and Objective-C++ only)}
+@opindex Wvolatile
+@opindex Wno-volatile
+Warn about deprecated uses of the volatile qualifier.  This includes postfix
+and prefix @code{++} and @code{--} expressions of volatile-qualified types,
+using simple assignments where the left operand is a volatile-qualified
+non-class type for their value, compound assignments where the left operand
+is a volatile-qualified non-class type, volatile-qualified function return
+type, volatile-qualified parameter type, and structured bindings of a
+volatile-qualified type.  This usage was deprecated in C++20.


Just a minor thing: Since the text uses volatile as a keyword
(as opposed to an adjective) it should be @code{volatile},
analogously how it's quoted in warnings.

Martin


[Bug c++/91592] `__is_assignable` fails for private assignment operators in certain contexts

2019-08-29 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91592

Marek Polacek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-08-29
 CC||mpolacek at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Marek Polacek  ---
The problem is that for the // FAIL case we never enforce the access of
operator= in B.

Re: Go patch committed: Provide index information on bounds check failure

2019-08-29 Thread Andreas Schwab
On Aug 28 2019, Ian Lance Taylor  wrote:

> This patch to the Go frontend and libgo changes the panic message
> reported for an out of bounds index or slice operation to include the
> invalid values.

This breaks aarch64/-mabi=ilp32.

aarch64-suse-linux/ilp32/libgo/archive/tar/check-testlog:

/usr/aarch64-suse-linux/bin/ld: _gotest_.o: in function 
`archive..z2ftar.Reader.next':
/opt/gcc/gcc-20190829/Build/aarch64-suse-linux/ilp32/libgo/gotest1086/test/reader.go:72:
 undefined reference to `runtime.goPanicExtendSliceAlen'

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


[Bug tree-optimization/91568] internal compiler error: in vect_schedule_slp_instance, at tree-vect-slp.c:3922

2019-08-29 Thread wala1 at illinois dot edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91568

--- Comment #11 from Matt Wala  ---
Thanks for fixing this!

Re: [PATCH] Adding _Dependent_ptr type qualifier in C part 1/3

2019-08-29 Thread Joseph Myers
On Fri, 30 Aug 2019, Akshat Garg wrote:

> > The first question for any new thing that is syntactically a qualifier is:
> > is it intended generally to be counted as a qualifier where the standard
> > refers to qualified type, the unqualified version of a type, etc.?  Or is
> > it, like _Atomic, a qualifier only syntactically and generally excluded
> > from references to qualifiers?
> >
> Can you help me in understanding why the _Atomic is excluded from the
> standard references. I referred to the C standard draft N2310 (
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf) but I couldn't
> understand how it is excluded? I want to know what properties should a
> qualifier have to be in standard qualifiers list?

In the case of _Atomic, it can affect the size and alignment of the type 
to which it is applied, which means it can't be considered a qualifier in 
the nornal semantic sense.

In general you need to consider questions such as: is it safe to convert a 
pointer to unqualified type to a pointer to the corresponding 
_Dependent_ptr-qualified type?  Are conditional expressions between 
pointers whose target types differ in presence or absence of 
_Dependent_ptr safe?  If in general, in such places where the standard 
refers to qualifiers, the existing logic there is also appropriate for 
_Dependent_ptr, that suggests it should be a qualifier semantically.  If 
the standard logic often seems inappropriate for _Dependent_ptr, that 
indicates it's not a qualifier in standard terms.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Adding _Dependent_ptr type qualifier in C part 1/3

2019-08-29 Thread Akshat Garg
Hi Joseph,
Many thanks for giving us your feedback.

On Tue, Aug 20, 2019 at 3:57 AM Joseph Myers 
wrote:

> On Tue, 30 Jul 2019, Martin Sebor wrote:
>
> > On 7/30/19 1:13 AM, Akshat Garg wrote:
> > > Hi,
> > > This patch includes C front-end code for a type qualifier
> _Dependent_ptr.
> >
> > Just some very high-level comments/questions.  I only followed
> > the _Dependent_ptr discussion from a distance and I'm likely
> > missing some context so the first thing I looked for in this
> > patch is documentation of the new qualifier.  Unless it's
>
> The first question for any new thing that is syntactically a qualifier is:
> is it intended generally to be counted as a qualifier where the standard
> refers to qualified type, the unqualified version of a type, etc.?  Or is
> it, like _Atomic, a qualifier only syntactically and generally excluded
> from references to qualifiers?
>
Can you help me in understanding why the _Atomic is excluded from the
standard references. I referred to the C standard draft N2310 (
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf) but I couldn't
understand how it is excluded? I want to know what properties should a
qualifier have to be in standard qualifiers list?

Thanks,
Akshat

>
> For the _Atomic implementation I had to go through all the references to
> qualifiers or TYPE_MAIN_VARIANT in the front end and consider in each case
> whether it handled _Atomic correctly, given that _Atomic is not counted as
> a qualifier in the standard (so the unqualified version of const _Atomic
> int is _Atomic int not int, and so can't be derived simply by using
> TYPE_MAIN_VARIANT, for example).  Some cases didn't need changing because
> the handling (e.g. diagnostic for different types) was still appropriate
> for _Atomic even though not formally a qualifier, but plenty did need
> changing and associated tests added.
>
> Such a check of front end code is probably unavoidable (before a change is
> ready for trunk, not necessarily for an initial rough RFC patch) for any
> new qualifier, whether it counts as a qualifier in standard terms or not
> (and the patch reviewer will need to do their own check of references to
> qualifiers or TYPE_MAIN_VARIANT that didn't get changed by the patch), but
> the answer to that question helps indicate whether the default is to
> expect code to need changing for the new qualifier or not.
>
> > you point to it?  (In that case, or if a proposal is planned,
> > the feature should probably either only be available with
> > -std=c2x and -std=gnu2x or a pedantic warning should be issued
>
> There should not be any -std=c2x (flag_isoc2x) conditionals simply based
> on "a proposal is planned".  flag_isoc2x conditionals (pedwarn_c11 calls,
> etc.) should be for cases where a feature is *accepted and committed into
> the C2x branch of Jens's git repository for the C standard*, not for
> something that might be proposed, or is proposed, but doesn't yet have
> specific text integrated into the text of the standard.
>
> If something is simply proposed *and we've concluded it's a good feature
> to have as an extension in any case* then you have a normal
> pedwarn-if-pedantic (no condition on standard version) as for any GNU
> extension (and flag_isoc2x conditions / changes to use pedwarn_c11 instead
> can be added later if the extension is added to the standard).
>
> --
> Joseph S. Myers
> jos...@codesourcery.com
>


[Bug testsuite/27221] g++.dg/ext/alignof2.C fails on powerpc-darwin (and powerpc-aix)

2019-08-29 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=27221

--- Comment #8 from Iain Sandoe  ---
fixed for 8.4

[Bug testsuite/67958] The tests changed by r223498 now FAILs on darwin

2019-08-29 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67958

--- Comment #10 from Iain Sandoe  ---
fixed for 8.4

[Bug bootstrap/87030] GCC fails to build with Xcode 10, attempting an impossible multilib build

2019-08-29 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87030

Iain Sandoe  changed:

   What|Removed |Added

   Target Milestone|8.4 |7.5

--- Comment #25 from Iain Sandoe  ---
fixed for 8.4 - the problem exists on earlier branches too, so will backport
for 7.5 if time permits.

Re: Go patch committed: Provide index information on bounds check failure

2019-08-29 Thread Rainer Orth
Hi Ian,

> This patch to the Go frontend and libgo changes the panic message
> reported for an out of bounds index or slice operation to include the
> invalid values.  This makes it easier for the user to see what the
> problem is.  This implements https://golang.org/cl/161477 in the
> gofrontend, for https://golang.org/issue/30116.  Bootstrapped and ran
> Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.
>
> Unfortunately, GMail has once again blocked the patch attachment.  So
> if you want to see the patch, see
> https://gcc.gnu.org/viewcvs/gcc?view=revision=274998 .

this patch broke sparc-sun-solaris2.11 bootstrap: in gotools I get
several link failures like this:

Undefined   first referenced
 symbol in file
runtime.goPanicExtendSliceAcap  
../sparc-sun-solaris2.11/libgo/.libs/libgo.so
runtime.goPanicExtendSliceAlen  
../sparc-sun-solaris2.11/libgo/.libs/libgo.so
runtime.goPanicExtendIndex  
../sparc-sun-solaris2.11/libgo/.libs/libgo.so
runtime.goPanicExtendIndexU 
../sparc-sun-solaris2.11/libgo/.libs/libgo.so
runtime.goPanicExtendSliceB 
../sparc-sun-solaris2.11/libgo/.libs/libgo.so
runtime.goPanicExtendSliceAcapU 
../sparc-sun-solaris2.11/libgo/.libs/libgo.so
runtime.goPanicExtendSliceBU
../sparc-sun-solaris2.11/libgo/libgotool.a(buildid.o)
ld: fatal: symbol referencing errors
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:829: buildid] Error 1

The attached patch fixes this and allows the links to succeed; tests
still to be run.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


diff --git a/libgo/go/runtime/panic32.go b/libgo/go/runtime/panic32.go
--- a/libgo/go/runtime/panic32.go
+++ b/libgo/go/runtime/panic32.go
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-// +build 386 amd64p32 arm mips mipsle m68k nios2 sh shbe
+// +build 386 amd64p32 arm mips mipsle m68k nios2 sh shbe sparc
 
 package runtime
 


[PATCH] Fix unused malloc return value warning

2019-08-29 Thread François Dumont

Hi

    I am having this warning:

/home/fdt/dev/gcc/git/libstdc++-v3/testsuite/util/testsuite_performance.h:170: 
attention: ignoring return value of « void* malloc(size_t) » declared 
with attribute « warn_unused_result » [-Wunused-result]

  170 |   malloc(0); // Needed for some implementations.

    Ok to fix it with attached patch ?

    It seems trivial but I wonder if I shouldn't keep the malloc 
returned pointer and free it properly ?


    Or maybe just remove the malloc cause there is not clear comment 
explaining why it's needed and I haven't found much in SVN audit trail.


    * testsuite_files/util/testsuite_performance.h
    (resource_counter::start): Ignore unused malloc(0) result.

François

diff --git a/libstdc++-v3/testsuite/util/testsuite_performance.h b/libstdc++-v3/testsuite/util/testsuite_performance.h
index 556c78159be..8abc77cf31a 100644
--- a/libstdc++-v3/testsuite/util/testsuite_performance.h
+++ b/libstdc++-v3/testsuite/util/testsuite_performance.h
@@ -167,7 +167,7 @@ namespace __gnu_test
 {
   if (getrusage(who, _begin) != 0 )
 	memset(_begin, 0, sizeof(rusage_begin));
-  malloc(0); // Needed for some implementations.
+  void* p __attribute__((unused)) = malloc(0); // Needed for some implementations.
   allocation_begin = mallinfo();
 }
 



[Bug testsuite/27221] g++.dg/ext/alignof2.C fails on powerpc-darwin (and powerpc-aix)

2019-08-29 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=27221

--- Comment #7 from Iain Sandoe  ---
Author: iains
Date: Thu Aug 29 19:36:50 2019
New Revision: 275054

URL: https://gcc.gnu.org/viewcvs?rev=275054=gcc=rev
Log:
[Darwin, testsuite] Backport fix for PR27221.

2019-08-29  Iain Sandoe  

Backport from mainline.
2019-05-22  Iain Sandoe  

PR testsuite/27221
* g++.dg/ext/alignof2.C: XFAIL for 32bit Darwin.


Modified:
branches/gcc-8-branch/gcc/testsuite/ChangeLog
branches/gcc-8-branch/gcc/testsuite/g++.dg/ext/alignof2.C

[Bug fortran/91556] Problems with better interface checking

2019-08-29 Thread sgk at troutmask dot apl.washington.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91556

--- Comment #17 from Steve Kargl  ---
On Thu, Aug 29, 2019 at 07:18:01PM +, anlauf at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91556
> 
> --- Comment #16 from anlauf at gcc dot gnu.org ---
> (In reply to Steve Kargl from comment #15)
> > On Thu, Aug 29, 2019 at 06:49:15PM +, anlauf at gcc dot gnu.org wrote:
> > > 
> > > That's perfectly legal, but gets rejected unless -fallow-argument-mismatch
> > > is specified.  But then I still get a warning (or many if this appears in
> > > a large module).
> > 
> > You can get rid of the warning with -w.
> 
> That would get rid of all warnings.
> 

Yes, that is a down side to writing code with implicit
interfaces.  There is no excuse for not using proper
explicit interfaces in modern Fortran.

[Bug testsuite/67958] The tests changed by r223498 now FAILs on darwin

2019-08-29 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67958

--- Comment #9 from Iain Sandoe  ---
Author: iains
Date: Thu Aug 29 19:32:25 2019
New Revision: 275053

URL: https://gcc.gnu.org/viewcvs?rev=275053=gcc=rev
Log:
[Darwin, testsuite] Backport fix for PR67958.

2019-08-29  Iain Sandoe  

Backport from mainline.
2019-05-21  Iain Sandoe  

PR testsuite/67958
* gcc.target/i386/pr32219-1.c: Adjust scan-asms for Darwin, comment
the differences.
* gcc.target/i386/pr32219-2.c: Likewise.
* gcc.target/i386/pr32219-3.c: Likewise.
* gcc.target/i386/pr32219-4.c: Likewise.
* gcc.target/i386/pr32219-5.c: Likewise.
* gcc.target/i386/pr32219-6.c: Likewise.
* gcc.target/i386/pr32219-7.c: Likewise.
* gcc.target/i386/pr32219-8.c: Likewise.


Modified:
branches/gcc-8-branch/gcc/testsuite/ChangeLog
branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr32219-1.c
branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr32219-2.c
branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr32219-3.c
branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr32219-4.c
branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr32219-5.c
branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr32219-6.c
branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr32219-7.c
branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr32219-8.c

[Bug bootstrap/87030] GCC fails to build with Xcode 10, attempting an impossible multilib build

2019-08-29 Thread iains at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87030

--- Comment #24 from Iain Sandoe  ---
Author: iains
Date: Thu Aug 29 19:26:45 2019
New Revision: 275052

URL: https://gcc.gnu.org/viewcvs?rev=275052=gcc=rev
Log:
[Darwin] Fix PR87030 and tidy config fragments.

This is about 32/64b host and multilib support across the range of Darwin
systems.

Prior to Darwin8 (OS X 10.4), the toolchains support only PowerPC and only 32b.

On Darwin8 it is possible to target a 64b multilib, but with support limited
to a few of the main libraries on the system (not a recommended configuration).

>From Darwin9 to Darwin17 (OSX 10.5 to 10.13) it is possible to have either
32 or 64b hosted toolchains, with support for a 64 or 32b multilib
respectively.

On Darwin9 the kernel is 32b, but with support for 64b executables, so it's
conventional to build a 32b host toolchain supporting a 64b multilib. However
this is not enforced (merely a convention).

There is also some platform hardware supporting Darwin10/11 which is only 32b
and for which the same situation applies. However, from Darwin10 to Darwin17,
the majority of platform hardware supports a 64b kernel and it's conventional
to build a 64b host toolchain with support for a 32b multilib.

On/from Darwin18 (OS X 10.14), the development headers (in the SDK) no longer
expose the interfaces for the 32b multilib support (although sufficient runtime
support remains installed that the testsuite can be run for a 32b multilib).

The PR is raised against this latter situation since the absence of exposed
interfaces causes a 'default' bootstrap fail regardless of the availability of
the runtimes. Given the number of permutations, I felt it warranted a general
solution, especially since the current scheme of target headers and t-make
fragments has become somewhat messy.

The changes here enforce the single 32b PowerPC multilib for Darwin < 8 and the
single X86 64b multilib for Darwin >= 18. This means that there is no longer
any need to configure Darwin18+ '--disable-multilib', but also that if you want
to use the ability to continue to test the compiler's 32b multilib there, you
need to make a configuration targeting an earlier OS version (and using the
SDK from that).

2019-08-29  Iain Sandoe  

Backport from mainline
2019-07-24  Iain Sandoe  

PR bootstrap/87030
* config/i386/darwin.h (REAL_LIBGCC_SPEC): Revert change from r273749.

PR bootstrap/87030
* config/i386/darwin.h (REAL_LIBGCC_SPEC): Move from here...
* config/i386/darwin32-biarch.h .. to here.
* config/i386/darwin64-biarch.h: Adjust comments.
* config/rs6000/darwin32-biarch.h: Likewise.
* config/rs6000/darwin64-biarch.h: Likewise.
* config.gcc: Missed commit from r273746
(*-*-darwin*): Don't include CPU t-darwin here.
(i[34567]86-*-darwin*): Adjust to use biarch files. Produce
an error message if i686-darwin configuration is attempted for
Darwin >= 18.

Backport from mainline
2019-07-23  Iain Sandoe  

PR bootstrap/87030
* config.gcc (*-*-darwin*): Don't include CPU t-darwin here.
(i[34567]86-*-darwin*): Adjust to use biarch files. Produce
an error message if i686-darwin configuration is attempted for
Darwin >= 18.
(x86_64-*-darwin*): Switch to single multilib for Darwin >= 18.
(powerpc-*-darwin*): Use biarch files where needed.
(powerpc64-*-darwin*): Likewise.
* config/i386/darwin.h (REAL_LIBGCC_SPEC): Move to new biarch file.
(DARWIN_ARCH_SPEC, DARWIN_SUBARCH_SPEC): Revise for default single
arch case.
* config/i386/darwin32-biarch.h: New.
* config/i386/darwin64.h: Rename.
* gcc/config/i386/darwin64-biarch.h: To this.
* config/i386/t-darwin: Rename.
* gcc/config/i386/t-darwin32-biarch: To this.
* config/i386/t-darwin64: Rename.
* gcc/config/i386/t-darwin64-biarch: To this.
* config/rs6000/darwin32-biarch.h: New.
* config/rs6000/darwin64.h: Rename.
* config/rs6000/darwin64-biarch.h: To this.
(DARWIN_ARCH_SPEC, DARWIN_SUBARCH_SPEC): Revise for default single
arch case.
* config/rs6000/t-darwin8: Rename.
* config/rs6000/t-darwin32-biarch: To this.
* config/rs6000/t-darwin64 Rename.
* config/rs6000/t-darwin64-biarch: To this.


Added:
branches/gcc-8-branch/gcc/config/i386/darwin32-biarch.h
branches/gcc-8-branch/gcc/config/i386/darwin64-biarch.h
branches/gcc-8-branch/gcc/config/i386/t-darwin32-biarch
branches/gcc-8-branch/gcc/config/i386/t-darwin64-biarch
branches/gcc-8-branch/gcc/config/rs6000/darwin32-biarch.h
branches/gcc-8-branch/gcc/config/rs6000/darwin64-biarch.h
branches/gcc-8-branch/gcc/config/rs6000/t-darwin32-biarch
branches/gcc-8-branch/gcc/config/rs6000/t-darwin64-biarch
Removed:
branches/gcc-8-branch/gcc/config/i386/darwin64.h

[Bug fortran/91556] Problems with better interface checking

2019-08-29 Thread anlauf at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91556

--- Comment #16 from anlauf at gcc dot gnu.org ---
(In reply to Steve Kargl from comment #15)
> On Thu, Aug 29, 2019 at 06:49:15PM +, anlauf at gcc dot gnu.org wrote:
> > module foo
> >   implicit none
> >   type t1
> >  integer :: i = 1
> >   end type t1
> >   type t2
> >  integer :: j = 2
> >   end type t2
> > contains
> >   subroutine s1 (x)
> > type(t1) :: x
> > call my_mpi_bcast_wrapper (x, storage_size (x)/8)
> >   end subroutine s1
> >   subroutine s2 (y)
> > type(t2) :: y
> > call my_mpi_bcast_wrapper (y, storage_size (y)/8)
> >   end subroutine s2
> > end module foo
> > 
> > That's perfectly legal, but gets rejected unless -fallow-argument-mismatch
> > is specified.  But then I still get a warning (or many if this appears in
> > a large module).
> 
> You can get rid of the warning with -w.

That would get rid of all warnings.

[Bug fortran/91556] Problems with better interface checking

2019-08-29 Thread sgk at troutmask dot apl.washington.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91556

--- Comment #15 from Steve Kargl  ---
On Thu, Aug 29, 2019 at 06:49:15PM +, anlauf at gcc dot gnu.org wrote:
> module foo
>   implicit none
>   type t1
>  integer :: i = 1
>   end type t1
>   type t2
>  integer :: j = 2
>   end type t2
> contains
>   subroutine s1 (x)
> type(t1) :: x
> call my_mpi_bcast_wrapper (x, storage_size (x)/8)
>   end subroutine s1
>   subroutine s2 (y)
> type(t2) :: y
> call my_mpi_bcast_wrapper (y, storage_size (y)/8)
>   end subroutine s2
> end module foo
> 
> That's perfectly legal, but gets rejected unless -fallow-argument-mismatch
> is specified.  But then I still get a warning (or many if this appears in
> a large module).

You can get rid of the warning with -w.

[Bug fortran/91556] Problems with better interface checking

2019-08-29 Thread anlauf at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91556

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 CC||anlauf at gcc dot gnu.org

--- Comment #14 from anlauf at gcc dot gnu.org ---
The current solution is a bit annoying for implicitly-derived interfaces.

Consider a code like:

module foo
  implicit none
  type t1
 integer :: i = 1
  end type t1
  type t2
 integer :: j = 2
  end type t2
contains
  subroutine s1 (x)
type(t1) :: x
call my_mpi_bcast_wrapper (x, storage_size (x)/8)
  end subroutine s1
  subroutine s2 (y)
type(t2) :: y
call my_mpi_bcast_wrapper (y, storage_size (y)/8)
  end subroutine s2
end module foo

That's perfectly legal, but gets rejected unless -fallow-argument-mismatch
is specified.  But then I still get a warning (or many if this appears in
a large module).

I know that there is a (quite clumsy) solution to the above by providing
many dummy interfaces, just to defeat the checking.

I would like to see an error only for explicit interfaces.  But e.g. for
packages like MPI, where the mpi_* routines can handle different argument
types, and where by default one doesn't need (or want) an explicit interface,
I'd hope that that the checks could be downgraded.

[Bug middle-end/81914] [7 Regression] gcc 7.1 generates branch for code which was branchless in earlier gcc version

2019-08-29 Thread jg at jguk dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81914

Jonny Grant  changed:

   What|Removed |Added

 CC||jg at jguk dot org

--- Comment #15 from Jonny Grant  ---
Hello

I noticed this example below does not give a warning. I had expected something
similar for -Wsign-conversion present behaviour. Sharing my notes as follows.

A) false+true  maybe not treated as signed?

B) the return conversion from size_t to int.

https://godbolt.org/z/KoVPqd


#include 
int main()
{
size_t j = false+true;

return j;
}

[Bug tree-optimization/90134] ICE in duplicate_eh_regions_1, at except.c:557

2019-08-29 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90134

Arseny Solokha  changed:

   What|Removed |Added

  Known to fail|9.0 |10.0, 9.2.0

--- Comment #2 from Arseny Solokha  ---
Another one, just in case:

% x86_64-unknown-linux-gnu-g++-10.0.0-alpha20190825 -O1 -fnon-call-exceptions
-ftree-parallelize-loops=2 -fno-early-inlining -fno-lifetime-dse -c
libstdc++-v3/testsuite/25_algorithms/headers/algorithm/parallel_algorithm_mixed2.cc
during GIMPLE pass: ompexpssa   
In file included from
/usr/lib/gcc/x86_64-unknown-linux-gnu/10.0.0-alpha20190825/include/g++-v10/vector:66,
 from
/usr/lib/gcc/x86_64-unknown-linux-gnu/10.0.0-alpha20190825/include/g++-v10/parallel/multiway_mergesort.h:35,
 from
/usr/lib/gcc/x86_64-unknown-linux-gnu/10.0.0-alpha20190825/include/g++-v10/parallel/sort.h:44,
 from
/usr/lib/gcc/x86_64-unknown-linux-gnu/10.0.0-alpha20190825/include/g++-v10/parallel/algo.h:45,
 from
/usr/lib/gcc/x86_64-unknown-linux-gnu/10.0.0-alpha20190825/include/g++-v10/parallel/algorithm:37,
 from
libstdc++-v3/testsuite/25_algorithms/headers/algorithm/parallel_algorithm_mixed2.cc:27:
/usr/lib/gcc/x86_64-unknown-linux-gnu/10.0.0-alpha20190825/include/g++-v10/bits/stl_uninitialized.h:
In function
'__gnu_parallel__multiway_merge_exact_splitting_true__std__pair___gnu_cxxnormal_iterator_short_u.17':
/usr/lib/gcc/x86_64-unknown-linux-gnu/10.0.0-alpha20190825/include/g++-v10/bits/stl_uninitialized.h:546:19:
internal compiler error: in duplicate_eh_regions_1, at except.c:557
  546 |for (; __n > 0; --__n, (void) ++__cur)
  |   ^~~
0x6cd451 duplicate_eh_regions_1
   
/var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190825/work/gcc-10-20190825/gcc/except.c:557
0xc34e18 duplicate_eh_regions_1
   
/var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190825/work/gcc-10-20190825/gcc/except.c:599
0xc35040 duplicate_eh_regions(function*, eh_region_d*, int, tree_node*
(*)(tree_node*, void*), void*)
   
/var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190825/work/gcc-10-20190825/gcc/except.c:633
0xfc75ce move_sese_region_to_fn(function*, basic_block_def*, basic_block_def*,
tree_node*)
   
/var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190825/work/gcc-10-20190825/gcc/tree-cfg.c:7565
0xe4e4b6 expand_omp_taskreg
   
/var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190825/work/gcc-10-20190825/gcc/omp-expand.c:1439
0xe54e40 expand_omp_synch
   
/var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190825/work/gcc-10-20190825/gcc/omp-expand.c:6982
0xe54e40 expand_omp
   
/var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190825/work/gcc-10-20190825/gcc/omp-expand.c:8827
0xe56b80 execute_expand_omp
   
/var/tmp/portage/sys-devel/gcc-10.0.0_alpha20190825/work/gcc-10-20190825/gcc/omp-expand.c:9019

[PATCH, i386]: Tighten inline_secondary_memory_needed to reject moves between (SSE,mask) and non-general regs

2019-08-29 Thread Uros Bizjak
2019-08-29  Uroš Bizjak  

* config/i386/i386.c (inline_secondary_memory_needed): Return true
for moves between SSE and non-general registers and between
mask and non-general registers.
(ix86_register_move_cost): Remove stalled comment.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index d2d84eb11663..1c9c719f22a3 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -18306,32 +18306,36 @@ inline_secondary_memory_needed (machine_mode mode, 
reg_class_t class1,
   if (FLOAT_CLASS_P (class1) != FLOAT_CLASS_P (class2))
 return true;
 
-  /* Between mask and general, we have moves no larger than word size.  */
-  if ((MASK_CLASS_P (class1) != MASK_CLASS_P (class2))
-  && (GET_MODE_SIZE (mode) > UNITS_PER_WORD))
-  return true;
-
   /* ??? This is a lie.  We do have moves between mmx/general, and for
  mmx/sse2.  But by saying we need secondary memory we discourage the
  register allocator from using the mmx registers unless needed.  */
   if (MMX_CLASS_P (class1) != MMX_CLASS_P (class2))
 return true;
 
+  /* Between mask and general, we have moves no larger than word size.  */
+  if (MASK_CLASS_P (class1) != MASK_CLASS_P (class2))
+{
+  if (!(INTEGER_CLASS_P (class1) || INTEGER_CLASS_P (class2))
+ || GET_MODE_SIZE (mode) > UNITS_PER_WORD)
+   return true;
+}
+
   if (SSE_CLASS_P (class1) != SSE_CLASS_P (class2))
 {
   /* SSE1 doesn't have any direct moves from other classes.  */
   if (!TARGET_SSE2)
return true;
 
+  /* Between SSE and general, we have moves no larger than word size.  */
+  if (!(INTEGER_CLASS_P (class1) || INTEGER_CLASS_P (class2))
+ || GET_MODE_SIZE (mode) > UNITS_PER_WORD)
+   return true;
+
   /* If the target says that inter-unit moves are more expensive
 than moving through memory, then don't generate them.  */
   if ((SSE_CLASS_P (class1) && !TARGET_INTER_UNIT_MOVES_FROM_VEC)
  || (SSE_CLASS_P (class2) && !TARGET_INTER_UNIT_MOVES_TO_VEC))
return true;
-
-  /* Between SSE and general, we have moves no larger than word size.  */
-  if (GET_MODE_SIZE (mode) > UNITS_PER_WORD)
-   return true;
 }
 
   return false;
@@ -18608,15 +18612,7 @@ ix86_register_move_cost (machine_mode mode, 
reg_class_t class1_i,
   if (MMX_CLASS_P (class1) != MMX_CLASS_P (class2))
 gcc_unreachable ();
 
-  /* Moves between SSE and integer units are expensive.  */
   if (SSE_CLASS_P (class1) != SSE_CLASS_P (class2))
-
-/* ??? By keeping returned value relatively high, we limit the number
-   of moves between integer and SSE registers for all targets.
-   Additionally, high value prevents problem with x86_modes_tieable_p(),
-   where integer modes in SSE registers are not tieable
-   because of missing QImode and HImode moves to, from or between
-   MMX/SSE registers.  */
 return (SSE_CLASS_P (class1)
? ix86_cost->hard_register.sse_to_integer
: ix86_cost->hard_register.integer_to_sse);


Re: [PR91598] Improve autoprefetcher heuristic in haifa-sched.c

2019-08-29 Thread Wilco Dijkstra
Hi Alexander,
 
> So essentially the main issue is not a hardware peculiarity, but rather the
> bad schedule being totally wrong (it could only make sense if loads had 
> 1-cycle
> latency, which they do not).

The scheduling is only bad because the specific intrinsics used are mapped
onto asm statements, so they are ignored by the scheduler and modelled
with zero latencies.

> I think this highlights how implementing this autoprefetch heuristic via the
> dfa_lookahead_guard interface looks questionable in the first place, but the
> patch itself makes sense to me.

Yes I'm still not sure what this autoprefetch heuristic is trying to 
accomplish...
We could try disabling it and see whether it actually helps.

Wilco

[PATCH, i386]: Do not limit the cost of moves to/from XMM register to minimum 8.

2019-08-29 Thread Uros Bizjak
2019-08-28  Uroš Bizjak  

* config/i386/i386.c (ix86_register_move_cost): Do not
limit the cost of moves to/from XMM register to minimum 8.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Actually committed as r274994 with the wrong ChangeLog.

Uros.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 49ab50ea41bf..11c75be113e0 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -18601,9 +18601,9 @@ ix86_register_move_cost (machine_mode mode, reg_class_t 
class1_i,
where integer modes in SSE registers are not tieable
because of missing QImode and HImode moves to, from or between
MMX/SSE registers.  */
-return MAX (8, SSE_CLASS_P (class1)
-   ? ix86_cost->hard_register.sse_to_integer
-   : ix86_cost->hard_register.integer_to_sse);
+return (SSE_CLASS_P (class1)
+   ? ix86_cost->hard_register.sse_to_integer
+   : ix86_cost->hard_register.integer_to_sse);
 
   if (MAYBE_FLOAT_CLASS_P (class1))
 return ix86_cost->hard_register.fp_move;


[Bug c++/91129] [9/10 Regression] Implicit casts fail for modulo operator

2019-08-29 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91129

Marek Polacek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |mpolacek at gcc dot 
gnu.org
   Target Milestone|--- |9.3
Summary|Implicit casts fail for |[9/10 Regression] Implicit
   |modulo operator |casts fail for modulo
   ||operator

--- Comment #2 from Marek Polacek  ---
https://gcc.gnu.org/ml/gcc-patches/2019-08/msg02008.html

C++ PATCH for c++/91129 - wrong error with binary op in template argument

2019-08-29 Thread Marek Polacek
We reject this test with errors like

nontype1.C:22:14: error: taking address of rvalue [-fpermissive]
   22 |   A{}> a2;
  |  ^~~

that happens because for converting "C{}" to int we generate
"C::operator int (_EXPR )".  The second
template argument is a binary operator, so while parsing the template
arguments in a function template foo we end up in cp_build_binary_op.

cp_build_binary_op calls, for certain binary ops, fold_non_dependent_expr.
Since  these
calls are no longer tf_none.  fold_non_dependent_expr, when in a template,
will instantiate, which gives the "taking address of rvalue" error.

In this particular case the fix seems to be using fold_for_warn instead,
which in a template is fold_non_dependent_expr with tf_none; all the
fold calls I'm changing in this patch are used for diagnostic purposes
only, and it fixes all the bogus errors.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-08-29  Marek Polacek  

PR c++/91129 - wrong error with binary op in template argument.
* typeck.c (warn_for_null_address): Use fold_for_warn instead of
fold_non_dependent_expr.
(cp_build_binary_op): Likewise.

* g++.dg/cpp1y/nontype1.C: New test.

diff --git gcc/cp/typeck.c gcc/cp/typeck.c
index c09bb309142..31414453524 100644
--- gcc/cp/typeck.c
+++ gcc/cp/typeck.c
@@ -4305,7 +4305,7 @@ warn_for_null_address (location_t location, tree op, 
tsubst_flags_t complain)
   || TREE_NO_WARNING (op))
 return;
 
-  tree cop = fold_non_dependent_expr (op, complain);
+  tree cop = fold_for_warn (op);
 
   if (TREE_CODE (cop) == ADDR_EXPR
   && decl_with_nonnull_addr_p (TREE_OPERAND (cop, 0))
@@ -4628,9 +4628,8 @@ cp_build_binary_op (const op_location_t ,
  || code1 == COMPLEX_TYPE || code1 == VECTOR_TYPE))
{
  enum tree_code tcode0 = code0, tcode1 = code1;
- tree cop1 = fold_non_dependent_expr (op1, complain);
  doing_div_or_mod = true;
- warn_for_div_by_zero (location, cop1);
+ warn_for_div_by_zero (location, fold_for_warn (op1));
 
  if (tcode0 == COMPLEX_TYPE || tcode0 == VECTOR_TYPE)
tcode0 = TREE_CODE (TREE_TYPE (TREE_TYPE (op0)));
@@ -4669,11 +4668,8 @@ cp_build_binary_op (const op_location_t ,
 
 case TRUNC_MOD_EXPR:
 case FLOOR_MOD_EXPR:
-  {
-   tree cop1 = fold_non_dependent_expr (op1, complain);
-   doing_div_or_mod = true;
-   warn_for_div_by_zero (location, cop1);
-  }
+  doing_div_or_mod = true;
+  warn_for_div_by_zero (location, fold_for_warn (op1));
 
   if (code0 == VECTOR_TYPE && code1 == VECTOR_TYPE
  && TREE_CODE (TREE_TYPE (type0)) == INTEGER_TYPE
@@ -4766,7 +4762,7 @@ cp_build_binary_op (const op_location_t ,
}
   else if (code0 == INTEGER_TYPE && code1 == INTEGER_TYPE)
{
- tree const_op1 = fold_non_dependent_expr (op1, complain);
+ tree const_op1 = fold_for_warn (op1);
  if (TREE_CODE (const_op1) != INTEGER_CST)
const_op1 = op1;
  result_type = type0;
@@ -4812,10 +4808,10 @@ cp_build_binary_op (const op_location_t ,
}
   else if (code0 == INTEGER_TYPE && code1 == INTEGER_TYPE)
{
- tree const_op0 = fold_non_dependent_expr (op0, complain);
+ tree const_op0 = fold_for_warn (op0);
  if (TREE_CODE (const_op0) != INTEGER_CST)
const_op0 = op0;
- tree const_op1 = fold_non_dependent_expr (op1, complain);
+ tree const_op1 = fold_for_warn (op1);
  if (TREE_CODE (const_op1) != INTEGER_CST)
const_op1 = op1;
  result_type = type0;
diff --git gcc/testsuite/g++.dg/cpp1y/nontype1.C 
gcc/testsuite/g++.dg/cpp1y/nontype1.C
new file mode 100644
index 000..a37e996a3ff
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp1y/nontype1.C
@@ -0,0 +1,42 @@
+// PR c++/91129 - wrong error with binary op in template argument.
+// { dg-do compile { target c++14 } }
+
+template
+struct C
+{
+  constexpr operator T() const { return v; }
+  constexpr auto operator()() const { return v; }
+};
+
+template
+struct A
+{
+};
+
+template
+void foo ()
+{
+  A{}> a0;
+  A{}> a1;
+  A{}> a2;
+  A{}> a3;
+  A{}> a4;
+  A{}> a5;
+  A{}> a6;
+  A{}> a7;
+  A{}> a8;
+  A{}> a9;
+  A{}> a10;
+  A> C{})> a11;
+  A{}> a12;
+  A{}> a13;
+  A{}> a14;
+  A{}> a15;
+  A{}> a16;
+  A{}> a17;
+}
+
+int main()
+{
+  foo<10>();
+}


Re: [PR91598] Improve autoprefetcher heuristic in haifa-sched.c

2019-08-29 Thread Alexander Monakov
On Thu, 29 Aug 2019, Maxim Kuvyrkov wrote:

> >> r1 = [rb + 0]
> >> 
> >> r2 = [rb + 8]
> >> 
> >> r3 = [rb + 16]
> >> 
> >> 
> >> which, apparently, cortex-a53 autoprefetcher doesn't recognize.  This
> >> schedule happens because r2= load gets lower priority than the
> >> "irrelevant"  due to the above patch.
> >> 
> >> If we think about it, the fact that "r1 = [rb + 0]" can be scheduled
> >> means that true dependencies of all similar base+offset loads are
> >> resolved.  Therefore, for autoprefetcher-friendly schedule we should
> >> prioritize memory reads before "irrelevant" instructions.
> > 
> > But isn't there also max number of load issues in a fetch window to 
> > consider? 
> > So interleaving arithmetic with loads might be profitable. 
> 
> It appears that cores with autoprefetcher hardware prefer loads and stores
> bundled together, not interspersed with other instructions to occupy the rest
> of CPU units.

Let me point out that the motivating example has a bigger effect in play:

(1) r1 = [rb + 0]
(2) 
(3) r2 = [rb + 8]
(4) 
(5) r3 = [rb + 16]
(6) 

here Cortex-A53, being an in-order core, cannot issue the load at (3) until
after the load at (1) has completed, because the use at (2) depends on it.
The good schedule allows the three loads to issue in a pipelined fashion.

So essentially the main issue is not a hardware peculiarity, but rather the
bad schedule being totally wrong (it could only make sense if loads had 1-cycle
latency, which they do not).

I think this highlights how implementing this autoprefetch heuristic via the
dfa_lookahead_guard interface looks questionable in the first place, but the
patch itself makes sense to me.

Alexander


Re: [PR91598] Improve autoprefetcher heuristic in haifa-sched.c

2019-08-29 Thread Wilco Dijkstra
Hi Maxim,
 
 >  It appears that cores with autoprefetcher hardware prefer loads and stores 
 >bundled together, not interspersed with > other instructions to occupy the 
 >rest of CPU units.
  
 I don't believe it is as simple as that - modern cores have multiple 
prefetchers but
 won't prefer bundling loads and stores in large blocks. That would result in 
terrible
 performance due to dispatch and issue stalls. Also the increased register 
pressure
 could cause extra spilling. If we group loads and stores, we'd definitely need 
to
 limit them to say 4 or so at most, and then interleave ALU operations.
 
  > Autoprefetching heuristic is enabled only for cores that support it, and 
isn't active for by default.
  
 It's enabled on most cores, including the default (generic). So we do have to 
be
 careful that this doesn't regress any other benchmarks or do worse on modern
 cores.
 
 Cheers,
 Wilco
  
 

[PATCH][ARM] Add logical DImode expanders

2019-08-29 Thread Wilco Dijkstra
We currently use default mid-end expanders for logical DImode operations.
These split operations without first splitting off complex immediates or
memory operands.  The resulting expansions are non-optimal and allow for
fewer LDRD/STRD opportunities.  So add back explicit expanders which ensure
memory operands and immediates are handled more efficiently.

Bootstrap OK on armhf, regress passes.

ChangeLog:
2019-08-29  Wilco Dijkstra  

* config/arm/arm.md (anddi3): Expand explicitly.
(iordi3): Likewise.
(xordi3): Likewise.
(one_cmpldi2): Likewise.
* config/arm/arm.c (const_ok_for_dimode_op): Return true if one
of the constant parts is simple.
* config/arm/predicates.md (arm_anddi_operand): Add predicate.
(arm_iordi_operand): Add predicate.
(arm_xordi_operand): Add predicate.

--

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
fb57880fe0568be96a04aee1b7d230e77121e3f5..1fec00baa2a5e510ef2c02d9766432cc7cd0a17b
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -4273,8 +4273,8 @@ const_ok_for_dimode_op (HOST_WIDE_INT i, enum rtx_code 
code)
 case AND:
 case IOR:
 case XOR:
-  return (const_ok_for_op (hi_val, code) || hi_val == 0x)
-  && (const_ok_for_op (lo_val, code) || lo_val == 0x);
+  return const_ok_for_op (hi_val, code) || hi_val == 0x
+|| const_ok_for_op (lo_val, code) || lo_val == 0x;
 case PLUS:
   return arm_not_operand (hi, SImode) && arm_add_operand (lo, SImode);
 
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
ed49c4beda138633a84b58fe345cf5ba99103ab7..738d42fd164f117f1dec1108a824d984ccd70d09
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -2176,6 +2176,89 @@ (define_expand "divdf3"
   "")
 
 
+; Expand logical operations.  The mid-end expander does not split off memory
+; operands or complex immediates, which leads to fewer LDRD/STRD instructions.
+; So an explicit expander is needed to generate better code.
+
+(define_expand "anddi3"
+  [(set (match_operand:DI0 "s_register_operand")
+   (and:DI (match_operand:DI 1 "s_register_operand")
+   (match_operand:DI 2 "arm_anddi_operand")))]
+  "TARGET_32BIT"
+  {
+  rtx low  = simplify_gen_binary (AND, SImode,
+ gen_lowpart (SImode, operands[1]),
+ gen_lowpart (SImode, operands[2]));
+  rtx high = simplify_gen_binary (AND, SImode,
+ gen_highpart (SImode, operands[1]),
+ gen_highpart_mode (SImode, DImode,
+operands[2]));
+
+  emit_insn (gen_rtx_SET (gen_lowpart (SImode, operands[0]), low));
+  emit_insn (gen_rtx_SET (gen_highpart (SImode, operands[0]), high));
+  DONE;
+  }
+)
+
+(define_expand "iordi3"
+  [(set (match_operand:DI0 "s_register_operand")
+   (ior:DI (match_operand:DI 1 "s_register_operand")
+   (match_operand:DI 2 "arm_iordi_operand")))]
+  "TARGET_32BIT"
+  {
+  rtx low  = simplify_gen_binary (IOR, SImode,
+ gen_lowpart (SImode, operands[1]),
+ gen_lowpart (SImode, operands[2]));
+  rtx high = simplify_gen_binary (IOR, SImode,
+ gen_highpart (SImode, operands[1]),
+ gen_highpart_mode (SImode, DImode,
+operands[2]));
+
+  emit_insn (gen_rtx_SET (gen_lowpart (SImode, operands[0]), low));
+  emit_insn (gen_rtx_SET (gen_highpart (SImode, operands[0]), high));
+  DONE;
+  }
+)
+
+(define_expand "xordi3"
+  [(set (match_operand:DI0 "s_register_operand")
+   (xor:DI (match_operand:DI 1 "s_register_operand")
+   (match_operand:DI 2 "arm_xordi_operand")))]
+  "TARGET_32BIT"
+  {
+   rtx low  = simplify_gen_binary (XOR, SImode,
+   gen_lowpart (SImode, operands[1]),
+   gen_lowpart (SImode, operands[2]));
+   rtx high = simplify_gen_binary (XOR, SImode,
+   gen_highpart (SImode, operands[1]),
+   gen_highpart_mode (SImode, DImode,
+  operands[2]));
+
+   emit_insn (gen_rtx_SET (gen_lowpart (SImode, operands[0]), low));
+   emit_insn (gen_rtx_SET (gen_highpart (SImode, operands[0]), high));
+   DONE;
+  }
+)
+
+(define_expand "one_cmpldi2"
+  [(set (match_operand:DI 0 "s_register_operand")
+   (not:DI (match_operand:DI 1 "s_register_operand")))]
+  "TARGET_32BIT"
+  {
+  rtx low  = simplify_gen_unary (NOT, SImode,
+gen_lowpart (SImode, operands[1]),
+  

Re: [PATCH], Fix V1TI in Altivec regs on old systems

2019-08-29 Thread Segher Boessenkool
Hi!

On Tue, Aug 20, 2019 at 02:00:31PM -0400, Michael Meissner wrote:
> I
> noticed on power5 that the V1TImode mode is allowed in Altivec registers, even
> though power5 doesn't have Altivec registers.
> 
> While it doesn't seem to effect anything (I couldn't create a test case that
> failed), it is a small nit that should be fixed.  The test for TARGET_VADDUQM
> matches a test earlier in the function where VSX registers are checked.

Yeah, but does that test make any sense?

Why p8 (or later) only?  Why vector only?  (Well that one is clear, and
what this patch is about).  Why -mpowerpc64 only?  Because it has __int128
maybe?  But then you should test for *that* (and why is it important?)

Where would V1TI go if not in vector regs?  Just in memory?

And, what happens on 970?  p5 doesn't have vector registers, but 970 does.

> --- gcc/config/rs6000/rs6000.c(revision 274635)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -1874,7 +1874,7 @@ rs6000_hard_regno_mode_ok_uncached (int
>/* AltiVec only in AldyVec registers.  */
>if (ALTIVEC_REGNO_P (regno))
>  return (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)
> - || mode == V1TImode);
> + || (TARGET_VADDUQM && mode == V1TImode));


Segher


Re: [PR91598] Improve autoprefetcher heuristic in haifa-sched.c

2019-08-29 Thread Maxim Kuvyrkov
> On Aug 29, 2019, at 7:29 PM, Richard Biener  
> wrote:
> 
> On August 29, 2019 5:40:47 PM GMT+02:00, Maxim Kuvyrkov 
>  wrote:
>> Hi,
>> 
>> This patch tweaks autoprefetcher heuristic in scheduler to better group
>> memory loads and stores together.
>> 
>> From https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91598:
>> 
>> There are two separate changes, both related to instruction scheduler,
>> that cause the regression.  The first change in r253235 is responsible
>> for 70% of the regression.
>> ===
>>   haifa-sched: fix autopref_rank_for_schedule qsort comparator
>> 
>> * haifa-sched.c (autopref_rank_for_schedule): Order 'irrelevant' insns
>>   first, always call autopref_rank_data otherwise.
>> 
>> 
>> 
>> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@253235
>> 138bc75d-0d04-0410-961f-82ee72b054a4
>> ===
>> 
>> After this change instead of
>> r1 = [rb + 0]
>> r2 = [rb + 8]
>> r3 = [rb + 16]
>> r4 = 
>> r5 = 
>> r6 = 
>> 
>> we get
>> r1 = [rb + 0]
>> 
>> r2 = [rb + 8]
>> 
>> r3 = [rb + 16]
>> 
>> 
>> which, apparently, cortex-a53 autoprefetcher doesn't recognize.  This
>> schedule happens because r2= load gets lower priority than the
>> "irrelevant"  due to the above patch.
>> 
>> If we think about it, the fact that "r1 = [rb + 0]" can be scheduled
>> means that true dependencies of all similar base+offset loads are
>> resolved.  Therefore, for autoprefetcher-friendly schedule we should
>> prioritize memory reads before "irrelevant" instructions.
> 
> But isn't there also max number of load issues in a fetch window to consider? 
> So interleaving arithmetic with loads might be profitable. 

It appears that cores with autoprefetcher hardware prefer loads and stores 
bundled together, not interspersed with other instructions to occupy the rest 
of CPU units.

Autoprefetching heuristic is enabled only for cores that support it, and isn't 
active for by default.

> 
>> On the other hand, following similar logic, we want to delay memory
>> stores as much as possible to start scheduling them only after all
>> potential producers are scheduled.  I.e., for autoprefetcher-friendly
>> schedule we should prioritize "irrelevant" instructions before memory
>> writes.
>> 
>> Obvious patch to implement the above is attached.  It brings 70% of
>> regressed performance on this testcase back.
>> 
>> OK to commit?
>> 
>> Regards,
>> 
>> --
>> Maxim Kuvyrkov
>> www.linaro.org



[Bug c++/89585] GCC 8.3: asm volatile no longer accepted at file scope

2019-08-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89585

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #36 from Jakub Jelinek  ---
I think so.

Re: [PR91598] Improve autoprefetcher heuristic in haifa-sched.c

2019-08-29 Thread Richard Biener
On August 29, 2019 5:40:47 PM GMT+02:00, Maxim Kuvyrkov 
 wrote:
>Hi,
>
>This patch tweaks autoprefetcher heuristic in scheduler to better group
>memory loads and stores together.
>
>From https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91598:
>
>There are two separate changes, both related to instruction scheduler,
>that cause the regression.  The first change in r253235 is responsible
>for 70% of the regression.
>===
>haifa-sched: fix autopref_rank_for_schedule qsort comparator
>
> * haifa-sched.c (autopref_rank_for_schedule): Order 'irrelevant' insns
>first, always call autopref_rank_data otherwise.
>
>
>
>git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@253235
>138bc75d-0d04-0410-961f-82ee72b054a4
>===
>
>After this change instead of
>r1 = [rb + 0]
>r2 = [rb + 8]
>r3 = [rb + 16]
>r4 = 
>r5 = 
>r6 = 
>
>we get
>r1 = [rb + 0]
>
>r2 = [rb + 8]
>
>r3 = [rb + 16]
>
>
>which, apparently, cortex-a53 autoprefetcher doesn't recognize.  This
>schedule happens because r2= load gets lower priority than the
>"irrelevant"  due to the above patch.
>
>If we think about it, the fact that "r1 = [rb + 0]" can be scheduled
>means that true dependencies of all similar base+offset loads are
>resolved.  Therefore, for autoprefetcher-friendly schedule we should
>prioritize memory reads before "irrelevant" instructions.

But isn't there also max number of load issues in a fetch window to consider? 
So interleaving arithmetic with loads might be profitable. 

>On the other hand, following similar logic, we want to delay memory
>stores as much as possible to start scheduling them only after all
>potential producers are scheduled.  I.e., for autoprefetcher-friendly
>schedule we should prioritize "irrelevant" instructions before memory
>writes.
>
>Obvious patch to implement the above is attached.  It brings 70% of
>regressed performance on this testcase back.
>
>OK to commit?
>
>Regards,
>
>--
>Maxim Kuvyrkov
>www.linaro.org



Re: [PATCH] Generalized predicate/condition for parameter reference in IPA (PR ipa/91088)

2019-08-29 Thread Martin Jambor
Hi,

On Fri, Jul 12 2019, Feng Xue OS wrote:
> Current IPA-cp only generates cost-evaluating predicate for conditional 
> statement like
> "if (param cmp const_val)", it is too simple and conservative. This patch 
> generalizes the
> process to handle the form as T(param), a mathematical transformation on the 
> function
> parameter, in which the parameter occurs once, and other operands are 
> constant value.

thanks for working on this.  I cannot approve this, but I have had a
brief look and have the following comments:
>
> 
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 3d92250b520..0110446e09e 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,25 @@
> +2019-07-12  Feng Xue  
> +
> + PR ipa/91088
> + * ipa-predicat.h (struct expr_eval_op): New struct.
> + (expr_eval_ops): New typedef.
> + (struct condition): Add param_ops member.
> + (add_condition): Add param_ops parameter.
> + * ipa-predicat.c (expr_eval_ops_equal_p): New function.
> + (predicate::add_clause): Add param_ops comparison.
> + (dump_condition): Add debug dump for param_ops.
> + (remap_after_inlining): Add param_ops argument to call to
> + add_condition.
> + (add_condition): Add parameter param_ops.
> + * ipa-fnsummary.c (evaluate_conditions_for_known_args): Fold
> + parameter expressions using param_ops.
> + (decompose_param_expr):  New function.
> + (set_cond_stmt_execution_predicate): Use call to decompose_param_expr
> + to replace call to unmodified_parm_or_parm_agg_item.
> + (set_switch_stmt_execution_predicate): Likewise.
> + (inline_read_section): Read param_ops from summary stream.
> + (ipa_fn_summary_write): Write param_ops to summary stream.
> +

(It's a bad idea to make ChangeLog entries part of the patch, it won't
apply to anyone, not even to you nowadays. )


>  2019-07-11  Sunil K Pandey  
>  
>   PR target/90980
> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
> index 09986211a1d..faf8bd39090 100644
> --- a/gcc/ipa-fnsummary.c
> +++ b/gcc/ipa-fnsummary.c
> @@ -301,9 +301,9 @@ set_hint_predicate (predicate **p, predicate 
> new_predicate)
>  }
>  
>  
> -/* Compute what conditions may or may not hold given invormation about
> +/* Compute what conditions may or may not hold given information about
> parameters.  RET_CLAUSE returns truths that may hold in a specialized 
> copy,
> -   whie RET_NONSPEC_CLAUSE returns truths that may hold in an nonspecialized
> +   while RET_NONSPEC_CLAUSE returns truths that may hold in an nonspecialized
> copy when called in a given context.  It is a bitmask of conditions. Bit
> 0 means that condition is known to be false, while bit 1 means that 
> condition
> may or may not be true.  These differs - for example NOT_INLINED condition
> @@ -336,6 +336,8 @@ evaluate_conditions_for_known_args (struct cgraph_node 
> *node,
>  {
>tree val;
>tree res;
> +  int j;
> +  struct expr_eval_op *op;
>  
>/* We allow call stmt to have fewer arguments than the callee function
>   (especially for K style programs).  So bound check here (we assume
> @@ -399,7 +401,18 @@ evaluate_conditions_for_known_args (struct cgraph_node 
> *node,
> continue;
>   }
>  
> -  val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (c->val), val);
> +  for (j = 0; vec_safe_iterate (c->param_ops, j, ); j++)
> + {
> +   if (!op->val)
> + val = fold_unary (op->code, op->type, val);
> +   else if (op->val_is_rhs)
> + val = fold_binary_to_constant (op->code, op->type, val, op->val);
> +   else
> + val = fold_binary_to_constant (op->code, op->type, op->val, val);
> +   if (!val)
> + break;
> + }
> +
>res = val
>   ? fold_binary_to_constant (c->code, boolean_type_node, val, c->val)
>   : NULL;
> @@ -1177,6 +1190,105 @@ eliminated_by_inlining_prob (ipa_func_body_info *fbi, 
> gimple *stmt)
>  }
>  }
>  
> +/* Flatten a tree expression on parameter into a set of sequential 
> operations.
> +   we only handle expression that is a mathematical transformation on the
> +   parameter, and in the expression, parameter occurs only once, and other
> +   operands are IPA invariant.  */

I understand describing these things is difficult, but flatten is
strange way to describe what the function does.  What about somthing
like the following?

Analyze EXPR if it represents a series of simple operations performed on
a function parameter and return true if so.  FBI, STMT, INDEX_P, SIZE_P
and AGGPOS have the same meaning like in
unmodified_parm_or_parm_agg_item.  Operations on the parameter are
recorded to PARAM_OPS_P if it is not NULL.


> +
> +static bool
> +decompose_param_expr (struct ipa_func_body_info *fbi,
> +   gimple *stmt, tree expr,
> +   int *index_p, HOST_WIDE_INT *size_p,
> +   struct agg_position_info *aggpos,
> +   

[PATCH] Couple of debug dump improvements to scheduler (no code-gen changes)

2019-08-29 Thread Maxim Kuvyrkov
Hi,

The first patch adds ranking statistics for autoprefetcher heuristic.

The second one makes it easier to diff scheduler debug dumps by adding more 
context lines for diff at clock increments.

OK to commit?

--
Maxim Kuvyrkov
www.linaro.org




0002-Add-missing-entry-for-rank_for_schedule-stats.patch
Description: Binary data


0003-Improve-diff-ability-of-scheduler-logs.patch
Description: Binary data


[PR91598] Improve autoprefetcher heuristic in haifa-sched.c

2019-08-29 Thread Maxim Kuvyrkov
Hi,

This patch tweaks autoprefetcher heuristic in scheduler to better group memory 
loads and stores together.

From https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91598:

There are two separate changes, both related to instruction scheduler, that 
cause the regression.  The first change in r253235 is responsible for 70% of 
the regression.
===
haifa-sched: fix autopref_rank_for_schedule qsort comparator

* haifa-sched.c (autopref_rank_for_schedule): Order 'irrelevant' 
insns
first, always call autopref_rank_data otherwise.



git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@253235 
138bc75d-0d04-0410-961f-82ee72b054a4
===

After this change instead of
r1 = [rb + 0]
r2 = [rb + 8]
r3 = [rb + 16]
r4 = 
r5 = 
r6 = 

we get
r1 = [rb + 0]

r2 = [rb + 8]

r3 = [rb + 16]


which, apparently, cortex-a53 autoprefetcher doesn't recognize.  This schedule 
happens because r2= load gets lower priority than the "irrelevant"  due to the above patch.

If we think about it, the fact that "r1 = [rb + 0]" can be scheduled means that 
true dependencies of all similar base+offset loads are resolved.  Therefore, 
for autoprefetcher-friendly schedule we should prioritize memory reads before 
"irrelevant" instructions.

On the other hand, following similar logic, we want to delay memory stores as 
much as possible to start scheduling them only after all potential producers 
are scheduled.  I.e., for autoprefetcher-friendly schedule we should prioritize 
"irrelevant" instructions before memory writes.

Obvious patch to implement the above is attached.  It brings 70% of regressed 
performance on this testcase back.

OK to commit?

Regards,

--
Maxim Kuvyrkov
www.linaro.org




0001-Improve-autoprefetcher-heuristic-partly-fix-regressi.patch
Description: Binary data


Re: [ARM/FDPIC v5 00/21] FDPIC ABI for ARM

2019-08-29 Thread Christophe Lyon

On 29/08/2019 15:57, Christophe Lyon wrote:

Hi,

On 15/05/2019 14:39, Christophe Lyon wrote:

Hello,

This patch series implements the GCC contribution of the FDPIC ABI for
ARM targets.

This ABI enables to run Linux on ARM MMU-less cores and supports
shared libraries to reduce the memory footprint.

Without MMU, text and data segments relative distances are different
from one process to another, hence the need for a dedicated FDPIC
register holding the start address of the data segment. One of the
side effects is that function pointers require two words to be
represented: the address of the code, and the data segment start
address. These two words are designated as "Function Descriptor",
hence the "FD PIC" name.

On ARM, the FDPIC register is r9 [1], and the target name is
arm-uclinuxfdpiceabi. Note that arm-uclinux exists, but uses another
ABI and the BFLAT file format; it does not support code sharing.
The -mfdpic option is enabled by default, and -mno-fdpic should be
used to build the Linux kernel.

This work was developed some time ago by STMicroelectronics, and was
presented during Linaro Connect SFO15 (September 2015). You can watch
the discussion and read the slides [2].
This presentation was related to the toolchain published on github [3],
which is based on binutils-2.22, gcc-4.7, uclibc-0.9.33.2, gdb-7.5.1
and qemu-2.3.0, and for which pre-built binaries are available [3].

The ABI itself is described in details in [1].

Our Linux kernel patches have been updated and committed by Nicolas
Pitre (Linaro) in July 2017. They are required so that the loader is
able to handle this new file type. Indeed, the ELF files are tagged
with ELFOSABI_ARM_FDPIC. This new tag has been allocated by ARM, as
well as the new relocations involved.

The binutils, QEMU and uclibc-ng patch series have been merged a few
months ago. [4][5][6]

This series provides support for architectures that support ARM and/or
Thumb-2 and has been tested on arm-linux-gnueabi without regression,
as well as arm-uclinuxfdpiceabi, using QEMU. arm-uclinuxfdpiceabi has
a few more failures than arm-linux-gnueabi, but is quite functional.

I have also booted an STM32 board (stm32f469) which uses a cortex-m4
with linux-4.20.17 and ran successfully several tools.

Are the GCC patches OK for inclusion in master?


I have addressed the comments I received on v5, and I am going to post updated 
versions of the patches that needed changes as follow-ups in this thread. I 
hope this will help reviewers as I will provide answers and updated patches 
next to their comments. After that, I will rebase the whole series and send it 
as v6 if that helps (several testsuite patches have already been approved 
as-is, but committing them now would change the patch numbering, thus possibly 
confusing reviewers).

However, note that several patches in the series haven't received feedback yet, 
so this is a ping for them :-)
[ARM/FDPIC v5 06/21] [ARM] FDPIC: Add support for c++ exceptions
[ARM/FDPIC v5 10/21] [ARM] FDPIC: Implement TLS support.
[ARM/FDPIC v5 11/21] [ARM] FDPIC: Add support to unwind FDPIC signal frame
[ARM/FDPIC v5 12/21] [ARM] FDPIC: Restore r9 after we call __aeabi_read_tp
[ARM/FDPIC v5 13/21] [ARM] FDPIC: Force LSB bit for PC in Cortex-M architecture



I forgot to mention that I found a problem in libitm's sjlj.S, worth this 
additional patch.

Christophe



Thanks,

Christophe


Changes between v4 and v5:
- rebased on top of recent gcc-10 master (April 26th, 2019)
- fixed handling of stack-protector combined patterns in FDPIC mode

Changes between v3 and v4:

- improved documentation (patch 1)
- emit an error message (sorry) if the target architecture does not
   support arm nor thumb-2 modes (patch 4)
- handle Richard's comments on patch 4 (comments, unspec)
- added .align directive (patch 5)
- fixed use of kernel helpers (__kernel_cmpxchg, __kernel_dmb) (patch 6)
- code factorization in patch 7
- typos/internal function name in patch 8
- improved patch 12
- dropped patch 16
- patch 20 introduces arm_arch*_thumb_ok effective targets to help
   skip some tests
- I tested patch 2 on xtensa-buildroot-uclinux-uclibc, it adds many
   new tests, but a few regressions
   (https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00713.html)
- I compiled and executed several LTP tests to exercise pthreads and signals
- I wrote and executed a simple testcase to change the interaction
   with __kernel_cmpxchg (ie. call the kernel helper rather than use an
   implementation in libgcc as requested by Richard)

Changes between v2 and v3:
- added doc entry for -mfdpic new option
- took Kyrill's comments into account (use "Armv7" instead of "7",
   code factorization, use preprocessor instead of hard-coding "r9",
   remove leftover code for thumb1 support, fixed comments)
- rebase over recent trunk
- patches with changes: 1, 2 (commit message), 3 (rebase), 4, 6, 7, 9,
   14 (rebase), 19 (rebase)

Changes between v1 and v2:
- fix GNU coding style
- exit with an error 

Re: [PATCH v2 0/9] S/390: Use signaling FP comparison instructions

2019-08-29 Thread Ilya Leoshkevich
> Am 22.08.2019 um 15:45 schrieb Ilya Leoshkevich :
> 
> Bootstrap and regtest running on x86_64-redhat-linux and
> s390x-redhat-linux.
> 
> This patch series adds signaling FP comparison support (both scalar and
> vector) to s390 backend.

I'm running into a problem on ppc64 with this patch, and it would be
great if someone could help me figure out the best way to resolve it.

vector36.C test is failing because gimplifier produces the following

  _5 = _4 > { 2.0e+0, 2.0e+0, 2.0e+0, 2.0e+0 };
  _6 = VEC_COND_EXPR <_5, { -1, -1, -1, -1 }, { 0, 0, 0, 0 }>;

from

  VEC_COND_EXPR < (*b > { 2.0e+0, 2.0e+0, 2.0e+0, 2.0e+0 }) ,
  { -1, -1, -1, -1 } ,
  { 0, 0, 0, 0 } >

Since the comparison tree code is now hidden behind a temporary, my code
does not have anything to pass to the backend.  The reason for creating
a temporary is that the comparison can trap, and so the following check
in gimplify_expr fails:

  if (gimple_seq_empty_p (internal_post) && (*gimple_test_f) (*expr_p))
goto out;

gimple_test_f is is_gimple_condexpr, and it eventually calls
operation_could_trap_p (GT).

My current solution is to simply state that backend does not support
SSA_NAME in vector comparisons, however, I don't like it, since it may
cause performance regressions due to having to fall back to scalar
comparisons.

I was thinking about two other possible solutions:

1. Change the gimplifier to allow trapping vector comparisons.  That's
   a bit complicated, because tree_could_throw_p checks not only for
   floating point traps, but also e.g. for array index out of bounds
   traps.  So I would have to create a tree_could_throw_p version which
   disregards specific kinds of traps.

2. Change expand_vector_condition to follow SSA_NAME_DEF_STMT and use
   its tree_code instead of SSA_NAME.  The potential problem I see with
   this is that there appears to be no guarantee that _5 will be inlined
   into _6 at a later point.  So if we say that we don't need to fall
   back to scalar comparisons based on availability of vector >
   instruction and inlining does not happen, then what's actually will
   be required is vector selection (vsel on S/390), which might not be
   available in general case.

What would be a better way to proceed here?



Re: [ARM/FDPIC v5 12/21] [ARM] FDPIC: Restore r9 after we call __aeabi_read_tp

2019-08-29 Thread Kyrill Tkachov

Hi Christophe,

On 5/15/19 1:39 PM, Christophe Lyon wrote:

We call __aeabi_read_tp() to get the thread pointer. Since this is a
function call, we have to restore the FDPIC register afterwards.

2019-XX-XX  Christophe Lyon  
    Mickaël Guêné 

    gcc/
    * config/arm/arm.c (arm_load_tp): Add FDPIC support.
    * config/arm/arm.md (load_tp_soft_fdpic): New pattern.
    (load_tp_soft): Disable in FDPIC mode.

Change-Id: I1f6dfaee6260ecb453270f4971b3c5124317a186

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 5fc7a20..26f29c7 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -8732,7 +8732,25 @@ arm_load_tp (rtx target)

   rtx tmp;

-  emit_insn (gen_load_tp_soft ());
+  if (TARGET_FDPIC)
+   {
+ rtx par = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (3));
+ rtx fdpic_reg = gen_rtx_REG (Pmode, FDPIC_REGNUM);
+ rtx initial_fdpic_reg = get_hard_reg_initial_val (Pmode, 
FDPIC_REGNUM);

+
+ emit_insn (gen_load_tp_soft_fdpic ());
+
+ /* Restore r9.  */
+ XVECEXP (par, 0, 0) = gen_rtx_UNSPEC (VOIDmode,
+   gen_rtvec (2, fdpic_reg,
+ initial_fdpic_reg),
+ UNSPEC_PIC_RESTORE);
+ XVECEXP (par, 0, 1) = gen_rtx_USE (VOIDmode, initial_fdpic_reg);
+ XVECEXP (par, 0, 2) = gen_rtx_CLOBBER (VOIDmode, fdpic_reg);
+ emit_insn (par);
+   }
+  else
+   emit_insn (gen_load_tp_soft ());

   tmp = gen_rtx_REG (SImode, R0_REGNUM);
   emit_move_insn (target, tmp);
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 9036255..0edcb1d 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -11759,12 +11759,25 @@
 )

 ;; Doesn't clobber R1-R3.  Must use r0 for the first operand.
+(define_insn "load_tp_soft_fdpic"
+  [(set (reg:SI 0) (unspec:SI [(const_int 0)] UNSPEC_TLS))
+   (clobber (reg:SI 9))


Use FDPIC_REGNUM here (does it need to be declared at the top of arm.md 
for it to work?)


Otherwise this is ok.

Thanks,

Kyrill




+   (clobber (reg:SI LR_REGNUM))
+   (clobber (reg:SI IP_REGNUM))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_SOFT_TP && TARGET_FDPIC"
+  "bl\\t__aeabi_read_tp\\t@ load_tp_soft"
+  [(set_attr "conds" "clob")
+   (set_attr "type" "branch")]
+)
+
+;; Doesn't clobber R1-R3.  Must use r0 for the first operand.
 (define_insn "load_tp_soft"
   [(set (reg:SI 0) (unspec:SI [(const_int 0)] UNSPEC_TLS))
    (clobber (reg:SI LR_REGNUM))
    (clobber (reg:SI IP_REGNUM))
    (clobber (reg:CC CC_REGNUM))]
-  "TARGET_SOFT_TP"
+  "TARGET_SOFT_TP && !TARGET_FDPIC"
   "bl\\t__aeabi_read_tp\\t@ load_tp_soft"
   [(set_attr "conds" "clob")
    (set_attr "type" "branch")]
--
2.6.3



Re: [ARM/FDPIC v5 12/21] [ARM] FDPIC: Restore r9 after we call __aeabi_read_tp

2019-08-29 Thread Christophe Lyon

Here is an updated version that makes use of the helper 
gen_restore_pic_register_after_call

Christophe


On 15/05/2019 14:39, Christophe Lyon wrote:

We call __aeabi_read_tp() to get the thread pointer. Since this is a
function call, we have to restore the FDPIC register afterwards.

2019-XX-XX  Christophe Lyon  
Mickaël Guêné 

gcc/
* config/arm/arm.c (arm_load_tp): Add FDPIC support.
* config/arm/arm.md (load_tp_soft_fdpic): New pattern.
(load_tp_soft): Disable in FDPIC mode.

Change-Id: I1f6dfaee6260ecb453270f4971b3c5124317a186

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 5fc7a20..26f29c7 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -8732,7 +8732,25 @@ arm_load_tp (rtx target)
  
rtx tmp;
  
-  emit_insn (gen_load_tp_soft ());

+  if (TARGET_FDPIC)
+   {
+ rtx par = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (3));
+ rtx fdpic_reg = gen_rtx_REG (Pmode, FDPIC_REGNUM);
+ rtx initial_fdpic_reg = get_hard_reg_initial_val (Pmode, 
FDPIC_REGNUM);
+
+ emit_insn (gen_load_tp_soft_fdpic ());
+
+ /* Restore r9.  */
+ XVECEXP (par, 0, 0) = gen_rtx_UNSPEC (VOIDmode,
+   gen_rtvec (2, fdpic_reg,
+  initial_fdpic_reg),
+   UNSPEC_PIC_RESTORE);
+ XVECEXP (par, 0, 1) = gen_rtx_USE (VOIDmode, initial_fdpic_reg);
+ XVECEXP (par, 0, 2) = gen_rtx_CLOBBER (VOIDmode, fdpic_reg);
+ emit_insn (par);
+   }
+  else
+   emit_insn (gen_load_tp_soft ());
  
tmp = gen_rtx_REG (SImode, R0_REGNUM);

emit_move_insn (target, tmp);
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 9036255..0edcb1d 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -11759,12 +11759,25 @@
  )
  
  ;; Doesn't clobber R1-R3.  Must use r0 for the first operand.

+(define_insn "load_tp_soft_fdpic"
+  [(set (reg:SI 0) (unspec:SI [(const_int 0)] UNSPEC_TLS))
+   (clobber (reg:SI 9))
+   (clobber (reg:SI LR_REGNUM))
+   (clobber (reg:SI IP_REGNUM))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_SOFT_TP && TARGET_FDPIC"
+  "bl\\t__aeabi_read_tp\\t@ load_tp_soft"
+  [(set_attr "conds" "clob")
+   (set_attr "type" "branch")]
+)
+
+;; Doesn't clobber R1-R3.  Must use r0 for the first operand.
  (define_insn "load_tp_soft"
[(set (reg:SI 0) (unspec:SI [(const_int 0)] UNSPEC_TLS))
 (clobber (reg:SI LR_REGNUM))
 (clobber (reg:SI IP_REGNUM))
 (clobber (reg:CC CC_REGNUM))]
-  "TARGET_SOFT_TP"
+  "TARGET_SOFT_TP && !TARGET_FDPIC"
"bl\\t__aeabi_read_tp\\t@ load_tp_soft"
[(set_attr "conds" "clob")
 (set_attr "type" "branch")]



>From b27af6ffc5423679167b5862764d259598b3bf29 Mon Sep 17 00:00:00 2001
From: Christophe Lyon 
Date: Thu, 8 Feb 2018 14:51:07 +0100
Subject: [ARM/FDPIC v6 12/24] [ARM] FDPIC: Restore r9 after we call
 __aeabi_read_tp
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

We call __aeabi_read_tp() to get the thread pointer. Since this is a
function call, we have to restore the FDPIC register afterwards.

2019-XX-XX  Christophe Lyon  
	Mickaël Guêné 

	gcc/
	* config/arm/arm.c (arm_load_tp): Add FDPIC support.
	* config/arm/arm.md (load_tp_soft_fdpic): New pattern.
	(load_tp_soft): Disable in FDPIC mode.

Change-Id: I0811cc7c5df8f44dd8b8b1f4caf54c7d3609c414

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 43fe467..9501e8d 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -8729,7 +8729,18 @@ arm_load_tp (rtx target)
 
   rtx tmp;
 
-  emit_insn (gen_load_tp_soft ());
+  if (TARGET_FDPIC)
+	{
+	  rtx fdpic_reg = gen_rtx_REG (Pmode, FDPIC_REGNUM);
+	  rtx initial_fdpic_reg = get_hard_reg_initial_val (Pmode, FDPIC_REGNUM);
+
+	  emit_insn (gen_load_tp_soft_fdpic ());
+
+	  /* Restore r9.  */
+	  emit_insn (gen_restore_pic_register_after_call(fdpic_reg, initial_fdpic_reg));
+	}
+  else
+	emit_insn (gen_load_tp_soft ());
 
   tmp = gen_rtx_REG (SImode, R0_REGNUM);
   emit_move_insn (target, tmp);
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 328d32d..ea015ed 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -11700,12 +11700,25 @@
 )
 
 ;; Doesn't clobber R1-R3.  Must use r0 for the first operand.
+(define_insn "load_tp_soft_fdpic"
+  [(set (reg:SI 0) (unspec:SI [(const_int 0)] UNSPEC_TLS))
+   (clobber (reg:SI 9))
+   (clobber (reg:SI LR_REGNUM))
+   (clobber (reg:SI IP_REGNUM))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_SOFT_TP && TARGET_FDPIC"
+  "bl\\t__aeabi_read_tp\\t@ load_tp_soft"
+  [(set_attr "conds" "clob")
+   (set_attr "type" "branch")]
+)
+
+;; Doesn't clobber R1-R3.  Must use r0 for the first operand.
 (define_insn "load_tp_soft"
   [(set (reg:SI 0) (unspec:SI [(const_int 0)] UNSPEC_TLS))
(clobber (reg:SI LR_REGNUM))

Re: [ARM/FDPIC v5 09/21] [ARM] FDPIC: Add support for taking address of nested function

2019-08-29 Thread Christophe Lyon

On 31/07/2019 16:44, Christophe Lyon wrote:

On Tue, 16 Jul 2019 at 14:42, Kyrill Tkachov
 wrote:



On 7/16/19 12:18 PM, Kyrill Tkachov wrote:

Hi Christophe

On 5/15/19 1:39 PM, Christophe Lyon wrote:

In FDPIC mode, the trampoline generated to support pointers to nested
functions looks like:

.word trampoline address
.word trampoline GOT address
ldrr12, [pc, #8]
ldrr9, [pc, #8]
ldr   pc, [pc, #8]
.word static chain value
.word GOT address
.word function's address

because in FDPIC function pointers are actually pointers to function
descriptors, we have to actually generate a function descriptor for
the trampoline.

2019-XX-XX  Christophe Lyon 
 Mickaël Guêné 

 gcc/
 * config/arm/arm.c (arm_asm_trampoline_template): Add FDPIC
 support.
 (arm_trampoline_init): Likewise.
 (arm_trampoline_init): Likewise.
 * config/arm/arm.h (TRAMPOLINE_SIZE): Likewise.

Change-Id: Idc4d5f629ae4f8d79bdf9623517481d524a0c144

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 40e3f3b..99d13bf 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3976,13 +3976,50 @@ arm_warn_func_return (tree decl)
 .word static chain value
 .word function's address
 XXX FIXME: When the trampoline returns, r8 will be clobbered.  */
+/* In FDPIC mode, the trampoline looks like:
+  .word trampoline address
+  .word trampoline GOT address
+  ldrr12, [pc, #8] ; #4 for Thumb2
+  ldrr9,  [pc, #8] ; #4 for Thumb2
+  ldr   pc,  [pc, #8] ; #4 for Thumb2
+  .word static chain value
+  .word GOT address
+  .word function's address
+*/



I think this comment is not right for Thumb2.

These load instructionshave 32-bit encodings, even in Thumb2 (they use
high registers).


Andre and Wilco pointed out to me offline that the offset should be #4
for Arm mode.

The Arm ARM at E1.2.3 says:

PC, the program counter

* When executing an A32 instruction, PC reads as the address of the
current instruction plus 8.

* When executing a T32 instruction, PC reads as the address of the
current instruction plus 4.



Yes, it looks like the code is right, and the comment is wrong:
- offset 8 for thumb2 mode
- offset 4 for arm mode


Here is the updated version


Thanks,

Christophe


Thanks,

Kyrill




Also, please merge this comment with the one above (no separate /**/)



  static void
  arm_asm_trampoline_template (FILE *f)
  {
fprintf (f, "\t.syntax unified\n");

-  if (TARGET_ARM)
+  if (TARGET_FDPIC)
+{
+  /* The first two words are a function descriptor pointing to the
+trampoline code just below.  */
+  if (TARGET_ARM)
+   fprintf (f, "\t.arm\n");
+  else if (TARGET_THUMB2)
+   fprintf (f, "\t.thumb\n");
+  else
+   /* Only ARM and Thumb-2 are supported.  */
+   gcc_unreachable ();
+
+  assemble_aligned_integer (UNITS_PER_WORD, const0_rtx);
+  assemble_aligned_integer (UNITS_PER_WORD, const0_rtx);
+  /* Trampoline code which sets the static chain register but also
+PIC register before jumping into real code. */
+  asm_fprintf (f, "\tldr\t%r, [%r, #%d]\n",
+  STATIC_CHAIN_REGNUM, PC_REGNUM,
+  TARGET_THUMB2 ? 8 : 4);
+  asm_fprintf (f, "\tldr\t%r, [%r, #%d]\n",
+  PIC_OFFSET_TABLE_REGNUM, PC_REGNUM,
+  TARGET_THUMB2 ? 8 : 4);
+  asm_fprintf (f, "\tldr\t%r, [%r, #%d]\n",
+  PC_REGNUM, PC_REGNUM,
+  TARGET_THUMB2 ? 8 : 4);



As above, I think the offset should be 8 for both Arm and Thumb2.

Thanks,

Kyrill



+  assemble_aligned_integer (UNITS_PER_WORD, const0_rtx);
+}
+  else if (TARGET_ARM)
  {
fprintf (f, "\t.arm\n");
asm_fprintf (f, "\tldr\t%r, [%r, #0]\n", STATIC_CHAIN_REGNUM,
PC_REGNUM);
@@ -4023,12 +4060,40 @@ arm_trampoline_init (rtx m_tramp, tree fndecl,
rtx chain_value)
emit_block_move (m_tramp, assemble_trampoline_template (),
 GEN_INT (TRAMPOLINE_SIZE), BLOCK_OP_NORMAL);

-  mem = adjust_address (m_tramp, SImode, TARGET_32BIT ? 8 : 12);
-  emit_move_insn (mem, chain_value);
+  if (TARGET_FDPIC)
+{
+  rtx funcdesc = XEXP (DECL_RTL (fndecl), 0);
+  rtx fnaddr = gen_rtx_MEM (Pmode, funcdesc);
+  rtx gotaddr = gen_rtx_MEM (Pmode, plus_constant (Pmode,
funcdesc, 4));
+  /* The function start address is at offset 8, but in Thumb mode
+we want bit 0 set to 1 to indicate Thumb-ness, hence 9
+below.  */
+  rtx trampoline_code_start
+   = plus_constant (Pmode, XEXP (m_tramp, 0), TARGET_THUMB2 ? 9

: 8);

+
+  /* Write initial funcdesc which points to the trampoline.  */
+  mem = adjust_address (m_tramp, SImode, 0);
+  emit_move_insn (mem, 

Re: [ARM/FDPIC v5 05/21] [ARM] FDPIC: Fix __do_global_dtors_aux and frame_dummy generation

2019-08-29 Thread Christophe Lyon

On 12/07/2019 08:06, Richard Sandiford wrote:

Christophe Lyon  writes:

In FDPIC, we need to make sure __do_global_dtors_aux and frame_dummy
are referenced by their address, not by pointers to the function
descriptors.

2019-XX-XX  Christophe Lyon  
Mickaël Guêné 

* libgcc/crtstuff.c: Add support for FDPIC.

Change-Id: I0bc4b1232fbf3c69068fb23a1b9cafc895d141b1

diff --git a/libgcc/crtstuff.c b/libgcc/crtstuff.c
index 4927a9f..159b461 100644
--- a/libgcc/crtstuff.c
+++ b/libgcc/crtstuff.c
@@ -429,9 +429,18 @@ __do_global_dtors_aux (void)
  #ifdef FINI_SECTION_ASM_OP
  CRT_CALL_STATIC_FUNCTION (FINI_SECTION_ASM_OP, __do_global_dtors_aux)
  #elif defined (FINI_ARRAY_SECTION_ASM_OP)
+#if defined(__FDPIC__)
+__asm__(
+"   .section .fini_array\n"
+"   .align 2\n"
+"   .word __do_global_dtors_aux\n"
+);
+asm (TEXT_SECTION_ASM_OP);
+#else /* defined(__FDPIC__) */
  static func_ptr __do_global_dtors_aux_fini_array_entry[]
__attribute__ ((__used__, section(".fini_array"), 
aligned(sizeof(func_ptr
= { __do_global_dtors_aux };
+#endif /* defined(__FDPIC__) */
  #else /* !FINI_SECTION_ASM_OP && !FINI_ARRAY_SECTION_ASM_OP */
  static void __attribute__((used))
  __do_global_dtors_aux_1 (void)


It'd be good to avoid hard-coding the pointer size.  Would it work to do:

__asm__("\t.equ\.t__do_global_dtors_aux_alias, __do_global_dtors_aux\n");
extern char __do_global_dtors_aux_alias;
static void *__do_global_dtors_aux_fini_array_entry[]
__attribute__ ((__used__, section(".fini_array"), aligned(sizeof(void *
= { &__do_global_dtors_aux_alias };

?  Similarly for the init_array.


OK, done.


AFAICT this and 02/21 are the only patches that aren't Arm-specific,
is that right?

Thanks,
Richard
.



>From ea0eee1ddeddef92277ae68eac4af28994c2902c Mon Sep 17 00:00:00 2001
From: Christophe Lyon 
Date: Thu, 8 Feb 2018 11:12:52 +0100
Subject: [ARM/FDPIC v6 05/24] [ARM] FDPIC: Fix __do_global_dtors_aux and
 frame_dummy generation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

In FDPIC, we need to make sure __do_global_dtors_aux and frame_dummy
are referenced by their address, not by pointers to the function
descriptors.

2019-XX-XX  Christophe Lyon  
	Mickaël Guêné 

	libgcc/
	* libgcc/crtstuff.c: Add support for FDPIC.

Change-Id: I0bc4b1232fbf3c69068fb23a1b9cafc895d141b1

diff --git a/libgcc/crtstuff.c b/libgcc/crtstuff.c
index 4927a9f..6659039 100644
--- a/libgcc/crtstuff.c
+++ b/libgcc/crtstuff.c
@@ -429,9 +429,17 @@ __do_global_dtors_aux (void)
 #ifdef FINI_SECTION_ASM_OP
 CRT_CALL_STATIC_FUNCTION (FINI_SECTION_ASM_OP, __do_global_dtors_aux)
 #elif defined (FINI_ARRAY_SECTION_ASM_OP)
+#if defined(__FDPIC__)
+__asm__("\t.equ\t__do_global_dtors_aux_alias, __do_global_dtors_aux\n");
+extern char __do_global_dtors_aux_alias;
+static void *__do_global_dtors_aux_fini_array_entry[]
+__attribute__ ((__used__, section(".fini_array"), aligned(sizeof(void *
+ = { &__do_global_dtors_aux_alias };
+#else /* defined(__FDPIC__) */
 static func_ptr __do_global_dtors_aux_fini_array_entry[]
   __attribute__ ((__used__, section(".fini_array"), aligned(sizeof(func_ptr
   = { __do_global_dtors_aux };
+#endif /* defined(__FDPIC__) */
 #else /* !FINI_SECTION_ASM_OP && !FINI_ARRAY_SECTION_ASM_OP */
 static void __attribute__((used))
 __do_global_dtors_aux_1 (void)
@@ -473,9 +481,17 @@ frame_dummy (void)
 #ifdef __LIBGCC_INIT_SECTION_ASM_OP__
 CRT_CALL_STATIC_FUNCTION (__LIBGCC_INIT_SECTION_ASM_OP__, frame_dummy)
 #else /* defined(__LIBGCC_INIT_SECTION_ASM_OP__) */
+#if defined(__FDPIC__)
+__asm__("\t.equ\t__frame_dummy_alias, frame_dummy\n");
+extern char __frame_dummy_alias;
+static void *__frame_dummy_init_array_entry[]
+__attribute__ ((__used__, section(".init_array"), aligned(sizeof(void *
+ = { &__frame_dummy_alias };
+#else /* defined(__FDPIC__) */
 static func_ptr __frame_dummy_init_array_entry[]
   __attribute__ ((__used__, section(".init_array"), aligned(sizeof(func_ptr
   = { frame_dummy };
+#endif /* defined(__FDPIC__) */
 #endif /* !defined(__LIBGCC_INIT_SECTION_ASM_OP__) */
 #endif /* USE_EH_FRAME_REGISTRY || USE_TM_CLONE_REGISTRY */
 
-- 
2.6.3



Re: [ARM/FDPIC v5 04/21] [ARM] FDPIC: Add support for FDPIC for arm architecture

2019-08-29 Thread Christophe Lyon

On 16/07/2019 13:58, Richard Sandiford wrote:

Christophe Lyon  writes:

The FDPIC register is hard-coded to r9, as defined in the ABI.

We have to disable tailcall optimizations if we don't know if the
target function is in the same module. If not, we have to set r9 to
the value associated with the target module.

When generating a symbol address, we have to take into account whether
it is a pointer to data or to a function, because different
relocations are needed.

2019-XX-XX  Christophe Lyon  
Mickaël Guêné 

* config/arm/arm-c.c (__FDPIC__): Define new pre-processor macro
in FDPIC mode.
* config/arm/arm-protos.h (arm_load_function_descriptor): Declare
new function.
* config/arm/arm.c (arm_option_override): Define pic register to
FDPIC_REGNUM.
(arm_function_ok_for_sibcall): Disable sibcall optimization if we
have no decl or go through PLT.
(arm_load_pic_register): Handle TARGET_FDPIC.
(arm_is_segment_info_known): New function.
(arm_pic_static_addr): Add support for FDPIC.
(arm_load_function_descriptor): New function.
(arm_assemble_integer): Add support for FDPIC.
* config/arm/arm.h (PIC_OFFSET_TABLE_REG_CALL_CLOBBERED):
Define. (FDPIC_REGNUM): New define.
* config/arm/arm.md (call): Add support for FDPIC.
(call_value): Likewise.
(*restore_pic_register_after_call): New pattern.
(untyped_call): Disable if FDPIC.
(untyped_return): Likewise.
* config/arm/unspecs.md (UNSPEC_PIC_RESTORE): New.

Change-Id: I8fb1a6b85ace672184013568c5d28fbda2f7fda4

diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index 6e256ee..34695fa 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -203,6 +203,8 @@ arm_cpu_builtins (struct cpp_reader* pfile)
builtin_define ("__ARM_EABI__");
  }
  
+  def_or_undef_macro (pfile, "__FDPIC__", TARGET_FDPIC);

+
def_or_undef_macro (pfile, "__ARM_ARCH_EXT_IDIV__", TARGET_IDIV);
def_or_undef_macro (pfile, "__ARM_FEATURE_IDIV", TARGET_IDIV);
  
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h

index 485bc68..272968a 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -139,6 +139,7 @@ extern int arm_max_const_double_inline_cost (void);
  extern int arm_const_double_inline_cost (rtx);
  extern bool arm_const_double_by_parts (rtx);
  extern bool arm_const_double_by_immediates (rtx);
+extern rtx arm_load_function_descriptor (rtx funcdesc);
  extern void arm_emit_call_insn (rtx, rtx, bool);
  bool detect_cmse_nonsecure_call (tree);
  extern const char *output_call (rtx *);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 45abcd8..d9397b5 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3485,6 +3485,15 @@ arm_option_override (void)
if (flag_pic && TARGET_VXWORKS_RTP)
  arm_pic_register = 9;
  
+  /* If in FDPIC mode then force arm_pic_register to be r9.  */

+  if (TARGET_FDPIC)
+{
+  arm_pic_register = FDPIC_REGNUM;
+  if (! TARGET_ARM && ! TARGET_THUMB2)
+   sorry ("FDPIC mode is supported on architecture versions that "
+  "support ARM or Thumb-2 only.");
+}
+
if (arm_pic_register_string != NULL)
  {
int pic_register = decode_reg_name (arm_pic_register_string);


Isn't this equivalent to rejecting Thumb-1?  I think that would be
clearer in both the condition and the error message.


Right, fixed.


How does this interact with arm_pic_data_is_text_relative?  Are both
values supported?

It doesn't interact well... it only works with the default value.
Otherwise, there are compiler crashes.




@@ -7295,6 +7304,21 @@ arm_function_ok_for_sibcall (tree decl, tree exp)
if (cfun->machine->sibcall_blocked)
  return false;
  
+  if (TARGET_FDPIC)

+{
+  /* In FDPIC, never tailcall something for which we have no decl:
+the target function could be in a different module, requiring
+a different FDPIC register value.  */
+  if (decl == NULL)
+   return false;
+
+  /* Don't tailcall if we go through the PLT since the FDPIC
+register is then corrupted and we don't restore it after
+static function calls.  */
+  if (!targetm.binds_local_p (decl))
+   return false;
+}
+
/* Never tailcall something if we are generating code for Thumb-1.  */
if (TARGET_THUMB1)
  return false;
@@ -7711,7 +7735,9 @@ arm_load_pic_register (unsigned long saved_regs 
ATTRIBUTE_UNUSED, rtx pic_reg)
  {
rtx l1, labelno, pic_tmp, pic_rtx;
  
-  if (crtl->uses_pic_offset_table == 0 || TARGET_SINGLE_PIC_BASE)

+  if (crtl->uses_pic_offset_table == 0
+  || TARGET_SINGLE_PIC_BASE
+  || TARGET_FDPIC)
  return;
  
gcc_assert (flag_pic);

@@ -7780,28 +7806,142 @@ arm_load_pic_register (unsigned long saved_regs 
ATTRIBUTE_UNUSED, rtx pic_reg)
emit_use (pic_reg);
  }
  
+/* Try to determine 

[Bug target/91598] [8/9/10 regression] 60% speed drop on neon intrinsic loop

2019-08-29 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91598

--- Comment #2 from Maxim Kuvyrkov  ---
Created attachment 46784
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46784=edit
Patch for 70% of the regression

Re: [ARM/FDPIC v5 13/21] [ARM] FDPIC: Force LSB bit for PC in Cortex-M architecture

2019-08-29 Thread Kyrill Tkachov

Hi Christophe,

On 5/15/19 1:39 PM, Christophe Lyon wrote:

Without this, when we are unwinding across a signal frame we can jump
to an even address which leads to an exception.

This is needed in __gnu_persnality_sigframe_fdpic() when restoring the
PC from the signal frame since the PC saved by the kernel has the LSB
bit set to zero.

2019-XX-XX  Christophe Lyon  
    Mickaël Guêné 

    libgcc/
    * config/arm/unwind-arm.c (_Unwind_VRS_Set): Handle v7m
    architecture.

Change-Id: Ie84de548226bcf1751e19a09e8f091fb3013ccea

diff --git a/libgcc/config/arm/unwind-arm.c 
b/libgcc/config/arm/unwind-arm.c

index 9ba73e7..ba47150 100644
--- a/libgcc/config/arm/unwind-arm.c
+++ b/libgcc/config/arm/unwind-arm.c
@@ -199,6 +199,11 @@ _Unwind_VRS_Result _Unwind_VRS_Set 
(_Unwind_Context *context,

 return _UVRSR_FAILED;

   vrs->core.r[regno] = *(_uw *) valuep;
+#if defined(__ARM_ARCH_7M__)
+  /* Force LSB bit since we always run thumb code.  */
+  if (regno == 15)
+   vrs->core.r[regno] |= 1;
+#endif


Hmm, this looks quite specific. There are other architectures that are 
thumb-only too (6-M, 7E-M etc).


Would checking for __thumb__ be better?

Thanks,

Kyrill



   return _UVRSR_OK;

 case _UVRSC_VFP:
--
2.6.3



Re: Question about make_extraction() in combine.c

2019-08-29 Thread Michael Eager

On 8/28/19 12:33 PM, Segher Boessenkool wrote:

Hi!

On Tue, Aug 27, 2019 at 09:37:59AM -0700, Michael Eager wrote:

Combine is complex, but I don't think that target descriptions should
conform to its behaviors;


But they have to, in some ways.  If combine writes something that can be
written in multiple ways in some way X, then your machine description has
to recognise X (perhaps in addition to other ways it can be written), or
you will not get as much optimisation as you might like: some combine
attempts will fail.


combine should adapt to the target.


How?


By not making arbitrary restrictions on the instructions which a target 
can implement, simply to avoid a bug in a different target.


The target has an instruction which can extract a bit (or bit field) 
from a word in memory.  The code in combine prevents that instruction 
from being used without creating awkward workarounds.





diff --git a/gcc/combine.c b/gcc/combine.c
index 93bd3da26d7..fdc79ab7d3e 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -,17 +,7 @@ make_extraction (machine_mode mode, rtx inner, 
HOST_WIDE_INT pos,


This patch is against some older version of combine.c?  The line number is
off by 70 or so.


The patch was created last November.


&& partial_subreg_p (extraction_mode, mode))
  extraction_mode = mode;


And current trunk has here

   /* Punt if len is too large for extraction_mode.  */
   if (maybe_gt (len, GET_MODE_PRECISION (extraction_mode)))
 return NULL_RTX;

(See r268913, https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01140.html ).


This patch seems unrelated.


Does that fix your problem already?  Is more needed?  Is your patch
removing some now-dead code?


If you are asking, does my patch remove code that is no longer needed to 
fix the decades-old problem which it fixed (or hid)?  I suspect it does. 
 But this cannot be verified.  There's no test case for the original 
problem, nor is it clear which architecture had the problem.


--
Michael Eagerea...@eagerm.com
1960 Park Blvd., Palo Alto, CA 94306


[PATCH V6 05/11] bpf: new GCC port

2019-08-29 Thread Jose E. Marchesi


This patch adds a port for the Linux kernel eBPF architecture to GCC.

ChangeLog:

  * configure.ac: Support for bpf-*-* targets.
  * configure: Regenerate.

contrib/ChangeLog:

  * config-list.mk (LIST): Disable go in bpf-*-* targets.

gcc/ChangeLog:

  * config.gcc: Support for bpf-*-* targets.
  * common/config/bpf/bpf-common.c: New file.
  * config/bpf/t-bpf: Likewise.
  * config/bpf/predicates.md: Likewise.
  * config/bpf/constraints.md: Likewise.
  * config/bpf/bpf.opt: Likewise.
  * config/bpf/bpf.md: Likewise.
  * config/bpf/bpf.h: Likewise.
  * config/bpf/bpf.c: Likewise.
  * config/bpf/bpf-protos.h: Likewise.
  * config/bpf/bpf-opts.h: Likewise.
  * config/bpf/bpf-helpers.h: Likewise.
  * config/bpf/bpf-helpers.def: Likewise.
---
 ChangeLog  |   5 +
 configure  |  54 ++-
 configure.ac   |  54 ++-
 contrib/ChangeLog  |   4 +
 contrib/config-list.mk |   3 +-
 gcc/ChangeLog  |  16 +
 gcc/common/config/bpf/bpf-common.c |  55 +++
 gcc/config.gcc |   9 +
 gcc/config/bpf/bpf-helpers.def | 194 
 gcc/config/bpf/bpf-helpers.h   | 327 +
 gcc/config/bpf/bpf-opts.h  |  56 +++
 gcc/config/bpf/bpf-protos.h|  33 ++
 gcc/config/bpf/bpf.c   | 948 +
 gcc/config/bpf/bpf.h   | 539 +
 gcc/config/bpf/bpf.md  | 497 +++
 gcc/config/bpf/bpf.opt | 123 +
 gcc/config/bpf/constraints.md  |  32 ++
 gcc/config/bpf/predicates.md   |  64 +++
 gcc/config/bpf/t-bpf   |   0
 19 files changed, 3010 insertions(+), 3 deletions(-)
 create mode 100644 gcc/common/config/bpf/bpf-common.c
 create mode 100644 gcc/config/bpf/bpf-helpers.def
 create mode 100644 gcc/config/bpf/bpf-helpers.h
 create mode 100644 gcc/config/bpf/bpf-opts.h
 create mode 100644 gcc/config/bpf/bpf-protos.h
 create mode 100644 gcc/config/bpf/bpf.c
 create mode 100644 gcc/config/bpf/bpf.h
 create mode 100644 gcc/config/bpf/bpf.md
 create mode 100644 gcc/config/bpf/bpf.opt
 create mode 100644 gcc/config/bpf/constraints.md
 create mode 100644 gcc/config/bpf/predicates.md
 create mode 100644 gcc/config/bpf/t-bpf

diff --git a/configure.ac b/configure.ac
index 1fe97c001cc..b8ce2ad20b9 100644
--- a/configure.ac
+++ b/configure.ac
@@ -638,6 +638,9 @@ case "${target}" in
 # No hosted I/O support.
 noconfigdirs="$noconfigdirs target-libssp"
 ;;
+  bpf-*-*)
+noconfigdirs="$noconfigdirs target-libssp"
+;;
   powerpc-*-aix* | rs6000-*-aix*)
 noconfigdirs="$noconfigdirs target-libssp"
 ;;
@@ -672,12 +675,43 @@ if test "${ENABLE_LIBSTDCXX}" = "default" ; then
 avr-*-*)
   noconfigdirs="$noconfigdirs target-libstdc++-v3"
   ;;
+bpf-*-*)
+  noconfigdirs="$noconfigdirs target-libstdc++-v3"
+  ;;
 ft32-*-*)
   noconfigdirs="$noconfigdirs target-libstdc++-v3"
   ;;
   esac
 fi
 
+# Disable C++ on systems where it is known to not work.
+# For testing, you can override this with --enable-languages=c++.
+case ,${enable_languages}, in
+  *,c++,*)
+;;
+  *)
+  case "${target}" in
+bpf-*-*)
+  unsupported_languages="$unsupported_languages c++"
+  ;;
+  esac
+  ;;
+esac
+
+# Disable Objc on systems where it is known to not work.
+# For testing, you can override this with --enable-languages=objc.
+case ,${enable_languages}, in
+  *,objc,*)
+;;
+  *)
+  case "${target}" in
+bpf-*-*)
+  unsupported_languages="$unsupported_languages objc"
+  ;;
+  esac
+  ;;
+esac
+
 # Disable D on systems where it is known to not work.
 # For testing, you can override this with --enable-languages=d.
 case ,${enable_languages}, in
@@ -687,6 +721,9 @@ case ,${enable_languages}, in
 case "${target}" in
   *-*-darwin*)
unsupported_languages="$unsupported_languages d"
+;;
+  bpf-*-*)
+   unsupported_languages="$unsupported_languages d"
;;
 esac
 ;;
@@ -715,6 +752,9 @@ case "${target}" in
 # See .
 unsupported_languages="$unsupported_languages fortran"
 ;;
+  bpf-*-*)
+unsupported_languages="$unsupported_languages fortran"
+;;
 esac
 
 # Disable libffi for some systems.
@@ -761,6 +801,9 @@ case "${target}" in
   arm*-*-symbianelf*)
 noconfigdirs="$noconfigdirs target-libffi"
 ;;
+  bpf-*-*)
+noconfigdirs="$noconfigdirs target-libffi"
+;;
   cris-*-* | crisv32-*-*)
 case "${target}" in
   *-*-linux*)
@@ -807,7 +850,7 @@ esac
 # Disable the go frontend on systems where it is known to not work. Please keep
 # this in sync with contrib/config-list.mk.
 case "${target}" in
-*-*-darwin* | *-*-cygwin* | *-*-mingw*)
+*-*-darwin* | *-*-cygwin* | *-*-mingw* | bpf-* )
 unsupported_languages="$unsupported_languages go"
 

[PATCH][GCC] Complex division improvements in GCC

2019-08-29 Thread Elen Kalda
Hi all,

Advice and help needed! 

This patch makes changes to the the complex division in GCC. The algorithm 
used is same as in https://gcc.gnu.org/ml/gcc-patches/2019-08/msg01629.html - 
same problems, same improvement in robustness, same loss in accuracy. 

Since Baudin adds another underflow check, two more basic blocks get added 
during the cplxlower1 pass. 

No problems with bootstrap on aarch64-none-linux-gnu. Unsurprisingly, there
are regressions in gcc/testsuite/gcc.dg/torture/builtin-math-7.c. As in the 
patch linked above, the regressions in that test are due to the loss in 
accuracy.

To evaluate the performance, the same test which generates 360 000 000 random 
numbers was used. Doing one less division results in a nice 11.32% 
improvement in performance:

| CPU time

smiths  | 7 290 996
b1div   | 6 465 590

That implementation works (in a sense that it produces an expected result), 
but it could be made more efficient and clean. As an example, the cplxlower1
pass assigns one variable to another variable, which seems redundant:

[...]

  [local count: 1063004407]:
  # i_19 = PHI <0(2), i_15(7)>
  _9 = REALPART_EXPR ;
  _7 = IMAGPART_EXPR ;
  _1 = COMPLEX_EXPR <_9, _7>;
  _18 = REALPART_EXPR ;
  _17 = IMAGPART_EXPR ;
  _2 = COMPLEX_EXPR <_18, _17>;
  _16 = ABS_EXPR <_18>;
  _21 = ABS_EXPR <_17>;
  _22 = _16 > _21;
  if (_22 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 531502204]:
  _23 = _17 / _18;
  _24 = _17 * _23;
  _25 = _18 + _24;
  _26 = 1.0e+0 / _25;
  _27 = _23 == 0.0;
  if (_27 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 265751102]:
  _28 = _7 / _18;
  _29 = _17 * _28;
  _30 = _9 + _29;
  _31 = _26 * _30;
  _32 = _9 / _18;
  _33 = _17 * _32;
  _34 = _7 - _33;
  _35 = _26 * _34;
  _83 = _31; <--- could these extra assignments be avoided?
  _84 = _35; <---|
  goto ; [100.00%]

   [local count: 265751102]:
  _36 = _7 * _23;
  _37 = _9 + _36;
  _38 = _26 * _37;
  _39 = _9 * _23;
  _40 = _7 - _39;
  _41 = _26 * _40;
  _81 = _38;
  _82 = _41;

   [local count: 531502204]:
  # _71 = PHI <_83(12), _81(13)>
  # _72 = PHI <_84(12), _82(13)>
  _85 = _71;
  _86 = _72;
  goto ; [100.00%]

   [local count: 531502204]:
  _42 = _18 / _17;
  _43 = _18 * _42;
  _44 = _17 + _43;
  _45 = 1.0e+0 / _44;
  _46 = _42 == 0.0;
  if (_46 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 265751102]:
  _47 = _9 / _17;
  _48 = _18 * _47;
  _49 = _7 + _48;
  _50 = _45 * _49;
  _51 = _7 / _17;
  _52 = _18 * _51;
  _53 = _9 - _52;
  _54 = _45 * _53;
  _77 = _50;
  _78 = _54;
  goto ; [100.00%]

   [local count: 265751102]:
  _55 = _9 * _42;
  _56 = _7 + _55;
  _57 = _45 * _56;
  _58 = _7 * _42;
  _59 = _9 - _58;
  _60 = _45 * _59;
  _75 = _57;
  _76 = _60;

   [local count: 531502204]:
  # _73 = PHI <_77(15), _75(16)>
  # _74 = PHI <_78(15), _76(16)>
  _61 = -_74;
  _79 = _73;
  _80 = _61;

[...]

Best wishes,
Elen

gcc/ChangeLog:

2019-08-29  Elen Kalda  

* fold-const.c (fold_negate_const): Make the fold_negate_const function 
non-static
(const_binop): Implement Baudin's algorithm for complex division
* fold-const.h (fold_negate_const): Add a fold_negate_const function 
declaration
* tree-complex.c (complex_div_internal_wide): New function to aid with the 
wide complex division
(expand_complex_div_wide): Implement Baudin's algorithm for complex 
division
diff --git a/gcc/fold-const.h b/gcc/fold-const.h
index 54c850a3ee1f5db7c20fc8ab07ea504d634b55b8..71c1631881b693f973fa9ef94154abb02064e1c1 100644
--- a/gcc/fold-const.h
+++ b/gcc/fold-const.h
@@ -194,6 +194,7 @@ extern tree const_binop (enum tree_code, tree, tree, tree);
 extern bool negate_mathfn_p (combined_fn);
 extern const char *c_getstr (tree, unsigned HOST_WIDE_INT * = NULL);
 extern wide_int tree_nonzero_bits (const_tree);
+extern tree fold_negate_const (tree arg0, tree type);
 
 /* Return OFF converted to a pointer offset type suitable as offset for
POINTER_PLUS_EXPR.  Use location LOC for this conversion.  */
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 0bd68b5e2d484d6f3be52b1d38be5a9f41637355..e4ea9046fbf010861b726dd742fd3834b35e80ec 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -133,7 +133,7 @@ static tree fold_binary_op_with_conditional_arg (location_t,
 		 enum tree_code, tree,
 		 tree, tree,
 		 tree, tree, int);
-static tree fold_negate_const (tree, tree);
+tree fold_negate_const (tree, tree);
 static tree fold_not_const (const_tree, tree);
 static tree fold_relational_const (enum tree_code, tree, tree, tree);
 static tree fold_convert_const (enum tree_code, tree, tree);
@@ -1387,7 +1387,9 @@ const_binop (enum tree_code code, tree arg1, tree arg2)
   tree i1 = TREE_IMAGPART (arg1);
   tree r2 = TREE_REALPART (arg2);
   tree i2 = TREE_IMAGPART (arg2);
+
   tree real, imag;
+  imag = real = NULL_TREE;
 
   switch 

[PATCH V6 02/11] opt-functions.awk: fix comparison of limit, begin and end

2019-08-29 Thread Jose E. Marchesi
The function integer_range_info makes sure that, if provided, the
initial value fills in the especified range.  However, it is necessary
to convert the values to a numerical context before comparing, to make
sure awk is using arithmetical order and not lexicographical order.

gcc/ChangeLog:

* opt-functions.awk (integer_range_info): Make sure values are in
numeric context before operating with them.
---
 gcc/ChangeLog | 5 +
 gcc/opt-functions.awk | 7 ---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/gcc/opt-functions.awk b/gcc/opt-functions.awk
index 1190e6d6b66..c1da80c648c 100644
--- a/gcc/opt-functions.awk
+++ b/gcc/opt-functions.awk
@@ -346,9 +346,10 @@ function search_var_name(name, opt_numbers, opts, flags, 
n_opts)
 function integer_range_info(range_option, init, option)
 {
 if (range_option != "") {
-   start = nth_arg(0, range_option);
-   end = nth_arg(1, range_option);
-   if (init != "" && init != "-1" && (init < start || init > end))
+   ival = init + 0;
+   start = nth_arg(0, range_option) + 0;
+   end = nth_arg(1, range_option) + 0;
+   if (init != "" && init != "-1" && (ival < start || ival > end))
  print "#error initial value " init " of '" option "' must be in range 
[" start "," end "]"
return start ", " end
 }
-- 
2.11.0



[PATCH V6 00/11] eBPF support for GCC

2019-08-29 Thread Jose E. Marchesi
[Differences from V5:
. Use TARGET_BIG_ENDIAN instead of TARGET_LITTLE_ENDIAN, and make sure
  ASM_SPEC always passes an endianness selector argument to the
  assembler.
. De-obfuscate the usage of arg.type_size_in_bytes and
  arg_aggregate-type_p.
. Increase the cummulative argument unconditionally in
  bpf_function_arg_advance.
. Simplify conditional in bpf_print_operand_address.
. Remove predicates reg_or_memory_operand and mov_dst_operand, because
  they became the same than nonimmediate_operand.
. The kernel helper get_current_task() doesn't accept arguments.
  Adjust the helper definition accordingly, and the corresponding
  test.]

Hi people!

This patch series introduces a port of GCC to eBPF, which is a virtual
machine that resides in the Linux kernel.  Initially intended for
user-level packet capture and filtering, eBPF is nowadays generalized
to serve as a general-purpose infrastructure also for non-networking
purposes.

The binutils support is already upstream.  See
https://sourceware.org/ml/binutils/2019-05/msg00306.html.

eBPF architecture and ABI
=
   
Documentation for eBPF can be found in the linux kernel source tree,
file Documentation/networking/filter.txt.  It covers the instructions
set, the way the interpreter works and the many restrictions imposed
by the kernel verifier.
   
As for the ABI, att this moment compiled eBPF doesn't have very well
established conventions.  The details on what is expected to be in an
ELF file containing eBPF is determined, in practice, by what the llvm
BPF backend generates and what is expected by the the two existing
kernel loaders: bpf_load.c and libbpf.

We hope that the addition of this port to the GNU toolchain will help
to mature this domain.

Overview of the patch series

   
The first few patches are preparatory:

. The first patch updates config.guess and config.sub from the
  'config' upstream project, in order to recognize bpf-*-* triplets.

. The second patch fixes an integrity check in opt-functions.awk.

. The third patch annotates many tests in the gcc.c-torture/compile
  testsuite with their requirements in terms of stack frame size,
  using the existing dg-require-stack-size machinery.

. The fourth patch introduces a new effective target flag called
  indirect_call, and annotates the tests in gcc.c-torture/compile
  accordingly.

The rest of the patches are BPF specific:

The fifth patch adds the new GCC port proper.  Machine description,
implementation of target hooks and macros, command-line options and
the like.

The sixth patch adds a libgcc port for eBPF.  At the moment, it is
minimal and it basically addresses the limitations imposed by the
target, by excluding a few functions in libgcc2 (all of them related
to TImodes) whose default implementations exceed the eBPF stack limit.

The seventh, eight and ninth patches deal with testing the new
port. The gcc.target testsuite is extended with eBPF-specific tests,
covering the backend-specific built-in functions and diagnostics.  The
check-effective-target functions are made aware of eBPF targets. Many
tests in the gcc.c-torture/compile testsuite are annotated to be
skipped in bpf-*-* targets, since they violate some restriction
imposed by the hardware (such as surpassing the stack limit.)  The
resulting testsuite doesn't have unexpected failures, and is currently
the principal way to check for regressions in the port.  Likewise,
many tests in the gcc.dg testsuite are annotated to be skipped in
bpf-*-* targets.

The tenth patch adds documentation updates to the GCC manual,
including information on the new command line options and compiler
built-ins.

Finally, the last patch adds myself as the maintainer of the BPF port.
I personally commit to evolve and maintain the port for as long as
necessary, and to find a suitable replacement in case I have to step
down for whatever reason.

Some notes on the port
==

As a compilation target, eBPF is rather peculiar.  This is mainly due
to the quite hard restrictions imposed by the kernel verifier, and
also due to the security-driven design of the architecture itself.

To list a few examples:

. The stack is disjoint, and each stack frame corresponding to a
  function activation is isolated: it is not possible for a callee to
  access the stack frame of the caller, nor for a caller to access the
  stack frame of it's callees.  The frame pointer register is
  read-only.

. Therefore it is not possible to pass arguments in the stack.

. Argument passing is restricted to 5 arguments.

. Each stack frame is limited to 512 bytes by default.

. The instruction set doesn't support indirect jumps.

. The instruction set doesn't support indirect calls.

. The architecture doesn't provide an explicit stack pointer.
  Instead, the eBPF "hardware" (in this case the kernel verifier)
  examines the compiled program and, by looking at the way the stack
  is accessed, estimates the size of the 

[PATCH V6 06/11] bpf: new libgcc port

2019-08-29 Thread Jose E. Marchesi
This patch adds an eBPF port to libgcc.

As of today, compiled eBPF programs do not support a single-entry
point schema.  Instead, a BPF "executable" is a relocatable ELF object
file containing multiple entry points, in certain named sections.

Also, the BPF loaders in the kernel do not execute .ini/.fini
constructors/destructors.  Therefore, this patch provides empty crtn.S
and cri.S files.

libgcc/ChangeLog:

* config.host: Set cpu_type for bpf-*-* targets.
* config/bpf/t-bpf: Likewise.
* config/bpf/crtn.S: Likewise.
* config/bpf/crti.S: New file.
---
 libgcc/ChangeLog |  7 +++
 libgcc/config.host   |  7 +++
 libgcc/config/bpf/crti.S |  0
 libgcc/config/bpf/crtn.S |  0
 libgcc/config/bpf/t-bpf  | 23 +++
 5 files changed, 37 insertions(+)
 create mode 100644 libgcc/config/bpf/crti.S
 create mode 100644 libgcc/config/bpf/crtn.S
 create mode 100644 libgcc/config/bpf/t-bpf

diff --git a/libgcc/config.host b/libgcc/config.host
index 503ebb6be20..2e9fbc35482 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -107,6 +107,9 @@ avr-*-*)
 bfin*-*)
cpu_type=bfin
;;
+bpf-*-*)
+cpu_type=bpf
+;;
 cr16-*-*)
;;
 crisv32-*-*)
@@ -526,6 +529,10 @@ bfin*-*)
tmake_file="$tmake_file bfin/t-bfin t-fdpbit"
extra_parts="crtbegin.o crtend.o crti.o crtn.o"
 ;;
+bpf-*-*)
+tmake_file="$tmake_file ${cpu_type}/t-${cpu_type}"
+extra_parts="crti.o crtn.o"
+   ;;
 cr16-*-elf)
tmake_file="${tmake_file} cr16/t-cr16 cr16/t-crtlibid t-fdpbit"
extra_parts="$extra_parts crti.o crtn.o crtlibid.o"
diff --git a/libgcc/config/bpf/crti.S b/libgcc/config/bpf/crti.S
new file mode 100644
index 000..e69de29bb2d
diff --git a/libgcc/config/bpf/crtn.S b/libgcc/config/bpf/crtn.S
new file mode 100644
index 000..e69de29bb2d
diff --git a/libgcc/config/bpf/t-bpf b/libgcc/config/bpf/t-bpf
new file mode 100644
index 000..88129a78f61
--- /dev/null
+++ b/libgcc/config/bpf/t-bpf
@@ -0,0 +1,23 @@
+LIB2ADDEH = 
+
+crti.o: $(srcdir)/config/bpf/crti.S
+   $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $<
+
+crtn.o: $(srcdir)/config/bpf/crtn.S
+   $(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $<
+
+# Some of the functions defined in libgcc2 exceed the eBPF stack
+# limit, or other restrictions imposed by this peculiar target.
+# Therefore we have to exclude them here.
+#
+# Patterns in bpf.md must guarantee that no calls to the excluded
+# functions are ever generated, and compiler tests should make sure
+# this holds.
+#
+# Note that the modes in the function names below are misleading: di
+# means TImode.
+LIB2FUNCS_EXCLUDE = _mulvdi3 _divdi3 _moddi3 _divmoddi4 _udivdi3 _umoddi3 \
+_udivmoddi4
+
+# Prevent building "advanced" stuff (for example, gcov support).
+INHIBIT_LIBC_CFLAGS = -Dinhibit_libc
-- 
2.11.0



[PATCH V6 09/11] bpf: adjust GCC testsuite to eBPF limitations

2019-08-29 Thread Jose E. Marchesi
This patch makes many tests in gcc.dg and gcc.c-torture to be skipped
in bpf-*-* targets.  This is due to the many limitations imposed by
eBPF to what would be perfectly valid C code: no support for more than
5 arguments to function calls, no support for indirect jumps, a very
limited range for direct jumps, etc.

Hopefully some of these restrictions will be relaxed in the future.
Also, as semantics associated with object linking get developed in
eBPF, it may be possible at some point to provide a set of standard
run-time libraries for eBPF programs.

gcc/testsuite/ChangeLog:

* gcc.dg/builtins-config.h: eBPF doesn't support C99 standard
functions.
* gcc.c-torture/compile/20101217-1.c: Add a function prototype for
printf.
* gcc.c-torture/compile/2211-1.c: Skip if target bpf-*-*.
* gcc.c-torture/compile/poor.c: Likewise.
* gcc.c-torture/compile/pr25311.c: Likewise.
* gcc.c-torture/compile/pr39928-1.c: Likewise.
* gcc.c-torture/compile/pr70061.c: Likewise.
* gcc.c-torture/compile/920501-7.c: Likewise.
* gcc.c-torture/compile/2403-1.c: Likewise.
* gcc.c-torture/compile/20001226-1.c: Likewise.
* gcc.c-torture/compile/20030903-1.c: Likewise.
* gcc.c-torture/compile/20031125-1.c: Likewise.
* gcc.c-torture/compile/20040101-1.c: Likewise.
* gcc.c-torture/compile/20040317-2.c: Likewise.
* gcc.c-torture/compile/20040726-1.c: Likewise.
* gcc.c-torture/compile/20051216-1.c: Likewise.
* gcc.c-torture/compile/900313-1.c: Likewise.
* gcc.c-torture/compile/920625-1.c: Likewise.
* gcc.c-torture/compile/930421-1.c: Likewise.
* gcc.c-torture/compile/930623-1.c: Likewise.
* gcc.c-torture/compile/961004-1.c: Likewise.
* gcc.c-torture/compile/980504-1.c: Likewise.
* gcc.c-torture/compile/980816-1.c: Likewise.
* gcc.c-torture/compile/990625-1.c: Likewise.
* gcc.c-torture/compile/DFcmp.c: Likewise.
* gcc.c-torture/compile/HIcmp.c: Likewise.
* gcc.c-torture/compile/HIset.c: Likewise.
* gcc.c-torture/compile/QIcmp.c: Likewise.
* gcc.c-torture/compile/QIset.c: Likewise.
* gcc.c-torture/compile/SFset.c: Likewise.
* gcc.c-torture/compile/SIcmp.c: Likewise.
* gcc.c-torture/compile/SIset.c: Likewise.
* gcc.c-torture/compile/UHIcmp.c: Likewise.
* gcc.c-torture/compile/UQIcmp.c: Likewise.
* gcc.c-torture/compile/USIcmp.c: Likewise.
* gcc.c-torture/compile/consec.c: Likewise.
* gcc.c-torture/compile/limits-fndefn.c: Likewise.
* gcc.c-torture/compile/lll.c: Likewise.
* gcc.c-torture/compile/parms.c: Likewise.
* gcc.c-torture/compile/pass.c: Likewise.
* gcc.c-torture/compile/pp.c: Likewise.
* gcc.c-torture/compile/pr32399.c: Likewise.
* gcc.c-torture/compile/pr34091.c: Likewise.
* gcc.c-torture/compile/pr34688.c: Likewise.
* gcc.c-torture/compile/pr37258.c: Likewise.
* gcc.c-torture/compile/pr37327.c: Likewise.
* gcc.c-torture/compile/pr37381.c: Likewise.
* gcc.c-torture/compile/pr37669-2.c: Likewise.
* gcc.c-torture/compile/pr37669.c: Likewise.
* gcc.c-torture/compile/pr37742-3.c: Likewise.
* gcc.c-torture/compile/pr44063.c: Likewise.
* gcc.c-torture/compile/pr48596.c: Likewise.
* gcc.c-torture/compile/pr51856.c: Likewise.
* gcc.c-torture/compile/pr54428.c: Likewise.
* gcc.c-torture/compile/pr54713-1.c: Likewise.
* gcc.c-torture/compile/pr54713-2.c: Likewise.
* gcc.c-torture/compile/pr54713-3.c: Likewise.
* gcc.c-torture/compile/pr55921.c: Likewise.
* gcc.c-torture/compile/pr70240.c: Likewise.
* gcc.c-torture/compile/pr70355.c: Likewise.
* gcc.c-torture/compile/pr82052.c: Likewise.
* gcc.c-torture/compile/pr83487.c: Likewise.
* gcc.c-torture/compile/pr86122.c: Likewise.
* gcc.c-torture/compile/pret-arg.c: Likewise.
* gcc.c-torture/compile/regs-arg-size.c: Likewise.
* gcc.c-torture/compile/structret.c: Likewise.
* gcc.c-torture/compile/uuarg.c: Likewise.
* gcc.dg/20001009-1.c: Likewise.
* gcc.dg/20020418-1.c: Likewise.
* gcc.dg/20020426-2.c: Likewise.
* gcc.dg/20020430-1.c: Likewise.
* gcc.dg/20040306-1.c: Likewise.
* gcc.dg/20040622-2.c: Likewise.
* gcc.dg/20050603-2.c: Likewise.
* gcc.dg/20050629-1.c: Likewise.
* gcc.dg/20061026.c: Likewise.
* gcc.dg/Warray-bounds-3.c: Likewise.
* gcc.dg/Warray-bounds-30.c: Likewise.
* gcc.dg/Wframe-larger-than-2.c: Likewise.
* gcc.dg/Wframe-larger-than.c: Likewise.
* gcc.dg/Wrestrict-11.c: Likewise.
* gcc.c-torture/compile/2804-1.c: Likewise.
---
 gcc/testsuite/ChangeLog| 87 ++
 

[PATCH V6 01/11] Update config.sub and config.guess.

2019-08-29 Thread Jose E. Marchesi
* config.sub: Import upstream version 2019-06-30.
* config.guess: Import upstream version 2019-07-24.
---
 ChangeLog|   5 ++
 config.guess | 264 +++
 config.sub   |  50 +--
 3 files changed, 240 insertions(+), 79 deletions(-)

diff --git a/config.guess b/config.guess
index 8e2a58b864f..97ad0733304 100755
--- a/config.guess
+++ b/config.guess
@@ -2,7 +2,7 @@
 # Attempt to guess a canonical system name.
 #   Copyright 1992-2019 Free Software Foundation, Inc.
 
-timestamp='2019-01-03'
+timestamp='2019-07-24'
 
 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
@@ -262,6 +262,9 @@ case 
"$UNAME_MACHINE:$UNAME_SYSTEM:$UNAME_RELEASE:$UNAME_VERSION" in
 *:SolidBSD:*:*)
echo "$UNAME_MACHINE"-unknown-solidbsd"$UNAME_RELEASE"
exit ;;
+*:OS108:*:*)
+   echo "$UNAME_MACHINE"-unknown-os108_"$UNAME_RELEASE"
+   exit ;;
 macppc:MirBSD:*:*)
echo powerpc-unknown-mirbsd"$UNAME_RELEASE"
exit ;;
@@ -275,8 +278,8 @@ case 
"$UNAME_MACHINE:$UNAME_SYSTEM:$UNAME_RELEASE:$UNAME_VERSION" in
echo "$UNAME_MACHINE"-unknown-redox
exit ;;
 mips:OSF1:*.*)
-echo mips-dec-osf1
-exit ;;
+   echo mips-dec-osf1
+   exit ;;
 alpha:OSF1:*:*)
case $UNAME_RELEASE in
*4.0)
@@ -385,20 +388,7 @@ case 
"$UNAME_MACHINE:$UNAME_SYSTEM:$UNAME_RELEASE:$UNAME_VERSION" in
echo sparc-hal-solaris2"`echo "$UNAME_RELEASE"|sed -e 's/[^.]*//'`"
exit ;;
 sun4*:SunOS:5.*:* | tadpole*:SunOS:5.*:*)
-   set_cc_for_build
-   SUN_ARCH=sparc
-   # If there is a compiler, see if it is configured for 64-bit objects.
-   # Note that the Sun cc does not turn __LP64__ into 1 like gcc does.
-   # This test works for both compilers.
-   if [ "$CC_FOR_BUILD" != no_compiler_found ]; then
-   if (echo '#ifdef __sparcv9'; echo IS_64BIT_ARCH; echo '#endif') | \
-   (CCOPTS="" $CC_FOR_BUILD -E - 2>/dev/null) | \
-   grep IS_64BIT_ARCH >/dev/null
-   then
-   SUN_ARCH=sparcv9
-   fi
-   fi
-   echo "$SUN_ARCH"-sun-solaris2"`echo "$UNAME_RELEASE"|sed -e 
's/[^.]*//'`"
+   echo sparc-sun-solaris2"`echo "$UNAME_RELEASE" | sed -e 's/[^.]*//'`"
exit ;;
 i86pc:AuroraUX:5.*:* | i86xen:AuroraUX:5.*:*)
echo i386-pc-auroraux"$UNAME_RELEASE"
@@ -998,22 +988,50 @@ EOF
exit ;;
 mips:Linux:*:* | mips64:Linux:*:*)
set_cc_for_build
+   IS_GLIBC=0
+   test x"${LIBC}" = xgnu && IS_GLIBC=1
sed 's/^//' << EOF > "$dummy.c"
#undef CPU
-   #undef ${UNAME_MACHINE}
-   #undef ${UNAME_MACHINE}el
+   #undef mips
+   #undef mipsel
+   #undef mips64
+   #undef mips64el
+   #if ${IS_GLIBC} && defined(_ABI64)
+   LIBCABI=gnuabi64
+   #else
+   #if ${IS_GLIBC} && defined(_ABIN32)
+   LIBCABI=gnuabin32
+   #else
+   LIBCABI=${LIBC}
+   #endif
+   #endif
+
+   #if ${IS_GLIBC} && defined(__mips64) && defined(__mips_isa_rev) && 
__mips_isa_rev>=6
+   CPU=mipsisa64r6
+   #else
+   #if ${IS_GLIBC} && !defined(__mips64) && defined(__mips_isa_rev) && 
__mips_isa_rev>=6
+   CPU=mipsisa32r6
+   #else
+   #if defined(__mips64)
+   CPU=mips64
+   #else
+   CPU=mips
+   #endif
+   #endif
+   #endif
+
#if defined(__MIPSEL__) || defined(__MIPSEL) || defined(_MIPSEL) || 
defined(MIPSEL)
-   CPU=${UNAME_MACHINE}el
+   MIPS_ENDIAN=el
#else
#if defined(__MIPSEB__) || defined(__MIPSEB) || defined(_MIPSEB) || 
defined(MIPSEB)
-   CPU=${UNAME_MACHINE}
+   MIPS_ENDIAN=
#else
-   CPU=
+   MIPS_ENDIAN=
#endif
#endif
 EOF
-   eval "`$CC_FOR_BUILD -E "$dummy.c" 2>/dev/null | grep '^CPU'`"
-   test "x$CPU" != x && { echo "$CPU-unknown-linux-$LIBC"; exit; }
+   eval "`$CC_FOR_BUILD -E "$dummy.c" 2>/dev/null | grep 
'^CPU\|^MIPS_ENDIAN\|^LIBCABI'`"
+   test "x$CPU" != x && { echo 
"$CPU${MIPS_ENDIAN}-unknown-linux-$LIBCABI"; exit; }
;;
 mips64el:Linux:*:*)
echo "$UNAME_MACHINE"-unknown-linux-"$LIBC"
@@ -1126,7 +1144,7 @@ EOF
*Pentium)UNAME_MACHINE=i586 ;;
*Pent*|*Celeron) UNAME_MACHINE=i686 ;;
esac
-   echo 
"$UNAME_MACHINE-unknown-sysv${UNAME_RELEASE}${UNAME_SYSTEM}{$UNAME_VERSION}"
+   echo 
"$UNAME_MACHINE-unknown-sysv${UNAME_RELEASE}${UNAME_SYSTEM}${UNAME_VERSION}"
exit ;;
 i*86:*:3.2:*)
if test -f /usr/options/cb.name; then
@@ -1310,38 +1328,39 @@ EOF
echo "$UNAME_MACHINE"-apple-rhapsody"$UNAME_RELEASE"
exit ;;
 *:Darwin:*:*)
-   UNAME_PROCESSOR=`uname -p` || UNAME_PROCESSOR=unknown
-   set_cc_for_build
-   if test "$UNAME_PROCESSOR" = unknown ; then
-   

[Bug target/91598] [8/9/10 regression] 60% speed drop on neon intrinsic loop

2019-08-29 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91598

--- Comment #1 from Maxim Kuvyrkov  ---
Created attachment 46783
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46783=edit
Testcase

Testcase reported on
https://lists.linaro.org/pipermail/linaro-toolchain/2019-August/006983.html

[PATCH V6 11/11] bpf: add myself as the maintainer for the eBPF port

2019-08-29 Thread Jose E. Marchesi
ChangeLog:

* MAINTAINERS: Add myself as the maintainer of the eBPF port.
Remove myself from Write After Approval section.
---
 ChangeLog   | 5 +
 MAINTAINERS | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5d8402949bc..5d69d696c2c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -57,6 +57,7 @@ arm port  Ramana Radhakrishnan

 arm port   Kyrylo Tkachov  
 avr port   Denis Chertykov 
 bfin port  Jie Zhang   
+bpf port   Jose E. Marchesi
 c6x port   Bernd Schmidt   
 cris port  Hans-Peter Nilsson  
 c-sky port Xianmiao Qu 
@@ -497,7 +498,6 @@ Luis Machado

 Ziga Mahkovec  
 Matthew Malcomson  
 Mikhail Maltsev
-Jose E. Marchesi   
 Patrick Marlier

 Simon Martin   
 Alejandro Martinez 

-- 
2.11.0



[PATCH V6 10/11] bpf: manual updates for eBPF

2019-08-29 Thread Jose E. Marchesi
gcc/ChangeLog:

* doc/invoke.texi (Option Summary): Cover eBPF.
(eBPF Options): New section.
* doc/extend.texi (BPF Built-in Functions): Likewise.
(BPF Kernel Helpers): Likewise.
---
 gcc/ChangeLog   |   7 +++
 gcc/doc/extend.texi | 171 
 gcc/doc/invoke.texi |  37 
 3 files changed, 215 insertions(+)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 4aea4d31761..e821cafff1e 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -13604,6 +13604,8 @@ instructions, but allow the compiler to schedule those 
calls.
 * ARM ARMv8-M Security Extensions::
 * AVR Built-in Functions::
 * Blackfin Built-in Functions::
+* BPF Built-in Functions::
+* BPF Kernel Helpers::
 * FR-V Built-in Functions::
 * MIPS DSP Built-in Functions::
 * MIPS Paired-Single Support::
@@ -14601,6 +14603,175 @@ void __builtin_bfin_csync (void)
 void __builtin_bfin_ssync (void)
 @end smallexample
 
+@node BPF Built-in Functions
+@subsection BPF Built-in Functions
+
+The following built-in functions are available for eBPF targets.
+
+@deftypefn {Built-in Function} unsigned long long __builtin_bpf_load_byte 
(unsigned long long @var{offset})
+Load a byte from the @code{struct sk_buff} packet data pointed by the register 
@code{%r6} and return it.
+@end deftypefn
+
+@deftypefn {Built-in Function} unsigned long long __builtin_bpf_load_half 
(unsigned long long @var{offset})
+Load 16-bits from the @code{struct sk_buff} packet data pointed by the 
register @code{%r6} and return it.
+@end deftypefn
+
+@deftypefn {Built-in Function} unsigned long long __builtin_bpf_load_word 
(unsigned long long @var{offset})
+Load 32-bits from the @code{struct sk_buff} packet data pointed by the 
register @code{%r6} and return it.
+@end deftypefn
+
+@node BPF Kernel Helpers
+@subsection BPF Kernel Helpers
+
+These built-in functions are available for calling kernel helpers, and
+they are available depending on the kernel version selected as the
+CPU.
+
+Rather than using the built-ins directly, it is preferred for programs
+to include @file{bpf-helpers.h} and use the wrappers defined there.
+
+For a full description of what the helpers do, the arguments they
+take, and the returned value, see the
+@file{linux/include/uapi/linux/bpf.h} in a Linux source tree.
+
+@smallexample
+void *__builtin_bpf_helper_map_lookup_elem (void *map, void *key)
+int   __builtin_bpf_helper_map_update_elem (void *map, void *key,
+void *value,
+unsigned long long flags)
+int   __builtin_bpf_helper_map_delete_elem (void *map, const void *key)
+int   __builtin_bpf_helper_map_push_elem (void *map, const void *value,
+  unsigned long long flags)
+int   __builtin_bpf_helper_map_pop_elem (void *map, void *value)
+int   __builtin_bpf_helper_map_peek_elem (void *map, void *value)
+int __builtin_bpf_helper_clone_redirect (void *skb,
+ unsigned int ifindex,
+ unsigned long long flags)
+int __builtin_bpf_helper_skb_get_tunnel_key (void *ctx, void *key, int size, 
int flags)
+int __builtin_bpf_helper_skb_set_tunnel_key (void *ctx, void *key, int size, 
int flags)
+int __builtin_bpf_helper_skb_get_tunnel_opt (void *ctx, void *md, int size)
+int __builtin_bpf_helper_skb_set_tunnel_opt (void *ctx, void *md, int size)
+int __builtin_bpf_helper_skb_get_xfrm_state (void *ctx, int index, void *state,
+int size, int flags)
+static unsigned long long __builtin_bpf_helper_skb_cgroup_id (void *ctx)
+static unsigned long long __builtin_bpf_helper_skb_ancestor_cgroup_id
+ (void *ctx, int level)
+int __builtin_bpf_helper_skb_vlan_push (void *ctx, __be16 vlan_proto, __u16 
vlan_tci)
+int __builtin_bpf_helper_skb_vlan_pop (void *ctx)
+int __builtin_bpf_helper_skb_ecn_set_ce (void *ctx)
+
+int __builtin_bpf_helper_skb_load_bytes (void *ctx, int off, void *to, int len)
+int __builtin_bpf_helper_skb_load_bytes_relative (void *ctx, int off, void 
*to, int len, __u32 start_header)
+int __builtin_bpf_helper_skb_store_bytes (void *ctx, int off, void *from, int 
len, int flags)
+int __builtin_bpf_helper_skb_under_cgroup (void *ctx, void *map, int index)
+int __builtin_bpf_helper_skb_change_head (void *, int len, int flags)
+int __builtin_bpf_helper_skb_pull_data (void *, int len)
+int __builtin_bpf_helper_skb_change_proto (void *ctx, __be16 proto, __u64 
flags)
+int __builtin_bpf_helper_skb_change_type (void *ctx, __u32 type)
+int __builtin_bpf_helper_skb_change_tail (void *ctx, __u32 len, __u64 flags)
+int __builtin_bpf_helper_skb_adjust_room (void *ctx, __s32 len_diff, __u32 
mode,
+ unsigned long long flags)
+@end smallexample
+
+Other helpers:
+
+@smallexample
+int 

[PATCH V6 08/11] bpf: make target-supports.exp aware of eBPF

2019-08-29 Thread Jose E. Marchesi
This patch makes the several effective target checks in
target-supports.exp to be aware of eBPF targets.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp
(check_effective_target_trampolines): Adapt to eBPF.
(check_effective_target_stack_size): Likewise.
(dg-effective-target-value): Likewise.
(check_effective_target_indirect_jumps): Likewise.
(check_effective_target_nonlocal_goto): Likewise.
(check_effective_target_global_constructor): Likewise.
(check_effective_target_return_address): Likewise.
---
 gcc/testsuite/ChangeLog   |  9 +
 gcc/testsuite/lib/target-supports.exp | 18 +++---
 2 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index f457a46a02b..ce08a2f8421 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -526,7 +526,8 @@ proc check_effective_target_trampolines { } {
 || [istarget nvptx-*-*]
 || [istarget hppa2.0w-hp-hpux11.23]
 || [istarget hppa64-hp-hpux11.23]
-|| [istarget pru-*-*] } {
+|| [istarget pru-*-*]
+|| [istarget bpf-*-*] } {
return 0;
 }
 return 1
@@ -781,7 +782,7 @@ proc add_options_for_tls { flags } {
 # Return 1 if indirect jumps are supported, 0 otherwise.
 
 proc check_effective_target_indirect_jumps {} {
-if { [istarget nvptx-*-*] } {
+if { [istarget nvptx-*-*] || [istarget bpf-*-*] } {
return 0
 }
 return 1
@@ -790,7 +791,7 @@ proc check_effective_target_indirect_jumps {} {
 # Return 1 if nonlocal goto is supported, 0 otherwise.
 
 proc check_effective_target_nonlocal_goto {} {
-if { [istarget nvptx-*-*] } {
+if { [istarget nvptx-*-*] || [istarget bpf-*-*] } {
return 0
 }
 return 1
@@ -799,10 +800,9 @@ proc check_effective_target_nonlocal_goto {} {
 # Return 1 if global constructors are supported, 0 otherwise.
 
 proc check_effective_target_global_constructor {} {
-if { [istarget nvptx-*-*] } {
-   return 0
-}
-if { [istarget amdgcn-*-*] } {
+if { [istarget nvptx-*-*]
+|| [istarget amdgcn-*-*]
+|| [istarget bpf-*-*] } {
return 0
 }
 return 1
@@ -825,6 +825,10 @@ proc check_effective_target_return_address {} {
 if { [istarget nvptx-*-*] } {
return 0
 }
+# No notion of return address in eBPF.
+if { [istarget bpf-*-*] } {
+   return 0
+}
 # It could be supported on amdgcn, but isn't yet.
 if { [istarget amdgcn*-*-*] } {
return 0
-- 
2.11.0



[PATCH V6 07/11] bpf: gcc.target eBPF testsuite

2019-08-29 Thread Jose E. Marchesi
This patch adds a new testsuite to gcc.target, with eBPF specific
tests.

Tests are included for:
- Target specific diagnostics.
- All built-in functions.

testsuite/ChangeLog:

* gcc.target/bpf/bpf.exp: New file.
* gcc.target/bpf/builtin-load.c: Likewise.
* cc.target/bpf/constant-calls.c: Likewise.
* gcc.target/bpf/diag-funargs.c: Likewise.
* cc.target/bpf/diag-indcalls.c: Likewise.
* gcc.target/bpf/helper-bind.c: Likewise.
* cc.target/bpf/helper-bpf-redirect.c: Likewise.
* gcc.target/bpf/helper-clone-redirect.c: Likewise.
* gcc.target/bpf/helper-csum-diff.c: Likewise.
* gcc.target/bpf/helper-csum-update.c: Likewise.
* gcc.target/bpf/helper-current-task-under-cgroup.c: Likewise.
* gcc.target/bpf/helper-fib-lookup.c: Likewise.
* gcc.target/bpf/helper-get-cgroup-classid.c: Likewise.
* gcc.target/bpf/helper-get-current-cgroup-id.c: Likewise.
* gcc.target/bpf/helper-get-current-comm.c: Likewise.
* gcc.target/bpf/helper-get-current-pid-tgid.c: Likewise.
* gcc.target/bpf/helper-get-current-task.c: Likewise.
* gcc.target/bpf/helper-get-current-uid-gid.c: Likewise.
* gcc.target/bpf/helper-get-hash-recalc.c: Likewise.
* gcc.target/bpf/helper-get-listener-sock.c: Likewise.
* gcc.target/bpf/helper-get-local-storage.c: Likewise.
* gcc.target/bpf/helper-get-numa-node-id.c: Likewise.
* gcc.target/bpf/helper-get-prandom-u32.c: Likewise.
* gcc.target/bpf/helper-get-route-realm.c: Likewise.
* gcc.target/bpf/helper-get-smp-processor-id.c: Likewise.
* gcc.target/bpf/helper-get-socket-cookie.c: Likewise.
* gcc.target/bpf/helper-get-socket-uid.c: Likewise.
* gcc.target/bpf/helper-getsockopt.c: Likewise.
* gcc.target/bpf/helper-get-stack.c: Likewise.
* gcc.target/bpf/helper-get-stackid.c: Likewise.
* gcc.target/bpf/helper-ktime-get-ns.c: Likewise.
* gcc.target/bpf/helper-l3-csum-replace.c: Likewise.
* gcc.target/bpf/helper-l4-csum-replace.c: Likewise.
* gcc.target/bpf/helper-lwt-push-encap.c: Likewise.
* gcc.target/bpf/helper-lwt-seg6-action.c: Likewise.
* gcc.target/bpf/helper-lwt-seg6-adjust-srh.c: Likewise.
* gcc.target/bpf/helper-lwt-seg6-store-bytes.c: Likewise.
* gcc.target/bpf/helper-map-delete-elem.c: Likewise.
* gcc.target/bpf/helper-map-lookup-elem.c: Likewise.
* gcc.target/bpf/helper-map-peek-elem.c: Likewise.
* gcc.target/bpf/helper-map-pop-elem.c: Likewise.
* gcc.target/bpf/helper-map-push-elem.c: Likewise.
* gcc.target/bpf/helper-map-update-elem.c: Likewise.
* gcc.target/bpf/helper-msg-apply-bytes.c: Likewise.
* gcc.target/bpf/helper-msg-cork-bytes.c: Likewise.
* gcc.target/bpf/helper-msg-pop-data.c: Likewise.
* gcc.target/bpf/helper-msg-pull-data.c: Likewise.
* gcc.target/bpf/helper-msg-push-data.c: Likewise.
* gcc.target/bpf/helper-msg-redirect-hash.c: Likewise.
* gcc.target/bpf/helper-msg-redirect-map.c: Likewise.
* gcc.target/bpf/helper-override-return.c: Likewise.
* gcc.target/bpf/helper-perf-event-output.c: Likewise.
* gcc.target/bpf/helper-perf-event-read.c: Likewise.
* gcc.target/bpf/helper-perf-event-read-value.c: Likewise.
* gcc.target/bpf/helper-perf-prog-read-value.c: Likewise.
* gcc.target/bpf/helper-probe-read.c: Likewise.
* gcc.target/bpf/helper-probe-read-str.c: Likewise.
* gcc.target/bpf/helper-probe-write-user.c: Likewise.
* gcc.target/bpf/helper-rc-keydown.c: Likewise.
* gcc.target/bpf/helper-rc-pointer-rel.c: Likewise.
* gcc.target/bpf/helper-rc-repeat.c: Likewise.
* gcc.target/bpf/helper-redirect-map.c: Likewise.
* gcc.target/bpf/helper-set-hash.c: Likewise.
* gcc.target/bpf/helper-set-hash-invalid.c: Likewise.
* gcc.target/bpf/helper-setsockopt.c: Likewise.
* gcc.target/bpf/helper-skb-adjust-room.c: Likewise.
* gcc.target/bpf/helper-skb-cgroup-id.c: Likewise.
* gcc.target/bpf/helper-skb-change-head.c: Likewise.
* gcc.target/bpf/helper-skb-change-proto.c: Likewise.
* gcc.target/bpf/helper-skb-change-tail.c: Likewise.
* gcc.target/bpf/helper-skb-change-type.c: Likewise.
* gcc.target/bpf/helper-skb-ecn-set-ce.c: Likewise.
* gcc.target/bpf/helper-skb-get-tunnel-key.c: Likewise.
* gcc.target/bpf/helper-skb-get-tunnel-opt.c: Likewise.
* gcc.target/bpf/helper-skb-get-xfrm-state.c: Likewise.
* gcc.target/bpf/helper-skb-load-bytes.c: Likewise.
* gcc.target/bpf/helper-skb-load-bytes-relative.c: Likewise.
* gcc.target/bpf/helper-skb-pull-data.c: Likewise.
* gcc.target/bpf/helper-skb-set-tunnel-key.c: Likewise.
* gcc.target/bpf/helper-skb-set-tunnel-opt.c: Likewise.

[PATCH V6 04/11] testsuite: new require effective target indirect_calls

2019-08-29 Thread Jose E. Marchesi
This patch adds a new dg_require_effective_target procedure to the
testsuite infrastructure: indirect_calls.  This new function tells
whether a target supports calls to non-constant call targets.

This patch also annotates the tests in the gcc.c-torture testuite that
require support for indirect calls.

gcc/ChangeLog:

* doc/sourcebuild.texi (Effective-Target Keywords): Document
indirect_calls.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_effective_target_indirect_calls):
New proc.
* gcc.c-torture/compile/20010102-1.c: Annotate with
dg-require-effective-target indirect_calls.
* gcc.c-torture/compile/20010107-1.c: Likewise.
* gcc.c-torture/compile/20011109-1.c: Likewise.
* gcc.c-torture/compile/20011218-1.c: Likewise.
* gcc.c-torture/compile/20011229-1.c: Likewise.
* gcc.c-torture/compile/20020129-1.c: Likewise.
* gcc.c-torture/compile/20020320-1.c: Likewise.
* gcc.c-torture/compile/20020706-1.c: Likewise.
* gcc.c-torture/compile/20020706-2.c: Likewise.
* gcc.c-torture/compile/20021205-1.c: Likewise.
* gcc.c-torture/compile/20030921-1.c: Likewise.
* gcc.c-torture/compile/20031023-1.c: Likewise.
* gcc.c-torture/compile/20031023-2.c: Likewise.
* gcc.c-torture/compile/20031023-3.c: Likewise.
* gcc.c-torture/compile/20031023-4.c: Likewise.
* gcc.c-torture/compile/20040614-1.c: Likewise.
* gcc.c-torture/compile/20040909-1.c: Likewise.
* gcc.c-torture/compile/20050122-1.c: Likewise.
* gcc.c-torture/compile/20050202-1.c: Likewise.
* gcc.c-torture/compile/20060208-1.c: Likewise.
* gcc.c-torture/compile/20081108-1.c: Likewise.
* gcc.c-torture/compile/20150327.c: Likewise.
* gcc.c-torture/compile/920428-2.c: Likewise.
* gcc.c-torture/compile/920928-5.c: Likewise.
* gcc.c-torture/compile/930117-1.c: Likewise.
* gcc.c-torture/compile/930607-1.c: Likewise.
* gcc.c-torture/compile/991213-2.c: Likewise.
* gcc.c-torture/compile/callind.c: Likewise.
* gcc.c-torture/compile/calls-void.c: Likewise.
* gcc.c-torture/compile/calls.c: Likewise.
* gcc.c-torture/compile/pr21840.c: Likewise.
* gcc.c-torture/compile/pr32139.c: Likewise.
* gcc.c-torture/compile/pr35607.c: Likewise.
* gcc.c-torture/compile/pr37433-1.c: Likewise.
* gcc.c-torture/compile/pr37433.c: Likewise.
* gcc.c-torture/compile/pr39941.c: Likewise.
* gcc.c-torture/compile/pr40080.c: Likewise.
* gcc.c-torture/compile/pr43635.c: Likewise.
* gcc.c-torture/compile/pr43791.c: Likewise.
* gcc.c-torture/compile/pr43845.c: Likewise.
* gcc.c-torture/compile/pr44043.c: Likewise.
* gcc.c-torture/compile/pr51694.c: Likewise.
* gcc.c-torture/compile/pr77754-2.c: Likewise.
* gcc.c-torture/compile/pr77754-3.c: Likewise.
* gcc.c-torture/compile/pr77754-4.c: Likewise.
* gcc.c-torture/compile/pr89663-2.c: Likewise.
* gcc.c-torture/compile/pta-1.c: Likewise.
* gcc.c-torture/compile/stack-check-1.c: Likewise.
* gcc.dg/Walloc-size-larger-than-18.c: Likewise.
---
 gcc/ChangeLog  |  5 ++
 gcc/doc/sourcebuild.texi   |  4 ++
 gcc/testsuite/ChangeLog| 55 ++
 gcc/testsuite/gcc.c-torture/compile/20010102-1.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20010107-1.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20011109-1.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20011218-1.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20011229-1.c   |  3 ++
 gcc/testsuite/gcc.c-torture/compile/20020129-1.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20020320-1.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20020706-1.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20020706-2.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20021205-1.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20030921-1.c   |  1 +
 gcc/testsuite/gcc.c-torture/compile/20031023-1.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20031023-2.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20031023-3.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20031023-4.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20040614-1.c   |  1 +
 gcc/testsuite/gcc.c-torture/compile/20040909-1.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20050122-1.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20050202-1.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20060208-1.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20081108-1.c   |  2 +
 gcc/testsuite/gcc.c-torture/compile/20150327.c |  2 +
 gcc/testsuite/gcc.c-torture/compile/920428-2.c |  2 +
 gcc/testsuite/gcc.c-torture/compile/920928-5.c |  3 ++
 gcc/testsuite/gcc.c-torture/compile/930117-1.c |  2 +
 

[Bug target/91598] New: [8/9/10 regression] 60% speed drop on neon intrinsic loop

2019-08-29 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91598

Bug ID: 91598
   Summary: [8/9/10 regression] 60% speed drop on neon intrinsic
loop
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkuvyrkov at gcc dot gnu.org
  Target Milestone: ---

Performance of the attached neon loop drops on Cortex-A53 by about 60% between
GCC 7 and GCC 8.  Performance of trunk is the same as GCC 8.

There are two separate changes, both related to instruction scheduler that
cause the regression.  The first change in r253235 is responsible for 70% of
the regression.
===
haifa-sched: fix autopref_rank_for_schedule qsort comparator

* haifa-sched.c (autopref_rank_for_schedule): Order 'irrelevant'
insns
first, always call autopref_rank_data otherwise.



git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@253235
138bc75d-0d04-0410-961f-82ee72b054a4
===

After this change instead of
r1 = [rb + 0]
r2 = [rb + 8]
r3 = [rb + 16]
r4 = 
r5 = 
r6 = 

we got
r1 = [rb + 0]

r2 = [rb + 8]

r3 = [rb + 16]


which, apparently, cortex-a53 autoprefetcher doesn't recognize.  This schedule
happens because r2= load gets lower priority than the "irrelevant"  due to the above patch.

If we think about it, the fact that "r1 = [rb + 0]" can be scheduled means that
true dependencies of all similar base+offset loads are resolved.  Therefore,
for autoprefetcher-friendly schedule we should prioritize memory reads before
"irrelevant" instructions.

On the other hand, following similar logic, we want to delay memory stores as
much as possible to start scheduling them only after all potential producers
are scheduled.  I.e., for autoprefetcher-friendly schedule we should prioritize
"irrelevant" instructions before memory writes.

Obvious patch to implement the above is attached.  It brings 70% of regressed
performance on this testcase back.

The second part of the regression is due to compiler getting lucky with
scheduling inline-asms representing the intrinsics.  After 
===
Set default sched pressure algorithm

The Arm backend sets the default sched-pressure algorithm to
SCHED_PRESSURE_MODEL.
Benchmarking on AArch64 shows this speeds up floating point performance on
SPEC -
eg. CactusBSSN improves by ~16%.  The gains are mostly due to less
spilling,
so enable this on AArch64 by default.

gcc/
* config/aarch64/aarch64.c (aarch64_override_options_internal):
Set PARAM_SCHED_PRESSURE_ALGORITHM to SCHED_PRESSURE_MODEL.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@254378
138bc75d-0d04-0410-961f-82ee72b054a4
===
the compiler no longer gets lucky on this testcase.

The solution here is to convert intrinsics in arm-neon.h to builtins/UNSPECs
and attach scheduler descriptions to the UNSPECs.

[PATCH V6 03/11] testsuite: annotate c-torture/compile tests with dg-require-stack-size

2019-08-29 Thread Jose E. Marchesi
This patch annotates tests that make use of a significant a mount of
stack space.  Embedded and other restricted targets may have problems
compiling and running these tests.  Note that the annotations are in
many cases not exact.

testsuite/ChangeLog:

* gcc.c-torture/compile/2609-1.c: Annotate with
dg-require-stack-size.
* gcc/testsuite/gcc.c-torture/compile/2804-1.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/20020304-1.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/20020604-1.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/20021015-1.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/20050303-1.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/20060421-1.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/20071207-1.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/20080903-1.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/20121027-1.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/20151204.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/920501-12.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/920501-4.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/920723-1.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/921202-1.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/931003-1.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/931004-1.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/950719-1.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/951222-1.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/990517-1.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/bcopy.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/pr23929.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/pr25310.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/pr34458.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/pr39937.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/pr41181.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/pr41634.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/pr43415.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/pr43417.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/pr44788.c: Likewise.
* gcc/testsuite/gcc.c-torture/compile/sound.c: Likewise.
---
 gcc/testsuite/ChangeLog  | 35 
 gcc/testsuite/gcc.c-torture/compile/2609-1.c |  2 ++
 gcc/testsuite/gcc.c-torture/compile/2804-1.c |  1 +
 gcc/testsuite/gcc.c-torture/compile/20020304-1.c |  2 ++
 gcc/testsuite/gcc.c-torture/compile/20020604-1.c |  1 +
 gcc/testsuite/gcc.c-torture/compile/20021015-1.c |  1 +
 gcc/testsuite/gcc.c-torture/compile/20050303-1.c |  1 +
 gcc/testsuite/gcc.c-torture/compile/20060421-1.c |  2 ++
 gcc/testsuite/gcc.c-torture/compile/20071207-1.c |  1 +
 gcc/testsuite/gcc.c-torture/compile/20080903-1.c |  2 ++
 gcc/testsuite/gcc.c-torture/compile/20121027-1.c |  2 ++
 gcc/testsuite/gcc.c-torture/compile/20151204.c   |  1 +
 gcc/testsuite/gcc.c-torture/compile/920501-12.c  |  1 +
 gcc/testsuite/gcc.c-torture/compile/920501-4.c   |  1 +
 gcc/testsuite/gcc.c-torture/compile/920723-1.c   |  1 +
 gcc/testsuite/gcc.c-torture/compile/921202-1.c   |  2 ++
 gcc/testsuite/gcc.c-torture/compile/931003-1.c   |  2 ++
 gcc/testsuite/gcc.c-torture/compile/931004-1.c   |  2 ++
 gcc/testsuite/gcc.c-torture/compile/950719-1.c   |  2 ++
 gcc/testsuite/gcc.c-torture/compile/951222-1.c   |  2 ++
 gcc/testsuite/gcc.c-torture/compile/990517-1.c   |  3 ++
 gcc/testsuite/gcc.c-torture/compile/bcopy.c  |  1 +
 gcc/testsuite/gcc.c-torture/compile/pr23929.c|  1 +
 gcc/testsuite/gcc.c-torture/compile/pr25310.c|  1 +
 gcc/testsuite/gcc.c-torture/compile/pr34458.c|  1 +
 gcc/testsuite/gcc.c-torture/compile/pr39937.c|  2 ++
 gcc/testsuite/gcc.c-torture/compile/pr41181.c|  2 ++
 gcc/testsuite/gcc.c-torture/compile/pr41634.c|  2 ++
 gcc/testsuite/gcc.c-torture/compile/pr43415.c|  2 ++
 gcc/testsuite/gcc.c-torture/compile/pr43417.c|  2 ++
 gcc/testsuite/gcc.c-torture/compile/pr44788.c|  2 ++
 gcc/testsuite/gcc.c-torture/compile/sound.c  |  1 +
 32 files changed, 84 insertions(+)

diff --git a/gcc/testsuite/gcc.c-torture/compile/2609-1.c 
b/gcc/testsuite/gcc.c-torture/compile/2609-1.c
index f03aa35a7ac..e41701cc6d9 100644
--- a/gcc/testsuite/gcc.c-torture/compile/2609-1.c
+++ b/gcc/testsuite/gcc.c-torture/compile/2609-1.c
@@ -1,3 +1,5 @@
+/* { dg-require-stack-size "1024" } */
+
 int main ()
 {
   char temp[1024] = "tempfile";
diff --git a/gcc/testsuite/gcc.c-torture/compile/2804-1.c 
b/gcc/testsuite/gcc.c-torture/compile/2804-1.c
index 35464c212d2..550669b53a3 100644
--- a/gcc/testsuite/gcc.c-torture/compile/2804-1.c
+++ b/gcc/testsuite/gcc.c-torture/compile/2804-1.c
@@ -6,6 +6,7 @@
 /* { dg-skip-if "Not enough 64-bit registers" { pdp11-*-* 

Backports to 8.4

2019-08-29 Thread Jakub Jelinek
Hi!

I've backported following 12 commits from trunk to 8.4,
bootstrapped/regtested on x86_64-linux and i686-linux and committed
to gcc-8-branch.

Jakub
2019-08-29  Jakub Jelinek  

Backported from mainline
2019-04-19  Jakub Jelinek  

PR middle-end/90139
* tree-outof-ssa.c (get_temp_reg): If reg_mode is BLKmode, return
assign_temp instead of gen_reg_rtx.

* gcc.c-torture/compile/pr90139.c: New test.

--- gcc/tree-outof-ssa.c(revision 270456)
+++ gcc/tree-outof-ssa.c(revision 270457)
@@ -653,6 +653,8 @@ get_temp_reg (tree name)
   tree type = TREE_TYPE (name);
   int unsignedp;
   machine_mode reg_mode = promote_ssa_mode (name, );
+  if (reg_mode == BLKmode)
+return assign_temp (type, 0, 0);
   rtx x = gen_reg_rtx (reg_mode);
   if (POINTER_TYPE_P (type))
 mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
--- gcc/testsuite/gcc.c-torture/compile/pr90139.c   (nonexistent)
+++ gcc/testsuite/gcc.c-torture/compile/pr90139.c   (revision 270457)
@@ -0,0 +1,20 @@
+/* PR middle-end/90139 */
+
+typedef float __attribute__((vector_size (sizeof (float V;
+void bar (int, V *);
+int l;
+
+void
+foo (void)
+{
+  V n, b, o;
+  while (1)
+switch (l)
+  {
+  case 0:
+   o = n;
+   n = b;
+   b = o;
+   bar (1, );
+  }
+}
2019-08-29  Jakub Jelinek  

Backported from mainline
2019-04-26  Jakub Jelinek  

PR debug/90197
* c-tree.h (c_finish_loop): Add 2 further location_t arguments.
* c-parser.c (c_parser_while_statement): Adjust c_finish_loop caller.
(c_parser_do_statement): Likewise.
(c_parser_for_statement): Likewise.  Formatting fixes.
* c-typeck.c (c_finish_loop): Add COND_LOCUS and INCR_LOCUS arguments,
emit DEBUG_BEGIN_STMTs if needed.

--- gcc/c/c-parser.c(revision 271347)
+++ gcc/c/c-parser.c(revision 271348)
@@ -6001,7 +6001,8 @@ c_parser_while_statement (c_parser *pars
   location_t loc_after_labels;
   bool open_brace = c_parser_next_token_is (parser, CPP_OPEN_BRACE);
   body = c_parser_c99_block_statement (parser, if_p, _after_labels);
-  c_finish_loop (loc, cond, NULL, body, c_break_label, c_cont_label, true);
+  c_finish_loop (loc, loc, cond, UNKNOWN_LOCATION, NULL, body,
+c_break_label, c_cont_label, true);
   add_stmt (c_end_compound_stmt (loc, block, flag_isoc99));
   c_parser_maybe_reclassify_token (parser);
 
@@ -6046,6 +6047,7 @@ c_parser_do_statement (c_parser *parser,
   c_break_label = save_break;
   new_cont = c_cont_label;
   c_cont_label = save_cont;
+  location_t cond_loc = c_parser_peek_token (parser)->location;
   cond = c_parser_paren_condition (parser);
   if (ivdep && cond != error_mark_node)
 cond = build3 (ANNOTATE_EXPR, TREE_TYPE (cond), cond,
@@ -6059,7 +6061,8 @@ c_parser_do_statement (c_parser *parser,
   build_int_cst (integer_type_node, unroll));
   if (!c_parser_require (parser, CPP_SEMICOLON, "expected %<;%>"))
 c_parser_skip_to_end_of_block_or_statement (parser);
-  c_finish_loop (loc, cond, NULL, body, new_break, new_cont, false);
+  c_finish_loop (loc, cond_loc, cond, UNKNOWN_LOCATION, NULL, body,
+new_break, new_cont, false);
   add_stmt (c_end_compound_stmt (loc, block, flag_isoc99));
 }
 
@@ -6132,7 +6135,9 @@ c_parser_for_statement (c_parser *parser
   /* Silence the bogus uninitialized warning.  */
   tree collection_expression = NULL;
   location_t loc = c_parser_peek_token (parser)->location;
-  location_t for_loc = c_parser_peek_token (parser)->location;
+  location_t for_loc = loc;
+  location_t cond_loc = UNKNOWN_LOCATION;
+  location_t incr_loc = UNKNOWN_LOCATION;
   bool is_foreach_statement = false;
   gcc_assert (c_parser_next_token_is_keyword (parser, RID_FOR));
   token_indent_info for_tinfo
@@ -6166,7 +6171,8 @@ c_parser_for_statement (c_parser *parser
  c_parser_consume_token (parser);
  is_foreach_statement = true;
  if (check_for_loop_decls (for_loc, true) == NULL_TREE)
-   c_parser_error (parser, "multiple iterating variables in fast 
enumeration");
+   c_parser_error (parser, "multiple iterating variables in "
+   "fast enumeration");
}
  else
check_for_loop_decls (for_loc, flag_isoc99);
@@ -6196,7 +6202,8 @@ c_parser_for_statement (c_parser *parser
  c_parser_consume_token (parser);
  is_foreach_statement = true;
  if (check_for_loop_decls (for_loc, true) == NULL_TREE)
-   c_parser_error (parser, "multiple iterating variables in 
fast enumeration");
+   c_parser_error (parser, "multiple iterating variables in "
+   "fast enumeration");
}
  else
check_for_loop_decls (for_loc, flag_isoc99);
@@ -6218,15 +6225,18 @@ 

[Bug target/91150] [10 Regression] wrong code with -O -mavx512vbmi due to wrong writemask

2019-08-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91150

--- Comment #5 from Jakub Jelinek  ---
Author: jakub
Date: Thu Aug 29 15:05:47 2019
New Revision: 275046

URL: https://gcc.gnu.org/viewcvs?rev=275046=gcc=rev
Log:
Backported from mainline
2019-07-30  Jakub Jelinek  

PR target/91150
* config/i386/i386.c (expand_vec_perm_blend): Change mask type
from unsigned to unsigned HOST_WIDE_INT.  For E_V64QImode cast
comparison to unsigned HOST_WIDE_INT before shifting it left.

* gcc.target/i386/avx512bw-pr91150.c: New test.

Added:
branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/avx512bw-pr91150.c
Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/config/i386/i386.c
branches/gcc-8-branch/gcc/testsuite/ChangeLog

[Bug middle-end/78884] [7/8] ICE when gimplifying VLA in OpenMP SIMD region

2019-08-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78884

--- Comment #12 from Jakub Jelinek  ---
Author: jakub
Date: Thu Aug 29 15:05:01 2019
New Revision: 275045

URL: https://gcc.gnu.org/viewcvs?rev=275045=gcc=rev
Log:
Backported from mainline
2019-07-04  Jakub Jelinek  

PR middle-end/78884
* gimplify.c (struct gimplify_omp_ctx): Add add_safelen1 member.
(gimplify_bind_expr): If seeing TREE_ADDRESSABLE VLA inside of simd
loop body, set ctx->add_safelen1 instead of making it GOVD_PRIVATE.
(gimplify_adjust_omp_clauses): Add safelen (1) clause if
ctx->add_safelen1 is set.

* gcc.dg/gomp/pr78884.c: New test.

Added:
branches/gcc-8-branch/gcc/testsuite/gcc.dg/gomp/pr78884.c
Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/gimplify.c
branches/gcc-8-branch/gcc/testsuite/ChangeLog

[Bug rtl-optimization/90756] [7/8 Regression] g++ ICE in convert_move, at expr.c:218 on i686 and s390x

2019-08-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90756

--- Comment #23 from Jakub Jelinek  ---
Author: jakub
Date: Thu Aug 29 15:04:19 2019
New Revision: 275044

URL: https://gcc.gnu.org/viewcvs?rev=275044=gcc=rev
Log:
Backported from mainline
2019-07-04  Jakub Jelinek  

PR rtl-optimization/90756
* explow.c (promote_ssa_mode): Always use TYPE_MODE, don't bypass it
for VECTOR_TYPE_P.

* gcc.dg/pr90756.c: New test.

Added:
branches/gcc-8-branch/gcc/testsuite/gcc.dg/pr90756.c
Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/explow.c
branches/gcc-8-branch/gcc/testsuite/ChangeLog

[Bug sanitizer/90954] ICE: combining undefined behavior sanitizer with openmp

2019-08-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90954

--- Comment #5 from Jakub Jelinek  ---
Author: jakub
Date: Thu Aug 29 15:03:31 2019
New Revision: 275043

URL: https://gcc.gnu.org/viewcvs?rev=275043=gcc=rev
Log:
Backported from mainline
2019-06-25  Jakub Jelinek  

PR sanitizer/90954
* c-omp.c (c_finish_omp_atomic): Allow tree_invariant_p in addition
to SAVE_EXPR in first operand of a COMPOUND_EXPR.

* c-c++-common/gomp/pr90954.c: New test.

Added:
branches/gcc-8-branch/gcc/testsuite/c-c++-common/gomp/pr90954.c
Modified:
branches/gcc-8-branch/gcc/c-family/ChangeLog
branches/gcc-8-branch/gcc/c-family/c-omp.c
branches/gcc-8-branch/gcc/testsuite/ChangeLog

[Bug c++/90950] OpenMP clause handling rejecting references to incomplete types in templates

2019-08-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90950

--- Comment #3 from Jakub Jelinek  ---
Author: jakub
Date: Thu Aug 29 15:02:44 2019
New Revision: 275042

URL: https://gcc.gnu.org/viewcvs?rev=275042=gcc=rev
Log:
Backported from mainline
2019-06-21  Jakub Jelinek  

PR c++/90950
* semantics.c (finish_omp_clauses): Don't reject references to
incomplete types if processing_template_decl.

* g++.dg/gomp/lastprivate-1.C: New test.

Added:
branches/gcc-8-branch/gcc/testsuite/g++.dg/gomp/lastprivate-1.C
Modified:
branches/gcc-8-branch/gcc/cp/ChangeLog
branches/gcc-8-branch/gcc/cp/semantics.c
branches/gcc-8-branch/gcc/testsuite/ChangeLog

[Bug c/90760] [8/9/10 Regression] ICE on attributes section and alias in set_section, at symtab.c:1573

2019-08-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90760

--- Comment #6 from Jakub Jelinek  ---
Author: jakub
Date: Thu Aug 29 15:01:54 2019
New Revision: 275041

URL: https://gcc.gnu.org/viewcvs?rev=275041=gcc=rev
Log:
Backported from mainline
2019-06-12  Jakub Jelinek  

PR c/90760
* symtab.c (symtab_node::set_section): Allow being called on aliases
as long as they aren't analyzed yet.

* gcc.dg/pr90760.c: New test.

Added:
branches/gcc-8-branch/gcc/testsuite/gcc.dg/pr90760.c
Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/symtab.c
branches/gcc-8-branch/gcc/testsuite/ChangeLog

[Bug debug/90733] [8 Regression] ICE in simplify_subreg, at simplify-rtx.c:6440

2019-08-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90733

--- Comment #6 from Jakub Jelinek  ---
Author: jakub
Date: Thu Aug 29 15:01:10 2019
New Revision: 275040

URL: https://gcc.gnu.org/viewcvs?rev=275040=gcc=rev
Log:
Backported from mainline
2019-06-05  Jakub Jelinek  

PR debug/90733
* var-tracking.c (vt_expand_loc_callback): Don't create raw subregs
with VOIDmode inner operands.

* gcc.dg/pr90733.c: New test.

Added:
branches/gcc-8-branch/gcc/testsuite/gcc.dg/pr90733.c
Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/testsuite/ChangeLog
branches/gcc-8-branch/gcc/var-tracking.c

[Bug libgomp/90585] libgomp hsa plugin ftbfs in the x32 multilib variant

2019-08-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90585

--- Comment #6 from Jakub Jelinek  ---
Author: jakub
Date: Thu Aug 29 15:00:20 2019
New Revision: 275039

URL: https://gcc.gnu.org/viewcvs?rev=275039=gcc=rev
Log:
Backported from mainline
2019-05-24  Jakub Jelinek  

PR libgomp/90585
* plugin/plugin-hsa.c (print_kernel_dispatch, run_kernel): Use PRIu64
macro instead of "lu".
(release_kernel_dispatch): Likewise.  Cast shadow->debug to uintptr_t
before casting to void *.

Modified:
branches/gcc-8-branch/libgomp/ChangeLog
branches/gcc-8-branch/libgomp/plugin/plugin-hsa.c

[Bug debug/90197] [8 Regression] Cannot step through simple loop at -O -g

2019-08-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90197

--- Comment #14 from Jakub Jelinek  ---
Author: jakub
Date: Thu Aug 29 14:59:48 2019
New Revision: 275038

URL: https://gcc.gnu.org/viewcvs?rev=275038=gcc=rev
Log:
Backported from mainline
2019-05-15  Jakub Jelinek  

PR debug/90197
* cp-gimplify.c (genericize_cp_loop): Emit a DEBUG_BEGIN_STMT
before the condition (or if missing or constant non-zero at the end
of the loop.  Emit a DEBUG_BEGIN_STMT before the increment expression
if any.  Don't call protected_set_expr_location on incr if it already
has a location.

Modified:
branches/gcc-8-branch/gcc/cp/ChangeLog
branches/gcc-8-branch/gcc/cp/cp-gimplify.c

Re: [ARM/FDPIC v5 03/21] [ARM] FDPIC: Force FDPIC related options unless -mno-fdpic is provided

2019-08-29 Thread Christophe Lyon

On 16/07/2019 12:34, Richard Sandiford wrote:

Christophe Lyon  writes:

On 22/05/2019 10:45, Christophe Lyon wrote:

On Wed, 22 May 2019 at 10:39, Szabolcs Nagy  wrote:


On 21/05/2019 16:28, Christophe Lyon wrote:

--- a/gcc/config/arm/linux-eabi.h
+++ b/gcc/config/arm/linux-eabi.h
@@ -89,7 +89,7 @@
   #define MUSL_DYNAMIC_LINKER_E "%{mbig-endian:eb}"
   #endif
   #define MUSL_DYNAMIC_LINKER \
-  "/lib/ld-musl-arm" MUSL_DYNAMIC_LINKER_E "%{mfloat-abi=hard:hf}.so.1"
+  "/lib/ld-musl-arm" MUSL_DYNAMIC_LINKER_E
"%{mfloat-abi=hard:hf}%{mfdpic:-fdpic}.so.1"


the line break seems wrong (either needs \ or no newline)


Sorry, that's a mailer artifact.


--- a/libsanitizer/configure.tgt
+++ b/libsanitizer/configure.tgt
@@ -45,7 +45,7 @@ case "${target}" in
  ;;
 sparc*-*-solaris2.11*)
  ;;
-  arm*-*-uclinuxfdpiceabi)
+  arm*-*-fdpiceabi)


should be *fdpiceabi instead of *-fdpiceabi i think.


Indeed, thanks
.


FWIW, here is the updated patch:
- handles musl -fdpic suffix
- disables sanitizers for arm*-*-fdpiceabi
- does not handle -static in a special way, so using -static produces binaries 
that request the non-existing /usr/lib/ld.so.1, thus effectively making -static 
broken/unsupported (this does lead to a few more FAIL in the testsuite)

The plan is to work -static-pie later, as discussed.


Could you make -static without -mno-fdpic an error via a %e spec,
so that the failure mode is a bit more user-friendly?

I realise this isn't your preferred option, sorry.



As discussed later, I didn't because I couldn't find a way
to catch linker (-Wl,XXX) options in the specs, and I prefer
to keep the possibility to generic a "static" binary using
"-static -Wl,-dynamic-linker XXX"

However, I've also a new patch in the series to disable tests that involve 
-static, attached here.



diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h
index e1bacf4..6c25a1a 100644
--- a/gcc/config/arm/bpabi.h
+++ b/gcc/config/arm/bpabi.h
@@ -55,6 +55,8 @@
  #define TARGET_FIX_V4BX_SPEC " %{mcpu=arm8|mcpu=arm810|mcpu=strongarm*"\
"|march=armv4|mcpu=fa526|mcpu=fa626:--fix-v4bx}"
  
+#define TARGET_FDPIC_ASM_SPEC  ""


Formatting nit: should be a single space before ""


OK


+
  #define BE8_LINK_SPEC \
"%{!r:%{!mbe32:%:be8_linkopt(%{mlittle-endian:little}"\
" %{mbig-endian:big}" \
@@ -64,7 +66,7 @@
  /* Tell the assembler to build BPABI binaries.  */
  #undef  SUBTARGET_EXTRA_ASM_SPEC
  #define SUBTARGET_EXTRA_ASM_SPEC \
-  "%{mabi=apcs-gnu|mabi=atpcs:-meabi=gnu;:-meabi=5}" TARGET_FIX_V4BX_SPEC
+  "%{mabi=apcs-gnu|mabi=atpcs:-meabi=gnu;:-meabi=5}" TARGET_FIX_V4BX_SPEC 
TARGET_FDPIC_ASM_SPEC


Long line.


OK


diff --git a/gcc/config/arm/linux-eabi.h b/gcc/config/arm/linux-eabi.h
index 66ec0ea..d7cc923 100644
--- a/gcc/config/arm/linux-eabi.h
+++ b/gcc/config/arm/linux-eabi.h
@@ -89,7 +89,7 @@
  #define MUSL_DYNAMIC_LINKER_E "%{mbig-endian:eb}"
  #endif
  #define MUSL_DYNAMIC_LINKER \
-  "/lib/ld-musl-arm" MUSL_DYNAMIC_LINKER_E "%{mfloat-abi=hard:hf}.so.1"
+  "/lib/ld-musl-arm" MUSL_DYNAMIC_LINKER_E 
"%{mfloat-abi=hard:hf}%{mfdpic:-fdpic}.so.1"
  
  /* At this point, bpabi.h will have clobbered LINK_SPEC.  We want to

 use the GNU/Linux version, not the generic BPABI version.  */


Rich, could you confirm that this is (going to be?) the correct name?


This was confirmed by Rich.


diff --git a/gcc/config/arm/linux-eabi.h b/gcc/config/arm/linux-eabi.h
index 66ec0ea..d7cc923 100644
--- a/gcc/config/arm/linux-eabi.h
+++ b/gcc/config/arm/linux-eabi.h
@@ -89,7 +89,7 @@
  #define MUSL_DYNAMIC_LINKER_E "%{mbig-endian:eb}"
  #endif
  #define MUSL_DYNAMIC_LINKER \
-  "/lib/ld-musl-arm" MUSL_DYNAMIC_LINKER_E "%{mfloat-abi=hard:hf}.so.1"
+  "/lib/ld-musl-arm" MUSL_DYNAMIC_LINKER_E 
"%{mfloat-abi=hard:hf}%{mfdpic:-fdpic}.so.1"
  
  /* At this point, bpabi.h will have clobbered LINK_SPEC.  We want to

 use the GNU/Linux version, not the generic BPABI version.  */
@@ -101,11 +101,14 @@
  #undef  ASAN_CC1_SPEC
  #define ASAN_CC1_SPEC "%{%:sanitize(address):-funwind-tables}"
  
+#define FDPIC_CC1_SPEC ""

+
  #undef  CC1_SPEC
  #define CC1_SPEC  \
-  LINUX_OR_ANDROID_CC (GNU_USER_TARGET_CC1_SPEC " " ASAN_CC1_SPEC,   \
+  LINUX_OR_ANDROID_CC (GNU_USER_TARGET_CC1_SPEC " " ASAN_CC1_SPEC " "  \
+  FDPIC_CC1_SPEC,  \
   GNU_USER_TARGET_CC1_SPEC " " ASAN_CC1_SPEC " "   \
-  ANDROID_CC1_SPEC)
+  ANDROID_CC1_SPEC "" FDPIC_CC1_SPEC)
  
  #define CC1PLUS_SPEC \

LINUX_OR_ANDROID_CC ("", ANDROID_CC1PLUS_SPEC)


Does it make sense to add FDPIC_CC1_SPEC to the Android version?


No, now fixed.


diff --git a/gcc/config/arm/uclinuxfdpiceabi.h 
b/gcc/config/arm/uclinuxfdpiceabi.h
new file mode 100644
index 000..3180bcd
--- /dev/null

Re: [PATCH] Setup predicate for switch default case in IPA (PR ipa/91089)

2019-08-29 Thread Martin Jambor
Hi,

On Fri, Jul 12 2019, Feng Xue OS wrote:
> IPA does not construct executability predicate for default case of switch 
> statement.
> So execution cost of default case is not properly evaluated in IPA-cp, this 
> might
> prevent function clone for function containing switch statement, if certain 
> non-default
> case is proved to be executed after constant propagation.
>
> This patch is composed to deduce predicate for default case, if it turns out 
> to be a
> relative simple one, for example, we can try to merge case range, and use
> comparison upon range bounds, and also range analysis information to simplify 
> predicate.
>

I have read through the patch and it looks OK to me but I cannot approve
it, you have to ping Honza for that.  Since you decided to use the value
range info, it would be nice if you could also add a testcase where it
plays a role.  Also, please don't post changelog entries as a part of
the patch, it basically guarantees it will not apply for anybody, not
even for you when you update your trunk.

Thanks for working on this,

Martin


> Feng
>
> 
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 3d92250b520..4de2f568990 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,10 @@
> +2019-07-12  Feng Xue  
> +
> + PR ipa/91089
> + * ipa-fnsummary.c (set_switch_stmt_execution_predicate): Add predicate
> + for switch default case using range analysis information.
> + * params.def (PARAM_IPA_MAX_SWITCH_PREDICATE_BOUNDS): New.
> +
>  2019-07-11  Sunil K Pandey  
>  


[Bug pch/90326] Using any precompiled header breaks definition of FLT_MAX

2019-08-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90326

--- Comment #8 from Jakub Jelinek  ---
Author: jakub
Date: Thu Aug 29 14:59:18 2019
New Revision: 275037

URL: https://gcc.gnu.org/viewcvs?rev=275037=gcc=rev
Log:
Backported from mainline
2019-05-10  Jakub Jelinek  

PR pch/90326
cp/
* config-lang.in (gtfiles): Remove c-family/c-lex.c, add
c-family/c-cppbuiltin.c.
objc/
* config-lang.in (gtfiles): Add c-family/c-format.c.
objcp/
* config-lang.in (gtfiles): Don't add c-family/c-cppbuiltin.c.
testsuite/
* g++.dg/pch/pr90326.C: New test.
* g++.dg/pch/pr90326.Hs: New file.

Added:
branches/gcc-8-branch/gcc/testsuite/g++.dg/pch/pr90326.C
branches/gcc-8-branch/gcc/testsuite/g++.dg/pch/pr90326.Hs
Modified:
branches/gcc-8-branch/gcc/cp/ChangeLog
branches/gcc-8-branch/gcc/cp/config-lang.in
branches/gcc-8-branch/gcc/objc/ChangeLog
branches/gcc-8-branch/gcc/objc/config-lang.in
branches/gcc-8-branch/gcc/objcp/ChangeLog
branches/gcc-8-branch/gcc/objcp/config-lang.in
branches/gcc-8-branch/gcc/testsuite/ChangeLog

[Bug debug/90197] [8 Regression] Cannot step through simple loop at -O -g

2019-08-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90197

--- Comment #13 from Jakub Jelinek  ---
Author: jakub
Date: Thu Aug 29 14:57:56 2019
New Revision: 275036

URL: https://gcc.gnu.org/viewcvs?rev=275036=gcc=rev
Log:
Backported from mainline
2019-04-26  Jakub Jelinek  

PR debug/90197
* c-tree.h (c_finish_loop): Add 2 further location_t arguments.
* c-parser.c (c_parser_while_statement): Adjust c_finish_loop caller.
(c_parser_do_statement): Likewise.
(c_parser_for_statement): Likewise.  Formatting fixes.
* c-typeck.c (c_finish_loop): Add COND_LOCUS and INCR_LOCUS arguments,
emit DEBUG_BEGIN_STMTs if needed.

Modified:
branches/gcc-8-branch/gcc/c/ChangeLog
branches/gcc-8-branch/gcc/c/c-parser.c
branches/gcc-8-branch/gcc/c/c-tree.h
branches/gcc-8-branch/gcc/c/c-typeck.c

[Bug middle-end/90139] [7/8 Regression] ICE in emit_block_move_hints, at expr.c:1601

2019-08-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90139

--- Comment #14 from Jakub Jelinek  ---
Author: jakub
Date: Thu Aug 29 14:57:18 2019
New Revision: 275035

URL: https://gcc.gnu.org/viewcvs?rev=275035=gcc=rev
Log:
Backported from mainline
2019-04-19  Jakub Jelinek  

PR middle-end/90139
* tree-outof-ssa.c (get_temp_reg): If reg_mode is BLKmode, return
assign_temp instead of gen_reg_rtx.

* gcc.c-torture/compile/pr90139.c: New test.

Added:
branches/gcc-8-branch/gcc/testsuite/gcc.c-torture/compile/pr90139.c
Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/testsuite/ChangeLog
branches/gcc-8-branch/gcc/tree-outof-ssa.c

Re: [ARM/FDPIC v5 02/21] [ARM] FDPIC: Handle arm*-*-uclinuxfdpiceabi in configure scripts

2019-08-29 Thread Christophe Lyon

On 12/07/2019 08:49, Richard Sandiford wrote:

Christophe Lyon  writes:

The new arm-uclinuxfdpiceabi target behaves pretty much like
arm-linux-gnueabi. In order the enable the same set of features, we
have to update several configure scripts that generally match targets
like *-*-linux*: in most places, we add *-uclinux* where there is
already *-linux*, or uclinux* when there is already linux*.

In gcc/config.gcc and libgcc/config.host we use *-*-uclinuxfdpiceabi
because there is already a different behaviour for *-*uclinux* target.

In libtool.m4, we use uclinuxfdpiceabi in cases where ELF shared
libraries support is required, as uclinux does not guarantee that.

2019-XX-XX  Christophe Lyon  

config/
* futex.m4: Handle *-uclinux*.
* tls.m4 (GCC_CHECK_TLS): Likewise.

gcc/
* config.gcc: Handle *-*-uclinuxfdpiceabi.

libatomic/
* configure.tgt: Handle arm*-*-uclinux*.
* configure: Regenerate.

libgcc/
* config.host: Handle *-*-uclinuxfdpiceabi.

libitm/
* configure.tgt: Handle *-*-uclinux*.
* configure: Regenerate.

libstdc++-v3/
* acinclude.m4: Handle uclinux*.
* configure: Regenerate.
* configure.host: Handle uclinux*

* libtool.m4: Handle uclinux*.


Has the libtool.m4 patch been submitted to upstream libtool?
I think this is supposed to be handled by submitting there first
and then cherry-picking into gcc, so that the change isn't lost
by a future import.


I added a comment to libtool.m4 about this.


[...]

diff --git a/config/tls.m4 b/config/tls.m4
index 1a5fc59..a487aa4 100644
--- a/config/tls.m4
+++ b/config/tls.m4
@@ -76,7 +76,7 @@ AC_DEFUN([GCC_CHECK_TLS], [
  dnl Shared library options may depend on the host; this check
  dnl is only known to be needed for GNU/Linux.
  case $host in
-   *-*-linux*)
+   *-*-linux* | -*-uclinux*)
  LDFLAGS="-shared -Wl,--no-undefined $LDFLAGS"
  ;;
  esac


Is this right for all uclinux targets?

I don't think so, now restricted to -*-uclinuxfdpic*




diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 84258d8..cb0fdc5 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4


It'd probably be worth splitting out the libstdc++-v3 bits and
submitting them separately, cc:ing libstd...@gcc.gnu.org.  But...


I've now split the patch into two parts (both attached here)



@@ -1404,7 +1404,7 @@ AC_DEFUN([GLIBCXX_ENABLE_LIBSTDCXX_TIME], [
  ac_has_nanosleep=yes
  ac_has_sched_yield=yes
  ;;
-  gnu* | linux* | kfreebsd*-gnu | knetbsd*-gnu)
+  gnu* | linux* | kfreebsd*-gnu | knetbsd*-gnu | uclinux*)
  AC_MSG_CHECKING([for at least GNU libc 2.17])
  AC_TRY_COMPILE(
[#include ],


is this the right thing to do?  It seems odd to be testing the glibc
version for uclibc.

Do you want to support multiple possible settings of
ac_has_clock_monotonic and ac_has_clock_realtime?  Or could you just
hard-code the values, given particular baseline assumptions about the
version of uclibc etc.?  Hard-coding would then make


@@ -1526,7 +1526,7 @@ AC_DEFUN([GLIBCXX_ENABLE_LIBSTDCXX_TIME], [
  
if test x"$ac_has_clock_monotonic" != x"yes"; then

  case ${target_os} in
-  linux*)
+  linux* | uclinux*)
AC_MSG_CHECKING([for clock_gettime syscall])
AC_TRY_COMPILE(
  [#include 


...this redundant.


Right, now fixed.


@@ -2415,7 +2415,7 @@ AC_DEFUN([GLIBCXX_ENABLE_CLOCALE], [
# Default to "generic".
if test $enable_clocale_flag = auto; then
  case ${target_os} in
-  linux* | gnu* | kfreebsd*-gnu | knetbsd*-gnu)
+  linux* | gnu* | kfreebsd*-gnu | knetbsd*-gnu | uclinux*)
enable_clocale_flag=gnu
;;
darwin*)


This too seems to be choosing a glibc setting for a uclibc target.

Indeed.




@@ -2661,7 +2661,7 @@ AC_DEFUN([GLIBCXX_ENABLE_ALLOCATOR], [
# Default to "new".
if test $enable_libstdcxx_allocator_flag = auto; then
  case ${target_os} in
-  linux* | gnu* | kfreebsd*-gnu | knetbsd*-gnu)
+  linux* | gnu* | kfreebsd*-gnu | knetbsd*-gnu | uclinux*)
enable_libstdcxx_allocator_flag=new
;;
*)


The full case is:

   # Probe for host-specific support if no specific model is specified.
   # Default to "new".
   if test $enable_libstdcxx_allocator_flag = auto; then
 case ${target_os} in
   linux* | gnu* | kfreebsd*-gnu | knetbsd*-gnu)
enable_libstdcxx_allocator_flag=new
;;
   *)
enable_libstdcxx_allocator_flag=new
;;
 esac
   fi

which looks a bit redundant :-)


Right :-)

Thanks,

Christophe



Thanks,
Richard
.



>From 81c84839b8f004b7b52317850f27f58e05bec6ad Mon Sep 17 00:00:00 2001
From: Christophe Lyon 
Date: Fri, 4 May 2018 15:11:35 +
Subject: [ARM/FDPIC v6 02/24] [ARM] FDPIC: Handle 

Re: [ARM/FDPIC v5 01/21] [ARM] FDPIC: Add -mfdpic option support

2019-08-29 Thread Christophe Lyon

On 16/07/2019 12:11, Richard Sandiford wrote:

[This isn't really something that should be reviewed under global
reviewership, but if it's either that or nothing, I'll do it anyway...]

Christophe Lyon  writes:

2019-XX-XX  Christophe Lyon  
Mickaël Guêné  

gcc/
* config/arm/arm.opt: Add -mfdpic option.
* doc/invoke.texi: Add documentation for -mfdpic.

Change-Id: I0eabd1d11c9406fd4a43c4333689ebebbfcc4fe8

diff --git a/gcc/config/arm/arm.opt b/gcc/config/arm/arm.opt
index 9067d49..2ed3bd5 100644
--- a/gcc/config/arm/arm.opt
+++ b/gcc/config/arm/arm.opt
@@ -306,3 +306,7 @@ Cost to assume for a branch insn.
  mgeneral-regs-only
  Target Report RejectNegative Mask(GENERAL_REGS_ONLY) Save
  Generate code which uses the core registers only (r0-r14).
+
+mfdpic
+Target Report Mask(FDPIC)
+Enable Function Descriptor PIC mode.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 29585cf..805d7cc 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -703,7 +703,8 @@ Objective-C and Objective-C++ Dialects}.
  -mrestrict-it @gol
  -mverbose-cost-dump @gol
  -mpure-code @gol
--mcmse}
+-mcmse @gol
+-mfdpic}
  
  @emph{AVR Options}

  @gccoptlist{-mmcu=@var{mcu}  -mabsdata  -maccumulate-args @gol
@@ -17912,6 +17913,23 @@ MOVT instruction.
  Generate secure code as per the "ARMv8-M Security Extensions: Requirements on
  Development Tools Engineering Specification", which can be found on
  
@url{http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/ECM0359818_armv8m_security_extensions_reqs_on_dev_tools_1_0.pdf}.
+
+@item -mfdpic
+@itemx -mno-fdpic
+@opindex mfdpic
+@opindex mno-fdpic
+Select the FDPIC ABI, which uses function descriptors to represent


Maybe "64-bit function descriptors"?  Just a suggestion, might not be useful.

OK with that change, thanks.


OK, here is a new version, where I added a few words to explain that -static
is not supported.

Thanks,
Christophe



Richard


+pointers to functions.  When the compiler is configured for
+@code{arm-*-uclinuxfdpiceabi} targets, this option is on by default
+and implies @option{-fPIE} if none of the PIC/PIE-related options is
+provided.  On other targets, it only enables the FDPIC-specific code
+generation features, and the user should explicitly provide the
+PIC/PIE-related options as needed.
+
+The opposite @option{-mno-fdpic} option is useful (and required) to
+build the Linux kernel using the same (@code{arm-*-uclinuxfdpiceabi})
+toolchain as the one used to build the userland programs.
+
  @end table
  
  @node AVR Options

.



>From c936684e2b77ff5716bd8b67c617dcad088c72e0 Mon Sep 17 00:00:00 2001
From: Christophe Lyon 
Date: Thu, 8 Feb 2018 10:44:32 +0100
Subject: [ARM/FDPIC v6 01/24] [ARM] FDPIC: Add -mfdpic option support
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

2019-XX-XX  Christophe Lyon  
	Mickaël Guêné  

	gcc/
	* config/arm/arm.opt: Add -mfdpic option.
	* doc/invoke.texi: Add documentation for -mfdpic.

Change-Id: I05b98d6ae87c2b3fc04dd7fba415c730accdf33e

diff --git a/gcc/config/arm/arm.opt b/gcc/config/arm/arm.opt
index 9067d49..2ed3bd5 100644
--- a/gcc/config/arm/arm.opt
+++ b/gcc/config/arm/arm.opt
@@ -306,3 +306,7 @@ Cost to assume for a branch insn.
 mgeneral-regs-only
 Target Report RejectNegative Mask(GENERAL_REGS_ONLY) Save
 Generate code which uses the core registers only (r0-r14).
+
+mfdpic
+Target Report Mask(FDPIC)
+Enable Function Descriptor PIC mode.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 29585cf..b77fa06 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -703,7 +703,8 @@ Objective-C and Objective-C++ Dialects}.
 -mrestrict-it @gol
 -mverbose-cost-dump @gol
 -mpure-code @gol
--mcmse}
+-mcmse @gol
+-mfdpic}
 
 @emph{AVR Options}
 @gccoptlist{-mmcu=@var{mcu}  -mabsdata  -maccumulate-args @gol
@@ -17912,6 +17913,27 @@ MOVT instruction.
 Generate secure code as per the "ARMv8-M Security Extensions: Requirements on
 Development Tools Engineering Specification", which can be found on
 @url{http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/ECM0359818_armv8m_security_extensions_reqs_on_dev_tools_1_0.pdf}.
+
+@item -mfdpic
+@itemx -mno-fdpic
+@opindex mfdpic
+@opindex mno-fdpic
+Select the FDPIC ABI, which uses 64-bit function descriptors to
+represent pointers to functions.  When the compiler is configured for
+@code{arm-*-uclinuxfdpiceabi} targets, this option is on by default
+and implies @option{-fPIE} if none of the PIC/PIE-related options is
+provided.  On other targets, it only enables the FDPIC-specific code
+generation features, and the user should explicitly provide the
+PIC/PIE-related options as needed.
+
+Note that static linking is not supported because it would still
+involve the dynamic linker when the program self-relocates.  If such
+behaviour is acceptable, use -static and -Wl,-dynamic-linker options.
+
+The opposite @option{-mno-fdpic} option is useful (and required) to

[GSoC-19] Adding functions in math.h as built-ins

2019-08-29 Thread Tejas Joshi
Hello.
As deadline of GSoC has ended and regardless of what it results into,
I would like to sincerely thanks GCC for giving me this opportunity to
contribute in and learn GCC which helped me to get to know open source
community.
Working on this project has helped me to not only elevate my knowledge
about GCC, its intrinsic but also compilers in general, C, C++ and
floating point intrinsics too and will definitely assist me in
developing my future career and opportunities.
I would like to thanks Martin and Honza for guiding me throughout this
course of project and also all those people, especially Joseph, Segher
and Richard who actively assisted me for the problems I encountered.
Although the project hasn't met its specified deliverables till
deadline, I will still continue to work on the remaining things so
that those get comitted too and will keep seeking for inputs whenever
I need.

Thanks,
Tejas


Re: enable_shared_from_this fails at runtime when inherited privately

2019-08-29 Thread Christian Schneider




Am 29.08.19 um 13:44 schrieb Jonathan Wakely:

On Thu, 29 Aug 2019 at 12:43, Jonathan Wakely wrote:


On Thu, 29 Aug 2019 at 11:50, Christian Schneider
 wrote:


Am 29.08.19 um 12:07 schrieb Jonathan Wakely:

On Thu, 29 Aug 2019 at 10:15, Christian Schneider
 wrote:


Hello,
I just discovered, that, when using enable_shared_from_this and
inheriting it privately, this fails at runtime.
I made a small example:

#include 
#include 
#include 
#include 

#ifndef prefix
#define prefix std
#endif

class foo:
   prefix::enable_shared_from_this
{
public:
   prefix::shared_ptr get_sptr()
   {
   return shared_from_this();
   }
};

int main()
{
   auto a = prefix::make_shared();
   auto b = a->get_sptr();
   return 0;
}

This compiles fine, but throws a weak_ptr exception at runtime.
I'm aware, that the implementation requires, that
enable_shared_from_this needs to be publicly inherited, but as a first
time user, I had to find this out the hard way, as documentations (I
use, ie. cppreference.com) don't mention it, probably because it's not a
requirement of the standard.


It definitely is a requirement of the standard. The new wording we
added via 
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0033r1.html#spec
says that the base's weak_ptr is only initialized when the base class
is "unambiguous and accessible". It doesn't say that an ambiguous or
inaccessible base class makes the program ill-formed, so we're not
allowed to reject such a program. >

I see. As far as I understand, this sentence was removed:
Requires: enable_shared_from_this shall be an accessible base class
of T. *this shall be a subobject of an object t of type T. There shall
be at least one shared_ptr instance p that owns 

As far as I read it, this required enable_shared_from_this to be public
accessible.


No. It only required it to be publicly accessible if you called
shared_from_this().


Do you know (or someone else), why it was removed?


Yes (look at the author of the paper :-). As I wrote in that paper:

"The proposed wording removes the preconditions on shared_from_this so
that it is now well-defined to call it on an object which is not owned
by any shared_ptr, in which case shared_from_this would throw an
exception."

Previously it was undefined behaviour to call shared_from_this() if
the base class hadn't been initialized to share ownership with a
shared_ptr. That meant the following was undefined:

#include 
struct X : std::enable_shared_from_this { };
int main()
{
   X x; // not owned by a shared_ptr
   x.shared_from_this();
}

Now this program is perfectly well-defined, but it throws an
exception. There is no good reason to say that program has undefined
behaviour (which means potentially unbounded types of errors) when we
can just make it valid code that throws an exception when misused.


And in order to make it well-defined, we tightened up the
specification to say exactly how and when the weak_ptr in a
enable_shared_from_this base class gets initialized. If it's not
possible to initialize it (e.g. because it's private) then it doesn't
initialize it.


OK, thx for clarification and insights.
Since it is a requirement from the standard, I will add a note on 
cppreference.com, so that it is clear that it needs to be public 
inherited, and it silently fails if you don't inherit public.





I find it a little, umm..., inconvenient, that the compiler happily
accepts it when it is clear that it never ever can work...


The code compiles and runs. It just doesn't do what you thought it
would do. Welcome to C++.


  1   2   >